Senior Site Reliability Operations Engineer

37 minutos atrás


São Paulo, São Paulo, Brasil Truelogic Tempo inteiro
About Truelogic

At Truelogic we are a leading provider of nearshore staff augmentation services headquartered in New York. For over two decades, we've been delivering top-tier technology solutions to companies of all sizes, from innovative startups to industry leaders, helping them achieve their digital transformation goals.

Our team of 600+ highly skilled tech professionals, based in Latin America, drives digital disruption by partnering with U.S. companies on their most impactful projects. Whether collaborating with Fortune 500 giants or scaling startups, we deliver results that make a difference.

By applying for this position, you're taking the first step in joining a dynamic team that values your expertise and aspirations. We aim to align your skills with opportunities that foster exceptional career growth and success while contributing to transformative projects that shape the future.

Our Client

A leading Financial Services


Job Summary

The Site Reliability Operations (SRO) team ensures 24/7 stability of Pennymac's internal IT infrastructure and mission-critical backend systems. This role is not DevOps-focused, but is crucial in monitoring, coordinating, and restoring operations during incidents, particularly in a high-stakes, regulated environment.

The role balances incident command, technical troubleshooting, project leadership, and communication with multiple internal and external stakeholders.

Responsibilities
  • Lead incident response as Incident Commander, coordinating teams, communications, and service restoration

  • Produce executive-level incident reports, run RCAs, and drive continuous improvement

  • Monitor and improve observability using tools like AWS CloudWatch and New Relic, reducing alert noise and gaps

  • Provide hands-on system support across Linux and Windows environments, including complex infrastructure issues

  • Manage and execute deployments via Jenkins, GitLab, or similar CI/CD platforms

  • Own infrastructure initiatives such as migrations, upgrades, and process improvements

  • Enforce change management and risk assessment for production changes

  • Maintain documentation and SOPs, acting as a key liaison between engineering teams and external vendors

  • On-call rotation: 1-week rotation, subject to critical incident call-ins between 6:00 PM and 6:00 AM PT.

Qualifications and Job Requirements
  • 5+ years of experience in Windows and Linux environments with proven troubleshooting capabilities.

  • Strong knowledge of monitoring tools like AWS CloudWatch, New Relic, Nagios, SumoLogic.

  • Practical experience with CI/CD tools (Jenkins, GitLab) and backup tools (CommVault, AWS Backup).

  • Strong scripting skills in PowerShell, Python, or equivalent.

  • Outstanding communication skills, especially under pressure, including executive reporting.

  • Experience in high-paced environments and with on-call support models.

  • Autonomous and proactive attitude; capable of managing complex tasks independently.

What We Offer
  • 100% Remote Work: Enjoy the freedom to work from the location that helps you thrive. All it takes is a laptop and a reliable internet connection.

  • Highly Competitive USD Pay: Earn an excellent, market-leading compensation in USD, that goes beyond typical market offerings.

  • Paid Time Off: We value your well-being. Our paid time off policies ensure you have the chance to unwind and recharge when needed.

  • Work with Autonomy: Enjoy the freedom to manage your time as long as the work gets done. Focus on results, not the clock.

  • Work with Top American Companies: Grow your expertise working on innovative, high-impact projects with Industry-Leading U.S. Companies.

Why You'll Like Working Here
  • A Culture That Values You: We prioritize well-being and work-life balance, offering engagement activities and fostering dynamic teams to ensure you thrive both personally and professionally.

  • Diverse, Global Network: Connect with over 600 professionals in 25+ countries, expand your network, and collaborate with a multicultural team from Latin America.

  • Team Up with Skilled Professionals: Join forces with senior talent. All of our team members are seasoned experts, ensuring you're working with the best in your field.

Apply now



  • São Paulo, São Paulo, Brasil Enumerate Tempo inteiro

    Role OverviewWe're looking for a Senior Site Reliability Engineer who can own the architecture, governance, and cost efficiency of our cloud and platform infrastructure. In this role you'll design and evolve our production environments, define standards and best practices, and partner with engineering and IT teams to build scalable, reliable systems that are...

  • Site Reliability Engineer

    28 minutos atrás


    São Paulo, São Paulo, Brasil Sur Tempo inteiro

    As the Site Reliability Engineer you will support and scale the infrastructure powering their secure, mission-critical SaaS platform. You must be confident in operating and debugging both modern infrastructure (cloud-native, containerized services) and classic Windows production environments (IIS, SQL Server AlwaysOn, Service Broker), with the ability to...

  • Site Reliability Engineer

    2 semanas atrás


    São Paulo, São Paulo, Brasil INDI Staffing Services Tempo inteiro

    At INDI, we're passionate about empowering individuals and businesses worldwide. Our cutting-edge recruiters connect leading companies with top talent, fostering a dynamic environment where innovation thrives. Join us in shaping the future of work.Overview of the role:We are looking for a Site Reliability Engineer to build and maintain highly reliable,...

  • Site Reliability Engineer

    3 semanas atrás


    São Paulo, Estado de São Paulo, Brasil Conquest One Tempo inteiro

    Vaga: SRE Sênior️ Inglês para conversação é imprescindívelHíbrido – presencial 2x na semana no Jardim Paulista (Av. Nove de Julho – São Paulo/SP) + 3x na semana de home office Contratação: CLT Horário de trabalho: 09:00 às 18:00Estamos em busca de um(a) Site Reliability Engineer Sênior para atuar de forma estratégica na transformação e...

  • Site Reliability Engineer

    34 minutos atrás


    São Paulo, São Paulo, Brasil PayRetailers Tempo inteiro

    Job DescriptionWe're PayRetailers, and we offer cutting-edge payment solutions that empower businesses to succeed in Latin America & Africa. Our collaborative and inclusive work environment encourages creativity and growth, where every employee's contribution is valued.We've got big plans to expand into new markets and make a meaningful impact on the world...

  • Site Reliability Engineer

    7 minutos atrás


    São Paulo, São Paulo, Brasil FullStack Tempo inteiro

    About FullStackFullStack is the most transparent IT talent network, connecting highly skilled individuals with top global companies and Silicon Valley startups for remote, on-demand projects. We focus on building a trusted, high-performance network where talent can thrive in a positive, respectful, and supportive environment. By prioritizing transparency,...


  • São Paulo, São Paulo, Brasil Truelogic Software Tempo inteiro

    About TruelogicAt Truelogic we are a leading provider of nearshore staff augmentation services headquartered in New York. For over two decades, we've been delivering top-tier technology solutions to companies of all sizes, from innovative startups to industry leaders, helping them achieve their digital transformation goals.Our team of 600+ highly skilled...


  • São Paulo, São Paulo, Brasil Truelogic Tempo inteiro

    About TruelogicAt Truelogic we are a leading provider of nearshore staff augmentation services headquartered in New York. For over two decades, we've been delivering top-tier technology solutions to companies of all sizes, from innovative startups to industry leaders, helping them achieve their digital transformation goals.Our team of 600+ highly skilled...

  • Site Reliability Engineer

    24 minutos atrás


    São Paulo, São Paulo, Brasil Enter Tempo inteiro

    A Enter (anteriormente Talisman AI) foi fundada em 2023 com a missão de tornar o Brasil um protagonista em Inteligência Artificial. Unimos a expertise humana à eficiência da IA para ajudar grandes empresas da América Latina a otimizar processos críticos de alto volume e que exigem intenso trabalho manual. Iniciamos nossa jornada aplicando IA para...

  • Site Reliability Engineer

    27 minutos atrás


    São Paulo, São Paulo, Brasil Enter Tempo inteiro

    A Enter (anteriormente Talisman AI) foi fundada em 2023 com a missão de tornar o Brasil um protagonista em Inteligência Artificial. Unimos a expertise humana à eficiência da IA para ajudar grandes empresas da América Latina a otimizar processos críticos de alto volume e que exigem intenso trabalho manual. Iniciamos nossa jornada aplicando IA para...