Site Reliability Engineer

1 semana atrás


São Paulo, São Paulo, Brasil PayRetailers Tempo inteiro

Job Description
We're PayRetailers, and we offer cutting-edge payment solutions that empower businesses to succeed in Latin America & Africa. Our collaborative and inclusive work environment encourages creativity and growth, where every employee's contribution is valued.

We've got big plans to expand into new markets and make a meaningful impact on the world of payments. To help us get there, our Technology team is on the lookout for a new Site Reliability Engineer with a strong focus on Data.

About The Role
Site Reliability Engineers are the guardians of our reliability promise. They deliver a highly reliable, resilient, and cost-efficient platform that consistently meets business and customer expectations for availability and performance.

Job Requirements
The ideal candidate should have all the following requirements.

However, we believe in self-learning and adaptation, so we can be flexible on certain requirements.

What Is a MUST

  • Proactive attitude, always on the lookout for improvement opportunities.
  • Strong scripting skills (Python, Bash).
  • Experience in Cloud.
  • Knowledge of Grafana, Application Insights, OpenTelemetry, Prometheus.
  • DBA experience in creating and maintaining DDBB in SQL Server (Mongo or postgreSQL).
  • Fluent level of English, able to conduct technical meetings in English.

What Is Nice To Have

  • Experience with non-functional and production testing.
  • Analytical mindset, being able to connect the dots and establish cause and effect.
  • Experience with containers and container orchestration platforms (EKS/AKS).
  • Understanding of APIs and asynchronous distributed software architectures.
  • Working knowledge of AI-enabled tools like VS Code, Claude Code, etc.
  • Demonstrable experience with applying AI to Site Reliability Engineering.
  • Knowledge with process automation tools like N8N.
  • Working experience with chaos engineering.

Job Responsibilities

  • Increase automation of operational activities to reduce downtime risk, in collaboration with Platform Engineering and Domain Squads.
  • Drive systemic improvements across engineering teams based on incident RCAs and telemetry insights.
  • Implement non-functional improvements (resilience, performance, reliability) directly in code, with Domain Squads reviewing and approving changes.
  • Promote adoption of SRE best practices across development teams (integration patterns, monitoring, alerting, real-time tracing.
  • Provide cross-platform observability capabilities above and beyond what the Domain Squads provide.

Investigate issues and incidents and propose/implement changes as deemed necessary.

  • Continuously review logs, metrics, and alerts to identify and/or implement continuous improvements.
  • Design non-functional test and continuously run them to ensure that we build quality up-to and including production.

Job Benefits

  • Individual development plans
  • Excellent working environment and collaboration
  • Private medical insurance covered by the company
  • Meal and Food Allowance
  • Life Insurance
  • Wellhub (Gym)

If you're passionate about tech, innovation, and want to thrive in an environment that values collaboration and diversity, this role might be the perfect fit for you

Apply today and help us shape the future of the PayTech industry

To get an idea of what life at PayRetailers is like, check out our Instagram and our About Us page.

Our commitment to diversity, equity & inclusion

At PayRetailers, diversity, equity, and inclusion aren't just values – they're fundamental to who we are. We're dedicated to fostering an environment where every individual feels valued, respected, and empowered to bring their authentic selves to work. We welcome applicants from all backgrounds and identities, recognizing that diversity drives innovation and strengthens our team.

So, if you're passionate about making a difference and excited about the role, we encourage you to apply. Join us in building a global company where everyone can thrive and feel proud to belong.

Please feel free to include your pronouns in your application (e.g. she/her, he/him, they/them, etc.).


  • Site Reliability Engineer

    1 semana atrás


    São Paulo, São Paulo, Brasil Sur Tempo inteiro

    As the Site Reliability Engineer you will support and scale the infrastructure powering their secure, mission-critical SaaS platform. You must be confident in operating and debugging both modern infrastructure (cloud-native, containerized services) and classic Windows production environments (IIS, SQL Server AlwaysOn, Service Broker), with the ability to...

  • Site Reliability Engineer

    4 semanas atrás


    São Paulo, Estado de São Paulo, Brasil INDI Staffing Services Tempo inteiro

    At INDI, we're passionate about empowering individuals and businesses worldwide. Our cutting-edge recruiters connect leading companies with top talent, fostering a dynamic environment where innovation thrives. Join us in shaping the future of work.Overview of the role:We are looking for a Site Reliability Engineer to build and maintain highly reliable,...


  • São Paulo, São Paulo, Brasil Enter Tempo inteiro

    A Enter (anteriormente Talisman AI) foi fundada em 2023 com a missão de tornar o Brasil um protagonista em Inteligência Artificial. Unimos a expertise humana à eficiência da IA para ajudar grandes empresas da América Latina a otimizar processos críticos de alto volume e que exigem intenso trabalho manual. Iniciamos nossa jornada aplicando IA para...


  • São Paulo, São Paulo, Brasil Enter Tempo inteiro

    A Enter (anteriormente Talisman AI) foi fundada em 2023 com a missão de tornar o Brasil um protagonista em Inteligência Artificial. Unimos a expertise humana à eficiência da IA para ajudar grandes empresas da América Latina a otimizar processos críticos de alto volume e que exigem intenso trabalho manual. Iniciamos nossa jornada aplicando IA para...

  • Site Reliability Engineer

    1 semana atrás


    São Paulo, São Paulo, Brasil Sur Tempo inteiro

    As the Site Reliability Engineer you will support and scale the infrastructure powering their secure, mission-critical SaaS platform.You must be confident in operating and debugging both modern infrastructure (cloud-native, containerized services) and classic Windows production environments (IIS, SQL Server AlwaysOn, Service Broker), with the ability to...

  • Site Reliability Engineer

    1 semana atrás


    São Paulo, São Paulo, Brasil Sur Tempo inteiro

    As the Site Reliability Engineer you will support and scale the infrastructure powering their secure, mission-critical SaaS platform. You must be confident in operating and debugging both modern infrastructure (cloud-native, containerized services) and classic Windows production environments (IIS, SQL Server AlwaysOn, Service Broker), with the ability to...


  • São Paulo, Estado de São Paulo, Brasil Conquest One Tempo inteiro

    Vaga: SRE Sênior️ Inglês para conversação é imprescindívelHíbrido – presencial 2x na semana no Jardim Paulista (Av. Nove de Julho – São Paulo/SP) + 3x na semana de home office Contratação: CLT Horário de trabalho: 09:00 às 18:00Estamos em busca de um(a) Site Reliability Engineer Sênior para atuar de forma estratégica na transformação e...


  • São Paulo, São Paulo, Brasil K2 Solutions Tempo inteiro

    Trabalho híbrido na região de Pinheiros/ SP - 3x por semana no escritório Estamos selecionando um Senior Site Reliability Engineer - SRE para se juntar ao nosso time e desempenhar um papel essencial na manutenção, automação e melhoria da confiabilidade dos sistemas que impulsionam a rede logística da empresa em múltiplas regiões. Essa pessoa...

  • Site Reliability Engineer

    2 semanas atrás


    São Paulo, São Paulo, Brasil WSO2 Tempo inteiro

    About WSO2Founded in 2005, WSO2 is the largest independent software vendor providing open-source API management, integration, and identity and access management (IAM) products. WSO2's products and platforms—including our next-gen internal developer platform, Choreo—empower organizations to leverage the full potential of APIs for secure delivery of...


  • Barueri, São Paulo, Estado de São Paulo, Brasil Gft Brasil Tempo inteiro

    O que buscamos: Profissional com atuação em confiabilidade e operação de serviços em nuvem, responsável por apoiar a evolução da maturidade de práticas de Site Reliability Engineering em ambientes AWS, com foco em disponibilidade, desempenho, Observabilidade e resiliência dos serviços. Atuando no modelo híbrido (3x na semana) em São Paulo....