SRE/Production Support Engineer

Há 23 horas


Mossoró, Brasil Tecla Tempo inteiro

*Native/Bilingual English is required for this role (read/written/spoken)
Please upload your CV Resume in English.

Monthly salary: $4,000 - $5,500 USD

Along with our partner, we are seeking a  Senior SRE/Production Support Engineer to lead the operational reliability, stability, and performance of their production systems. The selected professional will serve as a technical leader for incident response, root cause analysis, and long-term operational improvements. This role requires deep expertise in AWS serverless architectures, Python backends, PostgreSQL, and frontend technologies like React/Amplify.

The Senior Production Support Engineer not only resolves incidents but also drives system improvements, mentors junior engineers, and shapes processes for reliability and monitoring.

Responsibilities:
Lead incident management for production issues across: AWS Lambda-based microservices, PostgreSQL (RDS), and React/Amplify frontend applications
Investigate, diagnose, and resolve complex production issues, including performance, data, and configuration problems.
Conduct and lead post-incident reviews and root cause analyses (RCA), driving preventive solutions.
Mentor and guide junior/mid-level production support engineers in troubleshooting and operational best practices.
Maintain and enhance monitoring, alerting, logging, and observability tools (CloudWatch, X-Ray, DataDog, etc.).
Collaborate with engineering teams to improve system reliability, scalability, and maintainability.
Own and improve runbooks, playbooks, and operational documentation.
Participate in on-call rotations, providing technical leadership during high-impact incidents.
Analyze recurring issues and propose architectural or procedural improvements to prevent recurrence.
Support deployment validation, emergency rollbacks, and operational changes.
Partner with DevOps and Engineering teams to optimize performance, cost, and availability of cloud resources.

Required Qualifications:
5+ years of experience in production support, SRE, DevOps, or backend engineering roles.
Strong expertise with AWS services, particularly Lambda, API Gateway, RDS (PostgreSQL), S3, Cognito, and CloudWatch.
Proficient in Python, with the ability to read, debug, and modify code to resolve issues.
Deep understanding of PostgreSQL, including query optimization, data integrity, and troubleshooting.
Experience managing and improving observability, monitoring, and alerting in production systems.
Proven experience handling high-severity incidents and leading incident response.
Strong problem-solving skills and ability to navigate distributed systems.
Excellent communication skills for incident reporting, collaboration, and mentoring.

Preferred Qualifications:
Experience with frontend technologies (React, Amplify) for debugging full-stack issues.
Familiarity with serverless architecture best practices and cost/performance optimization.
Experience with infrastructure-as-code (CloudFormation, CDK, Terraform).
Knowledge of automation and scripting for operational tasks (Python preferred).
Prior experience in defining or improving SLOs, SLAs, and operational KPIs.
Familiarity with modern CI/CD pipelines and automated deployment strategies.
Hands-on experience with observability and monitoring platforms (DataDog, New Relic, Sentry).

Success Indicators:
Production incidents are resolved quickly and effectively, minimizing business impact.
Post-incident RCAs lead to measurable improvements in system reliability.
Operational playbooks and runbooks are well-maintained and widely used.
Junior/mid-level engineers are mentored effectively and develop troubleshooting skills.
Systems are proactively monitored, optimized, and improved for stability, scalability, and cost efficiency.

Tools You May Use:
AWS Services:  Lambda, RDS (PostgreSQL), S3, API Gateway, Cognito, CloudWatch, X-Ray, SNS/SQS, EventBridge
Languages & Scripting:  Python
Monitoring & Observability:  CloudWatch, DataDog, Sentry, X-Ray
Version Control & CI/CD:  GitHub/GitLab, CI/CD pipelines
Frontend Collaboration:  React, Amplify
Ticketing & Collaboration:  Jira, Confluence
AI Prompting: Cursor, ChatGPT

Benefits:
A fully remote position with a structured schedule that supports work-life balance.
The opportunity to join a forward-thinking company transforming the future of film and television production through cutting-edge technology.
Two weeks of paid vacation per year.
10 paid days for local holidays.

Work Schedule:  US Pacific Standard Time

*Please note our partner is only looking for full-time dedicated team members who are eager to fully integrate within their team.



  • Mossoró, Brasil Signify Technology Tempo inteiro

    The CompanyA well-established tech organization building advanced AI products for healthcare and clinical research. The team focuses on secure, reliable platforms that process sensitive medical data and support research and clinical workflows.Role & ResponsibilitiesAs aSenior SRE, you will:Design and automate infrastructure(infrastructure-as-code tools)Build...

  • Devops Engineer

    Há 4 dias


    Mossoró, Brasil Flowmentum, Inc. Tempo inteiro

    DevOps & Platform EngineersWe're hiring DevOps/Platform Engineers with strong SRE skills to work on high-scale SaaS platforms. Our stack is heavy on EKS, MongoDB/Atlas, and you'll be tackling database contention, scaling challenges, and complex deployments every day.This role is for problem solvers who thrive on multitasking, navigating ambiguity, and...

  • Senior It Support Engineer

    1 semana atrás


    Mossoró, Brasil Rain Tempo inteiro

    Job Description Rain is the fastest-growing earned wage access (EWA) fintech in the U.S. , serving 3.5 million employees and backed by top investors like QED and Prosus . We've raised nearly $400M in funding—including the largest Series A in fintech history —and just closed our Series B to fuel our next stage of hypergrowth. We're seeking an experienced...

  • Machine Learning Engineer

    2 semanas atrás


    Mossoró, Brasil Flatiron Software Tempo inteiro

    About Flatiron is a global remote software development company with engineers located around the world. We unite experts from diverse backgrounds and experiences in a collaborative culture to deliver exceptional products and services for our clients. As a forward-thinking software engineering company, we provide industry-leading solutions to complex problems...


  • Mossoró, Brasil Mercado Eletrônico Tempo inteiro

    O Mercado Eletrônico é líder na América Latina em soluções de gestão de compras B2B. Suas tecnologias e serviços para as áreas de compras ajudam empresas a conquistarem mais economia, agilidade, governança e colaboração.Com escritórios no Brasil, Estados Unidos, México e Portugal, contabiliza mais de 1 milhão de fornecedores, 10 mil...


  • mossoró, Brasil Velozient Tempo inteiro

    We're looking for full-time, remoteSenior Full Stack Software Engineerswith 7+ years of experience delivering production-ready software. The ideal candidates will have recent hands-on C# /.NET, modern JS framework (Vue.js, React.js (preferred), Angular), and extensive end-to-end API and integration experience. You should thrive in fast-moving environments...


  • Mossoró, Brasil Velozient Tempo inteiro

    We are looking for a remote, full-timeAI Software Engineerto join our US client's team. You should have a minimum of 3 to 5+ years of experience developing and delivering commercial software, with a solid background in AI/ML, Python, TypeScript/JavaScript, and C#/ .NET. In this role, you will leverage deep expertise in NLP and ML to help build scalable,...


  • Mossoró, Rio Grande do Norte, Brasil Acquatrat Do Nordeste Tempo inteiro R$80.000 - R$240.000 por ano

    About the RoleLocation is flexible: Preference would to be based in Mossoró, RN / Natal, RN / Aracaju, SE or / Catu, BA75% in the office (home) and 25% filed workEmployment Type: Full-timeWe are seeking a motivated and technically skilled Artificial Lift Engineer/Technician to join our team. This role involves the design, operation, maintenance, and...


  • Mossoró, Brasil Acquatrat Do Nordeste Tempo inteiro

    About the RoleLocation is flexible: Preference would to be based in Mossoró, RN / Natal, RN / Aracaju, SE or / Catu, BA75% in the office (home) and 25% filed workEmployment Type: Full-timeWe are seeking a motivated and technically skilled Artificial Lift Engineer/Technician to join our team.This role involves the design, operation, maintenance, and...

  • Data Engineer

    2 semanas atrás


    Mossoró, Brasil Nearsure Tempo inteiro

    Explore the Nearsure experience!Join our close-knit LATAM remote team: Connect through fun activities like coffee breaks, tech talks, and games with your team-mates and management.Say goodbye to micromanagement! We champion autonomy, open communication, and respect for diversity as our core values.Your well-being matters: Our People Care team is here from...