Site Reliability Engineer

1 dia atrás


Brazil, BR MetaCTO Tempo inteiro

About Us

At MetaCTO, we specialize in helping startups and growing companies turn visionary ideas into successful digital products through expert app development and fractional CTO services. As a Site Reliability Engineer (SRE), you will play a critical role in ensuring the reliability, scalability, and security of the backend infrastructure that powers innovative applications for our clients. This role will involve managing cloud environments, optimizing databases, automating deployments, and improving system observability.

Job Description

As a Site Reliability Engineer (SRE) at MetaCTO, you will be responsible for designing, implementing, and maintaining highly available, scalable, and secure infrastructure solutions. You will collaborate with software engineers to improve system performance, automate operations, and ensure the smooth functioning of critical backend services. You’ll work extensively with cloud platforms like AWS, leveraging technologies such as Terraform, Docker, Kubernetes, and CI/CD pipelines to enhance system reliability.

Responsibilities
  • Architect, build, and maintain cloud infrastructure on AWS (Lambda, EC2, RDS, S3, EKS, SQS, CloudWatch).
  • Manage and optimize databases (MySQL, PostgreSQL) for performance, reliability, and security.
  • Implement monitoring, alerting, and logging solutions to ensure system health and performance, with specific experience using Zabbix and Elastic Logging.
  • Design and maintain CI/CD pipelines for automated deployment and scaling of applications.
  • Work with containerization and orchestration tools such as Docker and Kubernetes.
  • Develop and enforce security best practices for cloud environments and infrastructure.
  • Automate operational processes using Infrastructure-as-Code (Terraform, CloudFormation) and scripting languages like Python or Bash.
  • Troubleshoot and resolve infrastructure-related incidents and optimize system performance.
  • Collaborate with backend engineers to ensure high availability, fault tolerance, and scalable system design, with a strong focus on Django-based applications.
Qualifications
  • 5-10 years of experience in Site Reliability Engineering (SRE), DevOps, or Cloud Engineering roles.
  • Strong expertise in AWS cloud services (EC2, RDS, S3, Lambda, CloudFront, EKS, SQS, IAM).
  • Hands-on experience with containerization (Docker) and orchestration (Kubernetes, ECS, or EKS).
  • Deep knowledge of relational databases (MySQL, PostgreSQL), including performance tuning, query optimization, monitoring, and migration management.
  • Proficiency in Infrastructure-as-Code tools such as Terraform, CloudFormation, or Pulumi.
  • Strong experience with CI/CD pipelines and automation tools (GitHub Actions, Jenkins, CircleCI, or GitLab CI/CD).
  • Proficiency in monitoring tools, specifically Zabbix, and logging solutions like Elastic Logging.
  • Scripting experience with Python, Bash, or Go for automating operational tasks.
  • Experience working with Django-based applications in a cloud environment.
  • Experience implementing security best practices for cloud-based applications.
  • Knowledge of distributed systems and microservices architecture.
Preferred Skills
  • AWS certifications (Solutions Architect, DevOps Engineer) are a plus.
  • Experience with serverless computing and event-driven architectures.
  • Familiarity with message queue services (SQS, RabbitMQ, Kafka).
  • Understanding of zero-downtime deployments and disaster recovery strategies.
Position Details
  • Type: Full-Time
  • Location: 100% Remote
  • Hours: US Pacific Time hours
How to Apply

If you are passionate about scalability, automation, and reliability, and thrive in a collaborative, fast-paced environment, we’d love to hear from you. Please submit your resume and an optional brief cover letter outlining your relevant experience.

MetaCTO is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.



  • Brazil, BR Mercado Eletrônico Tempo inteiro

    O Mercado Eletrônico é líder na América Latina em soluções de gestão de compras B2B. Suas tecnologias e serviços para as áreas de compras ajudam empresas a conquistarem mais economia, agilidade, governança e colaboração.Com escritórios no Brasil, Estados Unidos, México e Portugal, contabiliza mais de 1 milhão de fornecedores, 10 mil...


  • Brazil, BR Scubyt Tempo inteiro

    Software Engineer Site Reliability EngineerLocation: Brazil REMOTE Duration: Fulltime CLT / REMOTEAbout the roleThe Application SRE Team supports several critical components of our foundational technologies for real-time protection, as well as our RBI and SSPM services. We are a team of software engineers focused on improving availability, latency,...


  • Brazil, BR Insight Global Tempo inteiro

    RemoteAutomation Cloud EngineerRequired Skills & ExperienceRequired Skills & Qualifications:• Minimum 8 years of experience in infrastructure automation and DevOps.• Strong hands-on experience with Terraform for IaC across Azure, GCP, and OCI.• Proficiency in Jenkins pipeline development using Groovy.• Solid experience with Ansible for configuration...


  • Brazil, BR HCLTech Tempo inteiro

    Your role and responsabilities:Handling major incidents via CIRS (Critical Issue Response System) and providing frequent updates until resolution.Performing deep-dive application troubleshooting and identifying preventive actions.Managing CIRS-related requests including deployments, feature toggles, and data fixes.Following up on major production incidents...


  • Brazil, BR Solas IT Recruitment Tempo inteiro

    A leading organisation is on the look out for a Site Reliability Engineer to provide expertise in maintaining operational coverage of services and functions offered through the organisations cloud compute and storage environments including research infrastructure. This role is a full time, permanent positionThey will consider related roles such as Cloud...


  • Brazil, BR Pride Global Tempo inteiro

    We're seeking for a Senior UX/UI Engineer in Brazil - 100% remoteType: 6-Month Contract (with strong potential for extension)Location: RemoteAbout the RoleWe’re looking for a Senior UX/UI Engineer to help build and improve internal web-based user interfaces for our Site Reliability Engineering teams. This role is ideal for a front-end engineer who loves to...


  • Brazil, BR Pride Global Tempo inteiro

    We’re Hiring: Sr. Engineer (React + UX/UI Knowledge) Remote (US-based, preferably EST) | 6-9 month contract (with possible extension) | Highly competitive USD compensation packageJoin our Site Reliability Engineering team and help improve internal web applications built by engineers, for engineers. We’re looking for someone who can not only code...


  • Brazil, BR Pride Global Tempo inteiro

    We're Hiring: Senior Data Engineer | Remote from Brazil | Fluent English required | Location: Remote – Brazil onlyContact: TemporaryAre you passionate about building scalable data platforms and cutting-edge MLOps solutions? Do you want to work with a top-tier US company revolutionizing e-commerce and circular fashion?We're looking for a Senior Data...

  • DevOps Engineer

    Há 5 dias


    Brazil, BR Flowmentum, Inc. Tempo inteiro

    We’re Flowmentum and our clients are fast-moving teams building reliable, scalable, and secure infrastructure for companies shaping the future of AI, fintech, cloud services, and beyond.Our engineers work on high-traffic, mission-critical systems that power millions of users across the globe.We believe in autonomy, ownership, and solving hard problems —...

  • Platform Engineer

    Há 5 dias


    Brazil, BR Flowmentum, Inc. Tempo inteiro

    Senior DevOps & Platform Engineer(Azure Networking | .NET 4.6 | Terraform | PowerShell | Azure DevOps) Remote | Global Team | ⏰ Flexible HoursWe're hiring a Senior DevOps & Platform Engineer to join our remote-first, results-driven engineering team. If you're an expert in Azure networking and have deep experience with .NET Framework 4.6, this is your...