Site Reliability Engineer
Há 22 horas
About the Company
This company operates a global computing platform that enables businesses to programmatically deploy single-tenant Bare Metal instances across multiple regions worldwide.
They are a team of passionate engineers working at the intersection of hardware, software, and network infrastructure, building the fastest, most developer-centric single-tenant cloud infrastructure on the market. If you share this passion, this role offers the opportunity to help shape the future of internet-scale infrastructure.
This position is being managed in partnership with an external recruitment consultancy supporting the company throughout the hiring process.
Summary
The Reliability team is responsible for the health and resilience of the infrastructure powering a global bare metal cloud platform. As a Senior Site Reliability Engineer (SRE), you'll focus on building reliable, observable, and self-healing systems at scale.
SREs here operate at the intersection of software engineering and infrastructure — designing tools that automate operations, improve incident response, and enhance observability, ensuring the platform delivers high performance and reliability to customers worldwide.
This role is ideal for engineers passionate about reliability, automation, distributed systems, and bringing cloud-like experiences to bare metal environments.
Key Responsibilities
- Continuously improve platform reliability and performance.
- Design, build, and maintain tools to automate operational workflows and incident response.
- Implement and enhance observability systems (monitoring, alerting, tracing).
- Collaborate with engineering and platform teams to design scalable and resilient systems.
- Participate in on-call rotations and lead post-incident reviews with a learning-focused approach.
- Develop and document operational playbooks and processes.
- Contribute to defining SLOs/SLIs and driving reliability metrics across teams.
Skills & Qualifications
Required:
- Fluent verbal and written English communication skills
- Advanced experience with Linux/Unix in production environments
- Hands-on experience with Kubernetes and container orchestration
- Proficiency with IaC tools (e.g., Terraform, Ansible)
- Experience with observability stacks (Prometheus, Grafana, Loki, ELK, etc.)
- Proficiency with scripting/programming languages such as Bash, Python, Go, or Ruby
- Working knowledge of Git and CI/CD pipelines
- Experience with incident response and root cause analysis
- Knowledge of cloud-native reliability and security best practices
What’s Offered
- Contractor engagement (PJ)
- Paid Time Off
- Competitive compensation package
- Wellness benefit (Wellhub / Gympass equivalent)
- Annual performance-based bonus
- Flexible working hours
- Opportunities for technical and career growth
-
Senior Site Reliability
Há 6 dias
Manaus, Brasil Canonical Tempo inteiroSenior Site Reliability / Gitops EngineerJoin to apply for the Senior Site Reliability / Gitops Engineer role at CanonicalSenior Site Reliability / Gitops Engineer1 day ago Be among the first 25 applicantsJoin to apply for the Senior Site Reliability / Gitops Engineer role at CanonicalCanonical is a leading provider of open source software and operating...
-
Site Reliability Engineer
3 semanas atrás
Manaus, Brasil Canonical Tempo inteiroJoin to apply for the Site Reliability Engineer role at Canonical Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers...
-
Site Reliability Engineer
1 semana atrás
Manaus, Brasil Canonical Tempo inteiroJoin to apply for the Site Reliability Engineer role at CanonicalCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets.Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT.Our customers include...
-
Senior Site Reliability
1 semana atrás
Manaus, Brasil Canonical Tempo inteiroSenior Site Reliability / Gitops EngineerJoin to apply for the Senior Site Reliability / Gitops Engineer role at Canonical Senior Site Reliability / Gitops Engineer1 day ago Be among the first 25 applicants Join to apply for the Senior Site Reliability / Gitops Engineer role at Canonical Canonical is a leading provider of open source software and operating...
-
Site Reliability Engineer Sr
2 semanas atrás
Manaus, Brasil Mercado Eletrônico Tempo inteiroO Mercado Eletrônico é líder na América Latina em soluções de gestão de compras B2B. Suas tecnologias e serviços para as áreas de compras ajudam empresas a conquistarem mais economia, agilidade, governança e colaboração. Com escritórios no Brasil, Estados Unidos, México e Portugal, contabiliza mais de 1 milhão de fornecedores, 10 mil...
-
Senior Site Reliability Engineer
Há 4 dias
Manaus, Brasil Canonical Tempo inteiroOverview Senior Site Reliability Engineer role at Canonical. What we are looking for Senior Site Reliability Engineer. Next-gen operations at scale, with pure Python infra-as-code, from bare metal to containers and applications. Our goal is to perfect enterprise infrastructure devops. We run hundreds of private cloud, Kubernetes, and application clusters for...
-
Software Engineer Site Reliability Engineer
Há 11 horas
Manaus, Brasil Scubyt Tempo inteiroSoftware Engineer Site Reliability EngineerLocation: Brazil REMOTEDuration: Fulltime CLT / REMOTEAbout the roleThe Application SRE Team supports several critical components of our foundational technologies for real-time protection, as well as ourRBIandSSPMservices.We are a team of software engineers focused on improving availability, latency, performance,...
-
Site Reliability Engineer Sr
1 semana atrás
Manaus, Brasil Mercado Eletrônico Tempo inteiroO Mercado Eletrônico é líder na América Latina em soluções de gestão de compras B2B. Suas tecnologias e serviços para as áreas de compras ajudam empresas a conquistarem mais economia, agilidade, governança e colaboração. Com escritórios no Brasil, Estados Unidos, México e Portugal, contabiliza mais de 1 milhão de fornecedores, 10 mil...
-
Site Reliability Engineer
3 semanas atrás
Manaus, Brasil AgileEngine Tempo inteiroSite Reliability Engineer (Middle/Senior) ID38916 AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards. WHY JOIN US...
-
Site Reliability Engineer Lead
4 semanas atrás
Manaus, Brasil Incognia Tempo inteiroSobre a oportunidadeCom uma demanda cada vez crescente, precisamos evoluir nossa infraestrutura de monitoramento, CI, CD, automações, SSO e banco de dados, buscando ajudar os times de produto a criarem aplicações cada vez mais confiáveis e escaláveis. O time de SRE (Site Reliability Engineer) está alocado dentro da área de Core Engineering da...