Site Reliability Engineer
Há 2 horas
About the Company This company operates a global computing platform that enables businesses to programmatically deploy single-tenant Bare Metal instances across multiple regions worldwide. They are a team of passionate engineers working at the intersection of hardware, software, and network infrastructure, building the fastest, most developer-centric single-tenant cloud infrastructure on the market. If you share this passion, this role offers the opportunity to help shape the future of internet-scale infrastructure. This position is being managed in partnership with an external recruitment consultancy supporting the company throughout the hiring process. Summary The Reliability team is responsible for the health and resilience of the infrastructure powering a global bare metal cloud platform. As a Senior Site Reliability Engineer (SRE) , you'll focus on building reliable, observable, and self-healing systems at scale. SREs here operate at the intersection of software engineering and infrastructure — designing tools that automate operations, improve incident response, and enhance observability, ensuring the platform delivers high performance and reliability to customers worldwide. This role is ideal for engineers passionate about reliability, automation, distributed systems, and bringing cloud-like experiences to bare metal environments. Key Responsibilities Continuously improve platform reliability and performance. Design, build, and maintain tools to automate operational workflows and incident response. Implement and enhance observability systems (monitoring, alerting, tracing). Collaborate with engineering and platform teams to design scalable and resilient systems. Participate in on-call rotations and lead post-incident reviews with a learning-focused approach. Develop and document operational playbooks and processes. Contribute to defining SLOs/SLIs and driving reliability metrics across teams. Skills & Qualifications Required: Fluent verbal and written English communication skills Advanced experience with Linux/Unix in production environments Hands-on experience with Kubernetes and container orchestration Proficiency with IaC tools (e.G., Terraform, Ansible) Experience with observability stacks (Prometheus, Grafana, Loki, ELK, etc.) Proficiency with scripting/programming languages such as Bash, Python, Go, or Ruby Working knowledge of Git and CI/CD pipelines Experience with incident response and root cause analysis Knowledge of cloud-native reliability and security best practices What’s Offered Contractor engagement (PJ) Paid Time Off Competitive compensation package Wellness benefit (Wellhub / Gympass equivalent) Annual performance-based bonus Flexible working hours Opportunities for technical and career growth
-
Site Reliability Engineer
Há 23 horas
Belo Horizonte, Brasil MetaCTO Tempo inteiroAbout UsAt MetaCTO, we specialize in helping startups and growing companies turn visionary ideas into successful digital products through expert app development and fractional CTO services. As a Site Reliability Engineer (SRE), you will play a critical role in ensuring the reliability, scalability, and security of the backend infrastructure that powers...
-
Software Engineer Site Reliability Engineer
Há 23 horas
Belo Horizonte, Brasil Scubyt Tempo inteiroSoftware Engineer Site Reliability EngineerLocation: Brazil REMOTE Duration: Fulltime CLT / REMOTEAbout the roleThe Application SRE Team supports several critical components of our foundational technologies for real-time protection, as well as our RBI and SSPM services. We are a team of software engineers focused on improving availability, latency,...
-
Site Reliability Engineer
Há 14 horas
Belo Horizonte, Brasil MetaCTO Tempo inteiroAbout Us At MetaCTO, we specialize in helping startups and growing companies turn visionary ideas into successful digital products through expert app development and fractional CTO services. As a Site Reliability Engineer (SRE) , you will play a critical role in ensuring the reliability, scalability, and security of the backend infrastructure that powers...
-
Site Reliability Engineer
1 semana atrás
Belo Horizonte, Brasil Bairesdev Tempo inteiroSite Reliability Engineer - Remote Work | REF#******We are looking for a Site Reliability Engineer to administrate and provide support for the whole project infrastructure hosted in the cloud while implementing CI/CD pipelines for the automation of the deployments.What You Will DoEnsure high service availability, performance, security, and...
-
Site Reliability Engineer
Há 4 horas
Belo Horizonte, Brasil BairesDev Tempo inteiroSite Reliability Engineer - Remote Work | REF# We are looking for a Site Reliability Engineer to administrate and provide support for the whole project infrastructure hosted in the cloud while implementing CI/CD pipelines for the automation of the deployments. What You Will Do Ensure high service availability, performance, security, and maintainability....
-
Senior Site Reliability Engineer
1 semana atrás
Belo Horizonte, Brasil Canonical Tempo inteiroCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets.Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT.Our customers include the world's leading public cloud and silicon providers, and...
-
Site Reliability Engineer Sr
2 semanas atrás
Belo Horizonte, Brasil Mercado Eletrônico Tempo inteiroO Mercado Eletrônico é líder na América Latina em soluções de gestão de compras B2B. Suas tecnologias e serviços para as áreas de compras ajudam empresas a conquistarem mais economia, agilidade, governança e colaboração. Com escritórios no Brasil, Estados Unidos, México e Portugal, contabiliza mais de 1 milhão de fornecedores, 10 mil...
-
Site Reliability Engineer Sr
2 semanas atrás
Belo Horizonte, Brasil Mercado Eletrônico Tempo inteiroO Mercado Eletrônico é líder na América Latina em soluções de gestão de compras B2B. Suas tecnologias e serviços para as áreas de compras ajudam empresas a conquistarem mais economia, agilidade, governança e colaboração.Com escritórios no Brasil, Estados Unidos, México e Portugal, contabiliza mais de 1 milhão de fornecedores, 10 mil...
-
Senior Site Reliability Engineer
Há 7 dias
Belo Horizonte, Brasil YAPP Tempo inteiroA Getrak, líder em plataforma SaaS de rastreamento, monitoramento e segurança veicular, busca um Senior Site Reliability Engineer (SRE) para integrar o time de Tecnologia e Produto. Atuando em um ambiente de alta escala e missão crítica, você será responsável por garantir a confiabilidade, disponibilidade e performance da nossa plataforma, que...
-
Site Reliability Engineer
Há 3 dias
Belo Horizonte, Brasil Insight Global Tempo inteiroRemote Automation Cloud Engineer Required Skills & Experience Required Skills & Qualifications: - Minimum 8 years of experience in infrastructure automation and DevOps. - Strong hands-on experience with Terraform for IaC across Azure, GCP, and OCI. - Proficiency in Jenkins pipeline development using Groovy. - Solid experience with Ansible for configuration...