Site Reliability Engineer
Há 5 dias
Role Summary
The SRE Technical Member will:
- Deliver engineering, operational, and administrative support for the application and its technology landscape.
- Address reliability and operational challenges such as application failures, production issues, infrastructure performance (disk, memory), monitoring, and security.
- Serve as a mid-level subject matter expert, integrating with multiple teams to develop and evolve SRE practices for Azure-based environments.
- Participate in production support activities, including deployments, upgrades, and critical issue resolution.
This role is central to designing, implementing, and maintaining monitoring, alerting, and reporting solutions across servers, containers, databases, and cloud infrastructure components.
Key Responsibilities
- Collaborate with Central SRE, DevOps, and InfoSec teams on new projects, platform builds, and deployments.
- Contribute to the design, implementation, and operation of large-scale, Azure-based platforms.
- Apply industry best practices in monitoring, alerting, reporting, and cloud architecture.
- Participate in infrastructure, application, and security planning, focusing on scalability, redundancy, and data preservation.
- Support high-availability topologies with development teams.
- Produce documentation and weekly operational status reports, detailing project progress and key metrics.
- Provide engineering and support for technical infrastructure, cloud, databases, and application performance.
- Manage incident response, change management, and user permissions following SRE best practices (Google SRE model).
- Maintain close collaboration between Application, Central SRE, DevOps, InfoSec, and business units.
- Assist in configuring and onboarding new applications into the Azure DevOps (ADO) platform.
Core Technical Skills
- Strong understanding of SRE fundamentals: monitoring, alerting, reporting, performance, availability, and incident response.
- Hands-on experience with CI/CD tools (Git, Azure Pipelines, Ansible, etc.).
- Infrastructure as Code (IaC) design, scripting, and setup.
- Deep knowledge of Azure Web Services — installation, configuration, and management.
- Experience administering Microsoft applications (.NET, C#, Angular) with focus on automation, optimization, and security.
- Proficiency in Cosmos DB and MS SQL operational tasks.
- Excellent troubleshooting, root-cause analysis, and problem-solving skills.
- Experience with disaster recovery, scalability testing, and capacity planning.
Qualifications
- Bachelor’s degree in a technical discipline (Computer Science, Engineering, or related field).
- 5+ years of industry experience in SRE, DevOps, or related technical operations roles.
- Proven experience in cloud infrastructure, automation, and application reliability engineering within large-scale, enterprise environments.
-
Site Reliability Engineer Junior
Há 8 horas
Recife, Brasil Incognia Tempo inteiroSobre a oportunidade:Com uma demanda cada vez crescente, precisamos evoluir nossa infraestrutura de monitoramento, CI, CD, automações, SSO e banco de dados, buscando ajudar os times de produto a criarem aplicações cada vez mais confiáveis e escaláveis.O time de SRE (Site Reliability Engineer) está alocado dentro da área de Core Engineering da...
-
Site Readiness Engineer
1 semana atrás
Recife, Brasil Pathlock Tempo inteiroAbout Pathlock:Pathlock is a leader in application security, access governance, and compliance automation. Our cloud-based solutions help organizations secure critical applications, mitigate risk, and enforce policies across a diverse IT landscape.Job Summary:We are looking for a skilled Site Readiness Engineer (SRE) with expertise in CI/CD automation and...
-
Site Reliability Engineer Junior
Há 4 dias
Recife, Pernambuco, Brasil Incognia Tempo inteiro R$55.000 - R$81.075 por anoSobre a oportunidade:Com uma demanda cada vez crescente, precisamos evoluir nossa infraestrutura de monitoramento, CI, CD, automações, SSO e banco de dados, buscando ajudar os times de produto a criarem aplicações cada vez mais confiáveis e escaláveis. O time de SRE (Site Reliability Engineer) está alocado dentro da área de Core Engineering da...
-
Site Reliability Engineer Lead
2 semanas atrás
Recife, Pernambuco, Brasil Incognia Tempo inteiro R$80.000 - R$120.000 por anoJob description Sobre a oportunidade:Com uma demanda cada vez crescente, precisamos evoluir nossa infraestrutura de monitoramento, CI, CD, automações, SSO e banco de dados, buscando ajudar os times de produto a criarem aplicações cada vez mais confiáveis e escaláveis. O time de SRE (Site Reliability Engineer) está alocado dentro da área de Core...
-
Bsatech | recife
4 semanas atrás
RECIFE, Brasil BSA CORP LTDA Tempo inteiroA BSAtech é uma empresa especializada no desenvolvimento de jogos de entretenimento com alcance global. Nosso compromisso é entregar experiências digitais de alta qualidade, combinando inovação, criatividade e tecnologia. Estamos em um momento de expansão e buscamos profissionais excepcionais para nos ajudar a ampliar nossas áreas de negócio e...
-
Highly Skilled System Reliability Engineer
Há 8 horas
Recife, Brasil Bebeesystemreliability Tempo inteiroJob OverviewA site reliability engineer is a vital member of our team, responsible for ensuring the stability and performance of our systems.Key Responsibilities:ElasticSearch, Prometheus, and Kibana expertise is required.Kubernetes proficiency and hands-on experience with cloud providers (AWS, Azure or GCP) are essential.Infrastructure as Code (IaC) skills...
-
Recife, Brasil beBeeSystemReliability Tempo inteiroJob Overview A site reliability engineer is a vital member of our team, responsible for ensuring the stability and performance of our systems. Key Responsibilities: ElasticSearch, Prometheus, and Kibana expertise is required. Kubernetes proficiency and hands-on experience with cloud providers (AWS, Azure or GCP) are essential. Infrastructure as Code (IaC)...
-
Senior Site Reliability Engineer
Há 4 dias
Recife, Pernambuco, Brasil Docplanner Tempo inteiroCompany Description Welcome to the good side of tech You might have heard about us but with a different name: Feegow, Doctoralia or Docplanner. The names are different depending on the country we are located in, but we are all part of Docplanner Group.It all started over 12 years ago when we asked ourselves: is anyone in healthcare thinking about patients?...
-
Senior Site Reliability Engineer
Há 4 dias
Recife, Pernambuco, Brasil DocPlanner Tempo inteiroCompany DescriptionWelcome to the good side of techYou might have heard about us but with a different name: Feegow, Doctoralia or Docplanner. The names are different depending on the country we are located in, but we are all part of Docplanner Group.It all started over 12 years ago when we asked ourselves: is anyone in healthcare thinking about patients? We...
-
Engenheiro De Confiabilidade De Sistema
Há 4 dias
Recife, Brasil Bebeedesenvolvedor Tempo inteiroDescrição do Cargo:O Site Reliability Engineer Sênior irá trabalhar em equipe, desenvolvendo e melhorando soluções de infraestrutura para apoiar o crescimento dos nossos serviços.Contribuir para a melhoria da cultura operacional em engenharia, usando as melhores práticas de SRE e DevOps do mercado, visando aprimorar automação de infraestrutura,...