Site Reliability Engineer
1 semana atrás
About the Team/Role
We are seeking a Software Development Engineer Level 3 to join our SRE team dedicated to the Mobility line of business. This role is for a professional with a software development background who will apply SRE principles to ensure the reliability, scalability, and performance of our complex software systems.
The ideal candidate will have related experience and will be a key player in fostering a culture of continuous improvement and collaboration across engineering teams.
SRE is an ongoing journey of continuous improvement, and the core principles apply regardless of the technology's complexity, the customer's needs, or the business context. If you're passionate about building resilient and highly available systems, we encourage you to apply.
How you'll make an impact
As a Site Reliability Engineer, your responsibilities will include:
Embrace Observability: You'll build and maintain comprehensive monitoring and observability systems by meticulously instrumenting applications, infrastructure, and dependencies. You'll create clear dashboards that provide a direct view of system health, standardizing metrics, logs, and tracing to enable effective correlation and analysis.
Design for Performance and Resilience: You will design systems with a focus on scalability, redundancy, and fault tolerance. This includes setting clear performance targets (SLIs/SLOs) aligned with business goals and regularly conducting load testing and chaos engineering to find issues proactively.
Proactive Reliability: You'll help shift our team from a reactive to a proactive mindset by defining explicit Service Level Objectives (SLOs) that reflect user expectations. You'll use error budgets to guide the balance between development and operations, slowing down releases when necessary to maintain reliability.
Incident Management and Learning: You will treat outages and performance degradations as opportunities to improve resilience. This involves streamlining incident response with clear procedures and conducting blameless postmortems to learn from mistakes.
Automate Everything (with Caution): You'll automate repetitive and error-prone tasks to minimize toil and free up the team for high-value work. You'll build in robust testing and rollback capabilities into automation pipelines, always maintaining careful oversight and human judgment.
Impact Engineering and Corporate Culture: You'll collaborate with development and product teams to improve system quality and performance. This includes highlighting impacts on quality, bringing focus to customer journey bottlenecks, and helping to prioritize product stories related to defects.
Experience you'll bring
Expertise in software design, development, and testing for software enhancements and new products.
Knowledge of automated testing tools and traditional quality assurance approaches.
Experience with cloud development, including designing, developing, and maintaining applications on platforms like Amazon Web Services/EC2.
Understanding of cloud storage services, including EBS, Amazon S3, and EFS.
Ability to create documentation for future maintenance and issue resolution.
Experience with APIs, pre-scripting, post-scripting, and integration testing.
-
Senior Network Engineer
Há 5 dias
Salvador, Bahia, Brasil Acronis Tempo inteiro R$80.000 - R$120.000 por anoAcronis is a world leader in cyber protection—empowering people by providing them with cutting-edge technology that enables them to monitor, control, and protect the data that their businesses and lives depend on. We are looking for а Senior Linux Systems Administrator who is ready to join our mission in creating a #CyberFit futureThe Senior Network...
-
ArgoCD Specialist
Há 7 dias
Salvador, Bahia, Brasil WEX Tempo inteiro R$80.000 - R$120.000 por anoAbout the Team/RoleWe're looking for an experienced DevOps Specialist to lead the design, implementation, and operation of our GitOps platform using ArgoCD. You will be responsible for running ArgoCD at enterprise scale—supporting hundreds of Kubernetes clusters across multiple environments—with a focus on reliability, security, and developer...
-
Mid level Site Reliability Engineer
Há 3 dias
Salvador, Brasil WEX Brazil Technology Services Tempo inteiroAbout the Team/Role The WEX Site Reliability Engineering (SRE) team seeks individuals passionate about developing software and solutions for observability, incident response, reliability, performance, operational excellence, and compliance. As part of the Site Reliability Engineering organization, you will support internal stakeholders and
-
Salvador, Brasil Bebeeengineer Tempo inteiroJob Title: Site Reliability and Software EngineerWe are a global technology consulting company seeking highly skilled professionals to join our team.Our organization has over 12 years of experience in delivering complex projects across multiple countries, with a strong focus on innovation and customer satisfaction.We invest in the growth and development of...
-
Senior Ux/Ui Engineer
Há 12 horas
Salvador, Brasil Pride Global Tempo inteiroWe're seeking for a Senior UX/UI Engineer in Brazil - 100% remoteType: 6-Month Contract (with strong potential for extension)Location: RemoteAbout the RoleWe’re looking for a Senior UX/UI Engineer to help build and improve internal web-based user interfaces for our Site Reliability Engineering teams. This role is ideal for a front-end engineer who loves to...
-
Site Reliability Engineer
Há 7 dias
Salvador, Brasil AgileEngine Tempo inteiroOverview Site Reliability Engineer (Middle/Senior) ID38916 at AgileEngine. AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to...
-
Senior Site Reliability Engineer
Há 5 dias
Salvador, Brasil Signify Technology Tempo inteiroThe Company A well-established tech organization building advanced AI products for healthcare and clinical research.The team focuses on secure, reliable platforms that process sensitive medical data and support research and clinical workflows.Role & Responsibilities As a Senior SRE, you will:Design and automate infrastructure (infrastructure-as-code...
-
Senior Site Reliability Engineer
2 semanas atrás
Salvador, Brasil Signify Technology Tempo inteiroThe Company A well-established tech organization building advanced AI products for healthcare and clinical research. The team focuses on secure, reliable platforms that process sensitive medical data and support research and clinical workflows. Role & Responsibilities As a Senior SRE, you will: Design and automate infrastructure (infrastructure-as-code...
-
Site Reliability Engineer
Há 7 dias
Salvador, Brasil AgileEngine Tempo inteiroOverviewSite Reliability Engineer (Middle/Senior) ID38916 at AgileEngine. AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to...
-
Linux Site Reliability Consultant
2 semanas atrás
Salvador, Brasil Pythian Tempo inteiroOverview Site Reliability Consultant — Linux Site Reliability Consultant role at Pythian . Location: Brazil | Remote | Work from Home. One available position for the following time zone: PST. Why Pythian At Pythian, we are experts in strategic database and analytics services, driving digital transformation and operational excellence. Pythian, a...
-
Site Reliability Engineer
1 dia atrás
Salvador, Brasil Agileengine Tempo inteiroOverviewSite Reliability Engineer (Middle/Senior) ID***** at AgileEngine.AgileEngine is an Inc. **** company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries.We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to...
-
Devops Pleno
2 semanas atrás
Salvador, Brasil Conquest One Tempo inteiro⚙️ DevOps Pleno | Híbrido (SP) ou 100% RemotoVocê é apaixonado(a) por automação, performance e confiabilidade de sistemas? 💻Estamos em busca de um(a) DevOps | Site Reliability Engineer (SRE) Pleno para fazer parte de um time que valoriza inovação, eficiência e tecnologia de ponta! 🚀🏢 Modelo de trabalho:Híbrido (1x por semana na Faria...