Site Reliability Engineer
4 semanas atrás
About the CompanyThis company operates a global computing platform that enables businesses to programmatically deploy single-tenant Bare Metal instances across multiple regions worldwide. They are a team of passionate engineers working at the intersection of hardware, software, and network infrastructure, building the fastest, most developer-centric single-tenant cloud infrastructure on the market. If you share this passion, this role offers the opportunity to help shape the future of internet-scale infrastructure. This position is being managed in partnership with an external recruitment consultancy supporting the company throughout the hiring process.Summary The Reliability team is responsible for the health and resilience of the infrastructure powering a global bare metal cloud platform. As aSenior Site Reliability Engineer (SRE), you'll focus on building reliable, observable, and self-healing systems at scale. SREs here operate at the intersection of software engineering and infrastructure — designing tools that automate operations, improve incident response, and enhance observability, ensuring the platform delivers high performance and reliability to customers worldwide. This role is ideal for engineers passionate about reliability, automation, distributed systems, and bringing cloud-like experiences to bare metal environments.Key Responsibilities Continuously improve platform reliability and performance. Design, build, and maintain tools to automate operational workflows and incident response. Implement and enhance observability systems (monitoring, alerting, tracing). Collaborate with engineering and platform teams to design scalable and resilient systems. Participate in on-call rotations and lead post-incident reviews with a learning-focused approach. Develop and document operational playbooks and processes. Contribute to defining SLOs/SLIs and driving reliability metrics across teams.Skills & QualificationsRequired: Fluent verbal and written English communication skills Advanced experience with Linux/Unix in production environments Hands-on experience with Kubernetes and container orchestration Proficiency with IaC tools (e.g., Terraform, Ansible) Experience with observability stacks (Prometheus, Grafana, Loki, ELK, etc.) Proficiency with scripting/programming languages such as Bash, Python, Go, or Ruby Working knowledge of Git and CI/CD pipelines Experience with incident response and root cause analysis Knowledge of cloud-native reliability and security best practicesWhat’s Offered Contractor engagement (PJ) Paid Time Off Competitive compensation package Wellness benefit (Wellhub / Gympass equivalent) Annual performance-based bonus Flexible working hours Opportunities for technical and career growth
-
Site Reliability Engineer
2 semanas atrás
Brasília, Brasil Quantum World Technologies Inc. Tempo inteiroWe are seeking a Site Reliability Engineer (SRE) who is passionate about large-scale infrastructure and eager to develop deeper expertise in PostgreSQL. In this role, you will join the Database Engineering organization and help strengthen the reliability, resilience, and automation of our database platform. This position is an excellent fit for an...
-
Site Reliability Engineer
4 semanas atrás
Brasília, DF, Brasil MetaCTO Tempo inteiroAbout Us At MetaCTO, we specialize in helping startups and growing companies turn visionary ideas into successful digital products through expert app development and fractional CTO services. As a Site Reliability Engineer (SRE) , you will play a critical role in ensuring the reliability, scalability, and security of the backend infrastructure that powers...
-
Site Reliability Engineer
1 dia atrás
Brasília, Distrito Federal, Brasil FullStack Tempo inteiroAbout FullStackFullStack is the most transparent IT talent network, connecting highly skilled individuals with top global companies and Silicon Valley startups for remote, on-demand projects. We focus on building a trusted, high-performance network where talent can thrive in a positive, respectful, and supportive environment. By prioritizing transparency,...
-
Site Reliability Engineer
4 semanas atrás
Brasília, Brasil MetaCTO Tempo inteiroAbout UsAt MetaCTO, we specialize in helping startups and growing companies turn visionary ideas into successful digital products through expert app development and fractional CTO services. As aSite Reliability Engineer (SRE), you will play a critical role in ensuring the reliability, scalability, and security of the backend infrastructure that powers...
-
Site Reliability Engineer
4 semanas atrás
Brasília, Brasil MetaCTO Tempo inteiroAbout UsAt MetaCTO, we specialize in helping startups and growing companies turn visionary ideas into successful digital products through expert app development and fractional CTO services. As aSite Reliability Engineer (SRE), you will play a critical role in ensuring the reliability, scalability, and security of the backend infrastructure that powers...
-
Site Reliability Engineer
4 semanas atrás
Brasília, Brasil INDI Staffing Services Tempo inteiroAt INDI, we're passionate about empowering individuals and businesses worldwide. Our cutting-edge recruiters connect leading companies with top talent, fostering a dynamic environment where innovation thrives. Join us in shaping the future of work. Overview of the role:We are looking for a Site Reliability Engineer to build and maintain highly reliable,...
-
Site Reliability Engineer ID45689
3 semanas atrás
Brasília, Brasil AgileEngine Tempo inteiroJoin to apply for the Site Reliability Engineer ID45689 role at AgileEngine . AgileEngine is an Inc. 5000 company that creates award‑winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in application development and AI/ML, and our people‑first culture has earned us multiple Best Place to Work...
-
Software Engineer Site Reliability Engineer
4 semanas atrás
Brasília, Brasil Scubyt Tempo inteiroSoftware Engineer Site Reliability Engineer Location: Brazil REMOTE Duration: Fulltime CLT / REMOTEAbout the role The Application SRE Team supports several critical components of our foundational technologies for real-time protection, as well as ourRBIandSSPMservices. We are a team of software engineers focused on improving availability, latency,...
-
Software Engineer Site Reliability Engineer
3 semanas atrás
Brasília, Brasil Scubyt Tempo inteiroSoftware Engineer Site Reliability Engineer Location: Brazil REMOTE Duration: Fulltime CLT / REMOTE About the role The Application SRE Team supports several critical components of our foundational technologies for real-time protection, as well as our and services. We are a team of software engineers focused on improving availability, latency, performance,...
-
Software Engineer Site Reliability Engineer
4 semanas atrás
Brasília, Brasil Scubyt Tempo inteiroSoftware Engineer Site Reliability EngineerLocation: Brazil REMOTE Duration: Fulltime CLT / REMOTEAbout the role The Application SRE Team supports several critical components of our foundational technologies for real-time protection, as well as ourandservices. We are a team of software engineers focused on improving availability, latency, performance,...