Site Reliability Engineer

Há 7 dias


Rio de Janeiro, Brasil AgileEngine Tempo inteiro

Site Reliability Engineer (Middle/Senior) ID38916 3 weeks ago Be among the first 25 applicants Overview Site Reliability Engineer (Middle/Senior) role focused on managing alerts, on-call duties, and ensuring high availability for SaaS platforms. The position involves collaboration with Support, Customer Success, Migration, and Professional Services teams to deliver best-in-class service. Responsibilities Shift: Monday – Thursday 8AM – 7PM PST (11AM – 10PM EST) with rotating on-call. On-call shifts: every 6 weeks, one week as primary responder and the next week as secondary. Manage alerts daily, check systems, and escalate issues as needed. Provide 24x7 on-call support for critical SaaS events. Be available in emergencies when team members are unavailable or need help. Document issues and remediation steps. Proactively create appropriate monitors in the EKS/K8s ecosystem. Deploy to EKS/K8s clusters using Terraform and Helm. Learn and maintain existing infrastructure running under Docker Swarm. Improve infrastructure health by implementing checks and scripts to address known issues. Maintain and develop deployment code; automate manual tasks. Implement/integrate new technologies in the Cloud Infrastructure. Collaborate with other teams to provide high-level support and assistance. Apply a customer-focused approach when planning deployments/updates, considering customer impact. Work with Support, Customer Success, Migration, and Professional Services to deliver SaaS service excellence. Perform RCA and take corrective actions to prevent recurrence; create alert-related actions after investigations. Handle environment-specific support requests; identify automation requirements to improve RCA. MUST HAVES 2+ years of professional experience; Experience working with Datadog ; Hands-on experience as an AWS Cloud Engineer ; Working knowledge of EKS /Terraform /Helm ; Working experience with Docker and Docker Swarm ; Good understanding of AWS IAM roles and policies; Experience logging and monitoring AWS resources using CloudWatch logs ; Experience working in a Linux environment; Proficient in Bash and/or Python scripting; Understanding of web technologies such as REST APIs ; Experience with monitoring solutions such as Grafana and Prometheus ; Excellent oral and written communication skills; customer-facing communication skills to explain issues and RCAs. Experience in Product/Application Support for SaaS-based products; Understanding of APIs, Databases, Systems Architecture, and Design; Designing, implementing, and operating in a DevSecOps environment; Ability to work independently and in a collaborative environment; Technical aptitude with the desire to learn new and evolving technologies; Upper-Intermediate English level. NICE TO HAVES Experience with GCP or Azure ; Certifications: AWS Certified DevOps Engineer – Professional or AWS Certified Advanced Networking Specialty . PERKS AND BENEFITS Professional growth: Accelerate your professional journey with mentorship, TechTalks, and growth roadmaps. Competitive compensation: USD-based compensation with budgets for education, fitness, and team activities. A selection of exciting projects: Projects with modern solutions and Fortune 500 clients. Flextime: Options to work from home or in the office for work-life balance. Seniority level Mid-Senior level Employment type Full-time Job function Industries: IT Services and IT Consulting Referrals increase your chances of interviewing at AgileEngine by 2x #J-18808-Ljbffr


  • Site Reliability Engineer

    3 semanas atrás


    Rio de Janeiro, Brasil BairesDev Tempo inteiro

    Overview Site Reliability Engineer at BairesDev. We are looking for a Site Reliability Engineer to build and maintain highly reliable, scalable, and secure OpenShift/Kubernetes clusters. You will approach the problem of building and maintaining production systems from a software engineering perspective with a focus on automation and reliability. What You...

  • Site Reliability Engineer

    2 semanas atrás


    Rio de Janeiro, Brasil BairesDev Tempo inteiro

    OverviewSite Reliability Engineer at BairesDev. We are looking for a Site Reliability Engineer to build and maintain highly reliable, scalable, and secure OpenShift/Kubernetes clusters. You will approach the problem of building and maintaining production systems from a software engineering perspective with a focus on automation and reliability. What You Will...


  • Rio de Janeiro, Brasil BairesDev Tempo inteiro

    3 days ago Be among the first 25 applicants At BairesDev®, we've been leading the way in technology projects for over 15 years. We deliver cutting-edge solutions to giants like Google and the most innovative startups in Silicon Valley. Our diverse 4,000+ team, composed of the world's Top 1% of tech talent, works remotely on roles that drive significant...


  • Rio de Janeiro, Brasil BairesDev Tempo inteiro

    3 days ago Be among the first 25 applicants At BairesDev®, we've been leading the way in technology projects for over 15 years. We deliver cutting-edge solutions to giants like Google and the most innovative startups in Silicon Valley. Our diverse 4,000+ team, composed of the world's Top 1% of tech talent, works remotely on roles that drive significant...


  • Rio de Janeiro, Rio de Janeiro, Brasil Pacifica Continental Tempo inteiro R$80.000 - R$120.000 por ano

    Nosso cliente está procurando uma Site Reliability Engineer para integrar sua equipe. A profissional se concentrará em garantir a entrega bem-sucedida de tarefas e promover a saúde técnica de nossos serviços.Responsabilidades e atribuições:-Garantir a entrega bem-sucedida de tarefas e promover a saúde técnica de nossos serviços;-Projetar,...

  • Senior Site Reliability

    2 semanas atrás


    Região Geográfica Intermediária de Juiz de Fora, Brasil Canonical Tempo inteiro

    Senior Site Reliability / Gitops Engineer Join to apply for the Senior Site Reliability / Gitops Engineer role at Canonical Senior Site Reliability / Gitops Engineer 1 day ago Be among the first 25 applicants Join to apply for the Senior Site Reliability / Gitops Engineer role at Canonical Get AI-powered advice on this job and more exclusive features....


  • Rio de Janeiro, Brasil Canonical Tempo inteiro

    Overview Join to apply for the Senior Site Reliability Engineer role at Canonical . Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT....

  • Senior Site Reliability

    2 semanas atrás


    Região Geográfica Intermediária de Juiz de Fora, Brasil Canonical Tempo inteiro

    Senior Site Reliability / Gitops Engineer Join to apply for the Senior Site Reliability / Gitops Engineer role at Canonical Senior Site Reliability / Gitops Engineer 1 day ago Be among the first 25 applicants Join to apply for the Senior Site Reliability / Gitops Engineer role at Canonical Canonical is a leading provider of open source software and operating...


  • Greater Rio de Janeiro, Brasil Personetics Tempo inteiro R$90.000 - R$120.000 por ano

    DescriptionPersonetics is shaping the Cognitive Banking era, harnessing AI to help banks anticipate customer needs, provide actionable insights, and deliver intelligent financial guidance. Our platform continuously analyzes and leverages real-time transactional data, enabling banks to proactively support customers in managing their finances and reaching...

  • Site Reliability

    Há 16 horas


    Rio de Janeiro, Rio de Janeiro, Brasil Canonical - Jobs Tempo inteiro R$80.000 - R$120.000 por ano

    Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world's leading public cloud and silicon providers,...