Site Reliability Engineer

3 semanas atrás


Salvador, Brasil AgileEngine Tempo inteiro
Overview

Join to apply for the Site Reliability Engineer (Middle) ID38916 role at AgileEngine.

AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards.

If you're looking for a place to grow, make an impact, and work with people who care, we'd love to meet you

What you will do
  • Shift: Monday – Thursday 8AM – 7PM PST (11AM – 10PM EST) with rotating on-call;
  • Manage alerts daily, check systems, and escalate issues as needed;
  • Be part of a team that provides 24×7 on-call support for critical SaaS events;
  • Be available in case of emergencies when team members are not available or need help;
  • Document issues and remediation steps;
  • Proactively create appropriate monitors in the EKS/K8S ecosystem;
  • Deploy to EKS/K8s cluster using Terraform and Helm;
  • Learn and maintain existing infrastructure running under Docker Swarm;
  • Improve existing infrastructure health by implementing checks and scripts to correct known issues;
  • Maintain and develop deployment code;
  • Automate manual tasks;
  • Implement/integrate new technologies in our Cloud Infrastructure;
  • Collaborate with other teams and departments to provide the highest level of support and assistance;
  • Apply a real customer focus when planning deployments/updates, having the customer in the forefront of the mind, and considering the impact on them before making changes;
  • Work closely on solutions with Support, Customer Success, Migration, and Professional Services teams to provide the best in class SaaS service to our customers;
  • Perform RCA and take necessary corrective actions to prevent the recurrence of issues;
  • Create and assign alert-related actions to the appropriate team after the investigation;
  • Handle support requests for environment-specific actions;
  • Identify and provide automation requirements to improve RCA.
MUST HAVES
  • 2+ years of professional experience;
  • Experience working with Datadog;
  • Hands-on experience as an AWS Cloud Engineer;
  • Working knowledge of EKS/Terraform/Helm;
  • Working Experience with Docker and Docker Swarm;
  • Good understanding of AWS IAM roles and policies;
  • Experience logging and monitoring AWS resources using CloudWatch logs;
  • Experience working in a Linux environment;
  • Proficient in Bash and/or Python scripting;
  • A strong understanding of web technologies such as REST APIs;
  • Working Experience with monitoring solutions, such as Grafana and Prometheus;
  • Excellent oral and written communication skills;
  • Customer-facing communication skills to effectively explain issues and RCAs to them;
  • Experience in Product/Application Support for SaaS-based products;
  • Understanding of APIs, Databases, Systems Architecture, and Design;
  • Designing, implementing, and operating in a DevSecOps;
  • Excellent communication skills, both written and verbal;
  • Ability to work independently as well as within a collaborative environment;
  • A technical aptitude with the desire to learn new and evolving technologies;
  • Upper-Intermediate English level.
NICE TO HAVES
  • Experience
THE BENEFITS OF JOINING US
  • Professional growth: Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps.
  • Competitive compensation: We match your ever-growing skills, talent, and contributions with competitive USD-based compensation and budgets for education, fitness, and team activities.
  • A selection of exciting projects: Join projects with modern solutions development and top-tier clients that include Fortune 500 enterprises and leading product brands.
  • Flextime: Tailor your schedule for an optimal work-life balance, by having the options of working from home and going to the office – whatever makes you the happiest and most productive.

Your application doesn't end here To unlock the next steps, check your email and complete your registration on our Applicant Site . The incomplete registration results in the termination of your process.

Seniority level
  • Mid-Senior level
Employment type
  • Full-time
Job function
  • Industries
Industries
  • IT Services and IT Consulting
#J-18808-Ljbffr
  • Site Reliability Engineer

    2 semanas atrás


    Salvador, Brasil Wex Brazil Technology Services Tempo inteiro

    About the Team/Role We are seeking a Software Development Engineer Level 3 to join our SRE team dedicated to the Mobility line of business.This role is for a professional with a software development background who will apply SRE principles to ensure the reliability, scalability, and performance of our complex software systems.The ideal candidate will have...

  • Site Reliability Engineer

    2 semanas atrás


    Salvador, Brasil Gauge Tempo inteiro

    Somos uma empresa do Grupo Stefanini.Especializados em marketing digital, utilizamos uma abordagem integrada que combina tecnologia, inteligência de dados, design e profundo conhecimento do comportamento do consumidor.Nosso foco está em potencializar os resultados de nossos parceiros, oferecendo soluções que vão desde consultoria estratégica até a...


  • Salvador, Brasil AgileEngine Tempo inteiro

    OverviewSite Reliability Engineer (Middle) ID38916 AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards. If...


  • Salvador, Brasil AgileEngine Tempo inteiro

    Overview Site Reliability Engineer (Middle) ID38916 AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards. If you're...

  • Site Reliability Engineer

    2 semanas atrás


    Salvador, Brasil buscojobs Brasil Tempo inteiro

    Sobre a Empresa Com mais de 20 anos de mercado, a ITeam se destaca pelo comprometimento com o cliente. Baseamos nosso relacionamento em valores sólidos e objetivos claros, oferecendo soluções e serviços de TI que auxiliam na realização das metas dos nossos clientes. Nossa missão é fornecer serviços de TI que se alinhem com a estratégia e processos...

  • Site Reliability Engineer

    2 semanas atrás


    Salvador, Brasil HCLTech Tempo inteiro

    Your role and responsabilities: - Handling major incidents via CIRS (Critical Issue Response System) and providing frequent updates until resolution. - Performing deep-dive application troubleshooting and identifying preventive actions. - Managing CIRS-related requests including deployments, feature toggles, and data fixes. - Following up on major...

  • .NET Engineer

    Há 7 dias


    Salvador, Brasil AgileEngine Tempo inteiro

    Join to apply for the .NET Engineer (Senior/Lead) ID41557 role at AgileEngine AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and startups across 17+ industries. We rank among the leaders in application development and AI/ML, and our people-first culture has earned us Best Place to Work awards. ABOUT THE ROLE ...

  • Senior Golang Engineer

    1 semana atrás


    Salvador, Brasil Valor Software Tempo inteiro

    OverviewValor Software is a software development and consulting company that leverages open-source technologies to drive innovation and business growth for its clients.We are looking for a Senior GoLang Engineer to join Valor as we continue to expand.Senior GoLang Engineer — you will join an exciting project focused on building a platform for short video...

  • Senior Data Engineer

    3 semanas atrás


    Salvador, Bahia, Brasil Pride Global Tempo inteiro

    We're Hiring: Senior Data Engineer (MLOps) | Remote from Brazil | Fluent English required | USD-Hourly payLocation: Remote – Brazil only Language: Fluent English requiredAre you passionate about building scalable data platforms and cutting-edge MLOps solutions? Do you want to work with a top-tier US company revolutionizing e-commerce and circular...

  • .Net Engineer

    Há 5 dias


    Salvador, Brasil Agileengine Tempo inteiro

    Join to apply for the .NET Engineer (Senior/Lead) ID41557 role at AgileEngineAgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and startups across 17+ industries.We rank among the leaders in application development and AI/ML, and our people-first culture has earned us Best Place to Work awards.ABOUT THE ROLEAs a...