Site Reliability Engineer
3 semanas atrás
Overview Site Reliability Engineer (Middle/Senior) ID38916 at AgileEngine. AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and startups across 17+ industries. We value a people-first culture with multiple Best Place to Work awards. What you will do Shift: Monday – Thursday 8AM – 7PM PST (11AM – 10PM EST) with rotating on-call On-call shifts: every 6 weeks, one week as primary responder and the next week as secondary Manage alerts daily, check systems, and escalate issues as needed Provide 24×7 on-call support for critical SaaS events Be available in emergencies when team members are unavailable or need help Document issues and remediation steps Proactively create appropriate monitors in the EKS/K8S ecosystem Deploy to EKS/K8s cluster using Terraform and Helm Learn and maintain existing infrastructure running under Docker Swarm Improve infrastructure health by implementing checks and scripts to correct known issues Maintain and develop deployment code; automate manual tasks Implement/integrate new technologies in our Cloud Infrastructure Collaborate with Support, Customer Success, Migration, and Professional Services teams Apply a customer-focused approach when planning deployments/updates Work with teams to provide best-in-class SaaS service Perform RCA and take corrective actions to prevent recurrence Create and assign alert-related actions after investigation Handle environment-specific support requests Identify and provide automation requirements to improve RCA Must haves 2+ years of professional experience Experience working with Datadog Hands-on experience as an AWS Cloud Engineer Working knowledge of EKS, Terraform, Helm Working experience with Docker and Docker Swarm Good understanding of AWS IAM roles and policies Experience logging and monitoring AWS resources using CloudWatch Experience working in a Linux environment Proficient in Bash and/or Python scripting Understanding of REST APIs and web technologies Experience with monitoring solutions such as Grafana and Prometheus Excellent oral and written communication skills; customer-facing RCA explanations Experience in Product/Application Support for SaaS-based products Understanding of APIs, databases, systems architecture, and design DevSecOps-oriented mindset Ability to work independently and in a team; technical aptitude for learning new technologies Upper-Intermediate English level Nice to have Experience with GCP or Azure Certifications: AWS Certified DevOps Engineer – Professional or AWS Certified Advanced Networking Specialty Perks and benefits Professional growth: Mentorship, TechTalks, and personalized growth roadmaps Competitive compensation: USD-based compensation with budgets for education, fitness, and team activities A selection of exciting projects: Projects with modern solutions and top-tier clients including Fortune 500 enterprises Flextime: Flexible schedule with options to work from home or at the office Seniority level Mid-Senior level Employment type Full-time Industry IT Services and IT Consulting Referrals increase your chances of interviewing at AgileEngine by 2x #J-18808-Ljbffr
-
Site Reliability Engineer Sr
1 semana atrás
Campinas, Brasil Mercado Eletrônico Tempo inteiroO Mercado Eletrônico é líder na América Latina em soluções de gestão de compras B2B. Suas tecnologias e serviços para as áreas de compras ajudam empresas a conquistarem mais economia, agilidade, governança e colaboração. Com escritórios no Brasil, Estados Unidos, México e Portugal, contabiliza mais de 1 milhão de fornecedores, 10 mil...
-
Software Engineer Site Reliability Engineer
Há 17 horas
Campinas, Brasil Scubyt Tempo inteiroSoftware Engineer Site Reliability Engineer Location: Brazil REMOTE Duration: Fulltime CLT / REMOTEAbout the role The Application SRE Team supports several critical components of our foundational technologies for real-time protection, as well as our RBI and SSPM services. We are a team of software engineers focused on improving availability, latency,...
-
Site Reliability Engineer
4 semanas atrás
Campinas, Brasil AgileEngine Tempo inteiroSite Reliability Engineer (Middle) ID38916 AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards. Why Join Us If...
-
Site Reliability Engineer
Há 2 dias
Campinas, Brasil Agileengine Tempo inteiroSite Reliability Engineer (Middle) ID*****AgileEngine is an Inc. **** company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries.We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards.Why Join UsIf you're...
-
Senior Software Engineer
1 semana atrás
Campinas, Brasil Pride Global Tempo inteiroWe're Hiring: Sr. Engineer (React + UX/UI Knowledge) Remote (US-based, preferably EST) | 6-9 month contract (with possible extension) | Highly competitive USD compensation package Join our Site Reliability Engineering team and help improve internal web applications built by engineers, for engineers. We're looking for someone who can not only code clean,...
-
Site Reliability Engineer
3 semanas atrás
Campinas, Brasil AgileEngine Tempo inteiroOverviewSite Reliability Engineer (Middle/Senior) ID38916 at AgileEngine. AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and startups across 17+ industries. We value a people-first culture with multiple Best Place to Work awards. What you will do Shift: Monday – Thursday 8AM – 7PM PST (11AM – 10PM EST)...
-
On-Site IT Support Engineer
3 semanas atrás
Campinas, Brasil TECEZE Tempo inteiroOverview We are looking for a dedicated and proactive On-Site IT Support Engineer to provide hands-on support for our local infrastructure, users, and critical systems. This role ensures smooth IT operations, continuity of services, and timely resolution of incidents during the designated support period. The engineer will serve as the primary point of...
-
On-Site It Support Engineer
3 semanas atrás
Campinas, Brasil TECEZE Tempo inteiroOverviewWe are looking for a dedicated and proactive On-Site IT Support Engineer to provide hands-on support for our local infrastructure, users, and critical systems. This role ensures smooth IT operations, continuity of services, and timely resolution of incidents during the designated support period. The engineer will serve as the primary point of contact...
-
On-Site IT Support Engineer
4 semanas atrás
Campinas, Brasil TECEZE Tempo inteiroOverviewWe are looking for a dedicated and proactive On-Site IT Support Engineer to provide hands-on support for our local infrastructure, users, and critical systems. This role ensures smooth IT operations, continuity of services, and timely resolution of incidents during the designated support period. The engineer will serve as the primary point of contact...
-
Information Technology Support Engineer
4 semanas atrás
Campinas, SP, Brasil TECEZE Tempo inteiroOverview Teceze is seeking a dedicated and proactive On-Site IT Support Engineer to deliver hands-on technical support for our client's local infrastructure, end-users, and critical systems. The ideal candidate will ensure smooth IT operations, continuity of services, and timely resolution of incidents during the designated support period. This role acts as...