Site Reliability Engineer
4 semanas atrás
Site Reliability Engineer (Middle) ID38916
Join to apply for the Site Reliability Engineer (Middle) ID38916 role at AgileEngine.
AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards.
WHAT YOU WILL DO- Shift: Monday – Thursday 8AM – 7PM PST (11AM – 10PM EST) with rotating on-call.
 - On call shifts: every 6 weeks, for one week as primary responder and next week as secondary.
 - Manage alerts daily, check systems, and escalate issues as needed.
 - Be part of a team that provides 24×7 on-call support for critical SaaS events.
 - Be available in case of emergencies when team members are not available or need help.
 - Document issues and remediation steps.
 - Proactively create appropriate monitors in the EKS/K8S ecosystem.
 - Deploy to EKS/K8s cluster using Terraform and Helm.
 - Learn and maintain existing infrastructure running under Docker Swarm.
 - Improve existing infrastructure health by implementing checks and scripts to correct known issues.
 - Maintain and develop deployment code.
 - Automate manual tasks.
 - Implement/integrate new technologies in our Cloud Infrastructure.
 - Collaborate with other teams and departments to provide the highest level of support and assistance.
 - Apply a real customer focus when planning deployments/updates, having the customer in the forefront of the mind, and considering the impact on them before making changes.
 - Work closely on solutions with Support, Customer Success, Migration, and Professional Services teams to provide the best in class SaaS service to our customers.
 - Perform RCA and take necessary corrective actions to prevent the recurrence of issues.
 - Create and assign alert-related actions to the appropriate team after the investigation.
 - Handle support requests for environment-specific actions.
 - Identify and provide automation requirements to improve RCA.
 
- 2+ years of professional experience.
 - Experience working with Datadog.
 - Hands-on experience as an AWS Cloud Engineer.
 - Working knowledge of EKS/Terraform/Helm.
 - Working Experience with Docker and Docker Swarm.
 - Good understanding of AWS IAM roles and policies.
 - Experience logging and monitoring AWS resources using CloudWatch logs.
 - Experience working in a Linux environment.
 - Proficient in Bash and/or Python scripting.
 - A strong understanding of web technologies such as REST APIs.
 - Working Experience with monitoring solutions, such as Grafana and Prometheus.
 - Excellent oral and written communication skills.
 - Customer-facing communication skills to effectively explain issues and RCAs to them.
 - Experience in Product/Application Support for SaaS-based products.
 - Understanding of APIs, Databases, Systems Architecture, and Design.
 - Designing, implementing, and operating in a DevSecOps environment.
 - Excellent communication skills, both written and verbal.
 - Ability to work independently as well as within a collaborative environment.
 - A technical aptitude with the desire to learn new and evolving technologies.
 - Upper-Intermediate English level.
 
- Experience with GCP or Azure.
 - Certifications: AWS Certified DevOps Engineer – Professional or AWS Certified Advanced Networking Specialty.
 
- Professional growth: Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps.
 - Competitive compensation: We match your ever-growing skills, talent, and contributions with competitive USD-based compensation and budgets for education, fitness, and team activities.
 - A selection of exciting projects: Join projects with modern solutions development and top-tier clients that include Fortune 500 enterprises and leading product brands.
 - Flextime: Tailor your schedule for an optimal work-life balance, by having the options of working from home and going to the office – whatever makes you the happiest and most productive.
 
Referrals increase your chances of interviewing at AgileEngine by 2x. Get notified about new Site Reliability Engineer jobs in Greater Natal.
#J-18808-Ljbffr- 
					
						Site Reliability Engineer
2 semanas atrás
São José dos Campos, Brasil Bairesdev Tempo inteiroSite Reliability Engineer - Remote Work:At BairesDev, we've been leading the way in technology projects for over 15 years.We deliver cutting-edge solutions to giants like Google and the most innovative startups in Silicon Valley.Our diverse 4,000+ team, composed of the world's Top 1% of tech talent, works remotely on roles that drive significant impact...
 - 
					
						Senior Site Reliability
3 semanas atrás
São Paulo, Brasil Canonical Tempo inteiroSenior Site Reliability / Gitops EngineerJoin to apply for the Senior Site Reliability / Gitops Engineer role at Canonical Senior Site Reliability / Gitops Engineer1 day ago Be among the first 25 applicants Join to apply for the Senior Site Reliability / Gitops Engineer role at Canonical Get AI-powered advice on this job and more exclusive features....
 - 
					
						Remote Site Reliability Engineer
Há 6 dias
São Paulo, Brasil INDI Staffing Services Tempo inteiroOverviewWe are looking for a Site Reliability Engineer to build and maintain highly reliable, scalable, and secure OpenShift/Kubernetes clusters. Approach the problem of building and maintaining production systems from a software engineering perspective with a focus on automation and reliability. ResponsibilitiesBuild, automate, and maintain...
 - 
					
						Site Reliability Engineer
3 semanas atrás
São Paulo, Brasil INDI Staffing Services Tempo inteiroAt INDI, we're passionate about empowering individuals and businesses worldwide. Our cutting-edge recruiters connect leading companies with top talent, fostering a dynamic environment where innovation thrives. Join us in shaping the future of work.Overview of the role:We are looking for a Site Reliability Engineer to build and maintain highly reliable,...
 - 
					
						Site Reliability Engineer
2 semanas atrás
São Paulo, Brasil INDI Staffing Services Tempo inteiroAt INDI, we're passionate about empowering individuals and businesses worldwide. Our cutting-edge recruiters connect leading companies with top talent, fostering a dynamic environment where innovation thrives. Join us in shaping the future of work. Overview of the role: We are looking for a Site Reliability Engineer to build and maintain highly reliable,...
 - 
					
						Site Reliability Engineer
Há 2 dias
São José dos Campos, Brasil AgileEngine Tempo inteiroOverview Site Reliability Engineer (Middle/Senior) ID38916 – AgileEngine AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to...
 - 
					
						Senior Site Reliability Engineer
1 semana atrás
São Paulo, Brasil Chainlink Labs Tempo inteiroJoin to apply for the Senior Site Reliability Engineer role at Chainlink Labs 2 weeks ago Be among the first 25 applicants Join to apply for the Senior Site Reliability Engineer role at Chainlink Labs Get AI-powered advice on this job and more exclusive features. About UsChainlink Labs is the primary contributing developer of Chainlink, the decentralized...
 - 
					
						Site Reliability Engineer
3 semanas atrás
Jaboatão dos Guararapes, Brasil BairesDev Tempo inteiroSite Reliability Engineer - Remote Work: At BairesDev, we've been leading the way in technology projects for over 15 years. We deliver cutting-edge solutions to giants like Google and the most innovative startups in Silicon Valley. Our diverse 4,000+ team, composed of the world's Top 1% of tech talent, works remotely on roles that drive significant impact...
 - 
					
						Site reliability engineer
4 semanas atrás
São Paulo, Brasil INDI Staffing Services Tempo inteiroAt INDI, we're passionate about empowering individuals and businesses worldwide. Our cutting-edge recruiters connect leading companies with top talent, fostering a dynamic environment where innovation thrives. Join us in shaping the future of work. Overview of the role: We are looking for a Site Reliability Engineer to build and maintain highly reliable,...
 - 
					
						Site Reliability Engineer
3 semanas atrás
São Paulo, Brasil INDI Staffing Services Tempo inteiroAt INDI, we're passionate about empowering individuals and businesses worldwide. Our cutting-edge recruiters connect leading companies with top talent, fostering a dynamic environment where innovation thrives. Join us in shaping the future of work. Overview of the role: We are looking for a Site Reliability Engineer to build and maintain highly reliable,...