Infrastructure Site Reliability Engineer
Há 2 dias
At CVS Health, we're building a world of health around every consumer and surrounding ourselves with dedicated colleagues who are passionate about transforming health care.
As the nation's leading health solutions company, we reach millions of Americans through our local presence, digital channels and more than 300,000 purpose-driven colleagues – caring for people where, when and how they choose in a way that is uniquely more connected, more convenient and more compassionate. And we do it all with heart, each and every day.
Position Summary
As an Infrastructure Site Reliability Engineer, you will be responsible for designing, implementing, and managing the infrastructure systems and tools that enable reliability and performance of our technology platforms supporting various business initiatives within CVS Health. This role requires a strong background in infrastructure engineering and a commitment to proactive monitoring, troubleshooting, and optimizing systems for maximum uptime and performance. Collaborating with diverse teams, you will prioritize high availability, scalability, and resilience to ensure our platforms and services consistently meet and exceed customer expectations.
Primary Responsibilities:
1. Operations: Manage and maintain various systems and infrastructure, such as servers, storage, mainframe, iSeries, backup, archive, and recovery, ensuring the platforms have high availability, scalability, and reliability to meet the business requirements. Participate in on-call rotation to ensure availability and uptime of critical systems and provide timely response and resolution to incidents. Develop and maintain best practices documentation, including system architecture diagrams, standard operating procedures, and runbooks. Perform system and application performance analysis, utilizing monitoring tools, logging systems, and other relevant metrics, to identify and resolve issues and enhance overall system performance.
2. Process Improvement: Streamline and optimize operational processes, procedures, and documentation by implementing industry best practices. Develop, modify, and implement incident and problem management processes to increase efficiency and reduce downtime. Establish a comprehensive SRE process that encompasses the entire software team, ensuring seamless operations and prompt resolution of any escalated issues.
3. System Support: Collaborate with development teams to participate in code reviews, performance optimization, and application deployment processes. Drive reliability engineering practices, including monitoring, alerting, incident management, capacity planning, and disaster recovery. Automate infrastructure deployments, upgrades, and maintenance tasks, utilizing configuration management tools like Ansible and infrastructure-as-code frameworks such as Terraform. Stay abreast of industry trends, emerging technologies, and best practices in infrastructure site reliability engineering and apply knowledge to continually improve CVS Health's systems and processes. Provide customer support with meticulously documented procedures, enabling them to proficiently address customer complaints and deliver optimal service.
4. Capacity Management: Analyze historical usage patterns and growth projections to forecast future capacity requirements. Collaborate with stakeholders such as developers, product managers, and operations teams to understand the demand for resources and estimate the necessary infrastructure capacity. Establish and maintain monitoring systems to track the performance and utilization of critical resources. Identify potential bottlenecks, anomalies, or areas of improvement. Perform regular performance reviews help ensure systems meet defined service-level objectives (SLOs) and key performance indicators (KPIs).
Required Qualifications
- 7+ years of experience in Infrastructure Engineering, System Administration, or related roles.
- 3+ years of experience with cloud platforms (e.g., Amazon Web Services, Microsoft Azure) and infrastructure-as-code tools (e.g., Terraform, CloudFormation).
- 2+ years of experience in at least one configuration management tool such as Ansible, Puppet, or Chef.
- 2+ years of experience with containerization technologies such as Docker and container orchestration platforms like Kubernetes.
- 2+ years of experience in networking principles and protocols, including TCP/IP, DNS, load balancing, and firewalls.
- 1+ years of experience with incident management, performance monitoring, and capacity planning tools.
Preferred Qualifications
- Excellent troubleshooting and problem-solving skills, with the ability to identify, communicate, and resolve technical issues swiftly.
Education
- Bachelor's degree or equivalent experience (High School Diploma and 4 years relevant experience)
Pay Range
The typical pay range for this role is:
$118, $260,590.00
This pay range represents the base hourly rate or base annual full-time salary for all positions in the job grade within which this position falls. The actual base salary offer will depend on a variety of factors including experience, education, geography and other relevant factors. This position is eligible for a CVS Health bonus, commission or short-term incentive program in addition to the base pay range listed above. This position also includes an award target in the company's equity award program.
Our people fuel our future. Our teams reflect the customers, patients, members and communities we serve and we are committed to fostering a workplace where every colleague feels valued and that they belong.
Great benefits for great people
We take pride in our comprehensive and competitive mix of pay and benefits – investing in the physical, emotional and financial wellness of our colleagues and their families to help them be the healthiest they can be. In addition to our competitive wages, our great benefits include:
Affordable medical plan options, a 401(k) plan (including matching company contributions), and an employee stock purchase plan.
No-cost programs for all colleagues including wellness screenings, tobacco cessation and weight management programs, confidential counseling and financial coaching.
Benefit solutions that address the different needs and preferences of our colleagues including paid time off, flexible work schedules, family leave, dependent care resources, colleague assistance programs, tuition assistance, retiree medical access and many other benefits depending on eligibility.
For more information, visit
We anticipate the application window for this opening will close on: 11/12/2025Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state and local laws.
-
Site Reliability Engineer
Há 5 dias
São Paulo, São Paulo, Brasil Truelogic Tempo inteiro US$120.000 - US$180.000 por anoAbout TruelogicAt Truelogic we are a leading provider of nearshore staff augmentation services headquartered in New York. For over two decades, we've been delivering top-tier technology solutions to companies of all sizes, from innovative startups to industry leaders, helping them achieve their digital transformation goals.Our team of 600+ highly skilled...
-
Site Reliability Engineer
1 semana atrás
São Paulo, Estado de São Paulo, Brasil INDI Staffing Services Tempo inteiroAt INDI, we're passionate about empowering individuals and businesses worldwide. Our cutting-edge recruiters connect leading companies with top talent, fostering a dynamic environment where innovation thrives. Join us in shaping the future of work.Overview of the role:We are looking for a Site Reliability Engineer to build and maintain highly reliable,...
-
Senior Site Reliability Engineer
1 semana atrás
São Paulo, São Paulo, Brasil Dev Tempo inteiro US$120.000 - US$180.000 por anoWe are a US-based outsource software development company that has been delivering exceptional software experience to our clients since 2011, helping technology companies to become industry leaders.Over the past few years, we've been hiring specialists all over the world while our main development centers were in Ukraine. Now, we keep expanding and start...
-
Site Reliability Engineer
Há 3 dias
São Paulo, São Paulo, Brasil WSO2 Tempo inteiro R$80.000 - R$150.000 por anoAbout WSO2Founded in 2005, WSO2 is the largest independent software vendor providing open-source API management, integration, and identity and access management (IAM) products. WSO2's products and platforms—including our next-gen internal developer platform, Choreo—empower organizations to leverage the full potential of APIs for secure delivery of...
-
Site Reliability Engineer
2 semanas atrás
São Paulo, Estado de São Paulo, Brasil Mouts TI Tempo inteiroNa Mouts TI, entregamos soluções que impulsionam a transformação digital de forma ágil, eficiente e descomplicada.Buscamos um(a) SRE (Site Reliability Engineer) para atuar presencialmente, com foco em infraestrutura, automação e observabilidade em ambientes de missão crítica.Responsabilidades:Implementar e gerenciar soluções de observabilidade...
-
Site Reliability Engineer
Há 2 horas
São Paulo, São Paulo, Brasil Canonical - Jobs Tempo inteiro R$120.000 - R$240.000 por anoCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world's leading public cloud and silicon providers, and...
-
Senior Site Reliability
Há 3 horas
São Paulo, São Paulo, Brasil Canonical - Jobs Tempo inteiro R$80.000 - R$120.000 por anoCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world's leading public cloud and silicon providers,...
-
SRE - Senior Site Reliability Engineer
Há 2 dias
São Paulo, São Paulo, Brasil K2 Solutions Tempo inteiro R$80.000 - R$120.000 por anoTrabalho híbrido na região de Pinheiros/ SP - 3x por semana no escritório Estamos selecionando um Senior Site Reliability Engineer - SRE para se juntar ao nosso time e desempenhar um papel essencial na manutenção, automação e melhoria da confiabilidade dos sistemas que impulsionam a rede logística da empresa em múltiplas regiões. Essa pessoa...
-
São Paulo, São Paulo, Brasil Airbnb Tempo inteiro R$26.666 - R$33.333Airbnb was born in 2007 when two hosts welcomed three guests to their San Francisco home, and has since grown to over 5 million hosts who have welcomed over 2 billion guest arrivals in almost every country across the globe. Every day, hosts offer unique stays and experiences that make it possible for guests to connect with communities in a more authentic...
-
Site Reliability Engineer
2 semanas atrás
São Paulo, São Paulo, Brasil DELIVER IT Tempo inteiro R$80.000 - R$120.000 por anoVocê se considera uma pessoa que tem sede por aprendizado, gosta de trabalhar em equipe e almeja desenvolvimento na carreira? Então essa oportunidade é para vocêEstamos em busca de um(a) SRE Júnior (Site Reliability Engineer) para integrar uma equipe altamente técnica e comprometida com a excelência operacional. O profissional atuará com foco na...