Lead Site Reliability Engineer

Há 1 mês


Brazil Andela Tempo inteiro

Andela exists to connect brilliance and opportunity. Since 2014, we have been dedicated to breaking down global barriers and accelerating the future of work for both technologists and organizations around the world. For technologists, Andela offers competitive long-term career opportunities with leading organizations, access to a global community of professionals, and educational opportunities with leading technology providers.

At Andela, we’re deeply passionate about creating long-lasting and transformative growth opportunities for all - and doing it in an E.P.I.C. way We’re excited to continue building our remote-first team with incredible people like you. After applying for this role, you will join our Andela Community of brilliant technologists by passing a technical screening and live interview. As a community member, you’ll have access to a multitude of exclusive technologist roles. Join Andela today to access this opportunity and more in our global marketplace

Our roles are typically filled at lightning speed, so if you’re considering applying, get your application in quickly

This is a fully remote opportunity for one of our esteemed clients.

About the role:

We are seeking a highly skilled and experienced Lead SRE to oversee the deployment, maintenance, and optimization of the DataDog observability platform across our R&D environment. This role is crucial for ensuring a unified, efficient, and secure monitoring infrastructure. You will lead API integrations, assist in platform modernization, and support teams with architectural insights and best practices for observability and monitoring.

Responsibilities

  • Oversee the deployment, maintenance, and configuration of DataDog for system monitoring, logging, and observability.
  • Act as the primary point of contact for technical issues related to DataDog and observability tools.
  • Lead API integrations and enhance platform capabilities to align with organizational needs.
  • Monitor system performance and health, implementing proactive measures to prevent disruptions.
  • Assist with the migration to service accounts and ensure best practices for user and key management.
  • Provide operational and training support to R&D teams, ensuring efficient use of observability tools.
  • Contribute to platform improvements and guide the adoption of OpenTelemetry or other modernization initiatives.

Required skills:

  • DataDog Expertise (7-9 years): Advanced hands-on experience with DataDog, including monitoring, logging, dashboard creation, and APM configuration.
  • Observability Tools (7-9 years): Proficiency with tools like Prometheus and Grafana for system performance tracking.
  • Cloud Platforms (7-9 years): Extensive experience with AWS, including integration with DataDog for unified monitoring.
  • Containerization and Microservices Monitoring (4-6 years): Expertise in monitoring Kubernetes and containerized environments.
  • Python (4-6 years): Proficiency in Python for scripting and automating monitoring tasks.
  • CI/CD Pipelines (4-6 years): Experience integrating observability tools like DataDog into CI/CD workflows
  • Installation and configuration of DataDog agents and integrations.
  • User management, including roles, permissions, and security best practices.
  • Leadership skills

Nice-to-have skills:

  • OpenTelemetry Adoption: Experience migrating from proprietary tracing models to OpenTelemetry for distributed tracing.
  • API & Platform Migration: Expertise in transitioning to service account models and consolidating access keys for enhanced security.
  • Automation: Familiarity with automating monitoring setups and API configurations using scripting tools.

Type of contract: Contractor. You will be responsible for your taxes.

Contract length: 3-months (minimum). Renewable contract.

Dedication: Full-time (40 hours/week)

Location: 100% remote

Timezone: You’ll need to overlap at least 6 hours with US PST (UTC-4).

At Andela, we outcompete through diversity. We know that our strengths lie in the multiplicity of talents, perspectives, backgrounds, and orientations of residents in our community and we take pride in that. Andela is committed to a work environment in which all individuals are treated with respect and dignity. Each individual has the right to work in a professional atmosphere that promotes equal employment opportunities and prohibits discriminatory practices. Andela provides equal employment opportunities and workplace to all employees and applicants without regard to factors including but not limited to race, color, religion, gender, sexual orientation, gender identity, national origin, age, disability, pregnancy (including breastfeeding), genetic information, HIV/AIDS or any other medical status, family or parental status, marital status, amnesty or status as a covered veteran in accordance with applicable federal, state and local laws. This commitment applies to all terms and conditions of employment, including but not limited to hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training. Our policies expressly prohibit harassment and/or discrimination as stated above.

Andela is home for all, come as you are.


  • Site Reliability Engineer

    3 semanas atrás


    Brazil Fulcrum Digital Inc Tempo inteiro

    Position Overview:We are looking for a talented and motivated Site Reliability Engineer (SRE) to join our team remotely from LATAM. The ideal candidate will have strong technical skills and exceptional problem-solving abilities. As an SRE, you will ensure the reliability, availability, and performance of critical systems and applications.Key...


  • Brazil Ródio Tech Soluções Tempo inteiro

    Estamos à procura de um(a) Consultor(a) Site Reliability Engineer – Pleno  para se juntar ao nosso time de profissionais excepcionais na RÓDIO TECH.Responsabilidades:Desenvolver e manter sistemas resilientes utilizando linguagens de programação como Java, GoLang, Kotlin, Groovy ou Shell scripting.Trabalhar com ferramentas de contêinerização...


  • Brazil Ródio Tech Soluções Tempo inteiro

     Estamos à procura de um(a) Consultor(a) Site Reliability Engineer - Sênior, para se juntar ao nosso time de profissionais excepcionais na RÓDIO TECH.Responsabilidades:Desenvolver e manter sistemas resilientes e escaláveis, utilizando linguagens de programação como Java, GoLang, Kotlin, Groovy ou Shell scripting.Implementar e gerenciar ferramentas de...


  • Brazil, BR Ródio Tech Soluções Tempo inteiro

    Estamos à procura de um(a) Consultor(a) Site Reliability Engineer - Sênior, para se juntar ao nosso time de profissionais excepcionais na RÓDIO TECH.Responsabilidades:Desenvolver e manter sistemas resilientes e escaláveis, utilizando linguagens de programação como Java, GoLang, Kotlin, Groovy ou Shell scripting.Implementar e gerenciar ferramentas de...


  • Brazil, BR Ródio Tech Soluções Tempo inteiro

    Estamos à procura de um(a) Consultor(a) Site Reliability Engineer – Pleno para se juntar ao nosso time de profissionais excepcionais na RÓDIO TECH.Responsabilidades:Desenvolver e manter sistemas resilientes utilizando linguagens de programação como Java, GoLang, Kotlin, Groovy ou Shell scripting.Trabalhar com ferramentas de contêinerização (Docker,...


  • São Paulo, Brazil, BR Valtech Tempo inteiro

    A Valtech é uma empresa Digital global que possui escritórios em 21 países focada na transformação dos negócios por meio da inovação digital.Nós projetamos e construímos experiências únicas, executamos esforços de melhoria contínua e vivemos, alimentamos e impulsionamos a transformação dos negócios em todo o mundo digital. Apoiamos nossos...


  • Brazil, BR Andela Tempo inteiro

    Andela exists to connect brilliance and opportunity. Since 2014, we have been dedicated to breaking down global barriers and accelerating the future of work for both technologists and organizations around the world. For technologists, Andela offers competitive long-term career opportunities with leading organizations, access to a global community of...


  • São Paulo, Brazil, BR GFT Technologies Tempo inteiro

    SRE Sênior (Híbrido/SP) - 121062O que buscamos:Estamos em busca de uma pessoa que atue com Site Reliability Engineering (SRE) para atuação híbrida (Centro de São Paulo). Responsabilidades:Responsável pela disciplina de observabilidade do time de Operações de TI do clienteDefinição de Níveis de Serviço, Indicadores de níveis de...

  • Platform Engineer

    2 meses atrás


    Brazil Virtasant Tempo inteiro

    Do you want to work on cutting-edge projects with the world’s best IT engineers? Do you wish you could control which projects to work on and choose your own pay rate? Are you interested in the future of work and how the cloud will form teams? If so - this is the role for you.We are looking for an experienced Platform Engineer to join our team. This role...

  • Platform Engineer

    2 meses atrás


    Brazil, BR Virtasant Tempo inteiro

    Do you want to work on cutting-edge projects with the world’s best IT engineers? Do you wish you could control which projects to work on and choose your own pay rate? Are you interested in the future of work and how the cloud will form teams? If so - this is the role for you.We are looking for an experienced Platform Engineer to join our team. This role...


  • Brazil Launchcode Tempo inteiro

    About Us: Launchcode is a cutting-edge technology company focused on revolutionizing the agricultural industry. Our innovative solutions leverage advanced software and IoT technologies to optimize operations and improve efficiency. We are currently seeking a skilled Technical Engineer Lead Full Stack to join our dynamic team. Important facts about this...


  • São Paulo, Brazil, BR Insight Global Tempo inteiro

    REQUIRED SKILLS AND EXPERIENCE- 5+ years of OT (operational technology) or ICS (Industrial Control System) environment and technology experience automating and integrating platforms, devices, and systems- 3+ years as an OT cyber security engineer on a virtual or global team - Hands-on scripting experience (e. g., Python, PowerShell, Bash) in preferably...

  • Lead Software Engineer

    3 semanas atrás


    Brazil Virtustant Tempo inteiro

    Job Title: Senior Software Engineer – .NET and Mobile ApplicationsPosition Description:Join our team to work for our client, a leading provider of comprehensive and user-friendly security guard management software. As a Senior Software Engineer, you will lead a global development team, driving technical innovation and implementing best practices. This...

  • Data Engineer

    Há 1 mês


    Brazil, BR Insight Global Tempo inteiro

    Must-haves:5+ years of data engineer experience Experience working with AWS architecture (RDS, Glue, EMR, EC2, S3, Postgres, EMR, etc..)Snowflake data warehousingPython & SQL codingDay to Day:Insight Global is looking for 3 remote data engineers to join the analytics organization at a global medical device client. We are establishing a new data analytics...

  • Data Engineer

    Há 1 mês


    Brazil Insight Global Tempo inteiro

    Must-haves:5+ years of data engineer experience Experience working with AWS architecture (RDS, Glue, EMR, EC2, S3, Postgres, EMR, etc..)Snowflake data warehousingPython & SQL codingDay to Day:Insight Global is looking for 3 remote data engineers to join the analytics organization at a global medical device client. We are establishing a new data analytics...

  • Outsystems Engineer

    2 meses atrás


    Brazil Ranger Technical Resources Tempo inteiro

    Outsystems Engineer #2201Position Summary:We are seeking an experienced Outsystems Engineer with a strong background in both C#/.NET and Outsystems development to play a key role in designing, developing, and maintaining innovative software solutions. In this role, you will leverage your expertise in C#/.NET programming and OutSystems low-code platform to...

  • Senior SRE

    2 meses atrás


    São Paulo, Brazil, BR Remessa Online Tempo inteiro

    Sua carreira com liberdade e propósito Na Remessa Online, não se trata apenas de enviar dinheiro pelo mundo. Aqui, construímos conexões que transcendem fronteiras. Somos os #DreamMakers, realizando os sonhos tanto dos clientes quanto dos nossos. Nosso segredo diário é a colaboração. Com um toque de ousadia, assumimos responsabilidades como...

  • Systems Engineer

    2 meses atrás


    São Paulo, Brazil, BR Clever Devices Tempo inteiro

    As THE leader in transit technology, Clever Devices' vision is to make meaningful contributions to worldwide mobility. Our goal is to be the leading provider of exciting technology that improves the quality of mobility in communities around the world.Job Summary:The Systems Engineer is responsible for leading the deployment of complete systems that...

  • Systems Engineer

    4 semanas atrás


    São Paulo, Brazil, BR LevelUP HCS Tempo inteiro

    To be considered, please provide copy of resume in English format.Company Overview:Our client, a global leader in transit technology, is focused on making meaningful contributions to mobility worldwide. Their mission is to enhance the quality of transit systems through innovative technology, providing solutions that improve mobility in communities across the...

  • Systems Engineer

    Há 1 mês


    São Paulo, Brazil, BR LevelUP HCS Tempo inteiro

    To be considered, please provide copy of resume in English format.Company Overview:Our client, a global leader in transit technology, is focused on making meaningful contributions to mobility worldwide. Their mission is to enhance the quality of transit systems through innovative technology, providing solutions that improve mobility in communities across the...