Lead Site Reliability Engineer
Há 1 mês
Andela exists to connect brilliance and opportunity. Since 2014, we have been dedicated to breaking down global barriers and accelerating the future of work for both technologists and organizations around the world. For technologists, Andela offers competitive long-term career opportunities with leading organizations, access to a global community of professionals, and educational opportunities with leading technology providers.
At Andela, we’re deeply passionate about creating long-lasting and transformative growth opportunities for all - and doing it in an E.P.I.C. way We’re excited to continue building our remote-first team with incredible people like you. After applying for this role, you will join our Andela Community of brilliant technologists by passing a technical screening and live interview. As a community member, you’ll have access to a multitude of exclusive technologist roles. Join Andela today to access this opportunity and more in our global marketplace
Our roles are typically filled at lightning speed, so if you’re considering applying, get your application in quickly
This is a fully remote opportunity for one of our esteemed clients.
About the role:
We are seeking a highly skilled and experienced Lead SRE to oversee the deployment, maintenance, and optimization of the DataDog observability platform across our R&D environment. This role is crucial for ensuring a unified, efficient, and secure monitoring infrastructure. You will lead API integrations, assist in platform modernization, and support teams with architectural insights and best practices for observability and monitoring.
Responsibilities
- Oversee the deployment, maintenance, and configuration of DataDog for system monitoring, logging, and observability.
- Act as the primary point of contact for technical issues related to DataDog and observability tools.
- Lead API integrations and enhance platform capabilities to align with organizational needs.
- Monitor system performance and health, implementing proactive measures to prevent disruptions.
- Assist with the migration to service accounts and ensure best practices for user and key management.
- Provide operational and training support to R&D teams, ensuring efficient use of observability tools.
- Contribute to platform improvements and guide the adoption of OpenTelemetry or other modernization initiatives.
Required skills:
- DataDog Expertise (7-9 years): Advanced hands-on experience with DataDog, including monitoring, logging, dashboard creation, and APM configuration.
- Observability Tools (7-9 years): Proficiency with tools like Prometheus and Grafana for system performance tracking.
- Cloud Platforms (7-9 years): Extensive experience with AWS, including integration with DataDog for unified monitoring.
- Containerization and Microservices Monitoring (4-6 years): Expertise in monitoring Kubernetes and containerized environments.
- Python (4-6 years): Proficiency in Python for scripting and automating monitoring tasks.
- CI/CD Pipelines (4-6 years): Experience integrating observability tools like DataDog into CI/CD workflows
- Installation and configuration of DataDog agents and integrations.
- User management, including roles, permissions, and security best practices.
- Leadership skills
Nice-to-have skills:
- OpenTelemetry Adoption: Experience migrating from proprietary tracing models to OpenTelemetry for distributed tracing.
- API & Platform Migration: Expertise in transitioning to service account models and consolidating access keys for enhanced security.
- Automation : Familiarity with automating monitoring setups and API configurations using scripting tools.
Type of contract : Contractor. You will be responsible for your taxes.
Contract length: 3-months (minimum). Renewable contract.
Dedication: Full-time (40 hours/week)
Location: 100% remote
Timezone : You’ll need to overlap at least 6 hours with US PST (UTC-4).
At Andela, we outcompete through diversity. We know that our strengths lie in the multiplicity of talents, perspectives, backgrounds, and orientations of residents in our community and we take pride in that. Andela is committed to a work environment in which all individuals are treated with respect and dignity. Each individual has the right to work in a professional atmosphere that promotes equal employment opportunities and prohibits discriminatory practices. Andela provides equal employment opportunities and workplace to all employees and applicants without regard to factors including but not limited to race, color, religion, gender, sexual orientation, gender identity, national origin, age, disability, pregnancy (including breastfeeding), genetic information, HIV/AIDS or any other medical status, family or parental status, marital status, amnesty or status as a covered veteran in accordance with applicable federal, state and local laws. This commitment applies to all terms and conditions of employment, including but not limited to hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training. Our policies expressly prohibit harassment and/or discrimination as stated above.
Andela is home for all, come as you are.
-
Senior Site Reliability Engineer Lead
4 semanas atrás
Brasil Rocket Tempo inteiroAt Rocket.Chat, we're on a mission to reconnect the world through open-source communication. We're seeking an exceptional Senior Site Reliability Engineer Lead to join our team and drive the reliability, scalability, and performance of our platform.We offer a competitive salary of $150,000 per year, making it one of the highest in the industry for this role....
-
Reliability Engineering Lead
Há 11 horas
Brasil Promote Project Tempo inteiroAbout the Position:This is a remote job opportunity that requires a high level of autonomy and self-motivation.As a Senior Site Reliability Engineer, you will be responsible for mentoring junior engineers, participating in code reviews, and contributing to the development of our SRE guild.You will also have the opportunity to work on a wide range of...
-
Site Reliability Engineer
4 semanas atrás
Brasil, BR Fulcrum Digital Inc Tempo inteiroPosition Overview:We are looking for a talented and motivated Site Reliability Engineer (SRE) to join our team remotely from LATAM. The ideal candidate will have strong technical skills and exceptional problem-solving abilities. As an SRE, you will ensure the reliability, availability, and performance of critical systems and applications.Key...
-
Site Reliability Engineer
4 semanas atrás
Brasil Fulcrum Digital Inc Tempo inteiroPosition Overview: We are looking for a talented and motivated Site Reliability Engineer (SRE) to join our team remotely from LATAM. The ideal candidate will have strong technical skills and exceptional problem-solving abilities. As an SRE, you will ensure the reliability, availability, and performance of critical systems and applications. Key...
-
Consultor(a) Site Reliability Engineer
4 semanas atrás
Brasil Ródio Tech Soluções Tempo inteiroEstamos à procura de um(a) Consultor(a) Site Reliability Engineer - Sênior, para se juntar ao nosso time de profissionais excepcionais na RÓDIO TECH. Responsabilidades: Desenvolver e manter sistemas resilientes e escaláveis, utilizando linguagens de programação como Java, GoLang, Kotlin, Groovy ou Shell scripting. Implementar e gerenciar...
-
Consultor(a) Site Reliability Engineer
4 semanas atrás
Brasil Ródio Tech Soluções Tempo inteiroEstamos à procura de um(a) Consultor(a) Site Reliability Engineer – Pleno para se juntar ao nosso time de profissionais excepcionais na RÓDIO TECH. Responsabilidades: Desenvolver e manter sistemas resilientes utilizando linguagens de programação como Java, GoLang, Kotlin, Groovy ou Shell scripting. Trabalhar com ferramentas de contêinerização...
-
Fulcrum Digital Inc | Site Reliability Engineer
4 semanas atrás
brasil Fulcrum Digital Inc Tempo inteiroPosition Overview:We are looking for a talented and motivated Site Reliability Engineer (SRE) to join our team remotely from LATAM. The ideal candidate will have strong technical skills and exceptional problem-solving abilities. As an SRE, you will ensure the reliability, availability, and performance of critical systems and applications.Key...
-
Fulcrum Digital Inc | Site Reliability Engineer
4 semanas atrás
brasil Fulcrum Digital Inc Tempo inteiroPosition Overview: We are looking for a talented and motivated Site Reliability Engineer (SRE) to join our team remotely from LATAM. The ideal candidate will have strong technical skills and exceptional problem-solving abilities. As an SRE, you will ensure the reliability, availability, and performance of critical systems and applications. Key...
-
Ródio Tech Soluções | Consultor(a) Site Reliability Engineer
4 semanas atrás
brasil Ródio Tech Soluções Tempo inteiroEstamos à procura de um(a) Consultor(a) Site Reliability Engineer – Pleno para se juntar ao nosso time de profissionais excepcionais na RÓDIO TECH. Responsabilidades: Desenvolver e manter sistemas resilientes utilizando linguagens de programação como Java, GoLang, Kotlin, Groovy ou Shell scripting. Trabalhar com ferramentas de contêinerização...
-
Ródio Tech Soluções | Consultor(a) Site Reliability Engineer
4 semanas atrás
brasil Ródio Tech Soluções Tempo inteiroEstamos à procura de um(a) Consultor(a) Site Reliability Engineer - Sênior, para se juntar ao nosso time de profissionais excepcionais na RÓDIO TECH. Responsabilidades: Desenvolver e manter sistemas resilientes e escaláveis, utilizando linguagens de programação como Java, GoLang, Kotlin, Groovy ou Shell scripting. Implementar e gerenciar...
-
Ródio Tech Soluções | Consultor(a) Site Reliability Engineer
4 semanas atrás
brasil Ródio Tech Soluções Tempo inteiroEstamos à procura de um(a) Consultor(a) Site Reliability Engineer - Sênior, para se juntar ao nosso time de profissionais excepcionais na RÓDIO TECH.Responsabilidades:Desenvolver e manter sistemas resilientes e escaláveis, utilizando linguagens de programação como Java, GoLang, Kotlin, Groovy ou Shell scripting.Implementar e gerenciar ferramentas de...
-
Ródio Tech Soluções | Consultor(a) Site Reliability Engineer
4 semanas atrás
brasil Ródio Tech Soluções Tempo inteiroEstamos à procura de um(a) Consultor(a) Site Reliability Engineer – Pleno para se juntar ao nosso time de profissionais excepcionais na RÓDIO TECH.Responsabilidades:Desenvolver e manter sistemas resilientes utilizando linguagens de programação como Java, GoLang, Kotlin, Groovy ou Shell scripting.Trabalhar com ferramentas de contêinerização...
-
brasil Andela Tempo inteiroAndela exists to connect brilliance and opportunity. Since 2014, we have been dedicated to breaking down global barriers and accelerating the future of work for both technologists and organizations around the world. For technologists, Andela offers competitive long-term career opportunities with leading organizations, access to a global community of...
-
Platform Engineer
2 meses atrás
Brasil Virtasant Tempo inteiroDo you want to work on cutting-edge projects with the world’s best IT engineers? Do you wish you could control which projects to work on and choose your own pay rate? Are you interested in the future of work and how the cloud will form teams? If so - this is the role for you. We are looking for an experienced Platform Engineer to join our team. This...
-
Senior Cloud Infrastructure Engineer
3 semanas atrás
Brasil Promote Project Tempo inteiroJoin Promote Project as a Senior Cloud Infrastructure Engineer and take advantage of our remote job opportunity. We offer a competitive salary, ranging from $60,000 to $110,000 per year.About the RoleWe are seeking an experienced Senior Site Reliability Engineer to join our team in Brazil. As a key member of our engineering community, you will work alongside...
-
Field Service Engineer Leader
2 semanas atrás
Brasil FLSmidth Tempo inteiroRole Overview: As a Field Service Engineer with FLSmidth, you will be responsible for overseeing and managing day-to-day operations on construction sites. Your expertise in construction management and technical guidance will be essential in leading site engineers and construction teams. You will also be responsible for coordinating with clients, consultants,...
-
Technical Engineer Lead
Há 1 mês
Brasil Launchcode Tempo inteiroAbout Us: Launchcode is a cutting-edge technology company focused on revolutionizing the agricultural industry. Our innovative solutions leverage advanced software and IoT technologies to optimize operations and improve efficiency. We are currently seeking a skilled Technical Engineer Lead Full Stack to join our dynamic team. Important facts about this...
-
Lead Software Engineer
3 semanas atrás
Brasil Virtustant Tempo inteiroJob Title: Senior Software Engineer – .NET and Mobile Applications Position Description: Join our team to work for our client , a leading provider of comprehensive and user-friendly security guard management software. As a Senior Software Engineer, you will lead a global development team, driving technical innovation and implementing best practices. ...
-
Senior Site Reliability Engineer
4 semanas atrás
Brasil, BR Vericode Tempo inteiroSe você gosta de desafios e quer mostrar todo o seu potencial, queremos te conhecer! A Vericode preza por um time inclusivo e repleto de diversidade, nas suas mais variadas representações. Todas as nossas vagas estão abertas para pessoas com deficiência!#VemSerVericoderResponsabilidades e atribuiçõesProjetar, implementar e manter sistemas de...
-
Senior Site Reliability Engineer
4 semanas atrás
Brasil Vericode Tempo inteiroSe você gosta de desafios e quer mostrar todo o seu potencial, queremos te conhecer! A Vericode preza por um time inclusivo e repleto de diversidade, nas suas mais variadas representações. Todas as nossas vagas estão abertas para pessoas com deficiência!#VemSerVericoder💜 Responsabilidades e atribuições Projetar, implementar e manter sistemas de...