Senior Site Reliability Engineer, Observability

1 dia atrás


São Paulo, Brasil Chainlink Labs Tempo inteiro

OverviewChainlink Labs is the primary contributing developer of Chainlink, the decentralized computing platform powering the verifiable web. Chainlink is the industry-standard platform for providing access to real-world data, offchain computation, and secure cross-chain interoperability across any blockchain. Chainlink Labs helps power verifiable applications for banking, DeFi, global trade, and gaming by collaborating with some of the world’s largest financial institutions, notably Swift, DTCC, and ANZ. Chainlink Labs also works with top Web3 teams, including Aave, Compound, GMX, Maker, and Synthetix. Chainlink Labs was ranked as one of the Global Top 100 Most Loved Workplaces by Newsweek 2025. The Observability Team enables Chainlink development and empowers engineers to continue building and supporting crucial products and services that have a profound impact in the blockchain industry. Reliability is vital to the success of our company. As a Senior SRE, you will help us accelerate and enable other engineering teams by increasing self-service and decreasing cognitive load. This role is a good fit for someone with a strong DevOps mentality, a passion for building and maintaining a mature GitOps environment, and experience focusing on observability. The team is expanding, offering opportunities to build, learn, and grow. We are committed to diversity and inclusion and encourage you to apply even if you don\'t match 100% of the job requirements. Your Impact Build and orchestrate a Modern OTEL-based Observability Platform Support multiple telemetry types, like metrics, logs and traces Define and support modern governance in observability and problems at scale Ensure reliability, security, and performance exceed our defined SLAs Collaborate with engineers across the company to troubleshoot issues, deploy new products and services, and increase velocity while decreasing cognitive load Lead the design and deployment of monitoring/observability services to detect and alert the team of needed action Ingest, aggregate, transform, and utilize data from multiple sources in the real-time data pipeline Oversee the availability, performance, and supportability of observability infrastructure Create processes around alert response operations and support the team to ensure reliable delivery of oracle data Make recommendations to ensure sufficient metrics are collected to create alerts with every new feature release Champion reliability and security by taking the time to do your work right the first time Requirements7+ years of relevant professional experience. You have likely worked on a devops, infrastructure, SRE, and/or platform team before Ability to develop software outside of the scope of typical infrastructure requirements and configurations Experience programming in C, C++, Java, Python, Go, Perl, or Ruby Expert knowledge in all aspects of designing, developing, and managing large real-time systems Experience with monitoring and logging. You know how to export metrics using Prometheus, have built a Grafana dashboard, and have experience with a centralized logging solution like an ELK Stack, Splunk or Grafana Stack Experience with distributed systems and container orchestration. You have maintained or built Kubernetes clusters and feel comfortable deploying new services on them Strong communication skills. You can give and receive constructive feedback and participate in planning meetings and code reviews Desired QualificationsExcitement for blockchain, Web 3.0, and similar decentralized technologies Experience running infrastructure in the blockchain/web3 space Ability to scale systems sustainably through automation and evolve systems for reliability and velocity Experience working remotely in a distributed team A strong desire to grow and challenge yourself by automating services to reduce toil Tools and ServicesAWS; Terraform/Terragrunt; Kubernetes, Calico and ArgoCD; Prometheus and Grafana; GitHub Actions; Packer We expect you to be comfortable with most of those tools and proficient in several of them All roles with Chainlink Labs are global and remote-based. Overlap with Eastern Standard Time (EST) is encouraged unless stated otherwise. We carefully review all applications and aim to provide a response to every candidate within two weeks after the job posting closes. The closing date is listed on the job advert. We encourage thoughtful preparation of your application and will communicate the status after the closing date. Commitment to Equal OpportunityChainlink Labs is an equal opportunity employer. All qualified applicants will receive equal consideration for employment in compliance with applicable laws, regulations, or ordinances. If you need assistance or accommodation due to a disability or special need when applying for a role or in our recruitment process, please contact us via this form. Global Data Privacy Notice for Job Candidates and ApplicantsInformation collected and processed as part of your Chainlink Labs Careers profile, and any job applications you submit, is subject to our Privacy Policy. By submitting your application, you agree to our use and processing of your data as required. Seniority levelMid-Senior level Employment typeFull-time Job functionEngineering and Information Technology Industries: Technology, Information and Internet #J-18808-Ljbffr



  • São Paulo, Brasil Chainlink Labs Tempo inteiro

    Overview Chainlink Labs is the primary contributing developer of Chainlink, the decentralized computing platform powering the verifiable web. Chainlink is the industry-standard platform for providing access to real-world data, offchain computation, and secure cross-chain interoperability across any blockchain. Chainlink Labs helps power verifiable...


  • São Paulo, Brasil K2 Solutions Tempo inteiro

    Trabalho híbrido na região de Pinheiros/ SP - 3x por semana no escritórioEstamos selecionando um Senior Site Reliability Engineer - SRE para se juntar ao nosso time e desempenhar um papel essencial na manutenção, automação e melhoria da confiabilidade dos sistemas que impulsionam a rede logística da empresa em múltiplas regiões. Essa pessoa...


  • São Paulo, Brasil Canonical Tempo inteiro

    Senior Site Reliability / Gitops EngineerJoin to apply for the Senior Site Reliability / Gitops Engineer role at Canonical Senior Site Reliability / Gitops Engineer1 day ago Be among the first 25 applicants Join to apply for the Senior Site Reliability / Gitops Engineer role at Canonical Get AI-powered advice on this job and more exclusive features....


  • São Paulo, Brasil PayRetailers Tempo inteiro

    Site Reliability Engineer Join PayRetailers in São Paulo. We are expanding across Latin America and Africa, building cutting‑edge payment solutions. We value creativity, growth, and collaboration. About the role Site Reliability Engineers are guardians of our reliability promise. They deliver a highly reliable, resilient, and cost‑efficient platform...


  • São Paulo, Brasil Lend Tempo inteiro

    Buscamos um(a) Site Reliability Engineer Sênior para projetar, operar e evoluir uma infraestrutura de crédito que vai transformar o mercado financeiro brasileiro. Você será responsável por garantir que nossa plataforma seja confiável, escalável, segura e eficiente em custo , impactando diretamente nossos clientes e moldando como o crédito será...


  • São Paulo, Brasil PayRetailers Tempo inteiro

    Site Reliability Engineer Join PayRetailers in São Paulo. We are expanding across Latin America and Africa, building cutting‑edge payment solutions. We value creativity, growth, and collaboration. About the role Site Reliability Engineers are guardians of our reliability promise. They deliver a highly reliable, resilient, and cost‑efficient platform...


  • São Paulo, Brasil Dev.Pro Tempo inteiro

    Senior Site Reliability Engineer - OPS00023 We are a US‑based outsource software development company that has been delivering exceptional software experience to our clients since 2011, helping technology companies to become industry leaders. Over the past few years, we’ve been hiring specialists all over the world while our main development centers were...


  • São Paulo, Brasil PayRetailers Tempo inteiro

    Job Overview We’re PayRetailers, and we offer cutting‑edge payment solutions that empower businesses to succeed in Latin America & Africa. Our collaborative and inclusive work environment encourages creativity and growth, where every employee’s contribution is valued. We’ve got big plans to expand into new markets and make a meaningful impact on the...


  • São Paulo, Brasil Chainlink Labs Tempo inteiro

    Join to apply for the Senior Site Reliability Engineer role at Chainlink Labs 2 weeks ago Be among the first 25 applicants Join to apply for the Senior Site Reliability Engineer role at Chainlink Labs Get AI-powered advice on this job and more exclusive features. About UsChainlink Labs is the primary contributing developer of Chainlink, the decentralized...


  • São Paulo, Brasil Mouts TI Tempo inteiro

    Na Mouts TI, entregamos soluções que impulsionam a transformação digital de forma ágil, eficiente e descomplicada.Buscamos um(a) SRE (Site Reliability Engineer) para atuar presencialmente, com foco em infraestrutura, automação e observabilidade em ambientes de missão crítica.Responsabilidades:Implementar e gerenciar soluções de observabilidade