Staff Site Reliability Engineer

1 semana atrás


Caxias do Sul, Brasil Nearsure Tempo inteiro

Staff Site Reliability Engineer - Work from homeStaff Site Reliability Engineer - Work from home1 day ago Be among the first 25 applicantsGet AI-powered advice on this job and more exclusive features.Join our close-knit LATAM remote team: Connect through fun activities like coffee breaks, tech talks, and games with your team-mates and management.Say goodbye to micromanagementWe champion autonomy, open communication, and respect for diversity as our core values.?Your well-being matters: Our People Care team is here from day one to support you with everything from time-off requests to wellness check-ins.Plus, our Accounts Management team ensures smooth, effective client relationships, so you can focus on what you do best.Ready to grow with us?Here's what we offer you by joining usCompetitive USD salary – We value your skills and contributions100% remote work – While you can work from anywhere, you're always welcome to connect with teammates and grow your network at our coworking spaces across LATAMPaid time off – Take the time you need according to your country's regulations, all while receiving your full salary.Rest, recharge, and come back strongerNational Holidays celebrated – Take time off to celebrate important events and traditions with loved ones, fully embracing your culture.Sick leave – Focus on your health without the stress.Take the necessary time to recover and feel better.Refundable Annual Credit – Spend it on the perks you love to enhance your work-life balanceTeam-building activities – Join us for coffee breaks, tech talks, and after-work gatherings to bond with your Nearsure family and feel part of our vibrant community.Birthday day off – Enjoy an extra day off during your birthday week to celebrate in style with friends and familyAbout the project:As a Staff Site Reliability Engineer, you will own and optimize OpenTelemetry pipelines, enabling scalable and efficient observability.You'll build tools that empower teams, support incident response, and drive best practices.Your work ensures a reliable, secure infrastructure and actionable alerting across the organization.How your day-to-day work will look like Design, implement, and maintain observability pipelines across the three main signals—logs, metrics, and traces—ensuring standardized, scalable, and efficient data ingestion.Optimize ingestion strategies to balance cost, performance, and usability.Build self-service automation and tooling that enables development teams to instrument and leverage observability without requiring manual intervention from the SRE team.Drive adoption of best practices while ensuring teams own their telemetry.Design the processes, playbooks, checklists, and automations for them and other engineers to follow during an incident.Interact with members from almost all teams across the business to understand their monitoring, alerting, and SLO / SLA requirements and design systems and processes that ensure we meet or exceed these requirements.Influence architectural decisions during initial design stages to ensure resiliency and scale at the outset of software development.Design the processes, playbooks, checklists, and automations for them and other engineers to follow during an incident.Leverage Infrastructure-as-Code (IaC) to provision and manage monitoring tools, alerting rules, and our observability configurations across OTEL Pipelines.Design base-level requirements for new and existing services to ensure that all client infrastructure and code are monitored consistently and accurately at a basic level.Take full ownership of client infrastructure reliability, ensuring adherence to key availability and security KPIs.This would make you the ideal candidate Bachelor's Degree in Computer Science, Engineering, or a related field.8+ Years of experience working as an SRE Engineer or in a very similar role, more focused on observability.5+ Years of experience working with cloud (AWS).5+ Years of experience working with IaC tools (Terraform) and GitOps CI/CD solutions (ArgoCD, GitHub Actions, or similar).4+ Years of experience working with monitoring and logging OpenSource tools such as Grafana, Prometheus, Elastic/OpenSearch, Loki, Tempo.4+ Years of experience working in Kubernetes, including its core components, deployment methodologies, and monitoring best practices.Strong scripting abilities (Python, Go, or similar) for automating observability tasks.Experience in managing observability: SLI, SLOs, Log Transformation, Cardinality Management, Business and Resilience Metrics, 4 Golden Signals, Distributed Tracing.Experience with automated alerting workflows.Exposure with OpenTelemetry Pipelines.Advanced English Level is required for this role as you will work with US clients.Effective communication in English is essential to deliver the best solutions to our clients and expand your horizons.What to expect from our hiring process1.Let's chat about your experience2.Impress our recruiters, and you'll move on to a technical interview with our top developers.3.Nail that, and you'll meet our client - your final step to joining our amazing teamAt Nearsure, we're dedicated to solving complex business challenges through cutting-edge technology and we believe in the power of tailored solutions.Whether you are passionate about transforming businesses with Generative AI, building innovative software products, or implementing comprehensive enterprise platform solutions, we invite you to be part of our dynamic teamWe would love to hear from you if you are eager to make an impact and join a collaborative team that values creativity and expertise.Let's work together to shape the future of technologyBy applying to this position, you authorize Nearsure to collect, store, transfer, and process your personal data in accordance with our Privacy Policy.For more information, please review our Privacy Policy.( levelSeniority levelMid-Senior levelEmployment typeEmployment typeFull-timeJob functionJob functionInformation TechnologyIndustriesSoftware DevelopmentReferrals increase your chances of interviewing at Nearsure by 2xDevOps Engineer Career Opportunities at Dev.Pro - 01Software Engineer (Python) Career Opportunities at Dev.Pro - 01Staff Site Reliability Engineer - Work from homeSoftware Support Engineer with Python - RemoteSite Reliability Engineer - Remote Work | REF#******Software Engineer (Node.js) Career Opportunities at Dev.Pro - 01We're unlocking community knowledge in a new way.Experts add insights directly into each article, started with the help of AI.#J-*****-Ljbffr


  • Site Reliability Engineer

    2 semanas atrás


    Caxias do Sul, Brasil Canonical Tempo inteiro

    Join to apply for the Site Reliability Engineer role at CanonicalCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets.Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT.Our customers include...

  • Site Reliability Engineer

    1 semana atrás


    São Bernardo do Campo, Brasil BairesDev Tempo inteiro

    OverviewSite Reliability Engineer - Remote Work | REF# Join to apply for the Site Reliability Engineer - Remote Work role at BairesDev. We are looking for a Site Reliability Engineer to administrate and provide support for the whole project infrastructure hosted in the cloud while implementing CI/CD pipelines for the automation of the deployments....


  • Caxias do Sul, Brasil Canonical Tempo inteiro

    Join to apply for the Staff Security Operations Engineer role at CanonicalContinue with Google Continue with Google3 months ago Be among the first 25 applicantsJoin to apply for the Staff Security Operations Engineer role at CanonicalWe have opened several senior/staff Security Operations Engineer (SOC) positions, creating a new team reporting to the CISO.We...

  • Site Reliability Engineer

    1 semana atrás


    Duque de Caxias, Brasil BairesDev Tempo inteiro

    Site Reliability Engineer - Remote Work | REF# At BairesDev®, we've been leading the way in technology projects for over 15 years. We deliver cutting-edge solutions to giants like Google and the most innovative startups in Silicon Valley. Our diverse 4,000+ team, composed of the world's Top 1% of tech talent, works remotely on roles that drive significant...


  • Caxias do Sul, Brasil Pythian Tempo inteiro

    OverviewLinux Site Reliability Consultant — Brazil | Remote | Work from Home.One available position for the following time zone: PST.Why PythianAt Pythian, we are experts in strategic database and analytics services, driving digital transformation and operational excellence.Pythian, a multinational company, was founded in **** and started by ensuring the...


  • Rio Grande do Sul, Brasil Goodyear Tempo inteiro R$60.000 - R$120.000 por ano

    Start something great today. Go GoodyearRole DescriptionReliability Engineer is responsible for applying preventive and predictive maintenance procedures, supporting and developing equipment maintenance plan for the assigned manufacturing area with the aim of uninterrupted maintenance of equipment in order to reach the set productivity.Main...


  • Caxias do Sul, Brasil Vortigo Digital Tempo inteiro

    Somos a Vortigo - nascemos com o propósito de criar aplicativos mobile para um mundo em constante movimento, mas não paramos por aí.Ampliamos nossa atuação e hoje desenvolvemos softwares para ajudar empresas e startups no processo de transformação digital.Nosso time é composto por pessoas apaixonadas por desafios gigantes, mudando a experiência dos...


  • Caxias Do Sul, Brasil Signify Technology Tempo inteiro

    The Company A well-established tech organization building advanced AI products for healthcare and clinical research. The team focuses on secure, reliable platforms that process sensitive medical data and support research and clinical workflows.Role & Responsibilities As a Senior SRE, you will:Design and automate infrastructure (infrastructure-as-code...


  • Caxias do Sul, Brasil Ledn Tempo inteiro

    OverviewJoin to apply for the Staff Application Security Engineer role at LEDN.Ledn is a global financial services company built for digital assets, helping to improve the everyday lives of Bitcoin holders while building generational wealth for the future.We offer a suite of egalitarian lending, savings and trading products to digital asset holders in over...


  • Caxias do Sul, Brasil LEDN Tempo inteiro

    Overview Join to apply for the Staff Application Security Engineer role at LEDN . Ledn is a global financial services company built for digital assets, helping to improve the everyday lives of Bitcoin holders while building generational wealth for the future. We offer a suite of egalitarian lending, savings and trading products to digital asset holders in...