Senior Site Reliability Engineer

Há 2 dias


Belo Horizonte, Brasil Articul8 AI Tempo inteiro

Senior Site Reliability Engineer (SRE) - (Brazil) Join to apply for the Senior Site Reliability Engineer (SRE) - (Brazil) role at Articul8 AI. Position Overview We are seeking an experienced Site Reliability Engineer (SRE) to join our team and help ensure the reliability, performance, and scalability of our GenAI SaaS platform. As an SRE, you will bridge the gap between development and operations, implementing automation and best practices to maintain our service reliability objectives while supporting rapid innovation. Key Responsibilities Architect and maintain scalable, highly available infrastructure for our GenAI platform. Design and implement robust monitoring, alerting, and observability solutions to proactively ensure system health and performance. Automate deployment, scaling, and management of our cloud-native infrastructure, reducing toil and improving efficiency. Define, measure, and improve Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to deliver outstanding service quality. Participate in on-call rotations and provide rapid response to production incidents, minimizing downtime and user impact. Collaborate closely with development teams to build reliable, scalable, and efficient systems for complex AI workloads. Lead incident response efforts, conduct thorough post-mortems, and champion continuous improvement initiatives. Optimize infrastructure for performance, scalability, and cost-effectiveness—especially for high-demand AI workloads. Implement and enforce security best practices across all systems and environments. Create and maintain comprehensive documentation, including runbooks and knowledge base articles, to foster a culture of shared knowledge. Required Qualifications Bachelor's degree in Computer Science, Engineering, or related field, or equivalent practical experience. 5+ years of experience in DevOps, SRE, or similar roles. Strong experience with cloud platforms (AWS, GCP, or Azure). Proficiency in at least one programming/scripting language (Python, Go, Bash, etc.). Hands‑on experience with infrastructure as code tools (Terraform, CloudFormation, etc.). Solid background in containerization technologies (Docker, Kubernetes). Proven experience with monitoring and observability tools (Prometheus, Grafana, ELK stack, etc.). Strong understanding of CI/CD pipelines and automation. Exceptional troubleshooting and problem‑solving skills and ability to troubleshoot complex systems. Preferred Qualifications Experience supporting AI/ML systems in production. Knowledge of GPU infrastructure management and optimization. Familiarity with distributed systems and high‑performance computing. Experience with database systems (SQL and NoSQL). Certifications in cloud platforms (AWS, GCP, Azure). Experience with chaos engineering and resilience testing. Knowledge of security best practices and compliance requirements. #J-18808-Ljbffr



  • Belo Horizonte, Brasil Mercado Eletrônico Tempo inteiro

    O Mercado Eletrônico é líder na América Latina em soluções de gestão de compras B2B. Suas tecnologias e serviços para as áreas de compras ajudam empresas a conquistarem mais economia, agilidade, governança e colaboração.Com escritórios no Brasil, Estados Unidos, México e Portugal, contabiliza mais de 1 milhão de fornecedores, 10 mil...

  • Site Reliability Engineer

    3 semanas atrás


    Belo Horizonte, Brasil BairesDev Tempo inteiro

    Site Reliability Engineer - Remote Work: At BairesDev, we've been leading the way in technology projects for over 15 years. We deliver cutting-edge solutions to giants like Google and the most innovative startups in Silicon Valley. Our diverse 4,000+ team, composed of the world's Top 1% of tech talent, works remotely on roles that drive significant impact...

  • Site Reliability Engineer

    2 semanas atrás


    Belo Horizonte, Brasil AgileEngine Tempo inteiro

    Site Reliability Engineer (Middle) ID38916 AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and startups across 17+ industries. We are leaders in application development and AI/ML, with a people-first culture and multiple Best Place to Work awards. If you're looking for a place to grow, make an impact, and work...


  • Belo Horizonte, Brasil Signify Technology Tempo inteiro

    The CompanyA well-established tech organization building advanced AI products for healthcare and clinical research. The team focuses on secure, reliable platforms that process sensitive medical data and support research and clinical workflows.Role & ResponsibilitiesAs a Senior SRE, you will:Design and automate infrastructure (infrastructure-as-code...


  • Belo Horizonte, Brasil Signify Technology Tempo inteiro

    The CompanyA well-established tech organization building advanced AI products for healthcare and clinical research. The team focuses on secure, reliable platforms that process sensitive medical data and support research and clinical workflows.Role & ResponsibilitiesAs a Senior SRE, you will:Design and automate infrastructure (infrastructure-as-code...


  • Belo Horizonte, Brasil Canonical Tempo inteiro

    OverviewCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets.Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT.The company is founder led, profitable and growing, with 1200+ colleagues in...


  • Belo Horizonte, Brasil Microtalent Is Becoming Inspyr Global Solutions Tempo inteiro

    WE ARE HIRING DATA ENGINEER°Offer 100% remotly ONLY BrazilDirect contract with clientThe Senior Cloud Data Engineer leads the design, architecture, and implementation of secure, scalable data solutions on AWS, utilizing Snowflake, dbt, and modern automation tools.This role drives best practices for data quality, validation, and governance, while optimizing...


  • Belo Horizonte, Brasil Signify Technology Tempo inteiro

    The CompanyA well-established tech organization building advanced AI products for healthcare and clinical research.The team focuses on secure, reliable platforms that process sensitive medical data and support research and clinical workflows.Role & ResponsibilitiesAs aSenior SRE, you will:Design and automate infrastructure(infrastructure-as-code tools)Build...

  • React Native Engineer

    2 semanas atrás


    Belo Horizonte, Brasil AgileEngine Tempo inteiro

    Join to apply for the React Native Engineer (Lead) ID36430 role at AgileEngine 3 days ago Be among the first 25 applicants Join to apply for the React Native Engineer (Lead) ID36430 role at AgileEngine AgileEngine is one of the Inc. 5000 fastest-growing companies in the US and a top-3 ranked dev shop according to Clutch. We create award-winning custom...


  • Belo Horizonte, Brasil Stone Tempo inteiro

    Quem é Stone Tech? A Stone nasceu com o propósito de ser protagonista na transformação da indústria de pagamentos, lutando para oferecer as melhores soluções para quem empreende no Brasil.Pensando nisso, construímos a Stone Tech! A junção dos times de tecnologia Stone Co. e as empresas financeiras do grupo que reconhecem o potencial empreendedor de...