Manager, Site Reliability Engineer

Há 2 dias


São Paulo, Brasil Wildlife Studios Tempo inteiro

We're looking for a talented and passionate Site Reliability Engineer Manager, to join Wildlife's Cloud Platform team. As an SRE Manager you will have the goal to provide easy-to-use, highly available systems to all the engineers in the company. As an SRE Manager, your main goal is to enable your team to improve the infrastructure services, using and refining our existing automations while being able to contribute in technical and business decisions for new services that will support the scalability and usability of the infrastructure services in the company and improving the team career growth, engagement and retention. We know that the work we do has a high impact on our company's success and culture. The right person for this position is curious by nature, proactive, loves solving problems, and can thrive in a fast and growing business.  What you'll do Be the manager of a cross-functional team, contributing to the team roadmap and growth of its individual contributors; Develop, maintain, and optimize infrastructure clusters (., Kubernetes, NATS, ETCD, Postgres, MongoDB, Redis, Elasticsearch), infrastructure services (., Gitlab, Jenkins, Vault, Artifactory, Datadog, Jaeger, , and our APIs and automations to manage them (., Kubernetes Operators, Infrastructure as code, Pipelines, CLIs,); Analyze costs of infrastructure services and help define and optimize the budget of our infrastructure and game teams; Contribute to improvements on monitoring and observability patterns for infrastructure services; Troubleshoot, manage and lead incidents in production; Manage and improve the tools and processes related to infrastructure management across the company (Infrastructure-as-code standards, CI/CD design, build of our Internal Developer Platform, ; Help partner teams to architect and scale their applications and infrastructure with cloud-native best practices. What you'll need We expect our Managers to be Technical, dedicating around 50% of their time to working together with the ICs in their day-to-day work and being an active voice and participative on the team technical roadmap. Experience managing small teams with infrastructure background; Some level of leadership skills, including the areas of people management, communications, project management, talent development, performance management, team effectiveness, agility, hiring, decision making, planning, budgeting, and collaboration; Coding experience in at least one programming language. We work mostly with Go and Python; University degree in courses related to computing such as Computer Engineering, Computer Science, Information Systems, and Systems Analysis and Development or equivalent Market Experience;  Solid understanding of computer concepts (operational systems, networking, concurrency, memory management, and algorithm analysis); Experience with cloud computing services such as Amazon AWS, Google Cloud, or Microsoft Azure; Experience with Infrastructure as Code automations, such as Terraform, Packer, Ansible, Crossplane, etc; Experience managing Kubernetes clusters and developing Kubernetes operators; Experience automating routine tasks, such as deployments and monitoring setup; Experience with incident management and being oncall for productive systems and workloads; Strong written and spoken communication skills in English; Experience with complex, large-scale, and high-available systems; Experience with monitoring and telemetry in applications and infrastructure; History of technical leadership and ownership of critical projects, including the mentoring of junior team members. More about you Player focused. We are player-oriented, and infrastructure has a great impact on their experience. You have empathy with our players and focus on ensuring they have an amazing experience. You aim for a top-level infrastructure, guaranteeing the highest availability possible. Automation is key to scaling. We look for engineers who have a history of projecting and executing automation projects in order to get rid of any manual and repetitive tasks. Calm and pragmatism. When everything seems to be falling apart around you, you have a plan and keep calm. Bleeding edge. You are curious and like to study new technologies, test new solutions, and measure the impact brought by changes. We want to ensure we are using the best stack possible. Metrics-oriented. We make decisions based on data and metrics. We measure the results of our tasks against the expected outcome. And we ensure our work has delivered the correct impact on our customers. We believe in ownership and in shipping features end to end. Bar raiser. You want to elevate your team skills and raise the bar, by mentoring your peers, spreading knowledge, being proactive and a tech lead. About Wildlife Wildlife is one of the leading mobile game developers and publishers in the world. We have released more than 60 titles, reaching billions of people around the globe. Here, we create games that will excite, intrigue, and engage our players for years to come


  • Site Reliability Engineer

    22 minutos atrás


    São Paulo, Brasil PayRetailers Tempo inteiro

    Job Overview We’re PayRetailers, and we offer cutting‑edge payment solutions that empower businesses to succeed in Latin America & Africa. Our collaborative and inclusive work environment encourages creativity and growth, where every employee’s contribution is valued. We’ve got big plans to expand into new markets and make a meaningful impact on the...


  • São Paulo, Brasil K2 Solutions Tempo inteiro

    Trabalho híbrido na região de Pinheiros/ SP - 3x por semana no escritórioEstamos selecionando um Senior Site Reliability Engineer - SRE para se juntar ao nosso time e desempenhar um papel essencial na manutenção, automação e melhoria da confiabilidade dos sistemas que impulsionam a rede logística da empresa em múltiplas regiões. Essa pessoa...


  • São Paulo, Brasil INDI Staffing Services Tempo inteiro

    OverviewWe are looking for a Site Reliability Engineer to build and maintain highly reliable, scalable, and secure OpenShift/Kubernetes clusters. Approach the problem of building and maintaining production systems from a software engineering perspective with a focus on automation and reliability. ResponsibilitiesBuild, automate, and maintain...

  • Site Reliability Engineer

    4 semanas atrás


    São Paulo, Brasil INDI Staffing Services Tempo inteiro

    At INDI, we're passionate about empowering individuals and businesses worldwide. Our cutting-edge recruiters connect leading companies with top talent, fostering a dynamic environment where innovation thrives. Join us in shaping the future of work.Overview of the role:We are looking for a Site Reliability Engineer to build and maintain highly reliable,...

  • Site Reliability Engineer

    3 semanas atrás


    São Paulo, Brasil INDI Staffing Services Tempo inteiro

    At INDI, we're passionate about empowering individuals and businesses worldwide. Our cutting-edge recruiters connect leading companies with top talent, fostering a dynamic environment where innovation thrives. Join us in shaping the future of work. Overview of the role: We are looking for a Site Reliability Engineer to build and maintain highly reliable,...

  • Site Reliability Engineer

    4 semanas atrás


    São Paulo, Brasil Thales Tempo inteiro

    Overview Join to apply for the Site Reliability Engineer role at Thales . This position is on-site in our Berrini unit. Position Summary The candidate will be working as a SRE member who will help the organization to constantly ensure reliability, availability and performance of large-scale ODC services. SRE will work closely with development teams to...

  • Site Reliability Engineer

    3 semanas atrás


    São Paulo, Brasil Thales Tempo inteiro

    OverviewJoin to apply for the Site Reliability Engineer role at Thales. Thales people architect identity management and data protection solutions at the heart of digital security. Our technologies and services help banks exchange funds, people cross borders, energy become smarter and much more. More than 30,000 organizations rely on us to verify identities,...


  • São Paulo, Brasil Chainlink Labs Tempo inteiro

    Join to apply for the Senior Site Reliability Engineer role at Chainlink Labs 2 weeks ago Be among the first 25 applicants Join to apply for the Senior Site Reliability Engineer role at Chainlink Labs Get AI-powered advice on this job and more exclusive features. About UsChainlink Labs is the primary contributing developer of Chainlink, the decentralized...


  • São Paulo, Brasil INDI Staffing Services Tempo inteiro

    Overview of the role We are looking for a Site Reliability Engineer to build and maintain highly reliable, scalable, and secure OpenShift/Kubernetes clusters. We will need you to approach the problem of building and maintaining production systems from a software engineering perspective with a focus on automation, and reliability. Key responsibilities Build...


  • São Paulo, Brasil INDI Staffing Services Tempo inteiro

    Overview of the roleWe are looking for a Site Reliability Engineer to build and maintain highly reliable, scalable, and secure OpenShift/Kubernetes clusters. We will need you to approach the problem of building and maintaining production systems from a software engineering perspective with a focus on automation, and reliability. Key responsibilitiesBuild and...