Staff Systems Engineer

Há 2 dias


São Paulo, São Paulo, Brasil Nubank Tempo inteiro R$120.000 - R$250.000 por ano
About Nubank

Nubank was founded in 2013 to free people from a bureaucratic, slow, and inefficient financial system. Since then, through innovative technology and outstanding customer service, the company has been redefining people's relationships with money across Latin America. With operations in Brazil, Mexico, and Colombia, Nubank is today one of the largest digital banking platforms and technology-leading companies in the world.

Today, Nubank is a global company, with offices in São Paulo (Brazil), Mexico City (Mexico), Buenos Aires (Argentina), Bogotá (Colombia), Durham (United States), and Berlin (Germany). It was founded in 2013 in Sao Paulo, by Colombian David Vélez, and cofounded by Brazilian Cristina Junqueira and American Edward Wible. For more information, visit 

About the team

We are seeking an experienced and highly motivated Staff Site Reliability Engineer to join our Data Infra SRE team. This critical role will play a pivotal part in shaping the future direction of the SRE team for our Data Platform, contributing significantly to its evolution plan, as we go toward a Data Mesh architecture. We are looking for an individual with a proactive and entrepreneurial mindset who can drive innovation and excellence in reliability engineering. We are seeking an experienced and highly motivated Staff Site Reliability Engineer to join our Data Infra SRE team. This critical role will play a pivotal part in shaping the future direction of the SRE team for our Data Platform, contributing significantly to its evolution plan. While the team was initially formed under specific circumstances, our ambitious decentralization goals mean that we cannot scale effectively without a heavy investment in automation. Therefore, our vision is to heavily invest in automation with AI and AI agents, leveraging new frameworks like LangGraph alongside more classical automation approaches. For example, we aim to drastically reduce the time and effort involved in data platform crash resolution and coordination through intelligent automation. Another key initiative is to develop a "swarm of AI agents" that will act as a lubricant for the Data Platform, focusing on sophisticated anomaly detection mechanisms and predictive analytics to preventatively detect problems and automatically notify the respective responsible teams. This innovative approach will allow us to achieve unprecedented levels of reliability and efficiency. We are looking for an individual with a proactive and entrepreneurial mindset who can drive innovation and excellence in reliability engineering..

Your role

Your contributions in this role will directly address critical challenges, such as:

  • Strategic Direction and Evolution: Proactively identifying opportunities and leading initiatives to define and refine the strategic direction of the SRE team within the data platform context, specifically contributing to the Archipelago evolution plan.
  • Architectural Leadership: Providing architectural guidance and expertise for the design, implementation, and maintenance of highly reliable, scalable, and performant data infrastructure.
  • Incident Management and Resolution: Leading the effort in establishing and refining incident response protocols, ensuring efficient resolution of critical data platform issues, and driving post-incident analysis for continuous improvement.
  • Performance Optimization: Identifying and implementing solutions to optimize the performance, efficiency, and resource utilization of the data platform.
  • Automation and Tooling: Championing the development and adoption of advanced automation solutions, including leveraging AI and AI agents with new frameworks like LangGraph, for tasks such as data platform crash resolution and coordination. You will also oversee the development of more classical automation tools.
  • Proactive System Health: Designing and implementing advanced monitoring, alerting, and anomaly detection mechanisms to ensure the proactive identification and prevention of potential issues. This includes exploring the concept of a "swarm of AI agents" to act as a lubricant for the Data Platform, focusing on predictive analytics and notifying responsible teams preventatively.
  • Mentorship and Leadership: Mentoring less senior SREs, fostering a culture of reliability engineering excellence, and leading technical initiatives within the team.
Our SRE team is formally responsible for:
  • Service Level Objectives (SLO) Management: Defining, monitoring, and enforcing SLOs for critical data platform services.
  • System Observability: Implementing and maintaining comprehensive monitoring, logging, and tracing solutions across the data platform.
  • Toil Reduction: Identifying and automating repetitive manual tasks to improve team efficiency and focus on strategic initiatives.
  • Disaster Recovery and Business Continuity: Developing and testing disaster recovery plans to ensure the resilience of the data platform.
  • Capacity Planning: Forecasting resource needs and planning for infrastructure scaling to meet anticipated demand.
  • Performance Engineering: Optimizing system performance and addressing bottlenecks to ensure efficient operation.
  • Security Best Practices: Implementing and advocating for security best practices within the data platform.

Platform APIs: enabling alert management on virtually any service with simple interactions

Role Location

This is a Full Remote job opening with the option to visit the Berlin office whenever you would like.

Benefits
  • Health, dental, and life insurance
  • Meal allowance
  • Transportation assistance
  • 30 days of paid vacation
  • Equity at Nubank
  • Parking partnership - discounted parking in our office
  • Free bike parking with showers available
  • NuCare - Our mental health and wellness assistance program
  • NuLanguage - Our language learning program
  • Gympass partnership
  • Extended maternity and paternity leaves
  • Child care allowance
  • 'Espaço Feijão'- Private nursing and breastfeeding spaces in our buildings
  • Onsite Health Center - Medical support for every Nubanker in our office
Diversity & Inclusion

At Nubank, we want to be sure that we're building a more diverse and inclusive workplace that reflects the customers we serve and seek to empower. That's why we hire based on equality. We consider gender, ethnicity, race, religion, sexual orientation, and other identity markers as enriching elements to our company while ensuring neither of them represent a barrier when recruiting fantastic talent.



  • São Paulo, São Paulo, Brasil Nubank Tempo inteiro R$100.000 - R$150.000 por ano

    About NubankNubank was founded in 2013 to free people from a bureaucratic, slow, and inefficient financial system. Since then, through innovative technology and outstanding customer service, the company has been redefining people's relationships with money across Latin America. With operations in Brazil, Mexico, and Colombia, Nubank is today one of the...


  • São Paulo, São Paulo, Brasil Alternative Payments Tempo inteiro US$109.000 - US$120.000

    We're seeking an experienced Staff Software Engineer to join our engineering team and drive technical excellence across our platform. This is a technical leadership role focused on architecture design, system scalability, and knowledge sharing rather than people management. You'll be responsible for designing and implementing solutions that improve and...

  • Staff Back-end Engineer

    2 semanas atrás


    São Paulo, São Paulo, Brasil Truelogic Tempo inteiro R$90.000 - R$120.000 por ano

    About TruelogicAt Truelogic we are a leading provider of nearshore staff augmentation services headquartered in New York. For over two decades, we've been delivering top-tier technology solutions to companies of all sizes, from innovative startups to industry leaders, helping them achieve their digital transformation goals.Our team of 600+ highly skilled...

  • System Engineer

    2 semanas atrás


    São Paulo, São Paulo, Brasil Insight Global Tempo inteiro R$90.000 - R$120.000 por ano

    System EngineerOn site (starting Remote)12 month contract to hireWe are seeking a highly skilled and experienced Senior Systems Engineer to join our Infrastructure & Operations team. The successful candidate will be a critical asset in managing, maintaining, and optimizing our diverse server and storage ecosystem to ensure high availability, performance, and...


  • São Paulo, São Paulo, Brasil City Storage Systems Tempo inteiro R$120.000 - R$150.000 por ano

    Who We AreAt City Storage Systems, we're building Infrastructure for Better Food. We help restaurateurs around the world succeed in online food delivery. Our goal is to make food more affordable, higher quality and convenient for everyone. We're changing the game for restaurateurs, whether they're entrepreneurs opening their first restaurant all the way...

  • Staff Back-end Engineer

    2 semanas atrás


    São Paulo, São Paulo, Brasil Truelogic Tempo inteiro R$60.000 - R$120.000 por ano

    About TruelogicAt Truelogic we are a leading provider of nearshore staff augmentation services headquartered in New York. For over two decades, we've been delivering top-tier technology solutions to companies of all sizes, from innovative startups to industry leaders, helping them achieve their digital transformation goals.Our team of 600+ highly skilled...


  • São Paulo, São Paulo, Brasil Bayer Tempo inteiro R$80.000 - R$150.000 por ano

    At Bayer, we work hard to make this company a better place for our employees - and the world a better place for everyone. "Health for all, hunger for none." That's our vision at Bayer.At Bayer, Diversity & Inclusion is taken seriously, it is a non-negotiable value, and it is one of the strategic pillars of our organization. We believe that diverse teams...


  • São Paulo, São Paulo, Brasil Cognizant Technology Solutions Tempo inteiro R$40.000 - R$60.000 por ano

    Junior Systems Engineer – Service NowAbout the roleThe Junior Systems Engineer – Service Now, will support incident management and technical operations, focusing on Service Desk, ServiceNow, and Windows systems. This office-based role involves rotational shifts and requires advanced level in English, Portuguese, and Spanish. Key responsibilities include...

  • Staff Software Engineer

    2 semanas atrás


    São Paulo, São Paulo, Brasil Bayer Tempo inteiro R$80.000 - R$150.000 por ano

    At Bayer, we work hard to make this company a better place for our employees - and the world a better place for everyone. "Health for all, hunger for none." That's our vision at Bayer.At Bayer, Diversity & Inclusion is taken seriously, it is a non-negotiable value, and it is one of the strategic pillars of our organization. We believe that diverse teams...


  • São Paulo, São Paulo, Brasil Housecall Pro Tempo inteiro US$96.000 - US$120.000 por ano

    TO BE CONSIDERED FOR THIS ROLE, PLEASE SUBMIT AN UPDATED RESUME TRANSLATED TO ENGLISHWho is Housecall Pro?Housecall Pro is a fintech company founded in 2013. We built a SaaS platform that helps Home Service Professionals operate their businesses. We created the application for plumbers, electricians, and other Pros in the home improvement/trades...