Site Reliability Engineer

4 semanas atrás


Brasil Kraken Tempo inteiro
Overview

Site Reliability Engineer - Data Platform role at Kraken. Join our Data Infrastructure team to uphold the reliability, scalability, and efficiency of our data platform.

Responsibilities
  • Design the data governance mechanisms that ensure our lakehouse is easy to interact with, secure and in compliance with all applicable regulations.
  • Implement the infrastructure we use to ingest our data, store it, catalog it with the right metadata and capture its lineage.
  • Provide a state-of-the-art suite of BI tools for multiple teams within the company.
  • Guarantee the availability, high performance, scalability and cost efficiency of our data platform.
  • Implement data infrastructure solutions (self service) that support the needs of 10+ business units and over 100 engineering and data analysts
  • Utilize Infrastructure as Code (IaC) principles to design, provision, and manage both on-premises and cloud (AWS) infrastructure components using tools such as Terraform
  • Develop and maintain automation scripts using bash/shell scripting to automate operational tasks and deployments
  • Enhance and manage CI/CD pipelines to facilitate consistent software deployments across the data infrastructure
  • Implement robust data monitoring and alerting solutions to proactively detect anomalies and performance issues
  • Manage and implement role-based access control (RBAC) and permissions for multiple user groups and machine workflows across environments
  • Manage and maintain real-time streaming data architecture using technologies like Kafka and Debezium (CDC)
  • Ensure timely and accurate processing of streaming data for insights
  • Utilize Kubernetes to manage containerized applications within the data infrastructure
  • Implement incident response procedures and participate in on-call rotations
  • Collaborate with data analysts, engineers, and cross-functional teams to understand requirements and implement solutions
  • Document architecture, processes, and best practices to enable knowledge sharing and continuous improvement
  • Support AI/ML teams with their infra requests
Qualifications
  • Proven experience (5+ years) as a Site Reliability Engineer, Infrastructure Engineer, Data Infrastructure Engineer, or similar roles with a focus on data infrastructure and security
  • Experience with real-time data processing technologies such as Kafka, Flink, and Debezium
  • Experience managing hybrid multi-tenant cloud systems, particularly on AWS
  • Infrastructure as Code tools such as Terraform, Terragrunt and Atlantis
  • Experience with containerization/orchestration tools (Kubernetes, Nomad, Docker)
  • Strong Bash/shell scripting and proficiency in at least one programming language (preferably Python or JVM languages)
  • Experience with data technologies: Apache Airflow, Apache Spark, databases, BI tooling
  • Experience solving data access management at large-scale data lakes
  • Familiarity with CI/CD deployment pipelines and related tools
  • Strong problem-solving skills and ability to troubleshoot complex systems

This job is accepting ongoing applications and there is no application deadline.
Please note, applicants may redact or remove information identifying age, date of birth, or dates of attendance/graduation on their resume.
We consider qualified applicants with criminal histories for employment consistent with the San Francisco Fair Chance Ordinance.

Kraken is powered by people from around the world and we celebrate diverse talents, backgrounds, and perspectives. We hire based on merit and encourage applying for roles even if you don\'t meet every listed requirement, especially if you\'re passionate about crypto. Kraken is an equal opportunity employer; we do not tolerate discrimination or harassment. See Kraken\'s Career and Privacy policies for more information.

#J-18808-Ljbffr
  • Reliability Engineer, Mill

    4 semanas atrás


    Brasil Rosebel Gold Mines N.V. Tempo inteiro

    OverviewAt Rosebel Gold Mines N.V., we are seeking a highly skilled and motivated MillReliability Engineer to join our Mill department. As a Reliability Engineer, you will play a crucial role in ensuring the smooth operation and maintenance of our fixed asset milling equipment. In this role, you will play a crucial role in enhancing the team's understanding...

  • Reliability Engineer

    4 semanas atrás


    Brasil Flinks Tempo inteiro

    Flinks is where financial data moves—with purpose, trust, and impact. We're on a mission to simplify access to financial data and help businesses build better, faster, and more secure financial products and experiences. Since 2016, we've been bridging the gap between fintechs, financial institutions, and consumers by enabling seamless, secure data...

  • System Reliability Expert

    4 semanas atrás


    Brasil beBeeReliability Tempo inteiro

    Job Title: Site Reliability Engineer Are you a skilled professional seeking a challenging and dynamic work environment? Our company is a multinational corporation specializing in the Management, Implementation, Development and Maintenance of Information Systems. We are looking for an experienced System Reliability Expert to join our team. With over 150...


  • Brasil Stone Tempo inteiro

    Quem é Stone Tech? A Stone nasceu com o propósito de ser protagonista na transformação da indústria de pagamentos, lutando para oferecer as melhores soluções para quem empreende no Brasil. Quem é Stone Tech? A Stone nasceu com o propósito de ser protagonista na transformação da indústria de pagamentos, lutando para oferecer as melhores soluções...


  • Brasil Stone Tempo inteiro

    Quem é Stone Tech? A Stone nasceu com o propósito de ser protagonista na transformação da indústria de pagamentos, lutando para oferecer as melhores soluções para quem empreende no Brasil. Pensando nisso, construímos a Stone Tech! A junção dos times de tecnologia Stone Co. e as empresas financeiras do grupo que reconhecem o potencial...

  • Reliability Expert

    3 semanas atrás


    Brasil beBeeReliability Tempo inteiro

    Job Description We are seeking a skilled Site Reliability Engineer to fill this key role. The primary focus of this position is incident resolution via Critical Issue Response System (CIRS) and providing regular updates until successful resolution. Responsibilities: Handle major incidents using the CIRS platform Perform in-depth application troubleshooting,...

  • Site Reliability Engineer

    4 semanas atrás


    Brasil HCLTech Tempo inteiro

    ResponsibilitiesHandling major incidents via CIRS (Critical Issue Response System) and providing frequent updates until resolution. Performing deep-dive application troubleshooting and identifying preventive actions. Managing CIRS-related requests including deployments, feature toggles, and data fixes. Following up on major production incidents and...

  • Site reliability engineer

    3 semanas atrás


    Brasil HCLTech Tempo inteiro

    Your role and responsabilities: Handling major incidents via CIRS (Critical Issue Response System) and providing frequent updates until resolution. Performing deep-dive application troubleshooting and identifying preventive actions. Managing CIRS-related requests including deployments, feature toggles, and data fixes. Following up on major production...

  • Senior Devops Engineer

    4 semanas atrás


    Brasil ITG Software, Inc. Tempo inteiro

    Calling all DevOps Engineers We're looking for an experienced DevOps Engineer who wants to do more than just keep the lights on. This is a chance to design, build, and secure production-grade, cloud-native platforms that power real-world impact. About the Role:You'll be at the heart of our engineering team, scaling infrastructure, automating workflows, and...


  • Brasil Netvagas Tempo inteiro

    Overview Join to apply for the Application reliability engineer sre focado em aplicacoes role at Netvagas 2 days ago Be among the first 25 applicants Join to apply for the Application reliability engineer sre focado em aplicacoes role at Netvagas Responsibilities Desenvolver e implementar soluções de infraestrutura, garantindo estabilidade,...