Data Engineer – Azure Databricks
Há 3 dias
US - Remote About the team: Capco’s Data Team helps our clients transform every aspect of their business. We are highly skilled at formulating data strategy, defining business and technology initiatives across the data management lifecycle, and aligning multi-year strategic roadmaps with client’s business goals. As digital technologies advance and regulations tighten, today’s consumers – and, therefore, today’s businesses – are becoming more aware of the importance of good quality data. We work to establish holistic ways to effectively manage data through the modern data supply chain and facilitate consumption through analytics, modelling, AI, machine learning, dashboarding, and reporting. About the Job: The Data Engineer will serve as the lead technical specialist for designing and implementing data science and advanced analytics capabilities on Microsoft Azure Fabric and Databricks. This role focuses on data processing, identity resolution, entity linking, and data warehouse development that enable organizations to unify fragmented data across multiple systems into a trusted, governed, and analytics-ready model. The ideal candidate combines deep hands‑on expertise in Databricks engineering, data modeling, and applied data science, with the ability to build scalable, production-grade data solutions in collaboration with business, engineering, and analytics teams. What You’ll Get to Do: Data Platform & Warehouse Development Design and develop data lakehouse and warehouse structures within Azure Databricks and Fabric environments. Build ETL and ELT pipelines to extract, cleanse, normalize, and enrich data from CRM, ERP, LMS, and financial systems. Develop reusable data transformation and validation frameworks leveraging PySpark, SQL, and Delta Live Tables. Support the operationalization of the central data warehouse using Azure SQL and Fabric Data Warehouse. Identity Resolution & Data Linking Implement entity resolution models to unify customer, member, or participant records across systems using deterministic and probabilistic matching techniques. Design and deploy matching algorithms utilizing Databricks MLflow, PySpark, and Azure Machine Learning for cross-system deduplication and linkage. Collaborate with architects to define unique identifiers, external keys, and golden record frameworks for enterprise data integration. Monitor and continuously refine data matching accuracy, precision, and recall metrics. Data Processing & Automation Develop and schedule data ingestion pipelines in Azure Fabric and Databricks for recurring Excel, CSV, and structured PDF sources using Power Automate, Form Recognizer, and Fabric Dataflows. Apply data quality and validation rules to flag incomplete, inconsistent, or stale records. Build and automate data lineage, change tracking (CDC), and error-handling workflows. Support performance tuning and scalability for high-volume processing environments. Analytics & Modeling Support Provide curated and feature-engineered datasets for Power BI dashboards and machine learning use cases. Partner with data analysts to define KPIs and enable cross-system reporting and predictive insights. Develop scripts and notebooks to support exploratory data analysis (EDA) and visualization in Databricks. What You’ll Bring with You: BA in Data Science, Computer Science, Applied Mathematics, or related discipline. 5+ years of experience in data engineering and applied data science on Azure platforms. 3+ years building and managing pipelines in Azure Databricks (PySpark, Delta Lake, MLflow). 2+ years hands‑on experience with Microsoft Fabric (Data Factory, Dataflow Gen2, Data Warehouse). Power BI integration and data modeling. Entity resolution and master data management (MDM) methods. Statistical modeling, clustering, and record linkage algorithms. Data governance, lineage tracking, and compliance (PII, HIPAA, etc). Proven track record implementing identity resolution and entity linking frameworks. Strong background in SQL, Python, and large-scale data processing for analytics. Preferred Certifications: Microsoft Certified: Fabric Analytics Engineer Associate Why Capco? A career at Capco is a chance to help reshape the competitive landscape in financial services. We launch new banks, transform existing ones, and help our clients navigate complex change. As consultants, we work on the front‑end business design all the way through to technology implementation. We are the largest Financial Services focused consultancy in the world, serving everyone from global banks to emerging FinTechs, from strategy through digital transformation, design, business consulting, data and analytics, cyber, cloud, technology architecture, and engineering. Capco is a young and growing firm. We maintain an entrepreneurial spirit and growth mindset, and have minimal bureaucracy. We have no internal silos that get in the way of your career opportunities or ability to focus on our clients and make a difference to the business. We offer the opportunity for everyone to learn rapidly, take on tough challenges, and get promoted quickly. We take pride in our creative, collaborative, diverse, and inclusive culture, where everyone can #BYAW. Benefits We offer highly competitive benefits, including medical, dental and vision insurance, a 401(k) plan, tuition reimbursement, and a work culture focused on innovation and creation of lasting value for our clients and employees. Ready to take the Next Step? If this sounds like you, we would love to hear from you. This is an opportunity to make a difference and contribute to a highly successful company with a significant growth trajectory. US Pay Range $125,000 - $143,000 USD U.S. EQUAL OPPORTUNITY EMPLOYMENT INFORMATION Individuals seeking employment at Capco are considered without regards to race, color, religion, national origin, age, sex, marital status, ancestry, physical or mental disability, veteran status, gender identity, or sexual orientation. You are being given the opportunity to provide the following information in order to help us comply with federal and state Equal Employment Opportunity/Affirmative Action record keeping, reporting, and other legal requirements. Completion of the form is entirely voluntary. Whatever your decision, it will not be considered in the hiring process or thereafter. Any information that you do provide will be recorded and maintained in a confidential file. #J-18808-Ljbffr
-
Senior Solutions Architect
4 semanas atrás
Jundiaí, Brasil Databricks Inc. Tempo inteiroOverview While candidates in the listed location(s) are encouraged for this role, candidates in other locations will be considered. As a Senior Solutions Architect on the Digital Native Strategic team, you will shape the future of the Data & AI landscape by working with the most sophisticated data engineering and data science teams in the world. Databricks'...
-
Engenheiro de Dados Azure Sênior
3 semanas atrás
Jundiaí, Brasil Dataside Tempo inteiroPapel na empresa: Construir pipelines de dados entregando modelos de dados nas camadas Bronze, Silver e Gold para o time de negócios; ter capacidade de analisar dados e tomar decisões. Responsabilidades: Escrever consultas SQL complexas para realizar tarefas como selecionar, inserir, atualizar e excluir dados de várias tabelas. Construir pipelines de...
-
Engenheiro de Dados Senior
2 semanas atrás
Jundiaí, Brasil Agricopel Tempo inteiroPara se candidatar a esta vaga clique aqui e crie seu currículo. Descrição da Vaga Estamos em busca de um engenheiro de dados com expertise para integrar a equipe da Agricopel. O candidato necessita ter habilidades em programação Python, Apache Spark e SQL, além de experiência com Databricks, Azure (Data Factory) e GIT.Responsabilidades:- Desenvolver,...
-
Lead Data Engineer
Há 7 dias
Jundiaí, Brasil Fusemachines Tempo inteiro3 weeks ago Be among the first 25 applicantsAbout FusemachinesFusemachines is a leading AI strategy, talent, and education services provider.Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI.With a presence in 4 countries (Nepal, United States, Canada, and Dominican...
-
Lead Data Software Engineer
Há 7 dias
Jundiaí, Brasil Epam Systems Tempo inteiro1 day ago Be among the first 25 applicantsWe are seeking a highly skilled Lead Data Software Engineer to join our remote team, working with a global leader in banking and financial services.With a legacy of over 70 years, this bank offers a wide range of services, including financing and leasing, foreign exchange, and investment banking.As a Lead Data...
-
Arquiteto de Dados
Há 3 dias
Jundiaí, Brasil Innolevels Tempo inteiroEstamos contratando Arquiteto de Dados para a execução de projetos de transformação digital, desenvolvendo ferramentas inovadoras para entregar a melhor experiência para os usuários da plataforma de uma grande empresa. Entendemos Que Para Essa Evolução, é Necessário Experiência em arquitetura de dados moderna (Lakehouse, Medallion); Databricks...
-
Data Engineer
1 semana atrás
Jundiaí, Brasil Siemens Tempo inteiro2 days ago Be among the first 25 applicants Get AI-powered advice on this job and more exclusive features. Do you want to play a key role in building a better future, working on projects that impact society and the environment?If you're driven by the search for innovative solutions that maximize value creation and drive digital transformation, this is your...
-
Python and Kubernetes Software Engineer
Há 2 dias
Jundiaí, Brasil Canonical Tempo inteiroPython and Kubernetes Software Engineer - Data, AI/ML & Analytics Join to apply for the Python and Kubernetes Software Engineer - Data, AI/ML & Analytics role at Canonical Python and Kubernetes Software Engineer - Data, AI/ML & Analytics 4 months ago Be among the first 25 applicants Join to apply for the Python and Kubernetes Software Engineer - Data, AI/ML...
-
Senior Backend Engineer
1 semana atrás
Jundiaí, Brasil Cyera Tempo inteiroSenior Backend Engineer – Build & CI Focus R&D Tel Aviv Full-time Description At Cyera, we’re redefining cloud data security. Our platform rapidly scans massive cloud environments, giving customers clarity on their data and empowering them to prevent leaks before they happen. Backed by leading cyber investors and a stellar team, we’re growing fast and...
-
Data Infrastructure Architect
Há 5 dias
Jundiaí, Brasil Bebeeinfrastructure Tempo inteiroWe are seeking a skilled Data Infrastructure Architect to lead the design and optimization of our global data infrastructure.This role requires deep technical expertise across multiple cloud platforms, including AWS and Azure.The successful candidate will be responsible for architecting and managing scalable, fault-tolerant, and high-performance data...