ML Data Pipeline Engineer
3 semanas atrás
We're seeking a Data Pipeline Engineer to own and evolve our exercise recognition training data infrastructure. You'll manage the end-to-end pipeline that collects, synchronizes, validates, and prepares IMU sensor and video data for ML model training. This role combines systems engineering, data quality automation, and hands-on problem-solving in a production environment. What You'll Do Pipeline Operations & Improvement Maintain and enhance our multi-source data collection system: IMU sensors (via mobile app) and synchronized video streams from gym-based cameras. Improve video capture software robustness, particularly handling network interruptions and operational monitoring. Deploy and monitor services in remote Linux environments with appropriate DevOps practices. Data Quality & Validation Evolve our Python-based QC engine that validates data pre- and post-annotation Implement checks for IMU-video time synchronization, sensor health, and measurement consistency Apply digital signal processing techniques to identify sensor failures, connectivity issues, and measurement irregularities. Develop validation logic comparing annotations against sensor data to ensure temporal alignment. Analysis & Troubleshooting Perform ad-hoc analysis on ~1,200+ workout tasks to classify failure modes Identify whether issues stem from pipeline bugs, sensor problems, or annotation errors Prioritize engineering work based on data quality impact and coordinate with annotation team on fixes Tooling and Visualization Maintain and extend our NextJS UI serving annotators, data scientists, and stakeholders Create visualizations ) for QC metrics and signal analysis Integrate with LabelStudio annotation interface What You Bring Required Strong Python programming skills, particularly for data processing pipelines Experience with time-series data and digital signal processing Comfortable working in Linux environments and deploying/monitoring remote services Ability to debug complex multi-component systems (sensors, video, networks, sync) Data quality mindset: designing validation rules, tracking metrics, investigating anomalies SQL/database experience for managing pipeline metadata Highly Valued Video processing experience (RTSP streams, encoding, OCR) Working with sensor/IoT data and handling connectivity challenges NextJS or modern web frameworks for data tooling DevOps practices: containerization, monitoring, logging, alerting Experience with annotation pipelines and ML training data workflows Background in biomechanics, sports science, or wearable sensors Tech Stack Languages: Python (primary), JavaScript/TypeScript (NextJS UI) Data: IMU sensor streams, video (RTSP), time-series analysis, DSP Tools: LabelStudio, , Linux/bash, OCR libraries Infrastructure: Remote deployment, monitoring systems You'll Thrive Here If You Enjoy detective work: diagnosing why data doesn't match expectations Balance pragmatism with quality: shipping improvements while maintaining reliability Communicate well across technical and non-technical stakeholders Can work autonomously in a small, mission-driven team
-
Data Engineer
3 semanas atrás
Brazil HeartCentrix Solutions Tempo inteiroWe are seeking a highly skilled Python Data Engineer with an AI/ML focus to join our client’s growing data & analytics team in Brazil. This role is ideal for someone who loves building scalable data pipelines, operationalizing machine learning workflows, and partnering closely with data scientists to bring models into production. You will design, develop,...
-
Data Engineer
3 semanas atrás
Brazil HeartCentrix Solutions Tempo inteiroWe are seeking a highly skilled Python Data Engineer with an AI/ML focus to join our client’s growing data & analytics team in Brazil. This role is ideal for someone who loves building scalable data pipelines, operationalizing machine learning workflows, and partnering closely with data scientists to bring models into production.You will design, develop,...
-
Senior Data Engineer – AI/ML Focus
3 semanas atrás
Brazil beBeeData Tempo inteiroWe're seeking a seasoned Python developer to spearhead data engineering initiatives and drive AI/ML innovation within our organization. This role is perfect for someone who's passionate about crafting scalable data pipelines, operationalizing machine learning workflows, and collaborating closely with data scientists to bring models into production. You will...
-
Senior Data Scientist
3 semanas atrás
Brazil Luxoft Tempo inteiroProject Description:The primary goal of the project is the modernization, maintenance and development of an eCommerce platform for a big US-based retail company, serving millions of omnichannel customers each week.Solutions are delivered by several Product Teams focused on different domains - Customer, Loyalty, Search and Browse, Data Integration,...
-
Senior Data Scientist
3 semanas atrás
Brazil Luxoft Tempo inteiroProject Description: The primary goal of the project is the modernization, maintenance and development of an eCommerce platform for a big US-based retail company, serving millions of omnichannel customers each week. Solutions are delivered by several Product Teams focused on different domains - Customer, Loyalty, Search and Browse, Data Integration, Cart....
-
Data Engineer
3 semanas atrás
Brazil, BR HeartCentrix Solutions Tempo inteiroWe are seeking a highly skilled Python Data Engineer with an AI/ML focus to join our client’s growing data & analytics team in Brazil. This role is ideal for someone who loves building scalable data pipelines, operationalizing machine learning workflows, and partnering closely with data scientists to bring models into production.You will design, develop,...
-
Data Engineer for Sustainable Commerce
3 semanas atrás
Brazil beBeeData Tempo inteiroSenior Data Engineer Opportunity We are seeking a highly skilled Senior Data Engineer with expertise in MLOps to join our growing team. As a Senior Data Engineer, you will be responsible for designing, developing, and maintaining robust data infrastructure across real-time and batch workloads. Design and implement scalable data pipelines for model training,...
-
Senior Data Scientist
3 semanas atrás
Brazil, BR Luxoft Tempo inteiroProject Description:The primary goal of the project is the modernization, maintenance and development of an eCommerce platform for a big US-based retail company, serving millions of omnichannel customers each week.Solutions are delivered by several Product Teams focused on different domains - Customer, Loyalty, Search and Browse, Data Integration,...
-
Distributed Data Pipeline Specialist
3 semanas atrás
Brazil beBeeDataEngineer Tempo inteiroCloud Data Engineer - Remote Contract Opportunity We are seeking a highly skilled Cloud Data Engineer to join our team in a fully remote contract position. As a Cloud Data Engineer, you will be responsible for designing and implementing scalable data pipelines using cloud-based technologies. Design and build ETL/ELT pipelines in the cloud using PySpark,...
-
Azure Data Engineer
Há 6 dias
Brazil Penta Consulting Tempo inteiroWe are currently looking for a Portuguese speaking Data Engineer to work across Microsoft Azure solutions in delivering proactive services and workshops to premier enterprise customers in remotely in Brazil. Key Responsibilities Design and implement end-to-end data architecture solutions on Microsoft Azure. Develop and manage ETL/ELT pipelines using Azure...