Ml Data Pipeline Engineer
2 semanas atrás
We're seeking a Data Pipeline Engineer to own and evolve our exercise recognition training data infrastructure. You'll manage the end-to-end pipeline that collects, synchronizes, validates, and prepares IMU sensor and video data for ML model training. This role combines systems engineering, data quality automation, and hands-on problem-solving in a production environment. What You'll Do Pipeline Operations & Improvement Maintain and enhance our multi-source data collection system: IMU sensors (via mobile app) and synchronized video streams from gym-based cameras. Improve video capture software robustness, particularly handling network interruptions and operational monitoring. Deploy and monitor services in remote Linux environments with appropriate DevOps practices. Data Quality & Validation Evolve our Python-based QC engine that validates data pre- and post-annotation Implement checks for IMU-video time synchronization, sensor health, and measurement consistency Apply digital signal processing techniques to identify sensor failures, connectivity issues, and measurement irregularities. Develop validation logic comparing annotations against sensor data to ensure temporal alignment. Analysis & Troubleshooting Perform ad-hoc analysis on ~1,200+ workout tasks to classify failure modes Identify whether issues stem from pipeline bugs, sensor problems, or annotation errors Prioritize engineering work based on data quality impact and coordinate with annotation team on fixes Tooling and Visualization Maintain and extend our NextJS UI serving annotators, data scientists, and stakeholders Create visualizations (Chart.Js) for QC metrics and signal analysis Integrate with LabelStudio annotation interface What You Bring Required Strong Python programming skills, particularly for data processing pipelines Experience with time-series data and digital signal processing Comfortable working in Linux environments and deploying/monitoring remote services Ability to debug complex multi-component systems (sensors, video, networks, sync) Data quality mindset: designing validation rules, tracking metrics, investigating anomalies SQL/database experience for managing pipeline metadata Highly Valued Video processing experience (RTSP streams, encoding, OCR) Working with sensor/IoT data and handling connectivity challenges NextJS or modern web frameworks for data tooling DevOps practices: containerization, monitoring, logging, alerting Experience with annotation pipelines and ML training data workflows Background in biomechanics, sports science, or wearable sensors Tech Stack Languages: Python (primary), JavaScript/TypeScript (NextJS UI) Data: IMU sensor streams, video (RTSP), time-series analysis, DSP Tools: LabelStudio, Chart.Js, Linux/bash, OCR libraries Infrastructure: Remote deployment, monitoring systems You'll Thrive Here If You Enjoy detective work: diagnosing why data doesn't match expectations Balance pragmatism with quality: shipping improvements while maintaining reliability Communicate well across technical and non-technical stakeholders Can work autonomously in a small, mission-driven team
-
Lead Ai Engineer
1 dia atrás
Mato Grosso, Brasil GeorgiaTEK Systems Inc. Tempo inteiroRole: Lead AI Engineer Location: Remote Type: Contract Responsibilities: The Lead AI Engineer will be responsible for designing, developing, and deploying scalable machine learning and AI solutions. This includes building end-to-end ML pipelines, implementing intelligent automation, enabling data-driven decision-making, and supporting enterprise-level...
-
Data Engineer
1 dia atrás
Mato Grosso, Brasil Ascendion Tempo inteiroOverview: We are seeking a highly skilled Data Engineer to support the development of personalized search capabilities and data assimilation initiatives within the organization. This is a remote role aligned to the EST time zone. The ideal candidate will have strong experience working with Python, databricks, and Kafka, and will contribute directly to...
-
Data Engineer
Há 7 dias
Mato Grosso, Brasil Design Manager Tempo inteiroJob Title: Data Engineer Location: Remote (Brazil) Employment Type: Full Time Compensation: Competitive hourly rate, commensurate with experience About Design Manager Design Manager (+DesignSpec) is a leading provider of project management and accounting software tailored specifically for interior design firms. For over 30 years, we've helped thousands of...
-
Sr Data Engineer
1 dia atrás
Mato Grosso, Brasil Luxoft Tempo inteiroBefore you apply, please get familiar with Luxoft Luxoft locations: Logeek Magazine: Luxoft Alumni Club: Responsibilities: Design and implement scalable data pipelines using Databricks and Kafka Build and maintain real-time streaming solutions for high-volume data Collaborate with cross-functional teams to integrate data flows into broader systems Optimize...
-
Senior Data Engineer
3 semanas atrás
Mato Grosso, Brasil Eightpoint Tempo inteiroAbout Eightpoint Eightpoint is an internet technology company specializing in the agile development of products and content that address real-world interests, captivating users and driving significant growth for partners. With offices in the United States and Cayman Islands, Eightpoint collaborates with partners globally on the next generation of...
-
Data Engineer
3 semanas atrás
Mato Grosso, Brasil Tata Consultancy Services Tempo inteiroCome to one of the biggest IT Services companies in the world!! Here you can transform your career! Why to join TCS? Here at TCS we believe that people make the difference, that's why we live a culture of unlimited learning full of opportunities for improvement and mutual development. The ideal scenario to expand ideas through the right tools, contributing...
-
Sr Python Data Engineer
3 semanas atrás
Mato Grosso, Brasil Softensity Inc Tempo inteiroSenior Python Data Engineer About the Project Responsibilities Design, build, and maintain high-performance data processing pipelines using Python libraries (Pandas, Polars). Develop and expose RESTful APIs using FastAPI or similar frameworks. Consume and process normalized Parquet files from multiple upstream sources to generate dynamic Excel reports....
-
Remote Data Scientist
1 dia atrás
Mato Grosso, Brasil Turing Tempo inteiroAbout Turing: Turing is one of the world's fastest-growing AI companies accelerating the advancement and deployment of powerful AI systems. Turing helps customers in two ways: Working with the world's leading AI labs to advance frontier model capabilities in thinking, reasoning, coding, agentic behavior, multimodality, multilinguality, STEM and frontier...
-
Cloud Data Expert
Há 6 dias
Mato Grosso, Brasil beBeeDataEngineer Tempo inteiroAzure Data Engineer Role Transform your career as an expert data engineer in the cloud. Data Pipeline Design and Development: Build high-performance data processing solutions using Azure Data Factory, Azure Databricks, and other services. Data Science Collaboration: Understand data requirements and deliver clean, reliable datasets for business insights....
-
Sr Python Data Engineer
2 semanas atrás
Mato Grosso, Brasil Softensity Inc Tempo inteiroSenior Python Data Engineer About the Project Responsibilities Design, build, and maintain high-performance data processing pipelines using Python libraries (Pandas, Polars). Develop and expose RESTful APIs using FastAPI or similar frameworks. Consume and process normalized Parquet files from multiple upstream sources to generate dynamic Excel reports....