Ml Data Pipeline Engineer

1 dia atrás


Cuiabá, Brasil Prosigliere Tempo inteiro

We're seeking a Data Pipeline Engineer to own and evolve our exercise recognition training data infrastructure. You'll manage the end-to-end pipeline that collects, synchronizes, validates, and prepares IMU sensor and video data for ML model training.

This role combines systems engineering, data quality automation, and hands-on problem-solving in a production environment.

What You'll Do

Pipeline Operations & Improvement

- Maintain and enhance our multi-source data collection system: IMU sensors (via mobile app) and synchronized video streams from gym-based cameras.
- Improve video capture software robustness, particularly handling network interruptions and operational monitoring.
- Deploy and monitor services in remote Linux environments with appropriate DevOps practices.

Data Quality & Validation

- Evolve our Python-based QC engine that validates data pre- and post-annotation
- Implement checks for IMU-video time synchronization, sensor health, and measurement consistency
- Apply digital signal processing techniques to identify sensor failures, connectivity issues, and measurement irregularities.
- Develop validation logic comparing annotations against sensor data to ensure temporal alignment.

Analysis & Troubleshooting

- Perform ad-hoc analysis on ~1,200+ workout tasks to classify failure modes
- Identify whether issues stem from pipeline bugs, sensor problems, or annotation errors
- Prioritize engineering work based on data quality impact and coordinate with annotation team on fixes

Tooling and Visualization

- Maintain and extend our NextJS UI serving annotators, data scientists, and stakeholders
- Create visualizations (Chart.Js) for QC metrics and signal analysis
- Integrate with LabelStudio annotation interface

What You Bring

Required

- Strong Python programming skills, particularly for data processing pipelines
- Experience with time-series data and digital signal processing
- Comfortable working in Linux environments and deploying/monitoring remote services
- Ability to debug complex multi-component systems (sensors, video, networks, sync)
- Data quality mindset: designing validation rules, tracking metrics, investigating anomalies
- SQL/database experience for managing pipeline metadata

Highly Valued

- Video processing experience (RTSP streams, encoding, OCR)
- Working with sensor/IoT data and handling connectivity challenges
- NextJS or modern web frameworks for data tooling
- DevOps practices: containerization, monitoring, logging, alerting
- Experience with annotation pipelines and ML training data workflows
- Background in biomechanics, sports science, or wearable sensors

Tech Stack

- Languages: Python (primary), JavaScript/TypeScript (NextJS UI)
- Data: IMU sensor streams, video (RTSP), time-series analysis, DSP
- Tools: LabelStudio, Chart.Js, Linux/bash, OCR libraries
- Infrastructure: Remote deployment, monitoring systems

You'll Thrive Here If You

- Enjoy detective work: diagnosing why data doesn't match expectations
- Balance pragmatism with quality: shipping improvements while maintaining reliability
- Communicate well across technical and non-technical stakeholders
- Can work autonomously in a small, mission-driven team


  • Data Engineer

    1 dia atrás


    Cuiabá, Brasil Insight Global Tempo inteiro

    Insight Global is seeking a Data Engineer to join a Workforce Productivity and Data Engineering team and lead initiatives across Microsoft Azure, Fabric, and Databricks platforms onsite in Costa Rica. You will be responsible for designing, building, and maintaining scalable data pipelines using Azure Data Factory, Azure Synapse Analytics, and Databricks. You...

  • Ai Software Engineer

    1 dia atrás


    Cuiabá, Brasil Velozient Tempo inteiro

    We are looking for a remote, full-time AI Software Engineer to join our US client's team. You should have a minimum of 3 to 5+ years of experience developing and delivering commercial software, with a solid background in AI/ML, Python, TypeScript/JavaScript, and C#/ .NET. In this role, you will leverage deep expertise in NLP and ML to help build scalable,...

  • Data Engineer

    Há 5 dias


    Cuiabá, Brasil BairesDev Tempo inteiro

    6 days ago Be among the first 25 applicants At BairesDev®, we've been leading the way in technology projects for over 15 years. We deliver cutting-edge solutions to giants like Google and the most innovative startups in Silicon Valley. Our diverse 4,000+ team, composed of the world's Top 1% of tech talent, works remotely on roles that drive significant...

  • AI Engineer

    3 semanas atrás


    Cuiabá, Brasil BairesDev Tempo inteiro

    AI Engineer - Remote Work At BairesDev® we've been leading the way in technology projects for over 15 years. We deliver cutting-edge solutions to giants like Google and the most innovative startups in Silicon Valley. Our diverse 4,000+ team, composed of the world's Top 1% of tech talent, works remotely on roles that drive significant impact worldwide. AI...

  • Senior AI/ML Engineer

    2 semanas atrás


    Cuiabá, Brasil Truelogic Software Tempo inteiro

    Senior AI/ML Engineer - Digital creative agency (Brazil) Join to apply for the Senior AI/ML Engineer - Digital creative agency (Brazil) role at Truelogic Software Senior AI/ML Engineer - Digital creative agency (Brazil) 2 days ago Be among the first 25 applicants Join to apply for the Senior AI/ML Engineer - Digital creative agency (Brazil) role at Truelogic...


  • Cuiabá, Brasil Launch Potato Tempo inteiro

    OverviewSenior Machine Learning Engineer, Ad Performance at Launch Potato.Join to apply for the Senior Machine Learning Engineer, Ad Performance role at Launch Potato.Launch Potato is a profitable digital media company that reaches over 30M+ monthly visitors through brands such as FinanceBuzz, All About Cookies, and OnlyInYourState.As The Discovery and...

  • Lead Data Engineer

    2 semanas atrás


    Cuiabá, Brasil Fusemachines Tempo inteiro

    Overview Lead Data Engineer role located remotely (Contract) within the Media domain. Responsible for leading, designing, building, and maintaining the infrastructure for data integration, storage, processing, and analytics (BI, visualization, and Advanced Analytics) using Microsoft Azure. About Fusemachines Fusemachines is a leading AI strategy, talent, and...


  • Cuiabá, Brasil Launch Potato Tempo inteiro

    Senior Machine Learning Engineer, Ad Performance 3 days ago Be among the first 25 applicants WHO ARE WE? Launch Potato is a profitable digital media company that reaches over 30M+ monthly visitors through brands such as FinanceBuzz, All About Cookies, and OnlyInYourState. As The Discovery and Conversion Company, our mission is to connect consumers with the...

  • Senior Ai Engineer

    Há 5 dias


    Cuiabá, Brasil Velozient Tempo inteiro

    We are seeking a remote, full-time Senior AI Engineer with 5+ years of software and AI/ML engineering experience. Candidates must have a strong background in Python and either Golang, Node.Js, or Java expertise, with a strong desire to adopt Golang as the primary back-end technology. In this position, you will be at the heart of the client's product...

  • Azure Data Engineer

    Há 5 dias


    Cuiabá, Brasil Tata Consultancy Services Tempo inteiro

    Come to one of the biggest IT Services companies in the world!! Here you can transform your career! Why to join TCS? Here at TCS we believe that people make the difference, that's why we live a culture of unlimited learning full of opportunities for improvement and mutual development. The ideal scenario to expand ideas through the right tools, contributing...