ML Data Pipeline Engineer

1 dia atrás


Índio do Brasil Prosigliere Tempo inteiro

We're seeking a Data Pipeline Engineer to own and evolve our exercise recognition training data infrastructure. You'll manage the end-to-end pipeline that collects, synchronizes, validates, and prepares IMU sensor and video data for ML model training.

This role combines systems engineering, data quality automation, and hands-on problem-solving in a production environment.

What You'll Do

Pipeline Operations & Improvement

- Maintain and enhance our multi-source data collection system: IMU sensors (via mobile app) and synchronized video streams from gym-based cameras.
- Improve video capture software robustness, particularly handling network interruptions and operational monitoring.
- Deploy and monitor services in remote Linux environments with appropriate DevOps practices.

Data Quality & Validation

- Evolve our Python-based QC engine that validates data pre- and post-annotation
- Implement checks for IMU-video time synchronization, sensor health, and measurement consistency
- Apply digital signal processing techniques to identify sensor failures, connectivity issues, and measurement irregularities.
- Develop validation logic comparing annotations against sensor data to ensure temporal alignment.

Analysis & Troubleshooting

- Perform ad-hoc analysis on ~1,200+ workout tasks to classify failure modes
- Identify whether issues stem from pipeline bugs, sensor problems, or annotation errors
- Prioritize engineering work based on data quality impact and coordinate with annotation team on fixes

Tooling and Visualization

- Maintain and extend our NextJS UI serving annotators, data scientists, and stakeholders
- Create visualizations (Chart.js) for QC metrics and signal analysis
- Integrate with LabelStudio annotation interface

What You Bring

Required

- Strong Python programming skills, particularly for data processing pipelines
- Experience with time-series data and digital signal processing
- Comfortable working in Linux environments and deploying/monitoring remote services
- Ability to debug complex multi-component systems (sensors, video, networks, sync)
- Data quality mindset: designing validation rules, tracking metrics, investigating anomalies
- SQL/database experience for managing pipeline metadata

Highly Valued

- Video processing experience (RTSP streams, encoding, OCR)
- Working with sensor/IoT data and handling connectivity challenges
- NextJS or modern web frameworks for data tooling
- DevOps practices: containerization, monitoring, logging, alerting
- Experience with annotation pipelines and ML training data workflows
- Background in biomechanics, sports science, or wearable sensors

Tech Stack

- Languages: Python (primary), JavaScript/TypeScript (NextJS UI)
- Data: IMU sensor streams, video (RTSP), time-series analysis, DSP
- Tools: LabelStudio, Chart.js, Linux/bash, OCR libraries
- Infrastructure: Remote deployment, monitoring systems

You'll Thrive Here If You

- Enjoy detective work: diagnosing why data doesn't match expectations
- Balance pragmatism with quality: shipping improvements while maintaining reliability
- Communicate well across technical and non-technical stakeholders
- Can work autonomously in a small, mission-driven team



  • Índio do Brasil Prosigliere Tempo inteiro

    We're seeking a Data Pipeline Engineer to own and evolve our exercise recognition training data infrastructure. You'll manage the end-to-end pipeline that collects, synchronizes, validates, and prepares IMU sensor and video data for ML model training. This role combines systems engineering, data quality automation, and hands-on problem-solving in a...

  • Senior Data Engineer

    1 dia atrás


    Índio do Brasil Pride Global Tempo inteiro

    We're Hiring: Senior Data Engineer | Remote from Brazil | Fluent English required | Location: Remote – Brazil only Contact: Temporary Are you passionate about building scalable data platforms and cutting-edge MLOps solutions? Do you want to work with a top-tier US company revolutionizing e-commerce and circular fashion? We're looking for a Senior Data...


  • Índio do Brasil Pride Global Tempo inteiro

    We're Hiring: Senior Data Engineer | Remote from Brazil | Fluent English required | Location: Remote – Brazil onlyContact: TemporaryAre you passionate about building scalable data platforms and cutting-edge MLOps solutions? Do you want to work with a top-tier US company revolutionizing e-commerce and circular fashion?We're looking for a Senior Data...

  • Machine Learning Engineer

    1 semana atrás


    Índio do Brasil Flatiron Software Tempo inteiro

    About Flatiron is a global remote software development company with engineers located around the world. We unite experts from diverse backgrounds and experiences in a collaborative culture to deliver exceptional products and services for our clients. As a forward-thinking software engineering company, we provide industry-leading solutions to complex...

  • Machine Learning Engineer

    1 semana atrás


    Índio do Brasil Flatiron Software Tempo inteiro

    AboutFlatiron is a global remote software development company with engineers located around the world. We unite experts from diverse backgrounds and experiences in a collaborative culture to deliver exceptional products and services for our clients. As a forward-thinking software engineering company, we provide industry-leading solutions to complex problems...

  • Senior data engineer

    Há 21 horas


    Brasil Pride Global Tempo inteiro

    We're Hiring: Senior Data Engineer | Remote from Brazil | Fluent English required | Location : Remote – Brazil only Contact: Temporary Are you passionate about building scalable data platforms and cutting-edge MLOps solutions? Do you want to work with a top-tier US company revolutionizing e-commerce and circular fashion? We're looking for a Senior Data...


  • Brasil Invisible Agency Tempo inteiro US$60.000 - US$120.000 por ano

    Target Profile:2+ years of experience building and maintaining ML infrastructure or platforms in production environments.Demonstrated ability to take ML models from experimentation to deployment using MLOps best practices.Experience collaborating with data scientists, ML engineers, and backend teams on cross-functional projects.Technical...

  • Data Engineer

    1 dia atrás


    Índio do Brasil Tata Consultancy Services Tempo inteiro

    Come to one of the biggest IT Services companies in the world!! Here you can transform your career! Why to join TCS? Here at TCS we believe that people make the difference, that's why we live a culture of unlimited learning full of opportunities for improvement and mutual development. The ideal scenario to expand ideas through the right tools, contributing...

  • Data Engineer

    2 semanas atrás


    Índio do Brasil Tecla Tempo inteiro

    *Native/Bilingual English is required for this role (read/written/spoken)Please upload your CV Resume in English.Monthly salary: $4,500 - $6,000 USDOur partner is entering an exciting phase of growth, expanding their platform and building intelligent systems that transform how banks and credit unions understand and serve small business customers.They’re...

  • Senior Data Engineer

    2 semanas atrás


    Índio do Brasil Prosigliere Tempo inteiro

    Looking for an experienced Senior Data Engineer to join the Prosigliere team.DescriptionWe are seeking a highly skilled and motivated Data Engineer to join our growing team. The ideal candidate will be instrumental in designing, building, and maintaining robust and scalable data pipelines and infrastructure. This role requires a deep understanding of data...