AI Agent Evaluation Analyst
Há 3 dias
At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI.
The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting-edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real-world expertise from across the globe.
Who we're looking for:
We're looking for curious and intellectually proactive contributors, the kind of person who double-checks assumptions and plays devil's advocate.
Are you comfortable with ambiguity and complexity? Does an async, remote, flexible opportunity sound exciting? Would you like to learn how modern AI systems are tested and evaluated?
This is a flexible, project-based opportunity well-suited for:
- Analysts, researchers, or consultants with strong critical thinking skills.
- Students (senior undergrads / grad students) looking for an intellectually interesting gig.
- People open to a part-time and non-permanent opportunity.
About the project:
We're on the hunt for QAs for autonomous AI agents for a new project focused on validating and improving complex task structures, policy logic, and agent evaluation frameworks. Throughout the project, you'll have to balance quality assurance, research, and logical problem-solving. This project opportunity is ideal for people who enjoy looking at systems holistically and thinking through scenarios, implications, and edge cases.
You do not need a coding background, but you must be curious, intellectually rigorous, and capable of evaluating the soundness and consistency of complex setups. If you've ever excelled in things like consulting, CHGK, Olympiads, case solving, or systems thinking — you might be a great fit.
What you'll be doing:
- Reviewing evaluation tasks and scenarios for logic, completeness, and realism.
- Identifying inconsistencies, missing assumptions, or unclear decision points.
- Helping define clear expected behaviors (gold standards) for AI agents.
- Annotating cause-effect relationships, reasoning paths, and plausible alternatives.
- Thinking through complex systems and policies as a human would to ensure agents are tested properly.
- Working closely with QA, writers, or developers to suggest refinements or edge case coverage.
How to get started:
Apply to this post, qualify, and get the chance to contribute to a project aligned with your skills, on your own schedule. Shape the future of AI while building tools that benefit everyone.
Requirements- Excellent analytical thinking: Can reason about complex systems, scenarios, and logical implications.
- Strong attention to detail: Can spot contradictions, ambiguities, and vague requirements.
- Familiarity with structured data formats: Can read, not necessarily write JSON/YAML.
- Can assess scenarios holistically: What's missing, what's unrealistic, what might break?
- Good communication and clear writing (in English) to document your findings.
We also value applicants who have:
- Experience with policy evaluation, logic puzzles, case studies, or structured scenario design.
- Background in consulting, academia, olympiads (e.g. logic/math/informatics), or research.
- Exposure to LLMs, prompt engineering, or AI-generated content.
- Familiarity with QA or test-case thinking (edge cases, failure modes, "what could go wrong").
- Some understanding of how scoring or evaluation works in agent testing (precision, coverage, etc.).
- Get paid for your expertise, withrates that can go up to $15/hour depending on your skills, experience, and project needs.
- Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments.
- Participate in an advanced AI project and gain valuable experience to enhance your portfolio.
- Influence how future AI models understand and communicate in your field of expertise.
-
Full-Stack Ai Agent Engineer
4 semanas atrás
Rio de Janeiro, Rio de Janeiro, Brasil buscojobs Brasil Tempo inteiroOverview Early-stage startup (ex-Google, Stanford) seeks an extremely talented full-time contractor for a full-stack AI Agent engineer to help pioneer Enterprise AI for Research and Workflows. Responsibilities / Qualifications 2+ years experience designing and building large, complex (yet maintainable) AI apps and shipping production-quality features in...
-
Senior AI/ML Engineer, Conversational AI
Há 4 dias
Rio de Janeiro, Rio de Janeiro, Brasil Motorola Solutions Tempo inteiro US$125.000 - US$175.000 por anoCompany OverviewAt Motorola Solutions, we believe that everything starts with our people. We're a global close-knit community, united by the relentless pursuit to help keep people safer everywhere. Our critical communications, video security and command center technologies support public safety agencies and enterprises alike, enabling the coordination that's...
-
Senior AI/ML Engineer, Conversational AI
4 semanas atrás
Rio de Janeiro, Rio de Janeiro, Brasil Motorola Solutions Tempo inteiroNote: In the final description, strong formatting is treated as bold ; only allowed tags are used as specified. Company Overview At Motorola Solutions, we believe that everything starts with our people. We're a global close-knit community, united by the relentless pursuit to help keep people safer everywhere. Our critical communications, video security...
-
Senior Full Stack Engineer, AI Agents
3 semanas atrás
Rio de Janeiro, Rio de Janeiro, Brasil Eversynced Tempo inteiroRole Overview Senior Full Stack Developer role, remote (long-term assignment) for Brazil-based candidates. You will join one of our North American client's engineering team to elevate customer support operations through AI and agentic applications, and optimize intelligent systems that enhance customer experience across all support channels. As an...
-
Talent Technical Assessment Analyst
4 semanas atrás
Rio de Janeiro, Rio de Janeiro, Brasil BairesDev Tempo inteiroTalent Technical Assessment Analyst - Remote Work: At BairesDev, we've been leading the way in technology projects for over 15 years. We deliver cutting-edge solutions to giants like Google and the most innovative startups in Silicon Valley. Our diverse 4,000+ team, composed of the world's Top 1% of tech talent, works remotely on roles that drive significant...
-
Senior AI Engineer
3 semanas atrás
Rio de Janeiro, Rio de Janeiro, Brasil Velozient Tempo inteiroWe are seeking a remote, full-time Senior AI Engineer with 5+ years of software and AI/ML engineering experience. Candidates must have a strong background in Python and either Golang, Node.js, or Java expertise, with a strong desire to adopt Golang as the primary back-end technology. In this position, you will be at the heart of the client's product...
-
Senior AI/ML Full Stack Engineer
3 semanas atrás
Rio de Janeiro, Rio de Janeiro, Brasil FullStack Labs Tempo inteiroOverviewSenior AI/ML Full Stack Engineer - Remote - Latin AmericaJoin to apply for the Senior AI/ML Full Stack Engineer - Remote - Latin America role at FullStack Labs.5 days ago Be among the first 25 applicantsGet AI-powered advice on this job and more exclusive features.About FullStackFullStack is the most transparent IT talent network, connecting highly...
-
Lead AI Engineer
3 semanas atrás
Rio de Janeiro, Rio de Janeiro, Brasil SupportYourApp Tempo inteiroOverview Our team is expanding We are in need of a seasoned, self-driven Senior AI Engineer to join us part-time. If you are an expert in taking ownership and solving complex problems with creativity and grit, you''ll fit right in. Together we will: Build cutting-edge AI-powered solutions that transform data into real value; Build the backbone systems that...
-
Freelance AI Red Team Engineer
Há 15 horas
Rio de Janeiro, Rio de Janeiro, Brasil Mindrift Tempo inteiro R$30.000 - R$60.000 por anoThis opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency.At Mindrift, innovation meets opportunity. We believe in using the power of collective intelligence to ethically shape the future of AI.What...
-
Freelance AI Red Team Engineer
Há 14 horas
Rio de Janeiro, Rio de Janeiro, Brasil Mindrift Tempo inteiro R$30.000 - R$60.000 por anoThis opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency.At Mindrift, innovation meets opportunity. We believe in using the power of collective intelligence to ethically shape the future of AI.What...