Multimodal Ai Evaluator
Há 3 dias
Job DescriptioniMerit seeks highly detail-oriented and analytically minded professionals to perform nuanced evaluations of AI system outputs across various modalities.Analysts will assess the accuracy, quality, clarity, and cultural alignment of model outputs against complex guidelines.Role Responsibilities:Evaluate outputs generated by Large Language Models (LLMs) across multiple modalities.Assess quality against project-specific criteria such as correctness, coherence, completeness, style, cultural appropriateness, and safety.Identify subtle errors, hallucinations, or biases in AI responses.Apply domain expertise and logical reasoning to resolve ambiguous or unclear outputs.Provide detailed written feedback, tagging, and scoring of outputs to ensure consistency across the evaluation team.Collaborate with Project Managers and Quality Leads to meet accuracy, reliability, and turnaround benchmarks.Required Skills & Qualifications:Strong critical reading, observational, and evaluative skills across different modalities.Ability to articulate nuanced judgments with precision and clarity.Excellent English comprehension; additional languages a plus.Familiarity with LLMs, generative AI, and multimodal systems.Strong attention to detail and ability to apply guidelines consistently.Awareness of cultural and linguistic nuances, including potential bias and harm in AI outputs.Comfort with evolving workflows, rapid feedback cycles, and complex quality frameworks.Benefits:Opportunities to shape the evaluation standards for next-generation multimodal AI systems.Innovative and supportive global working environment.Competitive compensation and flexible remote working arrangements.Continuous learning and growth in applied AI evaluation.What We Offer:iMerit is an innovative company that empowers professionals to evaluate AI systems accurately and safely.Our collaborative environment fosters personal and professional growth, while our competitive compensation packages ensure a secure work-life balance.The ideal candidate should be comfortable working in environments where exposure to potentially sensitive content may occur due to imperfections in client-provided datasets.
-
Medical Transcription Evaluators
Há 4 dias
Fortaleza, Brasil DeepLLMData Tempo inteiroRole: Medical Transcription Evaluators Location: Remote Language : Portuguese-Brazilian Job Type: Contingent (Hourly/Project-Based) About the Role: The objective of this project is to have human evaluators listen to short audio clips of AI-generated speech (~30 seconds each) and correct the corresponding transcriptions. This ensures that the final transcript...
-
AI Red Team Engineer
Há 2 dias
Fortaleza, Brasil LILT Tempo inteiroAbout LILT AI is changing how the world communicates — and LILT is leading that transformation. We're on a mission to make the world's information accessible to everyone , regardless of the language they speak. We use cutting-edge AI, machine translation, and human-in-the-loop expertise to translate content faster, more accurately, and more...