Multimodal Ai Evaluator
Há 2 dias
Job DescriptionEvaluate the performance of Large Language Models (LLMs) across multiple modalities including text, image captions, video descriptions, and multimodal interactions.Assess output quality against project-specific criteria such as correctness, coherence, completeness, style, cultural appropriateness, and safety.Evaluate LLM outputs for subtle errors, hallucinations, or biases.Apply domain expertise to resolve ambiguous or unclear outputs.Collaborate with Project Managers and Quality Leads to meet accuracy, reliability, and turnaround benchmarks.Required Skills and QualificationsCritical Evaluation Skills: Strong critical reading, observational, and evaluative skills across different modalities.Ability to articulate nuanced judgments with precision and clarity.Excellent English comprehension (CEFR B2 or above); additional languages a plus.Familiarity with LLMs, generative AI, and multimodal systems.Awareness of cultural and linguistic nuances, including potential bias and harm in AI outputs.BenefitsWe offer an opportunity to work on complex quality frameworks, evolve workflows, and contribute to refining evaluation guidelines.OthersThis role is ideal for individuals who thrive in dynamic environments and can adapt to rapid feedback cycles.
-
Machine Learning Engineer
Há 6 horas
Viamão, Brasil RealQuant Tempo inteiroAbout RealQuant RealQuant is building the first vertical AI platform for commercial real estate — turning OMs, rent rolls, and financials into structured insights that power underwriting, reporting, and portfolio intelligence. We’re redefining how deals move from OM → LOI in under 30 minutes by combining institutional real estate expertise...