
RAG Architect
Há 6 dias
We are seeking a Senior AI Engineer to lead the design and implementation of an end-to-end Retrieval-Augmented Generation (RAG) architecture. This role will drive the ingestion of GitHub repositories, Confluence pages, Qtest artifacts, PRDs, and script libraries to power autoscripting, onboarding search, and long-term knowledge reuse. As a technical leader, you will set the strategic direction, select cutting-edge models, and mentor AI and Automation Agent Engineers to deliver a scalable, secure, and innovative platform.
Key Responsibilities- Architect ingestion and retrieval layers, selecting loaders, chunking strategies (AST-aware for Java), embeddings (e.g., BGE-Code, mxbai), vector stores (e.g., Chroma), cross-encoder rerankers, and LangChain router chains.
- Design CI orchestrations, including daily Jenkins jobs for delta detection, image captioning (e.g., Qwen2-VL, LLaVA), cost/latency guardrails, and rollback strategies.
- Establish model and prompt governance, including prompt templates, few-shot libraries, safety filters, and evaluation rubrics (faithfulness, coverage, compile success).
- Lead architecture for a UI onboarding tool, deciding on hosting (FastAPI + React or Streamlit MVP), SSO/auth flows, token streaming, and feedback mechanisms for continuous learning.
- Oversee data security and compliance, embedding privacy policies, source citations, audit logs, and ensuring Confluence/Qtest credentials are managed in Secrets Manager.
- Provide technical leadership by reviewing PRs, setting code quality standards, and conducting architecture workshops for AI and Automation Agent Engineers.
- 6–8 years of experience building data or ML platforms, with at least 2 years deploying LLM/RAG systems in production.
- Deep expertise in LangChain, ChromaDB, Qdrant, or pgvector, and cross-encoder rerankers.
- Strong proficiency in Python (FastAPI or Flask) and ability to analyze Java codebases for chunking boundaries.
- Proven experience designing CI/CD pipelines (Jenkins, GitHub Actions) with delta builds and artifact promotion.
- Hands-on experience managing OpenAI/Anthropic API keys or self-hosting large models.
- Demonstrated expertise in security and compliance, including PII protection, role-based access, and secret rotation.
-
RAG Architect
4 semanas atrás
Guarulhos, São Paulo, Brasil Totalperform Tempo inteiroWe are seeking a Senior AI Engineer to lead the design and implementation of an end-to-end Retrieval-Augmented Generation (RAG) architecture. This role will drive the ingestion of GitHub repositories, Confluence pages, Qtest artifacts, PRDs, and script libraries to power autoscripting, onboarding search, and long-term knowledge reuse. As a technical leader,...
-
AI Architecture Leader
Há 6 dias
Guarulhos, São Paulo, Brasil beBeeLeadership Tempo inteiroWe are seeking a senior AI leader to drive the design and implementation of an end-to-end Retrieval-Augmented Generation (RAG) architecture. This role will oversee the ingestion of diverse data sources, including GitHub repositories, Confluence pages, and Qtest artifacts, to power autoscripting, onboarding search, and long-term knowledge reuse.Key...