Systems, benchmarks, and experiments in retrieval, agent memory, and evaluation.
Chrome DevTools for AI agents: trace viewer, attribution overlay, diff mode, critical path analysis, portable repro bundles.
Retrieval evaluation + tuning workflow: ingest, index variants, eval suite, recommendations. Built for repeatability.
Hybrid retrieval over literature using Qdrant + Neo4j. Focus: grounding, provenance, and query-time tradeoffs.
Open-source catalog of ETL snippets. 15+ data types and embedding models, searchable, ready to copy into any pipeline.
Benchmark suite for sales agents: 70 tasks, scored across 7 categories. Compares LLM-only vs RAG vs tool-using agents.
Multimodal retrieval + personalization demo. Vibe-based search over a design library with Gemini AI generation.