TD

Work

Systems, benchmarks, and experiments in retrieval, agent memory, and evaluation.

Systems

Observability
Agentglass (Switchyard)

Chrome DevTools for AI agents: trace viewer, attribution overlay, diff mode, critical path analysis, portable repro bundles.

Eval + Tuning
MaxQ

Retrieval evaluation + tuning workflow: ingest, index variants, eval suite, recommendations. Built for repeatability.

Hybrid Retrieval
Biomedical GraphRAG

Hybrid retrieval over literature using Qdrant + Neo4j. Focus: grounding, provenance, and query-time tradeoffs.

Data Ingestion
Qdrant ETL Cookbook

Open-source catalog of ETL snippets. 15+ data types and embedding models, searchable, ready to copy into any pipeline.

Benchmarks

Agent Benchmark
GTM Arena

Benchmark suite for sales agents: 70 tasks, scored across 7 categories. Compares LLM-only vs RAG vs tool-using agents.

Experiments

Multimodal Search
Atelier

Multimodal retrieval + personalization demo. Vibe-based search over a design library with Gemini AI generation.