I build serious AI/ML, agent, and developer-tooling systems with a bias toward products that are technically credible and commercially legible.
- LLM / agent evaluation and benchmarking
- Structured output systems and developer tooling
- Applied AI products with strong evidence, usability, and clear positioning
Structured output portability compiler, CI guardrail, and API for OpenAI, Gemini, Anthropic, and Ollama.
A benchmark/evaluation effort focused on higher-signal measurement for modern model and agent behavior.
A runtime evidence layer for AI agents in enterprise workflows.
Context optimization tooling for coding workflows.
Business development + analytics + startup building, now building a sharp AI/ML and research-engineering portfolio.