Projects

Evidence-Driven Deep Research Agent

Research agent with an explicit evidence state, step-level reward signals, and a guided planner that uses those signals to decide what to search next. Runs an adversarial decomposer to generate counter-evidence sub-questions, a multi-stage retrieval pipeline with claim extraction and conflict detection, and a principled stopping criterion. Ablation across two queries: same compute envelope as baseline, full sub-question coverage vs. partial.

Agentic AI Process Rewards Python Tavily

GitHub Blog Post Slides

LabPilot — AI Copilot for R&D Experiment Optimization

Decision-support loop for R&D labs: surrogate model trained on historical experiment data recommends the next best experiment, with uncertainty quantification and adaptive bandit policies (UCB, LinUCB, greedy). An LLM reasoning layer explains each recommendation; a literature search layer adds source-backed justification. Full session loop — submit observed result, model adapts and recommends next step.

Bandit Policies Surrogate Modeling FastAPI React Nebius LLM

GitHub Demo Video

Emergency Guidance Agent — CPR Copilot

Voice-and-video CPR coaching assistant using browser camera and microphone, built on Google Gemini Live. The core design decision: a strict finite state machine controls step transitions across six states (intake → escalation → see_patient → start_compressions → continue_cpr → complete). The model provides real-time guidance within each state; the application enforces the safety sequence. The model cannot advance or skip steps.

Multimodal Gemini Live FSM Safety Design TypeScript Pipecat

GitHub Demo Video

Earlier Work