Digital twins
Bringing agents into digital-twin software, rebuilding the hybrid analytics framework from scratch, and building ML models that accelerate simulation and modeling workflows.
I work on agentic AI, digital twins, Large Language Models (LLMs), and model evaluation. Senior R&D at Synopsys, bringing agentic capability into digital-twin software, building Machine Learning (ML) models, and rebuilding hybrid analytics for faster simulation and modeling. Off hours I'm shipping SimPilot, a tool-using multi-agent platform with typed evidence gates, memory, sandboxed execution, and audit traces. PhD in Computational Modeling & Simulation, focused on Reinforcement Learning (RL), JAX/GPU training systems, transformers, and model evaluation.
Describe an engineering simulation in plain English. SimPilot's tool-using LLM agents plan the work, run it on sandboxed compute, check their outputs with typed validators, and hand back a report you can audit. Long-horizon agent work, automated end to end.
Bringing agents into digital-twin software, rebuilding the hybrid analytics framework from scratch, and building ML models that accelerate simulation and modeling workflows.
GPU-parallel training and evaluation harness for RL policies and deep sequence models. A week-long sweep now finishes overnight. Built for fast iteration, metrics, and reproducible runs.
Fine-tuned T5/BERT QA models and built task-specific evals to track accuracy, failure modes, and data quality. The best run reached >80% accuracy on the target QA task.
SimPilot turns tool outputs into pass/fail artifacts, run provenance, and debugging traces. It is my practical version of agent evals: evidence first, self-report second.
Most agent demos work because the tasks are short, the tools forgive everything, and a wrong answer is cheap. I wanted to see what happens at the other extreme.
Engineering simulation is the cleanest test case I know. The ground truth exists, and you can't talk your way around a bad result.
Other things I keep coming back to: dynamical systems and chaos, how to evaluate agents honestly, post-training data quality, the gap between research papers and shipped systems, why AI tooling is still rougher than it needs to be. I also have a soft spot for movies that sit with ambiguity: quiet character studies, strange sci-fi, and anything that makes the ordinary feel slightly unreal.