Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
Kolmogorov Complexity, Solomonoff Induction, and the Philosophical Limits of Aligned AGI
Published:
Begin with the simplest possible question about intelligence: what does it mean to learn? Not to fit a curve, not to minimize a loss, but to genuinely induce the right explanation from evidence. This question has a precise mathematical answer, one that was worked out in the 1960s by Kolmogorov, Solomonoff, and Chaitin, and extended by Hutter in the 2000s into a formal theory of optimal rational agency. The answer is beautiful and the theory is complete. It is also, on close inspection, deeply troubling for the project of value alignment.
The Mathematics of Adversarial Robustness: Certified Defenses, Lipschitz Geometry, and the Limits of Perturbation Sets
Published:
In 2013 Szegedy et al. showed that a GoogLeNet classifier, trained to near-human accuracy on ImageNet, could be fooled by adding imperceptibly small perturbations to any input image. The perturbations were invisible to human eyes, no larger than the noise in a compressed JPEG, but they caused confident, catastrophically wrong predictions. The model saw a school bus and called it an ostrich. A decade later, after thousands of papers on attacks and defenses, the phenomenon is still not fully understood. State-of-the-art models remain vulnerable in ways that defy intuitive explanation.
Singular Learning Theory and the Geometry of Neural Network Interpretability
Published:
Ask a mechanistic interpretability researcher why neural networks form the circuits they do and you will get an honest answer: we observe them, name them, and ablate them, but we lack a theory of why they emerge. This is not a complaint about the field, the empirical discoveries are real and important. It is a statement about what is missing. What we need is a mathematical account of why, given a data distribution and an architecture, gradient descent converges to representations with specific structural properties rather than others.
When RAG Fails: Building a GraphRAG System for Multi-Hop Reasoning
Published:
The question that broke our pipeline came from an oncologist: “Which drug was approved after the clinical trial that cited the 2018 KRAS resistance paper, and what is its mechanism?” Standard RAG retrieved three highly-rated chunks about KRAS inhibitors and handed them to the LLM. The LLM answered confidently and completely incorrectly.
Reducing Production Inference Latency by 10x: A Profiling Story
Published:
A model serving endpoint at Synthure had a p99 latency of 4.2 seconds. Physicians were waiting that long, four seconds, for coding recommendations during patient encounters. The product team had assumed LLM inference was the problem. We had discussed switching to a smaller model, accepting worse accuracy in exchange for speed. Before doing that, we profiled.
Implementing AlphaZero for Connect Four: MCTS + Neural Policy in C++ and Python
Published:
Around iteration 40 of training, something changed. The agent, which had been playing essentially random Connect Four with a mild center preference, started blocking threats it had no reason to know about. A human playing against it dropped a piece that created a diagonal three-in-a-row. The agent, on its next move, dropped a piece that blocked the winning extension. Not because it had been told about diagonals. Because 40 iterations of self-play had accumulated enough evidence that unblocked diagonals eventually lead to losses.
The Information Bottleneck: Deriving Optimal Representations From First Principles
Published:
Two models trained on the same data, same architecture, same hyperparameters, except one generalizes to new distributions and the other memorizes the training set. This was the puzzle I kept running into. Validation accuracy looked identical during training. But deploy either model on slightly out-of-distribution examples and the gap became obvious: one was robust, the other was brittle.
Implementing the Transformer in C++ Without ML Libraries: What You Learn From the Metal
Published:
There is a version of understanding a transformer where you can recite the equations and draw the architecture diagram. Then there is a deeper version where you know, concretely, what happens to a float when it enters the attention mechanism, which cache line it lives on, what instruction the CPU uses to multiply it, how many copies of it exist simultaneously in memory. I wanted the second kind of understanding. The only way to get it was to implement a transformer from nothing: no PyTorch, no NumPy, no BLAS wrappers.
The Model That Learned From the Future: A Temporal Leakage Postmortem
Published:
The validation dashboard said 99.7% precision. We had trained a fraud detection model for a healthcare claims processor, and by every metric it was performing remarkably well. The product team was excited. We were cautious, 99.7% felt too good, but we couldn’t find the flaw, so we deployed.
Building a Bayesian A/B Testing System That Knows When to Stop
Published:
At Synthure, we ran A/B tests the way most startups do: flip a coin on traffic, wait two weeks, check if $p < 0.05$, ship or revert. This worked until we started testing features that affected claim approval rates, where each day of a bad variant cost real money and delayed patient reimbursements. We couldn’t afford to wait two weeks. We also couldn’t afford to stop early and be wrong.
Reproducing Double Descent: The Experiment That Broke Classical Learning Theory
Published:
Classical learning theory told us bias and variance form a unimodal tradeoff: increase model capacity and test error first falls, then rises as the model starts memorizing. Every textbook contains this curve. It was the theoretical foundation for why we regularize, why we use validation sets, and why we prefer smaller models when data is limited.
Why Adam Works: Understanding Every Major Optimizer Through the Loss Landscape
Published:
During my first serious training run at Synthure, I watched a model converge beautifully for 40 epochs, then diverge. Learning rate too high, I assumed. I halved it. It diverged again, faster. I halved it again. Now it converged but plateaued far above the target loss. After two days I realized what was actually wrong: I was using SGD with momentum on a loss landscape with wildly different curvatures along different parameter directions, and no single learning rate could handle both the shallow ravines and the steep walls simultaneously.
portfolio
Mehra-Prescott with Rare Disasters (Equity Premium Puzzle)
Published:
Extends the Mehra-Prescott consumption-based asset pricing model with a mathematical rare disasters framework calibrated to U.S. data.
RobustSight: AI Safety & Alignment Framework
Published:
Comprehensive CV + LLM safety framework investigating adversarial robustness, interpretability, and human-guided alignment across frontier models.
Traffic Flow Optimization
Published:
Urban traffic management embedding graph theory, linear algebra, and calculus into C++ and JAX to optimize signal timing for city planners.
Volatility Alchemist: Quantitative Options Trading
Published:
End-to-end quant options trading system with real-time data integration, ML volatility models, interactive dashboards, and automated signal generation.
GameNGen: Diffusion World Model for Game Simulation
Published:
Trained a latent diffusion world model on 737K+ Super Mario Bros frames with VAE encoder, CNN reward model, and PPO agents.
Market Microstructure Forecasting with Deep RL
Published:
DRL agent predicting short-horizon price movements from order book snapshots and executing trades via event-driven backtesting.
NFL Veteran Transition: Causal Inference on Player Performance
Published:
Hierarchical mixed-effects models and ML ensembles to isolate the effect of team transitions from individual skill trajectories in NFL player data.
ACL Injury Risk Predictor
Published:
ML pipeline predicting ACL injury risk from biomechanical and performance data, combining sports science domain knowledge with gradient boosting and interpretability tools.
publications
Multiple Myeloma Gene Signatures
Published in The Oncologist, 2024
Statistical analysis of mutation, pathway, and survival data to identify gene signatures associated with disease progression in multiple myeloma.
Recommended citation: Kannappan, A. et al. (2024). "Multiple Myeloma Gene Signatures." The Oncologist.
Agentic AI Systems for Clinical Reasoning
Published in Book Chapter, 2026, 2026
Frameworks for integrating clinical and social data through multi-agent LLM orchestration to improve decision-making and patient-centered care.
Recommended citation: Kannappan, A. et al. (2026). "Agentic AI Systems for Clinical Reasoning." Book Chapter.
