Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Posts

Can You Trust a Probability That Was Never Checked Against Reality?

13 minute read

Published: June 16, 2026

Most risk dashboards report a number and ask you to trust it. A model says there is a 73 percent chance of an outbreak, and you have no way to know whether, across all the times it has said 73 percent, an outbreak actually followed 73 percent of the time, or 30 percent, or 95 percent. The number looks authoritative because it has a decimal point, but a probability that has never been checked against what actually happened is not really a probability. It is a feeling with a unit attached. I built MOSAIC in large part to take that problem seriously, and the discipline it forces, define a falsifiable quantity and then prove it is calibrated, turns out to matter far beyond epidemics.

How Much of a Prompt Can You Delete Before the Answer Breaks?

16 minute read

Published: June 15, 2026

Every token you send to a language model costs you twice. It costs money, because providers bill per token, and it costs latency, because attention is quadratic in sequence length, so doubling the context can quadruple the work the model does to read it. For a single short prompt none of this matters. For a production application that stuffs retrieved documents, conversation history, tool outputs, and system instructions into every call, it matters enormously, and it is the difference between an app that is cheap and fast and one that is neither.

Why Does a Valid Proof Tell You Nothing About Whether You Proved the Right Thing?

22 minute read

Published: June 01, 2026

When you prove something with a computer, the workflow has two halves that are easy to confuse. First you write down a statement, which formal-methods people call a specification, or spec for short. A spec is a precise description, in a language a machine can read, of what your code or your theorem is supposed to do. Then you write a proof that your work satisfies that spec, and you hand both to a proof checker like Lean or Axiom’s AXLE. The checker does something genuinely remarkable: it tells you, with certainty, whether the proof is valid. No hand-waving, no “looks right to me,” just a mechanical verdict that the proof establishes the statement.

Kolmogorov Complexity, Solomonoff Induction, and the Philosophical Limits of Aligned AGI

12 minute read

Published: May 01, 2026

Begin with the simplest possible question about intelligence: what does it mean to learn? Not to fit a curve, not to minimize a loss, but to genuinely induce the right explanation from evidence. This question has a precise mathematical answer, one that was worked out in the 1960s by Kolmogorov, Solomonoff, and Chaitin, and extended by Hutter in the 2000s into a formal theory of optimal rational agency. The answer is beautiful and the theory is complete. It is also, on close inspection, deeply troubling for the project of value alignment.

The Mathematics of Adversarial Robustness: Certified Defenses, Lipschitz Geometry, and the Limits of Perturbation Sets

13 minute read

Published: April 01, 2026

In 2013 Szegedy et al. showed that a GoogLeNet classifier, trained to near-human accuracy on ImageNet, could be fooled by adding imperceptibly small perturbations to any input image. The perturbations were invisible to human eyes, no larger than the noise in a compressed JPEG, but they caused confident, catastrophically wrong predictions. The model saw a school bus and called it an ostrich. A decade later, after thousands of papers on attacks and defenses, the phenomenon is still not fully understood. State-of-the-art models remain vulnerable in ways that defy intuitive explanation.

Singular Learning Theory and the Geometry of Neural Network Interpretability

13 minute read

Published: March 01, 2026

Ask a mechanistic interpretability researcher why neural networks form the circuits they do and you will get an honest answer: we observe them, name them, and ablate them, but we lack a theory of why they emerge. This is not a complaint about the field, the empirical discoveries are real and important. It is a statement about what is missing. What we need is a mathematical account of why, given a data distribution and an architecture, gradient descent converges to representations with specific structural properties rather than others.

When RAG Fails: Building a GraphRAG System for Multi-Hop Reasoning

7 minute read

Published: January 01, 2026

The question that broke our pipeline came from an oncologist: “Which drug was approved after the clinical trial that cited the 2018 KRAS resistance paper, and what is its mechanism?” Standard RAG retrieved three highly-rated chunks about KRAS inhibitors and handed them to the LLM. The LLM answered confidently and completely incorrectly.

Reducing Production Inference Latency by 10x: A Profiling Story

6 minute read

Published: December 01, 2025

A model serving endpoint at Synthure had a p99 latency of 4.2 seconds. Physicians were waiting that long, four seconds, for coding recommendations during patient encounters. The product team had assumed LLM inference was the problem. We had discussed switching to a smaller model, accepting worse accuracy in exchange for speed. Before doing that, we profiled.

Implementing AlphaZero for Connect Four: MCTS + Neural Policy in C++ and Python

7 minute read

Published: November 01, 2025

Around iteration 40 of training, something changed. The agent, which had been playing essentially random Connect Four with a mild center preference, started blocking threats it had no reason to know about. A human playing against it dropped a piece that created a diagonal three-in-a-row. The agent, on its next move, dropped a piece that blocked the winning extension. Not because it had been told about diagonals. Because 40 iterations of self-play had accumulated enough evidence that unblocked diagonals eventually lead to losses.

The Information Bottleneck: Deriving Optimal Representations From First Principles

7 minute read

Published: October 01, 2025

Two models trained on the same data, same architecture, same hyperparameters, except one generalizes to new distributions and the other memorizes the training set. This was the puzzle I kept running into. Validation accuracy looked identical during training. But deploy either model on slightly out-of-distribution examples and the gap became obvious: one was robust, the other was brittle.

Implementing the Transformer in C++ Without ML Libraries: What You Learn From the Metal

6 minute read

Published: September 01, 2025

There is a version of understanding a transformer where you can recite the equations and draw the architecture diagram. Then there is a deeper version where you know, concretely, what happens to a float when it enters the attention mechanism, which cache line it lives on, what instruction the CPU uses to multiply it, how many copies of it exist simultaneously in memory. I wanted the second kind of understanding. The only way to get it was to implement a transformer from nothing: no PyTorch, no NumPy, no BLAS wrappers.

The Model That Learned From the Future: A Temporal Leakage Postmortem

4 minute read

Published: August 01, 2025

The validation dashboard said 99.7% precision. We had trained a fraud detection model for a healthcare claims processor, and by every metric it was performing remarkably well. The product team was excited. We were cautious, 99.7% felt too good, but we couldn’t find the flaw, so we deployed.

Building a Bayesian A/B Testing System That Knows When to Stop

4 minute read

Published: July 01, 2025

At Synthure, we ran A/B tests the way most startups do: flip a coin on traffic, wait two weeks, check if $p < 0.05$, ship or revert. This worked until we started testing features that affected claim approval rates, where each day of a bad variant cost real money and delayed patient reimbursements. We couldn’t afford to wait two weeks. We also couldn’t afford to stop early and be wrong.

Reproducing Double Descent: The Experiment That Broke Classical Learning Theory

4 minute read

Published: June 01, 2025

Classical learning theory told us bias and variance form a unimodal tradeoff: increase model capacity and test error first falls, then rises as the model starts memorizing. Every textbook contains this curve. It was the theoretical foundation for why we regularize, why we use validation sets, and why we prefer smaller models when data is limited.

Why Adam Works: Understanding Every Major Optimizer Through the Loss Landscape

5 minute read

Published: May 01, 2025

During my first serious training run at Synthure, I watched a model converge beautifully for 40 epochs, then diverge. Learning rate too high, I assumed. I halved it. It diverged again, faster. I halved it again. Now it converged but plateaued far above the target loss. After two days I realized what was actually wrong: I was using SGD with momentum on a loss landscape with wildly different curvatures along different parameter directions, and no single learning rate could handle both the shallow ravines and the steep walls simultaneously.

portfolio

Mehra-Prescott with Rare Disasters (Equity Premium Puzzle)

Published: July 01, 2025

Extends the Mehra-Prescott consumption-based asset pricing model with a mathematical rare disasters framework calibrated to U.S. data.

RobustSight: AI Safety & Alignment Framework

Published: August 01, 2025

Comprehensive CV + LLM safety framework investigating adversarial robustness, interpretability, and human-guided alignment across frontier models.

Traffic Flow Optimization

Published: August 01, 2025

Urban traffic management embedding graph theory, linear algebra, and calculus into C++ and JAX to optimize signal timing for city planners.

Volatility Alchemist: Quantitative Options Trading

Published: August 01, 2025

End-to-end quant options trading system with real-time data integration, ML volatility models, interactive dashboards, and automated signal generation.

GameNGen: Diffusion World Model for Game Simulation

Published: January 01, 2026

Trained a latent diffusion world model on 737K+ Super Mario Bros frames with VAE encoder, CNN reward model, and PPO agents.

Market Microstructure Forecasting with Deep RL

Published: March 01, 2026

DRL agent predicting short-horizon price movements from order book snapshots and executing trades via event-driven backtesting.

NFL Veteran Transition: Causal Inference on Player Performance

Published: March 01, 2026

Hierarchical mixed-effects models and ML ensembles to isolate the effect of team transitions from individual skill trajectories in NFL player data.

ACL Injury Risk Predictor

Published: March 01, 2026

ML pipeline predicting ACL injury risk from biomechanical and performance data, combining sports science domain knowledge with gradient boosting and interpretability tools.

MOSAIC: Calibrated Multi-Stream Outbreak Early Warning

Published: June 05, 2026

Fuses wastewater, genomic, and outbreak-text surveillance into one calibrated probability that transmission is growing right now, and proves the number is trustworthy.

VeriGrad RL: Mechanistic Interpretability for Safety RL

Published: June 10, 2026

An open-source mechanistic interpretability and AI-safety lab for RL post-training: train policies to choose activation-level interventions, then verify they’re behaviorally safe, useful, and mechanistically faithful.

Popper: Falsifying Specifications Before You Prove Them

Published: June 13, 2026

An open-source tool that hunts for counterexamples to mathematical and logical statements, catching wrong specifications rather than just mismatched proofs.

RallyScope: Tennis ML & Computer Vision in the Browser

Published: June 14, 2026

36,342 ATP/WTA matches turned into an unsupervised playstyle map, an exactly-explainable win-probability model, and an in-browser ball tracker, trained at build time and inferred client-side.

PodBench: Agent Evaluation at Fleet Scale

Published: June 15, 2026

Deterministic, resettable task environments for LLM agents with a programmatic verifier, per-run token/cost metering, and Kubernetes autoscaling: pod health and model behavior on one pane.

Context Forge: The Pareto Frontier of Prompt Compression

Published: June 15, 2026

An open, reproducible benchmark of context compression (tokens saved vs. quality retained vs. latency) across the tokenizers production apps actually pay for, on real public data.

Shannon’s Gambit: A Chess Engine That Learns at an Honest Rating

Published: June 18, 2026

An end-to-end chess RL system that routes each position to the method that owns it, retrains itself nightly by gated self-play, and reports only the Elo Stockfish says it can actually play.

Covera: An Insurance Marketplace That Texts You the Right Plan

Published: June 20, 2026

A multi-agent concierge you text like a person: it simulates thousands of years of your real care against every real plan, ranks on risk-adjusted all-in cost, and stays on as a year-round advocate.

publications

Multiple Myeloma Gene Signatures

Published in The Oncologist, 2024

Statistical analysis of mutation, pathway, and survival data to identify gene signatures associated with disease progression in multiple myeloma.

Recommended citation: Kannappan, A. et al. (2024). "Multiple Myeloma Gene Signatures." The Oncologist.

Agentic AI Systems for Clinical Reasoning

Published in Book Chapter, 2026, 2026

Frameworks for integrating clinical and social data through multi-agent LLM orchestration to improve decision-making and patient-centered care.

Recommended citation: Kannappan, A. et al. (2026). "Agentic AI Systems for Clinical Reasoning." Book Chapter.

research

AI Safety Gridworlds: The Reward You Optimize vs the One You Meant

10 minute read

Published: November 28, 2017

DeepMind’s 2017 suite turns abstract alignment worries into tiny testable RL environments, built around one idea I still use: the visible reward is not the hidden performance function you actually care about.

Path Patching: Turning ‘This Circuit Explains the Behavior’ Into a Testable Claim

10 minute read

Published: April 16, 2023

Goldowsky-Dill et al. give interpretability a quantitative language for localization: express a hypothesis as a set of paths, patch the rest with a counterfactual input, and measure exactly what is left unexplained.

Capability Is Not Propensity: Evaluating Models for Extreme Risk

10 minute read

Published: May 25, 2023

The DeepMind-led paper that crystallized the distinction I use constantly: what a model can do versus what it is inclined to do, and why safety needs to measure both and embed the results in governance.

Safe RLHF: Helpfulness and Harmlessness as a Lagrangian

10 minute read

Published: October 19, 2023

Peking University’s Beaver decouples ‘is it helpful’ from ‘is it safe’ into two models, then balances them with a Lagrange multiplier that moves during training instead of a fixed weight you guess in advance.

ACDC: Teaching the Computer to Find the Circuit for You

10 minute read

Published: October 28, 2023

Conmy et al. automate the slowest step of mechanistic interpretability by pruning a model’s computational graph edge by edge, and are refreshingly honest about where the automation breaks.

Doubly-Efficient Debate: Judging a Superhuman Prover in Constant Time

10 minute read

Published: November 23, 2023

Brown-Cohen, Irving, and Piliouras give debate a complexity-theoretic backbone: two competing polynomial-time provers let a limited verifier check an arbitrary computation with a constant number of human judgments.

Sleeper Agents: When Safety Training Teaches the Model to Hide

10 minute read

Published: January 17, 2024

Anthropic builds deceptive models on purpose and shows that standard safety training does not remove the deception. Adversarial training can even teach the model to hide it better.

The Doubling Clock: Measuring How Long a Task an AI Can Finish

10 minute read

Published: March 18, 2025

METR replaces saturating benchmarks with a human-anchored metric: the length of task, in human time, that a model completes half the time. It has been doubling roughly every seven months.

A Field Guide to Mechanistic Interpretability

10 minute read

Published: October 13, 2025

Rai et al. organize the whole subfield around what you are trying to learn rather than which tool you happen to know, and the map is more useful than any single technique on it.

From Reward Hacking to Sabotage: How a Cheat Becomes a Character

10 minute read

Published: November 23, 2025

Anthropic shows that when a production model learns to game its reward on real coding tasks, it generalizes to alignment faking and sabotage, and that how you frame the cheating matters as much as whether it happens.

Bridging the Gap: What LLM Judges See That Humans Don’t

10 minute read

Published: December 01, 2025

A statistical framework that models an LLM judge as a shared human preference plus a linear bias on interpretable features, so you can both correct the judge and formally test where it diverges from people.

OpenAgentSafety: What Agents Do When You Give Them Real Tools

10 minute read

Published: February 16, 2026

CMU and AI2 put LLM agents in a sandbox with a real shell, browser, file system, and manipulative coworkers, and find unsafe behavior in half to three-quarters of vulnerable tasks, often on perfectly benign requests.

Aravind Kannappan

Sitemap

Pages

Posts

portfolio

publications

research

talks

teaching