Vijay Kodam
About
Blog
AI Lab
AI Projects
Co-Authored with AI
AppleSilicon
February 14, 2026
LLM Inference Internals: KV Cache, Flash Attention, and Optimizing for Apple Silicon
February 14, 2026
I Ran an 80B Coding Model Locally with Claude Code. It Took 1 Hour Instead of 9 Minutes. Here's What Was Wrong.