I'm Vishnu - currently building AI systems at CourtCorrect. Previously ML at Vodafone, contributed at Cohere, and write about deep learning, MLOps, and the parts of LLM serving that nobody warned you about.
A quick tour through FlashAttention, paged KV-cache, speculative decoding and friends - what each one actually changes.
Agents · LLMBuilding a Production Data Analysis AgentNotes from wiring an LLM up to messy real-world tabular data - schema inference, tool design, and the failure modes you only see in production.
MLOpsLearnings from Monzo at AWS re:InventA deep dive into how Monzo scaled from 4M to 8M customers with a tiny infra team - the talks that mattered for ML platform teams.