From the Front Lines of Tech

Sharing strategic insights and lessons from over 20 years of building scalable systems, leading high-performing teams, and navigating complex technology shifts.

Feb 11, 2026 · Strategy
The Build vs Buy Trap for Foundational Models
You are not Google. Your moat is your data, not your ability to pre-train Llama-4. We dissect the math of architecture parity and the rise of Outcome-as-a-Service.
Feb 11, 2026 · Strategy
Spot Market Arbitrage for AI: The Economics of Fault Tolerance
If your training loop isn't fault-tolerant, you're paying a 40% 'insurance tax' to your cloud provider. We look at the architectural cost of 30-second preemption notices.
Feb 10, 2026 · AI Engineering
Visualizing All-Reduce Bandwidth: The Physics of Distributed Training
When your model doesn't fit on one GPU, you're no longer just learning coding-you're learning physics. We dive deep into the primitives of NCCL, distributed collectives, and why the interconnect is the computer.
Feb 9, 2026 · Strategy
Squeezing the Inference Lever: The Economics of LLM Throughput
Inference price isn't a fixed cost-it's an engineering variable. We break down the three distinct levers of efficiency: Model Compression, Runtime Optimization, and Deployment Strategy.
Feb 6, 2026 · AI Infrastructure
My Profiling Nightmare: The Warp Stall
A war story of chasing a 5ms latency spike to a single loose thread. How to read Nsight Systems and spot Warp Divergence.
Feb 5, 2026 · Agentic AI
LLMs are Terrible Backends (Unless You Force JSON)
Non-determinism is a bug, not a feature. We explore how to whip the model into compliance using Enforcers, Pydantic, and Constrained Generation.

Newer posts

Older posts

Strictly Necessary

Analytics