From the Front Lines of Tech

Sharing strategic insights and lessons from over 20 years of building scalable systems, leading high-performing teams, and navigating complex technology shifts.

Feb 26, 2026 · Agentic AI
How to Use Silero VAD: Real-Time Voice Activity Detection in Python
How to use Silero VAD for real-time voice activity detection: build a Python audio pipeline with `from silero_vad import load_silero_vad`, endpointing, and barge-in handling.
Feb 25, 2026 · AI Infrastructure
Stateful Agents on K8s: Redis is Your Bottleneck, Not the Vector DB
Agents are stateless. Their memory is not. Scaling the LLM reasoning loop is trivial compared to solving the transactional concurrency of agent memory on Kubernetes.
Feb 24, 2026 · AI Infrastructure
JAX Pallas: Writing GPU Kernels for Maximum Performance
JAX Pallas is NVIDIA's GPU programming API for high-performance compute kernels. Write optimized kernels for matrix multiplication and memory access patterns.
- JAX
- XLA
- TPUs
- GCP
- Pallas
- Compilers
Feb 23, 2026 · Strategy
The Context Window ROI: Why RAG is a Tax on Reasoning
At $5 per million tokens with Gemini 2.5 Pro, the context window is no longer a scarcity. It is an asset class. It is time to rethink the true cost of RAG pipelines.
Feb 22, 2026 · AI Engineering
How LangGraph Supports Cycles: Preventing Infinite Loops in Agent Workflows
How LangGraph supports cycles for multi-agent workflows: learn to detect infinite loops, implement safety limits, and optimize cyclic agent graphs in production.
Feb 21, 2026 · Agentic AI
A2A Architectures: Tools are not just Functions (The Two-Phase Commit)
Why Agent-to-Agent (A2A) interactions and Side Effects require a 'Two-Phase Commit' for safety.

Newer posts

Older posts

Strictly Necessary

Analytics