
Scaling Structural Bias - Pre-training Custom Qwen3 on TPU v6e
An end-to-end guide to orchestrating Custom Qwen3 pre-training on Google Cloud's Trillium TPUs. I dive into modifying the Qwen3 architecture for structured JSON outputs, leveraging XPK for orchestration, and serving the final artifacts with vLLM's high-performance openXLA backend.