Posts by tag 'Blackwell'

Apr 6, 2026 · AI Infrastructure

AI Training Chip Performance: Real Scaling Data vs Marketing Hype (Blackwell to Hopper)

AI training chip performance data: analyzing real scaling from Hopper to Blackwell. 3.2x training, 50x inference gains, and why memory bandwidth matters more than FLOPs.

Feb 10, 2026 · AI Engineering

Visualizing All-Reduce Bandwidth: The Physics of Distributed Training

When your model doesn't fit on one GPU, you're no longer just learning coding-you're learning physics. We dive deep into the primitives of NCCL, distributed collectives, and why the interconnect is the computer.

Jan 6, 2026 · AI Engineering

Nvidia Blackwell: Microscaling, FP4, and FP6 Formats

Nvidia Blackwell microscaling and the new FP4 formats double inference speeds. Dive into how the second-generation Transformer Engine uses scale factors and sparsity for AI workloads.

Search

Tag: Blackwell

AI Training Chip Performance: Real Scaling Data vs Marketing Hype (Blackwell to Hopper)

Visualizing All-Reduce Bandwidth: The Physics of Distributed Training

Nvidia Blackwell: Microscaling, FP4, and FP6 Formats

Strictly Necessary

Analytics