

Breaking the Bandwidth Wall: Why AI Clusters are Shifting to Ultra Ethernet
To scale past 100k GPUs, the industry is replacing proprietary InfiniBand with AI-optimized Ultra Ethernet.


To scale past 100k GPUs, the industry is replacing proprietary InfiniBand with AI-optimized Ultra Ethernet.


In distributed training, the slowest packet determines the speed of the cluster. We benchmark GCP's 'Circuit Switched' Jupiter fabric against AWS's 'Multipath' SRD protocol.