DBO Pipeline Timeline

Dispatch (comm) Expert Compute Combine (comm) Idle

Top: Without overlap, each step runs Dispatch → Compute → Combine sequentially. GPUs idle during network transfers; the network idles during compute.
Bottom: With DBO, double buffering lets Step N's compute overlap with Step N+1's communication. Steady-state step time drops from t_comm + t_compute to max(t_comm, t_compute). The shaded idle blocks show wasted GPU time that DBO eliminates.

DBO Pipeline: Sequential vs. Overlapped Execution