Interactive Performance Model

Batch tokens
EP ranks
Latency
Bandwidth
Compute/tok
Expert skew
Click to toggle:
Presets:

Execution Timeline

dispatch compute combine

Step Time vs Batch Size

comm (2×dispatch) compute no‑overlap DBO
Show math

Dispatch time:

Compute time:

No‑overlap total:

DBO steady‑state:

The widget uses effective terms: L_eff, BW_eff, and c_tok,eff. Effects like VLLM_DBO_COMM_SMS are folded into these sliders (not modeled as a separate \rho control).

The timeline shows how a step is divided into dispatch, compute, and combine phases. With DBO, communication of the next microbatch overlaps with compute of the current one. The curves show how total step time changes across batch sizes; the dot shows the current slider position.