sweep
command uses flags such as --pause
and --resume
to control the sweep’s ability to create new W&B runs, with different effects on existing runs:
--pause
: When you pause a sweep, the agent creates no new runs until you resume the sweep. Existing runs continue to execute normally.--resume
: When you resume a sweep, the agent continues creating new runs according to the search strategy.--stop
: When you stop a sweep, the agent stops creating new runs. Existing runs continue to completion.--cancel
: When you cancel a sweep, the agent immediately kills all currently executing runs and stops creating new runs.
Pause a sweep
Pause a sweep so it temporarily stops creating new runs. Runs that are already executing will continue to run until completion. Use thewandb sweep --pause
command to pause a sweep. Provide the sweep ID that you want to pause.
Resume a sweep
Resume a paused sweep with thewandb sweep --resume
command. The sweep will start creating new runs again according to its search strategy. Provide the sweep ID that you want to resume:
Stop a sweep
Finish a sweep to stop creating new runs while letting currently executing runs finish gracefully. Use thewandb sweep --stop
command:
Cancel a sweep
Cancel a sweep to immediately kill all running runs and stop creating new runs. This is the only sweep command that forcibly terminates existing runs. Use thewandb sweep --cancel
command to cancel a sweep. Provide the sweep ID that you want to cancel.
Understanding sweep and run statuses
A sweep orchestrates multiple runs to explore hyperparameter combinations. Understanding how sweep status and run status interact is crucial for effectively managing your hyperparameter optimization.Key differences
- Sweep status controls whether new runs are created (Running, Paused, Stopped, Cancelled, Finished, Failed, Crashed)
- Run status reflects the execution state of individual runs (Pending, Running, Finished, Failed, Crashed, Killed)
Best practices
- Use
--pause
instead of cancel when you want to temporarily halt exploration without losing running experiments - Monitor individual run statuses to identify systematic failures
- Use
--stop
for graceful termination when you’ve found satisfactory hyperparameters - Reserve
--cancel
for emergencies when runs are consuming excessive resources or producing errors