Try in Colab
Next-level logging in few lines

If you’d rather dive straight into working code, check out this Google Colab.
Get started: track experiments
Sign up and create an API key
An API key authenticates your machine to W&B. You can generate an API key from your user profile.For a more streamlined approach, you can generate an API key by going directly to the W&B authorization page. Copy the displayed API key and save it in a secure location such as a password manager.
- Click your user profile icon in the upper right corner.
- Select User Settings, then scroll to the API Keys section.
- Click Reveal. Copy the displayed API key. To hide the API key, reload the page.
Install the wandb
library and log in
To install the wandb
library locally and log in:
- Command Line
- Python
- Python notebook
-
Set the
WANDB_API_KEY
environment variable to your API key. -
Install the
wandb
library and log in.
Name the project
A W&B Project is where all of the charts, data, and models logged from related runs are stored. Naming your project helps you organize your work and keep all the information about a single project in one place. To add a run to a project simply set theWANDB_PROJECT
environment variable to the name of your project. The WandbCallback
will pick up this project name environment variable and use it when setting up your run.
- Command Line
- Python
- Python notebook
Make sure you set the project name before you initialize the
Trainer
.huggingface
.
Log your training runs to W&B
This is the most important step when defining yourTrainer
training arguments, either inside your code or from the command line, is to set report_to
to "wandb"
in order enable logging with W&B.
The logging_steps
argument in TrainingArguments
will control how often training metrics are pushed to W&B during training. You can also give a name to the training run in W&B using the run_name
argument.
That’s it. Now your models will log losses, evaluation metrics, model topology, and gradients to W&B while they train.
- Command Line
- Python
Using TensorFlow? Just swap the PyTorch
Trainer
for the TensorFlow TFTrainer
.Turn on model checkpointing
Using Artifacts, you can store up to 100GB of models and datasets for free and then use the W&B Registry. Using Registry, you can register models to explore and evaluate them, prepare them for staging, or deploy them in your production environment. To log your Hugging Face model checkpoints to Artifacts, set theWANDB_LOG_MODEL
environment variable to one of:
checkpoint
: Upload a checkpoint everyargs.save_steps
from theTrainingArguments
.end
: Upload the model at the end of training, ifload_best_model_at_end
is also set.false
: Do not upload the model.
- Command Line
- Python
- Python notebook
Trainer
you initialize from now on will upload models to your W&B project. The model checkpoints you log will be viewable through the Artifacts UI, and include the full model lineage (see an example model checkpoint in the UI here).
By default, your model will be saved to W&B Artifacts as
model-{run_id}
when WANDB_LOG_MODEL
is set to end
or checkpoint-{run_id}
when WANDB_LOG_MODEL
is set to checkpoint
.
However, If you pass a run_name
in your TrainingArguments
, the model will be saved as model-{run_name}
or checkpoint-{run_name}
.W&B Registry
Once you have logged your checkpoints to Artifacts, you can then register your best model checkpoints and centralize them across your team with Registry. Using Registry, you can organize your best models by task, manage the lifecycles of models, track and audit the entire ML lifecyle, and automate downstream actions. To link a model Artifact, refer to Registry.Visualise evaluation outputs during training
Visualing your model outputs during training or evaluation is often essential to really understand how your model is training. By using the callbacks system in the Transformers Trainer, you can log additional helpful data to W&B such as your models’ text generation outputs or other predictions to W&B Tables. See the Custom logging section below for a full guide on how to log evaluation outputs while training to log to a W&B Table like this:
Finish your W&B Run (Notebook only)
If your training is encapsulated in a Python script, the W&B run will end when your script finishes. If you are using a Jupyter or Google Colab notebook, you’ll need to tell us when you’re done with training by callingrun.finish()
.
Visualize your results
Once you have logged your training results you can explore your results dynamically in the W&B Dashboard. It’s easy to compare across dozens of runs at once, zoom in on interesting findings, and coax insights out of complex data with flexible, interactive visualizations.Advanced features and FAQs
How do I save the best model?
If you passTrainingArguments
with load_best_model_at_end=True
to your Trainer
, W&B saves the best performing model checkpoint to Artifacts.
If you save your model checkpoints as Artifacts, you can promote them to the Registry. In Registry, you can:
- Organize your best model versions by ML task.
- Centralize models and share them with your team.
- Stage models for production or bookmark them for further evaluation.
- Trigger downstream CI/CD processes.
How do I load a saved model?
If you saved your model to W&B Artifacts withWANDB_LOG_MODEL
, you can download your model weights for additional training or to run inference. You just load them back into the same Hugging Face architecture that you used before.
How do I resume training from a checkpoint?
If you had setWANDB_LOG_MODEL='checkpoint'
you can also resume training by you can using the model_dir
as the model_name_or_path
argument in your TrainingArguments
and pass resume_from_checkpoint=True
to Trainer
.
How do I log and view evaluation samples during training
Logging to W&B via the TransformersTrainer
is taken care of by the WandbCallback
in the Transformers library. If you need to customize your Hugging Face logging you can modify this callback by subclassing WandbCallback
and adding additional functionality that leverages additional methods from the Trainer class.
Below is the general pattern to add this new callback to the HF Trainer, and further down is a code-complete example to log evaluation outputs to a W&B Table:
View evaluation samples during training
The following section shows how to customize theWandbCallback
to run model predictions and log evaluation samples to a W&B Table during training. We will every eval_steps
using the on_evaluate
method of the Trainer callback.
Here, we wrote a decode_predictions
function to decode the predictions and labels from the model output using the tokenizer.
Then, we create a pandas DataFrame from the predictions and labels and add an epoch
column to the DataFrame.
Finally, we create a wandb.Table
from the DataFrame and log it to wandb.
Additionally, we can control the frequency of logging by logging the predictions every freq
epochs.
Note: Unlike the regular WandbCallback
this custom callback needs to be added to the trainer after the Trainer
is instantiated and not during initialization of the Trainer
.
This is because the Trainer
instance is passed to the callback during initialization.
What additional W&B settings are available?
Further configuration of what is logged withTrainer
is possible by setting environment variables. A full list of W&B environment variables can be found here.
Environment Variable | Usage |
---|---|
WANDB_PROJECT | Give your project a name (huggingface by default) |
WANDB_LOG_MODEL | Log the model checkpoint as a W&B Artifact (
|
WANDB_WATCH | Set whether you’d like to log your models gradients, parameters or neither
|
WANDB_DISABLED | Set to true to turn off logging entirely (false by default) |
WANDB_QUIET . | Set to true to limit statements logged to standard output to critical statements only (false by default) |
WANDB_SILENT | Set to true to silence the output printed by wandb (false by default) |
- Command Line
- Notebook
How do I customize wandb.init
?
The WandbCallback
that Trainer
uses will call wandb.init
under the hood when Trainer
is initialized. You can alternatively set up your runs manually by calling wandb.init
before theTrainer
is initialized. This gives you full control over your W&B run configuration.
An example of what you might want to pass to init
is below. For wandb.init()
details, see the wandb.init()
reference.