Skip to main content
Hugging Face AutoTrain is a no-code tool for training state-of-the-art models for Natural Language Processing (NLP) tasks, for Computer Vision (CV) tasks, and for Speech tasks and even for Tabular tasks. W&B is directly integrated into Hugging Face AutoTrain, providing experiment tracking and config management. It’s as easy as using a single parameter in the CLI command for your experiments.
Experiment metrics logging

Install prerequisites

Install autotrain-advanced and wandb.
  • Command Line
  • Notebook
pip install --upgrade autotrain-advanced wandb
To demonstrate these changes, this page fine-tines an LLM on a math dataset to achieve SoTA result in pass@1 on the GSM8k Benchmarks.

Prepare the dataset

Hugging Face AutoTrain expects your CSV custom dataset to have a specific format to work properly.
  • Your training file must contain a text column, which the training uses. For best results, the text column’s data must conform to the ### Human: Question?### Assistant: Answer. format. Review a great example in timdettmers/openassistant-guanaco. However, the MetaMathQA dataset includes the columns query, response, and type. First, pre-process this dataset. Remove the type column and combine the content of the query and response columns into a new text column in the ### Human: Query?### Assistant: Response. format. Training uses the resulting dataset, rishiraj/guanaco-style-metamath.

Train using autotrain

You can start training using the autotrain advanced from the command line or a notebook. Use the --log argument, or use --log wandb to log your results to a W&B Run.
  • Command Line
  • Notebook
autotrain llm \
    --train \
    --model HuggingFaceH4/zephyr-7b-alpha \
    --project-name zephyr-math \
    --log wandb \
    --data-path data/ \
    --text-column text \
    --lr 2e-5 \
    --batch-size 4 \
    --epochs 3 \
    --block-size 1024 \
    --warmup-ratio 0.03 \
    --lora-r 16 \
    --lora-alpha 32 \
    --lora-dropout 0.05 \
    --weight-decay 0.0 \
    --gradient-accumulation 4 \
    --logging_steps 10 \
    --fp16 \
    --use-peft \
    --use-int4 \
    --merge-adapter \
    --push-to-hub \
    --token <huggingface-token> \
    --repo-id <huggingface-repository-address>
Experiment config saving

More Resources

I