Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.wandb.ai/llms.txt

Use this file to discover all available pages before exploring further.

W&B Inference gives you access to leading open-source foundation models through W&B Weave and an OpenAI-compatible API.
  • Using Inference you can build AI applications and agents without signing up for a hosting provider or self-hosting a model.
  • Using Weave, you can trace, evaluate, monitor, and improve your W&B Inference-powered applications.

Try out Inference in the UI

Navigate to https://wandb.ai/inference to explore available models and try them out in the Weave Playground. For more information on the web interface, see the UI Guide.

Use Inference through the API

This Python example uses Inference to send a chat completion request to an LLM.
import openai

client = openai.OpenAI(
    # The custom base URL points to W&B Inference
    base_url='https://api.inference.wandb.ai/v1',

    # Create an API key at https://wandb.ai/settings
    api_key="<your-api-key>",

    # Optional: Team and project for usage tracking
    project="<your-team>/<your-project>",
)

response = client.chat.completions.create(
    model="meta-llama/Llama-3.1-8B-Instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Tell me a joke."}
    ],
)

print(response.choices[0].message.content)

Next steps

  1. Set up your account using the prerequisites.
  2. Review the available models and usage information and limits.
  3. Use the service through the API or UI.
  4. Try out supported models in the W&B Weave Playground.
  5. Try the usage examples.
For information about pricing, usage limits, and credits, see Usage Information and Limits.