Fine-tuning LLMs with Blackwell, RTX 50 series & Unsloth

Learn how to fine-tune LLMs on NVIDIA's Blackwell RTX 50 series and B200 GPUs with our step-by-step guide.

Unsloth now supports NVIDIA’s Blackwell architecture GPUs, including RTX 50-series GPUs (5060–5090), RTX PRO 6000, and GPUS such as B200, B40, GB100, GB102 and more! You can read the official NVIDIA blogpost here.

Unsloth is now compatible with every NVIDIA GPU from 2018+ including the DGX Spark.

Our new Docker image supports Blackwell. Run the Docker image and start training! Guide

Pip install

Simply install Unsloth:

pip install unsloth

If you see issues, another option is to create a separate isolated environment:

python -m venv unsloth
source unsloth/bin/activate
pip install unsloth

Note it might be pip3 or pip3.13 and also python3 or python3.13

You might encounter some Xformers issues, in which cause you should build from source:

# First uninstall xformers installed by previous libraries
pip uninstall xformers -y

# Clone and build
pip install ninja
export TORCH_CUDA_ARCH_LIST="12.0"
git clone --depth=1 https://github.com/facebookresearch/xformers --recursive
cd xformers && python setup.py install && cd ..

Docker

unsloth/unsloth is Unsloth's only Docker image. For Blackwell and 50-series GPUs, use this same image - no separate image needed.

For installation instructions, please follow our Unsloth Docker guide.

uv

uv pip install unsloth

uv (Advanced)

The installation order is important, since we want the overwrite bundled dependencies with specific versions (namely, xformers and triton).

I prefer to use uv over pip as it's faster and better for resolving dependencies, especially for libraries which depend on torch but for which a specific CUDA version is required per this scenario.
Install uv
```
curl -LsSf https://astral.sh/uv/install.sh | sh && source $HOME/.local/bin/env
```
Create a project dir and venv:
```
mkdir 'unsloth-blackwell' && cd 'unsloth-blackwell'
uv venv .venv --python=3.12 --seed
source .venv/bin/activate
```
Install vllm
```
uv pip install -U vllm --torch-backend=cu128
```
Note that we have to specify cu128, otherwise vllm will install torch==2.7.0 but with cu126.

Install unsloth dependencies

uv pip install unsloth unsloth_zoo bitsandbytes

If you notice weird resolving issues due to Xformers, you can also install Unsloth from source without Xformers:

uv pip install -qqq \
"unsloth_zoo[base] @ git+https://github.com/unslothai/unsloth-zoo" \
"unsloth[base] @ git+https://github.com/unslothai/unsloth"

Download and build xformers (Optional)
Xformers is optional, but it is definitely faster and uses less memory. We'll use PyTorch's native SDPA if you do not want Xformers. Building Xformers from source might be slow, so beware!
```
# First uninstall xformers installed by previous libraries
pip uninstall xformers -y

# Clone and build
pip install ninja
export TORCH_CUDA_ARCH_LIST="12.0"
git clone --depth=1 https://github.com/facebookresearch/xformers --recursive
cd xformers && python setup.py install && cd ..
```
Note that we have to explicitly set TORCH_CUDA_ARCH_LIST=12.0.
transformers Install any transformers version, but best to get the latest.
```
uv pip install -U transformers
```

Conda or mamba (Advanced)

Install conda/mamba

curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"

Run the installation script

bash Miniforge3-$(uname)-$(uname -m).sh

Create a conda or mamba environment

conda create --name unsloth-blackwell python==3.12 -y

Activate newly created environment

conda activate unsloth-blackwell

Install vllm
Make sure you are inside the activated conda/mamba environment. You should see the name of your environment as a prefix to your terminal shell like this your (unsloth-blackwell)user@machine:
```
pip install -U vllm --extra-index-url https://download.pytorch.org/whl/cu128
```
Note that we have to specify cu128, otherwise vllm will install torch==2.7.0 but with cu126.
Install unsloth dependencies
Make sure you are inside the activated conda/mamba environment. You should see the name of your environment as a prefix to your terminal shell like this your (unsloth-blackwell)user@machine:
```
pip install unsloth unsloth_zoo bitsandbytes
```
Download and build xformers (Optional)
Xformers is optional, but it is definitely faster and uses less memory. We'll use PyTorch's native SDPA if you do not want Xformers. Building Xformers from source might be slow, so beware!
You should see the name of your environment as a prefix to your terminal shell like this your (unsloth-blackwell)user@machine:
```
# First uninstall xformers installed by previous libraries
pip uninstall xformers -y

# Clone and build
pip install ninja
export TORCH_CUDA_ARCH_LIST="12.0"
git clone --depth=1 https://github.com/facebookresearch/xformers --recursive
cd xformers && python setup.py install && cd ..
```
Note that we have to explicitly set TORCH_CUDA_ARCH_LIST=12.0.
Update triton
Make sure you are inside the activated conda/mamba environment. You should see the name of your environment as a prefix to your terminal shell like this your (unsloth-blackwell)user@machine:
```
pip install -U triton>=3.3.1
```
triton>=3.3.1 is required for Blackwell support.
Transformers Install any transformers version, but best to get the latest.
```
uv pip install -U transformers
```

If you are using mamba as your package just replace conda with mamba for all commands shown above.

WSL-Specific Notes

If you're using WSL (Windows Subsystem for Linux) and encounter issues during xformers compilation (reminder Xformers is optional, but faster for training) follow these additional steps:

Increase WSL Memory Limit Create or edit the WSL configuration file:

# Create or edit .wslconfig in your Windows user directory
# (typically C:\Users\YourUsername\.wslconfig)

# Add these lines to the file
[wsl2]
memory=16GB  # Minimum 16GB recommended for xformers compilation
processors=4  # Adjust based on your CPU cores
swap=2GB
localhostForwarding=true

After making these changes, restart WSL:

wsl --shutdown

Install xformers Use the following command to install xformers with optimized compilation for WSL:

# Set CUDA architecture for Blackwell GPUs
export TORCH_CUDA_ARCH_LIST="12.0"

# Install xformers from source with optimized build flags
pip install -v --no-build-isolation -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers

The --no-build-isolation flag helps avoid potential build issues in WSL environments.

PreviousFine-tuning LLMs with NVIDIA DGX Spark and Unsloth NextMulti-GPU Training with Unsloth

Last updated 6 days ago

Was this helpful?