Troubleshooting
Saving to safetensors
, not bin
format in Colab
safetensors
, not bin
format in ColabWe save to .bin
in Colab so it's like 4x faster, but set safe_serialization = None
to force saving to .safetensors
. So model.save_pretrained(..., safe_serialization = None)
or model.push_to_hub(..., safe_serialization = None)
If saving to GGUF or vLLM 16bit crashes
You can try reducing the maximum GPU usage during saving by changing maximum_memory_usage
.
The default is model.save_pretrained(..., maximum_memory_usage = 0.75)
. Reduce it to say 0.5 to use 50% of GPU peak memory or lower. This can reduce OOM crashes during saving.
Last updated