Ctrlk

Reddit page Discord Newsletter Latest news

Get Started
New
Models
Basics

Powered by GitBook

On this page

Basics

🖥️Inference & Deployment

Learn how to save your finetuned model so you can run it in your favorite inference engine.

You can also run your fine-tuned models by using Unsloth's 2x faster inference.

llama.cpp - Saving to GGUF

vLLM

SGLang

Unsloth Inference

Troubleshooting

vLLM Engine Arguments

LoRA Hotswapping

PreviousCogito v2.1 NextGGUF & llama.cpp

Last updated 19 days ago

Was this helpful?

Community

Reddit r/unsloth
Twitter (X)
LinkedIn

Resources

Tutorials
Docker
Hugging Face

Company

About
Events
Contact

Was this helpful?