👁️Vision Fine-tuning
Details on vision/multimodal fine-tuning with Unsloth
Last updated
Details on vision/multimodal fine-tuning with Unsloth
Last updated
Fine-tuning vision models has numerous use cases across various industries, enabling models to adapt to specific tasks and datasets. We provided 3 example notebooks for vision finetuning.
Llama 3.2 Vision finetuning for radiography: Notebook How can we assist medical professionals in analyzing Xrays, CT Scans & ultrasounds faster.
Qwen 2 VL finetuning for converting handwriting to LaTeX: Notebook This allows complex math formulas to be easily transcribed as LaTeX without manually writing it.
Pixtral 12B 2409 vision finetuning for general Q&A: Notebook One can concatenate general Q&A datasets with more niche datasets to make the finetune not forget base model skills.
To finetune vision models, we now allow you to select which parts of the mode to finetune. You can select to only finetune the vision layers, or the language layers, or the attention / MLP layers! We set them all on by default!