What Model Should I Use?
Last updated
Was this helpful?
Last updated
Was this helpful?
When preparing for fine-tuning, one of the first decisions you'll face is selecting the right model. Here's a step-by-step guide to help you choose:
E.g. For image-based training, select a vision model such as Llama 3.2 Vision. For code datasets, opt for a specialized model like Qwen Coder 2.5.
Licensing and Requirements: Different models may have specific licensing terms and . Be sure to review these carefully to avoid compatibility issues.
When preparing for fine-tuning, one of the first decisions you'll face is whether to use an instruct model or a base model.
Instruct models are pre-trained with built-in instructions, making them ready to use without any fine-tuning. These models, including GGUFs and others commonly available, are optimized for direct usage and respond effectively to prompts right out of the box.
Base models, on the other hand, are the original pre-trained versions without instruction fine-tuning. These are specifically designed for customization through fine-tuning, allowing you to adapt them to your unique needs.
The decision often depends on the quantity, quality, and type of your data:
1,000+ Rows of Data: If you have a large dataset with over 1,000 rows, it's generally best to fine-tune the base model.
300โ1,000 Rows of High-Quality Data: With a medium-sized, high-quality dataset, fine-tuning the base or instruct model are both viable options.
Less than 300 Rows: For smaller datasets, the instruct model is typically the better choice. Fine-tuning the instruct model enables it to align with specific needs while preserving its built-in instructional capabilities. This ensures it can follow general instructions without additional input unless you intend to significantly alter its functionality.
For information how how big your dataset should be,