Fine-tuning for GPT-3.5
It will not teach your GPT new tricks, but can make it faster and more predictable.
Have you seen the news about the fine-tuning of GPT-3.5?
In summary:
- This isn't the fine-tuning we're all used to!
- Training costs are relatively low - $0.008 per 1K tokens.
- Using the fine-tuned model is 8x more expensive than the base GPT-3.5-Turbo.
The news can be found here: OpenAI.
There's no additional cost for the uptime of the fine-tuned model. This suggests it's some variant of LoRA adapters and token manipulation.
This fine-tuning tailors the model for a specific task, making it more specialized. You can't teach it new facts easily, and it doesn't replace information retrieval. For more details, see OpenAI's documentation on fine-tuning.
Why go for such tuning? It's to save on prompt tokens! If there's a standard task that requires a lengthy prompt, a specific output format, or a certain response style, you can fine-tune GPT-3.5 for that task. This means you won't have to send as many few-shot examples in the request.
The tuning pays off when the input prompt is compressed by more than 8x. Plus, the response time will be faster.
This also pushes libraries like Microsoft Guidance or Microsoft TypedChat further into obsoletion.