brain-circuitFine Tuning

gllm-trainingarrow-up-right | Fine tuning Guidelinesarrow-up-right

What is a Fine Tuning ?

Fine tuning is a process that adapts a pre-trained model to perform better on specific tasks or domains by training it on a smaller, specialized dataset.This ensures that the model’s responses are more accurate, relevant, and tailored to particular use cases or requirements. The fine-tuning techniques used in our SDK include:

  1. Supervised Fine Tuning - training models using labeled input-output pairs to achieve task-specific performance improvements.

  2. Group Relative Policy Optimization (GRPO) - A reinforcement learning-based method that trains models to maximize reward functions across groups of candidate responses. This enables models to learn preference-aligned behaviors directly from reward signals instead of relying on explicit input-output pairs.

Last updated