Chapter 67: Model Merging & Optimization
Blend strengths and shrink costs. This chapter builds a model-merging skill to combine checkpoints, apply adapters, and optimize performance/latency for your Task API use cases.
Goals
- Understand when merging beats retraining
- Apply adapter/LoRA merges and safety checks
- Evaluate merged models for quality and regressions
- Optimize latency/cost with quantization and runtime tuning
- Capture repeatable merge/optimize steps in a skill
Lesson Progression
- Build the model-merging skill
- Merging strategies and tooling
- Quality/safety evaluation of merged models
- Optimization (quantization, runtime tweaks)
- Capstone: merged/optimized model for Task API; finalize the skill
Outcome & Method
You finish with a merged/optimized model tuned for your workload and a reusable model-merging skill.
Prerequisites
- Chapters 63-66 (data, SFT, persona, function calling)