Skip to main content

Instruction Formats

Your training data must follow a specific format that the training framework understands. Choose wrong, and training fails silently—the model learns nothing useful.

This lesson covers the three dominant formats: Alpaca (simple, single-turn), ShareGPT (conversational, multi-turn), and ChatML (role-structured, API-compatible). You'll know when to use each and how to convert between them.


The Format Problem

Training frameworks don't read natural conversation. They need structured data that clearly separates:

  1. What the user said (input)
  2. What the model should say (output)
  3. Any system instructions (context)

Different frameworks expect different structures. Pick the wrong format, and your carefully curated data is useless.


Format 1: Alpaca (Single-Turn Simplicity)

The Alpaca format is the simplest instruction format, designed for single-turn interactions.

Structure

{
"instruction": "What the user wants",
"input": "Optional context or data",
"output": "What the model should respond"
}

Real Example: Task API

{
"instruction": "Create a task to buy groceries",
"input": "",
"output": "Created task: 'Buy groceries' | Priority: medium | Due: not set"
}

With optional input:

{
"instruction": "Summarize this task",
"input": "Task: Review Q4 budget, Priority: high, Due: 2025-01-15, Status: pending",
"output": "High-priority task to review Q4 budget, due January 15th, currently pending."
}

When to Use Alpaca

Use CaseWhy Alpaca Works
Command-response patternsOne input, one output, no context needed
Classification tasksInput + label, no conversation
Single-turn Q&AQuestion → answer with no follow-up
Code generation from specSpecification → code, standalone

When NOT to Use Alpaca

Use CaseWhy Alpaca Fails
Multi-turn dialogueNo way to represent conversation history
Context-dependent responsesCan't track previous turns
Clarification workflowsCan't model "ask for more info" patterns

Alpaca Format Schema (Pydantic)

from pydantic import BaseModel

class AlpacaExample(BaseModel):
"""Single-turn instruction format."""
instruction: str
input: str = "" # Optional context
output: str

# Validation example
example = AlpacaExample(
instruction="Create a task to buy groceries",
input="",
output="Created task: 'Buy groceries' | Priority: medium | Due: not set"
)

Output:

AlpacaExample(instruction='Create a task to buy groceries', input='', output="Created task: 'Buy groceries' | Priority: medium | Due: not set")

Format 2: ShareGPT (Multi-Turn Conversations)

The ShareGPT format handles multi-turn conversations with alternating human/assistant messages.

Structure

{
"conversations": [
{"from": "human", "value": "First user message"},
{"from": "gpt", "value": "First assistant response"},
{"from": "human", "value": "Follow-up question"},
{"from": "gpt", "value": "Follow-up response"}
]
}

Real Example: Task API with Clarification

{
"conversations": [
{
"from": "human",
"value": "Create a task for the budget review"
},
{
"from": "gpt",
"value": "I'll create a task for the budget review. When is it due?"
},
{
"from": "human",
"value": "Friday, and make it high priority"
},
{
"from": "gpt",
"value": "Created task: 'Budget review' | Priority: high | Due: 2025-01-17 (Friday)"
}
]
}

When to Use ShareGPT

Use CaseWhy ShareGPT Works
Conversational assistantsNatural back-and-forth dialogue
Clarification workflowsModel learns to ask for missing info
Context-dependent tasksEach turn builds on previous
Real-world chat interfacesMatches production conversation flow

When NOT to Use ShareGPT

Use CaseWhy ShareGPT Is Overkill
Simple command → responseExtra structure adds no value
Classification tasksSingle input/output, no conversation
Batch processingNo dialogue needed

ShareGPT Format Schema (Pydantic)

from pydantic import BaseModel, validator
from typing import List, Literal

class ShareGPTMessage(BaseModel):
"""Single message in a conversation."""
from_: Literal["human", "gpt"] = None
value: str

class Config:
# Allow 'from' as field name in JSON
fields = {"from_": "from"}

class ShareGPTExample(BaseModel):
"""Multi-turn conversation format."""
conversations: List[ShareGPTMessage]

@validator("conversations")
def validate_alternation(cls, v):
"""Ensure human/gpt turns alternate."""
for i, msg in enumerate(v):
expected = "human" if i % 2 == 0 else "gpt"
if msg.from_ != expected:
raise ValueError(f"Turn {i} should be {expected}")
return v

Output when validated:

ShareGPTExample(conversations=[...4 messages...])

Format 3: ChatML (Role-Based Structure)

The ChatML format uses explicit role markers and is compatible with OpenAI's API structure.

Structure

<|im_start|>system
You are a task management assistant for the Task API.
<|im_end|>
<|im_start|>user
Create a task to buy groceries
<|im_end|>
<|im_start|>assistant
Created task: 'Buy groceries' | Priority: medium | Due: not set
<|im_end|>

JSON Representation

{
"messages": [
{"role": "system", "content": "You are a task management assistant..."},
{"role": "user", "content": "Create a task to buy groceries"},
{"role": "assistant", "content": "Created task: 'Buy groceries' | Priority: medium | Due: not set"}
]
}

When to Use ChatML

Use CaseWhy ChatML Works
OpenAI API compatibilityMatches their fine-tuning format exactly
System prompt injectionClear place for persona/instructions
Role-based conversationssystem/user/assistant separation
Production API deploymentSame format as inference calls

When NOT to Use ChatML

Use CaseWhy ChatML Is Unnecessary
Training without system promptsExtra structure adds complexity
Non-OpenAI frameworksMay not support ChatML natively
Simple Alpaca-style tasksOverkill for single-turn

ChatML Format Schema (Pydantic)

from pydantic import BaseModel, validator
from typing import List, Literal

class ChatMLMessage(BaseModel):
"""Single message in ChatML format."""
role: Literal["system", "user", "assistant"]
content: str

class ChatMLExample(BaseModel):
"""OpenAI-compatible chat format."""
messages: List[ChatMLMessage]

@validator("messages")
def validate_structure(cls, v):
"""Ensure valid message structure."""
if not v:
raise ValueError("Must have at least one message")

# System message (if present) must be first
if any(m.role == "system" for m in v[1:]):
raise ValueError("System message must be first")

# Last message should be assistant (what we're training)
if v[-1].role != "assistant":
raise ValueError("Last message must be assistant response")

return v

Format Comparison Table

AspectAlpacaShareGPTChatML
TurnsSingleMultiMulti
Rolesinstruction/input/outputhuman/gptsystem/user/assistant
System promptNoNoYes
Best forCommands, classificationDialogue, clarificationOpenAI fine-tuning
Framework supportAxolotl, Unsloth, mostUnsloth, some adaptersOpenAI, Unsloth

Format Selection Decision Tree

                    Start


┌───── Multi-turn conversation? ─────┐
│ │
Yes No
│ │
▼ ▼
Need system prompt? ┌── Alpaca ──┐
│ │ │
┌────┴────┐ Simple commands,
Yes No classification,
│ │ single-turn Q&A
▼ ▼
ChatML ShareGPT
│ │
OpenAI Dialogue,
fine-tune clarification

Converting Between Formats

When your data is in one format but your framework needs another, convert programmatically:

Alpaca to ShareGPT

def alpaca_to_sharegpt(alpaca: dict) -> dict:
"""Convert Alpaca format to ShareGPT format."""
instruction = alpaca["instruction"]
input_text = alpaca.get("input", "")
output = alpaca["output"]

# Combine instruction and input
if input_text:
human_message = f"{instruction}\n\nContext: {input_text}"
else:
human_message = instruction

return {
"conversations": [
{"from": "human", "value": human_message},
{"from": "gpt", "value": output}
]
}

# Example usage
alpaca_example = {
"instruction": "Create a task to buy groceries",
"input": "",
"output": "Created task: 'Buy groceries'"
}

sharegpt_example = alpaca_to_sharegpt(alpaca_example)
print(sharegpt_example)

Output:

{
"conversations": [
{"from": "human", "value": "Create a task to buy groceries"},
{"from": "gpt", "value": "Created task: 'Buy groceries'"}
]
}

ShareGPT to ChatML

def sharegpt_to_chatml(sharegpt: dict, system_prompt: str = "") -> dict:
"""Convert ShareGPT format to ChatML format."""
messages = []

# Add system prompt if provided
if system_prompt:
messages.append({"role": "system", "content": system_prompt})

# Convert conversations
for msg in sharegpt["conversations"]:
role = "user" if msg["from"] == "human" else "assistant"
messages.append({"role": role, "content": msg["value"]})

return {"messages": messages}

# Example usage
sharegpt_example = {
"conversations": [
{"from": "human", "value": "Create a task to buy groceries"},
{"from": "gpt", "value": "Created task: 'Buy groceries'"}
]
}

chatml_example = sharegpt_to_chatml(
sharegpt_example,
system_prompt="You are a task management assistant."
)
print(chatml_example)

Output:

{
"messages": [
{"role": "system", "content": "You are a task management assistant."},
{"role": "user", "content": "Create a task to buy groceries"},
{"role": "assistant", "content": "Created task: 'Buy groceries'"}
]
}

Designing Multi-Turn Conversations

For ShareGPT and ChatML formats, you'll design multi-turn conversations. Here are patterns that work:

Pattern 1: Clarification Flow

{
"conversations": [
{"from": "human", "value": "Create a task for the meeting"},
{"from": "gpt", "value": "I can create a task for the meeting. What's the due date?"},
{"from": "human", "value": "Next Monday, high priority"},
{"from": "gpt", "value": "Created task: 'Meeting' | Priority: high | Due: 2025-01-20 (Monday)"}
]
}

Why it works: Models learn to ask for missing information instead of hallucinating.

Pattern 2: Confirmation Flow

{
"conversations": [
{"from": "human", "value": "Delete all completed tasks"},
{"from": "gpt", "value": "You have 12 completed tasks. Are you sure you want to delete all of them?"},
{"from": "human", "value": "Yes, delete them"},
{"from": "gpt", "value": "Deleted 12 completed tasks."}
]
}

Why it works: Models learn to confirm destructive actions.

Pattern 3: Error Recovery Flow

{
"conversations": [
{"from": "human", "value": "Update the budget task priority to urgent"},
{"from": "gpt", "value": "I don't see a task called 'budget'. Did you mean 'Q4 budget review' or 'Annual budget planning'?"},
{"from": "human", "value": "Q4 budget review"},
{"from": "gpt", "value": "Updated task: 'Q4 budget review' | Priority: changed to high (Note: 'urgent' maps to high priority)"}
]
}

Why it works: Models learn graceful error handling and disambiguation.


Framework-Specific Requirements

Different training frameworks have format preferences:

FrameworkPreferred FormatNotes
UnslothShareGPTNative support, efficient batching
AxolotlAlpaca or ChatMLFlexible, many adapters
OpenAI Fine-tuningChatML (JSONL)Must match exactly
Hugging Face TRLFlexibleUse datasets library

For this book's examples (Unsloth on Colab Free Tier), we use ShareGPT format.


Try With AI

Prompt 1: Format Detective

Here are three training examples in different formats. For each one:
1. Identify the format (Alpaca, ShareGPT, or ChatML)
2. Explain how you identified it
3. Suggest when this format would be appropriate

Example A:
{"instruction": "Translate to French", "input": "Hello world", "output": "Bonjour le monde"}

Example B:
{"conversations": [{"from": "human", "value": "What's the weather?"}, {"from": "gpt", "value": "I don't have access to weather data. Could you check a weather app?"}]}

Example C:
{"messages": [{"role": "system", "content": "You are a helpful assistant"}, {"role": "user", "content": "Hi"}, {"role": "assistant", "content": "Hello! How can I help?"}]}

What you're learning: Pattern recognition for instruction formats. This skill is essential when working with existing datasets or debugging format errors.

Prompt 2: Design Multi-Turn Conversations

I need to create 5 ShareGPT training examples for a Task API assistant.
The conversations should demonstrate:
1. A clarification flow (asking for missing due date)
2. A confirmation flow (deleting multiple tasks)
3. An error recovery flow (task not found)
4. A multi-step workflow (create then immediately update)
5. An edge case (conflicting instructions)

For each, explain why the conversation structure teaches good behavior.

What you're learning: Designing multi-turn conversations that teach nuanced behaviors. The structure of your conversations directly shapes model behavior.

Prompt 3: Build a Format Converter

Help me build a Python script that:
1. Reads a JSONL file in Alpaca format
2. Converts each example to ShareGPT format
3. Optionally adds a system prompt to each conversation
4. Writes output to a new JSONL file
5. Validates the output format using Pydantic

Include error handling for malformed input and logging for progress.

What you're learning: Practical format conversion for real pipelines. You'll often receive data in one format and need to convert it for your training framework.

Safety Note

Format mismatches cause silent failures. Your training might run without errors but produce a useless model because the framework couldn't parse your data correctly. Always validate format before training—run a small test batch first.