Skip to main content

StreamableHTTP Transport: Remote MCP Over HTTP

The Remote Deployment Challenge

You've built a powerful MCP server using FastMCP that processes documents, analyzes code, and handles system operations. It works perfectly on your local machine—stdio transport handles all communication seamlessly.

But now you face a real-world problem: you need to deploy this server to a remote machine. Your clients (Claude instances, browser agents, mobile apps) are distributed across the network. Stdio is a local pipe—it only works when the server and client run on the same machine.

You need remote MCP.

The problem isn't just "how do we send messages over the network?" Any HTTP client can do that. The problem is that HTTP is fundamentally asymmetric:

  • Client → Server: Easy. Client sends request, server responds.
  • Server → Client: Hard. HTTP has no native way for servers to initiate messages.

But MCP requires bidirectional communication. Consider these scenarios:

Server → Client (required by MCP):

  • Progress notifications ("Processing file 3 of 10...")
  • Logging messages ("Tool execution took 2.3 seconds")
  • Sampling requests ("I need another file to make this decision")

Without server-initiated messaging, these workflows fail. The client must continuously poll ("Are you done yet? Are you done yet?"), wasting bandwidth and adding latency.

StreamableHTTP solves this. It transforms HTTP from request-response into a streaming protocol using Server-Sent Events (SSE), enabling true bidirectional MCP communication over the network.

How StreamableHTTP Works: The Dual-Connection Pattern

StreamableHTTP operates on a principle: one connection for client→server requests, another for server→client messages.

Architecture Overview

Client Application

├─ POST /sse/message → Server (client sends requests)

└─ GET /sse/{sessionId} (server streams responses + notifications)

Server-Sent Events (long-lived HTTP response)

Receives: Tool results, progress updates, logging, sampling requests

The Session ID: Binding Connections Together

Here's the key insight: HTTP connections are stateless. When your client makes a POST request and a separate GET request for SSE, the server needs to know these connections belong to the same session.

That's what the session ID does.

Session Lifecycle:

  1. Initialize: Client sends Initialize request → Server responds with mcp-session-id header
  2. Acknowledge: Client sends Initialized notification with that session ID
  3. Stream: Client opens SSE connection with ?session_id=[ID]
  4. Communicate: Client sends requests via POST, server streams responses via SSE
  5. Cleanup: Either side closes connections, server cleans up session

Example Request Flow

CLIENT                              SERVER
──────────────────────────────────────────

1. POST /initialize
{"protocol": "mcp"}
────────────────────→
← ─ ─ ─ ─
200 OK + header: mcp-session-id: abc123

2. POST /notifications
{
"jsonrpc": "2.0",
"method": "notifications/initialized",
"params": {"sessionId": "abc123"}
}
────────────────────→
← ─ ─ ─ ─
200 OK

3. GET /sse?session_id=abc123
(keeps connection open)
────────────────────→
(connection stays open - server can stream to it)

4. POST /request
{
"jsonrpc": "2.0",
"method": "resources/read",
"params": {...},
"sessionId": "abc123"
}
────────────────────→
← ─ ─ ─ ─
200 OK (acknowledge receipt)

5. Server streams through SSE connection (from step 3):
data: {"jsonrpc": "2.0", "result": {...}}

← ─ ─ ─ ─
(data arrives through GET connection)

Problem 1: Multiple SSE Streams for Different Contexts

MCP servers often handle multiple tool calls simultaneously. Each tool might need:

  • Sampling requests (asking the client for more data)
  • Progress notifications (keeping the client informed)
  • Logging (debugging what happened)

Naive approach: Send everything through one SSE stream.

Problem: Ordering becomes ambiguous. If Tool A sends a sampling request and Tool B logs a message, how does the client know which tool issued which request?

StreamableHTTP solution: Create tool-specific SSE connections.

Primary SSE Connection

Carries:

  • General notifications
  • Tool results (for simple tools)
  • Errors
@app.get("/sse/{session_id}")
async def primary_sse(session_id: str):
"""Primary stream for all non-tool-specific messages"""
session = session_store.get(session_id)
if not session:
return {"error": "invalid session"}

async def event_generator():
while session.is_active:
message = await session.queue.get()
yield f"data: {json.dumps(message)}\n\n"

return StreamingResponse(event_generator(), media_type="text/event-stream")

Tool-Specific SSE Connections

Created dynamically for tools that need interactive communication:

@app.get("/sse/{session_id}/tool/{tool_name}")
async def tool_sse(session_id: str, tool_name: str):
"""Tool-specific stream for sampling requests and progress"""
session = session_store.get(session_id)
if not session:
return {"error": "invalid session"}

tool_queue = session.get_tool_queue(tool_name)

async def event_generator():
while session.is_active:
message = await tool_queue.get()
yield f"data: {json.dumps(message)}\n\n"

return StreamingResponse(event_generator(), media_type="text/event-stream")
Connection TypePurposeLifecycle
Primary SSEGeneral notifications, tool results, errorsStays open entire session
Tool-Specific SSESampling requests, progress for one toolCreated per tool call, closes when tool completes

Problem 2: Session ID Routing Across Requests

Every client request must include the session ID so the server routes it correctly:

@app.post("/messages")
async def handle_message(request: Request):
body = await request.json()

session_id = body.get("sessionId") # Extract from request
session = session_store.get(session_id)

if not session:
return {"error": "Invalid session ID"}

# Route to correct session's handler
result = await session.handle_request(body)
return result

Client responsibility: Include sessionId in every POST request body.

// Example client code
async function makeRequest(method, params) {
const response = await fetch(`${SERVER_URL}/messages`, {
method: "POST",
body: JSON.stringify({
jsonrpc: "2.0",
method: method,
params: params,
sessionId: SESSION_ID // Critical
})
});
return response.json();
}

Why this matters: Without explicit session ID routing, requests from different clients could collide. Session IDs ensure isolation.

Implementing StreamableHTTP with FastMCP

Step 1: Specification (The Deployment Pattern)

Before code, clarify what we're building:

## StreamableHTTP MCP Server Specification

**Intent**: Enable remote MCP clients to connect to FastMCP servers over HTTP
with streaming support for progress notifications and sampling requests.

**Success Criteria**:
- ✅ Client can initialize and establish session
- ✅ Client can send requests (via POST) and receive results
- ✅ Server can push notifications (via SSE) without client polling
- ✅ Server can request samples from client (sampling pattern)
- ✅ Multiple clients can maintain independent sessions
- ✅ Session timeout cleans up stale connections

**Constraints**:
- Use FastMCP's StreamableHTTP mixin
- Configure both primary and tool-specific SSE streams
- Validate session IDs on every request
- Implement connection heartbeat to detect dead clients

Step 2: Configure FastMCP for HTTP

FastMCP provides a StreamableHTTP transport that handles the complexity:

from fastmcp import FastMCP
from fastmcp.server import StreamableHTTP
import json

# Initialize FastMCP with HTTP transport
mcp = FastMCP(
"document-analyzer",
transports=[
StreamableHTTP(
host="0.0.0.0",
port=8000,
session_timeout=300, # 5 minutes
enable_tool_sse=True, # Create tool-specific SSE streams
)
]
)

# Your tools work exactly as before
@mcp.tool()
async def analyze_document(file_path: str) -> str:
"""Analyze document and return insights"""
# Tool implementation unchanged
return f"Analysis of {file_path}: ..."

Key configuration options:

OptionPurpose
hostBind to all interfaces (0.0.0.0) for remote access
portHTTP listening port
session_timeoutMinutes before stale sessions are cleaned
enable_tool_sseCreate separate SSE streams per tool call

Step 3: Handle Session Initialization

StreamableHTTP handles session creation automatically, but you should understand the flow:

# FastMCP automatically provides these endpoints:
# POST /initialize → Creates session, returns mcp-session-id
# POST /messages → Client sends requests
# GET /sse/{session_id} → Primary SSE stream
# GET /sse/{session_id}/tool/{tool_name} → Tool-specific SSE

When a client connects:

  1. Client POST to /initialize with MCP protocol version
  2. Server creates new session, returns mcp-session-id header
  3. Client POST to /notifications/initialized with session ID
  4. Client opens SSE connection with GET /sse/{session_id}
  5. Server is now ready to stream responses

Step 4: Sending Notifications Through SSE

Your tools can send notifications that stream to clients:

@mcp.tool()
async def long_running_operation(data: str) -> str:
"""Demonstrate progress notifications"""

# Process in steps, notify after each
for i in range(5):
await mcp.emit_notification({
"type": "progress",
"current": i + 1,
"total": 5,
"message": f"Processing batch {i+1}/5..."
})
# Do work...
await asyncio.sleep(1)

return f"Completed processing {data}"

The notification arrives on the client's SSE stream in real-time—no polling required.

Step 5: Sampling Requests (Interactive Tools)

Tools can request additional context from the client:

@mcp.tool()
async def intelligent_analysis(file_path: str, context_needed: bool = False) -> str:
"""Tool that can request additional context"""

# Initial analysis
result = analyze(file_path)

# If we need more context, ask client
if context_needed:
context = await mcp.request_sample({
"type": "context_request",
"message": "I need additional context. What domain is this file from?",
"options": ["Finance", "Healthcare", "E-commerce", "Other"]
})

# Use provided context
result = refine_with_context(result, context)

return result

The client sees this request stream through SSE and can respond interactively.

Real-World Deployment: Connecting Remote Claude

Once your StreamableHTTP server is running:

# Server running on remote host at api.example.com:8000
uvicorn your_server:app --host 0.0.0.0 --port 8000

Connect it to Claude:

from anthropic import Anthropic

client = Anthropic()

# Configure MCP connection to remote server
client.configure_mcp_connection(
transport="streamable-http",
url="http://api.example.com:8000",
)

# Now Claude can call your remote tools
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
tools=client.get_tools(), # Discovers tools from remote MCP
messages=[
{
"role": "user",
"content": "Analyze the document at /data/report.pdf and provide insights"
}
]
)

Claude automatically:

  1. Initializes the session
  2. Opens the SSE stream
  3. Calls your tools
  4. Receives streaming responses and notifications
  5. Cleans up when done

The Session ID Pattern: Why It Matters

This pattern is critical for production systems:

Single-Client Scenario (Appears to Work Without Session ID):

Client 1
├─ POST /initialize → Session ABC
├─ GET /sse/ABC
└─ POST /messages with SessionId: ABC ✓

Multi-Client Scenario (Fails Without Session ID):

Client 1                          Client 2
├─ POST /initialize → Session ABC ├─ POST /initialize → Session DEF
├─ GET /sse/ABC ├─ GET /sse/DEF
├─ POST /messages (SessionId: ABC) ├─ POST /messages (SessionId: DEF)
│ "Convert video1" │ "Convert video2"
│ │
├ Request routes to Session ABC ✓ ├ Request routes to Session DEF ✓
└─ Result streams to correct client└─ Result streams to correct client

Without explicit session IDs, requests get mixed up. With them, isolation is guaranteed.

Try With AI

Setup: You're designing a document processing service that connects to remote Claude instances. You need to decide between keeping everything on one connection vs using the dual-stream pattern.

Part 1: Understanding the Constraint

Ask AI:

I'm building an MCP server that processes documents and provides real-time progress updates.
The client needs to:
1. Send a request ("Process this file")
2. Receive progress notifications ("50% complete...")
3. Get the final result

Why can't I just use HTTP POST for all communication? What makes this pattern different
from a normal request-response API? Explain the technical constraint and why SSE is the solution.

What you're learning: The fundamental difference between request-response (HTTP's native model) and bidirectional streaming (what MCP requires). Understanding this gap is crucial for designing remote agents. You're discovering why the architecture exists.

Part 2: Designing the Connection Pattern

Based on the explanation above, ask AI:

Given the HTTP limitation, why does StreamableHTTP use TWO separate connections
(one POST for requests, one GET for SSE responses) instead of trying to create
one true bidirectional connection?

What are the advantages of this split architecture? What are potential downsides?
Which problems does this solve, and which remain?

What you're learning: The tradeoff between simplicity (one connection for everything) and clarity (separate concerns). This informs architectural decisions you'll make when deploying real systems. You're refining the pattern based on HTTP's actual constraints.

Part 3: Session Management Under Load

Now consider a real deployment scenario:

I'm deploying my MCP server to handle 50 concurrent clients. Each client:
- Initializes a session (gets unique session ID)
- Opens an SSE stream with that ID
- Sends multiple requests (each includes the session ID)
- Can disconnect/reconnect at any time

Given this scenario, what could go wrong with session ID management?
How would I detect and clean up abandoned sessions? What happens if a client
crashes without closing their SSE stream? How would I prevent session ID collisions?

What you're learning: Production resilience patterns. Session timeouts, heartbeats, and cleanup logic aren't optional—they prevent resource leaks in long-running systems. You're thinking like a distributed systems engineer considering failure modes before they happen.


Reflect on Your Skill

You built an mcp-server skill in Lesson 0. Test and improve it based on what you learned.

Test Your Skill

Using my mcp-server skill, configure a server for remote deployment with StreamableHTTP.
Does my skill include guidance on session ID management, SSE connection patterns, and the dual-connection architecture?

Identify Gaps

Ask yourself:

  • Did my skill include StreamableHTTP configuration and session lifecycle management?
  • Did it explain the difference between primary SSE and tool-specific SSE streams?

Improve Your Skill

If you found gaps:

My mcp-server skill is missing remote deployment patterns with StreamableHTTP.
Update it to include StreamableHTTP transport configuration, session ID routing, SSE connection management, and the architectural patterns for bidirectional HTTP communication.