Building HTTP Servers
You built FastAPI servers in Part 6. You know the pattern: define routes, handle requests, return responses. Now you need the same capability in TypeScript—but the landscape is different. Instead of one dominant framework, you have three strong options optimized for different runtimes.
Your AI chat interface needs an HTTP server. Users send messages, you call the AI provider, and you stream back the response token by token. In Python, you'd use FastAPI with StreamingResponse. In TypeScript, you have choices: Fastify for Node.js enterprise deployments, Hono for edge-first development that runs everywhere, or ElysiaJS for maximum Bun performance.
The right choice depends on where you're deploying. This lesson shows you all three, explains when to use each, and gives you the streaming patterns you need for AI applications.
The TypeScript HTTP Framework Landscape
Unlike Python where FastAPI dominates the async space, TypeScript has evolved multiple excellent options:
| Framework | Primary Runtime | Key Strength | Best For |
|---|---|---|---|
| Fastify | Node.js | Plugin ecosystem, enterprise features | Production APIs on Node.js infrastructure |
| Hono | Any (Web Standards) | Runs everywhere unchanged | Edge functions, multi-runtime codebases |
| ElysiaJS | Bun | Maximum performance, end-to-end type safety | Bun-native high-performance APIs |
The key insight: Hono uses only Web Standard APIs, so the same code runs on Node.js, Deno, Bun, Cloudflare Workers, and browsers. Fastify and ElysiaJS are optimized for their specific runtimes.
For your AI interface, the choice often comes down to deployment target:
- Deploying to AWS/GCP with Node.js? Fastify
- Deploying to Cloudflare Workers or need runtime flexibility? Hono
- Deploying to Bun for maximum speed? ElysiaJS
Hono: Write Once, Run Everywhere
Hono is built on Web Standards, meaning it uses Request, Response, and fetch—APIs available in every modern JavaScript runtime. The framework weighs under 14KB with zero dependencies.
Here's a basic AI chat endpoint:
import { Hono } from "hono";
const app = new Hono();
// Health check
app.get("/health", (c) => {
return c.json({ status: "ok", runtime: detectRuntime() });
});
// Chat endpoint
app.post("/chat", async (c) => {
const { prompt } = await c.req.json<{ prompt: string }>();
// Call AI provider (simplified)
const response = await fetch("https://api.openai.com/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "gpt-4",
messages: [{ role: "user", content: prompt }],
}),
});
const data = await response.json();
return c.json(data);
});
function detectRuntime(): string {
if (typeof Deno !== "undefined") return "deno";
if (typeof Bun !== "undefined") return "bun";
return "node";
}
export default app;
Output (when running on Node.js):
GET /health
{"status":"ok","runtime":"node"}
POST /chat with {"prompt": "Hello"}
{"id":"chatcmpl-...","choices":[{"message":{"content":"Hello! How can I help you?"}}]}
This same code runs on any runtime. The only difference is how you start the server:
// node-server.ts
import { serve } from "@hono/node-server";
import app from "./app";
serve({ fetch: app.fetch, port: 3000 });
console.log("Server running on http://localhost:3000");
// bun-server.ts (or just run app.ts directly with Bun)
export default app; // Bun automatically serves exported Hono apps
// deno-server.ts
import { serve } from "https://deno.land/std/http/server.ts";
import app from "./app.ts";
serve(app.fetch, { port: 3000 });
Streaming with Hono
For AI applications, you need to stream responses token by token. Hono provides a streaming helper:
import { Hono } from "hono";
import { stream } from "hono/streaming";
const app = new Hono();
app.post("/chat/stream", async (c) => {
const { prompt } = await c.req.json<{ prompt: string }>();
return stream(c, async (stream) => {
// Call AI provider with streaming enabled
const response = await fetch("https://api.openai.com/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "gpt-4",
messages: [{ role: "user", content: prompt }],
stream: true,
}),
});
if (!response.body) {
await stream.write("Error: No response body");
return;
}
// Pipe the AI response directly to the client
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value, { stream: true });
await stream.write(chunk);
}
});
});
export default app;
Output (streaming response):
data: {"choices":[{"delta":{"content":"Hello"}}]}
data: {"choices":[{"delta":{"content":"!"}}]}
data: {"choices":[{"delta":{"content":" How"}}]}
data: {"choices":[{"delta":{"content":" can"}}]}
data: {"choices":[{"delta":{"content":" I"}}]}
data: {"choices":[{"delta":{"content":" help"}}]}
data: [DONE]
Compared to FastAPI: This is equivalent to returning StreamingResponse in FastAPI. The stream() helper manages the connection and lets you write chunks as they arrive.
Fastify: Node.js Power with Plugin Ecosystem
Fastify is the established choice for Node.js production APIs. It's faster than Express, has excellent TypeScript support, and offers a rich plugin ecosystem.
import Fastify from "fastify";
const app = Fastify({
logger: true, // Built-in structured logging
});
// Type-safe request body
interface ChatRequest {
prompt: string;
}
interface ChatResponse {
message: string;
model: string;
}
app.post<{
Body: ChatRequest;
Reply: ChatResponse;
}>("/chat", async (request, reply) => {
const { prompt } = request.body;
// Call AI provider
const response = await fetch("https://api.openai.com/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "gpt-4",
messages: [{ role: "user", content: prompt }],
}),
});
const data = await response.json();
return {
message: data.choices[0].message.content,
model: data.model,
};
});
app.listen({ port: 3000 });
Output:
{"level":30,"time":1704067200000,"msg":"Server listening at http://127.0.0.1:3000"}
POST /chat {"prompt": "Hello"}
{"message":"Hello! How can I assist you today?","model":"gpt-4"}
Streaming with Fastify
Fastify requires manual SSE handling through the raw response object:
import Fastify from "fastify";
const app = Fastify();
app.post("/chat/stream", async (request, reply) => {
const { prompt } = request.body as { prompt: string };
// Set SSE headers manually
reply.raw.writeHead(200, {
"Content-Type": "text/event-stream",
"Cache-Control": "no-cache",
"Connection": "keep-alive",
});
// Call AI with streaming
const response = await fetch("https://api.openai.com/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "gpt-4",
messages: [{ role: "user", content: prompt }],
stream: true,
}),
});
if (!response.body) {
reply.raw.end();
return;
}
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value, { stream: true });
reply.raw.write(chunk);
}
reply.raw.end();
});
app.listen({ port: 3000 });
Key difference from Hono: Fastify uses reply.raw to access Node.js's native HTTP response. This is more verbose but gives you full control.
Compared to FastAPI: In FastAPI, you'd return StreamingResponse(generate(), media_type="text/event-stream"). Fastify's approach is more manual but follows the same pattern.
ElysiaJS: Bun-Native Performance
ElysiaJS is built specifically for Bun, leveraging its performance optimizations. It can handle over 1.8 million requests per second in benchmarks—comparable to Rust and Go frameworks.
import { Elysia, t } from "elysia";
const app = new Elysia()
.post(
"/chat",
async ({ body }) => {
const response = await fetch("https://api.openai.com/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "gpt-4",
messages: [{ role: "user", content: body.prompt }],
}),
});
const data = await response.json();
return {
message: data.choices[0].message.content,
model: data.model,
};
},
{
// Schema validation with automatic TypeScript inference
body: t.Object({
prompt: t.String(),
}),
response: t.Object({
message: t.String(),
model: t.String(),
}),
}
)
.listen(3000);
console.log(`Server running at ${app.server?.hostname}:${app.server?.port}`);
Output:
Server running at localhost:3000
POST /chat {"prompt": "Hello"}
{"message":"Hello! How can I assist you today?","model":"gpt-4"}
Streaming with ElysiaJS
ElysiaJS supports streaming through generators and async iterators:
import { Elysia, t } from "elysia";
const app = new Elysia()
.post(
"/chat/stream",
async function* ({ body }) {
const response = await fetch("https://api.openai.com/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "gpt-4",
messages: [{ role: "user", content: body.prompt }],
stream: true,
}),
});
if (!response.body) {
yield "Error: No response body";
return;
}
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
yield decoder.decode(value, { stream: true });
}
},
{
body: t.Object({
prompt: t.String(),
}),
}
)
.listen(3000);
Key advantage: ElysiaJS uses generator syntax (function* with yield) which is more ergonomic than manual stream handling. The framework handles SSE headers automatically.
Framework Selection Guide
Here's when to choose each framework:
| Scenario | Recommended | Why |
|---|---|---|
| Existing Node.js infrastructure | Fastify | Best Node.js ecosystem integration |
| Cloudflare Workers deployment | Hono | Only option that works on Workers |
| Maximum cold start speed | ElysiaJS | Bun's 50ms cold start + Elysia optimization |
| Multi-runtime codebase | Hono | Same code runs everywhere |
| Complex middleware needs | Fastify | Richest plugin ecosystem |
| End-to-end type safety | ElysiaJS | Schema generates client types automatically |
| Edge functions globally | Hono | Designed for edge from the start |
| Learning / prototyping | Hono | Simplest API, works anywhere |
The FastAPI Comparison
For Python developers, here's how patterns translate:
| FastAPI | Fastify | Hono | ElysiaJS |
|---|---|---|---|
@app.post("/path") | app.post("/path", handler) | app.post("/path", handler) | app.post("/path", handler) |
StreamingResponse | reply.raw.write() | stream() helper | yield in generator |
| Pydantic validation | JSON Schema | Zod/Valibot middleware | Built-in t.Object() |
| Automatic OpenAPI | Plugin | Middleware | Built-in one-liner |
| Dependency injection | Decorators | Middleware context | Derive/resolve pattern |
The mental shift: FastAPI handles everything through decorators and automatic conversion. TypeScript frameworks are more explicit—you write the streaming logic directly.
Practical Pattern: AI Chat Server
Here's a complete pattern that works for any AI chat interface. This example uses Hono for portability:
import { Hono } from "hono";
import { stream } from "hono/streaming";
import { cors } from "hono/cors";
// Types for AI responses
interface ChatMessage {
role: "user" | "assistant" | "system";
content: string;
}
interface ChatRequest {
messages: ChatMessage[];
model?: string;
}
const app = new Hono();
// Enable CORS for browser clients
app.use("/*", cors());
// Non-streaming endpoint for simple requests
app.post("/api/chat", async (c) => {
const { messages, model = "gpt-4" } = await c.req.json<ChatRequest>();
const response = await fetch("https://api.openai.com/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({ model, messages }),
});
const data = await response.json();
return c.json({
message: data.choices[0].message.content,
usage: data.usage,
});
});
// Streaming endpoint for real-time responses
app.post("/api/chat/stream", async (c) => {
const { messages, model = "gpt-4" } = await c.req.json<ChatRequest>();
return stream(c, async (stream) => {
const response = await fetch("https://api.openai.com/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model,
messages,
stream: true,
}),
});
if (!response.body) {
await stream.write("data: {\"error\": \"No response\"}\n\n");
return;
}
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
await stream.write(decoder.decode(value, { stream: true }));
}
});
});
export default app;
Output (streaming):
POST /api/chat/stream {"messages": [{"role": "user", "content": "Hello"}]}
data: {"choices":[{"delta":{"role":"assistant"}}]}
data: {"choices":[{"delta":{"content":"Hello"}}]}
data: {"choices":[{"delta":{"content":"!"}}]}
data: {"choices":[{"delta":{"content":" How"}}]}
data: {"choices":[{"delta":{"content":" can"}}]}
data: {"choices":[{"delta":{"content":" I"}}]}
data: {"choices":[{"delta":{"content":" help"}}]}
data: {"choices":[{"delta":{"content":"?"}}]}
data: [DONE]
This server:
- Provides both streaming and non-streaming endpoints
- Works on Node.js, Deno, Bun, and edge runtimes
- Handles CORS for browser clients
- Follows the SSE format AI clients expect
Try With AI
Prompt 1: Framework Migration
I have this FastAPI endpoint:
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
app = FastAPI()
async def generate_response(prompt: str):
for word in ["Hello", "from", "the", "AI"]:
yield f"data: {word}\n\n"
await asyncio.sleep(0.5)
@app.post("/stream")
async def stream_response(prompt: str):
return StreamingResponse(
generate_response(prompt),
media_type="text/event-stream"
)
Convert this to:
1. Hono (for edge deployment)
2. Fastify (for Node.js)
3. ElysiaJS (for Bun)
Show me how each framework handles the streaming differently.
What you're learning: Direct translation between FastAPI streaming and TypeScript frameworks—understanding which abstraction each framework provides and when you need manual control.
Prompt 2: Build a Complete Chat Server
I'm building an AI chat interface that needs:
1. A streaming endpoint for real-time token display
2. Error handling when the AI provider times out
3. Rate limiting per user
4. CORS for browser clients
I want to deploy to Cloudflare Workers initially,
but might move to AWS Lambda later.
Which framework should I use? Build me a complete server
with all these features.
What you're learning: Framework selection based on real deployment requirements, and implementing production-ready features like rate limiting and error handling in your chosen framework.
Prompt 3: Compare Performance Tradeoffs
I need to build a high-traffic AI API that will:
- Handle 10,000 requests per minute
- Stream responses with sub-100ms time-to-first-byte
- Run on Bun in production
Compare ElysiaJS vs Hono for this use case:
- What performance differences should I expect?
- Which has better streaming primitives?
- How do I benchmark both to make the decision?
Show me equivalent implementations and how to test them.
What you're learning: Performance-critical framework selection—understanding benchmarking, streaming latency, and when Bun-native optimization matters versus cross-runtime flexibility.
Safety note: When building production AI chat servers, validate all incoming requests. Never stream API keys or sensitive data. Use environment variables for credentials and ensure your streaming endpoints have proper timeout handling to prevent connection exhaustion.