Updated Feb 23, 2026

Building HTTP Servers

You built FastAPI servers in Part 6. You know the pattern: define routes, handle requests, return responses. Now you need the same capability in TypeScript—but the landscape is different. Instead of one dominant framework, you have three strong options optimized for different runtimes.

Your AI chat interface needs an HTTP server. Users send messages, you call the AI provider, and you stream back the response token by token. In Python, you'd use FastAPI with StreamingResponse. In TypeScript, you have choices: Fastify for Node.js enterprise deployments, Hono for edge-first development that runs everywhere, or ElysiaJS for maximum Bun performance.

The right choice depends on where you're deploying. This lesson shows you all three, explains when to use each, and gives you the streaming patterns you need for AI applications.

The TypeScript HTTP Framework Landscape

Unlike Python where FastAPI dominates the async space, TypeScript has evolved multiple excellent options:

Framework	Primary Runtime	Key Strength	Best For
Fastify	Node.js	Plugin ecosystem, enterprise features	Production APIs on Node.js infrastructure
Hono	Any (Web Standards)	Runs everywhere unchanged	Edge functions, multi-runtime codebases
ElysiaJS	Bun	Maximum performance, end-to-end type safety	Bun-native high-performance APIs

The key insight: Hono uses only Web Standard APIs, so the same code runs on Node.js, Deno, Bun, Cloudflare Workers, and browsers. Fastify and ElysiaJS are optimized for their specific runtimes.

For your AI interface, the choice often comes down to deployment target:

Deploying to AWS/GCP with Node.js? Fastify
Deploying to Cloudflare Workers or need runtime flexibility? Hono
Deploying to Bun for maximum speed? ElysiaJS

Hono: Write Once, Run Everywhere

Hono is built on Web Standards, meaning it uses Request, Response, and fetch—APIs available in every modern JavaScript runtime. The framework weighs under 14KB with zero dependencies.

Here's a basic AI chat endpoint:

import { Hono } from "hono";

const app = new Hono();

// Health check
app.get("/health", (c) => {
  return c.json({ status: "ok", runtime: detectRuntime() });
});

// Chat endpoint
app.post("/chat", async (c) => {
  const { prompt } = await c.req.json<{ prompt: string }>();

  // Call AI provider (simplified)
  const response = await fetch("https://api.openai.com/v1/chat/completions", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      model: "gpt-4",
      messages: [{ role: "user", content: prompt }],
    }),
  });

  const data = await response.json();
  return c.json(data);
});

function detectRuntime(): string {
  if (typeof Deno !== "undefined") return "deno";
  if (typeof Bun !== "undefined") return "bun";
  return "node";
}

export default app;

Output (when running on Node.js):

GET /health
{"status":"ok","runtime":"node"}

POST /chat with {"prompt": "Hello"}
{"id":"chatcmpl-...","choices":[{"message":{"content":"Hello! How can I help you?"}}]}

This same code runs on any runtime. The only difference is how you start the server:

// node-server.ts
import { serve } from "@hono/node-server";
import app from "./app";

serve({ fetch: app.fetch, port: 3000 });
console.log("Server running on http://localhost:3000");

// bun-server.ts (or just run app.ts directly with Bun)
export default app;  // Bun automatically serves exported Hono apps

// deno-server.ts
import { serve } from "https://deno.land/std/http/server.ts";
import app from "./app.ts";

serve(app.fetch, { port: 3000 });

Streaming with Hono

For AI applications, you need to stream responses token by token. Hono provides a streaming helper:

import { Hono } from "hono";
import { stream } from "hono/streaming";

const app = new Hono();

app.post("/chat/stream", async (c) => {
  const { prompt } = await c.req.json<{ prompt: string }>();

  return stream(c, async (stream) => {
    // Call AI provider with streaming enabled
    const response = await fetch("https://api.openai.com/v1/chat/completions", {
      method: "POST",
      headers: {
        "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        model: "gpt-4",
        messages: [{ role: "user", content: prompt }],
        stream: true,
      }),
    });

    if (!response.body) {
      await stream.write("Error: No response body");
      return;
    }

    // Pipe the AI response directly to the client
    const reader = response.body.getReader();
    const decoder = new TextDecoder();

    while (true) {
      const { done, value } = await reader.read();
      if (done) break;

      const chunk = decoder.decode(value, { stream: true });
      await stream.write(chunk);
    }
  });
});

export default app;

Output (streaming response):

data: {"choices":[{"delta":{"content":"Hello"}}]}

data: {"choices":[{"delta":{"content":"!"}}]}

data: {"choices":[{"delta":{"content":" How"}}]}

data: {"choices":[{"delta":{"content":" can"}}]}

data: {"choices":[{"delta":{"content":" I"}}]}

data: {"choices":[{"delta":{"content":" help"}}]}

data: [DONE]

Compared to FastAPI: This is equivalent to returning StreamingResponse in FastAPI. The stream() helper manages the connection and lets you write chunks as they arrive.

Fastify: Node.js Power with Plugin Ecosystem

Fastify is the established choice for Node.js production APIs. It's faster than Express, has excellent TypeScript support, and offers a rich plugin ecosystem.

import Fastify from "fastify";

const app = Fastify({
  logger: true,  // Built-in structured logging
});

// Type-safe request body
interface ChatRequest {
  prompt: string;
}

interface ChatResponse {
  message: string;
  model: string;
}

app.post<{
  Body: ChatRequest;
  Reply: ChatResponse;
}>("/chat", async (request, reply) => {
  const { prompt } = request.body;

  // Call AI provider
  const response = await fetch("https://api.openai.com/v1/chat/completions", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      model: "gpt-4",
      messages: [{ role: "user", content: prompt }],
    }),
  });

  const data = await response.json();

  return {
    message: data.choices[0].message.content,
    model: data.model,
  };
});

app.listen({ port: 3000 });

Output:

{"level":30,"time":1704067200000,"msg":"Server listening at http://127.0.0.1:3000"}

POST /chat {"prompt": "Hello"}
{"message":"Hello! How can I assist you today?","model":"gpt-4"}

Streaming with Fastify

Fastify requires manual SSE handling through the raw response object:

import Fastify from "fastify";

const app = Fastify();

app.post("/chat/stream", async (request, reply) => {
  const { prompt } = request.body as { prompt: string };

  // Set SSE headers manually
  reply.raw.writeHead(200, {
    "Content-Type": "text/event-stream",
    "Cache-Control": "no-cache",
    "Connection": "keep-alive",
  });

  // Call AI with streaming
  const response = await fetch("https://api.openai.com/v1/chat/completions", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      model: "gpt-4",
      messages: [{ role: "user", content: prompt }],
      stream: true,
    }),
  });

  if (!response.body) {
    reply.raw.end();
    return;
  }

  const reader = response.body.getReader();
  const decoder = new TextDecoder();

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    const chunk = decoder.decode(value, { stream: true });
    reply.raw.write(chunk);
  }

  reply.raw.end();
});

app.listen({ port: 3000 });

Key difference from Hono: Fastify uses reply.raw to access Node.js's native HTTP response. This is more verbose but gives you full control.

Compared to FastAPI: In FastAPI, you'd return StreamingResponse(generate(), media_type="text/event-stream"). Fastify's approach is more manual but follows the same pattern.

ElysiaJS: Bun-Native Performance

ElysiaJS is built specifically for Bun, leveraging its performance optimizations. It can handle over 1.8 million requests per second in benchmarks—comparable to Rust and Go frameworks.

import { Elysia, t } from "elysia";

const app = new Elysia()
  .post(
    "/chat",
    async ({ body }) => {
      const response = await fetch("https://api.openai.com/v1/chat/completions", {
        method: "POST",
        headers: {
          "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
          "Content-Type": "application/json",
        },
        body: JSON.stringify({
          model: "gpt-4",
          messages: [{ role: "user", content: body.prompt }],
        }),
      });

      const data = await response.json();
      return {
        message: data.choices[0].message.content,
        model: data.model,
      };
    },
    {
      // Schema validation with automatic TypeScript inference
      body: t.Object({
        prompt: t.String(),
      }),
      response: t.Object({
        message: t.String(),
        model: t.String(),
      }),
    }
  )
  .listen(3000);

console.log(`Server running at ${app.server?.hostname}:${app.server?.port}`);

Output:

Server running at localhost:3000

POST /chat {"prompt": "Hello"}
{"message":"Hello! How can I assist you today?","model":"gpt-4"}

Streaming with ElysiaJS

ElysiaJS supports streaming through generators and async iterators:

import { Elysia, t } from "elysia";

const app = new Elysia()
  .post(
    "/chat/stream",
    async function* ({ body }) {
      const response = await fetch("https://api.openai.com/v1/chat/completions", {
        method: "POST",
        headers: {
          "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
          "Content-Type": "application/json",
        },
        body: JSON.stringify({
          model: "gpt-4",
          messages: [{ role: "user", content: body.prompt }],
          stream: true,
        }),
      });

      if (!response.body) {
        yield "Error: No response body";
        return;
      }

      const reader = response.body.getReader();
      const decoder = new TextDecoder();

      while (true) {
        const { done, value } = await reader.read();
        if (done) break;

        yield decoder.decode(value, { stream: true });
      }
    },
    {
      body: t.Object({
        prompt: t.String(),
      }),
    }
  )
  .listen(3000);

Key advantage: ElysiaJS uses generator syntax (function* with yield) which is more ergonomic than manual stream handling. The framework handles SSE headers automatically.

Framework Selection Guide

Here's when to choose each framework:

Scenario	Recommended	Why
Existing Node.js infrastructure	Fastify	Best Node.js ecosystem integration
Cloudflare Workers deployment	Hono	Only option that works on Workers
Maximum cold start speed	ElysiaJS	Bun's 50ms cold start + Elysia optimization
Multi-runtime codebase	Hono	Same code runs everywhere
Complex middleware needs	Fastify	Richest plugin ecosystem
End-to-end type safety	ElysiaJS	Schema generates client types automatically
Edge functions globally	Hono	Designed for edge from the start
Learning / prototyping	Hono	Simplest API, works anywhere

The FastAPI Comparison

For Python developers, here's how patterns translate:

FastAPI	Fastify	Hono	ElysiaJS
`@app.post("/path")`	`app.post("/path", handler)`	`app.post("/path", handler)`	`app.post("/path", handler)`
`StreamingResponse`	`reply.raw.write()`	`stream()` helper	`yield` in generator
Pydantic validation	JSON Schema	Zod/Valibot middleware	Built-in `t.Object()`
Automatic OpenAPI	Plugin	Middleware	Built-in one-liner
Dependency injection	Decorators	Middleware context	Derive/resolve pattern

The mental shift: FastAPI handles everything through decorators and automatic conversion. TypeScript frameworks are more explicit—you write the streaming logic directly.

Practical Pattern: AI Chat Server

Here's a complete pattern that works for any AI chat interface. This example uses Hono for portability:

import { Hono } from "hono";
import { stream } from "hono/streaming";
import { cors } from "hono/cors";

// Types for AI responses
interface ChatMessage {
  role: "user" | "assistant" | "system";
  content: string;
}

interface ChatRequest {
  messages: ChatMessage[];
  model?: string;
}

const app = new Hono();

// Enable CORS for browser clients
app.use("/*", cors());

// Non-streaming endpoint for simple requests
app.post("/api/chat", async (c) => {
  const { messages, model = "gpt-4" } = await c.req.json<ChatRequest>();

  const response = await fetch("https://api.openai.com/v1/chat/completions", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ model, messages }),
  });

  const data = await response.json();
  return c.json({
    message: data.choices[0].message.content,
    usage: data.usage,
  });
});

// Streaming endpoint for real-time responses
app.post("/api/chat/stream", async (c) => {
  const { messages, model = "gpt-4" } = await c.req.json<ChatRequest>();

  return stream(c, async (stream) => {
    const response = await fetch("https://api.openai.com/v1/chat/completions", {
      method: "POST",
      headers: {
        "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        model,
        messages,
        stream: true,
      }),
    });

    if (!response.body) {
      await stream.write("data: {\"error\": \"No response\"}\n\n");
      return;
    }

    const reader = response.body.getReader();
    const decoder = new TextDecoder();

    while (true) {
      const { done, value } = await reader.read();
      if (done) break;
      await stream.write(decoder.decode(value, { stream: true }));
    }
  });
});

export default app;

Output (streaming):

POST /api/chat/stream {"messages": [{"role": "user", "content": "Hello"}]}

data: {"choices":[{"delta":{"role":"assistant"}}]}

data: {"choices":[{"delta":{"content":"Hello"}}]}

data: {"choices":[{"delta":{"content":"!"}}]}

data: {"choices":[{"delta":{"content":" How"}}]}

data: {"choices":[{"delta":{"content":" can"}}]}

data: {"choices":[{"delta":{"content":" I"}}]}

data: {"choices":[{"delta":{"content":" help"}}]}

data: {"choices":[{"delta":{"content":"?"}}]}

data: [DONE]

This server:

Provides both streaming and non-streaming endpoints
Works on Node.js, Deno, Bun, and edge runtimes
Handles CORS for browser clients
Follows the SSE format AI clients expect

Try With AI

Prompt 1: Framework Migration

I have this FastAPI endpoint:

from fastapi import FastAPI
from fastapi.responses import StreamingResponse

app = FastAPI()

async def generate_response(prompt: str):
    for word in ["Hello", "from", "the", "AI"]:
        yield f"data: {word}\n\n"
        await asyncio.sleep(0.5)

@app.post("/stream")
async def stream_response(prompt: str):
    return StreamingResponse(
        generate_response(prompt),
        media_type="text/event-stream"
    )

Convert this to:
1. Hono (for edge deployment)
2. Fastify (for Node.js)
3. ElysiaJS (for Bun)

Show me how each framework handles the streaming differently.

What you're learning: Direct translation between FastAPI streaming and TypeScript frameworks—understanding which abstraction each framework provides and when you need manual control.

Prompt 2: Build a Complete Chat Server

I'm building an AI chat interface that needs:
1. A streaming endpoint for real-time token display
2. Error handling when the AI provider times out
3. Rate limiting per user
4. CORS for browser clients

I want to deploy to Cloudflare Workers initially,
but might move to AWS Lambda later.

Which framework should I use? Build me a complete server
with all these features.

What you're learning: Framework selection based on real deployment requirements, and implementing production-ready features like rate limiting and error handling in your chosen framework.

Prompt 3: Compare Performance Tradeoffs

I need to build a high-traffic AI API that will:
- Handle 10,000 requests per minute
- Stream responses with sub-100ms time-to-first-byte
- Run on Bun in production

Compare ElysiaJS vs Hono for this use case:
- What performance differences should I expect?
- Which has better streaming primitives?
- How do I benchmark both to make the decision?

Show me equivalent implementations and how to test them.

What you're learning: Performance-critical framework selection—understanding benchmarking, streaming latency, and when Bun-native optimization matters versus cross-runtime flexibility.

Safety note: When building production AI chat servers, validate all incoming requests. Never stream API keys or sensitive data. Use environment variables for credentials and ensure your streaming endpoints have proper timeout handling to prevent connection exhaustion.

The TypeScript HTTP Framework Landscape​

Hono: Write Once, Run Everywhere​

Streaming with Hono​

Fastify: Node.js Power with Plugin Ecosystem​

Streaming with Fastify​

ElysiaJS: Bun-Native Performance​

Streaming with ElysiaJS​

Framework Selection Guide​

The FastAPI Comparison​

Practical Pattern: AI Chat Server​

Try With AI​

Prompt 1: Framework Migration​

Prompt 2: Build a Complete Chat Server​

Prompt 3: Compare Performance Tradeoffs​