Skip to main content

Isolate with NemoClaw

James stared at the Docker Compose deployment from Lesson 14. "Can someone prompt-inject my agent and steal the API key?"

Emma did not soften the answer. "In Docker Compose, yes. The key is an environment variable inside the container. The messaging tool profile blocks exec, but that is an in-process check. The same Node.js process that runs the agent also enforces the restriction. If an attacker bypasses the runtime, the key is one printenv away."

"So how do I fix that?"

"You move the key to a place the agent cannot reach. A different process. A different network namespace. A different pod."

"How much?"

"Five dollars more per month."


James asked the right question: can someone steal the API key? That question applies to your deployment too. If your agent's API key lives in the same process as the agent itself, the answer is uncomfortable.

Your AI Employee runs 24/7 on a $5/month VPS with Docker Compose. But the API keys sit in the container's environment variables, sharing the same process as the agent. The security model is in-process tool profiles: the agent promises to follow the rules.

Before reading further, answer this question on paper or in a text file:

Your Lesson 14 deployment is live on a Hetzner VPS. An attacker sends a prompt injection that tricks the agent into running printenv. What information is exposed? What could the attacker do with it?

Write your answer in 2-3 sentences. You will revisit this scenario later in the lesson, after learning how the 4-pod architecture changes the outcome.

This lesson introduces NemoClaw, where the rules are enforced by infrastructure the agent cannot modify. The difference is not more rules. It is moving the rules to a place the agent cannot reach.

What NemoClaw Is

NemoClaw is not a new product. Not a fork. Not a competitor to OpenClaw.

NemoClaw = OpenClaw + OpenShell + Privacy Router + Policy Engine

One install command. What comes up is a K3s (a lightweight version of Kubernetes, the container orchestration system) cluster running inside a single Docker container, with four pods (groups of containers that share resources) that create something no single-process deployment can achieve: out-of-process policy enforcement.

OpenShell is NVIDIA's agent sandbox framework. When you wrap OpenClaw in OpenShell, the agent gets all of its tools. It can browse, execute commands, write files. But the guardrails are in a different process, on a different network namespace, enforced by a different binary that the agent cannot modify, restart, or see.

This is the difference between "the agent promises to follow the rules" and "the agent physically cannot break the rules."

The 4-Pod Architecture

Four pods. Four separate processes. Four separate trust boundaries.

PodPurposeHas API Keys?Has Agent?
GatewayChannel adapters, auth, routing, Control UINoNo
SandboxOpenClaw runtime, agent loop, tools, skillsNoYes
Privacy RouterAPI key vault, provider fallback, rate limitingYesNo
Policy EngineLandlock (fs), seccomp (syscalls), netns (network)NoNo

The architecture separates the thing that does the work (sandbox) from the thing that holds the credentials (privacy router). They are in different pods, different processes, different network namespaces.

The Privacy Router: Why It Matters

This is the single most important architectural difference between NemoClaw and Docker Compose.

The agent's model configuration points to inference.local:

{
"model": {
"provider": "openshell-inference",
"endpoint": "http://inference.local:8080",
"model": "claude-sonnet-4-20250514"
}
}

When the agent makes an inference call:

  1. The agent sends a request to inference.local:8080 with the model name but no API key
  2. The request crosses a network namespace boundary (sandbox pod to router pod)
  3. The privacy router looks up the appropriate provider and API key
  4. The router adds the Authorization header with the real key
  5. The router forwards the request to the actual provider (api.anthropic.com, api.openai.com)
  6. The response comes back through the router, stripped of authentication headers
  7. The agent receives a standard model response

At no point does the agent see the API key. At no point does the key exist in the sandbox's memory, environment, filesystem, or network traffic. The key exists only in the router pod's process memory.

The Prompt Injection Scenario

Revisit your answer from the opening exercise. In Docker Compose, printenv exposes ANTHROPIC_API_KEY=sk-ant-... and every other secret in the container. Now consider the same attack against NemoClaw.

The worst case: an attacker gains arbitrary code execution inside the sandbox. They have root. They can run any command, read any file, inspect any environment variable, sniff any network traffic.

What can they get?

Attack VectorResult
printenvNo API keys in environment
Read config filesModel endpoint is inference.local, no credentials
Sniff network trafficTraffic to inference.local has no Authorization header
Scan filesystemNo credential files, no secret mounts
Call API directlyNetwork namespace only routes to inference.local, not to api.anthropic.com

The attacker has full control of the sandbox and zero access to the API keys. They could abuse the privacy router by making excessive calls (running up the bill), but they cannot steal the keys. Rate limiting on the router handles the abuse case.

You cannot steal what is not there.

Compare to Docker Compose

In the Docker Compose deployment from Lesson 14:

environment:
- ANTHROPIC_API_KEY=sk-ant-...
- GOOGLE_API_KEY=AIza...

The agent process, the tool execution engine, and the API keys all live in the same container. The messaging tool profile blocks exec. But tool profiles are in-process checks in the same Node.js runtime. If an attacker bypasses the runtime through a native module vulnerability, a V8 exploit, or a container escape to the Docker socket, the tool profile is irrelevant and the keys are exposed.

NemoClaw does not rely on the agent runtime to protect the keys. The keys are in a different pod. The protection is architectural, not procedural.

Kernel-Level Enforcement

The Policy Engine uses Linux kernel primitives, not application-level checks.

Landlock restricts filesystem access. The sandbox can read and write its own workspace directory. It cannot read the operator's home directory, the host system's configuration, or other pods' filesystems.

seccomp restricts system calls. The sandbox cannot call ptrace (inspect other processes), cannot mount filesystems (escape the container), and cannot perform raw socket operations (bypass network policy).

Network namespaces control routing. The sandbox does not know the route to api.anthropic.com. The route does not exist in the sandbox's network stack. It is not that the request is intercepted and blocked. The route is not there.

The agent would need to escape the sandbox pod, gain root in the K3s cluster, modify the policy engine pod's configuration, and restart the policy engine. Practically impossible from inside a sandboxed agent process.

The Three-Tier Security Model Complete

This lesson completes the three-tier security model introduced in Lesson 3 and expanded in Lesson 13:

TierMechanismWhereEnforcesIntroduced
1Tool profilesIn-processWhich tools the agent can useL3
2requireApproval hooksPlugin-levelOperator gates on sensitive opsL13
3NemoClaw sandboxOut-of-processKernel-level isolation + key separationL15 (this lesson)

Each tier builds on the previous. Tier 1 is the baseline. Tier 2 adds human-in-the-loop for specific operations. Tier 3 makes the entire isolation model architectural instead of procedural.

Policy Presets

By default, the sandbox pod cannot reach any external endpoint. Not Google. Not Anthropic. Not PyPI. Nothing. The only endpoint the sandbox can reach is inference.local (the privacy router) and the gateway pod's internal service endpoint (for message routing).

Operators add presets to allow specific services:

PresetWhat It Allows
discordOutbound to Discord gateway and API
telegramOutbound to Telegram Bot API
slackOutbound to Slack API
pypiOutbound to pypi.org (pip install)
npmOutbound to registry.npmjs.org
dockerhubOutbound to Docker Hub

Custom policies add specific domains and CIDR ranges. When the agent tries to reach an unapproved endpoint, the connection is dropped. The operator reviews the denied attempt and either approves or blocks it.

The agent cannot approve its own network access. The agent cannot modify the policy. The policy engine is in a different pod.

When to Upgrade

Docker Compose is sufficient for many deployments. NemoClaw is the upgrade when the trust model changes.

SignalWhat It Means
A customer asks "where are the API keys stored?"You need an answer better than "environment variable"
A second operator joinsThey should not have direct key access
A compliance audit requires evidenceDefault-deny network policy is audit evidence
You serve paying customers whose data is your liability$10/month is a liability decision, not a cost decision

Cost Comparison

ItemDocker ComposeNemoClaw
VPS$5/mo (CX21)$15/mo (CX31)
Model provider~$50/mo~$50/mo
Voice (optional)$11/mo$11/mo
Total~$66/mo~$76/mo

The security delta is $10/month. NemoClaw requires at minimum 4 vCPU and 8 GB RAM (the K3s cluster, sandbox images, and privacy router add overhead). For any deployment serving paying customers, this is not a cost decision.

The Alpha Reality

Honest assessment: NemoClaw is v0.1.0. OpenShell is v0.0.16. The architecture is sound. The software is alpha.

What works: K3s deployment via nemoclaw setup-spark. Inference routing. Policy presets. Sandbox creation. Agent execution (tools, skills, heartbeats, crons all function inside the sandbox).

What does not work yet:

  • Recovery after crash is manual. If the K3s cluster goes down, the sandbox pod must be recreated. Docker Compose with restart: unless-stopped is more resilient here
  • No sandbox image caching. The 1142 MiB image push happens on every sandbox creation. No layer caching between K3s internal registry and build cache. Delete and recreate a sandbox, wait 3-7 minutes for the full push
  • Log aggregation is split. Logs across four pods. No unified view. You need kubectl logs for each pod, or an external aggregator
  • Documentation gaps. The cgroup v2 issue is in troubleshooting, not in setup. Provider fallback chain configuration is undocumented

The recommendation: start with Docker Compose (Lesson 14). Move to NemoClaw when your first customer asks about API key security and you want a better answer.

Try With AI

Exercise 1: Draw the Architecture

On paper or in a text file, draw the 4-pod architecture from memory.

Draw four boxes labeled: Gateway, Sandbox, Privacy Router,
Policy Engine. For each, label: what it does, whether it has
API keys, whether it has the agent. Draw the connections
between pods. Label what data flows on each connection.

What you are learning: The architecture is the security argument. If you can draw it and explain why the sandbox pod cannot access the privacy router's credentials, you understand the fundamental difference between in-process and out-of-process security.

Exercise 2: The Compromise Scenario

Assume an attacker has full root access inside the sandbox pod.
List every action they CAN take and every action they CANNOT take.
What is the worst damage they can cause?

What you are learning: The worst case in NemoClaw (abuse inference through inference.local, run up the bill) is dramatically better than the worst case in Docker Compose (steal API keys, impersonate the operator on any provider). Rate limiting on the router contains even the worst case.

Exercise 3: The Upgrade Decision

A friend deploys an AI agent for their small business using
Docker Compose. They ask you: "When should I switch to NemoClaw?"
Write three specific triggers that would tell them it's time.

What you are learning: The upgrade decision is not about scale. It is about trust boundaries. When the operator is the only user, Docker Compose is fine. When customers, second operators, or compliance requirements enter the picture, the trust model changes.


James sat back from the 4-pod diagram he had drawn on a napkin. Gateway, Sandbox, Privacy Router, Policy Engine. Arrows showing message flow and the one critical absence: no arrow carrying API keys into the sandbox.

"The agent never sees its own API key," he said. Not a question.

"It calls inference.local. The router adds the key. The response comes back clean." Emma traced the arrows on his napkin. "Even if someone compromises the sandbox completely, the keys are not there to steal."

James thought about his old company's supplier management system. They had kept the master price list in a locked filing cabinet. Individual buyers got price sheets with the numbers they needed and nothing else. The buyers could not leak the master list because they never had access to it.

"It's the same principle," he said. "Separate the credentials from the people who use them."

"Different process. Different network. Same idea. Old idea." She paused. "I have not stress-tested the crash recovery path myself. The alpha label is honest."

James looked at the napkin diagram, then at his terminal. Fifteen lessons. A working deployment. A security architecture with three tiers. Tool profiles gate access. Approval hooks gate execution. NemoClaw isolates credentials. "So can we ship this?"

"Under conditions." Emma pulled out a napkin and started writing. "WhatsApp credentials corrupt on reconnect. Memory is per-agent, not per-customer. The gateway is a single point of failure. MCP tools bypass your approval hooks. Silent failures are everywhere." She set down the pen. "That is the honest list. None of those are showstoppers. All of them need mitigation."

James thought about vendor evaluations at his old company. Every one ended the same way: the tool works if you work the tool. No platform fixed your processes for you.

"So the answer is 'yes, with conditions,'" he said.

"The conditions are in Lesson 14. Dedicated phone, capable model, zero criticals on the security audit, log monitoring." Emma stood up. "You understand the platform. You understand its edges. Chapter 57 is different. You stop configuring and start building. MCP servers. Teaching agents. Product economics."

James closed the laptop lid. The agent kept running.

Flashcards Study Aid