Anatomy of MCP-Wrapping Skills

You've learned that MCP servers are powerful tools for accessing external capabilities. But there's a critical insight that separates students from production developers: MCP servers alone are "dumb tools." They execute requests efficiently, but they don't know when to call themselves, how to filter results for your specific needs, or why some approaches work better than others.

That's where MCP-wrapping skills come in.

A skill that wraps an MCP server adds an intelligence layer on top of raw MCP capability. This layer encodes decision-making: when does this skill activate? What questions does it ask to understand the user's real need? How does it filter the MCP response to avoid token waste? What happens when the MCP server fails?

In this lesson, you'll analyze two real-world MCP-wrapping skills—fetching-library-docs and browsing-with-playwright—to understand this pattern. By the end, you'll be able to recognize why these skills are more valuable than raw MCP invocation, and you'll be ready to build your own in Lesson 4.

What Makes a Skill Different From MCP?

MCP Server: Raw Capability

An MCP server is a tool that does one thing well. For example, Context7 (the MCP server behind fetching-library-docs) fetches documentation. When you invoke it directly with a library name and topic, it returns all relevant documentation.

Typical direct MCP response: 934 tokens of documentation covering concepts, examples, edge cases, deprecations, and reference material.

The problem: You asked for "React useState examples." You're drowning in documentation when you only need 3-5 code examples and the function signature.

Skill: Intelligence Layer

A skill that wraps MCP adds decision-making. Before calling the MCP server, it asks questions. After getting results, it filters them based on content type. If the MCP server fails, it recovers gracefully.

Skill-filtered response: Content-type specific output—code examples, API signatures, setup commands, or concepts—with 60-90% token reduction.

The value: You specify what you need (examples, api-ref, setup, concepts, migration, troubleshooting, patterns) and get exactly that. The skill made decisions about what matters for your use case.

This difference is profound. In production systems where context is precious and API calls are expensive, this intelligence layer transforms MCP from "interesting tool" to "critical production component."

Deconstruction 1: The `fetching-library-docs` Skill

Let's analyze how fetching-library-docs adds intelligence on top of Context7.

Step 1: Specification First

Before understanding how the skill works, we need to know what it should do:

Intent: Fetch API documentation for published libraries (React, Next.js, Prisma, etc.) with automatic token reduction via intelligent filtering.

What triggers this skill?

Automatic triggers (context detection):

Implementing code using library APIs (e.g., writing React component with hooks)
Debugging library-specific errors (e.g., PrismaClientError in console)
Installing or configuring frameworks (e.g., adding Tailwind to project)
Integrating libraries together (e.g., connecting Prisma with Next.js)
Upgrading between versions (e.g., Next.js 14 to 15 migration)

Explicit triggers (user requests):

"Show me React hooks examples"
"How do I use Prisma queries?"
"What's the Next.js routing API?"

What should NOT trigger this skill?

Exploring library source code internals (use researching-with-deepwiki instead)
Accessing local file documentation
Researching library history or comparisons

Success criteria:

Returns relevant code examples and API signatures
Reduces token count by 60%+ compared to raw MCP output
Works across diverse libraries (React, Next.js, Prisma, Express, etc.)
Graceful fallback when library not found

Step 2: The Intelligence Layer—Persona and Questions

The skill's persona determines when and how it acts:

Persona: "You are a library documentation specialist. Your role is to proactively fetch documentation when code is being written, errors are encountered, or frameworks are being configured—not wait for explicit requests. Fetch docs BEFORE writing code, not after guessing wrong."

This persona tells us the skill makes three key decisions:

Detect when to invoke (auto-trigger on implementing, debugging, installing, integrating, upgrading)
Identify what's needed (library from context, topic from task)
Filter ruthlessly (content-type based on task type)

Questions the skill asks itself (through its decision logic):

What library is relevant? (check imports, errors, package.json, or ask user)
What topic is needed? (from error message, feature being implemented, or user specification)
What content type fits this task? (implementing → examples, debugging → troubleshooting, installing → setup)
Do I already have sufficient knowledge, or should I fetch fresh docs?

These questions drive proactive invocation—the skill auto-triggers based on context, not just explicit requests.

Step 3: Result Filtering—The Intelligence Pattern

Now let's see how the skill executes this intelligence. Look at the workflow:

Step	What Happens	Token Cost
1. Identify library + topic + content type	User asks "Show me React useState examples" → content-type: examples	0 (intelligence, no MCP call yet)
2. Resolve library (if needed)	`--library react` → calls `resolve-library-id` MCP tool	1 API call (can skip with `--library-id`)
3. Call Context7 MCP	Fetch React documentation via `query-docs` tool	1 API call (subprocess, doesn't count toward Claude context)
4. Route by content type	`filter-by-type.sh` routes to appropriate extractor	0 (shell processing)
5. Extract content	`extract-code-blocks.sh` extracts ```js/jsx blocks	0 (shell processing)
6. Return filtered output	Return code examples to Claude	60-70% token savings

The breakthrough: The MCP response stays in subprocess memory. Content-type filtering extracts only what's needed. Only the filtered result enters Claude's context. 60-90% token savings achieved through content-type filtering.

Available content types: examples (code blocks), api-ref (signatures), setup (terminal commands), concepts (prose), migration (before/after), troubleshooting (workarounds), patterns (best practices), all (no filtering).

Call budget awareness: Context7 has a 3-call limit per question. The skill uses max 2 calls (resolve + query when using --library), leaving 1 for your retry decisions. Using --library-id directly saves 1 call.

Step 4: Error Recovery Patterns

What happens when things go wrong? The skill distinguishes between infrastructure failures (safe to retry) and API errors (count against call budget).

Scenario 1: Library not found

resolve-library-id returns no matches
Skill returns [LIBRARY_NOT_FOUND] with call budget status: "1 of 3 calls used"
Suggests: try different spelling, common library IDs for reference
Does NOT auto-retry (would waste call budget)

Scenario 2: Library mismatch

User asks for "anthropic" but it resolves to an unrelated library
Skill validates: does resolved ID contain the library name?
If mismatch: returns [LIBRARY_MISMATCH] warning with resolved ID and options
Prevents wrong documentation from being used

Scenario 3: Invalid library ID format

User provides --library-id react (missing /org/project format)
Skill validates format before calling MCP
Returns [INVALID_LIBRARY_ID] with correct format examples
Saves API calls by catching format errors early

Scenario 4: Network timeout

MCP call fails with timeout/connection error
Skill retries with exponential backoff (2s, 5s, 10s delays)
After 3 retries: returns [FETCH_FAILED_AFTER_RETRIES]
Infrastructure retries don't count against Context7's call limit

Scenario 5: Rate limit hit

Context7 returns rate limit error
Skill returns [RATE_LIMIT_ERROR] with API key setup instructions
Does NOT retry (would be blocked anyway)

Step 5: Configuration and Triggering

The skill doesn't call MCP on every prompt. It uses context-based auto-detection:

Activation rules (invoke automatically when):

Implementing: About to write code using external library API
Debugging: Error message contains library-specific terms
Installing: Task involves adding new package or framework
Integrating: Connecting two libraries/services together
Upgrading: Version migration mentioned or detected
Uncertain: About to use unfamiliar API or unsure of correct pattern

Do NOT invoke when:

Already have sufficient knowledge from training
Task is about local/private code (use codebase search)
Comparing libraries (use web search)

This context-detection approach is the key insight: production skills don't wait for users to ask—they proactively fetch docs before writing code, preventing incorrect implementations.

Deconstruction 2: The `browsing-with-playwright` Skill

Now let's analyze a different MCP-wrapping skill to see if the pattern repeats.

Specification

Intent: Automate browser interactions for web navigation, form submission, data extraction, and UI testing using Playwright MCP.

What triggers this skill?

"Fill out this form and submit it"
"Extract product information from this e-commerce site"
"Test if the login flow works"

What should NOT trigger this skill?

Simple HTTP requests (use curl/wget instead)
API calls (use API client or MCP API wrapper)
Static content that doesn't require interaction

Success criteria:

Navigate websites reliably
Extract data from dynamic pages
Handle form interactions robustly
Recover from page load failures

The Intelligence Layer—Persona and Questions

Persona: "You are a web automation orchestrator. Your role is to understand the user's interaction goal, break it into steps (navigate → find elements → interact → validate), execute those steps via Playwright MCP, and report success/failure."

Questions the skill asks itself:

What's the user trying to accomplish on this website?
What interactions are required to complete this goal?
How do we identify elements (by text, by accessibility role, by CSS selector)?
What should success look like? (page navigation? form submission? data extraction?)
If an element isn't found, should we retry or escalate?

Result Filtering—Different Pattern, Same Principle

Playwright works differently from Context7, so the filtering pattern differs:

Step	What Happens	Output Reduction
1. Understand goal	User wants to "fill contact form and submit"	Clear scope
2. Navigate to page	Call Playwright MCP `browser_navigate` → returns full page DOM	500+ tokens (full page snapshot)
3. Get page snapshot	Call `browser_snapshot` → returns accessibility tree of all elements	Could be 1000+ tokens (every element on page)
4. Filter to relevant elements	Intelligence layer: find only form fields + submit button	50-100 tokens (focused subset)
5. Fill form	Call Playwright MCP `browser_fill_form` with filtered element refs	Isolated interaction
6. Submit and validate	Call `browser_click` + `browser_wait_for` to confirm success	Confirms completion

Key insight: The intelligence layer reduces MCP call volume and result processing volume by filtering BEFORE processing. Instead of analyzing the full page DOM, it identifies the relevant subset (form fields) and operates on just that.

Error Recovery in Playwright Skill

Scenario 1: Element not found

MCP browser_snapshot returns page, but form field not visible
Intelligence layer retries: wait for page to fully load, then snapshot again
After 3 retries: report "Element not found" with screenshot for debugging

Scenario 2: Click fails

browser_click on submit button times out
Intelligence layer: try browser_hover first (trigger any hover states), then click
If still fails: try JavaScript injection (browser_evaluate) to trigger click programmatically

Scenario 3: Page navigation takes longer than expected

User expects form to be on page immediately
Intelligence layer: use browser_wait_for with timeout (2 seconds default) before giving up
If timeout: report waiting issue and suggest increasing timeout

Pattern Recognition: The Skill Wrapping Template

Now that you've seen two different MCP-wrapping skills, let's extract the common pattern:

┌─────────────────────────────────────────────┐
│   MCP-WRAPPING SKILL STRUCTURE              │
├─────────────────────────────────────────────┤
│                                              │
│  PERSONA (Identity + Decision Framework)    │
│  └─ Who am I? What decisions do I make?     │
│                                              │
│  TRIGGER CONDITIONS (When to Activate)      │
│  └─ Auto-detection: context signals         │
│  └─ Explicit: user requests                 │
│                                              │
│  INTELLIGENCE QUESTIONS (How do I Decide?)  │
│  └─ What library? What topic? What type?    │
│                                              │
│  MCP CONFIGURATION (Which MCP Server?)      │
│  └─ How do I connect to external tool?      │
│                                              │
│  RESULT FILTERING (What Gets Returned?)     │
│  └─ How do I reduce noise/tokens?           │
│                                              │
│  ERROR RECOVERY (What If It Fails?)         │
│  └─ How do I handle failures gracefully?    │
│                                              │
│  FALLBACK STRATEGIES (What If MCP Down?)    │
│  └─ What's my backup plan?                  │
│                                              │
└─────────────────────────────────────────────┘

In fetching-library-docs:

Persona: Library documentation specialist (proactive, fetches before code is written)
Trigger: Auto-detection (implementing, debugging, installing, integrating, upgrading) + explicit requests
Questions: What library? (from context) What topic? (from task) What content type? (from task type)
MCP: Context7 for documentation (2 tools: resolve-library-id, query-docs)
Filtering: Content-type router → specialized extractors (60-90% token savings)
Error Recovery: Library validation, format validation, exponential backoff for infrastructure
Call Budget: Max 2 calls per question, leaving 1 for retry

In browsing-with-playwright:

Persona: Web automation orchestrator
Trigger: User wants browser interaction
Questions: What goal? Which elements? What counts as success?
MCP: Playwright for browser control
Filtering: Snapshot → filter to relevant elements only
Error Recovery: Retry with waits, try hover before click, JavaScript injection
Fallback: Screenshot + report, suggest manual intervention

Both follow the same structure. They wrap an MCP server with intelligence that makes the tool useful for specific purposes.

The Three Key Insights

Insight 1: MCP Wrapping Is Decision-Making

Raw MCP is powerful but purposeless. A skill that wraps MCP adds purpose through decision-making. The questions your skill asks (implicit or explicit) drive all its behavior: when to call MCP, how to filter results, how to recover from failure.

Insight 2: Token Efficiency Is Intentional

The 60-90% token savings in fetching-library-docs didn't happen by accident. It happened because the skill's designer asked: "What does the user actually need?" Then designed content-type filtering to return only that. Using --content-type examples returns code blocks only; using --content-type api-ref returns signatures only. This is intelligence—knowing what matters and eliminating what doesn't.

Insight 3: Error Recovery Makes Skills Production-Ready

A skill that works 95% of the time is not production-ready. A skill that fails gracefully 100% of the time (with clear error messages and fallbacks) is. The intelligence layer includes not just "happy path" logic but "failure recovery" logic.

Manual Exercise: Design Your MCP-Wrapping Skill

Now that you understand the anatomy, let's design a skill (without coding) that you'll build in Lesson 4.

Scenario: Choose an MCP server from Chapter 38 that your team implemented, or imagine a new one. Design a skill that wraps it.

Instructions (on paper or in a document):

Write the specification (2-3 sentences):
- What problem does this skill solve?
- What triggers it?
- What's the success criterion?
Design the persona (1-2 sentences):
- Who is this skill? (What's its identity and expertise domain?)
- What decisions does it make?
Write 5 intelligence questions (1 per line):
- What questions does the skill ask itself to make decisions?
- Make them specific to your domain (not generic)
Identify result filtering (1-2 sentences):
- What does raw MCP return?
- What should the filtered result contain? (estimate token reduction: what % is removed?)
Plan error recovery (3 failure scenarios):
- Scenario 1: MCP returns empty results → How does skill respond?
- Scenario 2: MCP times out → How does skill respond?
- Scenario 3: User input is ambiguous → How does skill clarify?
Design fallback strategy (1-2 sentences):
- If MCP is unavailable, what's the skill's backup plan?

Example (for reference only—design your own):

Spec: Fetch code examples from GitHub repositories matching search criteria, with result filtering for readability.

Persona: GitHub code search specialist. My role is to understand what coding pattern the user needs, search GitHub, and return only well-commented, production-quality examples.

Questions: What design pattern is the user looking for? What programming language? What project type (library, app, framework plugin)? What code quality matters (star count, recency, test coverage)?

Filtering: GitHub search returns 100 results with metadata. Filter to top 5 by stars, extract code samples only (not README bloat). Token reduction: 500 → 150 (70% savings).

Error Recovery:

Empty results → Retry with broader search terms, suggest related patterns

Rate limit exceeded → Queue query, inform user ("I'll fetch this in 60 seconds"), retry

Ambiguous pattern request → Ask clarifying questions ("Do you mean Factory Pattern or Builder Pattern?")

Fallback: If GitHub API unavailable, return hardcoded examples from local cache for top 10 patterns.

Try With AI

Use your AI companion to analyze and refine your MCP-wrapping skill design.

Prompt 1: Validate Your Filtering Strategy

I've designed a skill that wraps [your chosen MCP server].
Raw MCP returns approximately [X] tokens of output.
My filtering plan is: [describe what you remove/keep].
This should reduce output to approximately [Y] tokens.

Help me validate this filtering strategy:
- Is my token reduction realistic for this type of data?
- What else could I filter without losing critical information?
- Are there hidden assumptions in my filtering logic?

What you're learning: Token efficiency analysis—understanding what filtering decisions are safe and what would cause information loss.

Prompt 2: Stress-Test Your Error Recovery

I've planned three error scenarios for my skill:
1. [Scenario 1 and your recovery strategy]
2. [Scenario 2 and your recovery strategy]
3. [Scenario 3 and your recovery strategy]

For each scenario, tell me:
- Is my recovery strategy realistic?
- What could go wrong with my recovery approach?
- What's a failure case I haven't considered?

Then, suggest one additional error scenario I should design recovery for.

What you're learning: Robustness thinking—anticipating failures and designing recovery strategies that actually work in production.

Prompt 3: Compare Your Design to Reference Skills

I'm designing a skill to wrap [your chosen MCP].
I've heard that fetching-library-docs achieves 60-90% token reduction
(depending on content type) and browsing-with-playwright has sophisticated
error recovery.

Compare my design to these two reference skills:
- Where is my design similar to their pattern?
- Where is my design different (is that intentional or a gap)?
- What could I learn from their approaches (especially call budget management)?

What you're learning: Pattern recognition—understanding how your specific skill design fits within the broader MCP-wrapping template.

Safety Note

As you design your MCP-wrapping skill, remember: intelligent filtering requires understanding what matters in your domain. The best filtering decisions come from domain expertise (knowing your users and their actual needs) combined with iterative testing. Don't over-optimize for token reduction at the expense of functionality—the 60-90% range in fetching-library-docs varies because different content types extract different amounts. Code examples (--content-type examples) get 60-70% savings; setup commands (--content-type setup) get 80-90%. Match your filtering to what users actually need.

What Makes a Skill Different From MCP?​

MCP Server: Raw Capability​

Skill: Intelligence Layer​

Deconstruction 1: The fetching-library-docs Skill​

Step 1: Specification First​

Step 2: The Intelligence Layer—Persona and Questions​

Step 3: Result Filtering—The Intelligence Pattern​

Step 4: Error Recovery Patterns​

Step 5: Configuration and Triggering​

Deconstruction 2: The browsing-with-playwright Skill​

Specification​

The Intelligence Layer—Persona and Questions​

Result Filtering—Different Pattern, Same Principle​

Error Recovery in Playwright Skill​

Pattern Recognition: The Skill Wrapping Template​

The Three Key Insights​

Insight 1: MCP Wrapping Is Decision-Making​

Insight 2: Token Efficiency Is Intentional​

Insight 3: Error Recovery Makes Skills Production-Ready​

Manual Exercise: Design Your MCP-Wrapping Skill​

Try With AI​

Prompt 1: Validate Your Filtering Strategy​

Prompt 2: Stress-Test Your Error Recovery​

Prompt 3: Compare Your Design to Reference Skills​

Safety Note​