Anatomy of MCP-Wrapping Skills
You've learned that MCP servers are powerful tools for accessing external capabilities. But there's a critical insight that separates students from production developers: MCP servers alone are "dumb tools." They execute requests efficiently, but they don't know when to call themselves, how to filter results for your specific needs, or why some approaches work better than others.
That's where MCP-wrapping skills come in.
A skill that wraps an MCP server adds an intelligence layer on top of raw MCP capability. This layer encodes decision-making: when does this skill activate? What questions does it ask to understand the user's real need? How does it filter the MCP response to avoid token waste? What happens when the MCP server fails?
In this lesson, you'll analyze two real-world MCP-wrapping skills—fetching-library-docs and browsing-with-playwright—to understand this pattern. By the end, you'll be able to recognize why these skills are more valuable than raw MCP invocation, and you'll be ready to build your own in Lesson 4.
What Makes a Skill Different From MCP?
MCP Server: Raw Capability
An MCP server is a tool that does one thing well. For example, Context7 (the MCP server behind fetching-library-docs) fetches documentation. When you invoke it directly with a library name and topic, it returns all relevant documentation.
Typical direct MCP response: 934 tokens of documentation covering concepts, examples, edge cases, deprecations, and reference material.
The problem: You asked for "React useState examples." You're drowning in documentation when you only need 3-5 code examples and the function signature.
Skill: Intelligence Layer
A skill that wraps MCP adds decision-making. Before calling the MCP server, it asks questions. After getting results, it filters them based on content type. If the MCP server fails, it recovers gracefully.
Skill-filtered response: Content-type specific output—code examples, API signatures, setup commands, or concepts—with 60-90% token reduction.
The value: You specify what you need (examples, api-ref, setup, concepts, migration, troubleshooting, patterns) and get exactly that. The skill made decisions about what matters for your use case.
This difference is profound. In production systems where context is precious and API calls are expensive, this intelligence layer transforms MCP from "interesting tool" to "critical production component."
Deconstruction 1: The fetching-library-docs Skill
Let's analyze how fetching-library-docs adds intelligence on top of Context7.
Step 1: Specification First
Before understanding how the skill works, we need to know what it should do:
Intent: Fetch API documentation for published libraries (React, Next.js, Prisma, etc.) with automatic token reduction via intelligent filtering.
What triggers this skill?
Automatic triggers (context detection):
- Implementing code using library APIs (e.g., writing React component with hooks)
- Debugging library-specific errors (e.g.,
PrismaClientErrorin console) - Installing or configuring frameworks (e.g., adding Tailwind to project)
- Integrating libraries together (e.g., connecting Prisma with Next.js)
- Upgrading between versions (e.g., Next.js 14 to 15 migration)
Explicit triggers (user requests):
- "Show me React hooks examples"
- "How do I use Prisma queries?"
- "What's the Next.js routing API?"
What should NOT trigger this skill?
- Exploring library source code internals (use
researching-with-deepwikiinstead) - Accessing local file documentation
- Researching library history or comparisons
Success criteria:
- Returns relevant code examples and API signatures
- Reduces token count by 60%+ compared to raw MCP output
- Works across diverse libraries (React, Next.js, Prisma, Express, etc.)
- Graceful fallback when library not found
Step 2: The Intelligence Layer—Persona and Questions
The skill's persona determines when and how it acts:
Persona: "You are a library documentation specialist. Your role is to proactively fetch documentation when code is being written, errors are encountered, or frameworks are being configured—not wait for explicit requests. Fetch docs BEFORE writing code, not after guessing wrong."
This persona tells us the skill makes three key decisions:
- Detect when to invoke (auto-trigger on implementing, debugging, installing, integrating, upgrading)
- Identify what's needed (library from context, topic from task)
- Filter ruthlessly (content-type based on task type)
Questions the skill asks itself (through its decision logic):
- What library is relevant? (check imports, errors, package.json, or ask user)
- What topic is needed? (from error message, feature being implemented, or user specification)
- What content type fits this task? (implementing → examples, debugging → troubleshooting, installing → setup)
- Do I already have sufficient knowledge, or should I fetch fresh docs?
These questions drive proactive invocation—the skill auto-triggers based on context, not just explicit requests.
Step 3: Result Filtering—The Intelligence Pattern
Now let's see how the skill executes this intelligence. Look at the workflow:
| Step | What Happens | Token Cost |
|---|---|---|
| 1. Identify library + topic + content type | User asks "Show me React useState examples" → content-type: examples | 0 (intelligence, no MCP call yet) |
| 2. Resolve library (if needed) | --library react → calls resolve-library-id MCP tool | 1 API call (can skip with --library-id) |
| 3. Call Context7 MCP | Fetch React documentation via query-docs tool | 1 API call (subprocess, doesn't count toward Claude context) |
| 4. Route by content type | filter-by-type.sh routes to appropriate extractor | 0 (shell processing) |
| 5. Extract content | extract-code-blocks.sh extracts ```js/jsx blocks | 0 (shell processing) |
| 6. Return filtered output | Return code examples to Claude | 60-70% token savings |
The breakthrough: The MCP response stays in subprocess memory. Content-type filtering extracts only what's needed. Only the filtered result enters Claude's context. 60-90% token savings achieved through content-type filtering.
Available content types: examples (code blocks), api-ref (signatures), setup (terminal commands), concepts (prose), migration (before/after), troubleshooting (workarounds), patterns (best practices), all (no filtering).
Call budget awareness: Context7 has a 3-call limit per question. The skill uses max 2 calls (resolve + query when using --library), leaving 1 for your retry decisions. Using --library-id directly saves 1 call.
Step 4: Error Recovery Patterns
What happens when things go wrong? The skill distinguishes between infrastructure failures (safe to retry) and API errors (count against call budget).
Scenario 1: Library not found
resolve-library-idreturns no matches- Skill returns
[LIBRARY_NOT_FOUND]with call budget status: "1 of 3 calls used" - Suggests: try different spelling, common library IDs for reference
- Does NOT auto-retry (would waste call budget)
Scenario 2: Library mismatch
- User asks for "anthropic" but it resolves to an unrelated library
- Skill validates: does resolved ID contain the library name?
- If mismatch: returns
[LIBRARY_MISMATCH]warning with resolved ID and options - Prevents wrong documentation from being used
Scenario 3: Invalid library ID format
- User provides
--library-id react(missing/org/projectformat) - Skill validates format before calling MCP
- Returns
[INVALID_LIBRARY_ID]with correct format examples - Saves API calls by catching format errors early
Scenario 4: Network timeout
- MCP call fails with timeout/connection error
- Skill retries with exponential backoff (2s, 5s, 10s delays)
- After 3 retries: returns
[FETCH_FAILED_AFTER_RETRIES] - Infrastructure retries don't count against Context7's call limit
Scenario 5: Rate limit hit
- Context7 returns rate limit error
- Skill returns
[RATE_LIMIT_ERROR]with API key setup instructions - Does NOT retry (would be blocked anyway)
Step 5: Configuration and Triggering
The skill doesn't call MCP on every prompt. It uses context-based auto-detection:
Activation rules (invoke automatically when):
- Implementing: About to write code using external library API
- Debugging: Error message contains library-specific terms
- Installing: Task involves adding new package or framework
- Integrating: Connecting two libraries/services together
- Upgrading: Version migration mentioned or detected
- Uncertain: About to use unfamiliar API or unsure of correct pattern
Do NOT invoke when:
- Already have sufficient knowledge from training
- Task is about local/private code (use codebase search)
- Comparing libraries (use web search)
This context-detection approach is the key insight: production skills don't wait for users to ask—they proactively fetch docs before writing code, preventing incorrect implementations.
Deconstruction 2: The browsing-with-playwright Skill
Now let's analyze a different MCP-wrapping skill to see if the pattern repeats.
Specification
Intent: Automate browser interactions for web navigation, form submission, data extraction, and UI testing using Playwright MCP.
What triggers this skill?
- "Fill out this form and submit it"
- "Extract product information from this e-commerce site"
- "Test if the login flow works"
What should NOT trigger this skill?
- Simple HTTP requests (use curl/wget instead)
- API calls (use API client or MCP API wrapper)
- Static content that doesn't require interaction
Success criteria:
- Navigate websites reliably
- Extract data from dynamic pages
- Handle form interactions robustly
- Recover from page load failures
The Intelligence Layer—Persona and Questions
Persona: "You are a web automation orchestrator. Your role is to understand the user's interaction goal, break it into steps (navigate → find elements → interact → validate), execute those steps via Playwright MCP, and report success/failure."
Questions the skill asks itself:
- What's the user trying to accomplish on this website?
- What interactions are required to complete this goal?
- How do we identify elements (by text, by accessibility role, by CSS selector)?
- What should success look like? (page navigation? form submission? data extraction?)
- If an element isn't found, should we retry or escalate?
Result Filtering—Different Pattern, Same Principle
Playwright works differently from Context7, so the filtering pattern differs:
| Step | What Happens | Output Reduction |
|---|---|---|
| 1. Understand goal | User wants to "fill contact form and submit" | Clear scope |
| 2. Navigate to page | Call Playwright MCP browser_navigate → returns full page DOM | 500+ tokens (full page snapshot) |
| 3. Get page snapshot | Call browser_snapshot → returns accessibility tree of all elements | Could be 1000+ tokens (every element on page) |
| 4. Filter to relevant elements | Intelligence layer: find only form fields + submit button | 50-100 tokens (focused subset) |
| 5. Fill form | Call Playwright MCP browser_fill_form with filtered element refs | Isolated interaction |
| 6. Submit and validate | Call browser_click + browser_wait_for to confirm success | Confirms completion |
Key insight: The intelligence layer reduces MCP call volume and result processing volume by filtering BEFORE processing. Instead of analyzing the full page DOM, it identifies the relevant subset (form fields) and operates on just that.
Error Recovery in Playwright Skill
Scenario 1: Element not found
- MCP
browser_snapshotreturns page, but form field not visible - Intelligence layer retries: wait for page to fully load, then snapshot again
- After 3 retries: report "Element not found" with screenshot for debugging
Scenario 2: Click fails
browser_clickon submit button times out- Intelligence layer: try
browser_hoverfirst (trigger any hover states), then click - If still fails: try JavaScript injection (
browser_evaluate) to trigger click programmatically
Scenario 3: Page navigation takes longer than expected
- User expects form to be on page immediately
- Intelligence layer: use
browser_wait_forwith timeout (2 seconds default) before giving up - If timeout: report waiting issue and suggest increasing timeout
Pattern Recognition: The Skill Wrapping Template
Now that you've seen two different MCP-wrapping skills, let's extract the common pattern:
┌─────────────────────────────────────────────┐
│ MCP-WRAPPING SKILL STRUCTURE │
├─────────────────────────────────────────────┤
│ │
│ PERSONA (Identity + Decision Framework) │
│ └─ Who am I? What decisions do I make? │
│ │
│ TRIGGER CONDITIONS (When to Activate) │
│ └─ Auto-detection: context signals │
│ └─ Explicit: user requests │
│ │
│ INTELLIGENCE QUESTIONS (How do I Decide?) │
│ └─ What library? What topic? What type? │
│ │
│ MCP CONFIGURATION (Which MCP Server?) │
│ └─ How do I connect to external tool? │
│ │
│ RESULT FILTERING (What Gets Returned?) │
│ └─ How do I reduce noise/tokens? │
│ │
│ ERROR RECOVERY (What If It Fails?) │
│ └─ How do I handle failures gracefully? │
│ │
│ FALLBACK STRATEGIES (What If MCP Down?) │
│ └─ What's my backup plan? │
│ │
└─────────────────────────────────────────────┘
In fetching-library-docs:
- Persona: Library documentation specialist (proactive, fetches before code is written)
- Trigger: Auto-detection (implementing, debugging, installing, integrating, upgrading) + explicit requests
- Questions: What library? (from context) What topic? (from task) What content type? (from task type)
- MCP: Context7 for documentation (2 tools: resolve-library-id, query-docs)
- Filtering: Content-type router → specialized extractors (60-90% token savings)
- Error Recovery: Library validation, format validation, exponential backoff for infrastructure
- Call Budget: Max 2 calls per question, leaving 1 for retry
In browsing-with-playwright:
- Persona: Web automation orchestrator
- Trigger: User wants browser interaction
- Questions: What goal? Which elements? What counts as success?
- MCP: Playwright for browser control
- Filtering: Snapshot → filter to relevant elements only
- Error Recovery: Retry with waits, try hover before click, JavaScript injection
- Fallback: Screenshot + report, suggest manual intervention
Both follow the same structure. They wrap an MCP server with intelligence that makes the tool useful for specific purposes.
The Three Key Insights
Insight 1: MCP Wrapping Is Decision-Making
Raw MCP is powerful but purposeless. A skill that wraps MCP adds purpose through decision-making. The questions your skill asks (implicit or explicit) drive all its behavior: when to call MCP, how to filter results, how to recover from failure.
Insight 2: Token Efficiency Is Intentional
The 60-90% token savings in fetching-library-docs didn't happen by accident. It happened because the skill's designer asked: "What does the user actually need?" Then designed content-type filtering to return only that. Using --content-type examples returns code blocks only; using --content-type api-ref returns signatures only. This is intelligence—knowing what matters and eliminating what doesn't.
Insight 3: Error Recovery Makes Skills Production-Ready
A skill that works 95% of the time is not production-ready. A skill that fails gracefully 100% of the time (with clear error messages and fallbacks) is. The intelligence layer includes not just "happy path" logic but "failure recovery" logic.
Manual Exercise: Design Your MCP-Wrapping Skill
Now that you understand the anatomy, let's design a skill (without coding) that you'll build in Lesson 4.
Scenario: Choose an MCP server from Chapter 38 that your team implemented, or imagine a new one. Design a skill that wraps it.
Instructions (on paper or in a document):
-
Write the specification (2-3 sentences):
- What problem does this skill solve?
- What triggers it?
- What's the success criterion?
-
Design the persona (1-2 sentences):
- Who is this skill? (What's its identity and expertise domain?)
- What decisions does it make?
-
Write 5 intelligence questions (1 per line):
- What questions does the skill ask itself to make decisions?
- Make them specific to your domain (not generic)
-
Identify result filtering (1-2 sentences):
- What does raw MCP return?
- What should the filtered result contain? (estimate token reduction: what % is removed?)
-
Plan error recovery (3 failure scenarios):
- Scenario 1: MCP returns empty results → How does skill respond?
- Scenario 2: MCP times out → How does skill respond?
- Scenario 3: User input is ambiguous → How does skill clarify?
-
Design fallback strategy (1-2 sentences):
- If MCP is unavailable, what's the skill's backup plan?
Example (for reference only—design your own):
Spec: Fetch code examples from GitHub repositories matching search criteria, with result filtering for readability.
Persona: GitHub code search specialist. My role is to understand what coding pattern the user needs, search GitHub, and return only well-commented, production-quality examples.
Questions: What design pattern is the user looking for? What programming language? What project type (library, app, framework plugin)? What code quality matters (star count, recency, test coverage)?
Filtering: GitHub search returns 100 results with metadata. Filter to top 5 by stars, extract code samples only (not README bloat). Token reduction: 500 → 150 (70% savings).
Error Recovery:
- Empty results → Retry with broader search terms, suggest related patterns
- Rate limit exceeded → Queue query, inform user ("I'll fetch this in 60 seconds"), retry
- Ambiguous pattern request → Ask clarifying questions ("Do you mean Factory Pattern or Builder Pattern?")
Fallback: If GitHub API unavailable, return hardcoded examples from local cache for top 10 patterns.
Try With AI
Use your AI companion to analyze and refine your MCP-wrapping skill design.
Prompt 1: Validate Your Filtering Strategy
I've designed a skill that wraps [your chosen MCP server].
Raw MCP returns approximately [X] tokens of output.
My filtering plan is: [describe what you remove/keep].
This should reduce output to approximately [Y] tokens.
Help me validate this filtering strategy:
- Is my token reduction realistic for this type of data?
- What else could I filter without losing critical information?
- Are there hidden assumptions in my filtering logic?
What you're learning: Token efficiency analysis—understanding what filtering decisions are safe and what would cause information loss.
Prompt 2: Stress-Test Your Error Recovery
I've planned three error scenarios for my skill:
1. [Scenario 1 and your recovery strategy]
2. [Scenario 2 and your recovery strategy]
3. [Scenario 3 and your recovery strategy]
For each scenario, tell me:
- Is my recovery strategy realistic?
- What could go wrong with my recovery approach?
- What's a failure case I haven't considered?
Then, suggest one additional error scenario I should design recovery for.
What you're learning: Robustness thinking—anticipating failures and designing recovery strategies that actually work in production.
Prompt 3: Compare Your Design to Reference Skills
I'm designing a skill to wrap [your chosen MCP].
I've heard that fetching-library-docs achieves 60-90% token reduction
(depending on content type) and browsing-with-playwright has sophisticated
error recovery.
Compare my design to these two reference skills:
- Where is my design similar to their pattern?
- Where is my design different (is that intentional or a gap)?
- What could I learn from their approaches (especially call budget management)?
What you're learning: Pattern recognition—understanding how your specific skill design fits within the broader MCP-wrapping template.
Safety Note
As you design your MCP-wrapping skill, remember: intelligent filtering requires understanding what matters in your domain. The best filtering decisions come from domain expertise (knowing your users and their actual needs) combined with iterative testing. Don't over-optimize for token reduction at the expense of functionality—the 60-90% range in fetching-library-docs varies because different content types extract different amounts. Code examples (--content-type examples) get 60-70% savings; setup commands (--content-type setup) get 80-90%. Match your filtering to what users actually need.