Chapter 16: The Knowledge Extraction Method
"The knowledge that makes a domain agent genuinely useful is the knowledge the expert cannot easily write down. The Knowledge Extraction Method is the structured process for getting it out of their head and into a SKILL.md that works."
Chapter 15 established the complete architecture of a Cowork plugin — the three-component model, the Agent Skills Pattern, the context hierarchy, the governance layer, and the ownership model. It left one question deliberately unanswered: how do you actually write a production-quality SKILL.md? The architecture tells you what goes in each section. It does not tell you how to extract the domain expertise that belongs there. This chapter answers that question with a structured methodology.
The answer has two modes. Method A extracts knowledge from expert heads through a five-question interview framework designed to surface the tacit professional knowledge that makes the difference between a generic agent and a genuinely useful one. Method B extracts knowledge from institutional documents through a three-pass framework — explicit rule extraction, contradiction mapping, and gap identification — that converts policy manuals, handbooks, and standard operating procedures into SKILL.md instructions while surfacing the problems that naive extraction misses. Most professional domains require both methods, and the reconciliation principle determines which takes precedence when expert judgement and documented standards conflict.
But extraction alone is not enough. A SKILL.md that encodes the expert's knowledge but has never been tested against the range of real-world queries has unknown coverage gaps. The validation stage — building scenario sets, scoring outputs on accuracy, calibration, and boundary compliance, interpreting failure patterns, and running the shadow mode protocol — is what converts a plausible first draft into a production-ready file. This chapter teaches both halves: how to get the knowledge out, and how to confirm it works.
📚 Teaching Aid
What You'll Learn
By the end of this chapter, you will be able to:
- Explain why tacit professional knowledge resists articulation and why no platform solves this without structured extraction
- Conduct a Method A expert interview using the five-question framework to surface decision-making logic, exceptions, and escalation conditions
- Execute a Method B three-pass document extraction on a policy corpus, including contradiction mapping and gap identification
- Choose and combine Methods A and B based on where a domain's critical knowledge lives, and apply the reconciliation principle when they conflict
- Translate extraction outputs into a SKILL.md with a Persona that addresses Chapter 15's four structural elements through three extraction-focused writing questions, a Questions section with explicit scope boundaries, and Principles that are specific enough to test
- Build a validation scenario set with four categories at defined proportions and score outputs on three components
- Run the Validation Loop — interpret failure patterns, execute targeted rewrites, enter shadow mode, and manage the transition to autonomous operation
- Complete a full extraction-to-validation cycle for a professional domain
Lesson Flow
| Lesson | Title | Duration | What You'll Walk Away With |
|---|---|---|---|
| L01 | The Problem That No Platform Solves | 20 min | Understanding of tacit vs explicit knowledge, the articulation gap, and why structured extraction is necessary |
| L02 | The Five Questions — Expert Interview Framework | 30 min | The five interview questions, what each one surfaces, and how they map to SKILL.md sections |
| L03 | Conducting the Expert Interview | 20 min | The briefing protocol, note-taking approach, and north star summary that make an interview produce usable material |
| L04 | The Document Extraction Framework | 25 min | The three-pass framework for extracting SKILL.md instructions from institutional documents |
| L05 | Choosing and Combining Methods | 15 min | The domain-method mapping and reconciliation principle for multi-method extraction |
| L06 | From Extraction to SKILL.md | 25 min | How to translate extraction outputs into Persona, Questions, and Principles sections |
| L07 | Building the Validation Scenario Set | 25 min | The four scenario categories, three scoring components, and 95% threshold |
| L08 | The Validation Loop — From Draft to Production | 25 min | Failure pattern interpretation, targeted rewriting, shadow mode, and graduated autonomy |
| L09 | Hands-On Exercise — First Extraction and SKILL.md Draft | 150 min | A complete extraction-to-validation cycle for a real professional domain |
| L10 | Chapter Summary | 15 min | Synthesis of the full methodology, ready for the domain chapters |
| Quiz | Chapter Quiz | 50 min | 50 questions covering all ten lessons |
Chapter Contract
By the end of this chapter, you should be able to answer these five questions:
- What is the articulation gap, and why does it mean that platform improvements alone cannot produce a genuinely useful domain agent?
- What are the five interview questions in Method A, and what type of tacit knowledge does each one surface?
- What are the three passes in Method B, and what does each pass produce that the previous one did not?
- How do you choose between Method A and Method B, and what is the reconciliation principle when both apply?
- What are the four validation scenario categories, the three scoring components, and the threshold for shadow mode entry?
After Chapter 16
When you finish this chapter, your perspective shifts:
- You see the extraction problem. When you encounter a domain expert whose agent produces generic output, you can diagnose the cause: the tacit knowledge was never extracted, or it was extracted but not validated.
- You own the methodology. You can conduct an interview, extract from documents, translate into a SKILL.md, design validation scenarios, and run the validation loop — for any professional domain.
- You validate before you deploy. You understand that a SKILL.md is not finished when you write it — it is finished when it passes the validation threshold across a representative scenario set.
- You are ready for domain application. The methodology is domain-agnostic. The domain chapters that follow apply it to finance, legal, HR, healthcare, architecture, sales, and operations.
Start with Lesson 1: The Problem That No Platform Solves.