Skip to main content
Updated Mar 07, 2026

Chapter 16: The Knowledge Extraction Method

"The knowledge that makes a domain agent genuinely useful is the knowledge the expert cannot easily write down. The Knowledge Extraction Method is the structured process for getting it out of their head and into a SKILL.md that works."

Chapter 15 established the complete architecture of a Cowork plugin — the three-component model, the Agent Skills Pattern, the context hierarchy, the governance layer, and the ownership model. It left one question deliberately unanswered: how do you actually write a production-quality SKILL.md? The architecture tells you what goes in each section. It does not tell you how to extract the domain expertise that belongs there. This chapter answers that question with a structured methodology.

The answer has two modes. Method A extracts knowledge from expert heads through a five-question interview framework designed to surface the tacit professional knowledge that makes the difference between a generic agent and a genuinely useful one. Method B extracts knowledge from institutional documents through a three-pass framework — explicit rule extraction, contradiction mapping, and gap identification — that converts policy manuals, handbooks, and standard operating procedures into SKILL.md instructions while surfacing the problems that naive extraction misses. Most professional domains require both methods, and the reconciliation principle determines which takes precedence when expert judgement and documented standards conflict.

But extraction alone is not enough. A SKILL.md that encodes the expert's knowledge but has never been tested against the range of real-world queries has unknown coverage gaps. The validation stage — building scenario sets, scoring outputs on accuracy, calibration, and boundary compliance, interpreting failure patterns, and running the shadow mode protocol — is what converts a plausible first draft into a production-ready file. This chapter teaches both halves: how to get the knowledge out, and how to confirm it works.

📚 Teaching Aid

What You'll Learn

By the end of this chapter, you will be able to:

  • Explain why tacit professional knowledge resists articulation and why no platform solves this without structured extraction
  • Conduct a Method A expert interview using the five-question framework to surface decision-making logic, exceptions, and escalation conditions
  • Execute a Method B three-pass document extraction on a policy corpus, including contradiction mapping and gap identification
  • Choose and combine Methods A and B based on where a domain's critical knowledge lives, and apply the reconciliation principle when they conflict
  • Translate extraction outputs into a SKILL.md with a Persona that addresses Chapter 15's four structural elements through three extraction-focused writing questions, a Questions section with explicit scope boundaries, and Principles that are specific enough to test
  • Build a validation scenario set with four categories at defined proportions and score outputs on three components
  • Run the Validation Loop — interpret failure patterns, execute targeted rewrites, enter shadow mode, and manage the transition to autonomous operation
  • Complete a full extraction-to-validation cycle for a professional domain

Lesson Flow

LessonTitleDurationWhat You'll Walk Away With
L01The Problem That No Platform Solves20 minUnderstanding of tacit vs explicit knowledge, the articulation gap, and why structured extraction is necessary
L02The Five Questions — Expert Interview Framework30 minThe five interview questions, what each one surfaces, and how they map to SKILL.md sections
L03Conducting the Expert Interview20 minThe briefing protocol, note-taking approach, and north star summary that make an interview produce usable material
L04The Document Extraction Framework25 minThe three-pass framework for extracting SKILL.md instructions from institutional documents
L05Choosing and Combining Methods15 minThe domain-method mapping and reconciliation principle for multi-method extraction
L06From Extraction to SKILL.md25 minHow to translate extraction outputs into Persona, Questions, and Principles sections
L07Building the Validation Scenario Set25 minThe four scenario categories, three scoring components, and 95% threshold
L08The Validation Loop — From Draft to Production25 minFailure pattern interpretation, targeted rewriting, shadow mode, and graduated autonomy
L09Hands-On Exercise — First Extraction and SKILL.md Draft150 minA complete extraction-to-validation cycle for a real professional domain
L10Chapter Summary15 minSynthesis of the full methodology, ready for the domain chapters
QuizChapter Quiz50 min50 questions covering all ten lessons

Chapter Contract

By the end of this chapter, you should be able to answer these five questions:

  1. What is the articulation gap, and why does it mean that platform improvements alone cannot produce a genuinely useful domain agent?
  2. What are the five interview questions in Method A, and what type of tacit knowledge does each one surface?
  3. What are the three passes in Method B, and what does each pass produce that the previous one did not?
  4. How do you choose between Method A and Method B, and what is the reconciliation principle when both apply?
  5. What are the four validation scenario categories, the three scoring components, and the threshold for shadow mode entry?

After Chapter 16

When you finish this chapter, your perspective shifts:

  1. You see the extraction problem. When you encounter a domain expert whose agent produces generic output, you can diagnose the cause: the tacit knowledge was never extracted, or it was extracted but not validated.
  2. You own the methodology. You can conduct an interview, extract from documents, translate into a SKILL.md, design validation scenarios, and run the validation loop — for any professional domain.
  3. You validate before you deploy. You understand that a SKILL.md is not finished when you write it — it is finished when it passes the validation threshold across a representative scenario set.
  4. You are ready for domain application. The methodology is domain-agnostic. The domain chapters that follow apply it to finance, legal, HR, healthcare, architecture, sales, and operations.

Start with Lesson 1: The Problem That No Platform Solves.