Skip to main content
Updated Feb 16, 2026

Chapter 8: Computation & Data Extraction Workflow

"If it's math, it belongs in a script. Period."

What You'll Build

By the end of this chapter, you'll have a personal toolbox of reusable utilities:

# Your workflow by chapter end:
cat ~/finances/2025/*.csv | tax-prep

# Output:
# MEDICAL: CVS PHARMACY: $456.70
# MEDICAL: WALGREENS: $234.50
# CHARITABLE: RED CROSS: $200.00
# BUSINESS: ZOOM SUBSCRIPTION: $179.88
#
# --- TOTALS ---
# Medical: $1,891.20
# Charitable: $1,550.00
# Business: $774.32
#
# POTENTIAL DEDUCTIONS: $4,215.52

You'll transform from someone who manually categorizes expenses (tedious and error-prone) to someone who builds verified automation tools that work every tax season.

Prerequisites

From Seven Principles Chapter:

  • You understand ALL Seven Principles conceptually
  • You know why "Bash is the Key" matters (Principle 1)
  • You know why "Verification as Core Step" prevents failures (Principle 3)

From File Processing Chapter:

  • You can navigate directories (cd, ls, pwd)
  • You've run basic Bash commands
  • You understand the pipe operator (|) conceptually

Technical Requirements:

  • Python 3.x installed (see setup below)
  • Unix-like terminal (macOS, Linux, or WSL on Windows)
  • Access to Claude Code or similar AI assistant
  • A bank statement CSV export (most banks offer this)

Python Setup — verify Python is installed before starting Lesson 1:

# macOS / Linux:
python3 --version

# Windows (Command Prompt or PowerShell):
python --version

If you see a version number (3.x), you're ready. If not, install Python from python.org or use your system's package manager:

  • macOS: brew install python
  • Ubuntu/Debian: sudo apt install python3
  • Windows: Download from python.org and check "Add to PATH" during installation

Chapter Structure

LessonTitleDurationKey Skill
1From Broken Math to Your First Tool30 minBuild a Python utility from scratch
2The Testing Loop25 minVerify with exit codes and test data
3Parsing Real Data25 minParse CSV with Python's csv module
4Your Permanent Toolkit20 minMake scripts into permanent commands
5Data Wrangling30 minCategorize with regex pattern matching
6Capstone: Tax Season Prep40 minGenerate tax-ready report

Total Duration: 170 minutes (~3 hours)

Seven Principles in Action

This chapter applies the principles you learned in Chapter 6:

PrincipleHow You'll Apply It
P1: Bash is the KeyUse cat, find, xargs, pipes as your foundation
P2: Code as Universal InterfacePython scripts as reusable components
P3: Verification as Core StepZero-trust debugging with exit codes
P4: Small, Reversible DecompositionEach lesson builds one composable skill
P5: Persisting State in FilesAliases and scripts as persistent tools
P6: Constraints and SafetyTest data prevents production errors
P7: ObservabilityExit codes make failures visible

The Journey

Lesson 1: From Broken Math to Your First Tool

  • Discover why Bash arithmetic fails with decimals
  • Build sum.py — a Python script that reads numbers from stdin and calculates sums

Lesson 2: The Testing Loop

  • Learn why exit code 0 doesn't mean "correct"
  • Create test data with known answers to verify your scripts

Lesson 3: Parsing Real Data

  • Understand why simple text tools (like awk) fail on real CSV
  • Build a CSV parser with Python's csv module

Lesson 4: Your Permanent Toolkit

  • Transform scripts into permanent commands via chmod, aliases, and shell config
  • Close your terminal, open a new one, and your tools still work

Lesson 5: Data Wrangling

  • Use pattern matching to categorize transactions (Medical, Charitable, Business)
  • Handle false positives (Dr. Pepper is not a doctor)

Lesson 6: Capstone

  • Orchestrate everything into a real-world tax preparation workflow
  • Generate a report your accountant can use
  • Process a full year of bank statements with one command

Quick Start for Previous Chapter Graduates

Already comfortable with terminal basics? Here's what's new:

# Bash math FAILS with decimals (you'll discover why)
echo $((14.50 + 23.75)) # Error!

# Simple text tools FAIL on quoted CSV fields
echo '"Amazon, Inc.",-23.99' | awk -F',' '{print $2}'
# Output: Inc." -- WRONG! Broke on comma inside quotes

# Python handles both correctly (you'll build this)
cat bank-statement.csv | python sum-expenses.py
# Output: Total: $2,456.78

# Pattern matching categorizes for taxes (you'll learn this)
cat bank-statement.csv | python tax-categorize.py
# Output: Medical: $1,891.20, Charitable: $1,550.00

The Real-World Payoff

This isn't academic. By chapter end, you'll solve a real problem:

Before this chapter: Tax season means hours of manual work - opening each bank statement, hunting for medical expenses, trying to remember which "SQ *LOCALSTORE" was business vs personal.

After this chapter: Download your bank CSVs, run one command, get a categorized report. Every year. Forever.

The same pattern applies to any data extraction problem - expense reports, invoice processing, log analysis. You're learning to build tools, not just use them.