Lead Scoring
In Lesson 2, you built NexaFlow's ICP and researched five prospects. You know WHO to target. But which prospect should get a call THIS WEEK?
Your CRM says Meridian Logistics is a "warm lead." What does that mean? Someone opened an email? The company matches your industry filter? "Warm" is not actionable. It does not tell Farah whether to call Meridian before TransPak, or whether either of them deserves attention over Al-Safa Transport in Dubai.
This lesson replaces gut instinct with a three-dimension scoring model. Three questions, scored independently: does the company match your ICP (Fit)? Is something happening RIGHT NOW that makes them likely to buy (Timing)? Have they shown interest in NexaFlow (Engagement)? The composite score tells you where to look. The dimension breakdown tells you what to do.
Why Most Lead Scoring Fails
Before building the model, consider two scoring failures from NexaFlow's history.
Failure 1: The "hot" lead that was a terrible fit. Last quarter, a 15-person startup in Lahore scored 78 on the old single-dimension system. They had downloaded three whitepapers, attended a webinar, and replied to two emails. The old system weighted engagement heavily, so the startup looked like a top prospect. Farah's team spent three weeks on calls and demos. The startup could not afford NexaFlow's pricing, had no operations team to implement the product, and churned within two months of signing. High engagement, terrible fit. The old model could not distinguish interest from ability to buy.
Failure 2: The perfect-fit company scored "cold." Falcon Logistics in Abu Dhabi matched NexaFlow's ICP on every dimension -- 180 employees, 3PL operator, legacy WMS, expanding into Saudi Arabia. But they had never visited NexaFlow's website. No email opens, no downloads, no webinar attendance. The old system scored them 22 out of 100 and the team ignored them for six months. Meanwhile, Falcon had just hired a new COO who was actively evaluating workflow automation vendors. The timing signals were there in public data -- hiring announcements, LinkedIn posts about operational efficiency, a Companies House filing showing new investment. The old system could not see them because it only measured engagement.
Both failures have the same root cause: a single-dimension score hides the story. The three-dimension model separates what matters into independent questions so each gets a clear answer.
The Three-Dimension Model
Score every prospect on three independent dimensions:
| Dimension | Points | Question It Answers |
|---|---|---|
| Fit | 0-40 | Does this company match our ICP? |
| Timing | 0-40 | Is something happening NOW? |
| Engagement | 0-20 | Do they know we exist? |
| Total | 0-100 |
The weighting is deliberate. Fit and Timing carry equal weight because both are deal-breakers. A company that matches your ICP perfectly but has no budget right now will not buy. A company with urgent need but the wrong tech stack will not implement. Engagement carries less weight because it is the dimension you can change most directly through outreach and marketing.
Why Not a Single Score?
A prospect scoring 72/100 could be:
- Fit 35 + Timing 35 + Engagement 2 -- great company, great timing, but they have never heard of you. Action: awareness campaign.
- Fit 12 + Timing 38 + Engagement 22 -- wrong company that happens to be buying and follows your blog. Action: disqualify.
- Fit 30 + Timing 10 + Engagement 32 -- decent fit, no urgency, very engaged with content. Action: nurture and monitor for timing signals.
Same composite number. Three completely different next steps. The dimensions are the decision tool, not the total.
Classification Tiers
| Classification | Total Score | Dimension Requirements | Recommended Action |
|---|---|---|---|
| HOT | 75-100 | Fit >= 25 AND Timing >= 25 | Immediate outreach or awareness campaign |
| WARM | 55-74 | Fit >= 20 OR Timing >= 20 | Nurture sequence with personalisation |
| CULTIVATE | 35-54 | At least one dimension above 15 | Quarterly check; add to newsletter |
| NOT YET | 0-34 | No dimension above 15 | Disqualify or defer to future quarter |
Notice that classification depends on BOTH the total and the dimension balance. A prospect with Fit 38 and Timing 0 scores 38 but does not meet CULTIVATE dimension requirements because a zero-Timing prospect with no engagement is not worth monitoring.
Score All 5 Prospects
Take the five demo prospects from Lesson 2 and score each one. Start with Meridian:
Use the lead-scoring skill to score Meridian Logistics against
NexaFlow's ICP. Show the full dimension breakdown (Fit, Timing,
Engagement) and classify the lead.
What to expect: The agent reads your Meridian prospect record from demo-data.md and scores it against the ICP in sales-marketing.local.md. Your output will vary, but look for these sections:
| Section | What It Shows | What to Verify |
|---|---|---|
| TOTAL + Classification | Overall score and HOT/WARM/CULTIVATE/NOT YET label | Classification matches the threshold table from earlier in this lesson |
| Fit dimension | Industry, size, tech stack, geography sub-scores | Sub-scores reference data from demo-data.md, not invented details |
| Timing dimension | Trigger events, leadership changes, tech investment | Timing signals reference prospect record specifics |
| Engagement dimension | Website, content, events, email sub-scores | Lower scores for fictional prospects (limited public data) |
| Action recommendation | Next step for this prospect | Recommendation matches classification |
The agent's scores depend on your demo-data.md content and ICP configuration. The teaching point is the structure — three independent dimensions with sub-scores — not the exact numbers. If Meridian scores HOT (75+), the ICP is working. If it scores lower than expected, check which dimension is dragging the score.
If Meridian scores HOT, examine the dimension breakdown to understand why. The Fit dimension should reflect industry match, company size, and tech stack signals from demo-data.md. The Timing dimension should capture trigger events like contract wins and leadership changes. Engagement will typically be moderate for fictional prospects — limited public data means lower website and content interaction scores.
Now score the remaining four prospects. Run each one through the same prompt, substituting the prospect details from your Lesson 2 research briefs.
Ranking the 5 Prospects
After scoring all five, rank them:
Rank these 5 scored prospects by total score. Show
the dimension breakdown for each. Highlight the
top-scoring and bottom-scoring prospect.
Sample Ranking:
PROSPECT RANKING — NexaFlow Pipeline
═════════════════════════════════════
Rank Prospect Fit Timing Engage Total Class
──── ────────────────────────── ──── ────── ────── ───── ──────
1 Meridian Logistics (Leeds) 36 37 14 87 HOT
2 Al-Safa Transport (Dubai) 32 30 10 72 WARM
3 TransPak Logistics (KHI) 34 22 8 64 WARM
4 Greenline Express (LHR) 28 18 12 58 WARM
5 Coastal Freight (KHI) 18 8 6 32 NOT YET
The ranking produces a priority list. Meridian is the clear first call. Al-Safa needs a nurture sequence with Dubai-specific content. TransPak has high Fit but moderate Timing -- monitor for trigger events. Greenline is borderline WARM, pulled up by engagement (they read your content regularly). Coastal falls below the threshold on every dimension.
When the Model Gets It Wrong
Look at Al-Safa Transport. The model scores them 72 -- WARM. The recommended action is a nurture sequence.
But Farah looks at Al-Safa's data and says: "That is our best prospect this quarter. I would call them today."
Why? Farah knows something the model underweights. Al-Safa just lost their warehouse management vendor -- the vendor shut down operations in the UAE with 30 days notice. Al-Safa has no choice but to find a replacement immediately. This is not a "nice to have" evaluation. This is an emergency procurement with a hard deadline.
The model gave Al-Safa a Timing score of 30/40. That is strong but not maximum. The vendor shutdown signal was captured under "tech investment" but scored the same as a routine system upgrade. The model treats "vendor shutting down" and "considering new software" as equivalent timing signals. They are not. A forced replacement is a higher-urgency signal than a planned evaluation.
Adjusting the Weights
The scoring model underweights forced-replacement signals. To fix this:
Re-score Al-Safa Transport with this adjustment:
Timing signals should distinguish between voluntary
evaluation (planned system change) and forced replacement
(vendor shutdown, contract termination, system failure).
Forced replacement signals should score 15/15 on the
relevant timing sub-dimension.
Adjusted Output:
LEAD SCORE (ADJUSTED) — Al-Safa Transport
══════════════════════════════════════════
TOTAL: 82/100 (was 72)
Classification: HOT (was WARM)
Timing (adjusted): 40/40 (was 30/40)
Forced replacement: 15/15 (vendor shutdown — emergency
procurement, hard deadline)
Hiring signal: 15/15 (posted 2 ops roles in 2 weeks)
Tech investment: 10/10 (must replace system within 30 days)
Fit: 32/40 (unchanged)
Engagement: 10/20 (unchanged)
Al-Safa jumps from 72 to 82. The classification changes from WARM to HOT. Farah's instinct was right -- and now the model captures why.
The scoring model is only as good as the weights you configure. Every time you find a gap between the model's output and your best rep's judgment, you have found a calibration opportunity. Fix the weight, re-score, and the model gets smarter for every future prospect.
When a score contradicts what your best rep knows, the agent has not failed. It scored against the rules you gave it. The rules were incomplete. Every miscalibration you diagnose and fix makes the scoring model more accurate for every future prospect.
Routing Rules
Scoring without routing is a ranking exercise. Routing turns scores into action by defining who gets which leads, what response time applies, and what the expected next step is.
| Classification | Owner | SLA | Action |
|---|---|---|---|
| HOT | Top rep (Farah or equiv) | Contact within 24h | Personalised outreach referencing specific timing signal |
| WARM | Any rep on rotation | First touch within 3 days | Nurture sequence with ICP-specific content |
| CULTIVATE | Marketing automation | Quarterly review | Add to newsletter; invite to events; monitor for timing |
| NOT YET | No owner assigned | Re-score quarterly | Disqualify or defer; do not invest rep time |
Three decisions make routing work:
-
HOT prospects go to your best rep. Not the rep who has capacity. Not the rep whose territory includes that geography. The rep most likely to close. Farah does not get HOT leads because she has free time -- she gets them because she converts at 340% of quota. Distributing HOT leads equally across the team sounds fair but costs revenue.
-
WARM prospects get a defined sequence, not ad hoc follow-up. "Nurture" is not a strategy. Nurture with what content? Over what timeline? With what exit conditions? Lesson 6 builds the full multi-touch sequence. For now, the routing rule establishes that WARM prospects enter a structured path, not a rep's personal follow-up style.
-
NOT YET prospects get zero rep time. This is the hardest rule to enforce. Reps want to "keep the relationship warm" with prospects that scored 28. That is a misallocation. The quarterly re-score catches any NOT YET prospect whose circumstances change. Until then, no outreach, no calls, no "just checking in" emails.
What You Built
- Three-dimension scoring model configured with Fit (0-40) + Timing (0-40) + Engagement (0-20)
- 5 prospects scored and ranked by total with full dimension breakdowns
- Scoring calibration validated against expert judgment (Al-Safa adjustment)
- Routing rules defined for each score tier with owner, SLA, and action
Flashcards Study Aid
Test your understanding of the key concepts from this lesson.
Try With AI
Prompt 1 (Reproduce)
Score all 5 demo prospects from NexaFlow's pipeline
and rank by total score. Show the full dimension
breakdown (Fit, Timing, Engagement) for the top-scoring
prospect. Classify each as HOT, WARM, CULTIVATE,
or NOT YET.
What you are learning: The scoring model turns qualitative judgment into structured analysis. By scoring the same prospects you researched in Lesson 2, you see how ICP quality directly affects score accuracy -- a strong ICP produces scores that match your intuition; a weak ICP produces scores that surprise you. Every surprise is a calibration opportunity.
Prompt 2 (Adapt)
Take the lowest-scoring prospect from the ranking and
identify which dimension (Fit, Timing, or Engagement)
is dragging the score down. What would need to change
in the real world for this prospect to move up one tier?
Be specific — name the signal, the source, and the
expected score impact.
What you are learning: Dimension analysis turns a binary "not ready" verdict into a diagnostic. A low Fit score means this is the wrong company -- no action will fix it. A low Timing score means the right company at the wrong moment -- monitor for trigger events. A low Engagement score means they do not know you exist -- marketing can fix that. The dimension that drags the score determines the response.
Prompt 3 (Apply)
Score a prospect from your own pipeline using the
three-dimension model (Fit 0-40, Timing 0-40,
Engagement 0-20). Before the agent scores, write down
your gut estimate for each dimension. After scoring,
compare. If the model and your gut disagree on any
dimension, identify which one is right and why.
What you are learning: The gap between your gut score and the model's score reveals either domain knowledge the model lacks (you are right, fix the weights) or data the model found that you missed (the model is right, update your understanding). Both outcomes improve your pipeline. The exercise builds the habit of scoring BEFORE checking the agent -- which prevents anchoring bias where you unconsciously accept whatever number the agent produces.