Skip to main content

Agentic Architectures Ka Intikhab: Decision-Driven Crash Course

Pattern selection par ek conceptual crash course: kab sequential workflow istemaal karna hai, kab single agent + ReAct + tools, kab planning + ReAct execution, kab multi-agent specialist system (yeh chaar core patterns), aur kab in mein se kisi ke oopar reflection ki layer lagani hai. Yeh un engineers ke liye hai jo agents ship kar chuke hain aur architectures ko dikhawe ke bajaye usool ke saath fit ke hisaab se chunna chahte hain.

*22 Concepts • 5 Decisions • chaar learning tracks. Reader track: 2-3 ghante ki sirf conceptual reading (decision tree, paanch patterns, failure signals, setup nahin). Beginner / Intermediate / Advanced tracks: har ek taqreeban ~1 din, 2-3 din, 4-5 din (conceptual reading ke saath real tasks classify karne, deployment topologies sketch karne, aur har pattern ke liye khaas eval signals wire karne ki gehrai barhti jati hai). Imandaar total estimate: Reader track ke liye 2-3 ghante; team ko pattern selection ko working discipline banane ke liye 4-5 din. Part 5 ke decision lab se pehle apna track chun lein.*

Anchor article: Bala Priya C, "Choosing the Right Agentic Design Pattern: A Decision-Tree Approach," Machine Learning Mastery, May 15, 2026: machinelearningmastery.com/choosing-the-right-agentic-design-pattern-a-decision-tree-approach. Is course ki reedh ki haddi us article ka decision tree hai. Yeh course us ke oopar composition layer add karta hai: har pattern aapki deployment topology aur eval suite ke liye kya ma'ni rakhta hai.


Seedhi Roman Urdu version (pehle yeh parhein)

Aap agents bana chuke hain. Shayad woh customer-support Worker jo Maya ne Digital FTE course mein banaya, ya eval-driven course ka evaluation agent, ya cloud deployment course mein production tak le jaya gaya Tier-1 Support agent. Ab aap ek agent bana sakte hain. Jo kaam abhi usooli tareeqe se nahin kar pa rahe: agle dafa kis qisam ka agent banana hai.

Production AI mein ek asli failure mode yeh hai: engineers us pattern ki taraf bhagte hain jo impressive lagta hai, aksar multi-agent, jabke task ko sequential workflow chahiye hota hai jise paanch mein se teen steps mein LLM ki zarurat bhi nahin hoti. Ek aise masle ke liye hafton ki orchestration jise do tools wala achi tarah prompted single agent ek din mein handle kar sakta tha. Ulta failure mode bhi utna hi real hai: engineers lamba system prompt laga kar single agent chunte hain jab task ko waqai specialists mein decomposition chahiye hoti hai, aur agent aise context ke neeche toot jata hai jo ek mental model mein fit nahin hota.

Pattern selection woh design kaam hai jo build se pehle aata hai. Yeh sawal hai: "is agent system ki asal shape kya honi chahiye?" Is ka usooli jawab hai: apne task ke bare mein paanch sawal poochein, aur jawab paanch starting patterns mein se ek tak map ho jate hain. Yeh course paanch sawal, paanch patterns, woh failure signals jo batate hain ke pattern ghalat tha, aur sab se aham baat jab aap waqai ship kar rahe hon: har pattern aapki deployment topology aur eval suite ke liye kya ma'ni rakhta hai, yeh sab sikhata hai.

Discipline yeh nahin ke "hamesha sab se simple pattern chunno." Discipline yeh hai: "sab se simple pattern chunno jo task ki asal requirement se match karta ho, aur complexity sirf tab add karo jab aap us specific task property ka naam le sakte hon jo us complexity ko demand karti hai." Multi-agent system tab sahi jawab hai jab specialization ya scale real bottleneck banaye, na ke tab jab slide par zyada advanced lagta ho.

Yeh course jaan-boojh kar eval-driven course aur cloud deployment course se chota hai. Decision-logic framework tight hai; har pattern ki history par survey content bharna is discipline ko dilute kar dega. Tight framework bug nahin, feature hai.

📖 Agar aapne Agent Factory track ke pehle courses nahin kiye

Yeh course operational envelope (Inngest), eval discipline, aur cloud deployment ko cross-reference karta hai, aur in courses se "Maya's Tier-1 Support agent" ko running example ke taur par use karta hai. Aap unhein parhe baghair bhi is course ko bilkul istemaal kar sakte hain. Paanch-sawal decision tree, paanch patterns, aur failure-signal discipline apni jagah transferable framework hain.

Agar prior-course context ke baghair focused first pass chahiye, to is order mein parhein:

  • Part 1 (pattern-selection problem): discipline establish karta hai
  • Part 2 (paanch-sawal decision tree): conceptual spine
  • Part 3 patterns, lekin pehle pass par operational-envelope sidebars sirf skim karein
  • Part 4 (failure signals aur revision)
  • Part 5 (decision lab): paanch worked examples Maya context ke baghair bhi samajh aa jate hain
  • Part 7 closing

Pehle pass par kin cheezon ko preview ya optional samjhein:

  • Concept 8.5 (SDK primitives): agar aap OpenAI Agents SDK use kar rahe hain to useful; agar dusra framework use kar rahe hain to skim karein, kyun ke neeche wali pattern shapes transfer ho jati hain
  • Concept 8.6 (Inngest ke saath operational envelope): agar aap production agentic systems ship kar rahe hain to useful; agar aap abhi design-only stage par hain to skim karein. Yeh argument ke "zyada elaborate patterns ko zyada operational machinery chahiye" Inngest se aage bhi generalize hota hai
  • Part 3 ke deployment-composition sidebars: agar aap wahi cloud stack use kar rahe hain to useful; general principle (kaun se patterns ko sandbox chahiye aur kaun se ko nahin) kisi bhi cloud setup par transfer hota hai

Cross-references ko general principles ki concrete examples samjhein, gatekeeping prerequisites nahin. Framework un ke baghair bhi kaam karta hai.

Platform translation table: Agent Factory ka har choice kis cheez se map hota hai

Agar aap kisi aur stack par hain, yeh table har Agent Factory reference ko common alternatives se map karta hai. Decision tree, paanch patterns, failure signals, aur anti-pattern gallery in platforms par bilkul wahi kaam karte hain; sirf primitive names badalte hain.

Agent Factory reference (yahan use hone wala stack)2026 mein common alternativesLayer kya karti hai
Inngest (operational envelope)Temporal, Restate, Dapr Workflows, AWS Step Functions, Azure Durable Functions, LangGraph (partial; durable execution via checkpointers)Triggers, durable execution, flow control, HITL gates
OpenAI Agents SDK (agent engine)LangGraph, AutoGen, CrewAI, AWS Strands, Pydantic AI, LlamaIndex WorkflowsAgent loop, tool routing, multi-agent composition, structured output
Phoenix / Arize (trace observability)Langfuse, Helicone, LangSmith, Logfire, Honeycomb, Datadog APMHar trace par agent-behavior observability aur trace-to-eval pipeline
Azure Container Apps (harness runtime)AWS Fargate, Google Cloud Run, Fly.io, Railway, Render, Kubernetes (any cloud)Long-running HTTP service host, autoscale, secrets, ingress
Neon Postgres (durable state)Supabase, AWS RDS Postgres, PlanetScale, CockroachDB, Google Cloud SQLSessions, runs, traces, audit log: durable agent state
Cloudflare R2 (file storage)AWS S3, Google Cloud Storage, Azure Blob, Backblaze B2Inputs, outputs, knowledge artifacts; sandbox ke liye presigned-URL access
Cloudflare Sandbox (code execution)E2B, Modal, Daytona, Vercel Sandbox, Fly.io Machines, Cloudflare ContainersAgent-generated code ke liye isolated workspace
@inngest_client.create_function (envelope primitive)@workflow.defn (Temporal), state machine definition (Step Functions), StateGraph(...) (LangGraph)Durable function unit register karta hai
ctx.step.run(name, fn) (envelope primitive)workflow.execute_activity() (Temporal), Task state (Step Functions), node in StateGraph (LangGraph)Durable checkpoint jo retry par memoize karta hai
ctx.step.wait_for_event(...) (envelope primitive)workflow.wait_condition() (Temporal), waitForTaskToken (Step Functions), interrupt() (LangGraph)Event ya timeout tak durable suspend, HITL primitive
Fan-out trigger (envelope primitive)workflow.execute_child_workflow() parallel (Temporal), Map state (Step Functions), parallel edges (LangGraph)Ek coordinator → N specialist runs
Agent(...) + Runner.run() (SDK primitive)Agent.execute() (LangGraph), Agent + initiate_chat() (AutoGen), Crew + kickoff() (CrewAI)Agent loop chalata hai
@function_tool (SDK primitive)@tool (LangGraph/LangChain), Tool(...) (AutoGen), Pydantic models in CrewAIPython function ko agent tool ke taur par expose karta hai
handoff(target_agent) (SDK primitive)Command(goto=...) (LangGraph), nested chats (AutoGen), task delegation (CrewAI)Specialist conversation apne control mein leta hai
Agent.as_tool() (SDK primitive)Subgraph-as-node (LangGraph), nested agent calls (AutoGen), as_tool patterns in CrewAICoordinator specialist ko tool ki tarah use karta hai
output_guardrail (SDK primitive)Custom node + conditional edge (LangGraph), validator pattern (Pydantic AI), AWS Strands guardrailsAgent output par critique/validation pass

Is table ko kaise use karein. Jab yeh course kehta hai "wrap Runner.run() in step.run," aur aap Temporal plus LangGraph par hain, to isay aise parhein: "wrap Agent.execute() in workflow.execute_activity()." Architectural argument wahi hai; syntax farq hai. Ek anti-pattern se bachein: sirf is course ko parhne ke liye Agent Factory stack seekhne ki koshish na karein. Primitives map karein, framework parhein, apne stack par apply karein.

Ek row cleanly map nahin hoti: Agent.as_tool() versus handoff(). OpenAI Agents SDK "coordinator control mein rehta hai" (as_tool) aur "specialist control le leta hai" (handoff) ko first-class primitives ke taur par alag rakhta hai. Zyada tar dusre frameworks ya to is farq ko collapse kar dete hain ya sirf aadha implement karte hain. Architecturally important cheez khud yeh distinction hai; primitive ka naam incidental hai. Jab aap apne framework mein as_tool-style aur handoff-style composition ke darmiyan choose karte hain, aap wahi architectural choice kar rahe hote hain jise yeh course naam deta hai; aapka framework bas usay alag tarah surface kar sakta hai.


Glossary (ek dafa parhein, zarurat par wapas dekhein)

Poora glossary kholne ke liye click karein.
  • Agentic design pattern. AI agent systems ki recurring architectural shape: sequential workflow, ReAct + tools, planning + execution, reflection, multi-agent specialist. Har pattern task ke bare mein kuch assumptions rakhta hai; jab woh assumptions sahi hon to pattern value add karta hai; jab na hon to overhead ban jata hai.
  • Sequential workflow. Steps ki fixed pipeline jahan har step ka output agle step ko feed karta hai. Solution path pehle se known hota hai; LLM calls interpretation ya generation ke liye rakhi jati hain, agla step decide karne ke liye nahin. Example: invoice intake → extract → validate → store → notify.
  • ReAct (Reason + Act). Agentic loop jahan agent apni current state par reasoning karta hai, phir action leta hai (aksar tool call), result observe karta hai, aur repeat karta hai. Defining property: agla action runtime par decide hota hai, pehle se specified nahin hota.
  • Planning agent. Agent jo execution shuru hone se pehle explicit plan banata hai (dependencies ke saath stages ki sequence). Plan kaam ko structure deta hai; individual steps andar se phir bhi ReAct use kar sakte hain. Example: "research a market" → 5-step plan generate karein → har step tools ke saath execute karein.
  • Reflection (self-critique). Pattern jahan agent output generate karta hai, explicit criteria ke khilaf us par critique karta hai, aur critique ke basis par refine karta hai. Latency aur cost add karta hai; sirf tab valuable hai jab criteria checkable hon aur errors mehngay hon. Example: correctness checks ke saath SQL generation.
  • Multi-agent specialist system. System jahan distinct roles wale multiple agents (researcher, writer, reviewer) ek task par collaborate karte hain, aur routing ya supervisor agent unhein coordinate karta hai. Specialization, context-overload, ya parallel-execution needs se justified hota hai; aesthetic se nahin.
  • Solution path. Woh steps ki sequence jo task solve karti hai. Known path ka matlab steps runtime se pehle specify kiye ja sakte hain; unknown path ka matlab steps agent ki investigation se emerge hote hain.
  • Task structure. Major stages aur un ki dependencies. Articulable structure ka matlab aap execution se pehle stages describe kar sakte hain; emergent structure ka matlab stages feedback se khud reveal hote hain.
  • Architectural fit. Pattern ke assumptions aur task ki actual properties ke darmiyan match. Pattern selection fit-matching hai, capability-matching nahin: sab se capable pattern chunna ghalat heuristic hai.
  • Coordination overhead. Multiple agents ke darmiyan routing ya handoffs coordinate karne ki cost (tokens, latency, debugging complexity, failure modes). Multi-agent systems yeh cost dete hain; isay coordination se milne wali value se justify hona chahiye.
  • Failure signal. Runtime symptom jo batata hai ke chosen pattern task se mismatch hai. Examples: ReAct loops solved work par wapas aate hain (structure missing), planner aise plans banata hai jin se execution diverge karta hai (overstructured), reflection output improve nahin karti (vague criteria).
  • Pattern composition. Bare system ki different layers par different patterns istemaal karna. Example: top layer par planning agent, har plan step ke andar ReAct + tools, final synthesis par reflection.
  • Agent (OpenAI Agents SDK). Core SDK class: LLM-driven entity jo instructions=, optional tools=, structured output ke liye optional output_type=, aur optional handoffs= se define hoti hai. Is course ke har pattern ki atomic unit.
  • Runner.run(agent, input) (OpenAI Agents SDK). SDK call jo Agent ko final output tak chalata hai. SDK reason-act-observe loop andar se chalata hai: hand-rolled loop ki zarurat nahin. max_turns= parameter step budget hai.
  • @function_tool (OpenAI Agents SDK). Decorator jo Python function ko agent ke callable tool mein badalta hai. Type hints aur docstrings automatically tool ka JSON schema ban jate hain.
  • handoff() (OpenAI Agents SDK). Multi-agent transitions ke liye first-class SDK primitive: ek agent conversation dusre agent ko explicit hand karta hai, aur SDK context preserve karta hai. Jab specialist ko user-facing interaction apne control mein leni ho tab use karein.
  • Agent.as_tool() (OpenAI Agents SDK). SDK method jo Agent ko callable tool bana deta hai jise dusra Agent invoke kar sakta hai. Jab coordinator ko control mein reh kar specialist outputs compose karne hon tab use karein.
  • output_guardrail (OpenAI Agents SDK). SDK decorator jo validation/critique agent ko dusre agent ke output path mein wire karta hai. Block-bad-outputs-style reflection ke liye SDK-native primitive; fire hone par OutputGuardrailTripwireTriggered raise karta hai.
  • Operational envelope (Inngest). Runtime layer jo agent function ko wake karti hai (triggers), mid-flight crashes se bachati hai (durable execution via step.run), load limit karti hai (concurrency, throttle, priority), aur HITL coordinate karti hai (step.wait_for_event). Yeh aapki cloud deployment aur SDK engine ke saath compose hoti hai. operational-envelope course mein sikhaya gaya hai.
  • @inngest_client.create_function (Inngest). Decorator jo Python async function ko Inngest mein durably-executed unit ke taur par register karta hai. Trigger surface aur flow-control policy declare karta hai.
  • ctx.step.run(name, fn, args) (Inngest). Durability checkpoint. Completed steps retry par memoized output return karte hain; failed steps exponential backoff ke saath independently retry karte hain.
  • ctx.step.wait_for_event(...) (Inngest). Matching event aane ya timeout fire hone tak durable suspend. Suspension ke dauran zero compute consume hota hai. HITL gates ke peeche runtime primitive.
  • Fan-out trigger pattern (Inngest). Ek coordinator function N events emit karta hai; har event apni subscriber function ko wake karta hai. Multi-agent systems mein parallel specialist execution ke peeche runtime primitive.
  • Replay (Inngest). Failed runs full trace ke saath persist hotay hain. Fix ship karein, replay click karein; function failed step se naye code ke saath resume karta hai. Successful steps memoized rehte hain.

Kya aap tayyar hain? (prerequisites)

  1. Aap apna pehla agent bana chuke hain, ya equivalent experience rakhte hain. Is course ke patterns assume karte hain ke aap jaante hain agent loop kya hota hai, tool call kaisi dikhti hai, aur model structured output kaise return karta hai. Agar aapne abhi agent nahin banaya, pehle agent-building course karein.
  2. Aap kam az kam ek working agent bana chuke hain. Chahe woh customer-support Worker ho jo Maya ne banaya, research agent, chatbot, ya coding agent, aapko ek architectural choice karne (chahe tab aapko ehsaas na ho) aur us ke consequences ke saath jeene ka experience chahiye.
  3. Aap pseudocode parh sakte hain. Yeh conceptual course hai, is liye executable code bahut kam hai. Jo nazar aayega woh patterns samjhane ke liye pseudocode hoga; agar aap Python ya TypeScript parh sakte hain, yeh bhi parh lenge.
  4. (Optional lekin strongly recommended) Aap eval aur cloud deployment courses kar chuke hain. Is course ka main contribution pattern selection ko aapki deployment topology aur eval suite ke saath compose karna hai. Jinhon ne woh courses nahin kiye woh framework se phir bhi faida utha sakte hain, lekin integration arguments miss karenge.

Agar item 4 missing hai, phir bhi yeh course parhein aur deployment-composition aur eval-composition sidebars ko previews samjhein. Framework un ke baghair bhi land karta hai.

Pehle se jaan lene wali limits (imandaar scope)

  • Yeh conceptual course hai, code course nahin. Yeh aapko architecture choose karna sikhata hai, implement karna nahin. Implementation discipline pehle Agent Factory courses mein hai. Taqreeban 30 pages architectural reasoning aur total 5 pages pseudocode expect karein.
  • Paanch patterns exhaustive nahin hain. Reality mein graph-based agent systems, debate patterns, blackboard patterns, hierarchical task networks, aur aur bhi cheezen hain jo yahan cover nahin hotin. Yeh course woh paanch patterns cover karta hai jinhein article dominant architectural starting points ke taur par identify karta hai; mid-2026 tak yeh paanch production agent systems ki bari majority cover karte hain, sab ko nahin.
  • Decision tree starting point hai, final answer nahin. Real agent architectures evolve karte hain. Jo system tools wale single agent se shuru hua woh workload diversify hone par multi-agent ban sakta hai; planning-then-execution system paths clear hone par sequential workflow ban sakta hai. Yeh course starting decision sikhata hai, evolution nahin.
  • Cost aur latency choice ka hissa hain. Reflection latency add karti hai. Multi-agent tokens add karta hai. Planning extra LLM call add karti hai. Yeh course in costs ko real constraints samajhta hai; Concept 18 cover karta hai ke har pattern ka overhead kab justified hai.
  • Article spine hai; composition layer extension hai. Bala Priya C ka decision tree is course ka structural backbone hai. Yeh course do layers add karta hai jo article mein nahin: (a) har pattern aapki deployment topology ke liye kya ma'ni rakhta hai, aur (b) har pattern ke failure modes aapki eval suite ke zariye kaise nazar aate hain. Agar aapne sirf article parhi hai, yeh course production-discipline layer add karta hai.

Chaar learning tracks

TrackTime commitmentAap kya complete karte hainKis ke liye hai
Reader (pure conceptual)~2-3 hours, no labPoora conceptual arc: Part 1 (problem), Part 2 (decision tree), Part 3 (paanch patterns), Part 4 (failure signals), aur Part 7 ki closing. Classification exercises nahin, decision lab nahin.Engineering leaders, platform architects, ya curious-but-non-engineer readers jo decide kar rahe hain ke team time systematic pattern selection par lagana hai ya nahin.
Beginner~1 dayReader track + decision lab mein Decisions 1-2. Decision tree se do tasks classify karein (Maya's Tier-1 Support aur incident-response agent); chosen pattern ko high level par sketch karein.Agentic architecture mein naye engineers jo guided pattern-selection practice ka ek round chahte hain.
Intermediate~2-3 daysBeginner track + Decisions 3-4. Research agent aur enterprise onboarding agent add karein; apne cloud stack par unki deployment topologies sketch karein; eval signals identify karein jo har pattern ka failure mode pakren.Agentic systems ship karne wale engineers jo pattern selection ko deployment aur evaluation ke saath compose karna chahte hain.
Advanced~4-5 daysIntermediate track + Decision 5 + Parts 6 & 7. Coding agent add karein (sab se mushkil case); pattern composition explore karein; full discipline se ek hypothetical agent system end-to-end architect karein.Senior engineers aur tech leads jo pattern selection ko team-wide discipline banana chahte hain.

Track-fork guidance. Engineering leaders Reader track se shuru karein. Engineers ko Intermediate track default karna chahiye, kyun ke decision lab mein framework waqai internalize hota hai. Sirf is liye Part 5 poora skip na karein ke Part 2 jaldi parh liya. Framework tab chipakta hai jab aap isay real tasks par apply karte hain.

🚀 Minimum viable path, working pattern selection tak sab se chota rasta. Part 1 (problem), Part 2 (decision tree), aur lab ka Decision 1 (Maya's Tier-1 Support) parhein. Yeh taqreeban 90 minutes hai; end par aap paanch sawalon se naya task classify kar sakte hain aur starting pattern chun sakte hain. Baqi sab discipline ko gehra karta hai; yeh seed hai.

Aakhir mein aapke paas kya hoga (concrete outcomes)

Reader track understanding deta hai, artifacts nahin. Aap samjha sakte hain ke code likhne se pehle pattern selection kyun matter karta hai; paanch patterns aur unki characteristic task assumptions describe kar sakte hain; paanch common failure signals pehchan sakte hain jo pattern mismatch batate hain.

Beginner / Intermediate / Advanced tracks working classification discipline dete hain:

  • Naye task par paanch-sawal decision tree walk karne aur principled starting pattern chunne ki ability.
  • Apne cloud stack par har pattern ki deployment topology ka sketch (sequential workflow ko kin components ki zarurat hai versus multi-agent ya planning system ko).
  • Har pattern ke likely failure modes se un specific eval signals tak mapping jo unhein pakrenge.
  • Team-shareable artifact: ek one-page "classify-this-task" template jo design reviews mein use ho sakta hai.

TL;DR: chaar claims jin ka yeh course difa karta hai

  1. Pattern selection architectural fit hai, capability matching nahin. Har pattern task ke bare mein kuch assume karta hai. Sahi pattern woh hai jis ke assumptions task ki actual properties se match karte hain, na ke woh jis mein sab se zyada capability ya impressive structure ho. Multi-agent sequential workflow se "better" nahin; woh us specific case ke liye better hai jahan specialization ya scale bottleneck banata hai.

    Chaar core patterns + ek additive layer. Decision tree ke Q1-Q3 core pattern chunte hain: sequential workflow, single agent + ReAct + tools, planning + ReAct execution, ya multi-agent specialist system. Q4 decide karta hai ke chosen core ke oopar reflection additive layer add karni hai ya nahin. Reflection fifth peer pattern nahin; yeh quality-control layer hai jo chaaron cores mein se kisi ko wrap karti hai. Yeh distinction matter karta hai: jo students reflection ko standalone pattern samajhte hain woh architectural sach miss karte hain ke yeh unke already chosen core pattern ke saath compose hoti hai.

  2. Task ke bare mein paanch sawal architecture determine karte hain. Q1-Q3 core pattern chunte hain: kya solution path known hai? Kya workflow fixed hai? Kya task structure articulable hai? Q4 decide karta hai ke reflection layer lagani hai ya nahin: kya quality speed se zyada important hai aur criteria checkable hain? Q5 decide karta hai ke multi-agent tak upgrade karna hai ya nahin: kya specialization, context, ya scale bottleneck hai? Jawab deterministic taur par starting architecture tak map karte hain. Article sahi kehta hai ke literature mein missing cheez decision logic hai, patterns khud nahin.

  3. Pattern selection deployment topology aur eval signals ke saath compose hoti hai. Har pattern cloud stack ka different subset use karta hai: sequential workflows ko sandbox execution nahin chahiye; multi-agent systems ko careful audit logging chahiye kyun ke coordination failures production ke sab se mushkil bugs hain. Har pattern ke characteristic failure modes hain jinhein aapki eval suite alag tareeqe se pakarti hai. Is topic par kam courses yeh composition sikhate hain, kyun ke is ke liye deployment aur eval courses foundation chahiye.

  4. Decision tree starting point deta hai, final answer nahin. Real systems evolve karte hain. Discipline yeh nahin ke "architecture hamesha ke liye lock kar do"; discipline yeh hai ke "starting decision principled banao, failure signals watch karo, aur runtime evidence se evolution guide karo." Pattern selection pehla move hai; pattern revision ongoing move hai.

Aap jo seekh rahe hain us ki shakal (ek diagram, poore course mein wapas dekhein)

Yeh course 22 Concepts introduce karta hai (19 main plus 8.5, 8.6, aur 16.5 par 3 bridge concepts) aur 5 Decisions walk karta hai. Us sab se pehle, yahan decision tree hai jis ke gird poori cheez compose hoti hai.

Agentic pattern selection ke liye paanch-sawal decision tree. Oopar sawal hai "Is task ke liye kaunsa pattern fit hota hai?" Neeche sequence mein paanch branching questions hain, Q1: Kya solution path known hai? (Yes → Q2; No → adaptive reasoning needed, Q3 par jao). Q2: Kya yeh fixed workflow hai? (Yes → SEQUENTIAL WORKFLOW; No → adaptive patterns ko revisit karo). Q3: Kya task structure execution se pehle articulable hai? (Yes → PLANNING + REACT EXECUTION; No → SINGLE AGENT + REACT + TOOLS). Path/structure branch ke baad do aur sawal kisi bhi agentic pattern par apply hote hain. Q4: Kya quality speed se zyada matter karti hai, checkable criteria ke saath? (Yes → REFLECTION layer add karo; No → reflection skip karo). Q5: Kya specialization, context, ya scale bottleneck hai? (Yes → MULTI-AGENT SPECIALIST SYSTEM; No → single agent rakho). Neeche paanch terminal patterns colored boxes mein visualized hain: green sequential workflow, blue single agent + ReAct, purple planning + ReAct, orange single agent + reflection, red multi-agent specialist. Footer band kehta hai: "Top to bottom parhein. Result starting architecture hai, permanent commitment nahin. Production systems evolve karte hain; failure signals (Part 4) batate hain ke pattern kab task se match nahin karta."

*Tree ki shape: pehle poochein kya aapko LLM-driven agent ki zarurat bhi hai (Q1-Q2); agar haan, poochein task kitna structured hai (Q3); phir quality (Q4) aur scale (Q5) sirf tab layer karein jab woh real value banate hon. Jab bhi koi Concept ya Decision abstract lage, is diagram par wapas aayein.*


Part 1: Pattern selection ka masla

Concept 1: Pattern selection build se pehle ka design kaam hai

Agentic systems par zyada tar courses aapko har pattern build karna sikhate hain. Yeh course ek different sawal ke bare mein hai: diye gaye task ke liye kaunsa pattern build karna chahiye? Yeh sawal build se pehle aata hai, aur aana chahiye, lekin aam tor par isay nahin sikhaya jata, ek awkward wajah se: har pattern ki implementation achi tarah documented hai; un ke darmiyan choose karne ki decision logic nahin.

Pattern catalog mature ho chuka hai. ReAct 2022 paper se aata hai. Planning-then-execution patterns classical AI mein STRIPS tak jate hain aur 2023 mein LLMs ke liye dobara discover hue. Reflection 2023 se formalized hai. Multi-agent architectures har major framework sikhata hai. Kisi bhi pattern ka tutorial paanch minutes se kam mein mil jata hai. Jo aasani se nahin milta: is specific task aur in specific constraints ke saath kaunsa pattern fit hai?

Is se jo failure mode banta hai. Engineers default taur par woh pattern chunte hain jo unhein sab se recently mila ho ya talks mein sab se impressive lagta ho. Multi-agent demos khas taur par tempting hote hain kyun ke woh "real AI" jaise lagte hain: agents ek dusre se baat kar rahe hain, kaam divide kar rahe hain, coordinate kar rahe hain. Teams un masail ke liye weeks orchestration banati hain jo do well-defined tools wala single agent ek din mein solve kar sakta tha. Result: woh dheere ship karte hain, zyada mushkil debug karte hain, aur task ki zarurat se zyada tokens pay karte hain.

Ulta failure mode bhi real hai aur kam discuss hota hai. Engineers "bas ek single agent aur bahut lamba system prompt" choose kar lete hain jab task ko waqai structural decomposition chahiye hoti hai. Agent aise context ke neeche collapse kar jata hai jo ek mental model mein fit nahin hota. Tool-calling errors cascade karte hain. Reflection hi team ko known fix lagti hai, to woh har jagah add kar dete hain, aur ab har response 30 seconds leta hai. Woh brittle cheez ship karte hain jise architectural choice prevent kar sakti thi.

Is course ki discipline: pattern selection architectural fit-matching hai, capability matching nahin. "Best pattern kya hai?" mat poochein (aisa koi nahin). Poochein: "yeh task asal mein kya require karta hai, aur sab se chota pattern kaunsa hai jo woh provide karta hai?" Part 2 ka paanch-sawal decision tree is ka systematic jawab hai.

Yeh ab pehle se zyada kyun matter karta hai. 2023 mein agentic systems experimental thay. Ghalat pattern choose karne se weekend waste hota tha. 2026 mein agentic systems real users ko serve karte hue production mein hain; jo pattern aap choose karte hain woh aapki deployment topology, eval discipline, aur scale par operational cost determine karta hai. Ghalat pattern choice ab kai compounded tareeqon se mehngi hai: ghalat assumption ke liye infrastructure, ghalat failure modes ke liye evals, ghalat incidents ke liye runbooks. Pattern selection "preference" se "high-stakes design decision" ban chuki hai.

Khulasa: patterns khud well-documented hain; un ke darmiyan choose karne ki decision logic woh gap hai jo yeh course fill karta hai. Pattern selection architectural fit-matching hai, capability matching nahin. Ghalat pattern production mein mehnga compound hota hai: ghalat infrastructure, ghalat evals, ghalat runbooks. Yeh course paanch-sawal discipline sikhata hai jo sab se common pattern-selection failures ko prevent karti hai.

Concept 2: Har pattern task ke bare mein kuch alag assume karta hai

Pattern selection ko tractable banane wali gehri idea yeh hai: har agentic pattern task ki shape ke bare mein ek bet hai. Jab bet reality se match kare, pattern value add karta hai. Jab bet ghalat ho, pattern overhead ban jata hai, kabhi invisible overhead jo sirf tokens kharch karta hai, kabhi catastrophic overhead jo poora system tod deta hai.

Yeh hain paanch patterns ki bets:

Sequential workflow bet karta hai: mujhe steps pehle se maloom hain, aur har dafa wahi steps hain. Bet yeh hai ke solution path fixed hai aur runtime se pehle articulable hai. Agar sahi ho, agla kaam kya karna hai yeh decide karne ke liye LLM ki zarurat nahin: workflow jaanta hai. Aap LLM calls sirf un steps ke liye rakhte hain jahan waqai interpretation chahiye (text se yeh extract karo, woh summary generate karo). Cost predictable; latency bounded; failure modes obvious. Agar ghalat ho: agar steps waqai input ke contents ke hisaab se vary karte hain, workflow ghalat path force karta hai ya loudly fail karta hai.

Single agent + ReAct + tools bet karta hai: mujhe path pehle se maloom nahin; agent khud figure out karega. Bet yeh hai ke task itna open-ended hai ke agla step ab tak observe ki gayi cheez ke basis par decide hona chahiye. Agar sahi ho, ReAct ka loop (reason → act → observe → repeat) hi isay handle kar sakta hai, koi bhi predetermined plan step 3 tak ghalat ho jata. Agar ghalat ho: agar path asal mein kaafi stable tha aur likha ja sakta tha, ReAct latency, cost, aur agent ke loop karne ya solved work revisit karne ka risk add karta hai, bina koi aisi value diye jo sequential workflow se na mil sakti.

Planning + ReAct execution bet karta hai: main major stages aur dependencies pehle se articulate kar sakta hun, lekin har stage ko phir bhi adaptive reasoning chahiye. Bet yeh hai ke kaam ki shape known hai (research → analyze → synthesize → report) lekin har stage ka content investigation require karta hai. Agar sahi ho, plan scaffolding deta hai aur agent ko bhatakne se bachata hai, jabke har stage ke andar ReAct uncertainty handle karta hai. Agar ghalat ho: agar plan actually articulate nahin ho sakta (pure ReAct use karein) ya har stage ko adaptive reasoning chahiye hi nahin (sequential workflow use karein), plan aisa overhead ban jata hai jis se execution waise bhi diverge karta hai.

Reflection bet karti hai: output quality speed se zyada matter karti hai, aur quality checkable hai. Bet yeh hai ke critique pass generator ki missed defects pakar sakta hai, aur "good output" ke criteria itne explicit hain ke critique meaningful ho. Agar sahi ho, reflection first pass ke errors pakar kar reliability improve karti hai (incorrect SQL, weak legal arguments, reports mein factual mistakes). Agar ghalat ho: agar criteria vague hain ya critic aur generator ke blind spots same hain, reflection output improve kiye baghair latency aur cost add karti hai. Is se bhi bura: yeh false confidence de sakti hai ke critique ne woh quality "verify" kar di jo us ne actually verify nahin ki.

Multi-agent specialist system bet karta hai: koi single agent isay achi tarah karne ke liye expertise, context, ya capacity nahin rakhta. Bet yeh hai ke task waqai specialist roles mein partition hota hai (researcher + writer + reviewer; coder + security + docs), aur specialists ke darmiyan coordination ek agent par overload se sasti hai. Agar sahi ho, specialists apne domains mein generalist se behtar outputs dete hain, aur parallel execution throughput improve karti hai. Agar ghalat ho: agar "specialists" mostly wahi kaam kar rahe hain, ya coordination overhead kaam par dominate karta hai, aapne aisi complexity add ki jo kuch nahin khareedti aur naye failure modes introduce kar deti hai (routing errors, integration errors, ownership ambiguity).

Pattern hi bet hai; task ki actual properties decide karti hain ke bet sahi hai ya nahin. Isi liye pattern selection fit-matching hai. Aap yeh nahin pooch rahe "kaunsa pattern sab se powerful hai?" Aap pooch rahe hain "jis task ko main actual mein jaanta hun, us ke saath kaun se pattern ki bet sab se achi match hoti hai?"

Khulasa: har agentic pattern task ke bare mein ek bet hai: sequential workflow known fixed paths par bet karta hai, ReAct unknown adaptive paths par, planning articulable structure par, reflection checkable quality criteria par, multi-agent real specialization needs par. Sahi pattern woh hai jis ki bet reality se match kare; pattern selection fit-matching hai, capability matching nahin.

Concept 3: Do failure modes, overshooting aur undershooting

Concept 2 ne bataya ke har pattern ek bet hai. Concept 3 us bet ke ghalat hone ke do tareeqon ko naam deta hai, aur real production systems mein dono taqreeban barabar frequency se hotay hain.

Overshooting: task ki zarurat se zyada elaborate pattern chunna. Yeh zyada mashhoor failure mode hai, woh jisme talks aur demos aasani se phansa dete hain. Examples:

  • Ek single LinkedIn-post generation ke task ke liye three-agent system banana (researcher, writer, reviewer). "Researcher" agent ka output do paragraphs hai jise "writer" ko phir summarize karna padta hai. Reviewer 5% outputs reject karta hai aise issues par jo self-checking prompt pakar leta. Teen agents, teen guna cost, measurable quality improvement nahin.
  • Aise task mein planning add karna jo actually fixed workflow hai. Planner har dafa wahi plan produce karta hai (kyun ke task same hai). Har run ek extra LLM call bina wajah pay karta hai. Is se bura: input zara unusual ho to planner thora different plan produce karta hai, aur ab team debug kar rahi hoti hai "planner ne is input par different path kyun liya?"
  • Aise task par reflection add karna jahan checkable criteria nahin. Critic aur generator same model, same training data, aur aksar same blind spots share karte hain. Reflection pass ya to output rubber-stamp karta hai ya verbose-lekin-non-actionable critique generate karta hai. Latency double; quality flat.

Overshooting failure pattern: aapne us capability ke liye pay kiya jo task ko chahiye hi nahin thi, aur ab usay undo karna aasaan nahin kyun ke orchestration load-bearing ban chuki hai. Chhe mahine production mein chalne wala multi-agent system remove karna refactor nahin; rewrite hai.

Undershooting: task ki actual zarurat se zyada simple pattern chunna. Yeh failure mode talks mein kam dikhta hai kyun ke dramatize karne mein kam impressive hai, lekin kam az kam utna hi common hai. Examples:

  • Billing, technical, account, aur refund issues handle karne ke liye 4,000-token system prompt wala single agent use karna. Agent billing rules ko technical rules se confuse karta hai. Reflection thori madad karti hai lekin root cause nahin theek karti. Task ko waqai specialist routing chahiye thi; ek agent context hold nahin kar sakta tha.
  • Fixed pipeline hone wale workflow ke liye ReAct + tools use karna. Agent kabhi steps skip karta hai, kabhi completed work revisit karta hai, kabhi aise tool calls invent karta hai jo exist nahin karte. Team prompt mein "stop conditions" aur "progress criteria" add karti hai, symptoms treat karti hai mismatch nahin. Cost variance runbook problem ban jata hai.
  • Aise outputs par reflection skip karna jinhein waqai verification chahiye. Subtle errors wali SQL queries production mein ship hoti hain. Legal drafts citation mistakes ke saath clients ko bheje jate hain. Team baad mein tests add karti hai, lekin in errors ko pakarne ki natural jagah generation time par reflection pass thi.

Undershooting failure pattern: aapne brittle cheez ship ki jo manual oversight ya luck se survive karti hai. Production gaps reveal karta hai; remediation ya to woh pattern add karna hota hai jisse shuru karna chahiye tha, ya failure rate ko business cost ke taur par accept karna hota hai.

Dono failure modes equally important kyun hain. Pattern selection ki discussions overshooting par focus karti hain (kyun ke woh zyada visible failure hai, multi-agent system jise koi debug nahin kar sakta). Lekin undershooting utni hi common aur arguably zyada dangerous hai: yeh aise systems banati hai jo tab tak kaam karte hue lagte hain jab tak nahin karte, aur failure modes subtle hotay hain. Jo team overshooting avoid karna seekh le lekin undershooting pehchan na sake, usne discipline ka sirf aadha hissa seekha hai.

Part 2 ka decision tree dono failure modes surface karne ke liye design hua hai. Har sawal ek task property ke bare mein hai (path known hai? structure articulable hai? quality checkable hai?); agar jawab more elaborate pattern justify nahin karta, tree simpler pattern ki taraf route karta hai (overshoot prevent karta hai). Agar jawab elaborate pattern justify karta hai, tree wahan explicitly route karta hai (upgrade ko conscious bana kar undershoot prevent karta hai).

Khulasa: pattern selection do tareeqon se fail hoti hai: overshooting (task ki zarurat se zyada elaborate pattern chunna, aisi capability ke liye pay karna jo madad nahin karti) aur undershooting (task ki zarurat se zyada simple pattern chunna, brittle cheez ship karna). Dono roughly equal frequency se hotay hain; talks overshooting ko highlight karti hain lekin undershooting kam az kam utni hi dangerous hai kyun ke subtle hai. Part 2 ka decision tree pattern preferences ke bajaye task properties ke bare mein pooch kar dono failures surface karta hai.


Part 2: Paanch-sawal decision tree

Yeh part decision tree ko sawal-by-sawal walk karta hai. Har Concept paanch sawalon mein se ek cover karta hai: yeh kya test karta hai, real task ke liye jawab kaise dena hai, aur jawab kis pattern ki taraf route karta hai. Part 2 ke end tak aap poora tree ek dafa walk kar chuke honge.

Tree ki structure:

#SawalYeh kya test karta haiKahan route karta hai
Q1Kya solution path pehle se define ho sakta hai?Kya process runtime se pehle specify ho sakta haiAgar yes → Q2 (fixed workflow check); agar no → adaptive reasoning needed, Q3 par jao
Q2Kya workflow har run mein fixed aur stable hai?Kya har dafa wahi steps apply hotay hainAgar yes → Sequential Workflow; agar no → adaptive patterns revisit karo
Q3Kya task structure execution se pehle articulable hai?Kya major stages aur dependencies clear hainAgar yes → Planning + ReAct execution; agar no → Single agent + ReAct + tools
Q4Kya quality speed se zyada matter karti hai, checkable criteria ke saath?Kya extra critique/refinement passes latency/cost ke laayak hainAgar yes → chosen pattern ke oopar Reflection layer add karo; agar no → reflection skip
Q5Kya specialization, context, ya scale bottleneck hai?Kya ek agent expertise, context, ya parallel capacity mein kam haiAgar yes → Multi-Agent Specialist System; agar no → single agent rakho

Questions 1-3 core pattern determine karte hain. Questions 4-5 additive layers hain; yeh kisi bhi core pattern ke oopar apply ho sakte hain, lekin sirf jab unke assumptions hold karte hon.

Concept 4: Q1: Kya solution path pehle se define ho sakta hai?

Yeh sab se important sawal hai, kyun ke yeh decide karta hai ke aapko agentic system ki zarurat bhi hai ya nahin.

"Solution path" ka matlab kya hai. Seedha matlab: agar main aapko input bata dun, kya aap mujhe output tak jane wale exact steps ki sequence bata sakte hain? Jawab khud nahin, sirf path. Invoice intake ke liye: email receive karo → structured fields extract karo → database ke against validate karo → store karo → requester ko notify karo. Paanch steps, wahi paanch steps, har dafa. Yeh known solution path hai.

Contrast: customer poochta hai "mujhe November 12 ko do dafa charge kyun kiya gaya?" Path is baat par depend karta hai ke aapko kya milta hai. Transaction history dekho. Mil gayi. Do charges different merchants se hain, pivot to "kya yeh fraud tha?" Ya same merchant hai different timestamps ke saath, pivot to "kya second one retry tha?" Ya customer account mein multiple users hain, pivot to "kya kisi aur ne purchase ki?" Har branch different next step tak le jati hai. Path pehle se specify nahin ho sakta; investigation jo reveal karti hai us se emerge hota hai. Yeh unknown solution path hai.

Isay imandari se kaise test karein. Teen tests, is order mein:

  1. Kya input dekhne se pehle aap steps ka flowchart likh sakte hain? Agar yes, path known hai. Agar flowchart mein "ab agent decide karega kya karna hai" boxes chahiye, path unknown hai.
  2. Kya steps bahut se runs mein unchanged repeat hotay hain? Invoice intake repeat hota hai. Customer support investigations nahin. Research report ka outline har dafa same shape ho sakta hai (intro, teen sections, conclusion) lekin content discovery step sequence nahin; adaptive search hai.
  3. Input badalne se steps badalte hain? Known path different inputs ke liye same step sequence produce karta hai. Unknown path har step se reveal hone wali cheez ke basis par different step sequences produce karta hai.

Teams yahan kahan ghalti karti hain. Sab se common error yeh believe karna hai ke path known hai kyun ke task description structured lagti hai. "Process refund requests" known lagta hai: request receive karo, order dekho, refund issue karo, customer ko notify karo. Real refund requests aise nahin hotay. Kuch dispute investigation require karte hain (kya yeh chargeback tha?), kuch policy lookup (kya is customer's plan refunds allow karta hai?), kuch escalation (amount agent ki authority se zyada hai), kuch multiple charges involve karte hain jinhein disambiguate karna hota hai. Four-step flowchart ghalat hai; actual path adaptive hai.

Mirror error: path ko unknown samajhna kyun ke task description open-ended lagti hai. "Mujhe aaj raat city mein acha restaurant dhoondne mein madad karo" adaptive lagta hai, lekin agar actual implementation yeh hai: request parse karo → restaurant database ko filters ke saath query karo → rating ke hisaab se top 5 return karo, path known hai aur sequential workflow sahi pattern hai. "Agentic" framing misleading thi.

Route. Agar path known hai (aur stable hai, Q2 next dekhein), aap sequential workflow ki taraf ja rahe hain. Shayad aapko LLM-driven agent ki zarurat bhi nahin; aapko workflow chahiye jis mein interpretation ya generation ke specific steps par LLM calls embedded hon. Agar path unknown hai, aapko agentic reasoning chahiye; sawal yeh hai ke structure articulable hai (Q3, planning) ya nahin (Q3, pure ReAct).

Useful heuristic. Apne aap se poochein: "Agar mujhe isay bina LLM calls ke Python function ki tarah likhna pade, kya mujhe iski structure maloom hogi?" Agar yes, path shayad known hai; LLM sirf specific reasoning ya generation moments ke liye chahiye. Agar no, path shayad unknown hai; LLM structural decisions kar raha hai, sirf generative ones nahin.

Khulasa: Q1 poochta hai kya solution path runtime se pehle specify ho sakta hai. Known paths sequential workflows (Q2) ki taraf route karte hain; unknown paths adaptive agentic reasoning (Q3) ki taraf. Sab se common error yeh believe karna hai ke path known hai jab task description structured lagti hai lekin actual implementation adaptive hoti hai, refund processing, customer support, debugging. Ulta error yeh believe karna hai ke path unknown hai jab woh actually LLM-flavored input wala workflow hota hai. "Python function without LLM calls" heuristic se test karein.

Concept 5: Q2: Kya workflow har run mein fixed aur stable hai?

Aap Q1 ka jawab "yes, path known hai" de chuke hain. Q2 second check hai: kya yeh un inputs ke across fixed aur stable hai jo aap actual mein expect karte hain? Kyun ke "known" aur "stable" same cheez nahin.

Distinction. Path principle mein known ho sakta hai lekin practice mein vary kar sakta hai. Ek "research assistant" agent sochiye jo user queries handle karta hai. Kabhi user quick answer chahta hai (ek fact look up karo, return karo). Kabhi multi-source synthesis chahta hai (search, compare, summarize). Kabhi uploaded document ka analysis chahta hai (parho, claims extract karo, evaluate karo). Aap har case ka path likh sakte hain, lekin path input type ke saath vary karta hai. Yeh known-but-variable hai, known-and-stable nahin.

Versus: invoice intake. Har invoice wahi paanch steps se guzarta hai. Path stable hai. Har step ka content vary karta hai (different vendors, different amounts), lekin step structure nahin.

Yeh kyun matter karta hai. Sequential workflow stability assume karta hai. Agar aap fixed pipeline banate hain aur path vary karta hai, pipeline kuch inputs par ghalat path force karegi, ya to aise steps apply karegi jo apply nahin hotay (quick-answer query ko full synthesis treatment milta hai) ya loudly fail karegi (document-analysis path quick-answer step structure mein fit nahin hota).

Test. Real inputs ka representative sample dekhein (ya carefully imagine karein). Kya step sequence un sab mein same rehti hai?

  • Yes, har input same steps se guzarta hai → workflow stable hai; sequential workflow build karein.
  • No, different inputs ko different step sequences chahiye → workflow variable hai; aapko ya to (a) explicit branching wala workflow chahiye jo har variant handle kare, ya (b) agentic pattern jo input ke basis par path adapt kare.

Teams yahan kahan ghalti karti hain. "Known on average" ko "known and stable" treat karna. 80% case fixed workflow hai; 20% case deviation require karta hai. Engineers 80% case ke liye workflow banate hain aur 20% ke liye ad-hoc patches add karte hain. Aakhir mein patches original workflow par dominate karne lagte hain, aur aapke paas undocumented hybrid hota hai jise koi nahin samajhta. Yeh pattern aksar tab dikhta hai jab team admit nahin karna chahti ke task unki umeed se zyada adaptive hai: sequential workflows agentic patterns se safer lagte hain, is liye woh over-fit karte hain.

Route. Agar workflow fixed aur stable hai → Sequential Workflow. Tree ki is branch ke liye yahin ruk jayein. Questions 3 aur aksar 4 skip karein. Q5 sirf tab consider karein jab scale workflow instances ke across parallelization force kare.

Agar workflow known-but-variable hai → do choices:

  1. Explicit branching ke saath sequential workflow: har variant ko branch ke taur par likhein; deterministic route karein (aksar ek choti LLM call se jo sirf input type classify karti hai, phir route karti hai). Best jab variants kam aur stable hon.
  2. Path ko effectively unknown treat karein: Q3 par jayein aur agentic reasoning ko variation handle karne dein. Best jab variants bahut hon ya evolve kar rahe hon.

Pragmatic heuristic. Agar aap variants ko ek haath par gin sakte hain aur woh aksar change nahin hotay, branched workflow. Agar nahin, agentic pattern.

Khulasa: Q2 poochta hai kya known path un inputs ke across stable bhi hai jo aap expect karte hain. Stable paths sequential workflows tak route karte hain. Known-but-variable paths ya to explicit branching wale workflows tak route karte hain (few stable variants) ya agentic patterns tak (many or evolving variants). Trap yeh hai ke "80% case fixed hai" ko "fixed" samajh liya jaye; 20% case patches mein grow hota hai jo original design par dominate karte hain.

Concept 6: Q3: Kya task structure execution se pehle articulable hai?

Aap Q1 ka jawab "path unknown hai" de chuke hain; agentic reasoning chahiye. Q3 agla sawal poochta hai: kya kaam ki high-level structure pehle se articulable hai, chahe specific steps nahin hain?

Yahan "structure" ka matlab kya hai. Steps khud nahin, kyun ke Q1 ke hisaab se woh unknown hain. Stages aur unki dependencies. Example: market research agent. Aap steps pehle se specify nahin kar sakte (kin sources ko consult karna, kin competitors ko investigate karna, kaun se analyses chalane, yeh sab findings par depend karta hai). Lekin aap structure articulate kar sakte hain: data gather karo → analyze karo → synthesize karo → report karo. Chaar stages, isi order mein, clear dependencies ke saath. Yeh articulable structure hai.

Contrast: customer-support agent "mujhe issue aa raha hai" handle kar raha hai. Agent investigate karta hai. Findings ke basis par kaam account lookup, phir knowledge-base search, phir policy check, phir escalation require kar sakta hai, ya kuch bhi nahin sirf quick redirection. Aap stages articulate nahin kar sakte kyun ke kaam stage structure mein fit nahin hota; yeh investigation hai jo jab complete hoti hai tab complete hoti hai. Yeh articulable nahin.

Test. Specific input dekhne se pehle kaam ko phase diagram ki tarah draw karne ki koshish karein. Kya aap major phases aur dependencies label kar sakte hain?

  • Yes, phases clear hain (gather → analyze → synthesize; ya design → implement → test; ya research → draft → review) → structure articulable hai; planning use karein.
  • No, kaam phases mein fit nahin hota, yeh investigation, iteration, ya open-ended exploration hai → structure articulable nahin; ReAct use karein.

Teams yahan kahan ghalti karti hain. Jahan structure nahin hoti wahan invent karna. Engineers ko lagta hai plan hamesha possible hona chahiye, is liye force karte hain. Planner plan generate karta hai; execution immediately diverge karta hai kyun ke task mein woh phases actually nahin thay. Team phir ya to (a) divergence ko planner bug samajhti hai ("planner ne bad plan banaya"; planner rewrite; repeat) ya (b) plan ko dheere dheere itna short kar deti hai ke trivial ho kar kuch contribute nahin karta. Imandaar jawab yeh tha: "is task ko plan nahin chahiye tha; ReAct use karo."

Ulta error: jo structure actually hai use miss karna. Engineers un tasks ke liye pure ReAct use karte hain jin mein waqai phases hotay hain. Agent bhatakta hai, solved work revisit karta hai, ya overall progress track khona shuru karta hai. Prompt mein "yeh phases yaad rakhna" add karna workaround hai; architectural fix ReAct loop ke oopar planning add karna hai.

Route. Agar structure articulable hai → Planning + ReAct execution. Planning agent phase structure produce karta hai; har phase ke andar ReAct Q1 se identified unknown-step adaptation handle karta hai.

Agar structure articulable nahin → Single agent + ReAct + tools. Agent current state par reason karta hai, next action leta hai, result observe karta hai, aur repeat karta hai: agent khud jo maintain karta hai us ke ilawa koi overlayed structure nahin.

Internalize karne laayak heuristic. Planning tab madad karti hai jab kaam ki shape predictable ho lekin content nahin. ReAct alone tab sahi hai jab shape bhi discovery par depend karti ho. Shape-vs-content distinction in dono ko alag karne ka sab se clean tareeqa hai.

🔍 Q2 vs. Q3 confusion, examples se disambiguation

Q2 ("kya workflow fixed aur stable hai?") aur Q3 ("kya task structure articulable hai?") experienced teams ko bhi trip karte hain. Dono predictability ke bare mein poochte hain; farq yeh hai ke kis qisam ki predictability:

SawalYeh kya poochta hai"Yes" ka kya matlab hai"Yes" kahan route karta hai
Q2Kya steps khud runs ke across fixed hain?Wahi Python function-call sequence har dafa sahi answer produce karti hai. Agla kaam kya karna hai is par LLM-driven decisions nahin.Sequential workflow
Q3Kya major stages pehle se articulable hain, chahe step-level work vary kare?Specific input dekhne se pehle aap phase structure whiteboard par describe kar sakte hain. LLM har stage ke andar phir bhi decide karta hai kya karna hai.Planning + ReAct execution

Jo conflation kaatti hai: engineers task mein structure dekhte hain ("yahan clearly stages hain: research, analyze, write") aur Q2 ko YES answer kar dete hain. Lekin "structure exists" Q3 ka sawal hai, Q2 ka nahin. Q2 poochta hai kya aap runtime par exact step sequence predict kar sakte hain; agar agent ko har stage ke andar decisions karni hain (kaun se sources, kaun se analyses, kaun se framings), Q2 ka jawab NO hai aur aapko Q3 par hona chahiye.

Teen boundary examples jo Q2 vs. Q3 alag karte hain:

Example A, Invoice intake (Q2 = YES → Sequential workflow): extract → validate → store → notify. Har dafa wahi paanch steps. LLM fields extract karta hai aur notification likhta hai, lekin agla kaam decide nahin karta. Step sequence fixed hai.

Example B, Market research report (Q2 = NO, Q3 = YES → Planning + ReAct): data gather karo → analyze karo → synthesize karo → draft karo → review karo. Stages articulable hain, lekin har stage ke andar agent decide karta hai kya karna hai (kin sources ko consult karna, kin competitors par focus karna, kin analyses chalana). Stages fixed hain; stages ke andar steps adaptive hain.

Example C, Customer-support investigation (Q2 = NO, Q3 = NO → Single agent + ReAct): agent customer ka issue investigate karta hai. Predetermined phase structure nahin: agent ko jo milta hai us ke basis par kaam ek lookup bhi ho sakta hai ya paanch lookups plus policy check plus escalation. Na stages fixed hain na steps.

Notice karein example B woh case hai jise Part 5 ke Decisions sirf partially exercise karte hain. Agar aapko dono cheezen chahiye lagti hain "is mein clear phases hain" AUR "planner ne plan banaya execution diverge karta raha," to aap Q2/Q3 boundary par hain aur jawab almost always Planning + ReAct hai, Sequential workflow nahin.

Q2 ka known-but-variable subcase (naam dene laayak). Kabhi Q1 = YES (path known hai) lekin Q2 = NO (inputs ke across variable hai), jaise workflow ke 3-4 stable variants hon input type ke basis par (quick lookup vs. multi-source synthesis vs. document analysis). Yeh Sequential workflow ya Planning + ReAct case nahin; yeh explicit input-type routing wala branched workflow hai. Concept 5 isay cover karta hai; anti-pattern gallery mein Decision 4 ka variant (Concept 16.5 ki row "adding planning to a stable workflow") inverse failure cover karta hai.

Khulasa: Q3 poochta hai kya task ki high-level structure (stages aur dependencies) execution se pehle articulable hai. Articulable structure planning + ReAct execution tak route karti hai (plan shape deta hai; ReAct har stage ke unknown content ko handle karta hai). Non-articulable structure pure ReAct + tools tak route karti hai (agent shape aur content dono adaptively discover karta hai). Traps hain jahan structure nahin wahan invent karna (forced plans jin se execution diverge karta hai) aur jo structure actually hai use miss karna (phased work par pure ReAct, leading to wandering).

Concept 7: Q4: Kya quality speed se zyada matter karti hai, checkable criteria ke saath?

Q4 do additive layer sawalon mein pehla hai. Core pattern (sequential workflow, ReAct, ya planning + ReAct) Q1-Q3 se pehle hi choose ho chuka hai. Q4 poochta hai kya oopar reflection layer lagani hai.

Reflection kya karti hai. Agent output produce karne ke baad, critique pass usay explicit criteria ke against evaluate karta hai. Agar critique defects identify kare, agent refine (ya regenerate) karta hai. Pattern ki bet (Concept 2 se): critique pass generator ke missed errors pakar sakta hai, aur "good output" ke criteria itne explicit hain ke critique meaningful ho.

Reflection valuable hone ke liye dono conditions zaruri hain.

  1. Quality speed se zyada matter karti hai. Reflection kam az kam ek extra LLM call add karti hai (critique) aur aksar do (critique + refinement). Interactive use cases jahan latency matter karti hai (real-time customer support, conversational agents) mein yeh cost aksar prohibitive hoti hai. Batch use cases jahan output humans review karte hain ya downstream systems ko ship hota hai (report generation, code generation, document drafting) mein latency aam tor par acceptable hoti hai. Test: kya 2-5× slower response meaningfully higher-quality output ke badle acceptable hoga?
  2. Evaluation criteria explicit aur checkable hain. Vague criteria vague critiques produce karte hain. "Ensure this is good" criterion nahin. "Verify SQL parses, sirf listed tables ko hit karta hai, aur SELECT * use nahin karta" criterion hai. Explicit criteria ke baghair critique pass verbose chatter ban jata hai jo output improve nahin karta, aur aksar false confidence deta hai ke "AI ne check kar liya" jab kuch actual check nahin hua.

Dono conditions equally matter karti hain. Latency-sensitive task par reflection add karna time waste karta hai. Vague-criteria task par reflection add karna theater produce karta hai. Dono failures common hain; dono Q4 skip kar ke reflection is liye add karne se aate hain ke woh rigorous lagti hai.

Test. Do sawal poochein:

  • Agar yeh response produce hone mein 3-5× zyada time le, to kya mere users (ya downstream consumers) meaningful quality improvement ke badle is se theek honge? Agar no, reflection latency budget se justified nahin.
  • Kya main 5-10 specific bullet points mein exactly likh sakta hun ke is task ke liye "good output" ka kya matlab hai, is tarah ke different LLM un bullets ko parh kar output check kar sake? Agar no, reflection criterion clarity se justified nahin.

Agar dono answers yes hain, reflection value add karti hai. Agar koi ek no hai, reflection skip karein.

Teams yahan kahan ghalti karti hain.

Reflection add karna kyun ke critics rigorous lagte hain. "Generate, phir critique" good engineering jaisa lagta hai. Aksar hota bhi hai; kabhi sirf show hota hai. Test yeh hai ke critique measurable tareeqe se output change karti hai ya nahin. Agar aapne reflection add ki aur post-reflection output 90% dafa pre-reflection jaisa hi hai, reflection kaam nahin kar rahi; cost add kar rahi hai.

Generator aur critic dono ke liye same model aur prompt style use karna. Critic ke paas same training data, same biases, same blind spots hote hain jo generator ke paas hain. Yeh rubber-stamp karne lagta hai. Effective reflection patterns ya to (a) critic ke liye different model use karte hain, (b) critic ko fundamentally different perspective se frame karte hain ("aap strict reviewer hain jo problems dhoond raha hai" vs. generator ki helpful framing), ya (c) critic ko explicit checking tools dete hain (SQL run karo, JSON parse karo, schema validate karo).

Un tasks par reflect karna jahan output checkable nahin. Reflection un tasks par kaam karti hai jahan wrongness defined ho: errors wali SQL, compile na hone wala code, source ke key facts miss karne wali summaries. Jahan "good" subjective ho wahan kam kaam karti hai: marketing copy, creative writing, conversational responses. Subjective domains ko LLM reflection se zyada human-in-the-loop review se faida hota hai.

Route. Agar dono conditions hold karti hain, Q1-Q3 se chosen core pattern ke oopar reflection layer add karein. Yeh core pattern replace nahin karti; usay wrap karti hai. Reflection wala sequential workflow workflow chalata hai, phir final output critique karta hai. Reflection wala ReAct agent apna loop complete karta hai, phir final output critique karta hai. Reflection post-hoc quality control hai, core pattern ka replacement nahin.

Agar koi condition fail ho, reflection skip karein. Agar aapko waqai quality assurance chahiye lekin criteria checkable nahin, sahi fix human review hai, LLM reflection nahin.

Khulasa: Q4 poochta hai kya quality speed se zyada matter karti hai AUR evaluation criteria explicit aur checkable hain. Reflection value tab hi add karti hai jab dono conditions hold karen. Latency-sensitive tasks par reflection time waste karti hai; vague-criteria tasks par reflection theater produce karti hai. Do sab se common failure modes: reflection is liye add karna ke woh rigorous lagti hai (bina check kiye ke output change hota hai ya nahin) aur generator/critic ke liye same model aur prompt style use karna (rubber-stamping). Jab reflection justified ho, woh core pattern ke oopar layer hoti hai, usay replace nahin karti.

Concept 8: Q5: Kya specialization, context, ya scale bottleneck hai?

Q5 dusra additive layer sawal hai, aur sab se consequential bhi, kyun ke multi-agent systems build karne mein sab se mehngay pattern hain aur agar ghalat niklein to remove karne mein bhi sab se mehngay.

Multi-agent systems kis cheez par bet karte hain. Teen distinct claims, jinhein aksar mila diya jata hai:

  1. Specialization claim: task ko distinct expertise chahiye jo single agent ek prompt mein achi tarah hold nahin kar sakta. Coder, security reviewer, aur documentation writer ke optimal prompts, tools, aur evaluation criteria different hotay hain. Teeno ko ek agent mein fit karne se teeno mein mediocrity aati hai.
  2. Context claim: task ko itna context chahiye jo single agent effectively use nahin kar sakta. Even if context window technically large enough ho, context grow hone par retrieval aur reasoning degrade karte hain. Kaam ko agents ke across split karna, har ek ke focused context ke saath, reasoning quality preserve karta hai.
  3. Scale claim: task mein aisa kaam hai jo parallel chal sakta hai, aur multi-agent system isay single sequential agent se jaldi execute kar sakta hai. 10 competitors ko simultaneously research karna unhein ek ek kar ke research karne se behtar hai.

Har claim ko actual task ke against alag test karna zaruri hai.

Specialization claim sab se zyada believe ki jati hai baghair evidence. Engineers "build a feature" jaisa task dekhte hain aur usay roles mein decompose karte hain (architect, coder, tester, reviewer) kyun ke intuitive lagta hai. Intuition utni hi dafa ghalat hoti hai jitni sahi. Real feature-building aksar ek agent mein good tool access ke saath behtar hoti hai; architect-coder-tester separation handoff costs introduce karta hai jo specialization gain se zyada hotay hain. Claim test karein: kya domain specialist sirf is slice par focus kar ke work ko meaningfully improve karega?

Context claim scale par zyada often true hota hai. Ek single agent jo das knowledge bases par das retrievals kar raha hai aisa context accumulate karta hai jo reasoning degrade karta hai. Das retrieval-and-summary agents mein split karna, jahan har agent focused brief produce karta hai, phir briefs compose karna, aksar outperform karta hai kyun ke har retrieval agent ka context small aur focused rehta hai. Lekin yeh real architectural decision hai, default nahin.

Scale claim test karna sab se aasaan hai: kya parallel execution measurable throughput improvement deti hai, aur task waqai cleanly parallelize hota hai? Agar work mein strict sequential dependencies hain (har step ko previous step ka output chahiye), parallel multi-agent execution speed ke baghair coordination cost add karta hai.

Test. Teen sub-questions:

  1. Kya main woh specific expertise naam de sakta hun jo specialist justify karti hai? "Cleaner hoga" count nahin karta. "Reviewer ko OWASP standards apply karne hain jo coder ko nahin seekhne chahiye" count karta hai. Agar aap expertise naam nahin de sakte, specialization claim shayad aesthetic hai.
  2. Kya task ka context us se zyada ho jayega jo single agent effectively use kar sakta hai? Generally yes agar task multiple distinct knowledge bases, many sources ke across long-running investigations, ya har phase ke liye specialized tool sets require karta hai. Generally no agar context ek well-managed prompt mein fit hota hai.
  3. Kya work waqai parallelize hota hai, measurable throughput improvement ke saath? Agar work sequential hai (har step previous par depend karta hai), parallel execution madad nahin karti. Agar work genuinely independent hai (10 competitors research, 10 candidates evaluate, 10 documents summarize), parallelization real value provide karti hai.

Agar kam az kam ek sub-question ko strong yes milta hai, multi-agent justified hai. Agar teeno ko "maybe" ya "organizational reasons se separate agents acha hoga" milta hai, single-agent pattern par rahiye. Coordination overhead real aur substantial hai.

Teams yahan kahan ghalti karti hain.

Organizational reasons ke liye multi-agent systems banana. "Is par teen teams kaam kar rahi hain; chalo teen agents bana dete hain." Yeh agent architecture ko org chart ka mirror bana raha hai. Yeh lagbhag hamesha ghalat hota hai. Multi-agent systems task properties ke around design hone chahiye, team boundaries ke around nahin. (Teen teams ek agent par collaborate kar sakti hain; org structure aur agent structure ko match karna zaruri nahin.)

Coordination cost underestimate karna. Agents ke darmiyan har handoff serialization point introduce karta hai (ek agent ka output dusre ka input banta hai), potential failure point (handoff format match na kare), aur debugging difficulty (jab kuch ghalat ho, kis agent ne cause kiya?). Multi-agent systems single-agent systems se roughly order-of-magnitude zyada expensive debug hotay hain: is cost ko apni reasoning mein track karein jab decide kar rahe hon ke cost justified hai ya nahin.

Sophistication dikhane ke liye multi-agent banana. Yeh talks-and-demos failure mode hai. Architecture diagrams mein multi-agent systems impressive lagte hain; woh "real AI" dikhate hain. Agar actual task unhein justify nahin karta, aapne impressive overhead banaya hai.

Route. Agar specialization, context, ya scale real bottleneck banate hain → Multi-Agent Specialist System. System mein coordinator/routing agent plus specialists ho sakte hain, ya explicit handoff contracts wale specialists, ya shared state ke zariye communicate karte specialists. Core pattern (sequential workflow, ReAct, planning + ReAct) har specialist ke domain ke andar phir bhi apply hota hai; multi-agent patterns ki composition hai, replacement nahin.

Agar real bottleneck nahin → single-agent pattern rakhein. Agar Q4 conditions hold karti hain to reflection add karein, lekin aesthetic reasons ke liye multi-agent add na karein.

Q5 ke quantitative triggers, concrete metrics jo multi-agent decision fire karte hain. "Specialization, context, ya scale bottleneck" default taur par judgment-based hai, aur judgment wahi jagah hai jahan pattern-overshoot creep karta hai. Jahan possible ho, judgment ko measurement se replace karein. Neeche triggers rules of thumb hain jo Q5 ko subjective ("specialists jaisa feel hota hai") se defensible ("humne X measure kiya aur X threshold se upar hai") banate hain.

Bottleneck claimQuantitative trigger jo upgrade justify karta haiMetric kya measure karta hai
SpecializationSingle-agent traces dikhate hain ke tool-routing errors specific knowledge domains mein concentrate hain (rough working threshold: affected category ke lagbhag ek third runs, apne baseline ke hisaab se calibrate karein). Example: unified billing+technical agent technical queries ke sizeable share par ghalat tool chunta hai kyun ke billing terminology context dominate karti hai.Per-trace tool-correctness, query category se segmented: aapki eval suite ka Phoenix evaluator
Specialization (qualitative fallback)Measure nahin kar sakte? Upgrade se pehle specialist roles ki written specification require karein, har role ki responsibilities, tools, aur acceptance criteria plain English mein. Agar spec vague hai ya roles ki responsibility >40% overlap karti hai, specialization claim aesthetic hai, architectural nahin.Document review, metric nahin
Context overflowHoldout set par accuracy context grow hone se materially degrade karti hai (apni curve measure karein; rough flag: 15K → 45K token sweep par lagbhag 10 points drop investigate karne laayak hai). Example: 25 source documents load karne wala research agent 15K context par 78%, 30K par 71%, 45K par 62% accuracy dikhata hai.Golden dataset par context-vs-accuracy curve
Scale (parallelizable)Har run mein >5 independent sub-tasks hain AUR single-agent execution latency user-facing latency budget se >2× exceed karti hai. Example: 10 competitors research → single-agent sequentially 8 minutes leta hai, budget 3 minutes hai → parallel multi-agent execution hi fit hone ka rasta hai.End-to-end latency + sub-task independence analysis
Scale (throughput)Run volume single-agent design ke rate-limit ceiling se 10× exceed karta hai AUR per-tenant concurrency caps fairness preserve nahin kar sakte. Example: 500 RPM OpenAI quota ke against har tenant 5K runs/day require karta hai to multiple agent identities ya specialist-style decomposition ke across fan-out chahiye.Production load × API rate limits: operational envelope ke flow-control dashboards mein visible

Evidence ki hierarchy. Multi-agent ke liye strongest se weakest justification:

  1. Production trace data jo bottleneck dikhata hai (best: aapke paas evidence hai ke single-agent system waqai is tarah fail karta hai)
  2. Holdout-set measurements jo bottleneck dikhate hain (strong: controlled experiment)
  3. Domain analysis written specialist-role specifications ke saath (acceptable: kam az kam define kiya hai kya build kar rahe hain)
  4. "Feels like specialists" (insufficient: pattern-overshoot yahin rehta hai)

Useful self-check. "Sab se chota single-agent design kaunsa hai jo hum pehle ship kar sakte hain, aur kaunsi specific failure humein baad mein multi-agent force karegi?" Agar jawab hai "production traces mein X failure pattern discover hota," to pehle single-agent ship karein aur upgrade trigger ko fire hone dein. Multi-agent rarely ghalat endpoint hota hai; woh almost always ghalat starting point hota hai.

Khulasa: Q5 poochta hai kya specialization, context, ya scale real bottleneck banate hain jo multi-agent architecture justify karta hai. Teen claims (specialization, context, scale) alag test karna zaruri hai, aur jahan possible ho quantitative triggers ke against (illustrative thresholds, apne system par calibrate: taqreeban one third runs mein tool-routing errors, higher context par lagbhag 10 points accuracy drop, budget se 2× zyada latency). Specialization sab se zyada evidence ke baghair believe hoti hai; context scale par zyada often genuinely true hota hai; scale test karna sab se aasaan hai. Sab se bara failure mode task-property reasons ke bajaye organizational ya aesthetic reasons se multi-agent systems banana hai; coordination overhead real aur substantial hai, aur deployed multi-agent system remove karna refactor nahin, rewrite hai. Single-agent se shuru karein; measured triggers ko upgrade force karne dein.

Concept 8.5: OpenAI Agents SDK primitives, har pattern kya use karta hai

Part 3 paanch patterns walk karne se pehle, yahan pattern selection se implementation tak bridge hai. Pehle courses ne OpenAI Agents SDK ko anchor framework ke taur par sikhaya. Is course ke patterns abstract architectural shapes nahin jinhein aap scratch se reimplement karein; yeh woh shapes hain jo aap already-met SDK primitives se compose karte hain. Yeh concept har pattern ko un specific SDK primitives se map karta hai jo usay build karte hain.

Pattern selection ke liye paanch important primitives.

PrimitiveYeh kya haiKaun se patterns use karte hain
AgentCore class, LLM-driven entity with instructions, tools, aur optional structured output schema. Har pattern ki atomic unit.Sab paanch patterns
Runner.run(agent, input)Agent loop ko final output tak chalata hai. SDK loop aapke liye chalata hai: hand-rolled reason-act-observe cycle nahin.Single agent + ReAct (sab se prominent), Planning + ReAct, Multi-agent (per specialist)
@function_toolDecorator jo Python function ko agent ke callable tool mein badalta hai. Type signatures aur docstrings automatically tool schema ban jate hain.Single agent + ReAct, Planning + ReAct, Multi-agent (per specialist), Sequential workflow (jab LLM-step ko tools chahiye)
handoff(target_agent)Multi-agent transitions ke liye first-class SDK primitive: ek agent conversation context preserve karte hue control dusre ko explicitly hand karta hai. Coordinator hand-roll karne se cleaner.Multi-agent (primary use); Planning + ReAct (planner-to-executor)
output_guardrail / input_guardrailAgent input ya output par validation/critique passes chalane ke SDK primitives. Reflection ke liye native SDK pattern.Reflection (primary use); koi bhi pattern jise input validation chahiye

Ek aur primitive naam dene laayak: Agent.as_tool(). Yeh Agent ko callable tool mein convert karta hai jise dusra Agent invoke kar sakta hai. Yeh SDK ka hierarchical multi-agent composition mechanism hai (coordinator agent specialist agents ko tools ki tarah use karta hai, bilkul kisi aur function tool ki tarah). Agent.as_tool() wale multi-agent systems handoff() wale systems se simpler hain kyun ke coordinator control mein rehta hai; handoff() un situations ke liye hai jahan aap waqai chahte hain ke specialist conversation take over kare.

Pattern → primitive mapping ek nazar mein.

Sequential workflow:
Agent(output_type=...) at the LLM-steps; plain Python everywhere else
Runner.run() called once per LLM-step: no agentic loop (the agent has no tools)

Single agent + ReAct + tools:
Agent(instructions=..., tools=[@function_tool, @function_tool, ...])
Runner.run(agent, input): the SDK runs the reason-act-observe loop

Planning + ReAct execution:
planner = Agent(output_type=PlanSchema)
plan = await Runner.run(planner, task)
for stage in plan.stages:
result = await Runner.run(stage.agent, stage.input)

Single agent + reflection:
Agent(..., output_guardrails=[critic_guardrail])
OR: Agent(..., tools=[Agent.as_tool(critic_agent)])

Multi-agent specialist system:
coordinator = Agent(handoffs=[researcher, writer, reviewer])
OR: coordinator = Agent(tools=[researcher.as_tool(), writer.as_tool(), ...])

Aage Part 3 ke code blocks in sab ko full SDK detail mein dikhate hain.

Yeh mapping pattern selection ke liye kyun matter karti hai. SDK primitives sirf implementation conveniences nahin, woh architectural decisions encode karte hain. handoff() vs. as_tool() choose karna khud pattern-composition decision hai. handoff() ka matlab "specialist conversation take over karta hai"; as_tool() ka matlab "coordinator control mein rehta hai aur specialist ko function ki tarah use karta hai." Pehla tab appropriate hai jab specialist ko user se direct interact karna ho; dusra tab jab coordinator specialist outputs compose kar raha ho. Kaunsa use karna hai yeh jaanna isi course ki pattern-selection discipline ke downstream hai.

Worked example se connection. Customer-support Worker (Maya's Tier-1 Support agent) Agent + @function_tool (lookup, refund, escalation ke liye) + Runner.run() (FastAPI handler mein) use karta hai. Yeh single agent + ReAct + tools pattern hai, exactly woh jo Concept 10 SDK detail mein walk karega. Maya ki implementation is course ke paanch patterns mein se ek hai; baqi chaar woh variations hain jin par aap task properties change hone par jaate hain.

Concept 8.5 ka bottom line: SDK primitives sab paanch patterns ke building blocks hain. Agent atomic unit hai; Runner.run() loop chalata hai; @function_tool Python functions ko tools ke taur par expose karta hai; handoff() aur as_tool() agents ko multi-agent systems mein compose karte hain; output_guardrail reflection implement karta hai. Pattern → primitive mapping is course ke architectural choices ko concrete banati hai: pattern selection abstract nahin; yeh choice hai ke kaun se SDK primitives compose karne hain aur kaise.

Concept 8.6: Har pattern ke operational envelope considerations (concrete example Inngest)

📖 Standalone-reader note. Yeh Concept pattern choice ke operational consequences ke bare mein hai, Inngest sikhane ke bare mein nahin. Architectural argument kisi bhi durable-execution platform par generalize hota hai (Temporal, Restate, Dapr Agents, AWS Step Functions); Inngest concrete example hai kyun ke operational-envelope course yahi sikhata hai. Agar aapka platform different hai, ya aap abhi design stage par hain aur operational platform undecided hai, to pattern-architecture argument ke liye parhein: pattern jitna elaborate ho, operational envelope par us ka dependence utna zyada hota hai. Inngest primitives ki jagah apne platform ke primitives rakh lein.

Concept 8.5 ne patterns ko engine primitives (OpenAI Agents SDK) se map kiya. Concept 8.6 patterns ko operational envelope primitives se map karta hai: woh runtime machinery jo agent loop ko failures se survive karati hai, many concurrent users tak scale karati hai, aur duniya ke events se integrate karti hai. SDK agent loop chalata hai; envelope agent loop ko production-grade banata hai. Har pattern different envelope primitives use karta hai, aur pattern jitna elaborate ho, envelope par dependence utni zyada hoti hai.

Agent Factory track mein operational envelope Inngest hai. Neeche primitives Inngest ke hain; underlying pattern-architecture argument general hai.

Pattern selection ke liye important operational-envelope primitives.

PrimitiveYeh kya haiKaun se patterns sab se zyada use karte hain
@inngest_client.create_functionDecorator jo function ko durable-execution runtime ke saath register karta hai. Operationally-managed work ki unit.Sab paanch patterns
TriggerEvent, TriggerCronTrigger surfaces, duniya ke fired events, schedules jo function wake karte hain. Agent aapke call karne par nahin chalta; duniya trigger fire karti hai tab chalta hai.Sab paanch patterns; cron incident response aur batch workflows ke liye sab se relevant
ctx.step.run(name, fn, ...)Har call durable checkpoint hai, completed steps retry par memoized output return karte hain; failed steps independently retry karte hain. Production reliability ke neeche mechanic.Sequential workflow (sab se direct map), Planning + ReAct (har stage par ek step.run), Reflection (separate generator/critic steps)
ctx.step.wait_for_event(...)Function durably suspend hota hai, zero compute consume hota hai, jab tak matching event aaye ya timeout fire ho. HITL gates ke peeche runtime primitive.Human approval chahne wala koi bhi pattern; multi-agent (specialists ke darmiyan); reflection (jab human judgment critic ho)
concurrency, throttle, priorityPer-function flow-control policies. Concurrency active runs cap karti hai; throttle starts/sec cap karta hai; priority queue order karti hai; per-key concurrency multi-tenant fairness deti hai.Multi-agent (sab se critical, per-specialist limits rate-limit exhaustion prevent karte hain); koi bhi high-volume single-agent pattern
Fan-out triggersEk event N subscribing functions ko wake karta hai; ya parent N child events fire karta hai. Parallel specialist execution ke peeche runtime primitive.Multi-agent (parallel topology); Planning + ReAct (jab stages parallel chalti hain)
Replay + dead-letterFailed runs persist karte hain; fix ship karein, replay click karein, function failed step se naye code ke saath resume karta hai. Failure se pehle steps memoized rehte hain.Sab patterns, lekin pattern jitna elaborate ho, replay utna zyada matter karta hai kyun ke long run partway fail ho to stakes zyada hotay hain

Pattern → primitive mapping ek nazar mein.

Sequential workflow:
@inngest_client.create_function(trigger=TriggerEvent(...))
async def workflow(ctx):
a = await ctx.step.run("extract", extractor_agent.run, ...)
b = await ctx.step.run("validate", validate, a)
c = await ctx.step.run("store", db.insert, b)
await ctx.step.run("notify", notifier_agent.run, ...)
# Each step independently checkpointed; failure → memoized resume

Single agent + ReAct + tools:
@inngest_client.create_function(
trigger=TriggerEvent(event="customer/email.received"),
concurrency=[Concurrency(limit=10, key="event.data.customer_id")],
)
async def support(ctx):
result = await ctx.step.run("agent-loop", Runner.run, support_agent, ctx.event.data["query"])
# If agent needs HITL escalation, use step.wait_for_event inside the agent's tool
return result.final_output

Planning + ReAct execution:
@inngest_client.create_function(trigger=TriggerEvent(event="research/started"))
async def planning(ctx):
plan = await ctx.step.run("plan", Runner.run, planner, ctx.event.data["task"])
results = {}
for stage in plan.stages:
# Each stage = one step.run. Crash mid-stage → only that stage retries.
results[stage.id] = await ctx.step.run(f"stage-{stage.id}", Runner.run, stage.agent, ...)
return await ctx.step.run("synthesize", Runner.run, synthesizer, results)

Single agent + reflection:
@inngest_client.create_function(trigger=TriggerEvent(...))
async def reflective(ctx):
output = await ctx.step.run("generate", Runner.run, generator, ctx.event.data["task"])
critique = await ctx.step.run("critique", Runner.run, critic, output)
if not critique.final_output.is_safe:
output = await ctx.step.run("refine", Runner.run, generator, refine_prompt(output, critique))
return output

Multi-agent specialist system:
# Coordinator triggers fan-out of specialist events
@inngest_client.create_function(trigger=TriggerEvent(event="research/landscape.requested"))
async def coordinator(ctx):
plan = await ctx.step.run("plan", Runner.run, planner, ctx.event.data["topic"])
await ctx.step.run("fan-out", fan_out_specialist_events, plan.competitors)
# Each specialist runs independently as its own function:

@inngest_client.create_function(
trigger=TriggerEvent(event="research/competitor.research"),
concurrency=[Concurrency(limit=5, key="event.data.tenant_id")], # per-tenant cap
)
async def competitor_research(ctx):
return await ctx.step.run("research", Runner.run, researcher, ctx.event.data["target"])

Aage Part 3 ke sidebars har pattern ke liye explicit operational-envelope section ke saath in mappings ko dikhate hain.

Yeh mapping pattern selection ke liye kyun matter karti hai. Do production failure modes jo architecture diagram level par visible nahin lekin production mein zor se kaatte hain:

  1. Crash mid-flight. Six-step planning + ReAct execution agar step 4 par crash kare (durable execution ke baghair) to pehle teen steps ki cost dobara pay hoti hai. Operational-envelope course quantify karta hai: GPT-5-class pricing par multi-stage agent flow har crashed run par roughly $0.10-$2.00 dobara pay kar sakta hai. 1000 runs/day par yeh sirf crashes ki lost work mein lagbhag $30-$600/month hai. Sequential workflows crashes ko sasta survive karte hain kyun ke retries short hoti hain; multi-agent + reflection systems crashes ko mehnga survive karte hain kyun ke retries long hoti hain. Pattern jitna elaborate ho, operational envelope ki step.run memoization dollars mein utni zyada worth rakhti hai.
  2. Scale par coordination. Five specialists, ten tenants, aur 100 events/minute bursts wala multi-agent system per-specialist concurrency caps ke baghair rate limits exhaust kar dega. Operational envelope isay one line banata hai: concurrency=[Concurrency(limit=5, key="event.data.tenant_id")]. Is course ka decision tree pattern choose karta hai; operational envelope ke flow-control primitives chosen pattern ko scale par healthy rakhte hain.

Deployment composition. Operational envelope (Inngest) aur aapki cloud deployment compose karte hain; compete nahin. Cloud deployment course cloud topology sikhata hai: ACA + Neon + R2 + Cloudflare Sandbox + Phoenix. Operational-envelope course woh layer sikhata hai jo is topology ke andar SDK runner ko wrap karti hai. Real production system dono use karta hai: ACA par deployed Inngest functions, Runner.run() calls ko step.run() blocks ke andar chalate hue, Neon agent traces store karta hai aur sandbox tool code execute karta hai. Part 3 ke deployment-composition sidebars dono layers ko explicitly naam dete hain.

Eval composition. Inngest ka structured trace (har step ka input, output, retry count, latency) Phoenix mein usi tarah flow karta hai jaise SDK ka agent trace, OpenTelemetry ke zariye. Eval suite ke failure-detection patterns (trace-length anomalies, plan-execution divergence, rubber-stamping) sab Inngest-instrumented runs par kaam karte hain; operational envelope add hone se eval suite change nahin hoti.

Concept 8.6 ka bottom line: operational envelope (Inngest) sab paanch patterns ka production substrate hai. Triggers function wake karte hain; step.run usay durable banata hai; step.wait_for_event HITL gates implement karta hai; concurrency, throttle, aur priority load mein shape dete hain; fan-out multi-agent specialists coordinate karta hai; replay bug-fix recovery handle karta hai. Pattern jitna elaborate ho, envelope utni zyada worth rakhta hai: sequential workflows us ke baghair survive kar sakte hain; multi-agent + reflection systems ko uski zarurat hoti hai. Envelope aapki cloud deployment aur eval suite ke saath compose hota hai, alternatives ke taur par nahin balki production architecture ki parallel layers ke taur par.

Teen layers, side by side. Concepts 8.5 aur 8.6 mil kar establish karte hain ke koi bhi production agentic pattern teen layers ki composition hai: operational envelope (Inngest), engine (OpenAI Agents SDK), aur cloud deployment. Duniya top par triggers fire karti hai (customer emails, billing ya Slack ya CRM se webhooks, cron schedule, dusre Workers ke fan-out events, human approvals); yeh triggers teen layers se neeche flow karte hain. Neeche diagram har layer ke primitives aur unka kaam map karta hai. Jab bhi Part 3 ke operational-envelope sidebars abstract lagen, is par wapas aayein.

Production agentic pattern ki teen stacked layers. Oopar THE WORLD triggers fire karti hai: customer emails, billing, Slack, ya CRM se webhooks, cron schedule, fan-out events, aur human approvals. Yeh teen layers se neeche flow karte hain. Layer 1, Operational Envelope (Inngest), nervous system hai jo function wake karta hai, crashes survive karta hai, load limit karta hai, aur human-in-the-loop coordinate karta hai, TriggerEvent aur TriggerCron, ctx.step.run, concurrency, throttle, fan-out controls, step.wait_for_event, aur replay ke zariye. Layer 2, Engine (OpenAI Agents SDK), agent loop khud hai, atomic unit jise patterns compose karte hain, envelope ke step.run se wrapped, Agent, Runner.run(), function_tool decorator, handoff() aur as_tool(), aur output_guardrail ke zariye. Layer 3, Cloud Deployment, woh jagah hai jahan envelope aur engine actual mein run karte hain, real users tak reachable, FastAPI on Azure Container Apps, Neon Postgres, R2 plus sandbox, Phoenix plus OpenTelemetry, aur eval suite ke zariye. Footer note karta hai ke production agentic pattern teeno layers compose karta hai, aur pattern jitna elaborate ho, jaise reflection wala multi-agent, operational envelope utna critical hota hai.

Takeaway: teen layers stack hoti hain. Inngest (envelope) SDK (engine) ko wrap karta hai, aur dono cloud deployment ke andar run karte hain. Yeh course pattern choose karta hai; teen layers chosen pattern ko production reality mein badalti hain. Part 3 ke sab paanch patterns in teen layers ki compositions hain; pattern to pattern farq yeh hai ke har layer ke kaun se primitives use hotay hain. Pattern jitna elaborate ho (reflection wala multi-agent), operational-envelope layer utni critical hoti hai, kyun ke coordination, durability, aur HITL optional nahin rehte.

AI ke saath try karein, Part 2 ke baad. Ab aapke paas paanch sawal hain. Patterns in depth parhne se pehle inhein kisi real cheez par use karein. Apna Claude Code ya OpenCode session kholein aur paste karein:

"Main agentic architectures choose karna seekh raha hun. Mere actual work se ek real task choose karo jiske liye main agent bana sakta hun. Mujh se usay describe karwao, phir mujhe paanch sawalon se walk karao: Q1 (kya solution path known hai?), Q2 (kya workflow fixed aur stable hai?), Q3 (kya task structure articulable hai?), Q4 (kya quality speed se zyada matter karti hai, checkable criteria ke saath?), Q5 (kya specialization, context, ya scale bottleneck hai?). Jab mera jawab vague ho ya main task ki zarurat se zyada elaborate pattern ki taraf ja raha hun to push back karo. End par batao ke answers kis starting pattern ki taraf point karte hain."

Aap kya seekh rahe hain. Paanch sawal reflex tab bante hain jab aap unhein kisi aise task par chalate hain jis ki aapko waqai parwah ho. Kamzor jawab par push back karne wali cheez ke saath, out loud, ek dafa yeh karna agle das pages parhne se zyada valuable hai.


Part 3: Paanch patterns gehrai se

Part 2 ne decision tree ko question level par walk kiya. Part 3 usay pattern level par walk karta hai. Paanch terminal patterns mein har ek ke liye: pattern kya hai, uski characteristic implementation kaisi dikhti hai, aapki deployment topology ke liye is ka kya matlab hai, aur aapki eval suite kya watch karti hai taake pata chale pattern misapply hua hai.

Deployment-and-eval composition woh cheez hai jo yeh course oopar add karta hai. Agentic patterns par kam courses yeh layer sikhate hain, kyun ke isay foundation ke taur par deployment aur eval courses chahiye. Agar aapne woh courses nahin kiye, sidebars ko upcoming previews samjhein; agar kiye hain, composition pattern selection ko operational banati hai.

Patterns ko one by one walk karne se pehle, yahan woh matrix hai jo poora part summarize karta hai. Har pattern cloud stack ka different subset use karta hai; deployment cost differences real aur substantial hain. Concepts 9-13 har pattern detail mein walk karte waqt is par wapas aayein.

Matrix har pattern (columns) ko un cloud deployment components ke against map karta hai jo usay chahiye (rows). Check ka matlab needed; cross ka matlab not needed; tilde ka matlab conditional.

Pattern-by-deployment-component matrix. Columns paanch patterns hain: sequential workflow, ReAct wala single agent, ReAct wali planning, optional reflection layer, aur multi-agent specialist. Rows deployment components hain: FastAPI on Azure Container Apps, Neon Postgres, Cloudflare R2, Cloudflare Sandbox, bridge Worker, background-worker pattern, Phoenix observability, aur multi-provider model routing, plus relative-cost row. Cells needed ke liye check, not needed ke liye cross, conditional ke liye tilde use karte hain. Sequential workflow ko sab se chota subset chahiye (sandbox aur bridge Worker needed nahin) baseline 1x cost par. Single agent with ReAct roughly 3 to 10x; planning with ReAct 5 to 15x (plan table aur mandatory background worker add karta hai); reflection layer apne core ke oopar 2 to 3x add karti hai aur multi-provider routing add kar sakti hai; multi-agent specialist sab se bara hai 5 to 20x, per-specialist state, bridge Worker, aur tracing ke saath. Cost figures illustrative hain.

Pattern selection mein encoded cost discipline: reflection ke saath multi-agent system same task volume ke liye sequential workflow se kai guna mehnga ho sakta hai (illustrative ratio, tens of times ke order par, measured benchmark nahin). Sequential workflow sandbox aur bridge-Worker tiers poori tarah skip karta hai, is liye infrastructure ka bara hissa avoid karta hai; ReAct ya multi-agent ke liye baghair justification reach karna us capability ke liye pay karta hai jo task ko chahiye nahin.

Takeaway: sequential workflow ko do clear "not needed" markers milte hain (sandbox aur bridge Worker), jo agentic patterns se meaningfully kam infrastructure mein translate hota hai. Multi-agent ko sab se zyada expansion markers milte hain (per-specialist tracing, per-specialist bridge-Worker config). Matrix decision tree ki cost discipline ko visible banata hai.

Concept 9: Sequential workflow, characteristic shape, deployment, eval signals

Yeh kya hai. Steps ki fixed pipeline jahan har step ka output agle ko feed karta hai. Path known aur stable hai (Q1=yes, Q2=yes). LLM calls sirf un steps ke liye reserved hain jahan waqai interpretation ya generation chahiye, extraction, summarization, classification, na ke agla step decide karne ke liye.

OpenAI Agents SDK mein characteristic implementation:

from agents import Agent, Runner
from pydantic import BaseModel

class Invoice(BaseModel):
vendor: str
amount_cents: int
due_date: str
line_items: list[dict]

class NotificationMessage(BaseModel):
subject: str
body: str

# Two narrow agents: each does ONE LLM-step in the workflow.
# Notice: no tools, no agentic loop. Just structured-output extraction.
extractor = Agent(
name="invoice_extractor",
instructions="Extract structured invoice fields from the email body. Be strict about field types.",
output_type=Invoice,
)

notifier = Agent(
name="notification_writer",
instructions="Write a brief notification message to the requester, referencing the invoice details.",
output_type=NotificationMessage,
)

async def invoice_intake_workflow(email_content: str) -> ProcessingResult:
# Step 1: extraction (SDK Agent with structured output)
extraction = await Runner.run(extractor, email_content)
invoice: Invoice = extraction.final_output

# Step 2: validation (plain Python, no LLM)
validation = validate_against_db(invoice)
if not validation.ok:
return ProcessingResult(status="rejected", reason=validation.reason)

# Step 3: store (plain Python, no LLM)
record_id = db.insert(invoice)

# Step 4: notify (SDK Agent with structured output)
notif = await Runner.run(notifier, f"Invoice {record_id} from {invoice.vendor} stored. Notify {invoice.requester}.")
email.send(invoice.requester, notif.final_output.subject, notif.final_output.body)

return ProcessingResult(status="completed", record_id=record_id)

SDK shape notice karein: do narrow Agent instances, har ek ek LLM-only job karta hai (extraction, notification writing). Har agent ke paas output_type= ke zariye structured output hai, free-form text parsing nahin. Runner.run() do dafa call hota hai, har LLM-step ke liye ek dafa. Tools nahin, @function_tool decorators nahin, handoffs nahin, kyun ke workflow ko agentic reasoning nahin chahiye, sirf plain Python mein embedded LLM calls chahiye.

Internalize karne laayak SDK insight: Agent ka har use "agentic" nahin hota. Agent with output_type= aur no tools SDK ka idiomatic tareeqa hai "typed response ke saath LLM call" karne ka, jo exactly sequential workflow ke interpretation steps ko chahiye. Aap SDK use kar rahe hain, agent loop nahin.

Deployment composition. Sequential workflows cloud stack ka sab se chota subset use karte hain:

  • SDK primitives used: Agent (structured extraction/generation ke liye output_type= ke saath), har LLM-step ke liye Runner.run(). @function_tool nahin, handoff() nahin, as_tool() nahin, output_guardrail nahin. Agent loop unused hai, Runner.run() ek LLM call ke baad return karta hai kyun ke agent ke paas tools nahin.
  • FastAPI harness on Azure Container Apps: haan, requests receive karne ke liye HTTP service phir bhi chahiye.
  • Neon Postgres for durable state: haan, workflow record-keeping aur idempotency ke liye.
  • OpenAI API for the LLM calls: haan, lekin sirf specific steps ke liye jahan zarurat ho.
  • Cloudflare R2 for files: shayad, sirf agar workflow file artifacts handle karta hai.
  • Cloudflare Sandbox for execution: nahin. Sequential workflows agent-generated code run nahin karte; woh deterministic code with embedded LLM calls run karte hain. Sandbox layer (aur bridge Worker) needed nahin.

Sequential workflows ke bare mein sab se under-appreciated finding yeh hai: inhein cloud-deployment course ki zyada tar deployment complexity chahiye hi nahin. Agar aapka task sequential workflow fit karta hai, aap FastAPI + Postgres + OpenAI stack par ship kar sakte hain aur sandbox infrastructure poori tarah skip kar sakte hain. Cost savings: full agentic deployment se meaningfully kam infrastructure, kyun ke aap sandbox aur bridge-worker tiers poori tarah skip karte hain. Us capability ke liye pay na karein jo pattern ko chahiye nahin.

Eval signals. Sequential workflows ke liye eval suite kya watch karti hai:

Failure modeEval isay kis taur par pakarta hai
Extraction step input ghalat parhta haiOutput schema validation fail hoti hai; DeepEval structured-output mismatch pakarta hai
Validation logic mein gap haiProduction case slip karta hai; trace valid-but-wrong record ko storage tak pohanchte dikhata hai
Notification message tone ya facts mein off haiGenerated message par Phoenix inline evaluator pakarta hai; golden dataset mein promotion
Workflow aisa case handle karta hai jis ke liye design nahin huaDeepEval test suite "edge case inputs" include karti hai; failures workflow ki assumption boundary expose karte hain

Key insight: sequential workflow evals step-level correctness ke bare mein hain, agent reasoning quality ke bare mein nahin. Aap har LLM-using step independently test karte hain (kya extraction right schema return karti hai? kya generation right tone produce karti hai?). Aap workflow ke branching points test karte hain (kya validation woh cases pakarta hai jo pakarne chahiye?). Aapko "agent ne right path chuna ya nahin" test karne ki zarurat nahin kyun ke path fixed hai.

Production mein teams yahan kahan ghalti karti hain. LLM-embedded workflows ko agentic treat karna. Teams tool-call tracing, reasoning-step inspection jaisi agent-loop observability un workflows mein add kar deti hain jin mein na tool calls hain na reasoning steps. Aapko sirf standard request/response tracing plus per-step structured-output validation chahiye. Phoenix ke agent-reasoning dashboards overkill hain; App Insights ki standard request tracing sahi level hai.

Operational envelope. Sequential workflow Inngest ke durable-execution model ke liye sab se direct fit hai. Pattern ki structure, fixed steps, har step potential failure, deterministic dependencies, exactly woh cheez hai jis ke liye Inngest functions banay gaye hain.

  • Inngest primitives used: workflow register karne ke liye @inngest_client.create_function; wake signal ke liye TriggerEvent ya TriggerCron; har workflow step ke liye ek ctx.step.run("step-name", fn, args). step.wait_for_event nahin (routine workflow ko HITL nahin chahiye), fan-out nahin (workflow linear hai), complex flow control nahin.
  • 1:1 mapping: sequential workflow ka har step Inngest function mein ek ctx.step.run call banta hai. Concept 9 ke code ka five-step invoice intake (extract → validate → store → notify) five step.run calls ban jata hai. Step 3 par crash → steps 1-2 memoized output return karte hain, step 3 retry hota hai.
  • Cost benefit: $0.001-$0.05 per LLM call par, memoization ke baghair step 5 par crash hone wala workflow steps 1-4 dobara pay karta hai. Memoization ke saath sirf step 5 retry hota hai. Operational-envelope course isay quantify karta hai; workflows lengthen hone par savings compound hoti hain.

Sequential workflow plus Inngest curriculum ka sab se simple production-ready agentic deployment hai. Bahut se real workflows jinhein "agentic systems" samajh liya jata hai actually step.run checkpoints wali Inngest functions honi chahiye. Decision tree ka Q1 ("path known hai?") essentially pooch raha hai ke kya aapko agent loop ke baghair Inngest use karna chahiye.

*Concept 9 ka bottom line: sequential workflow tab sahi pattern hai jab path known aur stable ho. Yeh cloud stack ka sab se chota subset use karta hai (sandbox needed nahin), LLM calls ko interpretation-only steps ke liye reserve karta hai, aur agent-reasoning level ke bajaye step level par evaluate hota hai. Sab se common production mistake workflows ko agent-grade observability se over-instrument karna hai jis ki unhein zarurat nahin.*

Concept 10: Single agent + ReAct + tools, characteristic shape, deployment, eval signals

Yeh kya hai. Agent apni current state par reasoning aur action (tool call) ke darmiyan alternate karta hai, result observe karta hai, aur repeat karta hai. Path unknown hai (Q1=no) aur structure articulable nahin (Q3=no). Defining property: agent abhi observe ki gayi cheez ke basis par decide karta hai ke agla kaam kya karna hai.

OpenAI Agents SDK mein characteristic implementation:

from agents import Agent, Runner, function_tool

# Tools: plain async Python functions, exposed to the agent via the decorator.
# Type hints and docstrings become the tool's schema automatically.
@function_tool
async def lookup_account(account_id: str) -> dict:
"""Look up an account's current state including balance, plan, and billing status."""
return await db.accounts.find_by_id(account_id)

@function_tool
async def lookup_transactions(account_id: str, since_days: int = 90) -> list[dict]:
"""Return recent transactions for an account; defaults to last 90 days."""
return await db.transactions.find(account_id=account_id, since=since_days)

@function_tool
async def issue_refund(transaction_id: str, amount_cents: int, reason: str) -> dict:
"""Issue a refund. Fails if amount exceeds agent's authority ($500). Returns refund_id."""
return await refund_service.create(transaction_id, amount_cents, reason)

@function_tool
async def escalate_to_human(reason: str, context: dict) -> str:
"""Hand the case to a human reviewer. Returns the escalation ticket id."""
return await escalation_service.create_ticket(reason, context)

# One Agent with all the tools. The SDK runs the reason-act-observe loop.
support_agent = Agent(
name="tier1_support",
instructions=(
"You are a Tier-1 customer support agent. Investigate the customer's issue "
"using your tools. Issue refunds only when policy clearly allows and the "
"amount is under $500. Escalate any ambiguous case. If you cannot determine "
"the right action within 3 lookups, escalate. State when you are done."
),
tools=[lookup_account, lookup_transactions, issue_refund, escalate_to_human],
)

# The FastAPI handler: exactly the customer-support Worker's shape.
async def handle_support_request(customer_id: str, query: str) -> str:
result = await Runner.run(
support_agent,
input=f"Customer {customer_id} asks: {query}",
max_turns=25, # explicit step budget: non-optional in production
)
return result.final_output

SDK shape notice karein: multiple tools wala ek Agent, Runner.run() ke zariye called. SDK reason-act-observe loop internally chalata hai: aap for step in range(max_steps): response = llm.chat(...); for tool_call in response.tool_calls: ... nahin likhte. max_turns parameter step budget hai; hit hone par SDK MaxTurnsExceeded raise karta hai.

Internalize karne laayak SDK insight: canonical ReAct loop ek Runner.run() call hai. Complexity tool definitions aur agent instructions mein hai; loop machinery SDK ki responsibility hai. Yeh exactly Maya's Tier-1 Support agent, customer-support Worker ke peeche pattern hai.

Deployment composition. Single-agent ReAct cloud stack ka zyada hissa use karta hai:

  • SDK primitives used: Agent (tools= aur instructions= ke saath), har Python function jo tool ke taur par expose hai us par @function_tool decorator, agentic loop ke liye Runner.run(agent, input, max_turns=N). Yeh canonical SDK shape hai, exactly jo customer-support Worker deploy karta hai. handoff() ya as_tool() nahin (woh multi-agent primitives hain); output_guardrail nahin (woh reflection hai).
  • FastAPI harness on Azure Container Apps: haan, HTTP service ke liye.
  • Neon Postgres for durable state: haan, sessions, runs, traces ke liye. Critical kyun ke agent ka reasoning trace primary debugging artifact hai.
  • Cloudflare R2 for files: haan, agar agent file inputs/outputs handle karta hai.
  • Cloudflare Sandbox for execution: haan, agar agent ke paas code-executing tools hain. Agent apply_patch, shell commands, ya arbitrary Python chalata hai; woh code sandbox mein jata hai. Bridge Worker required hai.
  • Background worker pattern: haan, kyun ke ReAct loops 30+ seconds le sakte hain aur HTTP request block nahin karni chahiye.

Eval signals. ReAct ke failure modes reasoning-level hain, is liye eval signals bhi reasoning-level hain:

Failure modeEval isay kis taur par pakarta hai
Agent loop karta hai, solved work revisit karta haiTrace-length anomaly: same tool repeatedly similar arguments ke saath called. Phoenix flag
Agent nonexistent tools invoke karta hai (hallucinated tools)SDK mein tool-call validation; structured trace invalid call dikhata hai; CI eval DeepEval se pakarta hai
Agent solve karne se pehle give up karta hai (premature termination)Final output expected behavior se compare; trace few steps dikhata hai; DeepEval pakarta hai
Agent ki reasoning actions se diverge karti haiPhoenix tool-correctness evaluator: kya agent ka stated reason called tool se match karta hai?
Tool call latency cascade karti hai (har step slow)OTel timing aggregate runtime ko latency budget se exceed hota dikhata hai

Key insight: ReAct evals ko sirf input/output nahin, reasoning trace capture karna hota hai. Trace hi data hai. Agar aap sirf check karte hain ke agent ne right answer diya ya nahin, aap woh cases miss karenge jahan agent lucky tool calls se right answer tak pohancha, aur woh cases bhi jahan woh right answer de sakta tha lekin ek bad decision se nahin diya. Phoenix ke inline trace evaluators ReAct ke liye load-bearing observability layer hain.

Production mein teams yahan kahan ghalti karti hain. Step budgets ko infinity default rehne dena. Step cap ke baghair ReAct loop aakhir kar aisa input encounter karega jo usay indefinitely loop karwaye, tokens burn kare, workers block kare, aur rate limits exhaust kare. Steps hamesha explicitly cap karein (25 reasonable default hai; kuch tasks ko 50 chahiye; bahut kam ko 100 chahiye). Cap hit hona investigate karne ka signal hai, remove karne ka workaround nahin.

Operational envelope. Single agent + ReAct Inngest mein cleanly wrap hota hai, ek structural decision ke saath jo sahi leni chahiye: kya poora agent loop ek step.run banayen, ya use multiple steps mein decompose karein?

  • Inngest primitives used: event trigger ke saath @inngest_client.create_function (TriggerEvent(event="customer/email.received"), Maya ka exact setup); ctx.step.run("agent-loop", Runner.run, agent, input) SDK ke Runner.run() call ko wrap karta hai; downstream systems protect karne ke liye concurrency aur throttle; optionally escalation tool ke andar HITL implement karne ke liye ctx.step.wait_for_event.
  • Structural choice: poore agent loop ke liye ek step.run standard pattern hai. SDK reason-act-observe loop internally chalata hai; Inngest ke perspective se yeh ek durable step hai. Mid-loop crash → poora loop retry hota hai (SDK traces lost hoti hain, lekin function recover karta hai). Alternative, decomposed, har tool call ko apne step.run mein wrap karna, finer-grained durability deta hai lekin SDK loop ko Runner.run() se bahar lift karna padta hai, jo fragile hai. Default one step.run per agent loop rakhein jab tak decompose karne ki specific wajah na ho.
  • HITL via wait_for_event: Concept 10 ke code ka escalation tool Inngest pattern ban jata hai. Jab agent escalate_to_human call karta hai, woh tool event fire karta hai (refund/approval.requested) aur function step.wait_for_event ke zariye human response tak suspend hota hai. Agent code clean rehta hai, bas tool call karta hai, aur durability envelope handle karta hai.
  • Concurrency caps: concurrency=[Concurrency(limit=10, key="event.data.customer_id")] ek customer ke burst ko dusron ko starve karne se rokta hai. Yeh operational envelope ka per-key concurrency pattern hai, Maya ki deployment par directly applied.

Maya's Tier-1 Support agent implicitly yeh composition hai: engine ke liye SDK Agent + Runner.run(), deployment ke liye ACA + Neon + R2 + sandbox, plus triggers, durability, aur flow control ke liye Inngest envelope (jab present ho). Part 5 ka Decision 1 composition explicit banata hai.

Concept 10 ka bottom line: single-agent ReAct tab sahi pattern hai jab path unknown ho aur structure articulable na ho. Yeh cloud stack ka zyada hissa use karta hai (agent code run kare to sandbox required; Python harnesses ke liye bridge Worker required). Eval discipline final output ke bajaye reasoning trace capture karti hai: Phoenix ReAct ke liye load-bearing observability hai kyun ke trace-level signals hi characteristic failures pakarte hain (looping, hallucinated tools, premature termination, reasoning-action divergence).

Concept 11: Planning + ReAct execution, characteristic shape, deployment, eval signals

Yeh kya hai. Two-layer pattern: execution shuru hone se pehle planning agent explicit plan produce karta hai (dependencies ke saath stages); har stage ke andar ReAct + tools kaam handle karta hai. Step level par path unknown hai (Q1=no) lekin stage level par structure articulable hai (Q3=yes).

OpenAI Agents SDK mein characteristic implementation:

from agents import Agent, Runner, function_tool
from pydantic import BaseModel
from typing import Literal

class Stage(BaseModel):
id: str
description: str
agent_role: Literal["researcher", "analyzer", "synthesizer"]
depends_on: list[str] # other stage ids
step_budget: int

class Plan(BaseModel):
task_summary: str
stages: list[Stage]
success_criteria: str

# Planner: an Agent that produces a structured plan, no tools.
planner = Agent(
name="market_research_planner",
instructions=(
"Given a research task, produce a plan with 3-7 stages. Each stage has clear "
"dependencies and a step budget. Prefer fewer broader stages over many narrow ones."
),
output_type=Plan,
)

# Three execution specialists: each with its own tools and instructions.
researcher = Agent(
name="researcher",
instructions="Investigate the assigned topic using your tools. Return a structured brief.",
tools=[web_search, fetch_url, read_document],
)
analyzer = Agent(
name="analyzer",
instructions="Analyze the briefs from researchers. Identify patterns, contradictions, gaps.",
tools=[compute_metrics, compare_briefs],
)
synthesizer = Agent(
name="synthesizer",
instructions="Synthesize the analyzed findings into a coherent report.",
tools=[draft_report, format_citations],
)

ROLE_TO_AGENT = {"researcher": researcher, "analyzer": analyzer, "synthesizer": synthesizer}

async def planning_then_react(task: str, session_id: str) -> str:
# Stage 1: Generate the plan via the planner Agent
plan_result = await Runner.run(planner, task)
plan: Plan = plan_result.final_output
await db.runs.persist_plan(session_id, plan) # cloud deployment: plan persistence

# Stage 2: Execute each stage via the matching specialist Agent
stage_results: dict[str, str] = {}
for stage in topological_order(plan.stages):
agent = ROLE_TO_AGENT[stage.agent_role]
stage_input = compose_stage_input(stage, stage_results, task)
stage_run = await Runner.run(agent, stage_input, max_turns=stage.step_budget)
stage_results[stage.id] = stage_run.final_output
await db.runs.persist_stage(session_id, stage.id, stage_run.final_output)

# Stage 3: Final synthesis via the synthesizer one more time
final = await Runner.run(
synthesizer,
f"Compose the final report. Plan: {plan.model_dump_json()}. Results: {stage_results}",
)
return final.final_output

SDK shape notice karein: planner ek Agent hai with output_type=Plan aur no tools (sirf structured output produce karta hai). Har execution stage stage ke role ke matching specialist Agent ko Runner.run() se call karti hai. Plan Pydantic se structured hai, is liye SDK type level par validate karta hai: JSON-parse-and-hope nahin. Plan persistence cloud deployment ki Neon-Postgres runs table ke zariye hoti hai (customer-support Worker isay wire karta hai).

Internalize karne laayak SDK insight: structured-output Agent + tool-using Agent planning + ReAct execution ke do halves hain. SDK ka output_type= plans ko first-class artifacts banata hai; baqi Runner.run() calls par plain orchestration code hai.

Deployment composition. Planning + ReAct single-agent ReAct wale components use karta hai, plus ek extra discipline:

  • SDK primitives used: planner Agent with output_type=PlanSchema (no tools, structured output only); har role ke liye ek execution Agent with tools=[...] aur @function_tool decorators; Runner.run() planner ke liye ek dafa aur har stage ke liye ek dafa call hota hai. Plan persistence cloud deployment ki runs table mein rehti hai, SDK khud mein nahin; SDK Runner.run() calls ke across stateless hai.
  • Concept 10 se ReAct ki saari deployment requirements: same harness, sandbox, R2, background worker.
  • Neon mein plan persistence. Plan khud audit aur resumability ke liye store karne laayak artifact hai. runs table ka naya table ya schema extension plan_id, plan content, aur stage-by-stage progress track karta hai.
  • Long-running runs zyada common hain. Plans aksar 5-10 stages rakhte hain, har stage potentially 20-30 ReAct steps chalati hai. End-to-end 5-10 minute runs normal hain. Background worker pattern mandatory hai, optional nahin.

Eval signals. Planning + ReAct pure ReAct se aage naye failure modes add karta hai:

Failure modeEval isay kis taur par pakarta hai
Planner aisa plan produce karta hai jis se execution diverge karta haiPlan ko actual stage execution se compare karein; skipped, reordered, ya substantively redefined stages flag karein
Plan mein missing stages hain (obvious step plan mein nahin)Similar tasks ke golden-dataset plans se compare; DeepEval structural divergence flag karta hai
Stage handoffs context kho dete hainHar stage ka input inspect karein; agar stage N critical output from stage M reference nahin kar sakta, handoff ne information kho di
Plan over-detailed hai (har stage single tool call hai)Plan-stage size analysis; agar har stage 1-2 ReAct steps mein execute ho, planning layer kaam nahin kar rahi
Plan under-detailed hai (ek stage huge scope cover karta hai)Plan-stage size analysis; agar ek stage 50+ ReAct steps chalati hai, planning ne actually decompose nahin kiya

Key insight: planning + ReAct evals ko plan quality aur execution quality alag measure karni hoti hain. Good plan with bad execution aur bad plan with good execution different dikhte hain; unhein conflate karna false diagnoses produce karta hai. Eval signal "plan-execution divergence" sab se informative hai, yeh indicate karta hai ke planner aisi structure produce kar raha hai jo task mein actually nahin.

Production mein teams yahan kahan ghalti karti hain. Plan ko contract ki tarah trust karna. Plan starting structure hai; stages ke andar execution legitimately discover kar sakti hai ke next stage ko planned se different work chahiye. Divergence ko hamesha-bad samajhna rigidity banata hai; hamesha-fine samajhna planning ki value mita deta hai. Sahi discipline: har divergence log karein, divergences ko periodically patterns ke liye review karein (recurring divergence ka matlab planner improve karna hai), aur small in-stage adaptations ko alarm ke baghair hone dein.

Operational envelope. Planning + ReAct execution Inngest ke step.run model ke liye sab se clear fit hai, har stage ek step.run se map hoti hai, aur durability benefits multi-stage run ke across compound hotay hain.

  • Inngest primitives used: parent function ke liye @inngest_client.create_function; har stage par ek ctx.step.run (step.run("plan", Runner.run, planner, task), phir har execution stage par step.run); retries= per stage configure agar kuch stages non-transient failure modes rakhti hain; parallel runs cap karne ke liye concurrency.
  • Plan-then-execute mapping: step.run("plan", ...) plan produce karta hai; function phir plan.stages par iterate karta hai, har stage ke liye step.run(f"stage-{stage.id}", ...) call karta hai. Agar function mid-execution crash kare (maan lein 6 mein se stage 4 par), Inngest plan aur stages 1-3 memoization se restore karta hai; sirf stage 4 retry hota hai. Plan persistence free hai, Inngest usay "plan" step ke output ke taur par store karta hai.
  • Cost impact: savings yahan kisi bhi pattern se zyada hain. Planning + ReAct run 5-10 minutes aur 20-30 tool calls le sakta hai; minute 8 par crash without durability sab kuch dobara pay karwata hai. Operational envelope ki memoization GPT-5-class pricing par har crashed run ke $0.50-$2.00 save kar sakti hai. 1000 such runs/day aur transient infrastructure issues se 1-5% crash rates wale systems mein yeh directly saved LLM costs mein $150-$1000/month hai.
  • Parallel stage execution: jo stages ek dusre par depend nahin karti woh operational envelope ke fan-out pattern se fanned out ho sakti hain (har stage ke liye ek event, apni function trigger karte hue), per-stage durability preserve karte hue execution parallelize hoti hai.

Concept 11 ki deployment composition se "Neon mein plan persistence" requirement partially unnecessary hai agar Inngest envelope mein hai, kyun ke Inngest plan ko "plan" step ke output ke taur par store karta hai. Neon phir bhi audit aur OTel observability ke liye run track karta hai, lekin plan-recovery story Inngest handle karta hai, aapki application code nahin.

Concept 11 ka bottom line: planning + ReAct execution tab sahi pattern hai jab structure articulable ho lekin step-level work adaptation require kare. Yeh full ReAct deployment stack plus plan persistence aur background-worker pattern use karta hai. Eval discipline plan quality ko execution quality se alag rakhti hai; plan-execution divergence sab se informative signal hai, jo indicate karta hai ke planner aisi structure produce kar raha hai jo task mein actually nahin.

Concept 12: Single agent + reflection, characteristic shape, deployment, eval signals

Yeh kya hai. Kisi bhi core pattern ke oopar layer: agent output produce karta hai, phir critique pass explicit criteria ke against evaluate karta hai; defects identify hon to agent refine ya regenerate karta hai. Reflection Q4 se justified hoti hai (quality > speed AUR checkable criteria).

OpenAI Agents SDK mein characteristic implementation. SDK reflection ke liye do distinct primitives deta hai, is basis par choose karein ke aap validation chahte hain (bad outputs block karna) ya refinement (borderline outputs improve karna).

Flavor 1, validation-style reflection ke liye output_guardrail (lightweight SDK-native pattern):

from agents import Agent, Runner, output_guardrail, GuardrailFunctionOutput, RunContextWrapper
from pydantic import BaseModel

class SQLReview(BaseModel):
is_safe: bool
issues: list[str]
reasoning: str

# A critic Agent: uses a different model from the generator to avoid blind-spot overlap.
sql_critic = Agent(
name="sql_critic",
model="claude-opus-4-5", # different model family from the generator
instructions=(
"Review the SQL query. Check that it parses, hits only allowed tables, "
"does not use SELECT *, and has appropriate WHERE clauses. Flag any issues."
),
output_type=SQLReview,
)

@output_guardrail
async def critic_guardrail(ctx: RunContextWrapper, agent: Agent, output: str) -> GuardrailFunctionOutput:
review_result = await Runner.run(sql_critic, output)
review: SQLReview = review_result.final_output
return GuardrailFunctionOutput(
output_info={"issues": review.issues, "reasoning": review.reasoning},
tripwire_triggered=not review.is_safe,
)

# The generator Agent: uses output_guardrails to invoke the critic.
sql_generator = Agent(
name="sql_generator",
model="gpt-5", # different model family from the critic
instructions="Generate a SQL query that answers the user's question.",
tools=[fetch_schema, list_tables],
output_guardrails=[critic_guardrail],
)

# When tripwire fires, Runner.run raises OutputGuardrailTripwireTriggered.
# Catch it and decide: retry with critique context, escalate, or fail loudly.

Flavor 2, refinement-style reflection ke liye separate critic-and-refiner loop (jab aap output ko sirf block nahin, fix karwana chahte hain):

async def with_reflection(task: str, max_refinements: int = 2) -> str:
output = (await Runner.run(sql_generator, task)).final_output
for refinement in range(max_refinements):
critique = (await Runner.run(sql_critic, output)).final_output
if critique.is_safe and not critique.issues:
return output
# Refinement: feed the critique back to the generator
refine_prompt = f"Original query:\n{output}\n\nCritic flagged: {critique.issues}\n\nRevise the query."
output = (await Runner.run(sql_generator, refine_prompt)).final_output
return output # max refinements reached; output is best-effort

Do SDK shapes notice karein: output_guardrail SDK ka native pattern hai "bad outputs block" karne ke liye: declarative, agent definition se tied, har Runner.run() par automatically run hota hai. Separate critic-and-refiner loop SDK ka idiomatic pattern hai "borderline outputs improve" karne ke liye: zyada flexible, lekin orchestration aap likhte hain. Dono patterns critic aur generator ke liye different models use karte hain. Yeh Concept 7 wali discipline hai, model= parameter se har Agent par concrete.

Internalize karne laayak SDK insight: reflection SDK mein separate framework primitive nahin, Agent + Agent ki composition hai. output_guardrail decorator bas SDK convention hai jisse second agent first agent ke output path mein wire hota hai.

Deployment composition. Reflection core pattern ke oopar layer hoti hai, is liye deployment composition neeche wali cheez par depend karti hai:

  • SDK primitives used: block-bad-outputs reflection ke liye output_guardrail (SDK ka native validation primitive); ya refinement-style reflection ke liye do Agent instances (generator + critic) aur har agent ke liye Runner.run(). Critical: critic ko generator se different model= use karna chahiye, same SDK, different model family.
  • Agar core sequential workflow hai, reflection 1-2 LLM calls add karti hai; deployment structurally change nahin hoti.
  • Agar core ReAct + tools hai, reflection agent loop complete hone ke baad 1-2 LLM calls add karti hai; deployment structurally change nahin hoti.
  • Agar core planning + ReAct hai, reflection aksar stages ke darmiyan (stage N ke output ko stage N+1 se pehle critique) aur final synthesis par jati hai; is se latency add hoti hai.

Naya deployment consideration: model variety. Agar critic generator se different model use karta hai (Claude GPT ko critique kar raha hai, ya vice versa), harness ko multiple model providers support karne honge. Cloud deployment course single-provider deployment sikhata hai; reflection add karna aksar multi-provider ko real need bana deta hai. Secrets-management aur routing accordingly plan karein.

Eval signals. Reflection ke apne characteristic failure modes hain:

Failure modeEval isay kis taur par pakarta hai
Reflection output change nahin karti (rubber-stamping)Pre-reflection aur post-reflection outputs compare; agar >80% dafa nearly identical hon, reflection kaam nahin kar rahi
Reflection wrong direction mein refine karti hai (output worse)Golden dataset ke against pre- aur post-reflection score karein; net negative impact ka matlab critic misfire kar raha hai
Critic aur generator blind spots share karte hainA/B test: same generator, do different critics (different models ya prompts); critique content strongly correlate kare to critics independent nahin
Criteria time ke saath drift karte hain (criteria list ad-hoc grow/shrink hoti hai)Criteria list version-control karein; jab changes documented decisions se correspond na karen to flag karein
Refinement loops budget exceed karte hainRefinement counter threshold exceed karta hai; investigate karein critic repeatedly aise defects kyun dhoond raha hai jo generator fix nahin kar sakta

Key insight: reflection evals ko measure karna hota hai ke reflection net-positive hai ya nahin, sirf yeh nahin ke woh run hoti hai. Jo reflection pass output change kiye baghair run hota hai overhead hai; jo outputs worse banata hai harmful hai. "Rubber-stamp" failure mode detect karna sab se mushkil hai kyun ke surface metric healthy lagta hai (latency up, errors flat) lekin system apni cost earn nahin kar raha.

Production mein teams yahan kahan ghalti karti hain. Reflection add karna kyun ke woh rigorous lagti hai. Teams "generate, then critique" pattern add karti hain bina measure kiye ke critique generator ke missed things pakarti hai ya nahin. Months later, reflection pass extra LLM calls mein $X kharch kar chuka hota hai aur measurable quality improvement $0. Discipline yeh hai: pehle month mein reflection ka net contribution measure karein, aur agar contribution threshold se neeche ho to remove karein.

Operational envelope. Reflection Inngest ke step model ke saath achi tarah compose hoti hai, har pass (generate, critique, refine) apna step.run banta hai, aur durability benefits is baat ke proportional hain ke kisi failure se pehle kitne passes complete ho chuke thay.

  • Inngest primitives used: har run par teen ya chaar ctx.step.run calls, step.run("generate", ...), step.run("critique", ...), aur refinement attempts ke liye 0-2 step.run("refine-N", ...). Optionally: jab critic human ho to ctx.step.wait_for_event (function suspend hota hai jab tak human reviewer approval event fire na kare, wahi HITL-gate primitive jo operational envelope provide karta hai).
  • Durability win: agar generator step successfully complete ho (sab se expensive step, output produce karta hai jise critique hona hai), aur critic step transiently fail ho (rate limit, network blip), sirf critic step retry hota hai. Generator ka output memoized hai aur regenerate nahin hota. Operational envelope ka step.run discipline reflection ki added latency ko crashes par double-cost mein compound hone se rokta hai.
  • HITL reflection. Jab evaluation criteria dusre LLM se checkable nahin (Concept 7 ka "subjective domains" caveat), sahi jawab aksar human reflection hota hai. Inngest ka step.wait_for_event isay clean banata hai: step.run("generate", ...)step.run("send-to-reviewer", ...)step.wait_for_event("await-human-decision", timeout=timedelta(hours=4))step.run("act-on-decision", ...). Human review ke dauran function zero compute consumed ke saath suspend hota hai. Operational-envelope course HITL pattern detail mein walk karta hai.
  • Reflection ki cost-per-output discipline: Inngest ki run-level cost tracking (har step.run ke LLM cost par) reflection ka net contribution measure karna trivial banati hai. Per-run cost comparison (with-reflection vs. without-reflection) ek Phoenix dashboard query door hai.

Concept 12 ke reflection ke do SDK flavors (output_guardrail vs. separate critic-and-refiner loop) dono Inngest envelope ke saath naturally compose hotay hain. SDK flavor reflection style se choose karein; envelope discipline dono mein same hai.

Concept 12 ka bottom line: reflection tab sahi additive layer hai jab quality speed se zyada matter karti ho AUR criteria checkable hon. Yeh kisi bhi core pattern ke oopar layer hoti hai. Eval discipline measure karti hai ke reflection net-positive hai ya nahin; rubber-stamping sab se insidious failure mode hai kyun ke surface metrics healthy lagte hain. Agar deployment ke ek month ke andar reflection measurably outputs improve nahin kar rahi, usay remove karein.

Concept 13: Multi-agent specialist system, characteristic shape, deployment, eval signals

Yeh kya hai. Distinct roles wale multiple agents task par collaborate karte hain. Q5 se justified: specialization, context, ya scale real bottleneck banata hai. Pattern composition matter karti hai: har specialist ki internal architecture sequential workflow, ReAct, ya planning + ReAct ho sakti hai. Multi-agent baqi patterns ka replacement nahin; unki composition hai.

Teen SDK-native topologies, har ek different SDK primitive use karta hai.

Topology 1, coordinator with specialists as tools (SDK ka Agent.as_tool() pattern). Coordinator control mein rehta hai; specialists function tools ki tarah invoke hotay hain.

from agents import Agent, Runner, function_tool

# Three specialists, each with its own tools and instructions.
researcher = Agent(name="researcher", instructions="...", tools=[web_search, fetch_url])
writer = Agent(name="writer", instructions="...", tools=[draft_document])
reviewer = Agent(name="reviewer", instructions="...", tools=[lint_check, fact_check])

# The coordinator uses specialists as_tool(): calling them like functions.
coordinator = Agent(
name="coordinator",
instructions=(
"Decompose the task into research, writing, and review phases. "
"Use the specialist tools in order. Compose their outputs into a final report."
),
tools=[
researcher.as_tool(tool_name="research_topic", tool_description="Investigate a topic and return a brief"),
writer.as_tool(tool_name="draft_document", tool_description="Draft a document from research notes"),
reviewer.as_tool(tool_name="review_document", tool_description="Review a draft and return critique"),
],
)

async def coordinator_topology(task: str) -> str:
result = await Runner.run(coordinator, task, max_turns=30)
return result.final_output

Topology 2: Sequential handoff (SDK ka handoff() pattern). Specialists conversation take over karte hain; SDK context un ke darmiyan pass karta hai.

from agents import Agent, Runner, handoff

# Define specialists; each one declares which agents it can hand off TO.
final_reviewer = Agent(name="reviewer", instructions="Review the draft and produce the final output.")
writer = Agent(
name="writer",
instructions="Draft from the research. When the draft is ready, hand off to the reviewer.",
handoffs=[handoff(final_reviewer)],
)
researcher = Agent(
name="researcher",
instructions="Investigate the topic. When research is complete, hand off to the writer.",
tools=[web_search, fetch_url],
handoffs=[handoff(writer)],
)

async def handoff_topology(task: str) -> str:
# Start with the researcher; the SDK threads control through handoffs.
result = await Runner.run(researcher, task, max_turns=50)
return result.final_output # whoever ended up holding the conversation

Topology 3, parallel specialists composed by a synthesizer. SDK har specialist ko independently Runner.run() se chalata hai; synthesizer unke outputs compose karta hai.

import asyncio
from agents import Agent, Runner

# Five domain specialists running in parallel: one per competitor to research.
competitor_specialist = Agent(
name="competitor_research",
instructions="Research one competitor in depth: pricing, product, positioning, recent news.",
tools=[web_search, fetch_url, read_document],
)
synthesizer = Agent(
name="synthesizer",
instructions="Compose competitor briefs into a single comparative landscape report.",
)

async def parallel_topology(competitors: list[str]) -> str:
# Each specialist runs independently: different Runner.run() calls.
parallel_briefs = await asyncio.gather(*[
Runner.run(competitor_specialist, f"Research: {c}", max_turns=15)
for c in competitors
])
briefs_text = "\n\n".join(r.final_output for r in parallel_briefs)
final = await Runner.run(synthesizer, briefs_text)
return final.final_output

Teen SDK primitives notice karein:

  • Agent.as_tool() agent ko callable tool ke taur par wrap karta hai, coordinator control mein rehta hai, specialists ko functions ki tarah call karta hai. Best jab coordinator ko outputs compose karne aur next step decide karne hon.
  • handoff() conversation dusre agent ko pass karta hai, control transfer hota hai, aur SDK context manage karta hai. Best jab specialist ko user-facing interaction take over karni ho.
  • Parallel Runner.run() + asyncio.gather() specialists ko independently chalata hai: shared conversation nahin, handoff nahin. Best jab specialists isolation mein kaam karte hain aur outputs synthesizer compose karta hai.

Internalize karne laayak SDK insight: SDK multi-agent composition ke liye native primitives deta hai. Aap routing logic hand-roll nahin karte. Hierarchical composition ke liye as_tool(); sequential takeover ke liye handoff(); fan-out ke liye parallel Runner.run(). In ke darmiyan choose karna apni jagah pattern-selection decision hai, aur woh Q5 se surfaced task properties ke downstream hai.

Deployment composition. Multi-agent systems full cloud stack use karte hain plus ek critical additional discipline:

  • SDK primitives used: hierarchical composition ke liye Agent.as_tool() (coordinator control mein rehta hai); sequential takeover ke liye handoff() (specialist conversation take over karta hai); fan-out ke liye parallel Runner.run() + asyncio.gather(). Har specialist apna Agent hai, apni tools= list aur instructions= ke saath. SDK handoffs ke across context-passing manage karta hai; aap routing hand-roll nahin karte.
  • Single-agent ReAct ki saari requirements har specialist ke liye (harness, sandbox agar needed, R2, background worker).
  • Neon mein per-specialist runs/traces. Har specialist ki execution apna run hai; multi-agent system parent run hai jo child runs ko reference karta hai. Schema ko parent_run_id aur agent_role columns chahiye.
  • Routing audit logs. Har routing decision (kaunsa specialist? handoff format kya?) log hota hai. Multi-agent failures aksar wrong-routing-decision ya lost-context-on-handoff ke taur par manifest karte hain; explicit routing logs ke baghair debugging nearly impossible hai.
  • Per specialist cost tracking. Multi-agent systems mein kis specialist ne tokens burn kiye yeh kho jana aasaan hai. Per-specialist cost attribution runaway costs ko aggregate metrics mein chupne se rokta hai.

Bridge Worker plus specialists. Agar multiple specialists har ek code run karte hain, aapko multiple bridge-Worker configurations chahiye ho sakti hain (different specialists ke tooling needs ke liye different Manifests) ya ek bridge Worker jo specialist identity se route kare. Complexity logon ki expectation se tez escalate karti hai: yahin deployment-topology costs dominate karna shuru karte hain.

Eval signals. Multi-agent failures evaluate karna sab se mushkil hai kyun ke failures teen layers par ho sakte hain: specialist ke andar, routing/coordination mein, ya integration mein:

Failure modeEval isay kis taur par pakarta hai
Specialist wrong output produce karta haiHar specialist ke role par standard per-agent eval (evaluation ke liye har specialist ko standalone agent samjhein)
Coordinator ghalat specialist ko route karta haiRouting-accuracy eval: given task, kya right specialist ko gaya? Golden dataset mein labeled routing examples chahiye
Handoff information kho deta hai (specialist B specialist A ka output use nahin kar sakta)Handoff-completeness eval: kya specialist B ke paas A se needed information thi? Initially manual labels; patterns clear hon to automated
Integration specialists ke outputs ghalat combine karti haiGolden dataset ke against end-to-end eval; agar specialists individually pass lekin integrated output fail, problem integration hai
Specialists baghair resolution disagree karte hainInconsistency detector: parallel specialists conflicting answers produce karte hain; aggregator explicitly resolve karta hai ya conflict surface
Coordination overhead work value se zyada haiCost-per-correct-output: agar multi-agent single-agent se 3× zyada cost kare aur quality improvement 20% se kam ho, architecture apna overhead earn nahin kar rahi

Key insight: multi-agent evals ko teen separate scoreboards chahiye: specialist quality, routing accuracy, integration quality. Unhein conflate karna meaningless aggregate scores produce karta hai. Har specialist ki individual quality 95%, routing accuracy 90%, integration quality 80% ho sakti hai, aur end-to-end system ~68% perform karta hai (product). Separation ke baghair aap nahin bata sakte kis layer ko improve karna hai.

Production mein teams yahan kahan ghalti karti hain. Multi-agent system ko single unit treat karna. Jab kuch fail hota hai, team layer par localize karne ke bajaye poore system ko debug karti hai. Solution: day one se per-specialist tracing aur per-handoff logging enforce karein. Is ke baghair multi-agent debugging single-agent debugging se substantially harder aur slower hoti hai, aksar large multiple se, aur yeh pattern ki sab se bari hidden costs mein se ek hai.

Operational envelope. Multi-agent woh pattern hai jo Inngest operational envelope par sab se zyada depend karta hai. Almost har envelope primitive role play karta hai: parallel specialists ke liye fan-out, tenant fairness ke liye per-key concurrency, tier-based queueing ke liye priority, specialists ke darmiyan HITL gates, partial-failure recovery ke liye replay.

  • Inngest primitives used (curriculum ki sab se extensive composition):

    • Fan-out trigger pattern parallel specialist execution ke liye: coordinator function N specialist events fire karta hai; har specialist apna @inngest_client.create_function hai apne TriggerEvent ke saath. Ek event N functions wake karta hai; woh parallel run karte hain; Inngest har ek ko independently track karta hai.
    • Har specialist run par step.run har specialist function ke andar, single-agent ReAct (Concept 10) jaisi durability story, lekin N se multiplied.
    • Per-key concurrency caps taake koi single tenant specialist capacity monopolize na kare: concurrency=[Concurrency(limit=5, key="event.data.tenant_id")]. Yahan per-key concurrency load-bearing pattern hai.
    • Priority expressions tier-based fairness ke liye: Enterprise tenant runs queue mein Free tier se aage jump karte hain.
    • Specialists ke darmiyan step.wait_for_event jab handoffs ko human approval chahiye (for example, research → human-vetted research → analysis).
    • Partial-failure recovery ke liye replay: jab 5 mein se 3 specialists fail aur 2 succeed karen, failing-specialist ka code fix karen aur replay karein; 2 successful specialists ke outputs memoized rehte hain.
  • Coordination-cost insight: Concept 13 ne note kiya ke multi-agent ka coordination overhead sab se bari hidden cost hai. Inngest primitives us overhead ka zyada hissa absorb kar lete hain: routing logic events + triggers ban jata hai (hand-rolled router nahin); handoff contracts event schemas ban jate hain (SDK se validated Pydantic models); integration failures replay candidates ban jate hain (lost work nahin); per-specialist cost tracking per-function dashboard metrics ban jata hai.

  • Quantified savings. Inngest ke baghair multi-agent system ko aam tor par chahiye hota hai:

    • Custom routing/dispatch layer (~500-2000 lines of code)
    • Custom retry/dead-letter handler (~200-1000 lines)
    • Custom HITL approval queue with timeouts (~500-1500 lines)
    • Per-tenant rate limiting (~300-800 lines)
    • Custom replay/recovery tooling (~500-2000 lines)

    Together: 2,000-7,000 lines of operational-envelope code jise test, debug, aur maintain karna hota hai. Inngest ke saath yeh ~50-200 lines trigger declarations aur step.run calls ban jata hai. Total-cost difference production multi-agent system ki lifetime ke across compound hota hai.

  • Three-scoreboard observability rehti hai. Concept 13 ke eval signals se per-specialist quality, routing accuracy, aur integration quality scoreboards phir bhi apply hote hain; Inngest structured traces Phoenix mein OTel ke zariye flow karte hain, is liye eval discipline change nahin hoti.

Cloud deployment ki "per-specialist tracing, routing audit logs, cost tracking per specialist" requirement Inngest se partially absorb ho jati hai. Application-level traces (Phoenix) phir bhi chahiye, lekin audit logs aur cost tracking Inngest dashboard mein function-runs ke functions ban jate hain. Composition yeh hai: run-level operational data ke liye Inngest, trace-level evaluation data ke liye Phoenix, application-level audit ke liye Neon. Teen layers, har ek apna best kaam karti hai.

Concept 13 ka bottom line: multi-agent specialist systems full cloud stack plus per-specialist tracing, routing audit logs, aur cost-per-specialist tracking use karte hain. Eval discipline teen separate scoreboards require karti hai (specialist quality, routing accuracy, integration quality) kyun ke aggregate scores chhupa dete hain kis layer ne fail kiya. Coordination overhead sab se under-estimated cost hai; rigorous per-specialist instrumentation ke baghair debugging single-agent debugging se bahut zyada hard aur slow hoti hai.

AI ke saath try karein, Part 3 ke baad. Aap dekh chuke hain har pattern deploy karne mein kya cost rakhta hai aur kaise fail hota hai. Jis pattern par aap waqai reach karenge us ke liye isay concrete banayein. Apna Claude Code ya OpenCode session kholein aur paste karein:

"Mere next build ke liye sab se likely agentic pattern choose karo (sequential workflow, single agent with ReAct and tools, planning with ReAct, ya multi-agent specialist system). Us pattern ke liye mujhe do cheezon se walk karao. Pehle, deployment topology: isay kaun se components chahiye (HTTP service, durable state, file storage, sandboxed code execution, background workers, trace observability) aur kaun se skip kar sakta hai? Dusra, production mein sab se pehle mujhe kaunsa single failure signal watch karna chahiye, aur architecture badalne se pehle kaunsa cheap fix try karna chahiye. Mere pattern ke bare mein concrete raho, generic nahin."

Aap kya seekh rahe hain. Pattern choice real tab hoti hai jab aap uski run cost aur break hone ka signal naam de sakein. Yeh deployment-and-eval composition ko parhi hui cheez se whiteboard par sketch karne laayak cheez bana deta hai.


Part 4: Failure signals aur pattern revision

Aap starting pattern choose kar chuke hain. System run karta hai. Aapko kya batata hai ke pattern ghalat tha, aur phir kya karna chahiye? Part 4 Bala Priya C ke article se paanch characteristic failure signals cover karta hai, unhein aapki eval suite ke specific eval aur observability signals se map karta hai, targeted fixes ke saath jo architecture abandon kiye baghair try kiye ja sakte hain.

Concept 14: Paanch failure signals (aur har ek ka matlab)

Article paanch runtime symptoms identify karta hai jo pattern-task mismatch indicate karte hain. Har ek ki characteristic shape hai jise do dafa dekhne ke baad aap foran pehchan sakte hain.

Signal 1: ReAct loops ya solved work revisit karta hai. Agent ek hi run mein same tool similar arguments ke saath multiple times call karta hai. Ya partial outputs produce karta hai, phir unhein scratch se dobara derive karta hai. Pattern mein structure ya stop conditions missing hain. Agent ko yeh jaanne ka tareeqa nahin ke kaam complete ho gaya.

Observability mein yeh kahan dikhta hai: trace-length anomalies (run 40 steps le gaya jab most runs 15 lete hain); duplicate-tool-call patterns (same customer_lookup paanch dafa called); reasoning-loop signals (model reasoning text mein "let me try this again" ya equivalent).

Likely meanings, frequency ke order mein:

  • Agent ke prompt mein define nahin ke work "done" kab hota hai
  • Tool contracts loose hain (multiple tools plausibly same kaam kar sakte hain; agent un ke darmiyan oscillate karta hai)
  • Task ko waqai planning chahiye thi (Q3 yes hona chahiye tha)

Signal 2, Planner plan banata hai lekin execution diverge karta hai. Plan kehta hai "stage 1: research; stage 2: draft; stage 3: review." Execution stage 1 karta hai, phir stage 3 par jump karta hai, phir stage 2 par wapas aata hai. Ya execution stages add karta hai jo planner ne include nahin kiye. Task planning bet ki assumption se kam predictable tha.

Observability mein yeh kahan dikhta hai: plan-execution divergence metric (planned stages aur executed stages ke darmiyan edit distance compute karein); reordering signals (stages dependency order se bahar run); inserted-stage signals (execution mein plan ke bahar stages).

Likely meanings, frequency ke order mein:

  • Task ki structure partially articulable hai, fully nahin; planner major phases correctly identify karta hai lekin adaptive sub-phases miss karta hai (lightweight planning use karein)
  • Planner ki training is task domain se match nahin karti (domain examples ke saath planning prompt improve karein)
  • Task waqai articulable structure nahin rakhta (Q3 no hona chahiye tha; pure ReAct par downgrade karein)

Signal 3, Reflection answer improve nahin karti. Critique pass run hota hai, critique produce karta hai, agent refine karta hai, aur refined output original se indistinguishable hota hai. Ya refined output worse hota hai. Reflection bet fail ho rahi hai: ya criteria vague hain, ya critic aur generator blind spots share karte hain, ya dono.

Observability mein yeh kahan dikhta hai: pre/post-reflection comparison scores (agar statistically indistinguishable hon, reflection kaam nahin kar rahi); criterion-firing rates (kaun se criteria refinement trigger karte hain? agar hamesha ek hi criterion ho, wahi useful hai); critic-generator agreement rate (agar critic almost always pass karta hai, rubber-stamping hai).

Likely meanings, frequency ke order mein:

  • Criteria refinement drive karne ke liye too vague hain (unhein more specific aur checkable banayein)
  • Critic aur generator same model similar prompts ke saath hain (different model ya fundamentally different critic framing use karein)
  • Task ko actually reflection chahiye hi nahin thi (Q4 no hona chahiye tha, quality matter kar sakti hai lekin criteria checkable nahin)

Signal 4, Multi-agent routing fail hoti hai. Coordinator task ko ghalat specialist ko bhejta hai. Ya do specialists conflicting outputs produce karte hain jinhein aggregator reconcile nahin kar pata. Ya specialists ke darmiyan handoff critical information kho deta hai. Coordination overhead kaam par dominate kar raha hai.

Observability mein yeh kahan dikhta hai: routing accuracy metric (coordinator ke routing decisions ko golden-dataset labels se compare); handoff-completeness signals (specialist B ka input specialist A ke output se critical content reference nahin karta); integration-failure rate (specialists individually pass, end-to-end fail).

Likely meanings, frequency ke order mein:

  • Specialists ke roles overlap karte hain (boundaries clarify karein; overlapping specialists merge karein)
  • Handoff contracts implicit hain (explicit banayein; structured handoff formats require karein)
  • Task ko actually multi-agent chahiye hi nahin tha (Q5 no hona chahiye tha; single agent par collapse karein)

Signal 5, System complex lagta hai lekin better nahin. Diagnose karna sab se mushkil kyun ke koi single eval signal isay nahin pakarta. Architecture mein multiple layers hain (planning + reflection + multi-agent, maan lein), lekin output quality simpler baseline se measurably better nahin. Architecture task bottleneck ke bajaye aesthetic problem solve kar rahi hai.

Observability mein yeh kahan dikhta hai: single observability signal nahin. Detection baseline comparison require karta hai: same task ka simpler version implement karein (single agent + ReAct + tools, no reflection, no multi-agent) aur golden dataset par quality measure karein. Agar simpler version complex version ke roughly 10% ke andar perform karta hai, complex architecture apni cost earn nahin kar rahi.

Likely meaning, nearly all cases mein:

  • Team ne patterns layer kar diye baghair test kiye ke har layer justified thi ya nahin; overshoot multiple decisions ke across accumulate hua

Concept 14 ka bottom line: paanch characteristic failure signals pattern-task mismatch indicate karte hain: ReAct loops/revisits (missing structure), plan-execution divergence (overstructured), reflection not improving (vague criteria), multi-agent routing failures (overpartitioned), system-feels-complex-but-not-better (cumulative overshoot). Har signal ki characteristic observability shape hai. Signal pehchan na pehla step hai; fix hamesha architectural nahin hota, kabhi prompt tightening ya contract clarification hoti hai.

Concept 15: Targeted fixes jin ke liye architecture abandon karna zaruri nahin

Failure signal recognize karna hamesha architecture rewrite karne ka matlab nahin. Zyada tar fixes prompt, contract, ya instrumentation level par hoti hain, architecture level par nahin. Yeh concept har signal ko pehle try karne wali sab se sasti fix se map karta hai.

SignalPehle try karne wali sab se sasti fixAgar woh kaam na kareRequired architectural change
ReAct loops/revisitsExplicit stop conditions add karein ("aap task tab complete kar chuke hain jab...") aur tool boundaries ("X ko purpose Y ke liye use karein; X ko Z ke liye use na karein")Tool contracts improve karein (better descriptions, clearer return types)Planning layer add karein (Concept 11 ke pattern tak upgrade)
Plan-execution divergenceLightweight planning par switch karein (fewer, broader stages)Domain-specific examples ke saath planner prompt improve kareinPure ReAct par downgrade karein (Concept 10)
Reflection not improvingCriteria ko more specific aur checkable banayein (numeric thresholds, schema validation, explicit rules)Critic ke liye different model use karein; ya explicit checking tools (parser, validator)Reflection entirely remove karein agar improvement materialize na ho
Multi-agent routing failsKnown cases ke liye coordinator ko LLM-based se deterministic routing par switch kareinHandoff contracts explicit aur structured banayein (Pydantic models, free-text nahin)Overlapping specialists merge karein; agar Q5 actually hold nahin karta to single agent par collapse
Complex-but-not-betterTopmost layer remove karein (sab se recently added pattern) aur measure kareinNext layer up remove karein; iterateStrong baseline wale single agent par wapas aayein; sirf evidence ke saath rebuild

Principle: smallest scope par fix karein jo kaam kare. Prompt tightening tool-contract changes se sasta hai. Tool-contract changes architectural changes se sasti hain. Architectural changes rewrites se sasti hain. Zyada tar failure signals prompt ya contract level par address ho sakte hain, architecture knob pehle na ghumayein.

Exception: agar failure signal prompt aur contract fixes ke baad recur kare, yeh evidence hai ke architecture waqai ghalat hai. "Main isay patch kar sakta hun" aur "main isay patch karta rehta hun aur yeh naye tareeqon se fail hota rehta hai" ko alag karein. Latter signal hai ke pattern selection revisit karein.

Concept 15 ka bottom line: failure signals hamesha architectural changes require nahin karte. Zyada tar prompt level (stop conditions, criteria specification, role boundaries) ya contract level (tool descriptions, handoff structures, routing logic) par fix ho sakte hain. Architectural change last resort hai, first move nahin. Exception: prompt aur contract fixes ke baad recurring failures indicate karte hain ke pattern khud ghalat hai; tab decision tree dobara walk karein.

Concept 16: Jab decision tree ghalat hota hai

Decision tree acha hai. Infallible nahin. Teen situations jahan tree ka first answer ghalat hota hai, aur kya karna hai:

Situation 1, Task properties deployment ke baad change hoti hain. Jo stable workflow tha adaptive ban jata hai (business 20 edge cases add kar deta hai). Jo specialized expertise thi commodity ban jati hai (LLM behtar ho jata hai aur generalist ab woh work handle kar sakta hai jise specialist chahiye tha). Real example: customer-support workflow jo sequential pipeline se shuru hua (extract → classify → route → respond) personalization, history-awareness, aur tone-matching add hone par adaptive ban jata hai. Original pattern ab ghalat hai, lekin system production mein hai.

Fix: Concept 14 ki failure-signal observability isay pakarni chahiye. Jab workflow paths fail hona shuru karen kyun ke real inputs workflow ki expected shape se match nahin karte, woh signal hai. Nayi task properties ke saath decision tree dobara walk karein. Sirf is liye pretend na karein ke original choice ab bhi sahi hai kyun ke wahi deployed hai.

Situation 2, Different sub-tasks ko different patterns chahiye. Maya's Tier-1 Support agent routing, lookups, refunds, escalations handle karta hai. Kuch workflow-shaped hain (lookup: deterministic). Kuch ReAct-shaped hain (refund investigation: adaptive). Single-agent ReAct pattern un sab ko handle kar leta hai, lekin adequately rather than well. Fix: recognize karein ke yeh multi-pattern composition opportunity hai. Top-level coordinator pattern-specific sub-systems ko route karta hai: lookups ke liye sequential workflow, investigations ke liye ReAct + tools, complex multi-step disputes ke liye planning. Composition multi-agent hai, lekin specialists role-based nahin, pattern-based hain.

Situation 3, Constraints answer change kar dete hain. Decision tree assume karta hai ke aap fit hone wala pattern choose kar sakte hain. Kabhi aap nahin kar sakte. Hard latency budget reflection rule out karta hai. Hard cost budget multi-agent rule out karta hai. Hard simplicity requirement planning rule out karti hai. Jab constraints tree ke answer ko exclude kar dein, aapko constraints change karne, task scope change karne, ya worse fit accept karne mein se choose karna hota hai.

Fix: constraint-driven pattern choices ko separate decision ke taur par explicitly track karein. Document karein: "decision tree multi-agent par point kar raha tha, lekin humne cost ceiling ki wajah se single-agent choose kiya. Known limitation: specialization-driven failures zyada common honge." Is se constraint-driven choice visible aur revisitable hoti hai; jab constraints change hon, aapko maloom hota hai kya reconsider karna hai.

Concept 16 ka bottom line: decision tree starting point hai, permanent answer nahin. Teen situations tree revisit karwati hain: task properties deployment ke baad change hoti hain (failure-signal observability se pakrein), different sub-tasks ko different patterns chahiye (multiple patterns compose karein), aur constraints tree ke answer ko exclude karte hain (constraint-driven choice explicitly document karein). Pattern selection iterative hai, one-shot nahin.

Part 5 correct pattern selection ke worked examples walk karne se pehle, yahan inverse hai: common wrong choices ki quick gallery aur har ek ki better alternative. Anti-patterns recognize karna apni jagah skill hai: jo students decision tree internalize kar lete hain woh phir bhi pattern-overshoot ya pattern-undershoot mein gir sakte hain jab architectural temptation strong ho.

Anti-pattern gallery diagram jo do columns dikhata hai: OVERSHOOTING (red, left, "task ki zarurat se zyada elaborate") paanch anti-patterns ke saath, har ek red arrow se green "better choice" box ki taraf; aur UNDERSHOOTING (blue, right, "task ki zarurat se zyada simple") teen anti-patterns ke saath, har ek blue arrow se green better-choice box ki taraf. Left column (overshoot, 5 rows): simple content generation ke liye multi-agent → Single agent + ReAct ya workflow; fixed invoice processing ke liye ReAct → Sequential workflow; open-ended debugging ke liye planner → Single agent + ReAct + tools; vague quality criteria wale tasks par reflection → Reflection remove ya human review use; stable workflow par planning add karna → Sequential workflow. Right column (undershoot, 3 rows): many domains ke liye one giant agent → Multi-agent specialist system; massive context chahne wale tasks ke liye pure single-agent → focused contexts ke saath multi-agent; verification chahne wale outputs par reflection skip karna → Reflection layer add. Undershoot column ke bottom-right mein amber callout "The asymmetry is real" kehta hai: five overshoot anti-patterns, three undershoot; overshoot zyada common hai (talks aur demos elaborate patterns ko favor karte hain), undershoot production mein zyada dangerous hai (system tab tak kaam karta lagta hai jab tak nahin karta, failure mode subtle hai). Dono ko pakarne wala self-check question: "Agar senior engineer meri pattern choice review kare, woh sab se likely objection kya uthayega? Agar aap objection predict aur defend nahin kar sakte, to principled choice abhi nahin hui." Footer band: "Decision tree (Concepts 4 through 8) task properties ke bare mein pooch kar BOTH failure modes surface karne ke liye design hua hai, pattern preferences ke bajaye. Build se pehle apni draft architecture mein anti-pattern recognize karna woh practical skill hai jo framework produce karta hai. Design reviews ke dauran is gallery par wapas aayein."

Visual asymmetry, 5 overshoot anti-patterns vs. 3 undershoot, production systems ki real frequency reflect karti hai. Overshoot zyada visible hai kyun ke elaborate patterns better demos banate hain; undershoot zyada dangerous hai kyun ke failure modes subtle hain. Dono design-review time par catch karne ke laayak hain. Neeche table gallery ka full text deti hai:

Bad choiceYeh kyun fail hota haiBetter starting pattern
Simple content generation ke liye multi-agent (e.g., single LinkedIn post ke liye three agents, researcher + writer + reviewer)Coordination overhead specialization gain se bahut zyada hai. "Researcher" output ek paragraph hai jise "writer" summarize karta hai. Routing failures, handoff format mismatches, no measurable quality improvement ke liye teen guna tokens.Single agent + ReAct + tools (Concept 10), ya sequential workflow (Concept 9) agar content shape fixed hai. Multi-agent sirf tab reach karein jab Q5 genuinely fire kare.
Fixed invoice processing ke liye ReAct (extract → validate → store → notify)Agent kabhi steps skip karta hai, kabhi already-done work re-validate karta hai, kabhi tool calls invent karta hai. 5% runs mein step-budget exhaustion. Team prompt mein "stop conditions" add karti hai, architectural mismatch ke bajaye symptoms treat karti hai.Sequential workflow (Concept 9). Path known aur stable hai; LLM-driven loop ghalat tool hai.
Open-ended debugging ke liye planner (planner 5-stage plan produce karta hai; execution immediately diverge)Task ki structure pehle se articulable nahin. Planner aisa plan produce karta hai jo stage 2 tak ghalat ho jata hai. Plan-execution divergence trace dominate karta hai. Team ya planner ko endlessly tighten karti hai ya plan ko decorative treat karti hai.Single agent + ReAct + tools (Concept 10). Pure ReAct un tasks ko handle karta hai jahan shape aur content unknown hain.
Vague quality criteria wale tasks par reflection (marketing copy, conversational responses, subjective content)Critic aur generator blind spots share karte hain. Critique rubber-stamping ban jati hai. Latency double; quality flat. Worse: team ko false confidence milta hai ke "AI ne check kar liya."Ya reflection entirely remove karein (most common right answer) ya LLM reflection ko human review se replace karein (Concept 12). LLM reflection sirf checkable criteria par kaam karti hai.
Many domains ke liye one giant agent (billing + technical + account + refund + sales, sab ek 4,000-token system prompt wale agent mein)Context overflow, role confusion, tool-routing errors cascade. Reflection marginal help karti hai lekin root cause fix nahin. Agent technical questions ka jawab billing policy se aur vice versa deta hai.Multi-agent specialist system (Concept 13), domain per specialists, coordinator intent classification se route karta hai. Q5 ka specialization claim yahan genuinely fire karta hai.
Stable workflow par planning add karna (planner har dafa same plan produce karta hai kyun ke task same hai)Har run extra LLM call pay karta hai jo kuch contribute nahin karti. Input thora unusual ho to planner thora different plan produce karta hai, aur ab team debug karti hai "planner ne different path kyun liya?"Sequential workflow (Concept 9). Jab path fixed ho, planning needed nahin, path directly likhein.
Massive context chahne wale tasks ke liye pure single-agent (ek agent 20 source documents, teen knowledge bases, aur database schema prompt mein load karta hai)Context window degradation. Context grow hone par agent ki reasoning weak hoti hai; model woh cheezen miss karta hai jinhein dekh kar aap kahenge isay nazar aana chahiye tha.Focused contexts ke saath multi-agent specialist system (Concept 13). Har specialist sirf apna needed context load karta hai; synthesizer outputs compose karta hai. Q5 ka context claim yahan genuinely fire karta hai.
Verification chahne wale outputs par reflection skip karna (production SQL queries, clients ko legal drafts, repos mein code changes)Subtle errors ship ho jate hain. Team tests-after-the-fact add karti hai, jo generation time par catch karne se kam errors pakarte hain.Core pattern ke oopar reflection layer (Concept 12). Jab criteria checkable hon, reflection genuinely valuable hai. Q4 fire karta hai; usay skip na karein.

Anti-pattern gallery ke across pattern: zyada tar bad choices aesthetic appeal se driven pattern-overshoot hain (multi-agent impressive lagta hai, planning rigorous lagti hai, reflection careful lagti hai). Chota lekin equally important subset simplicity bias se driven pattern-undershoot hai (one big agent, workflow tasks par pure ReAct, checkable outputs par no reflection). Decision tree dono kinds of mistake surface karne ke liye design hua hai: pattern preferences ke bajaye task properties ke bare mein pooch kar.

Pattern choice lock karne se pehle useful self-check: "Agar senior engineer meri choice review kare, woh sab se likely objection kya uthayega?" Agar aap objection predict aur defend nahin kar sakte, shayad principled choice abhi nahin hui.

Concept 16.5 ka bottom line: pattern selection aksar overshoot (zarurat se zyada elaborate) aur kam dafa lekin utna hi damaging undershoot (zarurat se zyada simple) se fail hoti hai. Anti-pattern gallery dono failure modes ki sab se common shapes ko naam deti hai. Inhein internalize karna decision tree ki discipline ko tez karta hai; build se pehle apni draft architecture mein anti-pattern recognize karna framework ki practical skill hai. Is course ke end par one-page design-review template dekhein. Us mein explicit anti-pattern check hai ("agar senior engineer is choice ko review kare, woh kya object karega?") jo team design reviews ke liye is discipline ko operationalize karta hai.


Part 5: Decision lab

Part 5 paanch real tasks par decision tree walk karta hai. Har Decision ek worked classification hai: task, paanch sawalon ke jawab, resulting pattern, deployment topology sketch, aur watch karne wale eval signals. Point right answer nahin; discipline applied dekhna hai.

Har Decision same shape follow karta hai:

  • Task (ek paragraph)
  • Tree walk karna (paanch sawal task-specific reasoning ke saath answered)
  • Pattern choice aur justification
  • Deployment topology sketch (kaun se cloud components, Neon mein kya new tables, kya bridge-Worker config)
  • Watch karne wale eval signals (kaun se eval patterns, kaun se Phoenix evaluators)
  • Simulated track callout un readers ke liye jinhon ne deployment aur eval courses nahin kiye

Decision 1: Maya's Tier-1 Support agent

Task. Customer-support agent incoming queries handle karta hai. Agent kar sakta hai: account information look up, transaction history look up, policy rules look up, knowledge base search, authority limits ke andar refunds issue, authority exceed hone ya case ambiguous hone par human review ko escalate. Agent customer ke saath conversational interaction maintain karta hai.

Aapki baari. Aage parhne se pehle is task par paanch sawal walk karein. Pattern commit karein, phir worked answer se khud ko check karein. (Ya task apni AI mein paste karein aur usay Q1 se Q5 tak quiz karne dein, jab reasoning thin ho to push back karne ko kahen.)

Pehle khud walk karein, phir worked answer kholein.

Tree walk karna.

Q1: Kya solution path pehle se define ho sakta hai? Nahin. Customer queries bahut vary karti hain: "mera refund kahan hai?" lookup chahta hai; "mujhe do dafa charge kiya gaya" investigation chahta hai; "main cancel karna chahta hun" account changes chah sakta hai; "mera bill explain kar do" policy lookup aur explanation chah sakta hai. Path unknown hai.

Q2: N/A (Q1 no tha, is liye Q2 skip).

Q3: Kya task structure execution se pehle articulable hai? Nahin. Articulable "stages" nahin; investigation hai jo jab complete hoti hai tab complete hoti hai. Agent ek lookup kar ke respond kar sakta hai, ya paanch lookups aur teen policy checks. Clear stage structure nahin.

Q4: Kya quality speed se zyada matter karti hai? Mixed. Speed matter karti hai kyun ke customers live conversation mein wait kar rahe hain; quality matter karti hai kyun ke ghalat refund decisions business ko paisa cost karte hain. Lekin "good response" ke evaluation criteria real time mein checkable nahin. Yeh nuanced judgment involve karte hain ke customer's situation achi tarah handle hui ya nahin. Reflection yahan fit nahin.

Q5: Kya specialization, context, ya scale bottleneck hai? Borderline. Agent ko billing, technical, account, aur refund issues handle karne hain, jo specialization ka case lagta hai. Lekin: overlap ka volume (zyada customers ke questions categories span karte hain) specialist routing ko specialization benefit se zyada handoff friction dega. Single agent sahi call hai.

Pattern choice: Single agent + ReAct + tools. Concept 10 ka pattern.

Deployment topology sketch. Yeh exactly customer-support Worker ki cloud deployment ne build kiya. Full stack: FastAPI on ACA, sessions, runs, aur traces ke liye Neon, attached documents ke liye R2, apply_patch tool ke liye Cloudflare Sandbox via bridge Worker jo agent kabhi refund-documentation files generate karne ke liye use karta hai, aur 30 seconds se zyada runs ke liye background worker. Jo deployment ship hoti hai us ke muqable mein deployment changes nahin.

Watch karne wale eval signals. ReAct ke characteristic failures:

  • Trace-length anomalies (Phoenix dashboard)
  • Tool-call duplication (agent same account teen dafa look up karta hai)
  • Reasoning-action divergence (Phoenix tool-correctness evaluator)
  • Premature termination (agent bahut jaldi kehta hai "main help nahin kar sakta")
  • Step-budget exhaustion (agent 25 steps ke baad bhi output produce nahin karta)

Production mein sab se likely failure mode: agent ambiguous refund cases par loop karega. Fix: explicit stop conditions add karein ("agar 3 lookups ke andar right refund amount determine nahin kar sakte, escalate") aur "investigate further" aur "human ko escalate" ke darmiyan boundary clarify karein.

Operational envelope. Maya ka setup customer-support agent ke liye canonical Inngest composition hai:

  • Trigger: TriggerEvent(event="customer/email.received"), email-ingestion webhook event fire karta hai; function har customer email ke liye wake hota hai.
  • Durability: Runner.run(support_agent, ...) ko single step.run("agent-loop", ...) mein wrap karein. Mid-loop crash → poora agent run retry; loop ke andar sub-steps SDK-internal hain aur separately durable nahin.
  • Escalation par HITL: escalate_to_human tool refund/approval.requested fire karta hai aur function 4 hours tak step.wait_for_event ke zariye suspend hota hai. Wait ke dauran zero compute consume hota hai. Human Slack se approve karta hai; function verdict ke saath resume karta hai.
  • Concurrency: concurrency=[Concurrency(limit=10, key="event.data.customer_id"), Concurrency(limit=50)], har customer par at most 2-3 concurrent runs (angry customer sab ko starve nahin kar sakta) aur globally 50 (OpenAI rate limit aur Neon connection pool protect karta hai).

Decision 1 ke liye simulated track callout. Deployment aur eval courses ke baghair bhi aap paper par yeh exercise kar sakte hain: Maya ke task ke liye paanch sawal walk karein, pattern choice justify karein, aur sketch karein agent ko kaun se tools chahiye honge (account lookup, transaction lookup, policy search, refund issuance, escalation). Decision 1 jo cheez sikhata hai woh classification discipline hai; deployment specifics usay deepen karte hain lekin framework internalize karne ke liye required nahin.

Decision 2: Incident response agent

Task. On-call agent alerts receive karta hai (monitoring systems, customer reports, ya internal teams se) aur initial incident response chalata hai: service health check, recent deploys se correlate, likely root cause identify, applicable ho to remediation runbook run, situation novel ya severe ho to human on-call ko escalate. Agent ko clear incident report produce karni hoti hai.

Aapki baari. Aage parhne se pehle is task par paanch sawal walk karein. Pattern commit karein, phir worked answer se khud ko check karein. (Ya task apni AI mein paste karein aur usay Q1 se Q5 tak quiz karne dein, jab reasoning thin ho to push back karne ko kahen.)

Pehle khud walk karein, phir worked answer kholein.

Tree walk karna.

Q1: Kya solution path pehle se define ho sakta hai? Partially. Standard structure hai: "service health check, deploys correlate, cause identify, remediation attempt, zarurat ho to escalate." Lekin specific path actual situation par depend karta hai. Service A mein latency spike "rollback recent deploy" tak le ja sakta hai; service B mein 500-error spike "restart pod" tak; customer-reported issue "user-specific data flow investigate" tak. Path step level par unknown hai lekin stage level par structured.

Q2: N/A.

Q3: Kya task structure execution se pehle articulable hai? Yes. Stages clear hain: triage → diagnose → remediate → report. Har incident in stages se guzarta hai, chahe har stage ke andar specific work vary kare. Articulable structure.

Q4: Kya quality speed se zyada matter karti hai? Incident response mein speed enormously matter karti hai: incident ka har minute business cost karta hai. Lekin quality bhi matter karti hai kyun ke ghalat remediation cheezon ko worse bana sakti hai. Remediation steps execute karne se pehle un par reflection justified hai. Quick critique pass jo pooche "kya yeh remediation safe hai? kya yeh incident ke actual symptoms se match karti hai?" latency ke laayak hai. Remediation decisions par reflection add karein.

Q5: Kya specialization, context, ya scale bottleneck hai? Nahin. Monitoring, deploy history, runbook library, aur remediation tools access wala ek agent isay handle kar sakta hai. Multi-agent na karein.

Pattern choice: Planning + ReAct execution, remediation steps par reflection ke saath. Concepts 11 + 12 layered.

Deployment topology sketch. ReAct deployment (Concept 10) plus plan persistence (Concept 11) par built. Specific additions:

  • New Neon table: incidents (incident_id, severity, plan, current_stage, remediation_history)
  • Plan explicitly store hota hai aur stages complete hone par update hota hai
  • Remediation par reflection separate agent ke taur par run hoti hai (different model recommended, Claude-instance GPT-instance ko critique kare ya vice versa, blind-spot overlap avoid karne ke liye)
  • Background worker pattern mandatory hai (incident runs 5-15 minutes le sakte hain)

Watch karne wale eval signals.

  • Plan-execution divergence (kya plan actual happened work se match karta hai?)
  • Remediation par reflection effectiveness (kya critique ne unsafe remediations pakri? agar months tak nahin, reflection rubber-stamping ho sakti hai)
  • Time-to-resolution metric (incident response speed se judge hota hai; regression par track aur alert karein)
  • Escalation accuracy (agent ne jab escalate karna chahiye tha kiya? jab remediate karna chahiye tha kiya?)

Production mein sab se likely failure mode: planner simple incidents ke liye overly-detailed plans produce karta hai, latency add karta hai. Fix: planner ko appropriate plan granularity ke examples par train karein, clear incidents ke liye short plans, ambiguous ke liye longer plans. Plan ki value comprehensive hone mein nahin; situation ke liye right-sized hone mein hai.

Operational envelope. Incident response woh pattern hai jo almost every Inngest primitive use karta hai, cron, events, fan-out, durability, HITL, replay:

  • Triggers: dual triggers, proactive health checks ke liye TriggerCron(cron="*/5 * * * *") AUR reactive incidents ke liye TriggerEvent(event="incident/alert.fired"). Same function shape dono handle karta hai.
  • Durability per stage: planning stage aur har remediation step ke liye ek step.run; agar remediation partway fail ho, previous stages memoized rehti hain.
  • Remediation par HITL: planner output aur execution ke darmiyan, step.wait_for_event("await-remediation-approval", timeout=timedelta(minutes=15)) human reviewer ko gate karta hai. Tight timeout kyun ke incidents time-sensitive hain.
  • False-positive bug fixes ke liye replay: jab remediation script ka bug incidents ko particular tareeqe se fail karwata hai, script fix karein aur Inngest dashboard se failed incidents bulk-replay karein. Manual incident re-triage nahin.

Decision 2 ke liye simulated track callout. Yeh pehla Decision hai jo pattern composition introduce karta hai (planning + reflection). Paper par bhi exercise valuable hai: notice karein reflection add karne ka choice sirf Q4 se nahin aaya, woh Q4 ko specifically remediation step par apply karne se aaya. Reflection rarely all-or-nothing hoti hai; aksar specific high-stakes outputs par layered hoti hai.

Decision 3: Market research agent

Task. Topic ("competitive landscape in agentic AI middleware") aur research brief (key questions, depth requirements, deadline) diye jane par agent research report produce karta hai. Kaam involve karta hai: relevant sources identify, multiple databases search, documents read aur extract, sources ke across claims compare, findings draft, aur final report produce.

Aapki baari. Aage parhne se pehle is task par paanch sawal walk karein. Pattern commit karein, phir worked answer se khud ko check karein. (Ya task apni AI mein paste karein aur usay Q1 se Q5 tak quiz karne dein, jab reasoning thin ho to push back karne ko kahen.)

Pehle khud walk karein, phir worked answer kholein.

Tree walk karna.

Q1: Kya solution path pehle se define ho sakta hai? Nahin. Kin sources ko consult karna, kin competitors ko investigate karna, kaun se analyses chalane, sab raaste mein discovered cheezon par depend karte hain. Unknown path.

Q2: N/A.

Q3: Kya task structure execution se pehle articulable hai? Yes. Standard research-report shape: data gather karo → analyze karo → synthesize karo → draft karo → review karo. Specific sources aur analyses unknown hon tab bhi major phases clear hain. Articulable structure.

Q4: Kya quality speed se zyada matter karti hai? Yes, strongly. Research reports decision-makers parhte hain; factual errors aur weak analysis ke real consequences hotay hain. Quality criteria partially checkable hain, "all claims are sourced," "competitor analysis covers each major player," "synthesis brief ke questions answer karti hai." Reflection justified hai, especially synthesis aur final draft par.

Q5: Kya specialization, context, ya scale bottleneck hai? Likely yes context ke liye. Deep research ko bahut source material load karna hota hai; isay ek agent ke context window mein karna reasoning degradation risk karta hai. Research-and-summarize-per-source agents mein split karna jo focused briefs produce karein, phir briefs compose karna, sahi pattern hai. Context-management reasons se multi-agent.

Pattern choice: Multi-agent specialist system, top layer par planning, research specialists ke andar ReAct, aur final synthesis par reflection. Concepts 11, 13, aur 12 ki composition.

Deployment topology sketch. Full cloud stack plus multi-agent additions (Concept 13):

  • Neon mein parent-run + per-specialist run structure (parent_run_id, agent_role)
  • Routing audit logs ke kis specialist ko kaunsa source mila
  • Per-specialist cost tracking (50-page PDFs parhne wale research agents fast tokens burn kar sakte hain)
  • Bridge Worker specialists ke across shared document-reading tools handle karta hai
  • Aggregator agent shared Neon table se read karta hai jahan specialists apni summaries deposit karte hain

Watch karne wale eval signals.

  • Teen separate scoreboards: per-specialist research quality, routing accuracy (kya right specialist ko right source mila?), integration quality (kya final report specialists ki findings achi tarah synthesize karti hai?)
  • Top-level plan par plan-execution divergence
  • Final synthesis par reflection effectiveness
  • Cost-per-correct-output (multi-agent + reflection isay expensive banata hai; track aur justify karein)

Production mein sab se likely failure mode: specialists excellent individual briefs produce karte hain lekin aggregator unhein cleanly synthesize nahin kar pata kyun ke briefs inconsistent formats ya terminology use karte hain. Fix: structured handoff formats enforce karein (brief structure ke liye Pydantic schemas), taake aggregator ko uniformly-shaped inputs milen.

Operational envelope. Market research is course ka premier fan-out example hai, woh pattern jiske liye Inngest ke flow-control primitives designed hain:

  • Fan-out trigger pattern: coordinator function har competitor ke liye ek research/competitor.research event fire karta hai; har event independent function run fire karta hai. N competitors → N parallel function runs, sab separately tracked, sab independently durable.
  • Per-tenant concurrency cap: competitor-research function par concurrency=[Concurrency(limit=5, key="event.data.tenant_id")], ek tenant ki "50 competitors research karo" request ko system monopolize karne se rokta hai.
  • Durability per specialist: har competitor-research run apne step.run calls rakhta hai (web search, document fetch, brief generation); mid-research crash sirf failing step retry karta hai, poora research run nahin.
  • Aggregation as a separate function: jab sab specialist runs complete ho jate hain (Inngest "all done" events emit karta hai), research/landscape.synthesize se triggered synthesizer function briefs parh kar final report compose karta hai. Events ke zariye decoupled; shared state nahin.
  • Cost-per-specialist visibility: Inngest ka per-function dashboard har competitor par token spend dikhata hai; outliers (competitor X dusron se 5× zyada cost kar raha hai) immediately visible hain.

Decision 3 ke liye simulated track callout. Yeh Decision pattern composition dikhata hai, multi-agent baqi patterns ka replacement nahin; unki composition hai. Planning agent planning use karta hai; research specialists ReAct use karte hain; synthesis agent reflection use karta hai. Multi-agent topology hai; topology ke andar patterns phir bhi wahi paanch patterns hain.

Decision 4: Enterprise onboarding agent

Task. Jab naya enterprise customer sign up karta hai, agent onboarding workflow chalata hai: tenant provision (accounts, databases, configuration create), seed data populate, administrators invite, kickoff meetings schedule, welcome materials send. Kaam multiple deterministic provisioning steps aur kuch personalized communications involve karta hai.

Aapki baari. Aage parhne se pehle is task par paanch sawal walk karein. Pattern commit karein, phir worked answer se khud ko check karein. (Ya task apni AI mein paste karein aur usay Q1 se Q5 tak quiz karne dein, jab reasoning thin ho to push back karne ko kahen.)

Pehle khud walk karein, phir worked answer kholein.

Tree walk karna.

Q1: Kya solution path pehle se define ho sakta hai? Yes. Onboarding ki fixed sequence hai: provision → configure → seed → invite → schedule → send-welcome. Har onboarding isi order mein in steps se guzarta hai. Kuch steps ka content personalized hai (welcome message customer ke name aur industry ko reference karta hai) lekin step sequence invariant hai. Known path.

Q2: Kya workflow har run mein fixed aur stable hai? Yes. Har enterprise customer same onboarding workflow follow karta hai. Stable.

Q3, Q4, Q5: N/A ya no. Decision tree Q2 par terminate karta hai kyun ke workflow fixed hai.

Pattern choice: Sequential workflow. Concept 9.

Deployment topology sketch. Minimal cloud stack:

  • FastAPI on ACA
  • Onboarding state ke liye Neon (kaun se customers kis step mein hain)
  • Documents ke liye R2 (welcome PDFs, onboarding guides)
  • Personalization steps par embedded LLM calls (welcome message generation, account-name suggestions agar customer request kare)
  • Sandbox needed nahin. Bridge Worker needed nahin. Long-running agentic reasoning ke liye background-worker pattern needed nahin (halanke workflow khud scale handle karne ke liye background job ke taur par run ho sakta hai).

Yeh deployment full cloud stack se meaningfully cheaper hai, kyun ke task ko cloud deployment ki zyada tar complexity chahiye nahin.

Watch karne wale eval signals.

  • Step-level correctness (har provisioning step succeeded; extraction valid schemas return karti hai)
  • Workflow completion rate (onboardings ka kitna fraction successfully complete hota hai?)
  • Personalization quality (LLM-generated welcome messages, Phoenix tone aur factual accuracy grade kar sakta hai)
  • Failure mode: workflow steps ghalat inputs par apply hona (validation gaps)

Production mein sab se likely failure mode: edge-case enterprise (unusual industry, special compliance requirements) standard workflow mein fit nahin hota. Fix: ya to edge case ke liye workflow mein explicit branching add karein (agar edge cases kam hain), ya recognize karein ke workflow variable ho raha hai aur ReAct + tools par upgrade consider karein (agar edge cases proliferate karte hain). Is transition ko time ke saath watch karein: workflows aksar stable shuru hotay hain aur dheere dheere adaptive ban jate hain.

Operational envelope. Enterprise onboarding is course ka cleanest Inngest sequential workflow example hai: har step step.run hai, agentic complexity nahin:

  • Trigger: TriggerEvent(event="customer/enterprise.signed_up"), CRM mein deal close hone par fire hota hai.
  • Har onboarding step ke liye ek step.run: step.run("provision-tenant", ...), step.run("configure-defaults", ...), step.run("seed-data", ...), step.run("invite-admins", ...), step.run("schedule-kickoff", ...), step.run("send-welcome", ...). Har step durable hai; step 4 par crash → steps 1-3 memoized.
  • HITL needed nahin: onboarding fully automated hai; standard path mein step.wait_for_event calls nahin.
  • Delayed actions ke liye step.sleep: step.sleep("wait-2-days-before-followup", timedelta(days=2)) onboarding complete hone ke baad follow-up schedule karta hai, wait ke dauran zero compute consumed.
  • Cron pairing: separate cron-triggered function (TriggerCron("0 9 * * *")) customer database ko daily sweep karta hai un onboardings ke liye jo stalled hain (step fail hua aur retries out); cron function stuck cases ke liye recovery events fire karta hai.

Yeh deployment dusron se substantially cheaper hai, aur Inngest cost discipline visible banata hai: function dashboard step-by-step success rates aur step-by-step costs dikhata hai, to aap exactly dekh sakte hain kaunsa onboarding step bottleneck hai.

Decision 4 ke liye simulated track callout. Yeh Decision important hai kyun ke yeh agentic patterns ka negative example hai. Task ko agentic reasoning chahiye hi nahin. Embedded LLM calls wala workflow cheaper, more reliable, aur debug karne mein easy hai. Jab workflow kaam karta ho to ReAct ki taraf na bhagein. Yeh decision tree ki sab se important discipline hai.

Decision 5: Coding agent (advanced track)

Task. Coding agent feature request receive karta hai aur working implementation produce karta hai: existing codebase parhta hai, change design karta hai, code likhta hai, tests likhta hai, tests chalata hai, failures fix karta hai, aur human review ke liye ready PR produce karta hai. Codebase large hai, changes complex ho sakti hain, aur correctness matter karti hai.

Aapki baari. Aage parhne se pehle is task par paanch sawal walk karein. Pattern commit karein, phir worked answer se khud ko check karein. (Ya task apni AI mein paste karein aur usay Q1 se Q5 tak quiz karne dein, jab reasoning thin ho to push back karne ko kahen.)

Pehle khud walk karein, phir worked answer kholein.

Tree walk karna.

Q1: Kya solution path pehle se define ho sakta hai? Nahin. Coding work continuous discovery involve karta hai, codebase mein kya hai, existing code kaise structured hai, tests kaun se edge cases reveal karte hain. Unknown path.

Q2: N/A.

Q3: Kya task structure execution se pehle articulable hai? Partially. High-level shape clear hai: requirement samjho → codebase samjho → change design karo → implement → test → fix → PR produce karo. Lekin: complex changes ke liye design phase iterate kar sakti hai (design → constraint discover → design revise → constraint re-discover). Articulable lekin internal adaptation needs ke saath.

Q4: Kya quality speed se zyada matter karti hai? Yes, bahut. Production mein ship hone wale code ke real consequences hotay hain. Quality criteria checkable hain: tests pass/fail, type checks pass/fail, linter pass/fail, code review specific issues identify karta hai. Reflection highly justified hai.

Q5: Kya specialization, context, ya scale bottleneck hai? Genuinely yes specialization aur context dono ke liye. Coding kam az kam teen distinct skill sets involve karti hai: code generation (good code likhna), security review (vulnerabilities pakarna), aur documentation (change explain karna). Har ek focused agent se benefit karta hai. Multi-agent justified.

Pattern choice: Multi-agent specialist system, top par planning, specialists ke andar ReAct + tools, aur code outputs par explicit reflection. Baqi chaaron patterns ki composition.

Deployment topology sketch. Full cloud stack plus multi-agent extensions:

  • Coordinator agent: feature request receive karta hai, stages ke saath plan produce karta hai (design → code → review → document)
  • Coder specialist: ReAct + tools (codebase read, files write, tests run). Heavy sandbox use (tests chalana, code execute karna). Bridge Worker mandatory.
  • Reviewer specialist: ReAct + tools (coder output read, security checks run, linters run). Lighter sandbox use.
  • Documentation specialist: simpler, possibly sequential (changes extract → docs generate).
  • Coder ke final PR par reflection layer (kya sab tests pass? kya requirement match hoti hai?).
  • Neon mein per-specialist runs; routing audit logs; cost tracking per specialist (coder costs dominate karega).

Watch karne wale eval signals. Multi-agent ke teen scoreboards, plus reflection metrics. Particular focus:

  • Code-correctness eval (kya generated code tests pass karta hai?)
  • Security-review effectiveness (kya reviewer vulnerabilities pakarta hai? false-positive rate bhi matter karta hai)
  • Plan-execution divergence (coordinator ka plan vs. actually shipped)
  • Cost-per-PR (yeh expensive pattern hai; ensure karein ke apni cost earn karta hai)

Production mein sab se likely failure mode: reviewer specialist bottleneck ban jata hai, ya too strict (valid code ko minor style issues par reject karta hai) ya too permissive (real bugs wala code pass karta hai). Fix: reviewer decisions ke liye explicit criteria, aur separate eval jo reviewer judgments ko same code par human reviewer judgments ke against grade kare.

Operational envelope. Coding agent har Inngest primitive use karta hai, yeh woh pattern hai jo full operational envelope justify karta hai:

  • Triggers: TriggerEvent(event="github/issue.assigned_to_agent"), jab issue agent ko assign ho to fire; YA Slack ka chat command event fire karta hai.
  • Fan-out coordination: coordinator function feature ko stages mein decompose karta hai, phir specialist functions ko events fire karta hai (coding/specialist.code, coding/specialist.review, coding/specialist.docs). Har specialist apni function hai apni concurrency aur durability ke saath.
  • Har file edit par step.run: coder specialist har file modification ko step.run("edit-{path}", ...) mein wrap karta hai taake multi-file edit ke dauran crash completed edits lose na kare. Memoization yahan khas taur par valuable hai, partial completion ke baad LLM-generated code change dobara run karna expensive hai aur original plan se divergence risk karta hai.
  • PR merge par step.wait_for_event: agent PR produce karne ke baad function step.wait_for_event("await-human-merge-approval", timeout=timedelta(days=2)) se suspend hota hai. Human GitHub par review, approve karta hai; function post-merge cleanup perform karne resume karta hai.
  • Per-tenant concurrency: coder specialist par concurrency=[Concurrency(limit=2, key="event.data.tenant_id")] ek tenant ko coding capacity monopolize karne se rokta hai. (Coding expensive hai; per-tenant caps critical hain.)
  • Tier-based fairness ke liye priority: Enterprise tenants ke coding tasks queue mein Free-tier se aage jump karte hain (priority=Priority(run="100 - (event.data.tier_priority * 100)")).
  • Partial failure ke liye replay: jab reviewer specialist fixable reason se code reject karta hai, coder fix karta hai aur review event dobara fire karta hai; function dashboard har PR ki iteration history dikhata hai.
  • Safety windows ke liye step.sleep: merge ke baad step.sleep("await-tests-stable", timedelta(hours=2)), 2 hours CI runs ka wait taake confirm ho change downstream tests break nahin karti, phir agent work complete mark karta hai.

Decision 5 ke liye simulated track callout. Yeh sab se mushkil Decision hai kyun ke task waqai har pattern ko compose karna demand karta hai. Yahan exercise yeh yaad rakhna nahin ke kaun se patterns apply hotay hain; yeh dekhna hai ke decision tree systematically identify karta hai kaun se patterns compose karne hain aur kahan. Coding agent "advanced" is liye nahin ke complex hai; advanced is liye hai ke pattern composition ki discipline practice leti hai.


Part 6: Haqeeqi frontiers

Concept 17: Cost aur latency architectural constraints hain, afterthoughts nahin

Ab tak is course ne pattern selection ko is tarah treat kiya jaise cost aur latency secondary hon. Production mein woh aksar primary hotay hain. Concept 17 har pattern ka cost aur latency profile explicitly naam deta hai, taake decision tree budget constraints ko saamne rakh kar walk ho sake.

Har pattern ka cost profile (rough orders of magnitude, GPT-5-class pricing assume karte hue):

PatternCost per taskCost driver
Sequential workflow1× (baseline)LLM calls ki tadaad (aksar 1-3 per workflow)
Single agent + ReAct3-10×ReAct iterations ki tadaad (har loop par model call)
Planning + ReAct execution5-15×Planning call + per-stage ReAct loops
Single agent + reflectionUnderlying pattern ka 2-3×Critique + refinement passes
Multi-agent specialist5-20×Specialist runs + coordinator + integration ki tadaad

Numbers illustrative hain, precise nahin. Matter ratios karte hain: reflection ke saath multi-agent system same task volume ke liye sequential workflow se 30-60× zyada cost kar sakta hai. Jab yeh multiplier quality se justified ho, theek. Jab aesthetics se justified ho, budget catastrophe wait kar rahi hai.

Har pattern ka latency profile:

PatternLatencyDriver
Sequential workflowSab se low (~1-5s)Deterministic steps + sequence mein LLM calls
Single agent + ReActMedium (~10-30s)Har loop par ek model call; loops stretch ho sakte hain
Planning + ReActMedium-high (~30-90s)Planning call + sequential stage execution
Single agent + reflectionUnderlying pattern ka 2-3×Critique + refinement multiplicative latency add karte hain
Multi-agent specialistVariableParallel execution madad karti hai; coordination overhead add karta hai

Decision tree se integration. Q4 (quality vs. speed) implicitly latency address karta hai. Q5 (specialization/scale) implicitly cost address karta hai. Lekin decision tree explicitly yeh nahin kehta "answer tree se ek pattern kam elaborate hai, kyun ke aapka latency budget hard hai." Yeh tree ke oopar constraint-layer decision hai.

Practical discipline: decision tree walk karne se pehle apne latency aur cost budgets likh lein. Agar tree ka chosen pattern kisi budget ko violate karta hai, aapke paas teen options hain:

  1. Constraints change karein. Zyada budget lein, latency tolerance raise karein, ya slower delivery accept karein.
  2. Scope change karein. System ko jo karna hai woh reduce karein, taake less elaborate pattern handle kar sake.
  3. Worse fit accept karein. Less elaborate pattern use karein aur accept karein ke kuch failure modes jinhein elaborate pattern pakar leta woh honge.

Document karein kaunsa option choose kiya aur kyun. Jab system woh failure modes dikhaye ga jo elaborate pattern prevent kar sakta tha, aapko yaad rakhna hoga kaunsa trade-off kiya tha.

Concept 17 ka bottom line: cost aur latency architectural constraints hain, afterthoughts nahin. Har pattern ka characteristic cost aur latency profile hai, aur patterns compose hone par multipliers compound karte hain. Reflection wala multi-agent same task volume ke liye sequential workflow se 30-60× zyada cost kar sakta hai (illustrative ratio). Decision tree Q4 aur Q5 se inhein implicitly address karta hai, lekin explicit budget constraints kabhi tree ka answer override karte hain; override document karein aur resulting failure modes consciously accept karein.

Concept 18: Pattern composition, different layers par multiple patterns

Is course ne mostly patterns ko is tarah treat kiya hai jaise aap ek choose karte hain. Real systems aksar different layers par patterns compose karte hain: top par planning agent, har plan stage ke andar ReAct + tools, final output par reflection. Decisions 3 aur 5 yeh already dikha chuke hain; Concept 18 isay first-class architectural move ke taur par naam deta hai.

Teen composition shapes jo pehchanne laayak hain:

Hierarchical composition. Higher-level pattern lower-level patterns ko wrap karta hai. Examples:

  • Planning agent (top) + ReAct + tools (har stage ke andar)
  • Multi-agent coordinator (top) + sequential workflows (specialists ke andar)
  • ReAct (top) + sequential workflow (tool ke taur par jise ReAct agent deterministic work ke liye call karta hai)

Sequential composition. Patterns ek ke baad ek run karte hain, pehle ka output dusre ko feed karta hai. Examples:

  • Sequential workflow (structured data extract) → ReAct agent (structured data investigate)
  • ReAct agent (output generate) → reflection layer (critique aur refine)

Conditional composition. Different cases ko different patterns handle karte hain, router pattern select karta hai. Examples:

  • Known-shape requests ke liye sequential workflow ko route; unknown-shape requests ke liye ReAct ko route
  • High-stakes outputs ke liye reflection apply; low-stakes outputs ke liye skip

Composition ke liye pragmatic rule: har layer ki pattern choice same paanch sawalon se justify honi chahiye, us layer ke scope par applied. Top-level pattern overall task par tree walk kar ke choose hota hai. Har sub-component ka pattern us sub-component ke kaam par tree walk kar ke choose hota hai. Patterns is liye compose na karein ke composition sophisticated lagti hai; is liye karein ke har layer ki task properties usay demand karti hain.

Sab se common composition mistake: layers add karna kyun ke layers good engineering jaisi lagti hain. Coding agent jo multi-agent + planning + har output par reflection + sab kuch wrap karne wala circuit breaker pattern rakhta hai rigorous lagta hai; aksar unnecessary hota hai. Composition test karein: topmost layer remove karein. Agar outputs degrade nahin karte, layer apni cost earn nahin kar rahi thi.

Concept 18 ka bottom line: real systems different layers par patterns compose karte hain, hierarchical (ek pattern dusre ko wrap karta hai), sequential (ek ka output dusre ko feed karta hai), conditional (different cases ke liye different patterns). Har layer ki pattern choice us layer ke scope par decision tree walk kar ke justify honi chahiye. Sab se common composition mistake layers add karna hai kyun ke layered architectures sophisticated lagti hain; topmost layer remove kar ke test karein aur dekhein quality degrade hoti hai ya nahin.


Part 7: Khatma

Concept 19: Agent Factory curriculum mein pattern selection connective tissue hai

Yeh course agent kya hai (agent-building course, agent loops aur tools par) aur usay ship karne ke liye kya chahiye (production deployment par cloud deployment course, operational evaluation par eval-driven course) ke darmiyan bridge hai.

Pattern selection ke baghair connective tissue missing hai. Aap agent build kar sakte hain aur deploy kar sakte hain, lekin beech ka design decision, is task ke liye kis qisam ka agent, unprincipled tha. Yeh course woh gap fill karta hai.

Paanch sawal simple lagte hain. Kya path known hai? Kya workflow stable hai? Kya structure articulable hai? Kya quality speed se zyada matter karti hai? Kya specialization bottleneck hai? Lekin yeh woh architectural distinctions encode karte hain jin par field ne paanch saal kaam kiya hai. Pattern catalogs (ReAct, planning, reflection, multi-agent) exist karte hain; missing cheez un ke darmiyan choose karne ki decision logic thi. Bala Priya C ka article woh gap fill karta hai; yeh course usay deployment aur evaluation composition ke saath extend karta hai jo Agent Factory students ko chahiye.

Deployment composition woh contribution hai jo is course ko alag banata hai. Agentic patterns par kam courses sikhate hain ke har pattern cloud stack ke liye kya ma'ni rakhta hai:

  • Sequential workflows sandbox layer poori tarah skip karte hain
  • Single-agent ReAct full stack use karta hai
  • Planning + ReAct plan persistence aur longer background workers add karta hai
  • Reflection aksar multi-provider model routing introduce karti hai
  • Multi-agent per-specialist tracing, routing audit logs, aur per-role cost attribution demand karta hai

Yeh abstract concerns nahin. Yeh us deployment ke darmiyan farq hai jo chote workload ke liye $130/month cost karta hai aur us deployment ke darmiyan jo same workload ke liye $400/month cost karta hai kyun ke pattern over-elaborate tha. Pattern selection cost discipline bhi hai aur architecture discipline bhi.

Evaluation composition dusra contribution hai. Har pattern ke characteristic failure modes hain jinhein aapki eval suite different tareeqe se pakarti hai:

  • Sequential workflows: DeepEval ke zariye step-level correctness
  • ReAct: Phoenix ke zariye reasoning traces
  • Planning + ReAct: custom metric ke taur par plan-execution divergence
  • Reflection: pre/post comparison aur rubber-stamp detection
  • Multi-agent: specialist quality, routing, integration ke liye teen separate scoreboards

Pattern-aware evaluation ke baghair eval suite generic hai aur har pattern ke specific failures miss karti hai. Yeh course pattern by pattern batata hai kya dekhna hai, taake aapki eval suite pattern-aware ban jaye.

Agent Factory track ki closing thesis sentence ab thori different parhti hai. Agent-building course ne shuru kiya tha agent loop is the engine of an AI-native company. Cloud deployment course ne close kiya tha the agent loop, deployed at production scale with the right architectural separation, observed across the right surfaces, and graded continuously against a living eval suite, is what an AI-native company actually runs on. Yeh course missing prefix add karta hai: task ke liye right agent loop woh cheez hai jis par AI-native company chalti hai. Ghalat shape chunna, overshooting ya undershooting, aise systems banata hai jo dheere ship karte hain, zyada cost karte hain, aur zyada failure modes mein toot te hain. Pattern selection pehla design decision hai; baqi sab us ke downstream hai.

Is course ke baad kya aata hai. Cloud deployment course ki closing ne teen frontiers naam diye thay: agent-to-agent commerce, identic-AI deployment specifics, multi-region active-active. Woh future courses ke taur par ab bhi kharay hain. Yeh course ek aur add karta hai: pattern-specific testing harnesses. Eval suite generic hai; future course pattern-specific test generators bana sakta hai (ek "sequential workflow tester" jo workflow branches cover karne wale inputs generate kare; ek "multi-agent routing tester" jo coordinator ki routing logic probe karne wale inputs generate kare). Yeh real frontier hai, aur prerequisite ke taur par is course ki pattern taxonomy par depend karta hai.

AI ke saath try karein, final exercise. Apna Claude Code ya OpenCode session kholein. Paste karein:

"Maine abhi agentic pattern selection par course complete kiya hai. Agle quarter mein agent banane ke liye meri actual job se ek real task choose karo, toy example nahin. Mujhe us par paanch-sawal decision tree walk karao, har sawal ka jawab mujh se lo aur agar meri reasoning weak ho to push back karo. Phir batao tum kaunsa pattern recommend karoge, kaunsi cloud deployment topology chahiye hogi, aur mujhe kaun se eval signals watch karne chahiye. Task properties ke bare mein specific raho, generic nahin."

Aap kya seekh rahe hain. Decision tree tab hi chipakta hai jab aapke tasks par apply ho, textbook examples par nahin. Yeh exercise discipline ko ek concrete decision mein force karti hai jo aap waqai lenge. AI ka response save karein; jab agent build karna shuru karein to usay revisit karein.

Khulasa: yeh course agent design (agent loops aur tools) aur agent deployment (cloud deployment aur eval courses) ke darmiyan connective tissue hai. Paanch-sawal decision tree woh architectural distinctions encode karta hai jin par literature ne saalon kaam kiya; composition layer har pattern ko specific deployment aur evaluation discipline se map karti hai. Closing thesis: task ke liye right agent loop woh cheez hai jis par AI-native company chalti hai, aur pattern selection pehla design decision hai jiske downstream sab kuch flow karta hai. Is course ka actionable artifact final section mein References se pehle one-page design-review template hai: printable, team-shareable, same paanch sawal ~15-20 minutes per architecture proposal mein walk karta hai.


Cheat sheet: sab 22 Concepts aur 5 Decisions, Part ke hisaab se grouped

Dono friend reviews ne cheat sheet ko dense flag kiya; neeche grouping har row ko us Part se map karti hai jisse woh belong karti hai, taake aap 22 rows scroll karne ke bajaye section ke zariye navigate kar sakein.

Part 1: Pattern-selection problem

#ConceptKey takeaway
1Pattern selection build se pehle design work haiPatterns well-documented hain; un ke darmiyan choose karne ki decision logic nahin. Wrong choice production mein expensively compound hoti hai.
2Har pattern task ke bare mein ek bet haiSequential workflow known paths par bet karta hai; ReAct unknown paths par; planning articulable structure par; reflection checkable criteria par; multi-agent real specialization needs par.
3Do failure modes, overshoot aur undershootOvershoot (zarurat se zyada elaborate) mashhoor mode hai; undershoot (zarurat se zyada simple) equally common aur subtler hai.

Part 2: Paanch-sawal decision tree

#ConceptKey takeaway
4Q1: Kya solution path pehle se define ho sakta hai?Known paths workflows tak route karte hain; unknown paths agentic reasoning tak. "Python function without LLM calls" heuristic se test karein.
5Q2: Kya workflow fixed aur stable hai?Stable paths sequential workflow tak route karte hain; known-but-variable ya to branched workflow ya agentic patterns tak.
6Q3: Kya task structure articulable hai?Articulable → planning + ReAct execution; not articulable → pure ReAct. Shape-vs-content distinction. Q2/Q3 disambiguation sidebar boundary cases walk karta hai.
7Q4: Quality > speed AUR checkable criteria?Reflection value tab add karti hai jab dono conditions hold karen. Common failures: rubber-stamping, vague criteria, latency budget violations.
8Q5: Specialization, context, ya scale bottleneck?Teen claims separately test hotay hain, jahan possible ho quantitative triggers ke against: >30% tool-routing errors (specialization), higher context par >10% accuracy drop (overflow), >2× latency budget overrun (scale).

Bridge concepts: pattern selection se implementation tak

#ConceptKey takeaway
8.5SDK primitives: har pattern kya use karta haiAgent atomic unit hai. Runner.run() loop chalata hai. @function_tool tools expose karta hai. Specialist takeover ke liye handoff(); coordinator-in-charge ke liye as_tool(). Reflection ke liye output_guardrail. Pattern selection primitives compose karne ka choice hai.
8.6Har pattern ka operational envelope (Inngest concrete example)Triggers function wake karte hain (TriggerEvent, TriggerCron); step.run durable banata hai; step.wait_for_event HITL gates implement karta hai; concurrency/throttle/priority load shape karte hain; fan-out multi-agent specialists coordinate karta hai; replay bug-fix recovery handle karta hai. Pattern jitna elaborate, envelope utna critical.

Part 3: Paanch patterns in depth

#ConceptKey takeaway
9Sequential workflow, pattern, deployment, evals, envelopeCloud stack ka smallest subset use karta hai (sandbox needed nahin). Step-level evals, agent-reasoning evals nahin. Inngest functions ka sab se direct map.
10Single agent + ReAct, pattern, deployment, evals, envelopeBridge Worker ke saath full cloud stack. Phoenix trace evals load-bearing hain. Poore agent loop ke liye ek step.run.
11Planning + ReAct execution, pattern, deployment, evals, envelopePlan persistence add karta hai; longer background workers. Plan-execution divergence key eval signal hai. Har stage ke liye ek step.run.
12Single agent + reflection (additive layer), pattern, deployment, evals, envelopeKisi bhi core pattern ke oopar layer hoti hai. Aksar multi-provider model routing introduce karti hai. Rubber-stamping sab se insidious failure. SDK output_guardrail ya separate generator/critic.
13Multi-agent specialist system, pattern, deployment, evals, envelopeFull stack plus per-specialist tracing. Teen separate scoreboards required. Har Inngest primitive use karta hai (fan-out, per-tenant concurrency, priority, HITL). Coordination overhead real hai.

Part 4: Failure signals aur revision

#ConceptKey takeaway
14Paanch failure signalsReAct loops (missing structure), plan-execution divergence (overstructured), reflection no-improve (vague criteria), multi-agent routing fail (overpartitioned), complex-but-not-better (cumulative overshoot).
15Pehle smallest scope par fixesArchitectural changes se pehle prompt-level fixes (stop conditions, criteria specs), phir contract-level (tool descriptions, handoff structures).
16Jab decision tree ghalat hota haiTask properties post-deploy change, different sub-tasks ko different patterns, constraints tree ka answer exclude karte hain. Tree dobara walk karein.
16.5Anti-pattern gallery, common wrong choicesFive overshoot anti-patterns + three undershoot. Content ke liye multi-agent (→ single agent); invoice ke liye ReAct (→ workflow); debugging ke liye planner (→ ReAct); vague criteria par reflection (→ remove); one giant agent (→ multi-agent); checkable output par reflection skip (→ add).

Part 5: Decision lab (paanch Decisions, neeche separate table)

Part 6: Honest frontiers

#ConceptKey takeaway
17Cost aur latency architectural constraintsMulti-agent + reflection sequential workflow se 30-60× cost kar sakta hai (illustrative ratio). Constraint-driven pattern choices explicitly document karein.
18Different layers par pattern compositionHierarchical, sequential, conditional. Har layer ki pattern choice us scope par same paanch sawalon se justified.

Part 7: Closing

#ConceptKey takeaway
19Pattern selection connective tissue haiAgent design (agent loops aur tools) aur deployment (cloud deployment course) ke darmiyan bridge. Task ke liye right agent loop AI-native company chalata hai.

Paanch Decisions (Part 5)

#DecisionCore pattern + additive layers
1Maya's Tier-1 Support agentCore: Single agent + ReAct + tools (Concept 10). Additive layers nahin.
2Incident response agentCore: Planning + ReAct execution (Concept 11). Remediation steps par + Reflection layer (Concept 12).
3Market research agentCore: Multi-agent specialist system (Concept 13), specialists ke andar planning + ReAct ke saath. Synthesis par + Reflection layer.
4Enterprise onboarding agentCore: Sequential workflow (Concept 9). Additive layers nahin. Agentic patterns ka negative example.
5Coding agentCore: Multi-agent specialist system (Concept 13), specialists ke andar planning + ReAct ke saath. Coder output par + Reflection layer. Advanced case: har architectural decision composed.

Jaldi reference: paanch sawal, paanch patterns

Q1: Can the solution path be defined in advance?
Yes → Q2
No → Q3 (need agentic reasoning)

Q2: Is the workflow fixed and stable across runs?
Yes → SEQUENTIAL WORKFLOW
No → Q3 (or branched workflow if few stable variants)

Q3: Is the task structure articulable before execution?
Yes → PLANNING + REACT EXECUTION
No → SINGLE AGENT + REACT + TOOLS

Q4: Quality > speed AND criteria are checkable?
Yes → Add REFLECTION on top of the chosen pattern
No → Skip reflection

Q5: Specialization, context, or scale bottleneck?
Yes → MULTI-AGENT SPECIALIST SYSTEM
No → Keep single-agent pattern

Design-review template (one-page, printable)

*Is course ke framework ko design reviews mein apply karne ke liye team-shareable worksheet. Har architecture proposal ke liye ek print karein. Template same paanch sawal walk karta hai aur same compositional decisions surface karta hai; value solo fill out karne mein nahin, discussion ke dauran sawalon ko visible rakhne mein hai.*

═══════════════════════════════════════════════════════════════════════
COURSE ELEVEN: Agentic Architecture Design Review
═══════════════════════════════════════════════════════════════════════

Task name: _______________________________________________________

Task description (1-3 sentences):
________________________________________________________________
________________________________________________________________
________________________________________________________________

Reviewer(s): __________________________ Date: ____________________

───────────────────────────────────────────────────────────────────────
CORE PATTERN (Q1-Q3)
───────────────────────────────────────────────────────────────────────

Q1. Can the solution path be defined in advance?
[ ] YES, known → go to Q2
[ ] NO, adaptive → skip to Q3
Evidence:
______________________________________________________________

Q2. Is the workflow fixed and stable across runs?
[ ] YES, stable → CORE = Sequential Workflow → skip to Q4
[ ] NO, variable → continue to Q3
Evidence:
______________________________________________________________

Q3. Is the task's high-level structure articulable before execution?
[ ] YES, articulable → CORE = Planning + ReAct execution
[ ] NO, emergent → CORE = Single Agent + ReAct + tools
Evidence:
______________________________________________________________

→ CORE PATTERN CHOSEN: ________________________________________

───────────────────────────────────────────────────────────────────────
ADDITIVE LAYERS (Q4-Q5)
───────────────────────────────────────────────────────────────────────

Q4. Quality > speed AND criteria are checkable?
[ ] YES: both → ADD Reflection layer
[ ] NO: vague criteria → DO NOT add reflection
[ ] NO: latency budget → DO NOT add reflection (consider human review)
Checkable criteria (if YES):
______________________________________________________________
______________________________________________________________

Q5. Specialization, context, or scale bottleneck?
[ ] YES: specialization (name it): _______________________________
[ ] YES: context overflow (describe): ____________________________
[ ] YES: parallelizable scale (quantify): ________________________
[ ] NO: keep single agent

→ If Q5 is YES → upgrade CORE to: Multi-Agent Specialist System
Specialist roles: ____________________________________________

───────────────────────────────────────────────────────────────────────
FINAL ARCHITECTURE
───────────────────────────────────────────────────────────────────────

Core pattern: ________________________________________________
+ Reflection (Y/N): ________________________________________________
+ Multi-agent (Y/N): ________________________________________________

───────────────────────────────────────────────────────────────────────
IMPLEMENTATION & DEPLOYMENT
───────────────────────────────────────────────────────────────────────

SDK primitives used (Concept 8.5):
[ ] Agent (with output_type if structured)
[ ] Runner.run(agent, input, max_turns=__)
[ ] @function_tool decorators on N tools (N = __)
[ ] handoff() between agents
[ ] Agent.as_tool() for coordinator composition
[ ] output_guardrail (if reflection layer)

Operational envelope primitives (Concept 8.6, if applicable):
[ ] Trigger: ___________________________________________________
[ ] step.run per: _____________________________________________
[ ] step.wait_for_event for: __________________________________
[ ] Concurrency cap: ______ per ______________________________
[ ] Fan-out for: ______________________________________________
[ ] Priority/fairness rule: ___________________________________

Cloud deployment subset needed (Concept 9-13 sidebars):
[ ] FastAPI on ACA (always)
[ ] Neon Postgres
[ ] R2 (if files in/out)
[ ] Sandbox + Bridge Worker (if agent runs code)
[ ] Phoenix (if agentic: any pattern except pure sequential workflow)

───────────────────────────────────────────────────────────────────────
RISK ANALYSIS
───────────────────────────────────────────────────────────────────────

Cost class (Concept 17):
[ ] 1× baseline (Sequential workflow)
[ ] 3-10× (Single agent + ReAct)
[ ] 5-15× (Planning + ReAct)
[ ] +2-3× core (with Reflection)
[ ] 5-20× (Multi-agent)

Latency budget check:
Expected latency: ___________________________________________
User-facing budget: _________________________________________
[ ] Fits [ ] Tight [ ] Will not fit

Most likely failure signal to watch (Concept 14):
[ ] ReAct loops / revisits solved work
[ ] Plan-execution divergence
[ ] Reflection not improving output
[ ] Multi-agent routing failures
[ ] System feels complex but not better
Mitigation if it appears:
______________________________________________________________

Eval signals to wire (Concept 9-13 sidebars):
______________________________________________________________
______________________________________________________________

───────────────────────────────────────────────────────────────────────
ANTI-PATTERN CHECK (Concept 16.5)
───────────────────────────────────────────────────────────────────────

If a senior engineer reviewed this choice, what would they object to?
______________________________________________________________
______________________________________________________________

Counter-argument (why our choice is right despite the objection):
______________________________________________________________
______________________________________________________________

───────────────────────────────────────────────────────────────────────
SIGN-OFF
───────────────────────────────────────────────────────────────────────

Architecture approved for: [ ] Prototype [ ] Pilot [ ] Production
Approved by: ______________________________________________________
Re-review date: ______________________________________________________

═══════════════════════════════════════════════════════════════════════

Template jaan-boojh kar har architecture proposal ke liye 15-20 minutes mein walkable hai. Isay fill out karna discipline hai; value team conversation ke dauran questions visible rakhne mein hai. Har major architecture decision ke liye ek print karein; filled-out versions apni team ke design-decision archive mein rakhein.

References

  • Bala Priya C, "Choosing the Right Agentic Design Pattern: A Decision-Tree Approach," Machine Learning Mastery, May 15, 2026, machinelearningmastery.com/choosing-the-right-agentic-design-pattern-a-decision-tree-approach. Is course ki reedh ki haddi par decision tree unka hai.
  • Yao et al., "ReAct: Synergizing Reasoning and Acting in Language Models" (2022), original ReAct paper.
  • Wang et al., "Voyager: An Open-Ended Embodied Agent with Large Language Models" (2023), planning + execution composition ki early example.
  • Shinn et al., "Reflexion: Language Agents with Verbal Reinforcement Learning" (2023), reflection pattern ki formalization.
  • OpenAI, "The next evolution of the Agents SDK" (April 2026), SDK update (model-native harness plus native sandbox execution) jis se patterns shippable bante hain.
  • agent-building course (Panaversity Agent Factory): agent loops aur AI-native company ka engine.
  • eval-driven course (Panaversity Agent Factory): eval-driven development aur trace-to-eval discipline.
  • Cloud deployment course (Panaversity Agent Factory): cloud mein OpenAI Agents SDK harness deploy karna.

Agent Factory track ke liye pattern-selection crash course: paanch sawal, paanch patterns, failure signals, aur aapki deployment, eval suite, aur operational envelope (Inngest) ke saath composition. Anchor article: Bala Priya C, Machine Learning Mastery, May 15, 2026. Agent design (agent loops aur tools) aur deployment/eval courses ki production discipline ke darmiyan pattern-selection gap close karta hai, poore course mein operational envelope ke saath composed, aur translation table ke zariye kisi bhi agentic stack par portable.

Flashcards se parhai ki madad