From Business Problem to Tech Spec

A field lab that teaches the actual skill of going from a vague founder DM ("approvals are killing us") to a buildable tech spec — with a real tools catalog, real tradeoffs, and an ADR for the biggest decision.

The lab focuses on pattern recognition: given a problem, which architectural shape, which patterns, which tools, and what skills do you need? It's the senior-engineer judgment call, not "fill out this template."

What's here

source/
  guide.md                                  Source guide (markdown)
  Problem_to_Tech_Spec_Teaching_Guide.docx  Generated from guide.md
app/
  index.html        Single-page lab (no backend, no build step)
  styles.css        All styling
  content.js        8 stages, tech catalog, Stride case, per-stage evaluators
  app.js            Renderer, evaluator-driven feedback loop, modes, persistence

Run

cd app
python3 -m http.server 5187
open http://127.0.0.1:5187

No npm, no build, no dependencies. Just a static page.

Two modes

Lab mode (coached). After each submission, an evaluator runs and produces hints if your answer isn't yet at the maturity bar — you retry until it is. Then the model answer + sensitivity scenarios reveal.

Assessment mode (graded). No hints. Submit each stage once and move on. At the end you get a graded report against the rubric.

Both modes share the same 8 stages.

The 8 stages

#	Stage	What you produce	New skill taught
0	Receive the brief	5 clarifying questions	Resist designing — ask outcome/users/non-goals first
1	Problem brief	Measurable problem statement	Translate "feels slow" into numbers
2	Recognize the shape	Workflow / RAG / CRUD / search / realtime / batch / analytics / agent	Pattern recognition (senior-engineer skill)
3	Constraints → architectural responses	Map 8 constraints to 8 patterns	Constraint-driven design
4	Pick the stack, layer by layer	4 layers × real tools (Trigger.dev / Postgres / Slack+email / Sentry) with rationale citing requirements	Real-world tool selection
5	Skills gap + spikes	Team skills inventory; 2 time-boxed spikes for highest-risk gaps	Honest self-assessment, de-risking
6	NFRs as measurable scenarios	Stimulus / environment / response / measure	Load-testable contracts
7	Tech spec + ADR	TL;DR + 5-field ADR for biggest decision	Capturing decisions for future-you

Two playable cases — same 8-stage ladder

Stride (workflow shape) — 4 stack layers, 13 tools

Series-A B2B expense-management startup. The CEO DMs: "Approvals are killing us..."

You walk it up to: shape (workflow), constraints (audit / scale / cost / HITL), stack (Trigger.dev + Postgres events + Slack+email + Sentry), spikes (2 days for HITL signal, 1 day for delivery SLO), NFR (30s p95 delivery at 100/10min), TL;DR + ADR for the workflow-engine choice.

Ember (RAG shape) — 6 stack layers, 23 tools

Series-Seed AI customer-support startup. The CEO DMs: "Customers complain that the bot gives wrong pricing answers..."

You walk the same ladder, but the answers are entirely different: shape (RAG), constraints (hallucination / freshness / latency / cost / citations / PII), stack (pgvector + OpenAI 3-large + Claude Sonnet + structural+hybrid+rerank chunking + LiteLLM + Langfuse), spikes (3 days for chunking lift, 2 days for nightly eval pipeline), NFR (≥98% on golden set + <3s p95 + $0.05/answer), TL;DR + ADR for the chunking/retrieval choice.

The AI-specific coaching catches anti-patterns that bite real teams: jumping to "let's swap LLMs" before diagnosing the failure mode (80% of RAG failures are retrieval, not generation), shipping naive 500-token chunks, skipping evals, picking Voyage embeddings for general-domain content where OpenAI 3-large is cheaper at parity.

Tools catalog (real 2026 options with fit / fail / cost / risk)

Workflow shape (Stride): Trigger.dev · Inngest · Temporal · DIY Postgres+cron; Postgres-events · DynamoDB+Streams · Kafka; email-only · Slack+email pluggable · all-channels; logs-only · Sentry+Grafana/Datadog · Honeycomb/OTel.

RAG shape (Ember):

Retrieval: pgvector · Pinecone · Qdrant · Weaviate
Embedding: OpenAI text-embedding-3-large · Voyage voyage-3-large · Google text-embedding-005 · BGE-M3 self-hosted
LLM: Claude Sonnet 4.6 · GPT-5.x · tiered routing · open-weight via vLLM
Chunking: naive 500-token · structural · structural+hybrid+rerank
Gateway: direct SDKs · LiteLLM · Portkey · OpenRouter
Evals: eyeball · Langfuse · Braintrust · Arize Phoenix

Each option carries "fits when / fails when / cost / risk" — so picking is a thinking exercise, not a memory test.

Feedback design

The lab puts the onus on the learner. It scaffolds the structure of an answer (fields, dropdowns, drag-and-drop) but you bring the judgment. Empty text boxes are drop-off traps; per-layer pickers and bucketed prompts keep you moving while making you think.

On submit, the evaluator returns hints at two severity levels:

MUST — blocks pass. You retry.
TIP — soft suggestion. You can still pass but you'll know what to improve.

The reveal panels (Lab mode only) show three things:

Model answer vs. your final attempt
Sensitivity — what would change my picks if a constraint changed
Wrong-but-common alternatives with the failure mode named

Persistence is in localStorage; reset wipes the session.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.claude		.claude
app		app
feedback		feedback
labs		labs
scripts		scripts
source		source
web		web
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
STARTING_DOC.md		STARTING_DOC.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

From Business Problem to Tech Spec

What's here

Run

Two modes

The 8 stages

Two playable cases — same 8-stage ladder

Stride (workflow shape) — 4 stack layers, 13 tools

Ember (RAG shape) — 6 stack layers, 23 tools

Tools catalog (real 2026 options with fit / fail / cost / risk)

Feedback design

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

From Business Problem to Tech Spec

What's here

Run

Two modes

The 8 stages

Two playable cases — same 8-stage ladder

Stride (workflow shape) — 4 stack layers, 13 tools

Ember (RAG shape) — 6 stack layers, 23 tools

Tools catalog (real 2026 options with fit / fail / cost / risk)

Feedback design

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages