CS Ventures
Product Team Loop
An autonomous product team, built from agents

It finds its own work, ranks it, and never stops shipping.

Point it at a feature. A team of agents discovers what to build, throws out the slop, scores what survives, and ships sprints of fixes and enhancements. When the backlog runs low it discovers more, and keeps going. A human orchestrates and never writes a line of code.

Continuous discovery, not a one-shot idea list Evidence-anchored intake filter Fixes + enhancements every sprint Loops until the work is done
What makes it different

One loop that never stops

Most agent tools generate a list of ideas and stop. This runs the whole team on a cycle: discover, prioritize, ship a sprint of fixes and enhancements, then refill the backlog and go again.

Stage

Hover or tap any node, or the green Replenish link, to see what it does.

Stage hand-off Replenish (closes the loop) Work in flight

Watch one full turn of the loop

See the team take "Run the product team on Altro's rankings screen" from discovery and the slop filter through a shipped sprint, then loop back for more.

Why the output isn't slop

Two jobs, kept strictly apart

The default failure of agentic discovery is confident, well-formatted nonsense. The fix is to make discovery a strict intake gate and the build a self-checking pipeline. Every move is either a human orchestrating or an agent doing, and the two never blur.

The orchestrator

Routes, never writes code

A human scrum-master slices work, writes tight briefs, scores, accepts, ships, and keeps the backlog. Not one line of feature code. It stays lean by passing pointers like file paths and symbol names, never pasting files into its own head.

The agents

Do the work, return receipts

Lens agents find work, the prioritizer scores it, the product agent writes stories, and builders ship them. Each returns a terse manifest, never a code dump, so the orchestrator stays sharp across dozens of hand-offs.

Evidence or it's dropped

Every candidate must cite a file:line, a reproducible flow, or a spec it violates. No anchor, no intake. "This could be nicer" never makes it in.

Reviewer is not the Reviser

One agent audits a diff for runtime correctness; a different agent fixes what it flagged. A critic grading its own rework drifts into self-justification.

Agents never self-score

Discovery can't set its own impact or priority. One deterministic pass scores the whole filtered union against an anchored rubric, so two runs rank the same.

The roster

A full product team, dispatched as agents

Each role has a mode and a single job. Read-only roles always run in parallel; anything that writes the same file is sequenced, never raced.

RoleModeJob
Orchestratorscrum-masterLifecycle, briefs, scoring, accept, ship, bookkeeping. Writes no feature code.
Discovery × 5 lensesread-onlyEach owns one lens, sweeps every surface, surfaces candidates with evidence.
PrioritizationwriteDedupe, score, rank, assign stable ids, write the backlog.
Productread-onlyTurn top items into dev-ready stories that meet a Definition of Ready.
BuilderwriteImplement one story over one disjoint file set.
Diff Reviewerread-onlyAudit the diff for runtime correctness, rank findings P1 / P2 / P3.
ReviserwriteFix P1 / P2, usually the original Builder resumed. Max two rounds.
Inside discovery

Five lenses, each hunting one kind of problem

Run in parallel, every lens sweeps the whole feature but only through its own question. Convergence across lenses on one gap is the highest-confidence signal there is.

Flow

Dead ends, missing CTAs, multi-tap core jobs, states with no way out.

Runtime

Stale state and closures, wrong-state conditionals, broken CRUD wiring, off-by-one, null/NaN, empty/loading/error paths.

Coverage

Specced-or-listed but not actually built or usable. Real capability gaps against the design.

Consistency

House-style and pattern drift across surfaces, plus cross-document conflicts.

Gates

Brand and token violations, accessibility (labels and hit targets), AI-tells, and other hard-gate failures.

Signal vs. noise

Highest signal: human device reports and runtime-correctness audits. Lowest: generic best-practice suggestions, which the filter rejects.

Phase 2 · Prioritize

One number, computed the same way every time

An anchored rubric turns each item into impact, fit, and effort points, then a single deterministic pass ranks the whole union. Fit is persisted, ids are never reused, ties break impact then effort then id. Every input ends as scored, merged, rejected, or deferred, with no silent drops.

priority score = (impact × fit) / effort
Phase 4 · Sprint

Fixes and enhancements, shipped together

Each sprint pulls 3 to 5 disjoint-file stories at an 85% enhancement / 15% bug mix. Builder builds, a fresh Reviewer audits runtime correctness, a Reviser clears P1/P2 in at most two rounds. Any P1, regression, or device-reported bug is drain-first. Partial sprint? Ship only the files that passed.

85 / 15 mixdisjoint files onlymax 2 reviser rounds
Safety rails

It runs on its own, but never off a cliff

Humans keep the keys

Native buildsrouted to a human gate, never auto-shipped
SQL & migrationsblocked, queued for human review
Can't-see-in-sourcerenders & runtime go to a device report
Each shipstanding authorization, or pause for approval, your call

When something breaks

Regressiona shipped item that breaks reopens, exempt from do-not-propose
Bad shiprevert the deploy first, then fix forward
P1 crashpreempts the running sprint as a solo hotfix, then resumes
Empty sweepdiscovery returns nothing valid? pause and notify, never spin
Live demo · worked example

One full turn of the loop

Press play and watch the team take a single instruction through discovery, the slop filter, scoring, stories, a build-review-revise sprint, the ship gate, and the loop back for more. Step through it at your own pace.

Step 1 of 8

It starts with one plain instruction.

 

You give it one feature and a goal. The team handles discovery, prioritization, stories, and the build, sprint after sprint, reporting at each ship instead of asking what to do next.

Surface: Altro rankings screen Mix: 85% enhancement / 15% bug Human input: one instruction

5 lenses sweep in parallel

Flow
dead ends, missing CTAs
2 found
Runtime
stale state, off-by-one
2 found
Coverage
specced, not built
1 found
Consistency
pattern drift
1 found
Gates
a11y, brand, AI-tells
1 found

Candidates, each with an anchor

bug
Drag-reorder writes a stale rank index
useRankings.ts:88
runtime · reproducible on fast moves
enh
Ranked list has no empty state or CTA
RankingsScreen.tsx:142
flow · new users see a blank screen
enh
Reorder handles lack a11y labels, hit target < 44px
RankRow.tsx:31
gates · fails the a11y hard gate
enh
Compare-trips view specced but never wired
design/altro.md §4
coverage · capability gap vs design

Read-only agents, in parallel. Each finding names a defect or a missing capability and points at exactly where it lives.

The intake filter rules on every candidate

bug
Drag-reorder writes a stale rank index
useRankings.ts:88
kept
enh
Ranked list has no empty state or CTA
RankingsScreen.tsx:142
kept
enh
Reorder handles lack a11y labels
RankRow.tsx:31
kept
enh
Make the rankings screen pop more
no anchor
no evidence
enh
Add a gradient header banner
cosmetic
fails brand gate
bug
Reindex gap on city delete
already ENH-009
duplicate

Three rejected: no evidence anchor, cosmetic-only, and a duplicate of an existing backlog row. This gate is the whole difference between a product team and an idea generator.

BACKLOG.md · scored & ranked(impact × fit) / effort
idtitleifescore
BUG-014Stale rank indexP1HHighL0.0
ENH-031Empty state + CTAHHighM0.0
ENH-032a11y labels + hit targetMMedL0.0
ENH-033Wire compare-tripsHMedH0.0
ENH-034Rank chips to brand tokensLLowL0.0

One deterministic pass. The P1 bug drains first, overriding the ratio; the rest sort by score. Two runs of this rubric produce the same ranking.

ENH-031 → Definition of Ready

User valueNew users land on a screen that tells them what to do
AcceptanceEmpty list shows headline, blurb, and an "Add a city" CTA
Owns filesRankingsEmpty.tsx (new), RankingsScreen.tsx
Constraintsbrand tokens, a11y labels, no AI-tells

Contract symbols verified to exist

$ rg -n "useRankings" RankingsScreen.tsx
✓ 142: const { items } = useRankings()
$ rg -n "AddCityButton" components/
✓ Cta.tsx:8: export function AddCityButton

A story is ready only when its files are disjoint and every symbol it leans on is confirmed real. Missing symbol → the story is blocked, not built.

Builder
implements each story over its own files
Diff Reviewer
audits runtime correctness, flags P1/P2
Reviser
fixes only what was flagged
P1passed
BUG-014
Stale rank index
useRankings.ts
P2passed
ENH-031
Empty state + CTA
RankingsEmpty.tsx
passed
ENH-032
a11y labels + hit target
RankRow.tsx

Three stories, three disjoint file sets, so they build in parallel. The Reviewer flags a P1 and a P2; the Reviser clears both. Disjoint files are why they could run at once instead of in a line.

✓ Shipped
Definition of Done met · ship gate

Ship the sprint

Passed: hard gates clean, typecheck green, P1/P2 cleared
Commit: only the 3 story file sets, no lockfiles or migrations
Held back: a SQL migration stays blocked for your review
Standing authorization ships it, or pause for approval at every sprint. Your call.

Backlog drains as the sprint ships

Ready
5
Low-water
3
shipped 3 ready 2 below low-water
Discover Prioritize Stories Sprint

Ready drops below the low-water mark, so the loop re-runs discovery to refill it, rotating one lens to control cost, and keeps shipping. It stops only when the completion condition is met or the human says so.

1 / 8