← RCF

Concept

The build cycleCopy link

Five stages per feature: Define, Build, Review, Test, Finalise. Each stage commits. Each commit is honest. The discipline is what makes shipping with AI agents survivable.

The document chain tells you what to build. The build cycle tells you how to build it, slice by slice, in a way that doesn’t collapse under contact with reality or AI agents. The cycle is per functional build specification. One FBS, one cycle, five stages, five commits.

Phrased crisply: Define → Build → Review → Test → Finalise.

The order matters. The committing matters. The discipline of finishing one stage before starting the next matters most of all. Skipping stages or running them in parallel is exactly what the methodology is built to stop.

Stage 1: DefineCopy link

Lay down the test scaffolding. Every acceptance criterion in scope on the FBS gets its test suite as a file, with the cases sketched as describe and it blocks (or whatever your framework’s equivalent is) that name what each case will check, derived directly from the AC text. The criterion-to-suite-to-cases structure exists in code before any production code does.

Define isn’t where the assertions get written. The actual test code, the arrange/act/assert that makes each case run, lands in the Build stage that follows. Define is about pinning the shape: what suite covers which AC, what cases each suite contains, what behaviour each case is going to verify. The bodies of the cases are stubs (failing or pending) until Build fills them in alongside the implementation.

Define is where most teams skip ahead, and where AI agents especially want to skip ahead, because pinning the structure against criteria is slower and more thoughtful than writing code that looks plausible. The slow part is exactly where the methodology earns out. A suite structure derived from the ACs reflects what the criteria say. A suite invented after the code reflects what the code already does. One of those is testing. The other is wallpaper.

Define ends with a commit. Test files exist. Test runner sees them. Every case in the FBS’s scope is present, even if it’s pending or red. CI may or may not be green at this point depending on how the project handles pending tests, but the structure is in place.

Stage 2: BuildCopy link

Write the code that makes the tests pass. No more, no less. Build context comes from the FBS’s declared sources: the PRD sections it points at, the TAD components it touches, the modules and schemas it’s allowed to change. A worker doesn’t grep around looking for orientation. If the context didn’t come from the FBS, it isn’t context.

This is the part AI agents are excellent at, when they have a sharp brief and a green-bar target. The Define stage gave them the brief in the form of failing tests. The FBS gave them the brief in the form of scope. With both, the agent’s output narrows fast. Without both, the agent invents behaviour, and you’re back where you started.

Build ends with a commit. Code exists. The tests in scope pass.

Stage 3: ReviewCopy link

Step away from the code. Open the diff. Read it as though you didn’t write it, because you probably didn’t. Three questions, in order.

Does each test actually exercise the criterion it claims to? AC-042-03 said the old photo gets cleaned up within sixty seconds. Did the test that claims to cover it actually check the cleanup time, or did the worker sneak in a sixty-second sleep so the test would pass without the implementation needing to be fast? This is where AI agents get caught. They write tests that pass. They don’t always write tests that prove the AC.

Did anything sneak in that isn’t in scope? An FBS has a stated story scope and a stated set of files it touches. A change to a third file is either a missed dependency that needs to be promoted to the FBS, or scope creep that needs to be cut. Either way, it needs a decision, not a silent pass.

Does the slice match the technical architecture? If the TAD says exports go through the job service, the slice better not be calling out to a directly-spawned process. Architecture decisions are checked here, not in production.

Review ends with a commit, even if it’s an empty one signed off as “review pass”. Usually it isn’t empty. Review almost always surfaces something. Fix it. Commit. Move on.

Stage 4: TestCopy link

Run the suites. All of them on the project, not just the new ones. The green bar is the project, not the slice.

The reason is simple. A new slice can break something old. A change to the identity service to support photo upload can subtly change how the existing avatar fetch works. The slice’s own tests will be happy. The avatar fetch’s tests will catch the bleed. If the avatar fetch’s tests don’t catch it, the avatar fetch’s tests are weak, which is a finding of its own and a separate issue to log.

Test stage failures aren’t bugs in the slice. They’re evidence that the slice is interacting with the rest of the system in ways the slice didn’t plan for. Fix the interactions, not the tests. If a test was wrong on its own terms, that’s a finding about the AC behind it, not a reason to lower the bar.

Test ends with a commit. Whole project green. Coverage report attached.

Stage 5: FinaliseCopy link

CI green. Coverage report green for the slice. The FBS status flips to complete. The work item that opened against the FBS closes. The build sequence updates: dependent FBSs that were gated on this one become eligible to start.

Finalise is the smallest stage but the most under-rated. It’s the moment the slice goes from “done in the developer’s head” to “done in the system.” That distinction is what keeps the build sequence honest. Without an explicit Finalise, “done” drifts in definition over the life of a project, and by month three nobody can tell whether the FBSs marked complete are actually complete or just “completeish, last time we checked.”

Finalise ends with a commit. The commit message names the FBS. The status change is in the same commit. The chain holds.

Why each stage commitsCopy link

The five commits aren’t for show. They’re the unit of accountability. A senior reviewer should be able to look at any one commit and form an opinion about whether the stage was done properly, without having to understand the other four stages.

The Define commit shows the tests, in isolation, before any code. You can reason about whether they cover the criteria.

The Build commit shows the implementation, in isolation, with the tests it satisfies already present. You can reason about whether the implementation is the right shape for the tests.

The Review commit shows the corrections that came out of reading the diff. Almost always small. Sometimes substantial. Either way, visible.

One optional refinement for teams that want a tighter loop. If Review surfaces material fixes, those fixes can go back through Build as a fresh Build commit, then Review runs again on the new diff. That way the corrections themselves get reviewed, not just authored. Most slices don’t need it. On higher-risk work, the small Build ↔ Review loop is cheap insurance.

The Test commit shows what fell out when the whole project ran. Often nothing. Sometimes a fix to an unrelated bleed.

The Finalise commit shows the status flip and the FBS closing. Smallest commit of the five. The one that makes the state durable.

Squash the cycle into one commit and you lose every one of those reviewable moments. The cycle becomes one big lump that has to be reviewed as a single decision. That’s how you end up shipping commits that are partly wallpaper and partly real work, with no way to tell which is which after the fact.

What the cycle costsCopy link

Time, mostly. A clean five-stage cycle on a small slice takes longer than the “just build it and we’ll see” approach typical teams might be used to. Not always by a lot, but enough to feel slower in the first few stages.

The cost shows up early. The benefit shows up steadily, then compounds. A project six months in, run on the cycle, is in a different shape from a project six months in run any other way. The cycled project has a clean build sequence, current FBS statuses, a coherent test suite that maps to criteria, and a commit history a new joiner can read. The other project has a backlog full of features that work most of the time and a team that can’t agree on what “done” means anymore.

Per-feature commits are cheap. The discipline of running them in order, every time, is what’s expensive. It’s also what makes the difference between a methodology that works and a poster on the wall.

That discipline is also what prevents AI drift, the slow, plausible divergence between intent and output you get when an agent writes a lot of code with no per-slice checkpoint. (Different from model drift; this is the team-discipline kind.) The five committing stages are the checkpoint. Every cycle, the work gets re-anchored to the FBS, the ACs, and the test suite. Skip the cycle and the drift starts. Run the cycle and it can’t.

The cycle for changes, not just new workCopy link

The cycle runs the same way when the trigger is a change to the spec, not a new slice of greenfield. A requirement gets refined, an AC gets edited, the architecture grows a new constraint: traceability flags the gap between docs and code, the gap is scoped into a functional build specification, and the FBS runs through Define, Build, Review, Test, Finalise. Define generates fresh test suites from the updated AC text, or amends the existing ones. Build closes the diff. The other three stages run as usual.

That property is the methodology earning its keep. A bug fix, a new edge case, a whole new module: same five stages, same five commits, same reviewable shape per commit. The size of the diff varies; the discipline doesn’t. See the living spec page for the wider argument and how the gap-into-FBS pattern works across the shape of change.

What is the agentic SDLC?Copy link

The agentic SDLC is the software development lifecycle once AI coding agents are part of the team rather than a tool the team uses. RCF’s five-stage cycle is what that lifecycle looks like when you don’t want the agents to wreck the project. Same stages, same commits, agent in the loop instead of (or alongside) a human at the keyboard.

The cycle was designed for human teams. It does double duty for AI-assisted work, and the structure is more important, not less, when an agent is doing the typing. A worker agent can run the whole cycle inside one session, when the brief is sharp enough. Define stage: the agent reads the FBS, generates the test suites from the AC text, commits. Build stage: the agent writes the implementation against the failing tests, commits. Review stage: a second agent (or the same agent in a different role) reads the diff against the FBS and flags scope or correctness issues. Test stage: the runner runs, the agent fixes the bleed, commits. Finalise stage: the agent updates the FBS status, commits.

The cycle is also where the AI trust gap closes, one cycle at a time. An agent operating inside the cycle is constrained by what each stage is allowed to do. Define isn’t allowed to add files outside the FBS scope. Build isn’t allowed to change tests. Review isn’t allowed to silently rewrite the criteria. Each of those constraints is what makes the agent’s output reviewable, and what makes trust in AI-generated code real instead of imagined. That is the build cycle inside an AI SDLC: the discipline that survives contact with agents.