Blog

The Phase Document System: AI Session Memory That Survives Model Swaps

phase document system AI: Edmund Ng's journey spoke on governed AI, harness testing, and Vibe Coding for solo founders. Explore.

Published Updated 15 min read

ai-architecturephase-docs

phase document system AI — Edmund Ng AI architecture harness hero diagram (4:3 WebP)

phase document system AI matters when you move from demo velocity to production scrutiny. This article is Edmund Ng's field notes on AI session memory, harness discipline, and the journey toward auditable AI—written for solo founders and system rule designers who cannot afford silent regressions.

Continue with these journey spokes.

Continue with these journey spokes.

Continue with these journey spokes.

Continue with these journey spokes.

Continue with these journey spokes.

Continue with these journey spokes.

Complete Vibe Coding Guide for Non-Programmers · Multi-Axis Sub-Agent Review · Build with AI Without a Programming Background

On this page

Key takeaways

  • phase document system AI needs written rules—not hero prompts alone.
  • AI session memory keeps demo speed from becoming production regret.
  • Harness discipline connects this spoke to the wider governed production journey.
  • Cross-link Phase docs, Harness retests, and written tradeoff logs before calling work done.

Takeaways above anchor the rest of this spoke.

What — phase document system AI — AI session memory — the Phase Document System

The Phase Document System is Edmund Ng's documentation architecture for AI-assisted builds. Each Phase covers one meaningful unit of work with structured sections: goal, pre-read, work items, done-when, scope boundaries, verification.

Unlike chat logs, Phase docs are versioned contracts — they record what was decided, what was tested, and what remains.

Prerequisite: Complete Vibe Coding Guide — Act 1 hub.

Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.

Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.

Governed exports and harness checkpoints prevent demo velocity from collapsing under review.

In the What layer of this Act 2 architecture and harness spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for what is a phase document system AI: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.

In the What layer of this Act 2 architecture and harness spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for what is a this approach: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.

Why — spec driven AI builds — chat memory is not architecture memory

LLM sessions forget. Models swap. Teams change. Without Phase docs:

  • Technical debt accumulates invisibly
  • Past decisions become folklore
  • Hallucination about "current system state" spreads

Phase docs plus Decision Log (We considered A, chose B, because C.) create auditable continuity.

Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.

Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.

Governed exports and harness checkpoints prevent demo velocity from collapsing under review.

In the Why layer of this Act 2 architecture and harness spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for when should you freeze AI session specs: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.

In the Why layer of this Act 2 architecture and harness spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for when should you freeze AI session specs: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.

When — AI session memory — adopt Phase docs

Start Phase 0/1 before feature code when:

  • Build spans multiple sessions or models
  • More than one agent role touches the system
  • Regulated or client-scrutiny context applies

Skip only for: throwaway spikes with explicit throwaway label.

Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.

Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.

Governed exports and harness checkpoints prevent demo velocity from collapsing under review.

In the When layer of this Act 2 architecture and harness spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for how phase documents survive model swaps: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.

In the When layer of this Act 2 architecture and harness spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for how phase documents survive model swaps: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.

Where — spec driven AI builds — Phase docs in the stack

ArtifactRole
Phase docUnit of work contract
§22 gap-hunt appendixBlindspot sweep before build
Decision LogHuman-readable tradeoff sentence
Dossier / journalCross-phase rollup for cold-start

Public blog teaches Pattern/Category — not internal repo paths (sharing boundary).

Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.

Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.

Governed exports and harness checkpoints prevent demo velocity from collapsing under review.

In the Where layer of this Act 2 architecture and harness spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for what is a this approach: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.

In the Where layer of this Act 2 architecture and harness spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for what is a this approach: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.

How — AI session memory — minimal Phase doc ritual

  1. Goal + non-goals — one screen
  2. Pre-read table — files/docs to skim
  3. Work items + done-when — binary gates
  4. §22 gap-hunt — parallel sub-agent sweep; READY-FOR-BUILDER: YES/NO
  5. Implementation — only after YES
  6. Verification + residuals — closure ritual

Blog seed: "First draft plans are drafts. The protocol turns misses into permanent checklists."

Next: Harness Engineering.

Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.

Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.

Governed exports and harness checkpoints prevent demo velocity from collapsing under review.

In the How layer of this Act 2 architecture and harness spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for when should you freeze AI session specs: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.

In the How layer of this Act 2 architecture and harness spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for when should you freeze AI session specs: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.

是什么 — extended AI session memory — spec driven AI builds

Governed builders treat written rules, frozen snapshots, and harness retests as production requirements—not optional polish after a green demo. Edmund Ng's journey from non-programmer Vibe Coding to auditable AI systems shows why structure beats model churn when stakeholders ask how you decided, what you rejected, and what evidence you can export tomorrow.

Governed builders treat written rules, frozen snapshots, and harness retests as production requirements—not optional polish after a green demo. Edmund Ng's journey from non-programmer Vibe Coding to auditable AI systems shows why structure beats model churn when stakeholders ask how you decided, what you rejected, and what evidence you can export tomorrow.

Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.

Governed exports and harness checkpoints prevent demo velocity from collapsing under review.

In the 是什么 layer of this Act 2 architecture and harness spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for how phase documents survive model swaps: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.

为什么 — extended spec driven AI builds — AI session memory

Governed builders treat written rules, frozen snapshots, and harness retests as production requirements—not optional polish after a green demo. Edmund Ng's journey from non-programmer Vibe Coding to auditable AI systems shows why structure beats model churn when stakeholders ask how you decided, what you rejected, and what evidence you can export tomorrow.

Governed builders treat written rules, frozen snapshots, and harness retests as production requirements—not optional polish after a green demo. Edmund Ng's journey from non-programmer Vibe Coding to auditable AI systems shows why structure beats model churn when stakeholders ask how you decided, what you rejected, and what evidence you can export tomorrow.

Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.

Governed exports and harness checkpoints prevent demo velocity from collapsing under review.

In the 为什么 layer of this Act 2 architecture and harness spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for what is a this approach: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.

Summary

phase document system AI on Edmund Ng's journey means shipping with AI session memory, harness retests, and evidence-friendly decisions—not one-off prompts. Models change; written rules, exportable snapshots, and governance patterns endure.

Governed builders treat written rules, frozen snapshots, and harness retests as production requirements—not optional polish after a green demo. The journey from non-programmer Vibe Coding to auditable AI shows why structure beats model churn when stakeholders ask how you decided, what you rejected, and what evidence you can export tomorrow.

How phase documents survive model swaps

Edmund Ng treats each long-tail question as a production gate: freeze the spec, log the tradeoff, and prove behavior with Harness retests—not demo clicks alone.

Governed builders treat written rules, frozen snapshots, and harness retests as production requirements—not optional polish after a green demo. The journey from non-programmer Vibe Coding to auditable AI shows why structure beats model churn when stakeholders ask how you decided, what you rejected, and what evidence you can export tomorrow.

What is a phase document system AI

Edmund Ng treats each long-tail question as a production gate: freeze the spec, log the tradeoff, and prove behavior with Harness retests—not demo clicks alone.

Solo founders in Malaysia and APAC often face professional scrutiny early. Externalizing Phase documents, Decision Logs, and smoke tiers before the demo invitation arrives is cheaper than rebuilding trust after a silent regression reaches a customer walkthrough.

When should you freeze AI session specs

Edmund Ng treats each long-tail question as a production gate: freeze the spec, log the tradeoff, and prove behavior with Harness retests—not demo clicks alone.

Role separation matters: builder models may sweep diffs, but frontier models should audit frozen snapshots. Mixing those hats in one chat thread is how teams lose reproducibility and inherit context debt that no IDE upgrade fixes.

FAQ

What is phase document system AI?

Edmund Ng answers with structure first: freeze specs, separate builder and frontier roles, and prove behavior with Harness—not demo clicks. Written rules, Phase documents, and Decision Logs let teams explain tradeoffs months later without reconstructing chat history.

Governed builders treat written rules, frozen snapshots, and harness retests as production requirements—not optional polish after a green demo. The journey from non-programmer Vibe Coding to auditable AI shows why structure beats model churn when stakeholders ask how you decided, what you rejected, and what evidence you can export tomorrow.

How to phase documents survive model swaps?

Edmund Ng answers with structure first: freeze specs, separate builder and frontier roles, and prove behavior with Harness—not demo clicks. Written rules, Phase documents, and Decision Logs let teams explain tradeoffs months later without reconstructing chat history.

Solo founders in Malaysia and APAC often face professional scrutiny early. Externalizing Phase documents, Decision Logs, and smoke tiers before the demo invitation arrives is cheaper than rebuilding trust after a silent regression reaches a customer walkthrough.

What is a phase document system AI?

Edmund Ng answers with structure first: freeze specs, separate builder and frontier roles, and prove behavior with Harness—not demo clicks. Written rules, Phase documents, and Decision Logs let teams explain tradeoffs months later without reconstructing chat history.

Role separation matters: builder models may sweep diffs, but frontier models should audit frozen snapshots. Mixing those hats in one chat thread is how teams lose reproducibility and inherit context debt that no IDE upgrade fixes.

When should you freeze AI session specs?

Edmund Ng answers with structure first: freeze specs, separate builder and frontier roles, and prove behavior with Harness—not demo clicks. Written rules, Phase documents, and Decision Logs let teams explain tradeoffs months later without reconstructing chat history.

Governed builders treat written rules, frozen snapshots, and harness retests as production requirements—not optional polish after a green demo. The journey from non-programmer Vibe Coding to auditable AI shows why structure beats model churn when stakeholders ask how you decided, what you rejected, and what evidence you can export tomorrow.

Why does AI session memory matter for solo founders?

Edmund Ng answers with structure first: freeze specs, separate builder and frontier roles, and prove behavior with Harness—not demo clicks. Written rules, Phase documents, and Decision Logs let teams explain tradeoffs months later without reconstructing chat history.

Solo founders in Malaysia and APAC often face professional scrutiny early. Externalizing Phase documents, Decision Logs, and smoke tiers before the demo invitation arrives is cheaper than rebuilding trust after a silent regression reaches a customer walkthrough.

When should teams freeze specs before agent sweeps?

Edmund Ng answers with structure first: freeze specs, separate builder and frontier roles, and prove behavior with Harness—not demo clicks. Written rules, Phase documents, and Decision Logs let teams explain tradeoffs months later without reconstructing chat history.

Role separation matters: builder models may sweep diffs, but frontier models should audit frozen snapshots. Mixing those hats in one chat thread is how teams lose reproducibility and inherit context debt that no IDE upgrade fixes.

About the author

Edmund Ng — AI systems architect portrait

Edmund Ng — Malaysia-based solo founder, AI systems architect, and system rule designer. He ships governed AI with Vibe Coding, harness engineering, and auditable evidence chains. About · Projects · LinkedIn.

Related posts