vibe coding governance score — How I Score Vibe Coding Projects (20 vs 91)
vibe coding governance score: Edmund Ng's journey spoke on governed AI, harness testing, and Vibe Coding for solo founders. Explore.
Published Updated 13 min read
vibe-codinggovernanceai-governance

vibe coding governance score matters when you move from demo velocity to production scrutiny. This article is Edmund Ng's field notes on AI project scorecard, harness discipline, and the journey toward auditable AI—written for solo founders and system rule designers who cannot afford silent regressions.
Continue with these journey spokes.
Continue with these journey spokes.
Continue with these journey spokes.
Continue with these journey spokes.
Continue with these journey spokes.
Continue with these journey spokes.
Complete Vibe Coding Guide for Non-Programmers · The Phase Document System for AI · Build with AI Without a Programming Background
On this page
- What — vibe coding governance score — AI project scorecard — the governance score
- Why — vibe coding score 20 vs 91 — 20 vs 91 is a story, not shame
- When — AI project scorecard — score yourself (honest moments)
- Where — vibe coding score 20 vs 91 — score applies in the stack
- How — AI project scorecard — climb from 20 toward 91
- 是什么 — extended AI project scorecard — vibe coding score 20 vs 91
- 为什么 — extended vibe coding score 20 vs 91 — AI project scorecard
Key takeaways
- vibe coding governance score needs written rules—not hero prompts alone.
- AI project scorecard keeps demo speed from becoming production regret.
- Harness discipline connects this spoke to the wider governed production journey.
- Cross-link Phase docs, Harness retests, and written tradeoff logs before calling work done.
Takeaways above anchor the rest of this spoke.
What — vibe coding governance score — AI project scorecard — the governance score
After five rebuilds, Edmund Ng uses an internal governance score (abstracted here as 20 vs 91) to teach one idea:
Vibe Coding without governance is fast failure dressed as progress.
| Score band | What it usually means |
|---|---|
| ~20/100 | Features pass demos; no portable specs; decisions live in chat; Harness never run |
| ~50/100 | Some docs exist; tests happy-path only; Decision Logs written after challenges |
| ~91/100 | Phase contracts portable across models; two-door testing habit; evidence-minded closure |
This is not a certification program. It is a mirror for solo founders and small teams — especially in Malaysia/APAC contexts where professional review arrives early.
Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.
Governed exports and harness checkpoints prevent demo velocity from collapsing under review.
In the What layer of this Act 1 Vibe Coding spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for what is vibe coding governance score: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.
Why — vibe coding score 20 vs 91 — 20 vs 91 is a story, not shame
Rebuilds 1–4 in Edmund's arc were engineering-first, demo-first — impressive screens, fragile orchestration, Framework-Skip class failures. Rebuild 5 crystallized governance-first methods: Phase system, evidence thinking, Stage A/B mindset, closure rituals.
Why publish the score: Builders ask "am I ready for production?" Dashboard green and investor demos lie kindly. A simple scale forces honest questions before Act 3 auditable AI.
Why not exact rubric weights: Public surface stays at Pattern/Category layer — no internal ledger IDs or probe scripts.
Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.
Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.
Governed exports and harness checkpoints prevent demo velocity from collapsing under review.
In the Why layer of this Act 1 Vibe Coding spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for when does AI demo score fail production: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.
When — AI project scorecard — score yourself (honest moments)
| Moment | Question |
|---|---|
| Before first paying customer | Can a stranger continue from Phase docs alone? |
| Before calling MVP "done" | Did Harness run, or only manual click-through? |
| After a model swap | Did architecture survive without re-explaining in chat? |
| Before regulated-adjacent pitch | Can you show considered A, chose B, because C for major choices? |
If three answers are "no," you are likely in the 20 band — fix structure before marketing.
When to stop obsessing over the number: When Phase + Harness + Decision Logs are habitual — the score becomes a sanity check, not a vanity metric.
Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.
Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.
Governed exports and harness checkpoints prevent demo velocity from collapsing under review.
In the When layer of this Act 1 Vibe Coding spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for how to score vibe coding projects: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.
Where — vibe coding score 20 vs 91 — score applies in the stack
| Layer | Low score symptom | High score signal |
|---|---|---|
| Act 1 Vibe Coding | Prompt-pray; no Constitution | Build Priority Chain enforced |
| Act 2 Structure | Skipped phase documents | Portable specs + multi-axis review |
| Act 3 Trust | "We will add audit later" | Evidence chain thinking early |
Malaysia solo founders: local clients often ask process questions before feature questions — governance score framing helps sales conversations without over-promising compliance.
Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.
Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.
Governed exports and harness checkpoints prevent demo velocity from collapsing under review.
In the Where layer of this Act 1 Vibe Coding spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for what is this approach: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.
How — AI project scorecard — climb from 20 toward 91
Week 1 — stop the bleeding:
- One-page Constitution hard stops (what AI must never do)
- Phase 0 doc for the current feature — goal, non-goals, verification
- Refuse Framework-Skip merges — no raw demo code without contract
Week 2–4 — structure:
- Adopt Phase document system for every meaningful unit
- Run a minimal Harness pass — PRE snapshot → one parallel lane → POST fix
- Write Decision Logs at decision time — template: considered A, chose B, because C
Month 2+ — bridge to Act 3:
- Connect testing to 10/80/10 protocol mindset
- Read Building auditable AI systems before scaling multi-tenant promises
The scariest bugs are the ones your demo celebrates — low scores usually mean you celebrated too early.
Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.
Governed exports and harness checkpoints prevent demo velocity from collapsing under review.
In the How layer of this Act 1 Vibe Coding spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for when does AI demo score fail production: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.
是什么 — extended AI project scorecard — vibe coding score 20 vs 91
Governed builders treat written rules, frozen snapshots, and harness retests as production requirements—not optional polish after a green demo. Edmund Ng's journey from non-programmer Vibe Coding to auditable AI systems shows why structure beats model churn when stakeholders ask how you decided, what you rejected, and what evidence you can export tomorrow.
Governed builders treat written rules, frozen snapshots, and harness retests as production requirements—not optional polish after a green demo. Edmund Ng's journey from non-programmer Vibe Coding to auditable AI systems shows why structure beats model churn when stakeholders ask how you decided, what you rejected, and what evidence you can export tomorrow.
Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.
Governed exports and harness checkpoints prevent demo velocity from collapsing under review.
In the 是什么 layer of this Act 1 Vibe Coding spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for how to score vibe coding projects: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.
为什么 — extended vibe coding score 20 vs 91 — AI project scorecard
Governed builders treat written rules, frozen snapshots, and harness retests as production requirements—not optional polish after a green demo. Edmund Ng's journey from non-programmer Vibe Coding to auditable AI systems shows why structure beats model churn when stakeholders ask how you decided, what you rejected, and what evidence you can export tomorrow.
Governed builders treat written rules, frozen snapshots, and harness retests as production requirements—not optional polish after a green demo. Edmund Ng's journey from non-programmer Vibe Coding to auditable AI systems shows why structure beats model churn when stakeholders ask how you decided, what you rejected, and what evidence you can export tomorrow.
Structured exports and harness retests matter more than demo velocity when reviewers ask for evidence.
Governed exports and harness checkpoints prevent demo velocity from collapsing under review.
In the 为什么 layer of this Act 1 Vibe Coding spoke, teams work from an operational contract—not a marketing label. Governed exports and harness checkpoints prevent demo velocity from collapsing under multi-axis review or compliance questions. A practical test for what is this approach: what is frozen before agents sweep, what gets logged at tradeoff time, and which Harness retest proves behavior instead of UI luck. Edmund Ng's field notes emphasize exportable rules and Decision Logs so six-month-later auditors can follow the chain—that is the same fast AND governed bridge Acts 1–3 teach.
Summary
vibe coding governance score on Edmund Ng's journey means shipping with AI project scorecard, harness retests, and evidence-friendly decisions—not one-off prompts. Models change; written rules, exportable snapshots, and governance patterns endure.
Governed builders treat written rules, frozen snapshots, and harness retests as production requirements—not optional polish after a green demo. The journey from non-programmer Vibe Coding to auditable AI shows why structure beats model churn when stakeholders ask how you decided, what you rejected, and what evidence you can export tomorrow.
How to score vibe coding projects
Edmund Ng treats each long-tail question as a production gate: freeze the spec, log the tradeoff, and prove behavior with Harness retests—not demo clicks alone.
Governed builders treat written rules, frozen snapshots, and harness retests as production requirements—not optional polish after a green demo. The journey from non-programmer Vibe Coding to auditable AI shows why structure beats model churn when stakeholders ask how you decided, what you rejected, and what evidence you can export tomorrow.
What is vibe coding governance score
Edmund Ng treats each long-tail question as a production gate: freeze the spec, log the tradeoff, and prove behavior with Harness retests—not demo clicks alone.
Solo founders in Malaysia and APAC often face professional scrutiny early. Externalizing Phase documents, Decision Logs, and smoke tiers before the demo invitation arrives is cheaper than rebuilding trust after a silent regression reaches a customer walkthrough.
When does AI demo score fail production
Edmund Ng treats each long-tail question as a production gate: freeze the spec, log the tradeoff, and prove behavior with Harness retests—not demo clicks alone.
Role separation matters: builder models may sweep diffs, but frontier models should audit frozen snapshots. Mixing those hats in one chat thread is how teams lose reproducibility and inherit context debt that no IDE upgrade fixes.
FAQ
What is vibe coding governance score?
Edmund Ng answers with structure first: freeze specs, separate builder and frontier roles, and prove behavior with Harness—not demo clicks. Written rules, Phase documents, and Decision Logs let teams explain tradeoffs months later without reconstructing chat history.
Governed builders treat written rules, frozen snapshots, and harness retests as production requirements—not optional polish after a green demo. The journey from non-programmer Vibe Coding to auditable AI shows why structure beats model churn when stakeholders ask how you decided, what you rejected, and what evidence you can export tomorrow.
How to score vibe coding projects?
Edmund Ng answers with structure first: freeze specs, separate builder and frontier roles, and prove behavior with Harness—not demo clicks. Written rules, Phase documents, and Decision Logs let teams explain tradeoffs months later without reconstructing chat history.
Solo founders in Malaysia and APAC often face professional scrutiny early. Externalizing Phase documents, Decision Logs, and smoke tiers before the demo invitation arrives is cheaper than rebuilding trust after a silent regression reaches a customer walkthrough.
What is vibe coding governance score?
Edmund Ng answers with structure first: freeze specs, separate builder and frontier roles, and prove behavior with Harness—not demo clicks. Written rules, Phase documents, and Decision Logs let teams explain tradeoffs months later without reconstructing chat history.
Role separation matters: builder models may sweep diffs, but frontier models should audit frozen snapshots. Mixing those hats in one chat thread is how teams lose reproducibility and inherit context debt that no IDE upgrade fixes.
When should does AI demo score fail production?
Edmund Ng answers with structure first: freeze specs, separate builder and frontier roles, and prove behavior with Harness—not demo clicks. Written rules, Phase documents, and Decision Logs let teams explain tradeoffs months later without reconstructing chat history.
Governed builders treat written rules, frozen snapshots, and harness retests as production requirements—not optional polish after a green demo. The journey from non-programmer Vibe Coding to auditable AI shows why structure beats model churn when stakeholders ask how you decided, what you rejected, and what evidence you can export tomorrow.
Why does AI project scorecard matter for solo founders?
Edmund Ng answers with structure first: freeze specs, separate builder and frontier roles, and prove behavior with Harness—not demo clicks. Written rules, Phase documents, and Decision Logs let teams explain tradeoffs months later without reconstructing chat history.
Solo founders in Malaysia and APAC often face professional scrutiny early. Externalizing Phase documents, Decision Logs, and smoke tiers before the demo invitation arrives is cheaper than rebuilding trust after a silent regression reaches a customer walkthrough.
When should teams freeze specs before agent sweeps?
Edmund Ng answers with structure first: freeze specs, separate builder and frontier roles, and prove behavior with Harness—not demo clicks. Written rules, Phase documents, and Decision Logs let teams explain tradeoffs months later without reconstructing chat history.
Role separation matters: builder models may sweep diffs, but frontier models should audit frozen snapshots. Mixing those hats in one chat thread is how teams lose reproducibility and inherit context debt that no IDE upgrade fixes.
About the author

Edmund Ng — Malaysia-based solo founder, AI systems architect, and system rule designer. He ships governed AI with Vibe Coding, harness engineering, and auditable evidence chains. About · Projects · LinkedIn.
