Skip to content

Lessons Learned (Cross-Milestone)

Synthesized from .planning/RETROSPECTIVE.md covering v1.0 through v1.7 retrospectives, with v1.8 observations from active development.


Top 5 Verified Lessons (Across All Milestones)

1. Separate Auth from Feature Work

Auth gates have been a recurring source of integration complexity across three milestones:

  • v1.0/v1.2: Solana Sign-In (SIWS) was mixed with feature phases. Wallet signature verification and session management bled into plan management, mandate creation, and checkout flows. Each feature phase had to independently handle auth edge cases (expired sessions, wallet disconnection, signature replay).
  • v1.3: Email/password auth migration was its own phase but still created downstream churn — every existing endpoint needed auth middleware updates, redirect handling, and session migration logic.
  • v1.5: Public-facing surfaces (vela-checkout, vela-portal) needed their own auth models separate from the dashboard.

Verdict: Auth should always be a standalone phase with explicit integration points. Feature phases should consume auth as a dependency, not co-develop it. The cost of a separate auth phase is 1-2 sessions; the cost of mixed auth+feature is 2-3 sessions of bug fixes plus integration debt.

2. Validation Debt Compounds — Close It Dedicated

Deferred validation items don't disappear. They accumulate and eventually block forward progress.

  • v1.2 Phases 20-23: A dedicated gap-closure wave was needed to address wave-0 test stubs that had been deferred during feature development. Closing these gaps in-band with feature work would have slowed feature delivery and risked incomplete closure.
  • v1.3 Milestone Audit: The pre-archival audit found cross-plan integration testing gaps that were invisible during individual phase execution.
  • v1.7 Phases 43-44: Two gap-closure phases (integration drift + traceability drift) were directly caused by skipping the verifier on Phase 42. The ~30% overhead to v1.7 was entirely preventable.

Verdict: Run a gap-closure phase at the end of every milestone. The pattern is: execute feature phases → milestone audit → dedicated gap-closure phase(s). This is cheaper than in-band closure and catches more issues because the audit provides a complete view.

3. On-Chain as Source of Truth

Every shortcut that cached on-chain data in D1 for critical paths eventually required remediation.

  • v1.2 Phase 23: Credential derivation was reading from D1 cache instead of deriving from on-chain state. This caused stale credential data when mandates were upgraded between D1 sync intervals. The fix required refactoring all credential derivation to happen at transaction time from on-chain data.
  • v1.5 Checkout Sessions: Initial implementation stored mandate state in D1 for faster checkout rendering. This created a race condition where checkout could display a cancelled mandate if the webhook hadn't processed yet. Fixed by always reading mandate state from chain at checkout time.
  • v1.7 Protocol Refactor: Dynamic program ID resolution from on-chain config replaced the previous pattern of hardcoded program IDs in environment variables. This ensures SDK always talks to the correct deployed program.

Verdict: D1 is a UI cache, not a source of truth. Any data that gates a transaction (mandate status, credential validity, approval amounts) must be read from chain at transaction time. D1 is appropriate for: search, filtering, historical analytics, CSV exports.

4. Milestone Audits Catch Real Bugs

Mid-milestone and pre-archival audits have consistently found issues that would have been production gaps.

  • v1.2 Audit: Found missing error handling in the keeper scheduling loop and incomplete event emission for pull failures.
  • v1.3 Audit: Discovered cross-plan integration gaps between auth migration and existing webhook endpoints.
  • v1.6 Audit: Caught docs-only issues (broken cross-references, missing migration guides) that would have confused developers.

Verdict: Run gsd-audit-milestone mid-milestone (after ~60% of phases) and again before archival. The mid-milestone audit catches issues while context is fresh; the pre-archival audit catches anything missed.

During Phase 38 (docs infrastructure), starlight-links-validator caught 34 broken links across the documentation site. Without build-time validation, these would have been discovered by users.

  • Docs-heavy milestones (v1.6) are particularly susceptible because content is written across many phases by different contexts.
  • Starlight's link validator runs at build time, so broken links are caught before deployment.
  • The cost of adding link validation is near-zero (plugin config); the cost of broken docs is user trust.

Verdict: Always enable build-time link validation for docs milestones. Consider enabling it for any milestone that touches markdown documentation.


Process Evolution Table

MilestonePhasesPlansKey Process Change
v1.0520Established base workflow: phase → plan → execute cycle
v1.1826Added privacy layer, keeper automation
v1.21036Workstream structure, explicit validation phases, gap closure pattern
v1.3313Milestone audit before archival, centralized auth redirect, Better Auth plugin adoption
v1.4213Agent mandate protocol with CU profiling gate
v1.5737Shared checkout-session pattern, accepted tech debt tracking
v1.6416Docs-only milestone, audience-based IA, build-time link validation
v1.7529Protocol/SDK refactor, versioned accounts, two gap-closure phases
v1.8347Aggressive phase merging, streaming as net-new primitive

Patterns Established (Across Milestones)

Infrastructure Patterns

PatternOriginDescription
Service binding RPCv1.2Admin RPC isolated behind service binding; public routes can't reach admin endpoints
Browser-signs, server-logsv1.2Browser handles wallet signing; server handles event logging. Never mix concerns.
Credential derivation at tx timev1.2 Phase 23Credentials derived from on-chain state at transaction time, not from D1 cache
Dynamic program ID resolutionv1.7Program IDs resolved from on-chain config, not hardcoded env vars
Reserved space on accountsv1.7Every new account includes _reserved: [u8; 64] for additive migrations
Centralized PDAFactoryv1.7Single source of truth for all PDA seed derivation in SDK

Auth & Identity Patterns

PatternOriginDescription
Email-primary, wallet-secondaryv1.3Merchants identify via email; wallet is secondary credential
Org = merchant entityv1.3Organization entity maps directly to merchant on-chain identity
Typed confirmation gatesv1.2Destructive admin actions require typed confirmation strings

Documentation Patterns

PatternOriginDescription
starlight-sidebar-topicsv1.6Multi-audience docs navigation (developers, merchants, protocol)
rehype-mermaid for SSRv1.6Mermaid diagrams rendered server-side for zero-JS docs
Build-time link validationv1.6starlight-links-validator catches broken references at build time

Process Patterns

PatternOriginDescription
Gap-closure phasev1.7Dedicated phase for closing validation gaps rather than retro-amending
Milestone audit before archivalv1.3Run audit mid-milestone and before archival
Accepted tech debt trackingv1.5Explicitly track deferred items rather than leaving them implicit
Wave-0 test stubs upfrontv1.2Write test stubs at the start of each wave, fill in during execution

Key Anti-Patterns Identified

Security Anti-Patterns

  1. Caching on-chain data in D1 for security-critical paths

    • Manifestation: Credential derivation reading from D1 cache
    • Remediation: v1.2 Phase 23 — all credential derivation moved to transaction-time on-chain reads
    • Impact: Could have resulted in stale credentials being used for pull payments
  2. Security middleware applied late in a separate phase

    • Manifestation: v1.3 added auth middleware as a separate phase after feature work
    • Fix: Always co-locate security middleware with the feature it protects
  3. Skipping verifier on a phase because unit tests pass

    • Manifestation: Phase 42 passed unit tests but had integration gaps
    • Impact: Caused Phases 43-44 gap closure (30% overhead to v1.7)
    • Lesson: Unit tests ≠ integration works

Process Anti-Patterns

  1. Missing SUMMARY.md files

    • Blocks progress tracking for the milestone
    • GSD relies on SUMMARY.md for state recovery between sessions
  2. Phase directory fragmentation outside workstream

    • v1.2 Phases 20-22 were created outside the workstream directory
    • Made it harder to find phase artifacts during audit
  3. REQUIREMENTS.md carry-forward without cleanup

    • Requirements from previous milestones carried forward verbatim
    • Creates confusion about what's still relevant vs. already completed

Cost Observations

Overall Metrics

MetricValue
Total plans completed206+
Total commits529+
Average duration per plan~15 minutes
Total AI-assisted execution time~36 hours
Calendar span18 days
OperatorSolo founder with AI assistance

Velocity by Phase Type

Phase TypeAvg Plans/PhaseNotes
Feature phases5-6Higher coupling, more plans
Gap-closure phases1-3Fewer plans but higher debugging ratio
Docs-only phases3-4Lower coupling, can consolidate
Protocol/SDK phases6-8Highest coupling, careful sequencing needed

Gap-Closure Overhead

  • v1.2: Phases 20-23 (validation debt) — ~20% overhead to milestone
  • v1.7: Phases 43-44 (integration drift + traceability drift) — ~30% overhead to milestone
  • Both were directly attributable to deferred validation
  • Lesson: A single skipped verifier creates 2 gap-closure phases

Model Usage

RoleModelPurpose
PrimaryOpus 4.6 (quality profile)Planning, complex decisions, architecture
Executor subagentSonnetPlan execution, code generation
Planner subagentSonnetPhase planning, task breakdown
Researcher subagentSonnetContext gathering, documentation search

Recommendations for Future Milestones

  1. Always allocate 1-2 gap-closure phases at the end of each milestone
  2. Never skip the verifier regardless of unit test results
  3. Run milestone audit at 60% completion and again before archival
  4. Keep auth as a separate phase with explicit dependency declaration
  5. Enable build-time link validation for any milestone touching docs
  6. Write wave-0 test stubs upfront — they define the acceptance criteria
  7. Track tech debt explicitly rather than leaving it implicit
  8. Derive credentials from chain at tx time — never trust D1 cache for security-critical paths
  9. Reserve space on every new account — 64 bytes of insurance is worth it
  10. Consolidate docs phases aggressively — lower coupling means merging is safe

Internal knowledge base for the Vela Labs workspace.