Lessons Learned¶

⚠️ Engineering Notes – Non-Normative Document

This document collects lessons learned during development. It reflects reasoning, mistakes, and decisions observed over time.

It is not a specification, not a contract, and not a source of truth for system behavior.

Its purpose is to:

preserve engineering context

document recurring pitfalls

support future design decisions

reduce cognitive load when revisiting past work

The contents may evolve, be refined, or become obsolete as the project evolves.

Table of Contents¶

Lessons Learned
Table of Contents
Scope
1. Process & Decision Making
L0001 – Context must be explicit
L0002 – Decisions must precede implementation
L0003 – Structure must exist before automation
L0004 – Small steps outperform large refactors
2. Documentation as a System Component
L0010 – Documentation is part of the architecture
L0011 – Decisions require lifecycle management
L0012 – Documentation must match reality
3. Tooling & Automation
L0020 – Tooling is production code
L0021 – Automation must be explicit
L0022 – CI is an architectural validator
4. API, CLI, and Interface Design
L0030 – Contracts must be explicit
L0031 – CLI output has semantic layers
L0032 – Quiet and verbose modes must be strict
L0033 – CLI logic must be testable
L0034 – CLI behavior must follow platform conventions
5. Architecture & Code Structure
L0040 – Separation of concerns must be enforced
L0041 – Registries are catalogs, not execution plans
L0042 – Schema evolution must be explicit
6. Testing & Validation
L0050 – Tests must validate intent, not implementation
L0051 – Coverage is a signal, not a goal
L0052 – Integration paths should not be unit-tested aggressively
L0053 – Tests expose architectural flaws early
L0054 – Tests must not depend on environment state
L0055 – Failing tests are more valuable than passing ones
7. Logging & Observability
L0060 – Logging must be designed, not improvised
L0061 – Log levels must have strict meaning
L0062 – Observability beats guesswork
L0063 – Debugging requires control, not intuition
8. Versioning & Release Discipline
L0070 – Version must have a single source of truth
L0071 – CI is authoritative for releases
L0072 – Security rules must be documented
9. Process-Level Principles
L0080 – Assumptions are liabilities
L0081 – Not fixing something can be correct
L0082 – Decisions reduce cognitive load
10. Core Principles
L0090 – Structure before automation
L0091 – Documentation before tooling
L0092 – Decisions before code
End of Document

Scope¶

This document captures generalized, transferable lessons derived from real development and debugging sessions.

All project-specific details have been removed. The focus is on engineering practice, not on a specific codebase.

1. Process & Decision Making¶

L0001 – Context must be explicit¶

Implicit assumptions cause more failures than incorrect code.

Clear definition of:

scope
constraints
goals
non-goals

is required before any technical work begins.

L0002 – Decisions must precede implementation¶

Writing code before documenting decisions leads to:

scope creep
accidental redesigns
inconsistent behavior

A written decision acts as a constraint, not bureaucracy.

L0003 – Structure must exist before automation¶

Automation applied to an undefined structure creates fragile systems.

Correct order:

define structure
validate assumptions
document decisions
automate

L0004 – Small steps outperform large refactors¶

Incremental changes:

are easier to validate
reduce regression risk
preserve intent

Large refactors without checkpoints increase uncertainty.

2. Documentation as a System Component¶

L0010 – Documentation is part of the architecture¶

Documentation defines:

intent
boundaries
invariants
evolution constraints

If documentation diverges from behavior, the system becomes unstable.

L0011 – Decisions require lifecycle management¶

Decision records must be:

versioned
append-only
individually addressable
traceable over time

Single monolithic documents do not scale.

L0012 – Documentation must match reality¶

If behavior changes, documentation must change in the same commit.

Outdated documentation is a form of technical debt.

3. Tooling & Automation¶

L0020 – Tooling is production code¶

Developer tooling must be:

versioned
reviewed
deterministic
documented

Temporary scripts almost never remain temporary.

L0021 – Automation must be explicit¶

Implicit automation leads to:

hidden dependencies
irreproducible builds
fragile workflows

Automation must declare:

inputs
outputs
failure modes

L0022 – CI is an architectural validator¶

CI failures often indicate:

design inconsistencies
missing contracts
undocumented assumptions

CI is feedback on architecture, not just correctness.

4. API, CLI, and Interface Design¶

L0030 – Contracts must be explicit¶

If behavior is not defined, each component invents its own rules.

This applies to:

CLI flags
return codes
logging behavior
configuration semantics

L0031 – CLI output has semantic layers¶

Output must be classified as:

user-facing output
diagnostic information
debug/trace data
errors

Mixing these layers makes behavior untestable.

L0032 – Quiet and verbose modes must be strict¶

Rules:

quiet → no output except errors
verbose → additive only
no behavioral change based on verbosity

Anything else creates ambiguity.

L0033 – CLI logic must be testable¶

CLI implementations should:

separate parsing from execution
avoid sys.exit() in business logic
return structured results

Testability must be designed, not patched later.

L0034 – CLI behavior must follow platform conventions¶

CLI behavior must follow established platform conventions, including:

separation of stdout and stderr
exit code semantics
argument parsing rules
error reporting conventions

Violating these expectations leads to fragile tests and user confusion.

Flags such as --quiet must only affect informational output and must not suppress errors or diagnostics emitted by the runtime or argument parser.

5. Architecture & Code Structure¶

L0040 – Separation of concerns must be enforced¶

Logic, orchestration, I/O, and diagnostics must not be mixed.

Each layer must have a single responsibility.

L0041 – Registries are catalogs, not execution plans¶

A registry describes what exists.

Execution must be:

explicit
filtered
intentional

Never assume that everything registered must run.

L0042 – Schema evolution must be explicit¶

Schemas must:

declare versions
evolve intentionally
never infer structure implicitly

Schema versioning is part of API stability.

6. Testing & Validation¶

L0050 – Tests must validate intent, not implementation¶

Good tests:

verify observable behavior
assert contracts
survive refactors

Bad tests:

mirror internal structure
depend on private state
break on redesign

L0051 – Coverage is a signal, not a goal¶

High coverage does not imply correctness.

Coverage must be interpreted:

per module
by responsibility
in context

L0052 – Integration paths should not be unit-tested aggressively¶

I/O-heavy or environment-dependent code should be validated via:

integration checks
preflight validation
controlled manual verification

Not everything should be unit-tested.

L0053 – Tests expose architectural flaws early¶

Many design issues emerge only when writing tests.

Tests act as a design feedback mechanism.

L0054 – Tests must not depend on environment state¶

Tests depending on:

filesystem layout
installed tools
OS-specific behavior

are not reliable unit tests.

L0055 – Failing tests are more valuable than passing ones¶

A failing test often reveals:

incorrect assumptions
undocumented behavior
mismatched expectations
hidden coupling between components

Test failures should trigger a review of intent before any code change. They frequently expose design issues rather than implementation bugs.

7. Logging & Observability¶

L0060 – Logging must be designed, not improvised¶

Logging should be:

structured
centralized
predictable

Ad-hoc logging creates noise and instability.

L0061 – Log levels must have strict meaning¶

Each level must correspond to a specific intent.

If TRACE exists, it must be explicitly enabled.

L0062 – Observability beats guesswork¶

Systems should expose:

decision points
intermediate state
failure context

Debugging without visibility leads to speculation.

L0063 – Debugging requires control, not intuition¶

Effective debugging requires:

deterministic reproduction
explicit breakpoints
controlled execution flow
observable state transitions

Guesswork and ad-hoc logging are poor substitutes for structured inspection. Debugging tools should support reasoning, not replace it.

8. Versioning & Release Discipline¶

L0070 – Version must have a single source of truth¶

Version information must come from:

tags
release metadata

Never from duplicated constants.

L0071 – CI is authoritative for releases¶

Local environments must not control:

version numbers
releases
publication logic

L0072 – Security rules must be documented¶

Security behavior must be:

explicit
documented
enforced by tooling

Implicit security assumptions always fail over time.

9. Process-Level Principles¶

L0080 – Assumptions are liabilities¶

Every assumption must be:

stated
verified
documented

Unstated assumptions accumulate risk.

L0081 – Not fixing something can be correct¶

Fixing the wrong thing causes more damage than leaving it unchanged.

Restraint is an engineering skill.

L0082 – Decisions reduce cognitive load¶

Writing decisions:

prevents repeated debates
preserves rationale
accelerates future work

10. Core Principles¶

L0090 – Structure before automation¶

L0091 – Documentation before tooling¶

L0092 – Decisions before code¶

These principles prevent most long-term failures.