Performance Benchmarking¶
Codira performance measurements are developer-facing tooling. They must not change normal CLI behavior or make timing values part of the test contract.
Tools¶
Required system tools:
hyperfinefor command-level timing
Python development extras:
pyinstrumentfor optional profile reportssnakevizfor optional local inspection of.proffiles
psutil and hyperfine are treated as system-level tools in this repository,
not Python development dependencies.
Artifact Layout¶
Benchmark artifacts are written under:
.artifacts/benchmarks/<run-id>/
Saved JSON artifacts include:
- UTC run timestamp
- Codira version
- Git commit ID
- active analyzer/backend plugin inventory
- manifest path when applicable
- availability of
hyperfine,pyinstrument, andsnakeviz
Profile artifacts are written under:
.artifacts/benchmarks/<run-id>/profiles/
Codira index state for campaign commands is isolated from each target repository and written under:
.artifacts/benchmarks/<run-id>/indexes/<category-label>/
The campaign runner passes this directory through --output-dir for index
and ctx commands.
Manifest Format¶
The campaign runner expects a JSON manifest:
{
"repositories": [
{
"label": "codira",
"category": "small",
"path": "/path/to/codira",
"query": "schema migration logic",
"modes": ["cold", "warm", "partial_change"]
},
{
"label": "fontshow",
"category": "medium",
"path": "/path/to/fontshow"
},
{
"label": "texlive",
"category": "large",
"path": "/path/to/texlive"
}
]
}
Each repository entry requires:
labelcategorypath
Optional fields:
query, defaulting toschema migration logicmodes, defaulting tocold,warm, andpartial_changecommands, an optional list of Codira subcommand argv arrays benchmarked through Hyperfine in addition to the defaultindex --full, warmindex, andctx --jsoncommands
Missing repository paths fail fast before commands are run.
When commands is present, each entry is a JSON array of tokens excluding the
Codira executable itself, for example:
["sym", "build_parser", "--json"]
Supported manifest-benchmark subcommands are:
helpindexcovsymsymlistembcallsrefsauditctxpluginscaps
Manifest command tokens may use these placeholders:
{path}for the repository root path{output_dir}for the isolated Codira state directory{query}for the repository query string
For path-aware subcommands, the campaign runner appends --path and
--output-dir automatically when they are not already present.
Adaptive Command Resolution¶
Before building the measured command matrix for one repository, the campaign runs a discovery pass:
- build a temporary Codira index under
/tmp - run
codira symlist --jsonagainst that temporary index - score discovered function and method symbols by graph connectivity
- resolve adaptive benchmark commands toward richer repo-specific targets
- persist the selector provenance without persisting the temporary index
Adaptive resolution currently applies to these command families:
symsymlistembcallsrefsctx
Resolution rules:
sym,calls, andrefstry ranked symbol candidates and keep the highest scoring command that returns JSONstatus: "ok"with non-empty resultsembandctxshare one resolved semantic query chosen from the manifest query plus discovered symbol and module namessymlistis kept only when it returns a non-empty symbol inventory- commands that cannot be resolved to meaningful output are skipped for that repository instead of aborting the full campaign
The selector writes one JSON provenance artifact per repository under:
.artifacts/benchmarks/<run-id>/selection/
These artifacts record:
- requested query and requested commands
- resolved query and resolved commands
- skipped commands
- discovery candidate scores and command trial details
The main campaign-plan.json also records:
requested_queryquery(resolved)requested_commandsresolved_commandsskipped_commandsselection_artifact
Campaign Command¶
Inspect the planned commands without executing:
python scripts/benchmark_campaign.py benchmarks.json --dry-run
Run a campaign:
python scripts/benchmark_campaign.py benchmarks.json --runs 10 --warmup 2
Use a stable run identifier when comparing artifacts:
python scripts/benchmark_campaign.py benchmarks.json --run-id 20260430-baseline
For issue #30, record paired short-manifest runs for both first-party
backends before broad campaigns:
CODIRA_INDEX_BACKEND=sqlite python scripts/benchmark_campaign.py \
benchmarks/short_benchmark.local.json --run-id issue-30-short-sqlite
CODIRA_INDEX_BACKEND=duckdb python scripts/benchmark_campaign.py \
benchmarks/short_benchmark.local.json --run-id issue-30-short-duckdb
Capture the resulting artifact paths in the branch execution ledger before running broader manifests.
Plugin Requirements¶
Benchmark metadata depends on each analyzer and backend exposing stable plugin identity through the normal Codira plugin registry.
First-party analyzer/backend plugins must provide:
- stable plugin name
- stable provider distribution name
- implementation version
- deterministic discovery globs for analyzers
- deterministic loading status through
codira plugins --json
The first-party plugin set included in benchmark metadata tests is:
codira-analyzer-pythoncodira-analyzer-jsoncodira-analyzer-ccodira-analyzer-bashcodira-backend-sqlitecodira-backend-duckdb
New first-party analyzer or backend packages must update the shared benchmark plugin provider list and its tests.
Validation Policy¶
Automated tests validate parser behavior, dry-run command construction, manifest loading, metadata shape, and first-party plugin coverage. They do not assert exact timing values.
Performance campaigns are manual developer workflows. Normal CI should validate the scripts without running noisy wall-clock benchmark gates.