Performance Benchmarks¶
Fontshow performance benchmarks are local, on demand, and intentionally outside CI. They use fixed downloaded fixtures, temporary output directories, and Hyperfine JSON export.
No font dataset is committed to the repository.
Requirements¶
Install Hyperfine outside the Python project dependencies:
sudo apt install hyperfine
Activate the repository virtual environment before running benchmarks:
source .venv/bin/activate
The benchmark runner hard-fails when VIRTUAL_ENV is not the repository
.venv or when hyperfine is not available on PATH.
Fixture Setup¶
Generate the light fixture set:
scripts/setup_benchmark_fonts.sh light
Generate the medium fixture set:
scripts/setup_benchmark_fonts.sh medium
Generate the heavy fixture set:
scripts/setup_benchmark_fonts.sh heavy
All profiles write generated fonts under:
tests/fixtures/fonts_dir/
The light profile downloads eight pinned font files from the Google Fonts
repository. The files are selected from ofl/ directories, pinned to a
specific Google Fonts commit, and verified by SHA-256.
Pinned Google Fonts commit:
47831f08ec6d6d7ad6b465f23dc9f9a890a2a04b
Light profile families:
| Fixture | Source family |
|---|---|
Roboto.ttf |
ofl/roboto |
Roboto-Italic.ttf |
ofl/roboto |
OpenSans.ttf |
ofl/opensans |
NotoSans.ttf |
ofl/notosans |
NotoSerif.ttf |
ofl/notoserif |
Lato-Regular.ttf |
ofl/lato |
SourceCodePro.ttf |
ofl/sourcecodepro |
Inconsolata.ttf |
ofl/inconsolata |
The medium and heavy profiles start from the same files and add
replicated local copies under tests/fixtures/fonts_dir/.heavy/. The
medium profile adds three replicated copies for a 32-font inventory. The
heavy profile adds eight replicated copies for a 72-font inventory. This
gives larger deterministic stress inputs without committing or
redistributing fonts.
The generated fixture directory is ignored by Git.
Running Benchmarks¶
Run the light profile:
scripts/benchmark.sh light
Run the medium profile:
scripts/benchmark.sh medium
Run the heavy profile:
scripts/benchmark.sh heavy
The runner prepares fixed pipeline inputs before measuring:
fontshow dump-fonts \
--paths tests/fixtures/fonts_dir \
--output tests/fixtures/raw_inventory.json
fontshow parse-inventory \
tests/fixtures/raw_inventory.json \
--output tests/fixtures/sample_inventory.json
Then Hyperfine measures these stages independently:
fontshow dump-fonts --paths tests/fixtures/fonts_dir
fontshow parse-inventory tests/fixtures/raw_inventory.json
fontshow create-catalog --inventory tests/fixtures/sample_inventory.json
Each command uses one warmup run and three measured runs.
Measured command outputs are written to temporary directories. Hyperfine results are exported to:
tests/fixtures/benchmark_results/fontshow-light.json
tests/fixtures/benchmark_results/fontshow-heavy.json
Those result files are ignored by Git.
Loadability Batch Benchmarks¶
LuaLaTeX loadability probing is serial in normal dump-fonts runs. To
evaluate whether bounded parallel batch execution is worth adding later,
run the dedicated local benchmark:
scripts/benchmark_loadability_batches.sh light
For the stress profile:
scripts/benchmark_loadability_batches.sh heavy
For the medium profile:
scripts/benchmark_loadability_batches.sh medium
By default, the loadability benchmark compares:
jobs=1serial batch executionjobs=2bounded parallel batch executionjobs=4bounded parallel batch execution
To choose explicit job counts:
scripts/benchmark_loadability_batches.sh heavy 1 2 4
The script prepares a fixed inventory input, then replays only the
LuaLaTeX loadability probe through scripts/run_loadability_probe.py.
Results are exported to:
tests/fixtures/benchmark_results/loadability-light.json
tests/fixtures/benchmark_results/loadability-medium.json
tests/fixtures/benchmark_results/loadability-heavy.json
These files are ignored by Git. dump-fonts and parse-inventory
expose this as --loadability-jobs; use --loadability-jobs 1 when
fully serial probing is required. Increase the value only when
measurements show a repeatable wall-clock win without unstable failures
or obvious TeX cache contention.
For a full local font tree, generate an ignored benchmark input first:
fontshow dump-fonts \
--paths /path/to/fonts \
--cache-dir tests/fixtures/benchmark_results/full-input-cache \
--output tests/fixtures/full_loadability_benchmark_inventory.json
Then benchmark explicit job counts with scripts/run_loadability_probe.py
and Hyperfine. On a 12-thread machine with a large font tree, local
measurements showed useful scaling through jobs=8 and a smaller
additional gain at jobs=12, with byte-identical output across job
counts. Prefer jobs=4 or jobs=8 as first full-inventory trials, then
try jobs=12 if the machine can be dedicated to the run.
Readiness Checks¶
The normal pytest suite does not run benchmarks and does not require Hyperfine or downloaded fonts.
To verify local benchmark readiness without executing benchmarks:
FONTSHOW_BENCHMARK_READINESS=1 pytest -q tests/test_benchmark_workflow.py
These checks only verify:
- Hyperfine is available on
PATH - the generated fixture directory exists and contains font files
Interpreting Results¶
Use the Hyperfine summary for quick comparisons and the JSON export for recorded measurements. Compare profiles separately:
lightis for fast iteration and command-shape checksheavyis for stress and pre-release measurements
Do not compare results across machines without recording CPU, storage, operating system, TeX installation state, and whether caches were warm.
No performance thresholds are enforced by tests.