Skip to content

Codebase Map

Purpose

This document is a developer-oriented map of the Fontshow codebase.

It explains:

  • the repository structure;
  • the runtime pipeline;
  • how a single font entry moves through the system;
  • where to start when changing a subsystem;
  • which areas are safe or risky to modify;
  • where to look first when triaging a bug.

It is intended as a companion to:

  • docs/architecture.md
  • docs/pipeline.md
  • the command-specific docs under docs/tools/

High-level overview

Fontshow is a CLI pipeline for:

  1. checking the environment;
  2. discovering installed fonts;
  3. serializing a raw inventory;
  4. enriching and validating that inventory;
  5. rendering a LaTeX catalog from the enriched data.

The main entrypoint is src/fontshow/__main__.py.

The top-level commands are:

  • preflight
  • dump-fonts
  • parse-inventory
  • validate-inventory
  • create-catalog

The intended execution flow is:

preflight
   ↓
dump-fonts
   ↓
parse-inventory
   ↓
validate-inventory   (optional as a separate explicit step)
   ↓
create-catalog
   ↓
LaTeX compilation outside the Python pipeline

Package boundaries

src/fontshow/cli/

Command orchestration only.

This layer:

  • parses CLI arguments;
  • coordinates workflow steps;
  • returns exit codes;
  • delegates real work to subsystem modules.

Important files:

  • src/fontshow/__main__.py
  • src/fontshow/cli/dump_fonts.py
  • src/fontshow/cli/parse_inventory.py
  • src/fontshow/cli/validate_inventory.py
  • src/fontshow/cli/create_catalog.py

src/fontshow/inventory/

Inventory model, enrichment, and validation.

This layer handles:

  • raw font descriptor construction;
  • charset normalization;
  • Unicode block derivation;
  • script inference;
  • language inference;
  • semantic validation;
  • warning attachment;
  • specimen generation.

Important files:

  • src/fontshow/inventory/font_descriptor.py
  • src/fontshow/inventory/metadata_processing.py
  • src/fontshow/inventory/infer_languages.py
  • src/fontshow/inventory/script_analysis.py
  • src/fontshow/inventory/semantic_validation.py
  • src/fontshow/inventory/validation.py
  • src/fontshow/inventory/schema_validation.py

src/fontshow/catalog/

Catalog-specific transformation from inventory records to renderable entries.

This layer handles:

  • grouping and filtering;
  • sample selection for catalog display;
  • document assembly;
  • label rendering.

Important files:

  • src/fontshow/catalog/pipeline.py
  • src/fontshow/catalog/document.py
  • src/fontshow/catalog/labels.py

src/fontshow/latex/

Low-level LaTeX rendering support.

This layer handles:

  • escaping;
  • script-aware render policy;
  • templates;
  • fontspec / Polyglossia decisions.

Important files:

  • src/fontshow/latex/render.py
  • src/fontshow/latex/policy.py
  • src/fontshow/latex/templates.py

src/fontshow/platform/

OS- and tool-specific integration.

This layer handles:

  • installed font discovery;
  • Fontconfig querying;
  • runtime platform comparison.

Important files:

  • src/fontshow/platform/font_discovery.py
  • src/fontshow/platform/fontconfig.py
  • src/fontshow/platform/runtime.py

src/fontshow/preflight/

Environment validation subsystem.

This layer handles:

  • environment support checks;
  • capability checks;
  • result aggregation;
  • preflight CLI rendering.

Important files:

  • src/fontshow/preflight/runner.py
  • src/fontshow/preflight/model.py
  • src/fontshow/preflight/render.py
  • src/fontshow/preflight/checks/*

src/fontshow/core/

Shared infrastructure.

This layer handles:

  • CLI helpers;
  • logging facade;
  • JSON formatting;
  • enum serialization boundary;
  • warning structures;
  • shared types and constants.

Important files:

  • src/fontshow/core/cli_utils.py
  • src/fontshow/core/logging_utils.py
  • src/fontshow/core/json_boundary.py
  • src/fontshow/core/json_format.py
  • src/fontshow/core/warnings.py
  • src/fontshow/core/types.py

src/fontshow/ontology/ and src/fontshow/unicode/

Static domain knowledge.

This layer provides:

  • Unicode block/script tables;
  • language profiles;
  • script rendering metadata;
  • charset range utilities.

Important files:

  • src/fontshow/ontology/language_tables.py
  • src/fontshow/ontology/unicode_tables.py
  • src/fontshow/unicode/charset_ranges.py

How one font entry moves through the system

A single font face starts in src/fontshow/cli/dump_fonts.py.

Stage 1: discovery and raw extraction

dump-fonts:

  • discovers font files;
  • runs fontTools extraction;
  • optionally merges Fontconfig metadata;
  • builds one canonical descriptor per face via src/fontshow/inventory/font_descriptor.py.

At this point the descriptor is still raw.

It mainly contains:

  • extracted metadata from the font binary;
  • optional Fontconfig-derived fields;
  • platform/runtime metadata;
  • low-level per-face properties.

Stage 2: parsing and enrichment

parse-inventory reads the raw inventory and processes each font entry in place.

The main enrichment path is coordinated in src/fontshow/cli/parse_inventory.py and src/fontshow/inventory/metadata_processing.py.

The typical sequence is:

  1. schema and structural checks;
  2. charset decoding and normalization;
  3. Unicode block derivation;
  4. script inference;
  5. language inference;
  6. language normalization;
  7. structured warning attachment;
  8. specimen generation.

Important helper modules involved:

  • src/fontshow/unicode/charset_ranges.py
  • src/fontshow/inventory/script_analysis.py
  • src/fontshow/inventory/infer_languages.py
  • src/fontshow/inventory/semantic_validation.py
  • src/fontshow/inventory/specimens.py
  • src/fontshow/core/warnings.py

After this stage, the entry is enriched.

It now includes, in usable form:

  • normalized coverage metadata;
  • inferred scripts and languages;
  • language normalization output;
  • structured warning records;
  • selected specimen text.

Stage 3: validation

Validation may occur explicitly through validate-inventory or implicitly through later pipeline assumptions.

This stage applies:

  • schema validation;
  • structural validation;
  • semantic validation.

Key files:

  • src/fontshow/inventory/schema_validation.py
  • src/fontshow/inventory/validation.py
  • src/fontshow/inventory/semantic_validation.py

Stage 4: catalog generation

create-catalog loads the enriched inventory and hands the font list to catalog helpers.

src/fontshow/catalog/document.py then:

  • selects the primary script;
  • chooses the render policy;
  • selects fontspec and Polyglossia options;
  • formats specimen output;
  • assembles the final LaTeX blocks.

Other important helpers:

  • src/fontshow/catalog/pipeline.py
  • src/fontshow/latex/render.py
  • src/fontshow/latex/policy.py
  • src/fontshow/latex/templates.py

State transitions summary

A font face moves through these states:

  1. discovered filesystem font
  2. raw canonical inventory record
  3. enriched inventory record
  4. catalog-facing render record
  5. LaTeX block in the final document

Change map

Change area Primary files Why these files Likely tests
Raw inventory shape src/fontshow/inventory/font_descriptor.py, Descriptor construction, tests/test_validate_font_entry.py,
src/fontshow/inventory/types.py, shared typed structures, and tests/test_validate_inventory.py,
src/fontshow/core/types.py, schema contract are defined here. tests/schema/test_inventory_schema_validation.py,
_archive/schema/inventory_v1_2.schema.json tests/schema/test_schema_validation.py
Charset normalization src/fontshow/unicode/charset_ranges.py, Charset decode, range normalization, tests/test_charset_decoding.py,
src/fontshow/inventory/metadata_processing.py, and block derivation happen here. tests/test_charset_normalization.py,
src/fontshow/platform/fontconfig.py tests/test_charset_to_script_coverage.py,
tests/schema/test_schema_validation_charset.py
Script inference src/fontshow/inventory/script_analysis.py, Heuristics and script-range tests/test_infer_scripts.py,
src/fontshow/ontology/unicode_tables.py knowledge are centralized here. tests/test_charset_to_script_coverage.py
Language inference src/fontshow/inventory/infer_languages.py, Candidate scoring and tests/test_infer_languages.py,
thresholds src/fontshow/ontology/language_tables.py ontology-backed language profiles tests/test_infer_languages_threshold.py,
live here. tests/test_parse_inventory_integration.py
Language normalization src/fontshow/inventory/semantic_validation.py, Normalization, dropped/deprecated tests/test_language_normalization.py,
and semantic language src/fontshow/core/warnings.py handling, and structured warnings tests/test_validate_language_codes.py,
checks are here. tests/test_semantic_validation.py,
tests/test_create_catalog_inventory_validation.py
Inventory parsing flow src/fontshow/cli/parse_inventory.py, The parse CLI and in-place tests/test_parse_inventory_integration.py,
src/fontshow/inventory/metadata_processing.py, enrichment path are coordinated tests/test_parse_inventory_logging.py,
src/fontshow/inventory/io.py here. tests/cli/test_parse-inventory.py
Schema validation src/fontshow/inventory/schema_validation.py, Public and strict schema validation tests/schema/test_inventory_schema_validation.py,
behavior _archive/schema/inventory_v1_2.schema.json are implemented here against the tests/schema/test_schema_validation.py,
bundled schema. tests/schema/test_schema_validation_regression.py
General inventory src/fontshow/inventory/validation.py, Structural checks outside pure tests/test_validate_font_entry.py,
validation src/fontshow/inventory/entry_validation.py JSON-schema validation live here. tests/test_validate_inventory.py
Specimen generation src/fontshow/inventory/specimens.py, Inventory-level specimen fallback Indirectly covered by
src/fontshow/common/specimens.py and shared sample selection live tests/test_parse_inventory_integration.py,
here. tests/test_output_schema_invariants.py
Catalog filtering src/fontshow/catalog/pipeline.py, Input inventory loading, family tests/test_cli_invariants.py,
and grouping src/fontshow/inventory/io.py, grouping, test-font filtering, and tests/test_artifact_hygiene.py,
src/fontshow/cli/create_catalog.py orchestration happen here. tests/test_deterministic_output.py,
tests/test_platform_strictness.py
Catalog rendering src/fontshow/catalog/document.py, Entry block rendering, escaping, tests/test_deterministic_output.py,
and LaTeX layout src/fontshow/catalog/labels.py, policy selection, and templates are tests/test_artifact_hygiene.py,
src/fontshow/latex/render.py, all separated here. tests/test_cli_invariants.py
src/fontshow/latex/policy.py,
src/fontshow/latex/templates.py
JSON serialization / src/fontshow/core/json_format.py, Pretty-printing and enum tests/test_enum_json_boundary.py,
enum boundary src/fontshow/core/json_boundary.py normalization across disk/in-memory tests/test_json_formatting.py,
boundaries are here. tests/test_output_schema_invariants.py
Logging behavior src/fontshow/core/logging_utils.py, Structured logging, TRACE support, tests/test_fc_query_logging.py,
src/fontshow/core/cli_utils.py and CLI-visible logging helpers tests/test_trace_logging.py,
live here. tests/test_parse_inventory_logging.py,
tests/cli/test_cli_quiet_verbose.py
CLI dispatch src/fontshow/__main__.py, Top-level dispatch and per-command tests/cli/test_create-catalog.py,
and exit codes src/fontshow/cli/create_catalog.py, wrapper semantics are defined here. tests/cli/test_dump-fonts.py,
src/fontshow/cli/dump_fonts.py, tests/cli/test_parse-inventory.py,
src/fontshow/cli/parse_inventory.py, tests/cli/test_fontshow_version.py
src/fontshow/cli/validate_inventory.py
Preflight checks src/fontshow/preflight/runner.py, Check registration, execution, tests/preflight/test_registry.py,
and policy src/fontshow/preflight/model.py, result modeling, and policy logic tests/preflight/test_base_check_contract.py,
src/fontshow/preflight/render.py, are all here. tests/preflight/test_environment_policy.py,
src/fontshow/preflight/checks/base.py, tests/preflight/test_environment_matrix.py,
src/fontshow/preflight/checks/environment.p y, tests/preflight/test_font_discovery_policy.py,
src/fontshow/preflight/checks/font_discovery.py, tests/preflight/test_latex_policy.py,
src/fontshow/preflight/checks/latex.py, tests/preflight/test_render.py
src/fontshow/preflight/checks/ontology.py
Preflight CLI behavior src/fontshow/preflight/__main__.py Standalone preflight CLI wiring, tests/cli/test_preflight_cli.py,
output file handling, and exit-code tests/cli/test_preflight_output_file.py,
conversion are here. tests/preflight/test_preflight_internal_exception.py

If you want, I can also normalize this same table for the other two maps so they all share the same multiline style.


Ripple-risk map

Change area Safe starting point Ripple risk Common downstream modules affected
Font discovery behavior src/fontshow/platform/font_discovery.py Medium src/fontshow/cli/dump_fonts.py,
src/fontshow/platform/fontconfig.py,
src/fontshow/inventory/fonttools_extraction.py
Raw inventory shape src/fontshow/inventory/font_descriptor.py High src/fontshow/cli/parse_inventory.py,
src/fontshow/inventory/validation.py,
src/fontshow/catalog/document.py,
_archive/schema/inventory_v1_2.schema.json
Charset normalization src/fontshow/unicode/charset_ranges.py Medium src/fontshow/inventory/metadata_processing.py,
src/fontshow/inventory/script_analysis.py,
src/fontshow/inventory/infer_languages.py
Script inference src/fontshow/inventory/script_analysis.py Medium src/fontshow/inventory/metadata_processing.py,
src/fontshow/catalog/document.py,
src/fontshow/latex/policy.py
Language inference src/fontshow/inventory/infer_languages.py Medium src/fontshow/inventory/metadata_processing.py,
thresholds src/fontshow/inventory/semantic_validation.py,
src/fontshow/common/specimens.py
Language normalization src/fontshow/inventory/semantic_validation.py High src/fontshow/core/warnings.py,
and semantic checks src/fontshow/cli/parse_inventory.py,
src/fontshow/cli/create_catalog.py,
src/fontshow/diagnostics/inventory_warnings.py
Inventory parsing flow src/fontshow/cli/parse_inventory.py High src/fontshow/inventory/metadata_processing.py,
src/fontshow/inventory/schema_validation.py,
src/fontshow/inventory/specimens.py
Schema validation src/fontshow/inventory/schema_validation.py High src/fontshow/cli/validate_inventory.py,
behavior src/fontshow/cli/parse_inventory.py,
src/fontshow/inventory/validation.py
General inventory src/fontshow/inventory/validation.py Medium src/fontshow/cli/validate_inventory.py,
validation src/fontshow/cli/create_catalog.py,
src/fontshow/inventory/entry_validation.py
Specimen generation src/fontshow/inventory/specimens.py Medium src/fontshow/common/specimens.py,
src/fontshow/catalog/document.py
Catalog filtering src/fontshow/catalog/pipeline.py Medium src/fontshow/cli/create_catalog.py,
and grouping src/fontshow/inventory/io.py,
src/fontshow/catalog/document.py
Catalog rendering src/fontshow/catalog/document.py High src/fontshow/catalog/labels.py,
and LaTeX layout src/fontshow/latex/render.py,
src/fontshow/latex/policy.py
JSON serialization / src/fontshow/core/json_boundary.py Low src/fontshow/core/json_format.py,
enum boundary src/fontshow/cli/parse_inventory.py,
src/fontshow/cli/dump_fonts.py
Logging behavior src/fontshow/core/logging_utils.py Medium src/fontshow/core/cli_utils.py,
src/fontshow/platform/fontconfig.py,
src/fontshow/cli/parse_inventory.py,
src/fontshow/preflight/runner.py
CLI dispatch src/fontshow/__main__.py Medium all command modules under src/fontshow/cli,
and exit codes src/fontshow/preflight/__main__.py
Preflight checks src/fontshow/preflight/checks/environment.py Medium src/fontshow/preflight/runner.py,
and policy or another specific check module src/fontshow/preflight/model.py,
src/fontshow/preflight/render.py
Preflight CLI behavior src/fontshow/preflight/__main__.py Low src/fontshow/preflight/runner.py,
src/fontshow/preflight/render.py,
src/fontshow/core/cli_utils.py

Bug triage map

Symptom Most likely module First files to inspect
fontshow command exits CLI dispatch src/fontshow/__main__.py,
with wrong code or wrong src/fontshow/core/cli_utils.py
subcommand behavior
preflight says Preflight environment src/fontshow/preflight/checks/environment.py,
environment unsupported policy src/fontshow/preflight/runner.py
unexpectedly
preflight cannot find Preflight capability src/fontshow/preflight/checks/font_discovery.py,
fc-list or LuaLaTeX checks src/fontshow/preflight/checks/latex.py
dump-fonts finds no Platform discovery src/fontshow/platform/font_discovery.py,
fonts or too few fonts src/fontshow/cli/dump_fonts.py
dump-fonts drops valid Face filtering / src/fontshow/inventory/fonttools_extraction.py,
fonts as unsupported extraction src/fontshow/inventory/validation.py,
src/fontshow/cli/dump_fonts.py
Fontconfig metadata is Fontconfig integration src/fontshow/platform/fontconfig.py,
missing or malformed src/fontshow/cli/dump_fonts.py
Inventory JSON shape Descriptor construction / src/fontshow/inventory/font_descriptor.py,
changed or fields are schema _archive/schema/inventory_v1_2.schema.json,
missing unexpectedly src/fontshow/core/types.py
parse-inventory fails Parse orchestration or src/fontshow/cli/parse_inventory.py,
on valid input schema validation src/fontshow/inventory/schema_validation.py,
src/fontshow/inventory/io.py
Languages are missing or Language inference src/fontshow/inventory/infer_languages.py,
look too conservative thresholds src/fontshow/ontology/language_tables.py
Scripts are wrong or Script inference src/fontshow/inventory/script_analysis.py,
"unknown" appears src/fontshow/ontology/unicode_tables.py
unexpectedly
Charset ranges or Charset normalization src/fontshow/unicode/charset_ranges.py,
Unicode blocks look src/fontshow/inventory/metadata_processing.py
wrong
Language tags are Semantic normalization src/fontshow/inventory/semantic_validation.py,
dropped, normalized, or src/fontshow/core/warnings.py
warned unexpectedly
Specimen text is empty, Specimen selection src/fontshow/inventory/specimens.py,
ugly, or from the wrong src/fontshow/common/specimens.py,
language src/fontshow/catalog/document.py
validate-inventory Schema or semantic src/fontshow/inventory/schema_validation.py,
rejects data validation src/fontshow/inventory/validation.py,
unexpectedly src/fontshow/inventory/semantic_validation.py
create-catalog rejects Platform compatibility src/fontshow/platform/runtime.py,
an inventory due to enforcement src/fontshow/cli/create_catalog.py
platform mismatch
Catalog output is Catalog pipeline / JSON src/fontshow/catalog/pipeline.py,
nondeterministic ordering / grouping src/fontshow/catalog/document.py,
src/fontshow/core/json_format.py
LaTeX output is broken Catalog rendering / src/fontshow/catalog/document.py,
or escaping is wrong LaTeX helpers src/fontshow/latex/render.py,
src/fontshow/latex/templates.py
Wrong script-specific LaTeX render policy src/fontshow/latex/policy.py,
render policy or src/fontshow/ontology/language_tables.py
Polyglossia usage
Warning severities JSON boundary src/fontshow/core/json_boundary.py,
serialize incorrectly or src/fontshow/core/json_format.py
JSON roundtrip changes
meaning
TRACE or DEBUG logs are Logging facade src/fontshow/core/logging_utils.py,
missing / attributed to src/fontshow/platform/fontconfig.py,
wrong caller src/fontshow/cli/parse_inventory.py

Practical reading order

For a new contributor, this is the most efficient order:

  1. src/fontshow/__main__.py
  2. src/fontshow/cli/dump_fonts.py
  3. src/fontshow/cli/parse_inventory.py
  4. src/fontshow/cli/create_catalog.py
  5. src/fontshow/preflight/__main__.py
  6. then the subsystem files relevant to the feature or bug you care about

This order gives you:

  • the CLI surface;
  • the main data flow;
  • the handoff points between discovery, enrichment, validation, and rendering.