Configuration¶

Codira can run without a config file. Runtime commands create a default user-level config on first use when the platform user config directory is writable, and users can create or inspect config files explicitly with:

codira config init
codira config init --full
codira config dump
codira config explain embeddings.batch_size
codira config validate

Precedence¶

Effective configuration is resolved in this order:

CLI flags
-> CODIRA_* environment variables
-> repository config: .codira/config.toml
-> user config: platform user config directory
-> system config: platform system config directory
-> built-in defaults

Repository config lives at .codira/config.toml. The file can be committed, while normal .codira index artifacts remain ignored.

Generated Config¶

The default generated file is:

config_version = 1

[backend]
name = "sqlite"

[plugins]
disable_third_party = false
disabled_analyzers = []

[embeddings]
enabled = true
model = "sentence-transformers/all-MiniLM-L6-v2"
version = "1"
dimension = 384
device = "cpu"
batch_size = 32
torch_num_threads = 0
torch_num_interop_threads = 0

[embeddings.gpu]
device_id = 0
memory_limit_mb = 0

[embeddings.indexing]
mode = "immediate"
object_types = ["symbol", "documentation"]
max_text_chars = 0
include_paths = []
exclude_paths = []

torch_num_threads = 0 and torch_num_interop_threads = 0 mean Codira leaves Torch defaults unchanged.

embeddings.gpu.memory_limit_mb = 0 means no explicit GPU memory limit is configured.

embeddings.indexing.mode = "immediate" computes embeddings during codira index. Set it to "deferred" to persist structural index rows first and queue embeddings for a later codira index --embeddings-only pass.

embeddings.indexing.object_types limits which persisted object types receive embeddings. Supported values are "symbol" and "documentation". An empty list skips all embedding rows while leaving structural indexing enabled.

embeddings.indexing.max_text_chars = 0 means no text-size limit. Positive values skip embedding payloads longer than the configured number of characters.

embeddings.indexing.include_paths and exclude_paths are repo-root-relative path prefixes. Include filters are evaluated first; exclude filters remove matching files from embedding computation.

Repository Performance Profile¶

This repository commits an explicit .codira/config.toml tuned from the Issue #57 backend and embedding matrix:

backend.name = "duckdb" selects the backend with the strongest measured read/query performance on the bk-cpp benchmark set.
embeddings.indexing.mode = "immediate" keeps the clean matrix path as the default indexing mode. Deferred mode remains available for operators who explicitly want a two-step structural/indexing workflow.
embeddings.indexing.object_types = ["symbol", "documentation"] keeps both retrieval channels active. The matrix showed symbol embeddings dominate runtime, while documentation embeddings are comparatively cheap.
embeddings.indexing.max_text_chars = 0 keeps documentation embeddings uncapped. The capped-docs matrix did not show enough total-runtime benefit to justify reducing retrieval coverage by default.
embeddings.batch_size = 32 and zero Torch thread overrides preserve the current portable defaults. Host-local calibration can still override them through config, CLI flags, or environment variables.

The embedding matrix is hardware-sensitive because embedding throughput, DuckDB memory pressure, and Torch scheduling depend on CPU, RAM, GPU, and local model state. Re-run the matrix after a meaningful hardware change before treating these values as tuned for the new host. The matrix is a long operation; run it only when the expected hardware or backend signal justifies the elapsed time.

Profiles¶

codira config init --profile default writes conservative defaults.

codira config init --full writes the core defaults plus every known first-party plugin option with its default value.

codira config init --profile low-memory lowers the embedding batch size and sets conservative Torch thread counts.

codira config init --profile gpu selects a GPU-oriented device and larger batch size. It includes GPU metadata defaults but does not auto-detect hardware.

Embedding Calibration¶

codira calibrate embeddings runs a bounded offline calibration workflow and prints a config-compatible TOML snippet by default:

codira calibrate embeddings
codira calibrate embeddings --print
make calibrate-embeddings-config
codira calibrate embeddings --output /tmp/codira-embeddings.toml
codira calibrate embeddings --write

--write is the only mode that mutates the user config file. --print and --output do not create or update user config.

Calibration benchmarks deterministic text payloads against locally available embedding model artifacts. It does not download models or contact external services. If the semantic dependency stack or local model artifact is missing, Codira emits safe CPU fallback values instead of failing the command.

The printed block includes the complete [embeddings] section plus [embeddings.gpu], including model identity fields and calibrated runtime parameters.

Environment Overrides¶

Existing process-local environment overrides still work and take precedence over config files:

Variable	Config key
`CODIRA_INDEX_BACKEND`	`backend.name`
`CODIRA_DISABLE_THIRD_PARTY_PLUGINS`	`plugins.disable_third_party`
`CODIRA_EMBED_BATCH_SIZE`	`embeddings.batch_size`
`CODIRA_EMBED_DEVICE`	`embeddings.device`
`CODIRA_TORCH_NUM_THREADS`	`embeddings.torch_num_threads`
`CODIRA_TORCH_NUM_INTEROP_THREADS`	`embeddings.torch_num_interop_threads`

Validation¶

Config validation is strict. Unknown keys, invalid types, invalid enum values, and invalid numeric ranges fail before runtime work proceeds.

When validating the effective config, Codira also validates plugin tables against schemas exposed by loaded plugins. Configured plugin tables for unloaded plugins produce warnings and keep exit status 0; JSON output reports status = "ok_with_warnings".

Plugin Configuration¶

Plugin activation and plugin-specific settings live under namespaced tables:

[plugins.analyzer-python]
enabled = true
include_paths = ["src", "tests"]
exclude_paths = ["tests/fixtures"]
emit_imports = true

[plugins.backend-sqlite]
enabled = true

Common plugin keys:

Key	Type	Default	Description
`enabled`	bool	`true`	Disables the plugin when set to `false`.

Common analyzer keys:

Key	Type	Default	Description
`include_paths`	list[str]	`[]`	Repo-relative POSIX paths to include after suffix/family eligibility. Empty means include all otherwise eligible paths.
`exclude_paths`	list[str]	`[]`	Repo-relative POSIX paths to exclude after suffix/family eligibility. Excludes win over includes.

Path filter values must be non-empty repo-relative paths. Absolute paths and .. traversal segments are invalid.

First-party analyzer options:

Table	Options
`[plugins.analyzer-python]`	`emit_module_documentation`, `emit_imports`, `emit_constants`, `emit_type_aliases`
`[plugins.analyzer-json]`	`enabled_families = ["schema", "package", "release"]`, `emit_dependencies`, `emit_scripts`, `emit_schema_properties`
`[plugins.analyzer-c]`	`use_leading_comments`, `emit_doxygen_documentation`, `include_system_includes`, `emit_macros`
`[plugins.analyzer-cpp]`	`use_leading_comments`, `emit_doxygen_documentation`, `include_system_includes`, `emit_namespaces`, `emit_macros`
`[plugins.analyzer-bash]`	`emit_functions`
`[plugins.analyzer-markdown]`	`strip_front_matter`, `emit_file_artifact_without_headings`, `min_heading_level`, `max_heading_level`
`[plugins.analyzer-text]`	`include_root_files`, `include_docs_directories`, `exclude_generated`, `exclude_fixtures_logs`

First-party backend tables currently accept only common plugin keys:

[plugins.backend-sqlite]
enabled = true

[plugins.backend-duckdb]
enabled = true

Disabling the configured active backend is invalid. Disable an inactive backend only, or change [backend].name first.