Skip to content

Configuration

Codira can run without a config file. Runtime commands create a default user-level config on first use when the platform user config directory is writable, and users can create or inspect config files explicitly with:

codira config init
codira config init --full
codira config dump
codira config explain embeddings.batch_size
codira config validate

Precedence

Effective configuration is resolved in this order:

CLI flags
-> CODIRA_* environment variables
-> repository config: .codira/config.toml
-> user config: platform user config directory
-> system config: platform system config directory
-> built-in defaults

Repository config lives at .codira/config.toml. The file can be committed, while normal .codira index artifacts remain ignored.

Generated Config

The default generated file is:

config_version = 1

[backend]
name = "sqlite"

[plugins]
disable_third_party = false
disabled_analyzers = []

[embeddings]
enabled = true
model = "sentence-transformers/all-MiniLM-L6-v2"
version = "1"
dimension = 384
device = "cpu"
batch_size = 32
torch_num_threads = 0
torch_num_interop_threads = 0

[embeddings.gpu]
device_id = 0
memory_limit_mb = 0

[embeddings.indexing]
mode = "immediate"
object_types = ["symbol", "documentation"]
max_text_chars = 0
include_paths = []
exclude_paths = []

torch_num_threads = 0 and torch_num_interop_threads = 0 mean Codira leaves Torch defaults unchanged.

embeddings.gpu.memory_limit_mb = 0 means no explicit GPU memory limit is configured.

embeddings.indexing.mode = "immediate" computes embeddings during codira index. Set it to "deferred" to persist structural index rows first and queue embeddings for a later codira index --embeddings-only pass.

embeddings.indexing.object_types limits which persisted object types receive embeddings. Supported values are "symbol" and "documentation". An empty list skips all embedding rows while leaving structural indexing enabled.

embeddings.indexing.max_text_chars = 0 means no text-size limit. Positive values skip embedding payloads longer than the configured number of characters.

embeddings.indexing.include_paths and exclude_paths are repo-root-relative path prefixes. Include filters are evaluated first; exclude filters remove matching files from embedding computation.

Repository Performance Profile

This repository commits an explicit .codira/config.toml tuned from the Issue #57 backend and embedding matrix:

  • backend.name = "duckdb" selects the backend with the strongest measured read/query performance on the bk-cpp benchmark set.
  • embeddings.indexing.mode = "immediate" keeps the clean matrix path as the default indexing mode. Deferred mode remains available for operators who explicitly want a two-step structural/indexing workflow.
  • embeddings.indexing.object_types = ["symbol", "documentation"] keeps both retrieval channels active. The matrix showed symbol embeddings dominate runtime, while documentation embeddings are comparatively cheap.
  • embeddings.indexing.max_text_chars = 0 keeps documentation embeddings uncapped. The capped-docs matrix did not show enough total-runtime benefit to justify reducing retrieval coverage by default.
  • embeddings.batch_size = 32 and zero Torch thread overrides preserve the current portable defaults. Host-local calibration can still override them through config, CLI flags, or environment variables.

The embedding matrix is hardware-sensitive because embedding throughput, DuckDB memory pressure, and Torch scheduling depend on CPU, RAM, GPU, and local model state. Re-run the matrix after a meaningful hardware change before treating these values as tuned for the new host. The matrix is a long operation; run it only when the expected hardware or backend signal justifies the elapsed time.

Profiles

codira config init --profile default writes conservative defaults.

codira config init --full writes the core defaults plus every known first-party plugin option with its default value.

codira config init --profile low-memory lowers the embedding batch size and sets conservative Torch thread counts.

codira config init --profile gpu selects a GPU-oriented device and larger batch size. It includes GPU metadata defaults but does not auto-detect hardware.

Embedding Calibration

codira calibrate embeddings runs a bounded offline calibration workflow and prints a config-compatible TOML snippet by default:

codira calibrate embeddings
codira calibrate embeddings --print
make calibrate-embeddings-config
codira calibrate embeddings --output /tmp/codira-embeddings.toml
codira calibrate embeddings --write

--write is the only mode that mutates the user config file. --print and --output do not create or update user config.

Calibration benchmarks deterministic text payloads against locally available embedding model artifacts. It does not download models or contact external services. If the semantic dependency stack or local model artifact is missing, Codira emits safe CPU fallback values instead of failing the command.

The printed block includes the complete [embeddings] section plus [embeddings.gpu], including model identity fields and calibrated runtime parameters.

Environment Overrides

Existing process-local environment overrides still work and take precedence over config files:

Variable Config key
CODIRA_INDEX_BACKEND backend.name
CODIRA_DISABLE_THIRD_PARTY_PLUGINS plugins.disable_third_party
CODIRA_EMBED_BATCH_SIZE embeddings.batch_size
CODIRA_EMBED_DEVICE embeddings.device
CODIRA_TORCH_NUM_THREADS embeddings.torch_num_threads
CODIRA_TORCH_NUM_INTEROP_THREADS embeddings.torch_num_interop_threads

Validation

Config validation is strict. Unknown keys, invalid types, invalid enum values, and invalid numeric ranges fail before runtime work proceeds.

When validating the effective config, Codira also validates plugin tables against schemas exposed by loaded plugins. Configured plugin tables for unloaded plugins produce warnings and keep exit status 0; JSON output reports status = "ok_with_warnings".

Plugin Configuration

Plugin activation and plugin-specific settings live under namespaced tables:

[plugins.analyzer-python]
enabled = true
include_paths = ["src", "tests"]
exclude_paths = ["tests/fixtures"]
emit_imports = true

[plugins.backend-sqlite]
enabled = true

Common plugin keys:

Key Type Default Description
enabled bool true Disables the plugin when set to false.

Common analyzer keys:

Key Type Default Description
include_paths list[str] [] Repo-relative POSIX paths to include after suffix/family eligibility. Empty means include all otherwise eligible paths.
exclude_paths list[str] [] Repo-relative POSIX paths to exclude after suffix/family eligibility. Excludes win over includes.

Path filter values must be non-empty repo-relative paths. Absolute paths and .. traversal segments are invalid.

First-party analyzer options:

Table Options
[plugins.analyzer-python] emit_module_documentation, emit_imports, emit_constants, emit_type_aliases
[plugins.analyzer-json] enabled_families = ["schema", "package", "release"], emit_dependencies, emit_scripts, emit_schema_properties
[plugins.analyzer-c] use_leading_comments, emit_doxygen_documentation, include_system_includes, emit_macros
[plugins.analyzer-cpp] use_leading_comments, emit_doxygen_documentation, include_system_includes, emit_namespaces, emit_macros
[plugins.analyzer-bash] emit_functions
[plugins.analyzer-markdown] strip_front_matter, emit_file_artifact_without_headings, min_heading_level, max_heading_level
[plugins.analyzer-text] include_root_files, include_docs_directories, exclude_generated, exclude_fixtures_logs

First-party backend tables currently accept only common plugin keys:

[plugins.backend-sqlite]
enabled = true

[plugins.backend-duckdb]
enabled = true

Disabling the configured active backend is invalid. Disable an inactive backend only, or change [backend].name first.