System Overview¶
codira currently consists of four practical layers:
| Layer | Current responsibility |
|---|---|
| scanner | discover repository files for indexing |
| indexer | orchestrate analyzer routing, normalized artifacts, and backend persistence |
| query | resolve exact and semantic retrieval against the repository index |
| CLI | expose repository-local commands and output contracts |
The implementation is intentionally repository-local:
- the CLI operates relative to the current repository root
- index data lives under
.codira/ - exact and semantic query paths read the same SQLite database
Current Module Shape¶
The current branch centers on these modules:
src/codira/cli.pyfor command parsing and output formattingsrc/codira/scanner.pyfor Git-backed file discovery with filesystem fallbacksrc/codira/registry.pyfor backend selection and analyzer activationsrc/codira/indexer.pyfor incremental orchestration and SQLite backend persistence/query implementationsrc/codira/analyzers/python.pyandsrc/codira/analyzers/c.pyfor language-specific analysissrc/codira/storage.pyfor SQLite initialization and schema refreshsrc/codira/query/exact.pyfor exact lookup helperssrc/codira/query/producers.pyfor shared retrieval producer metadatasrc/codira/query/context.pyandsrc/codira/semantic/search.pyfor context retrieval and embedding-backed ranking
ADR-004 Boundary¶
ADR-004 now defines the architecture that this branch implements:
- one active index backend per repository instance
- multiple language analyzers in one indexing run
- documentation and tests landing alongside architectural refactors
The remaining future work is no longer about introducing these boundaries. It is about extending them without breaking the current contracts.
For retrieval specifically, the current accepted split is:
- analyzers provide indexing-time language knowledge
- shared query producer descriptors provide retrieval-facing capability metadata
- the query layer consumes those descriptors without depending on analyzer internals