Embedding Performance Execution¶
Purpose¶
This branch-local ledger records the executed steps for the batched embedding performance workstream.
Branch¶
The implementation branch for this work is:
feat/batch-embedding-indexing
Planned Phases¶
- Branch bootstrap and execution ledger
- ADR for batched embeddings and tunable runtime controls
- Batched embedding backend implementation
- Same-run payload deduplication in index persistence
- Benchmark script and operator documentation
- Validation, tuning review, and commit preparation
Executed Steps¶
- [x] Created the dedicated implementation branch.
- [x] Added the branch-local execution ledger.
- [x] Added ADR-008 covering batching, same-run payload reuse, and explicit runtime controls.
- [x] Added a batched embedding API and environment-driven runtime settings.
- [x] Updated index persistence to batch recomputed embeddings and reuse identical payload vectors within one flush.
- [x] Added regression tests for batching and same-run payload reuse.
- [x] Added a benchmark script for phase timings and embedding batch metrics.
- [x] Ran the full validation surface:
black --check src scripts tests,ruff check src scripts tests,mypy src scripts tests,pytest -q. - [x] Captured one instrumented full-index benchmark on this repository. The
first sample showed
embed_textsandflush_embedding_rowsdominating wall time, which confirmed the optimization target. - [x] Ran controlled embedding microbenchmarks after the first pass. On this host, constrained Torch threads and larger batches sometimes helped on the synthetic benchmark.
- [x] Kept runtime tuning operator-controlled after follow-up end-to-end measurements proved too noisy to justify hardcoded thread defaults in this branch.
- [x] Recovered the historical large-repository baseline for
Personalia/Progetti/Software/texlive-2026-sourceundercodira 1.4.0. The timedcodira index --fullrun started at21:30:18on 02/04/2026 and ended at05:31:21on 03/04/2026, for a total wall time of8h01m03s, with:Indexed: 7933,Reused: 0,Deleted: 0,Failed: 0,Embeddings recomputed: 43732,Embeddings reused: 0,Coverage issues: 0. - [x] Captured a large-repository full-index benchmark on
Personalia/Progetti/Software/texlive-2026-sourceafter the 1.7.x performance and audit-policy updates. On 03/04/2026 the commandcodira index --fullran from17:27:36to17:40:21, for a total wall time of12m45s, with:Indexed: 7933,Reused: 0,Deleted: 0,Failed: 0,Embeddings recomputed: 43732,Embeddings reused: 0,Coverage issues: 0. - [x] Recorded an apples-to-apples before/after comparison for the same large
repository and the same full-index workload:
8h01m03soncodira 1.4.0versus12m45son the current 1.7.x line, which is roughly a37.7xspeedup by wall clock. - [x] Create the final branch commit.