- [ ] T011 [P] Add config validation tests (env loading, provider/model format, invalid provider) in `tests/test_ai_config.py`
- [ ] T012 [P] Record FR-011 DRY/search evidence in `specs/002-ai-document-processing/research.md`
- [ ] T013 [P] Run focused foundational red/collect checks for `tests/test_ai_config.py` and AI modules
- [ ] T055 [P] User approval checkpoint: confirm foundational red-phase failures are reviewed and approved before continuing implementation work
- [ ] T014 [P] Fix strict lint-rule gates (`PLC0415`, `ANN001`, `F821`, `ANN201`, `B008`, `PLW2901`, `S108`) in `src/tdoc_crawler/ai/` and `tests/test_ai_*.py`
- [ ] T063 [P] Add FR-012 network-policy regression tests in `tests/test_ai_network_policy.py` (core crawler-source traffic must use `create_cached_session()`; AI provider traffic remains exempt)
- [ ] T064 [P] Add FR-012 compliance checks for forbidden direct core-source HTTP usage in `scripts/check.py`
- [ ] T065 [P] Add FR-018 storage-boundary test in `tests/test_ai_storage_boundary.py` verifying AI writes only to AI storage and does not mutate core SQLite schema
- [ ] T066 [P] Add FR-018 integration test in `tests/test_ai_pipeline.py` verifying metadata reads from `TDocDatabase` are read-only while artifacts persist only in AI storage
- [ ] T014 [P] User approval checkpoint: confirm foundational red-phase failures are reviewed and approved before continuing implementation work
- [ ] T015 [P] Fix strict lint-rule gates (`PLC0415`, `ANN001`, `F821`, `ANN201`, `B008`, `PLW2901`, `S108`) in `src/tdoc_crawler/ai/` and `tests/test_ai_*.py`
- [ ] T016 [P] Add FR-012 network-policy regression tests in `tests/test_ai_network_policy.py` (core crawler-source traffic must use `create_cached_session()`; AI provider traffic remains exempt)
- [ ] T017 [P] Add FR-012 compliance checks for forbidden direct core-source HTTP usage in `scripts/check.py`
- [ ] T018 [P] Add FR-018 storage-boundary test in `tests/test_ai_storage_boundary.py` verifying AI writes only to AI storage and does not mutate core SQLite schema
- [ ] T019 [P] Add FR-018 integration test in `tests/test_ai_pipeline.py` verifying metadata reads from `TDocDatabase` are read-only while artifacts persist only in AI storage
**Checkpoint**: Foundation stable and constitution-aligned.
- [ ] T034 [US4] Write embedding tests in `tests/test_ai_embeddings.py`
- [ ] T035 [US4] Run red checkpoint for `tests/test_ai_embeddings.py` and record failing output in `tests/test_ai_embeddings.py`
- [ ] T060 [US4] User approval checkpoint: confirm US4 red-phase failures are reviewed and approved before implementation
- [ ] T043 [US4] Write embedding tests in `tests/test_ai_embeddings.py`
- [ ] T044 [US4] Run red checkpoint for `tests/test_ai_embeddings.py` and record failing output in `tests/test_ai_embeddings.py`
- [ ] T045 [US4] User approval checkpoint: confirm US4 red-phase failures are reviewed and approved before implementation
### Implementation for User Story 4
- [ ] T036 [US4] Implement section-based chunking and overlap logic in `src/tdoc_crawler/ai/operations/embed.py`
- [ ] T037 [US4] Implement embedding generation and model-version metadata in `src/tdoc_crawler/ai/operations/embed.py`
- [ ] T038 [US4] Implement `query_embeddings()` API and pipeline registration in `src/tdoc_crawler/ai/__init__.py` and `src/tdoc_crawler/ai/operations/pipeline.py`
- [ ] T046 [US4] Implement section-based chunking and overlap logic in `src/tdoc_crawler/ai/operations/embed.py`
- [ ] T047 [US4] Implement embedding generation and model-version metadata in `src/tdoc_crawler/ai/operations/embed.py`
- [ ] T048 [US4] Implement `query_embeddings()` API and pipeline registration in `src/tdoc_crawler/ai/__init__.py` and `src/tdoc_crawler/ai/operations/pipeline.py`
**Checkpoint**: Semantic chunk retrieval is operational.