`kreuzberg` extraction currently uses `result.content` only in practice; `result.tables` and `result.images` are not consistently propagated through all paths.
---
## Phase 0: Compatibility and Unification Design
Goal: Lock a safe integration contract before coding.
-`StructuredExtractionResult` (canonical output shared by process/convert/summarize)
- Add extraction feature toggles to `LightRAGConfig`:
-`extract_tables: bool = True`
-`extract_figures: bool = True`
-`extract_equations: bool = True`
-`figure_description_enabled: bool = True`
- Keep `ProcessingResult` counters owned by `lightrag/processor.py` (for example `table_count`, `figure_count`, `equation_count`) while using shared element models from `models.py`.
### Provider compatibility matrix (initial)
- Ingestion/query providers currently implemented in `lightrag/rag.py`: `ollama`, `openai`, `zhipu`, `hf`, `jina`.
- Figure-description generation must explicitly handle providers without vision support:
- If unsupported: skip description generation and log a clear reason.
- Do not fail full document ingestion for missing vision capability.
---
## Phase 1: Shared Structured Extraction Core
Goal: Build one extraction flow consumed by all three entrypoints.
- [x] (2026-03-25) E2E command syntax corrected: `workspace add-members` now documented with positional item args (no `--items` flag)
- [x] (2026-03-25) E2E processing path fixed in CLI: replaced invalid `TDocDatabase.get_tdoc()` call and fixed `_logger` usage in `workspace process`
- [x] (2026-03-25) LightRAG insert compatibility fix: `TDocRAG.insert()` retries without kwargs when runtime `ainsert` rejects metadata kwargs
- [x] (2026-03-25) E2E query blocker fixed: removed query-time model override to preserve LightRAG `hashing_kv` injection and wrapped embeddings with `EmbeddingFunc` for hybrid query compatibility
- [x] (2026-03-25) PR-10 end-to-end flow validated on workspace `test-rag-elements-e2e` (create -> add-members -> process -> rag query)
- [x] Phase 0: Compatibility and Unification Design
- [x] Phase 1: Shared Structured Extraction Core
- [x] Phase 2: Table Preservation
- [x] Phase 3: Figure/Image Extraction
- [x] Phase 4: Equation Handling and Structural Chunking
@@ -4,6 +4,7 @@ This document provides a chronological log of all significant changes and improv
## Recent Changes
-**2026-03-25**: [Enhanced RAG pipeline with tables, figures, and equations](history/2026-03-25_SUMMARY_enhanced_rag_pipeline_tables_figures_equations.md)
-**2026-03-24**: [Convert and summarize commands implementation](history/2026-03-24_SUMMARY_convert_summarize_commands_implementation.md)