+270
−0
File added.
Preview size limit exceeded, changes collapsed.
+329
−0
File added.
Preview size limit exceeded, changes collapsed.
Loading
- New extraction.py: Unified extract_document_structured() function for all document types - New conversion.py: Consolidated PDF conversion with LibreOffice/remote fallback - processor.py: Rename TDocProcessor → DocumentProcessor, remove duplicated extraction logic - convert.py: Delegate to unified pipeline, keep only TDoc-specific metadata enrichment - Tests: Update to use DocumentProcessor and mock unified extraction functions This refactoring enables specs and arbitrary documents to use the same extraction pipeline as TDocs, with consistent caching, error handling, and artifact persistence. Document type now only matters during checkout and metadata retrieval, not extraction.
File added.
Preview size limit exceeded, changes collapsed.
File added.
Preview size limit exceeded, changes collapsed.