+13
−0
Loading
The _score_filename() function already handles all file types defined in STRUCTURED_PATTERNS (.doc, .docx, .pdf) and UNSTRUCTURED_PATTERNS (.ppt, .pptx, .xls, .xlsx, .txt, .csv), but classify_document_files() was only searching for *.docx files. Fixed by: - Expanding glob patterns to match all supported file extensions - Using separate globs for: .doc, .docx, .pdf, .ppt, .pptx, .xls, .xlsx, .txt, .csv - Removing duplicates if a file matches multiple patterns - Updated warning message to be more generic - Updated comment to remove 'DOCX' specific reference Updated test test_non_docx_files_are_ignored to test_non_docx_files_are_classified() to reflect new behavior where .pptx and .xlsx files are properly classified, not ignored.