feat(ai): add --accelerate option for embedding backend selection (4d0dac87) · Commits · Jan Reimes / 3gpp-crawler

docs/ai.md

+7 −0

Original line number	Diff line number	Diff line
		@@ -59,6 +59,7 @@ TDC_AI_LLM_API_BASE= # Optional: custom endpoint

		# Embedding Model (HuggingFace sentence-transformers)
		TDC_AI_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2 # Default: popular 384-dim model
		TDC_AI_EMBEDDING_BACKEND=torch # torch \| onnx \| openvino (default: torch)

		# Storage
		TDC_AI_STORE_PATH= # Defaults to <cache_dir>/.ai/lancedb
		@@ -161,6 +162,12 @@ tdoc-crawler ai workspace process -w my-project --force

		### 5. Single TDoc Operations

		Process a single TDoc through the pipeline (classification, extraction, embeddings, graph). Use `--accelerate` to choose the sentence-transformers backend.

		```bash
		tdoc-crawler ai process --tdoc-id SP-240001 --accelerate onnx
		```

		______________________________________________________________________

		## CLI Commands

+1 −0

Original line number	Diff line number	Diff line
		@@ -4,6 +4,7 @@ This document provides a chronological log of all significant changes and improv

		## Recent Changes

		- 2026-03-06: [AI embeddings accelerate backend option](history/2026-03-06_SUMMARY_01_AI_EMBEDDINGS_ACCELERATE_BACKEND.md)
		- 2026-02-09: [Align CLI options across commands](history/2026-02-09_SUMMARY_01_ALIGN_CLI_OPTIONS_ACROSS_COMMANDS.md)
		- 2026-02-07: [Spec download auto-crawl and bug fixes](history/2026-02-07_SUMMARY_01_SPEC_DOWNLOAD_AUTO_CRAWL_AND_BUG_FIXES.md)
		- 2026-02-03: [Lazy credential resolution](history/2026-02-03_SUMMARY_03_LAZY_CREDENTIAL_RESOLUTION.md)

0 → 100644

+27 −0

Original line number	Diff line number	Diff line
		# Summary - AI embeddings accelerate backend option

		Added an `--accelerate` option for `ai process` so callers can select the sentence-transformers backend used for embedding generation (torch, onnx, openvino). This flows through configuration and the embedding pipeline to ensure consistent backend selection.

		## Changes

		### CLI

		- Added `--accelerate/-a` to `ai process` with `torch \| onnx \| openvino` choices.
		- Passed the selected backend through `AiConfig` into the pipeline.

		### AI Configuration

		- Added `embedding_backend` to `AiConfig` with validation and `TDC_AI_EMBEDDING_BACKEND` support.

		### Embedding Pipeline

		- Threaded backend selection into `EmbeddingsManager` creation and pipeline embedding stage.
		- Fixed the placeholder backend argument when constructing `SentenceTransformer`.

		### Dependencies

		- Updated `tdoc-ai` dependencies to ensure all requested backends can be installed.

		## Verification

		- Not run (docs and code changes only).