feat(ai): update embedding model configuration and enhance workspace commands
- Replace Ollama embedding model with HuggingFace sentence-transformers.
- Add new commands for workspace management and artifact clearing.
- Improve documentation for embedding model usage and configuration.
The provider (first segment) is validated against the supported allowlist. The model name (everything after the first `/`) can contain additional slashes for nested model paths.
**Embedding models** use **HuggingFace model IDs** (handled by sentence-transformers):
Embedding models are **local-only** (downloaded from HuggingFace) and require the model ID to be a valid HuggingFace model that works with sentence-transformers.
tdoc-crawler ai workspace create <name> [--auto-build]
@@ -172,12 +183,17 @@ tdoc-crawler ai workspace activate <name>
# Deactivate the active workspace
tdoc-crawler ai workspace deactivate
# Get workspace details
tdoc-crawler ai workspace get <name>
# Get workspace details (name, status, member counts)
tdoc-crawler ai workspace info <name>
# Remove invalid/inactive members from workspace
tdoc-crawler ai workspace clear-invalid [-w <name>]
# Clear all AI artifacts while preserving members
tdoc-crawler ai workspace clear [-w <name>]
# Delete a workspace
tdoc-crawler ai workspace delete <name>
```
### Querying
Query the knowledge base using semantic embeddings and knowledge graph (RAG + GraphRAG).
@@ -187,11 +203,13 @@ Query the knowledge base using semantic embeddings and knowledge graph (RAG + Gr
tdoc-crawler ai query "your query here"
# Query a specific workspace
tdoc-crawler ai query --workspace <workspace_name> "your query here"
```
tdoc-crawler ai query -w <workspace_name> "your query here"
Note: Uses active workspace if `-w` is not provided. Combines vector embeddings (RAG) and knowledge graph (GraphRAG).
### Single TDoc Operations
# Specify number of results
tdoc-crawler ai query "your query here"-k 10
````
Note: Uses active workspace if `-w` is not provided. Combines vector embeddings (RAG) and knowledge graph (GraphRAG). The query is a **positional argument** (no `--query` flag needed).
- Original models: <https://huggingface.co/models?num_parameters=min:0,max:3B&library=sentence-transformers,onnx&sort=trending&author=sentence-transformers>
- Community models: <https://huggingface.co/models?num_parameters=min:0,max:3B&library=sentence-transformers,onnx&sort=trending>