Commit 30715b9c authored by Jan Reimes's avatar Jan Reimes
Browse files

docs(cli): add CLI refactoring implementation plan and issue dependencies

- Outline steps for refactoring CLI components to improve modularity.
- Document dependencies and recommended execution order for related issues.
- Include manual setup instructions for issue tracking and commands.
- Provide estimated effort and time for each refactoring task.
parent 4f19b797
Loading
Loading
Loading
Loading
+77 −0
Original line number Diff line number Diff line
# CLI Refactoring Beads Issue Dependencies

## Issue Dependencies (Recommended Execution Order)

The following beads issues have been created with the recommended dependencies to ensure proper execution order and avoid conflicts.

| Issue ID | Title | Priority | Dependencies | Status |
|-----------|-------|-----------|--------------|--------|
| `tdc-lst` | Refactor cli/fetching.py: Remove duplicate fetch_tdoc() | P2 | None | open |
| `tdc-oot` | Move normalize_portal_meeting_name() from cli/helpers.py to specs/normalization.py | P3 | None | open |
| `tdc-5uc` | Move resolve_meeting_id() from cli/helpers.py to database module | P3 | None | open |
| `tdc-6ts` | Move download_to_path() from cli/helpers.py to http_client.py | P3 | None | open |
| `tdc-n80` | Move prepare_tdoc_file() from cli/helpers.py to checkout.py | P3 | Depends on `tdc-6ts` | open |
| `tdc-72h` | Move database_path() from cli/helpers.py to database module | P4 | None | open |
| `tdc-lmu` | Update src/tdoc_crawler/cli/AGENTS.md with completed refactoring | P4 | Depends on all refactoring | open |
| `tdc-z37` | Run full test suite after CLI refactoring | P1 | Depends on all refactoring | open |

## Dependency Definitions

### Without Dependencies (Can start in parallel)
These issues can be worked on immediately as they don't depend on other tasks:
- `tdc-lst` - Remove duplicate fetch_tdoc()
- `tdc-oot` - Move normalize_portal_meeting_name()
- `tdc-5uc` - Move resolve_meeting_id()
- `tdc-6ts` - Move download_to_path()
- `tdc-72h` - Move database_path()

### With Dependencies
These issues must wait for their dependencies to complete:
- `tdc-n80` - Depends on `tdc-6ts` (download_to_path)
- `tdc-lmu` - Depends on all refactoring tasks (#1-#6)
- `tdc-z37` - Depends on all refactoring tasks (#1-#6)

## Manual Setup Instructions

The `bd` issue tracker's dependency flag syntax is:
```bash
bd create <title> --type <type> --priority <priority> --deps <dependencies>

# Example:
bd create "Move prepare_tdoc_file()" --type task --priority 3 --deps tdc-6ts
```

## Beads Command Reference

Create issue:
```bash
bd create <title> [flags]
```

Close issue:
```bash
bd close <id>
```

Add dependencies to existing issue:
```bash
bd deps add <id> <type>:<dependency-id>
```

## Recommended Work Queue

### Phase 1: Foundation (No dependencies)
1. `tdc-lst` - Fix fetch_tdoc duplication (15 min)
2. `tdc-oot` - Move normalize_portal_meeting_name (15 min)
3. `tdc-5uc` - Move resolve_meeting_id (30 min)
4. `tdc-6ts` - Move download_to_path (15 min)
5. `tdc-72h` - Move database_path (20 min)

### Phase 2: Integration (Depends on Phase 1)
6. `tdc-n80` - Move prepare_tdoc_file (30 min) - depends on `tdc-6ts`

### Phase 3: Documentation & Verification (Depends on all)
7. `tdc-lmu` - Update AGENTS.md (10 min)
8. `tdc-z37` - Run full test suite (10 min)

## Estimated Total Time: ~2.5 hours
+155 −0
Original line number Diff line number Diff line
# CLI Refactoring Implementation Plan

## Overview
Refactor `src/tdoc_crawler/cli/` to contain only CLI-specific functionality, moving library functions to the core package. This enables `tdoc_crawler` to be used as a standalone library.

## Phase 1: Fix Fetching.py Duplication (CRITICAL)

### Issue #1: Remove Duplicate fetch_tdoc() from cli/fetching.py
**Priority:** High
**Complexity:** Low

**Steps:**
1. Remove `fetch_tdoc()` function from `src/tdoc_crawler/cli/fetching.py`
2. Import `fetch_tdoc` from `tdoc_crawler.fetching` at the top of the file
3. Update imports in `src/tdoc_crawler/cli/app.py` if needed
4. Run tests to verify functionality

**Files Changed:**
- `src/tdoc_crawler/cli/fetching.py`

---

## Phase 2: Move Library Functions from cli/helpers.py

### Issue #2: Move normalize_portal_meeting_name() to specs/normalization.py
**Priority:** Medium
**Complexity:** Low

**Steps:**
1. Add `normalize_portal_meeting_name()` function to `src/tdoc_crawler/specs/normalization.py`
2. Update import in `src/tdoc_crawler/cli/helpers.py` to import from core
3. Update any other files that import from `cli.helpers`
4. Run tests to verify

**Files Changed:**
- `src/tdoc_crawler/specs/normalization.py`
- `src/tdoc_crawler/cli/helpers.py`

---

### Issue #3: Move resolve_meeting_id() to database module
**Priority:** Medium
**Complexity:** Medium

**Steps:**
1. Add `resolve_meeting_id()` function to `src/tdoc_crawler/database/__init__.py` or a new helper module
2. Update `src/tdoc_crawler/cli/fetching.py` to import from database module
3. Remove function from `src/tdoc_crawler/cli/helpers.py`
4. Run tests to verify

**Files Changed:**
- `src/tdoc_crawler/database/__init__.py` (or new file)
- `src/tdoc_crawler/cli/helpers.py`
- `src/tdoc_crawler/cli/fetching.py`

---

### Issue #4: Move download_to_path() to http_client module
**Priority:** Medium
**Complexity:** Low

**Steps:**
1. Add `download_to_path()` function to `src/tdoc_crawler/http_client.py`
2. Update `src/tdoc_crawler/cli/helpers.py` to import from core
3. Update `src/tdoc_crawler/checkout.py` to import from core (it already imports from cli.helpers for this function)
4. Run tests to verify

**Files Changed:**
- `src/tdoc_crawler/http_client.py`
- `src/tdoc_crawler/cli/helpers.py`
- `src/tdoc_crawler/checkout.py`

---

### Issue #5: Move prepare_tdoc_file() to checkout module
**Priority:** Medium
**Complexity:** Medium

**Steps:**
1. Add `prepare_tdoc_file()` function to `src/tdoc_crawler/checkout.py`
2. Update `src/tdoc_crawler/cli/helpers.py` to import from checkout module
3. Update `src/tdoc_crawler/cli/app.py` if needed
4. Run tests to verify

**Files Changed:**
- `src/tdoc_crawler/checkout.py`
- `src/tdoc_crawler/cli/helpers.py`
- `src/tdoc_crawler/cli/app.py`

---

### Issue #6: Move database_path() to database module
**Priority:** Low
**Complexity:** Low

**Steps:**
1. Add `database_path()` function to `src/tdoc_crawler/database/connection.py` or `__init__.py`
2. Update all files importing from `cli.helpers` to import from database module
3. Remove function from `src/tdoc_crawler/cli/helpers.py`
4. Run tests to verify

**Files Changed:**
- `src/tdoc_crawler/database/connection.py`
- `src/tdoc_crawler/cli/helpers.py`
- All files that import `database_path` from `cli.helpers`

---

## Phase 3: Final Cleanup

### Issue #7: Update AGENTS.md with Final Classification
**Priority:** Low
**Complexity:** Low

**Steps:**
1. Update `src/tdoc_crawler/cli/AGENTS.md` to reflect completed refactoring
2. Document any remaining functions in `cli/helpers.py` and their classification

---

### Issue #8: Run Full Test Suite
**Priority:** Critical
**Complexity:** Low

**Steps:**
1. Run full test suite: `uv run pytest -v`
2. Verify all tests pass
3. Fix any regressions introduced by refactoring

---

## Dependency Order

1. **Issue #1** (fetch_tdoc duplication) - Can be done independently
2. **Issue #4** (download_to_path) - Checkout.py depends on it
3. **Issue #5** (prepare_tdoc_file) - Depends on download_to_path
4. **Issue #2** (normalize_portal_meeting_name) - Independent
5. **Issue #3** (resolve_meeting_id) - Independent
6. **Issue #6** (database_path) - Depends on understanding all imports
7. **Issue #7** (Update AGENTS.md) - After all refactoring
8. **Issue #8** (Full test suite) - After all refactoring

## Estimated Effort

| Issue | Complexity | Files Changed | Estimated Time |
|-------|------------|---------------|----------------|
| #1 fetch_tdoc | Low | 2 | 15 min |
| #2 normalize_portal_meeting_name | Low | 2 | 15 min |
| #3 resolve_meeting_id | Medium | 3 | 30 min |
| #4 download_to_path | Low | 3 | 15 min |
| #5 prepare_tdoc_file | Medium | 3 | 30 min |
| #6 database_path | Low | 4 | 20 min |
| #7 Update AGENTS.md | Low | 1 | 10 min |
| #8 Full test suite | Low | 1 | 10 min |
| **Total** | - | **~15 files** | **~2.5 hours** |