Commit 3e6b619c authored by Jan Reimes's avatar Jan Reimes
Browse files

Fix MustDowngradeError: disable HTTP/3 in hishel cache adapter pool

Root cause: niquests installs urllib3-future which supports HTTP/3.
create_cached_session() mounts _NiquetsCacheAdapter (hishel
CacheAdapter subclass). On cache misses, hishel delegates to
requests.adapters.HTTPAdapter → urllib3-future transport, which
ignores the Session(disable_http3=True) setting.

Fix: override init_poolmanager() in _NiquetsCacheAdapter to inject
disabled_svn={HttpVersion.h3} into urllib3-future's pool manager,
preventing HTTP/3 at the connection level.

Also fixed import (HTTPVersion → HttpVersion) and updated AGENTS.md
with accurate two-layer fix documentation.
parent 2da47e73
Loading
Loading
Loading
Loading
+3 −3
Original line number Diff line number Diff line
@@ -39,6 +39,6 @@ call .venv\scripts\activate.bat
3gpp-crawler workspace members

:: convert tdocs/specs to PDF/artefacts for AI processing (portable fallback profile)
:: 3gpp-crawler workspace process --profile pdf-only
3gpp-crawler workspace process --profile markdown-only --docx-direct --device cuda
:: 3gpp-crawler workspace process --profile default
:: 3gpp-crawler workspace process atias --profile pdf-only
3gpp-crawler workspace process atias --profile markdown-only --docx-direct --device cuda
:: 3gpp-crawler workspace process atias --profile default
+19 −13
Original line number Diff line number Diff line
@@ -34,30 +34,36 @@ with create_cached_session() as session:

## HTTP/3 MustDowngradeError (CRITICAL)

When creating a **raw** `requests.Session()` or `niquests.Session()` (not via `create_cached_session`), **ALWAYS** pass `disable_http3=True`:
`www.3gpp.org` and `portal.3gpp.org` advertise HTTP/3 via `Alt-Svc` header but cannot handle it. This triggers `MustDowngradeError` from `urllib3-future` (bundled with niquests), wasting ~10 seconds per request in retries.

**Two fixes required:**

### 1. `niquests.Session(disable_http3=True)` — for raw sessions

Applies to ALL naked `Session()` calls outside `create_cached_session()`:

```python
# CORRECT — disables HTTP/3 to avoid MustDowngradeError from 3GPP servers
# CORRECT
session = requests.Session(disable_http3=True)
# or
with requests.Session(disable_http3=True) as session:
    ...

# WRONG — will crash with MustDowngradeError on 3GPP portal
session = requests.Session()  # NEVER this
# WRONG — will crash with MustDowngradeError
session = requests.Session()
```

**Why:** `portal.3gpp.org` and `www.3gpp.org` advertise HTTP/3 via Alt-Svc header but cannot actually handle it. niquests tries to upgrade, fails with `MustDowngradeError`, and wastes up to 10 seconds retrying per request.
| Location | Fixed? |
|----------|--------|
| `meetings/sources/portal.py` | ✅ |
| `tdocs/operations/checkout.py` (3 locations) | ✅ |

### 2. `_NiquetsCacheAdapter.init_poolmanager()` — for hishel cached sessions

**Scope:** Applies to ALL naked `Session()` calls outside `create_cached_session()`:
`create_cached_session()` mounts a `_NiquetsCacheAdapter` (hishel `CacheAdapter` subclass). On cache misses, hishel delegates to `requests.adapters.HTTPAdapter.send()``urllib3` transport. Since niquests installs `urllib3-future` (which supports HTTP/3), the `disable_http3=True` on the `Session` has **no effect** — the hishel adapter creates its own connection pool.

| Location | File | Fixed? |
|----------|------|--------|
| `meetings/sources/portal.py` | `niquests.Session(...)` | ✅ |
| `http_client/session.py` | `requests.Session(...)` (via `create_cached_session`) | ✅ |
| `tdocs/operations/checkout.py` | `requests.Session(...)` (3 locations) | ✅ |
The fix: `_NiquetsCacheAdapter.init_poolmanager()` injects `disabled_svn={HttpVersion.h3}` into the pool manager kwargs, disabling HTTP/3 at the `urllib3-future` connection level.

**Enforcement:** If you ever create a raw `niquests.Session()` or `requests.Session()` (aliased from niquests), add `disable_http3=True` without exception. The `create_cached_session()` factory already handles this correctly.
**Enforcement:** If you ever create a raw `niquests.Session()` or `requests.Session()` (aliased from niquests), add `disable_http3=True` without exception. If you create a custom adapter mounted on a cached session, ensure `init_poolmanager` disables HTTP/3.

## Anti-Duplication (DRY)

+14 −0
Original line number Diff line number Diff line
@@ -93,6 +93,20 @@ class _NiquetsCacheAdapter(CacheAdapter):
    so hishel's isinstance checks would fail without this bridge.
    """

    def init_poolmanager(self, connections: int, maxsize: int, block: bool = False, **pool_kwargs: object) -> None:
        """Disable HTTP/3 to prevent MustDowngradeError from 3GPP servers."""
        try:
            from urllib3.connection import HttpVersion

            disabled: set[HttpVersion] = pool_kwargs.get("disabled_svn", set())  # type: ignore[assignment]
            if not isinstance(disabled, set):
                disabled = set()
            disabled.add(HttpVersion.h3)
            pool_kwargs["disabled_svn"] = disabled  # type: ignore[assignment]
        except ImportError:
            pass
        super().init_poolmanager(connections, maxsize, block, **pool_kwargs)

    def send(
        self,
        request: requests.models.PreparedRequest,