Files
kite/.planning/phases/05-cloud-storage-backends/05-12-PLAN.md
T
curo1305 10175ee4b5 fix(05-12): close 3 UAT gaps — OAuth 400 preflight, 502 cloud fallback, upload hint
- oauth_initiate: pre-flight check returns 400 with env-var hint when
  GOOGLE_CLIENT_ID/SECRET or ONEDRIVE_CLIENT_ID/SECRET are not configured,
  preventing opaque MSAL/OAuth library 500 errors on misconfigured servers
- stream_document_content: broad except-clause catches non-CloudConnectionError
  exceptions and returns 502 with user-friendly message (was raw 500)
- docker-compose.yml: add volumes: - ./backend:/app to celery-worker so code
  changes are picked up by docker compose restart without a rebuild
- CloudStorageView: upload hint paragraph directs users to navigate into a
  cloud folder; no DropZone added (no folder context at overview level)
- 3 new backend tests pass; 2 existing tests patched with credential monkeypatch;
  full suite: 293 passed, 0 new failures, 1 pre-existing (test_extract_docx)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 17:55:08 +02:00

18 KiB
Raw Blame History

phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, gap_closure, must_haves
phase plan type wave depends_on files_modified autonomous requirements gap_closure must_haves
05-cloud-storage-backends 12 execute 1
backend/api/cloud.py
backend/api/documents.py
docker-compose.yml
frontend/src/views/CloudStorageView.vue
backend/tests/test_cloud_api.py
true
CLOUD-01
CLOUD-02
CLOUD-07
true
truths artifacts key_links
OneDrive OAuth initiate returns HTTP 400 with a descriptive message when ONEDRIVE_CLIENT_ID or ONEDRIVE_CLIENT_SECRET is not configured — not a 500 from MSAL
Google Drive OAuth initiate returns HTTP 400 with a descriptive message when GOOGLE_CLIENT_ID or GOOGLE_CLIENT_SECRET is not configured
stream_document_content returns 502 (not 500) when a cloud backend raises an unexpected exception
celery-worker in docker-compose.yml has a volume mount so code changes are picked up by docker compose restart (no rebuild required)
CloudStorageView shows an upload hint directing users to navigate into a cloud folder to upload files
path provides
backend/api/cloud.py Pre-flight config check in oauth_initiate for both onedrive and google_drive providers
path provides
backend/api/documents.py Broad except-clause in stream_document_content catches non-CloudConnectionError exceptions and returns 502
path provides
docker-compose.yml celery-worker service has volumes: - ./backend:/app matching the backend service
path provides
frontend/src/views/CloudStorageView.vue Upload hint paragraph shown when connections exist, directing users to navigate into a folder
from to via
frontend Settings → Cloud Storage → Connect OneDrive GET /api/cloud/oauth/initiate/onedrive Returns 400 with readable error when env vars missing
from to via
frontend document preview GET /api/documents/{id}/content Returns 502 instead of 500 on cloud backend failure
Close 3 UAT gaps from Phase 5 testing:
  1. OneDrive OAuth 500 (major): When ONEDRIVE_CLIENT_ID/SECRET env vars are not set, MSAL raises an exception that surfaces as a 500 error. Users cannot distinguish misconfiguration from a code bug. Add a pre-flight check that returns 400 with a human-readable message before touching MSAL. Same check for Google Drive.

  2. Cloud document stream opaque 500 (blocker): stream_document_content catches CloudConnectionError → 503, but any other exception from the cloud backend becomes a raw 500. Add a broad except Exception → 502 with a user-friendly message. Also add volumes: ./backend:/app to celery-worker in docker-compose.yml so code changes are reflected by docker compose restart without a full rebuild.

  3. Upload hint in CloudStorageView (blocker): The sidebar "Cloud Storage" link now navigates to /cloud (CloudStorageView) which shows provider connections but has no DropZone. Users expect to be able to upload there. Adding a DropZone would require knowing which cloud folder to target (not available at this level). Instead, add a clear inline hint: "To upload files, navigate into a cloud folder first."

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/05-cloud-storage-backends/05-UAT.md

From backend/api/cloud.py — oauth_initiate (lines ~314384):

  • Route: GET /api/cloud/oauth/initiate/{provider}
  • Provider validation at line ~336: if provider not in VALID_OAUTH_PROVIDERS: raise HTTPException(400, ...)
  • google_drive block starts at line ~348 with if provider == "google_drive":
  • onedrive block starts at line ~370 with elif provider == "onedrive":
  • settings fields: settings.google_client_id, settings.google_client_secret (config.py line ~62); settings.onedrive_client_id, settings.onedrive_client_secret, settings.onedrive_tenant_id (config.py line ~64)
  • All three onedrive fields default to empty string — no startup validation

From backend/api/documents.py — stream_document_content (lines ~708780):

  • CloudConnectionError catch at line ~754 returns 503
  • No broad except-clause after it — any other cloud exception becomes unhandled 500
  • from storage import get_storage_backend, get_storage_backend_for_document at line 40

From docker-compose.yml — celery-worker service (lines ~81100):

  • Has no volumes: block — backend code changes require docker compose up --build celery-worker
  • backend service at line ~53 has volumes: - ./backend:/app — same pattern needed for celery-worker

From frontend/src/views/CloudStorageView.vue:

  • Content section starts at line ~12: <div class="flex-1 overflow-y-auto px-6 py-5">
  • Empty state div at line ~23: <div v-else-if="connections.length === 0" ...>
  • Connections list at line ~30: <div v-else class="flex flex-col divide-y ..."> (rows listing providers)
  • No DropZone imported or rendered anywhere in the component
  • Upload hint must appear AFTER the connections list (inside the v-else branch) — not in the empty state

From backend/tests/test_cloud_api.py:

  • Read the file to understand existing test fixtures before adding new tests
  • Add tests for the 400 response when env vars are missing (mock settings)
Task 1: Backend — pre-flight config validation in oauth_initiate backend/api/cloud.py, backend/tests/test_cloud_api.py - backend/api/cloud.py (lines 310390: oauth_initiate function) - backend/config.py (lines 6070: onedrive_client_id, google_client_id fields) - backend/tests/test_cloud_api.py (full file: existing fixtures and test patterns) - GET /api/cloud/oauth/initiate/google_drive with google_client_id="" → 400 {"detail": "Google Drive OAuth is not configured on this server. Set GOOGLE_CLIENT_ID and GOOGLE_CLIENT_SECRET in your environment."} - GET /api/cloud/oauth/initiate/onedrive with onedrive_client_id="" → 400 {"detail": "OneDrive OAuth is not configured on this server. Set ONEDRIVE_CLIENT_ID, ONEDRIVE_CLIENT_SECRET, and ONEDRIVE_TENANT_ID in your environment."} - GET /api/cloud/oauth/initiate/onedrive with all onedrive env vars set → still attempts MSAL (existing behavior, not broken) - GET /api/cloud/oauth/initiate/unknown_provider → still 400 "Unsupported OAuth provider" (existing behavior unchanged) In backend/api/cloud.py, inside the `oauth_initiate` function, AFTER the existing `VALID_OAUTH_PROVIDERS` check and BEFORE the `if provider == "google_drive":` block, insert two pre-flight checks:
1. For google_drive: immediately before `if provider == "google_drive":`, add:
   ```
   if provider == "google_drive" and (not settings.google_client_id or not settings.google_client_secret):
       raise HTTPException(
           status_code=status.HTTP_400_BAD_REQUEST,
           detail="Google Drive OAuth is not configured on this server. Set GOOGLE_CLIENT_ID and GOOGLE_CLIENT_SECRET in your environment.",
       )
   ```

2. For onedrive: immediately before `elif provider == "onedrive":`, add:
   ```
   if provider == "onedrive" and (not settings.onedrive_client_id or not settings.onedrive_client_secret):
       raise HTTPException(
           status_code=status.HTTP_400_BAD_REQUEST,
           detail="OneDrive OAuth is not configured on this server. Set ONEDRIVE_CLIENT_ID, ONEDRIVE_CLIENT_SECRET, and ONEDRIVE_TENANT_ID in your environment.",
       )
   ```

These checks must fire before the `if provider == "google_drive":` / `elif provider == "onedrive":` blocks — do NOT restructure the if/elif chain.

In backend/tests/test_cloud_api.py, add two tests using `monkeypatch` (pytest) to override settings fields:
1. `test_oauth_initiate_google_drive_not_configured` — monkeypatch `settings.google_client_id = ""` and `settings.google_client_secret = ""`, call GET /api/cloud/oauth/initiate/google_drive as an authenticated regular user, assert 400, assert "GOOGLE_CLIENT_ID" in detail.
2. `test_oauth_initiate_onedrive_not_configured` — monkeypatch `settings.onedrive_client_id = ""`, call GET /api/cloud/oauth/initiate/onedrive, assert 400, assert "ONEDRIVE_CLIENT_ID" in detail.

Use the existing authenticated client fixture from the test file — read the file to find its name before writing tests.
- backend/api/cloud.py oauth_initiate contains `if provider == "google_drive" and (not settings.google_client_id or not settings.google_client_secret)` before the `if provider == "google_drive":` block - backend/api/cloud.py oauth_initiate contains `if provider == "onedrive" and (not settings.onedrive_client_id or not settings.onedrive_client_secret)` before the `elif provider == "onedrive":` block - `pytest backend/tests/test_cloud_api.py::test_oauth_initiate_google_drive_not_configured backend/tests/test_cloud_api.py::test_oauth_initiate_onedrive_not_configured -v` exits 0 - Both tests assert HTTP 400 and the relevant env var name in the detail string cd /Users/nik/Documents/Progamming/document_scanner/backend && python -m pytest tests/test_cloud_api.py::test_oauth_initiate_google_drive_not_configured tests/test_cloud_api.py::test_oauth_initiate_onedrive_not_configured -v Two new tests pass. oauth_initiate returns 400 with descriptive message when provider credentials are empty. Existing tests unchanged. Task 2: Backend — 502 fallback in stream_document_content + celery-worker volume mount backend/api/documents.py, docker-compose.yml, backend/tests/test_documents_api.py - backend/api/documents.py (lines 708780: stream_document_content function) - docker-compose.yml (lines 53100: backend and celery-worker service definitions) - backend/tests/test_documents_api.py (read to find fixtures for mocking get_storage_backend_for_document) - GET /api/documents/{id}/content when cloud backend raises CloudConnectionError → 503 "Cloud connection requires re-authentication" (EXISTING, unchanged) - GET /api/documents/{id}/content when cloud backend raises any other Exception (e.g., aiohttp.ClientError, timeout, generic RuntimeError) → 502 "Cloud backend unreachable. Please try again or reconnect in Settings." - GET /api/documents/{id}/content for a MinIO document → 200 with file bytes (unchanged, MinIO errors are not affected by the new clause) ### 1. backend/api/documents.py — add broad except-clause
In the `stream_document_content` function, find the `except CloudConnectionError as exc:` block (lines ~754758). IMMEDIATELY AFTER its closing line (`from exc`), add a second except clause:

```python
except Exception as exc:
    raise HTTPException(
        status_code=502,
        detail="Cloud backend unreachable. Please try again or reconnect in Settings.",
    ) from exc
```

The final try/except structure must be:
```
try:
    storage_backend = await get_storage_backend_for_document(...)
    file_bytes = await storage_backend.get_object(doc.object_key)
except CloudConnectionError as exc:
    raise HTTPException(503, ...) from exc
except Exception as exc:
    raise HTTPException(502, ...) from exc
```

Do NOT catch Exception before CloudConnectionError — order matters (specific before broad).

### 2. docker-compose.yml — add volume mount to celery-worker

In the `celery-worker` service block, add a `volumes:` key with the same bind mount as the `backend` service:
```yaml
volumes:
  - ./backend:/app
```

Place it after `environment:` and before `extra_hosts:` (or after `extra_hosts:` if that reads more cleanly). Match the indentation of surrounding keys (2 spaces).

Also add `PYTHONDONTWRITEBYTECODE=1` to the celery-worker environment if it is not already there (prevents .pyc files from cluttering the bind-mounted source).

### 3. backend/tests/test_documents_api.py — add test for 502 path

Add `test_stream_document_content_cloud_backend_error`: 
- Create a document with `storage_backend = "google_drive"` (or any non-minio value)
- Mock `get_storage_backend_for_document` to raise `RuntimeError("connection timeout")`
- Call GET /api/documents/{doc.id}/content as the document owner
- Assert 502 and "Cloud backend unreachable" in the response detail

Read existing document stream tests to find the right fixture pattern before writing.
- backend/api/documents.py stream_document_content has `except Exception as exc:` AFTER `except CloudConnectionError as exc:` block, raising HTTPException(502) - docker-compose.yml celery-worker service has `volumes: - ./backend:/app` - `pytest backend/tests/test_documents_api.py::test_stream_document_content_cloud_backend_error -v` exits 0 - `pytest backend/tests/ -v -k "stream_document_content"` — all stream tests pass (no regression) cd /Users/nik/Documents/Progamming/document_scanner/backend && python -m pytest tests/test_documents_api.py -v -k "stream" 2>&1 | tail -20 502 except-clause added. Volume mount added to docker-compose.yml. New test passes. Existing stream tests pass. Task 3: Frontend — upload hint in CloudStorageView frontend/src/views/CloudStorageView.vue - frontend/src/views/CloudStorageView.vue (full file — understand existing template structure) - frontend/src/views/CloudFolderView.vue (for reference — how upload works inside a folder) - When at /cloud with at least one active connection, a hint paragraph is visible below the connections list: "To upload files, navigate into a cloud folder first." - The hint does not appear on the empty state (no connections) — that state already directs to Settings. - Clicking a connection row still navigates to /cloud/{provider}/root (existing behavior unchanged). - No DropZone component is added to this view (no cloud folder context is available at this level). In frontend/src/views/CloudStorageView.vue, inside the `v-else` block (the div that renders the connections list, starting at `
cd /Users/nik/Documents/Progamming/document_scanner/frontend && npm run build 2>&1 | tail -5 Upload hint added below connections list. Build passes. No DropZone added. Existing connection click behavior unchanged.

<threat_model>

Trust Boundaries

Boundary Description
oauth_initiate preflight User-supplied provider string already validated; new checks only inspect server-side settings values — no user input involved
502 error message Static string, no user data reflected in the error detail
volume mount Read-only to container — same bind mount pattern as backend service

STRIDE Threat Register

Threat ID Category Component Disposition Mitigation Plan
T-05-12-01 Information Disclosure 400 error message for missing creds mitigate Message names env vars (server config), not any user data or secret values — safe to expose
T-05-12-02 Information Disclosure 502 error message mitigate Static string "Cloud backend unreachable" — no stack trace, no exception detail leaked to client
T-05-12-03 Tampering celery-worker volume mount accept Bind mount is same as backend service; only developer-controlled source files are mounted; production deployments use image builds, not bind mounts
T-05-12-SC Tampering npm/pip installs mitigate No new packages installed in this plan
</threat_model>
After all tasks complete: - `cd backend && python -m pytest tests/test_cloud_api.py::test_oauth_initiate_google_drive_not_configured tests/test_cloud_api.py::test_oauth_initiate_onedrive_not_configured -v` — 2 tests pass - `cd backend && python -m pytest tests/test_documents_api.py::test_stream_document_content_cloud_backend_error -v` — 1 test passes - `cd backend && python -m pytest -v` — zero new failures - `cd frontend && npm run build` — zero errors - Manual: docker-compose.yml celery-worker service has `volumes: - ./backend:/app` - Manual: open CloudStorageView at /cloud — upload hint visible below connections list - Manual: curl -H "Authorization: Bearer " http://localhost:8000/api/cloud/oauth/initiate/onedrive (with empty OneDrive creds) → 400 with "ONEDRIVE_CLIENT_ID" in detail

<success_criteria>

  • oauth_initiate returns 400 with descriptive env-var hint for unconfigured google_drive and onedrive
  • stream_document_content returns 502 (not 500) for non-CloudConnectionError cloud exceptions
  • celery-worker has volume mount so docker compose restart celery-worker picks up code changes
  • CloudStorageView shows upload hint directing users to navigate into a folder
  • 3 new backend tests pass; full pytest suite has zero new failures; frontend build clean </success_criteria>
Create `.planning/phases/05-cloud-storage-backends/05-12-SUMMARY.md` when done