Files
kite/.planning/phases/05-cloud-storage-backends/05-09-PLAN.md
T
curo1305 f006c00d49 docs(05): create UAT gap closure plans 09-11
Three new plans address all 6 diagnosed gaps from 05-UAT.md:

- 05-09: cloud document open (fetch+Blob URL), re-analyze (cloud-aware
  Celery task), and edit (PATCH /api/documents/{id})
- 05-10: OAuth initiate JSON response fix, Nextcloud custom endpoint
  edit round-trip, Edit button on ERROR rows, confirmation text overflow
- 05-11: admin hard-delete with admin-password confirmation (backend
  UserDeleteConfirm model + frontend inline panel)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 10:39:47 +02:00

12 KiB

phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, gap_closure, must_haves
phase plan type wave depends_on files_modified autonomous requirements gap_closure must_haves
05-cloud-storage-backends 09 execute 1
backend/api/documents.py
backend/tasks/document_tasks.py
frontend/src/api/client.js
frontend/src/components/documents/DocumentPreviewModal.vue
frontend/src/views/DocumentView.vue
backend/tests/test_cloud.py
true
CLOUD-03
CLOUD-05
CLOUD-07
true
truths artifacts key_links
Opening a cloud document proxies bytes through the backend and renders without 401
Re-analyzing a cloud document retrieves file bytes from the cloud backend, not MinIO
PATCH /api/documents/{id} accepts filename and folder_id and persists the change
Frontend fetch-with-blob-URL approach works in DocumentPreviewModal and DocumentView
path provides
backend/api/documents.py PATCH /{doc_id} endpoint accepting {filename, folder_id}
path provides
backend/tasks/document_tasks.py Cloud-aware re-analyze: routes to correct backend by doc.storage_backend
path provides
frontend/src/components/documents/DocumentPreviewModal.vue Authenticated fetch + Blob URL for document preview
path provides
frontend/src/views/DocumentView.vue Authenticated fetch + Blob URL for document open
from to via
frontend/src/components/documents/DocumentPreviewModal.vue /api/documents/{id}/content fetch with Authorization header → Blob URL
from to via
backend/tasks/document_tasks.py get_storage_backend_for_document doc.storage_backend != 'minio' branch
Fix three independent root causes that prevent cloud documents from being opened, re-analyzed, or edited.

Purpose: Cloud-stored documents are inaccessible after upload because (1) the preview uses an unauthenticated iframe src, (2) the Celery re-analyze task hardcodes MinIO, and (3) no PATCH endpoint exists for document metadata.

Output: All three flows work end-to-end for cloud documents. Three new/updated tests pass.

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/05-cloud-storage-backends/05-UAT.md

From backend/api/documents.py:

  • router = APIRouter(prefix="/api/documents")
  • get_regular_user dep: rejects admins (403), requires active user
  • Document ORM fields: id, user_id, filename, folder_id, storage_backend, object_key
  • Ownership check pattern: if doc.user_id != current_user.id: raise HTTPException(404)
  • Existing PATCH for folder move lives in backend/api/folders.py with separate router

From backend/tasks/document_tasks.py:

  • _run(document_id) calls get_storage_backend() unconditionally (returns MinIO backend)
  • doc.storage_backend column holds "minio" | "google_drive" | "onedrive" | "nextcloud" | "webdav"
  • Session is opened fresh per task: async with AsyncSessionLocal() as session:
  • User is fetched: user = await session.get(User, doc.user_id)
  • get_storage_backend_for_document(doc, user, session) already exists in storage/__init__.py

From backend/storage/__init__.py (inferred from documents.py usage):

  • get_storage_backend_for_document(doc, user, session) → returns correct backend instance

From frontend/src/api/client.js:

  • request(path, options) injects Authorization: Bearer header automatically
  • Returns parsed JSON; for binary responses a different approach is needed
  • getDocumentContentUrl(docId) currently returns a plain URL string (no auth)
Task 1: Add PATCH /api/documents/{doc_id} endpoint + cloud-aware Celery re-analyze backend/api/documents.py, backend/tasks/document_tasks.py, backend/tests/test_cloud.py - PATCH /api/documents/{doc_id} with body {filename?: str, folder_id?: uuid | null} returns 200 with updated document dict; non-owner gets 404; admin gets 403 (get_regular_user). - PATCH with only filename updates filename, folder_id unchanged. - PATCH with folder_id=null moves document to root (no folder). - Re-analyze task: when doc.storage_backend != "minio", calls get_storage_backend_for_document(doc, user, session) instead of get_storage_backend(); if user is None returns {"status": "missing_user"}; on CloudConnectionError returns {"status": "extract_failed", "error": "cloud backend error"}. - Re-analyze task: MinIO path (storage_backend == "minio") unchanged. In backend/api/documents.py, add a DocumentPatch Pydantic model with optional fields `filename: Optional[str] = None` and `folder_id: Optional[uuid.UUID] = None` — use a sentinel default of `_UNSET = object()` or `Optional[uuid.UUID]` with a custom flag; the cleanest approach for nullable-vs-absent folder_id is to use `model_fields_set` to distinguish "not provided" from "set to null". Accept the body and update only the fields present in `model_fields_set`. Validate that at least one field is provided (422 if body is empty). Apply ownership guard (doc.user_id != current_user.id → 404). Return the updated document dict using the same shape as GET /api/documents/{id} (call `storage.get_metadata(session, doc_id)` after commit).
In backend/tasks/document_tasks.py, in `_run()`, replace the unconditional `backend = get_storage_backend()` block with:
- If `doc.storage_backend` is None or `doc.storage_backend == "minio"`: use `get_storage_backend()` (existing MinIO path).
- Else: use `get_storage_backend_for_document(doc, user, session)`. If user is None (doc.user_id is None), return `{"document_id": document_id, "status": "missing_user"}`. Import `get_storage_backend_for_document` at the top of `_run()` using a deferred import (same pattern as other imports in this file: `from storage import get_storage_backend_for_document`). Catch `CloudConnectionError` from the get_object call and return `{"document_id": document_id, "status": "extract_failed", "error": "cloud backend error"}` (do NOT include the raw provider error message).

In backend/tests/test_cloud.py, add three tests:
1. `test_patch_document_filename` — create a document, PATCH with {filename: "renamed.pdf"}, assert 200 and updated filename.
2. `test_patch_document_wrong_owner` — create two users, try to PATCH user A's document as user B, assert 404.
3. `test_reanalyze_cloud_document_routes_to_cloud_backend` — mock `get_storage_backend_for_document` to return an AsyncMock backend, set doc.storage_backend="nextcloud", call `_run(doc_id)` directly, assert the mock backend's `get_object` was called and MinIO backend's `get_object` was NOT called.
cd /Users/nik/Documents/Progamming/document_scanner/backend && python -m pytest tests/test_cloud.py::test_patch_document_filename tests/test_cloud.py::test_patch_document_wrong_owner tests/test_cloud.py::test_reanalyze_cloud_document_routes_to_cloud_backend -v Three tests pass. PATCH /api/documents/{id} returns 200 for valid owner, 404 for wrong owner. Re-analyze task uses cloud backend for cloud documents. Task 2: Authenticated document preview — fetch-with-Blob-URL in frontend frontend/src/api/client.js, frontend/src/components/documents/DocumentPreviewModal.vue, frontend/src/views/DocumentView.vue In frontend/src/api/client.js, add a new exported function `fetchDocumentContent(docId)` that calls `fetch(/api/documents/${docId}/content, { headers: { Authorization: Bearer ${authStore.accessToken} }, credentials: "include" })`. On 401, attempt one refresh via `authStore.refresh()` and retry (same pattern as `request()`). Return the raw `Response` object (not parsed JSON) so callers can call `.blob()` on it. Do NOT use the existing `request()` helper because it calls `res.json()` unconditionally.
In frontend/src/components/documents/DocumentPreviewModal.vue:
- Remove the computed `proxyUrl` that returns a raw URL string.
- Add reactive state: `blobUrl = ref(null)`, `loadError = ref(null)`, `loading = ref(true)`.
- On `onMounted` (and watch `props.doc.id` for changes), call `fetchDocumentContent(props.doc.id)` from the API client, then `response.blob()`, then `URL.createObjectURL(blob)` and assign to `blobUrl`.
- In the template, change `iframe :src="proxyUrl"` to `iframe :src="blobUrl"` with a `v-if="blobUrl"`.
- Show a loading spinner while `loading` is true and `blobUrl` is null.
- Show an error message if `loadError` is set.
- On `onUnmounted`, call `URL.revokeObjectURL(blobUrl.value)` to release the object URL.

In frontend/src/views/DocumentView.vue:
- Locate any `window.open()` call that opens `/api/documents/{id}/content` directly.
- Replace it with: call `fetchDocumentContent(docId)`, get the blob, create an object URL, then `window.open(objectUrl)`. After a short delay (e.g. `setTimeout(() => URL.revokeObjectURL(objectUrl), 60000)`), revoke the URL.
- Import `fetchDocumentContent` from the API client.

Note: the `request()` helper in client.js already handles 401 → refresh → retry. The new `fetchDocumentContent` must replicate only the auth injection + single retry, not the JSON parsing. Keep it simple: use the `useAuthStore` lazy import pattern already in `request()`.
cd /Users/nik/Documents/Progamming/document_scanner/frontend && npm run build 2>&1 | tail -5 Frontend build passes with zero errors. DocumentPreviewModal and DocumentView use fetchDocumentContent. No unauthenticated src= URLs remain for the /content endpoint.

<threat_model>

Trust Boundaries

Boundary Description
frontend→/api/documents/{id}/content Previously browser-navigated; now fetch() with Bearer — closes the unauthenticated access gap
PATCH /api/documents/{id} body User-supplied filename and folder_id — validated via Pydantic; ownership enforced
Celery task → cloud backend Task now instantiates cloud backend inside worker process using DB-resident credentials

STRIDE Threat Register

Threat ID Category Component Disposition Mitigation Plan
T-05-09-01 Spoofing PATCH /documents/{id} mitigate get_regular_user enforced; admin → 403; wrong owner → 404
T-05-09-02 Information Disclosure PATCH response mitigate storage.get_metadata() whitelist used for response — no credentials_enc or password_hash
T-05-09-03 Tampering Celery task cloud credentials mitigate get_storage_backend_for_document loads credentials from DB inside task's own session — no credentials in broker message
T-05-09-04 Information Disclosure fetchDocumentContent Blob URL accept Blob URL is same-origin, revoked on unmount — no persistent exposure
T-05-09-SC Tampering npm/pip installs mitigate No new packages installed in this plan
</threat_model>
After both tasks complete: - `pytest backend/tests/test_cloud.py -v` — all three new tests pass, no regressions - `npm run build` — zero errors - Manual: open a cloud document in the app — preview loads without 401 - Manual: re-analyze a cloud document — task completes without NoSuchKey error - Manual: rename a cloud document — PATCH returns 200 with updated filename

<success_criteria>

  • PATCH /api/documents/{id} is callable with {filename} or {folder_id} and returns the updated document
  • Re-analyze Celery task calls the cloud backend for cloud documents (not MinIO)
  • DocumentPreviewModal uses fetch + Blob URL, no unauthenticated iframe src
  • DocumentView uses fetch + Blob URL, no window.open with raw /content URL
  • All three new pytest tests pass; full suite has zero new failures </success_criteria>
Create `.planning/phases/05-cloud-storage-backends/05-09-SUMMARY.md` when done