67edc19a36
Plan refinements: Vitest tests added to 09/10 must-haves, explicit mock_flow two-tuple pattern in 10, test_admin_api.py fixture usage in 11. New artifacts: UAT checklist, UI-SPEC, deferred-items, debug investigation for cloud-doc-operations-fail. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
15 KiB
15 KiB
phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, gap_closure, must_haves
| phase | plan | type | wave | depends_on | files_modified | autonomous | requirements | gap_closure | must_haves | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 05-cloud-storage-backends | 09 | execute | 1 |
|
true |
|
true |
|
Purpose: Cloud-stored documents are inaccessible after upload because (1) the preview uses an unauthenticated iframe src, (2) the Celery re-analyze task hardcodes MinIO, and (3) no PATCH endpoint exists for document metadata.
Output: All three flows work end-to-end for cloud documents. Three new/updated backend tests plus one Vitest unit test pass.
<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>
@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/05-cloud-storage-backends/05-UAT.mdFrom backend/api/documents.py:
- router = APIRouter(prefix="/api/documents")
- get_regular_user dep: rejects admins (403), requires active user
- Document ORM fields: id, user_id, filename, folder_id, storage_backend, object_key
- Ownership check pattern:
if doc.user_id != current_user.id: raise HTTPException(404) - Existing PATCH for folder move lives in backend/api/folders.py with separate router
- Try-import fallback pattern (lines 51-54): wraps optional imports in try/except ImportError and sets module to None; callers guard with
if module is not None:
From backend/tasks/document_tasks.py:
_run(document_id)callsget_storage_backend()unconditionally (returns MinIO backend)doc.storage_backendcolumn holds "minio" | "google_drive" | "onedrive" | "nextcloud" | "webdav"- Session is opened fresh per task:
async with AsyncSessionLocal() as session: - User is fetched:
user = await session.get(User, doc.user_id) get_storage_backend_for_document(doc, user, session)already exists in storage/__init__.py
From backend/storage/__init__.py (inferred from documents.py usage):
get_storage_backend_for_document(doc, user, session)→ returns correct backend instance
From backend/storage/google_drive_backend.py:
- Exports
CloudConnectionError— the canonical exception class for all cloud backend failures - Used by the cloud backend when it cannot connect to or retrieve from the remote provider
From frontend/src/api/client.js:
request(path, options)injects Authorization: Bearer header automatically- Returns parsed JSON; for binary responses a different approach is needed
getDocumentContentUrl(docId)currently returns a plain URL string (no auth)
In backend/tasks/document_tasks.py, in `_run()`, replace the unconditional `backend = get_storage_backend()` block with:
- If `doc.storage_backend` is None or `doc.storage_backend == "minio"`: use `get_storage_backend()` (existing MinIO path).
- Else: use `get_storage_backend_for_document(doc, user, session)`. If user is None (doc.user_id is None), return `{"document_id": document_id, "status": "missing_user"}`.
Import `get_storage_backend_for_document` at the top of `_run()` using a deferred import: `from storage import get_storage_backend_for_document`.
Import `CloudConnectionError` inside the cloud-backend branch using the try-import fallback pattern already used in `backend/api/documents.py` (lines 51-54) to avoid hard import errors if the module is absent:
```python
try:
from storage.google_drive_backend import CloudConnectionError
except ImportError:
CloudConnectionError = Exception
```
Place this try-import block at the top of the cloud-backend branch (before the `get_storage_backend_for_document` call), not at module top-level. Catch `CloudConnectionError` from the get_object call and return `{"document_id": document_id, "status": "extract_failed", "error": "cloud backend error"}` (do NOT include the raw provider error message).
In backend/tests/test_cloud.py, add three tests:
1. `test_patch_document_filename` — create a document, PATCH with {filename: "renamed.pdf"}, assert 200 and updated filename.
2. `test_patch_document_wrong_owner` — create two users, try to PATCH user A's document as user B, assert 404.
3. `test_reanalyze_cloud_document_routes_to_cloud_backend` — mock `get_storage_backend_for_document` to return an AsyncMock backend, set doc.storage_backend="nextcloud", call `_run(doc_id)` directly, assert the mock backend's `get_object` was called and MinIO backend's `get_object` was NOT called.
cd /Users/nik/Documents/Progamming/document_scanner/backend && python -m pytest tests/test_cloud.py::test_patch_document_filename tests/test_cloud.py::test_patch_document_wrong_owner tests/test_cloud.py::test_reanalyze_cloud_document_routes_to_cloud_backend -v
Three tests pass. PATCH /api/documents/{id} returns 200 for valid owner, 404 for wrong owner. Re-analyze task uses cloud backend for cloud documents.
Task 2: Authenticated document preview — fetch-with-Blob-URL in frontend + Vitest test
frontend/src/api/client.js, frontend/src/components/documents/DocumentPreviewModal.vue, frontend/src/views/DocumentView.vue, frontend/src/api/__tests__/client.test.js
In frontend/src/api/client.js, add a new exported function `fetchDocumentContent(docId)` that calls `fetch(/api/documents/${docId}/content, { headers: { Authorization: Bearer ${authStore.accessToken} }, credentials: "include" })`. On 401, attempt one refresh via `authStore.refresh()` and retry (same pattern as `request()`). Return the raw `Response` object (not parsed JSON) so callers can call `.blob()` on it. Do NOT use the existing `request()` helper because it calls `res.json()` unconditionally.
In frontend/src/components/documents/DocumentPreviewModal.vue:
- Remove the computed `proxyUrl` that returns a raw URL string.
- Add reactive state: `blobUrl = ref(null)`, `loadError = ref(null)`, `loading = ref(true)`.
- On `onMounted` (and watch `props.doc.id` for changes), call `fetchDocumentContent(props.doc.id)` from the API client, then `response.blob()`, then `URL.createObjectURL(blob)` and assign to `blobUrl`.
- In the template, change `iframe :src="proxyUrl"` to `iframe :src="blobUrl"` with a `v-if="blobUrl"`.
- Show a loading spinner while `loading` is true and `blobUrl` is null.
- Show an error message if `loadError` is set.
- On `onUnmounted`, call `URL.revokeObjectURL(blobUrl.value)` to release the object URL.
In frontend/src/views/DocumentView.vue:
- Locate any `window.open()` call that opens `/api/documents/{id}/content` directly.
- Replace it with: call `fetchDocumentContent(docId)`, get the blob, create an object URL, then `window.open(objectUrl)`. After a short delay (e.g. `setTimeout(() => URL.revokeObjectURL(objectUrl), 60000)`), revoke the URL.
- Import `fetchDocumentContent` from the API client.
Note: the `request()` helper in client.js already handles 401 → refresh → retry. The new `fetchDocumentContent` must replicate only the auth injection + single retry, not the JSON parsing. Keep it simple: use the `useAuthStore` lazy import pattern already in `request()`.
In frontend/src/api/__tests__/client.test.js (create the file if it does not exist), add a Vitest unit test for `fetchDocumentContent`:
- Mock `fetch` globally using `vi.stubGlobal('fetch', vi.fn())`.
- Mock the auth store so `authStore.accessToken` returns a known token string (e.g. "test-token-abc").
- Configure the mock fetch to return a Response-like object with `status: 200` and a `.blob()` method that resolves to a `new Blob(['%PDF-1.4'], { type: 'application/pdf' })`.
- Mock `URL.createObjectURL` to return a fixed string `"blob:http://localhost/fake-uuid"`.
- Call `fetchDocumentContent("doc-123")` and await the result.
- Assert: the first argument to fetch was `/api/documents/doc-123/content`.
- Assert: the `Authorization` header in the fetch call was `"Bearer test-token-abc"`.
- Assert: the returned Response is the mock response (not a Blob URL — callers are responsible for calling `.blob()` and `URL.createObjectURL`).
- Restore all stubs in `afterEach` using `vi.restoreAllMocks()`.
cd /Users/nik/Documents/Progamming/document_scanner/frontend && npm run test -- --reporter=verbose --run src/api/__tests__/client.test.js 2>&1 | tail -20 && npm run build 2>&1 | tail -5
Vitest test for fetchDocumentContent passes. Frontend build passes with zero errors. DocumentPreviewModal and DocumentView use fetchDocumentContent. No unauthenticated src= URLs remain for the /content endpoint.
<threat_model>
Trust Boundaries
| Boundary | Description |
|---|---|
| frontend→/api/documents/{id}/content | Previously browser-navigated; now fetch() with Bearer — closes the unauthenticated access gap |
| PATCH /api/documents/{id} body | User-supplied filename and folder_id — validated via Pydantic; ownership enforced |
| Celery task → cloud backend | Task now instantiates cloud backend inside worker process using DB-resident credentials |
STRIDE Threat Register
| Threat ID | Category | Component | Disposition | Mitigation Plan |
|---|---|---|---|---|
| T-05-09-01 | Spoofing | PATCH /documents/{id} | mitigate | get_regular_user enforced; admin → 403; wrong owner → 404 |
| T-05-09-02 | Information Disclosure | PATCH response | mitigate | storage.get_metadata() whitelist used for response — no credentials_enc or password_hash |
| T-05-09-03 | Tampering | Celery task cloud credentials | mitigate | get_storage_backend_for_document loads credentials from DB inside task's own session — no credentials in broker message |
| T-05-09-04 | Information Disclosure | fetchDocumentContent Blob URL | accept | Blob URL is same-origin, revoked on unmount — no persistent exposure |
| T-05-09-SC | Tampering | npm/pip installs | mitigate | No new packages installed in this plan |
| </threat_model> |
<success_criteria>
- PATCH /api/documents/{id} is callable with {filename} or {folder_id} and returns the updated document
- Re-analyze Celery task calls the cloud backend for cloud documents (not MinIO)
- CloudConnectionError imported via try-import fallback pattern; raw provider error not exposed in response
- DocumentPreviewModal uses fetch + Blob URL, no unauthenticated iframe src
- DocumentView uses fetch + Blob URL, no window.open with raw /content URL
- Vitest test for fetchDocumentContent asserts Bearer header injection and correct fetch URL
- All three new pytest tests pass; Vitest test passes; full suite has zero new failures </success_criteria>