docs(05): add UAT, UI-SPEC, deferred items, debug notes; refine plans 09-11

Plan refinements: Vitest tests added to 09/10 must-haves, explicit mock_flow two-tuple pattern in 10, test_admin_api.py fixture usage in 11. New artifacts: UAT checklist, UI-SPEC, deferred-items, debug investigation for cloud-doc-operations-fail. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 11:57:54 +02:00
parent 34f012b4e8
commit 67edc19a36
7 changed files with 1115 additions and 23 deletions
@@ -0,0 +1,67 @@
+---
+status: investigating
+trigger: "Documents stored on cloud backend cannot be opened, re-analyzed, or edited"
+created: 2026-05-30T00:00:00Z
+updated: 2026-05-30T00:00:00Z
+symptoms_prefilled: true
+goal: find_root_cause_only
+---
+
+## Current Focus
+
+hypothesis: "CONFIRMED — three independent root causes across open, re-analyze, and edit flows"
+test: "Full read of documents.py, document_tasks.py, DocumentPreviewModal.vue, client.js"
+expecting: "Three separate bugs identified with specific mechanisms"
+next_action: "return root cause findings"
+
+## Symptoms
+
+expected: "Opening, re-analyzing, and editing a document stored on a cloud backend should work correctly via the backend proxy"
+actual: "User cannot open, re-analyze, or edit any file stored on a cloud backend"
+errors: "None specifically reported, but likely HTTP errors or missing endpoints"
+reproduction: "Test 13 in Phase 5 UAT — after uploading a document to a connected Nextcloud/WebDAV backend, all document operations (open, re-analyze, edit) fail"
+started: "Discovered during UAT of Phase 5 (cloud storage backends)"
+
+## Eliminated
+
+- hypothesis: "GET /api/documents/{id}/content endpoint missing cloud branch"
+  evidence: "The endpoint calls get_storage_backend_for_document() which correctly dispatches to NextcloudBackend/WebDAVBackend based on doc.storage_backend — the backend proxy path is implemented"
+  timestamp: 2026-05-30T00:00:00Z
+
+## Evidence
+
+- timestamp: 2026-05-30T00:00:00Z
+  checked: "DocumentPreviewModal.vue — how it opens documents"
+  found: "Uses raw iframe :src pointing to /api/documents/{id}/content — this is a browser navigation, NOT a fetch() call, so the Authorization: Bearer header is never sent"
+  implication: "The backend /content endpoint uses get_regular_user dep which requires a JWT Bearer token. An iframe or window.open() GET has no Authorization header → 401 Unauthorized → document cannot be opened"
+
+- timestamp: 2026-05-30T00:00:00Z
+  checked: "backend/tasks/document_tasks.py _run() function — the re-analyze (extract_and_classify) Celery task"
+  found: "Line 64: backend = get_storage_backend() — this always returns MinIOBackend regardless of doc.storage_backend. For cloud documents, get_storage_backend_for_document() must be called but the Celery task has no User or Session context to look up CloudConnection credentials"
+  implication: "Re-analysis of a cloud-stored document fails: the task calls MinIO get_object() with a WebDAV path (e.g. 'docuvault/user-id/doc-id.pdf') which does not exist in MinIO → MinIO retrieval error → extract_and_classify returns status='extract_failed'"
+
+- timestamp: 2026-05-30T00:00:00Z
+  checked: "backend/api/documents.py — full route list via @router decorator scan"
+  found: "Only these routes exist: POST /upload-url, POST /upload, POST /{id}/confirm, GET /, GET /{id}, DELETE /{id}, POST /{id}/classify, GET /{id}/content. There is NO PATCH or PUT endpoint for editing document metadata (filename, folder, etc.) on cloud documents."
+  implication: "The 'edit' failure may refer to the classify endpoint (re-analyze) or to a missing document-rename/metadata-update endpoint. The classify endpoint itself works correctly for cloud docs (it uses cached extracted_text, not re-fetching from storage), but re-extraction does not."
+
+- timestamp: 2026-05-30T00:00:00Z
+  checked: "DocumentView.vue — how openPdf() works and how it uses the content URL"
+  found: "openPdf() either calls window.open(api.getDocumentContentUrl(doc.value.id), '_blank') or shows DocumentPreviewModal. Both result in unauthenticated browser requests with no Bearer token."
+  implication: "Both open paths (new tab and in-app preview) hit the /content endpoint without auth → 401 for all documents, not just cloud ones. However cloud documents additionally require credentials decryption, so they would fail even if the auth issue were solved."
+
+- timestamp: 2026-05-30T00:00:00Z
+  checked: "client.js getDocumentContentUrl — returns a plain URL string, never does a credentialed fetch"
+  found: "Function returns '/api/documents/{id}/content' as a plain string for use in iframe src or window.open(). No fetch() with Authorization header."
+  implication: "The content endpoint is auth-protected (get_regular_user dep) but the frontend uses unauthenticated browser navigation to reach it — the 401 response is the actual error the user sees for any document, but for cloud documents there is an additional issue in the Celery worker"
+
+## Resolution
+
+root_cause: |
+  Three independent root causes:
+  1. OPEN (401 auth): The /api/documents/{id}/content endpoint requires a JWT Bearer token (get_regular_user dep), but DocumentPreviewModal and DocumentView both access it via iframe src or window.open() — browser navigations that send no Authorization header. All documents fail to open, but cloud documents are additionally impacted.
+  2. RE-ANALYZE (wrong backend): The extract_and_classify Celery task hardcodes get_storage_backend() (always MinIO) at line 64 of document_tasks.py. For cloud-stored documents it should call get_storage_backend_for_document(), but the Celery task has no User ORM instance and no CloudConnection lookup mechanism. The task reads doc.storage_backend but does nothing with it — it always fetches from MinIO, which 404s on a WebDAV path.
+  3. EDIT (endpoint missing): There is no PATCH endpoint for updating document metadata (filename/title). The user's "edit" likely refers to the re-analyze/re-extract operation or to metadata editing, neither of which works for cloud docs.
+fix:
+verification:
+files_changed: []