docs(05): add UAT, UI-SPEC, deferred items, debug notes; refine plans 09-11
Plan refinements: Vitest tests added to 09/10 must-haves, explicit mock_flow two-tuple pattern in 10, test_admin_api.py fixture usage in 11. New artifacts: UAT checklist, UI-SPEC, deferred-items, debug investigation for cloud-doc-operations-fail. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,67 @@
|
||||
---
|
||||
status: investigating
|
||||
trigger: "Documents stored on cloud backend cannot be opened, re-analyzed, or edited"
|
||||
created: 2026-05-30T00:00:00Z
|
||||
updated: 2026-05-30T00:00:00Z
|
||||
symptoms_prefilled: true
|
||||
goal: find_root_cause_only
|
||||
---
|
||||
|
||||
## Current Focus
|
||||
|
||||
hypothesis: "CONFIRMED — three independent root causes across open, re-analyze, and edit flows"
|
||||
test: "Full read of documents.py, document_tasks.py, DocumentPreviewModal.vue, client.js"
|
||||
expecting: "Three separate bugs identified with specific mechanisms"
|
||||
next_action: "return root cause findings"
|
||||
|
||||
## Symptoms
|
||||
|
||||
expected: "Opening, re-analyzing, and editing a document stored on a cloud backend should work correctly via the backend proxy"
|
||||
actual: "User cannot open, re-analyze, or edit any file stored on a cloud backend"
|
||||
errors: "None specifically reported, but likely HTTP errors or missing endpoints"
|
||||
reproduction: "Test 13 in Phase 5 UAT — after uploading a document to a connected Nextcloud/WebDAV backend, all document operations (open, re-analyze, edit) fail"
|
||||
started: "Discovered during UAT of Phase 5 (cloud storage backends)"
|
||||
|
||||
## Eliminated
|
||||
|
||||
- hypothesis: "GET /api/documents/{id}/content endpoint missing cloud branch"
|
||||
evidence: "The endpoint calls get_storage_backend_for_document() which correctly dispatches to NextcloudBackend/WebDAVBackend based on doc.storage_backend — the backend proxy path is implemented"
|
||||
timestamp: 2026-05-30T00:00:00Z
|
||||
|
||||
## Evidence
|
||||
|
||||
- timestamp: 2026-05-30T00:00:00Z
|
||||
checked: "DocumentPreviewModal.vue — how it opens documents"
|
||||
found: "Uses raw iframe :src pointing to /api/documents/{id}/content — this is a browser navigation, NOT a fetch() call, so the Authorization: Bearer header is never sent"
|
||||
implication: "The backend /content endpoint uses get_regular_user dep which requires a JWT Bearer token. An iframe or window.open() GET has no Authorization header → 401 Unauthorized → document cannot be opened"
|
||||
|
||||
- timestamp: 2026-05-30T00:00:00Z
|
||||
checked: "backend/tasks/document_tasks.py _run() function — the re-analyze (extract_and_classify) Celery task"
|
||||
found: "Line 64: backend = get_storage_backend() — this always returns MinIOBackend regardless of doc.storage_backend. For cloud documents, get_storage_backend_for_document() must be called but the Celery task has no User or Session context to look up CloudConnection credentials"
|
||||
implication: "Re-analysis of a cloud-stored document fails: the task calls MinIO get_object() with a WebDAV path (e.g. 'docuvault/user-id/doc-id.pdf') which does not exist in MinIO → MinIO retrieval error → extract_and_classify returns status='extract_failed'"
|
||||
|
||||
- timestamp: 2026-05-30T00:00:00Z
|
||||
checked: "backend/api/documents.py — full route list via @router decorator scan"
|
||||
found: "Only these routes exist: POST /upload-url, POST /upload, POST /{id}/confirm, GET /, GET /{id}, DELETE /{id}, POST /{id}/classify, GET /{id}/content. There is NO PATCH or PUT endpoint for editing document metadata (filename, folder, etc.) on cloud documents."
|
||||
implication: "The 'edit' failure may refer to the classify endpoint (re-analyze) or to a missing document-rename/metadata-update endpoint. The classify endpoint itself works correctly for cloud docs (it uses cached extracted_text, not re-fetching from storage), but re-extraction does not."
|
||||
|
||||
- timestamp: 2026-05-30T00:00:00Z
|
||||
checked: "DocumentView.vue — how openPdf() works and how it uses the content URL"
|
||||
found: "openPdf() either calls window.open(api.getDocumentContentUrl(doc.value.id), '_blank') or shows DocumentPreviewModal. Both result in unauthenticated browser requests with no Bearer token."
|
||||
implication: "Both open paths (new tab and in-app preview) hit the /content endpoint without auth → 401 for all documents, not just cloud ones. However cloud documents additionally require credentials decryption, so they would fail even if the auth issue were solved."
|
||||
|
||||
- timestamp: 2026-05-30T00:00:00Z
|
||||
checked: "client.js getDocumentContentUrl — returns a plain URL string, never does a credentialed fetch"
|
||||
found: "Function returns '/api/documents/{id}/content' as a plain string for use in iframe src or window.open(). No fetch() with Authorization header."
|
||||
implication: "The content endpoint is auth-protected (get_regular_user dep) but the frontend uses unauthenticated browser navigation to reach it — the 401 response is the actual error the user sees for any document, but for cloud documents there is an additional issue in the Celery worker"
|
||||
|
||||
## Resolution
|
||||
|
||||
root_cause: |
|
||||
Three independent root causes:
|
||||
1. OPEN (401 auth): The /api/documents/{id}/content endpoint requires a JWT Bearer token (get_regular_user dep), but DocumentPreviewModal and DocumentView both access it via iframe src or window.open() — browser navigations that send no Authorization header. All documents fail to open, but cloud documents are additionally impacted.
|
||||
2. RE-ANALYZE (wrong backend): The extract_and_classify Celery task hardcodes get_storage_backend() (always MinIO) at line 64 of document_tasks.py. For cloud-stored documents it should call get_storage_backend_for_document(), but the Celery task has no User ORM instance and no CloudConnection lookup mechanism. The task reads doc.storage_backend but does nothing with it — it always fetches from MinIO, which 404s on a WebDAV path.
|
||||
3. EDIT (endpoint missing): There is no PATCH endpoint for updating document metadata (filename/title). The user's "edit" likely refers to the re-analyze/re-extract operation or to metadata editing, neither of which works for cloud docs.
|
||||
fix:
|
||||
verification:
|
||||
files_changed: []
|
||||
Reference in New Issue
Block a user