docs(05): add code review and verification reports for phase 5

REVIEW.md: 3 critical findings fixed (HTTPException passthrough,
Redis pre-flight ordering, CLOUD_CREDS_KEY in celery-worker env)
VERIFICATION.md: 7/7 must-haves verified; 6 human-verification items
require live cloud credentials (Google Drive, OneDrive, Nextcloud/WebDAV)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
curo1305
2026-05-30 18:07:42 +02:00
parent b1a136b5be
commit 1a6fa08a34
2 changed files with 407 additions and 228 deletions
@@ -2,258 +2,273 @@
phase: 05-cloud-storage-backends
reviewed: 2026-05-30T00:00:00Z
depth: standard
files_reviewed: 14
files_reviewed: 6
files_reviewed_list:
- backend/api/documents.py
- backend/api/admin.py
- backend/api/cloud.py
- backend/tasks/document_tasks.py
- backend/api/documents.py
- docker-compose.yml
- frontend/src/views/CloudStorageView.vue
- backend/tests/test_cloud.py
- backend/tests/test_admin_api.py
- backend/tests/test_classifier.py
- frontend/src/api/client.js
- frontend/src/components/admin/AdminUsersTab.vue
- frontend/src/components/cloud/CloudCredentialModal.vue
- frontend/src/components/documents/DocumentPreviewModal.vue
- frontend/src/components/settings/SettingsCloudTab.vue
- frontend/src/components/ui/ConfirmBlock.vue
- frontend/src/views/DocumentView.vue
- backend/tests/test_documents.py
findings:
critical: 5
warning: 6
info: 3
total: 14
critical: 3
warning: 4
info: 2
total: 9
status: issues_found
---
# Phase 05: Code Review Report
# Phase 05 Plan 12: Code Review Report
**Reviewed:** 2026-05-30
**Reviewed:** 2026-05-30T00:00:00Z
**Depth:** standard
**Files Reviewed:** 14
**Files Reviewed:** 6
**Status:** issues_found
## Summary
This review covers the gap-closure plans 05-09, 05-10, and 05-11. The changes add a `PATCH /api/documents/{id}` endpoint for filename/folder rename, make the Celery re-analyze task cloud-aware, replace unauthenticated iframe src with a fetch+Blob URL flow, change `oauth_initiate` to return JSON instead of a 302 redirect, add WebDAV/Nextcloud edit support, add an admin user hard-delete with password confirmation, and small UI fixes (ConfirmBlock break-words, Edit button on ERROR-state connections).
This review covers the plan 05-12 gap-closure changes: OAuth pre-flight config validation added
to `oauth_initiate`, a broad `except Exception → 502` fallback added after the
`except CloudConnectionError → 503` clause in `stream_document_content`, a Celery worker source
volume mount added to `docker-compose.yml`, an upload-hint paragraph added to
`CloudStorageView.vue`, two new pre-flight tests in `test_cloud.py`, and one new 502-path test
in `test_documents.py`.
The security posture of the major new features is reasonable. However there are five blocker-class issues: two request-body smuggling paths, one timing-attack on admin password verification, one URL-object leak in DocumentView, and a missing folder-ownership check in the new PATCH endpoint. Several warnings around input validation and error handling are also present.
Three critical issues were found. The most impactful is that the broad `except Exception` clause
added to `stream_document_content` unconditionally swallows `HTTPException` raised by
`get_storage_backend_for_document`, converting a correct 503 "reconnect" response into a
misleading 502 "unreachable" response. The second critical issue is that Redis state tokens are
written to Redis before the new pre-flight check runs, leaving one orphan state entry per
rejected OAuth initiation request. The third is that the Celery worker container is missing the
`CLOUD_CREDS_KEY` environment variable, which causes silent use of the fallback default key
`"CHANGEME-32-bytes-padded!!"` for cloud-document credential decryption, making every
extract-and-classify Celery task for cloud documents fail at runtime.
## Narrative Findings (AI reviewer)
---
## Critical Issues
### CR-01: `DELETE /api/admin/users/{id}` body parsed from JSON but HTTP spec makes DELETE bodies unreliable — and FastAPI maps it as a query-param model, not a body, causing 422 in some clients
### CR-01: `except Exception` in `stream_document_content` swallows `HTTPException` from `get_storage_backend_for_document`
**File:** `backend/api/admin.py:480-503`
**File:** `backend/api/documents.py:751-763`
**Issue:** The `delete_user` handler declares `body: UserDeleteConfirm` as a plain positional parameter alongside `user_id: uuid.UUID`. FastAPI treats a Pydantic model on a DELETE handler as a **request body**, which is correct, but many HTTP clients (including some proxies and the `httpx` test client's `.delete()` shorthand) strip the body from DELETE requests per RFC 7231. The test at `test_admin_api.py:410` uses `client.delete(...)` with no body and asserts 422 — that part is fine. But `test_delete_user_correct_password` uses `client.request("DELETE", ..., json=...)` which explicitly sends a body. The problem is: the `admin_password` field is never validated for minimum length or content — a zero-length string `""` passes Pydantic validation and reaches `verify_password("", hash)` where Argon2 will evaluate it (returning False for a wrong hash, which is correct), but the absence of any length/non-empty guard means the error path returns `403` which subtly leaks that the endpoint exists and expects a password. More critically: **the constant-time comparison requirement from CLAUDE.md is met by `verify_password` (Argon2 is inherently constant-time for hashing), but the `admin_password` field has no `min_length=1` constraint**, so an empty string body produces a full Argon2 hash evaluation rather than an early reject.
**Issue:** `get_storage_backend_for_document` (in `storage/__init__.py:100-103`) raises
`HTTPException(503, "Cloud connection not found or inactive")` when no active `CloudConnection`
row exists for the document's provider. `HTTPException` is a subclass of `Exception`
(confirmed: `starlette.exceptions.HTTPException → Exception → BaseException`), so the new
`except Exception as exc` block on line 759 catches it and re-raises it wrapped in a new
`HTTPException(502, "Cloud backend unreachable …")`.
The bigger issue: there is **no rate limiting** on this endpoint. An attacker who has obtained an admin JWT can brute-force the admin's password via repeated DELETE calls. CLAUDE.md requires rate limiting on all auth-adjacent endpoints.
The caller receives a misleading 502 status and a "backend unreachable" message when the real
problem is that the cloud connection was deleted or set to `REQUIRES_REAUTH`. The correct 503
with the reconnect prompt is silently suppressed.
**Fix:** Add `min_length=1` to `UserDeleteConfirm.admin_password` and ensure rate limiting middleware covers this endpoint:
The new test `test_stream_document_content_cloud_backend_error` (test_documents.py:598-632) only
exercises the `RuntimeError` path by monkeypatching `get_storage_backend_for_document` to raise
a `RuntimeError`. It does not test the path where `get_storage_backend_for_document` raises
`HTTPException(503)`, so this regression is undetected by the test suite.
**Fix:** Re-order the `except` clauses to explicitly re-raise `HTTPException` before the broad
catch catches it:
```python
class UserDeleteConfirm(BaseModel):
admin_password: str = Field(..., min_length=1, max_length=1024)
try:
storage_backend = await get_storage_backend_for_document(doc, current_user, session)
file_bytes = await storage_backend.get_object(doc.object_key)
except CloudConnectionError as exc:
raise HTTPException(
status_code=503,
detail="Cloud connection requires re-authentication. Please reconnect in Settings.",
) from exc
except HTTPException:
raise # propagate 503 from get_storage_backend_for_document unchanged
except Exception as exc:
raise HTTPException(
status_code=502,
detail="Cloud backend unreachable. Please try again or reconnect in Settings.",
) from exc
```
---
### CR-02: `PATCH /api/documents/{doc_id}` does not validate folder ownership — a user can move a document into another user's folder
### CR-02: Redis OAuth state token written before pre-flight check — orphan Redis entries created on every rejected request
**File:** `backend/api/documents.py:546-588`
**File:** `backend/api/cloud.py:342-357`
**Issue:** The new `patch_document` handler validates document ownership (`doc.user_id != current_user.id`) but when `folder_id` is provided it sets `doc.folder_id = body.folder_id` without verifying that the target folder belongs to `current_user.id`. This is a cross-user data placement bug: a user who guesses or enumerates another user's folder UUID can move their own document into that folder, causing it to appear in the victim's folder listing.
**Issue:** In `oauth_initiate`, `redis_client.setex(f"oauth_state:{state_token}", 1800, …)` is
called on line 344, persisting a 30-minute Redis entry, before the provider-config pre-flight
checks on lines 348-357. When `google_client_id` or `onedrive_client_id` is empty, the function
raises `HTTPException(400)` and the state token is never consumed or deleted. Every rejected call
leaves one orphan Redis key with an 1800-second TTL.
The existing `PATCH /api/documents/{id}/folder` endpoint in `backend/api/folders.py` does perform this check (lines ~479-488). The new `patch_document` bypasses that validation entirely.
In a misconfigured deployment (where OAuth credentials are not set), every authenticated user
clicking "Connect" generates a Redis key that is never reclaimed except by TTL expiry. Beyond
memory waste, an orphan state token created before the rejection could theoretically be captured
from server logs or monitoring and submitted to the callback endpoint if credentials are later
configured — allowing a replay of a stale initiation.
**Fix:** Add a folder ownership assertion before setting `doc.folder_id`:
The two new tests (`test_oauth_initiate_google_drive_not_configured`,
`test_oauth_initiate_onedrive_not_configured`) verify the 400 response but do not assert that
the `FakeRedis._store` is empty, so the leak is undetected.
**Fix:** Move all pre-flight checks above the Redis write:
```python
if "folder_id" in body.model_fields_set and body.folder_id is not None:
from db.models import Folder # noqa: PLC0415
target_folder = await session.get(Folder, body.folder_id)
if target_folder is None or target_folder.user_id != current_user.id:
raise HTTPException(404, "Folder not found")
doc.folder_id = body.folder_id
elif "folder_id" in body.model_fields_set:
doc.folder_id = None # move to root
@router.get("/oauth/initiate/{provider}")
async def oauth_initiate(provider: str, request: Request,
current_user: User = Depends(get_regular_user)) -> dict:
if provider not in VALID_OAUTH_PROVIDERS:
raise HTTPException(status_code=400, detail=f"Unsupported OAuth provider: {provider}.")
# Pre-flight BEFORE touching Redis
if provider == "google_drive" and (not settings.google_client_id or not settings.google_client_secret):
raise HTTPException(status_code=400, detail="…Set GOOGLE_CLIENT_ID and GOOGLE_CLIENT_SECRET…")
if provider == "onedrive" and (not settings.onedrive_client_id or not settings.onedrive_client_secret):
raise HTTPException(status_code=400, detail="…Set ONEDRIVE_CLIENT_ID, ONEDRIVE_CLIENT_SECRET…")
state_token = secrets.token_urlsafe(32)
redis_client = request.app.state.redis
await redis_client.setex(f"oauth_state:{state_token}", 1800, str(current_user.id))
```
---
### CR-03: `PATCH /api/documents/{doc_id}` accepts an empty string filename — corrupts the document record
### CR-03: `celery-worker` missing `CLOUD_CREDS_KEY` — cloud document processing silently uses wrong decryption key
**File:** `backend/api/documents.py:576-577`
**File:** `docker-compose.yml:81-102`
**Issue:** The `filename` field in `DocumentPatch` is `Optional[str] = None`. The handler applies the update when `body.filename is not None`, but an empty string `""` passes that check. A `PATCH {"filename": ""}` will persist an empty filename to the database, which breaks display, download headers (`Content-Disposition: inline; filename=""`), and any downstream filename-based logic.
**Issue:** The `celery-worker` service environment block (lines 83-90) does not include
`CLOUD_CREDS_KEY`. Without this variable, `settings.cloud_creds_key` falls back to the default
`"CHANGEME-32-bytes-padded!!"` (config.py:61). The Celery task `_run` in
`tasks/document_tasks.py` calls `get_storage_backend_for_document`, which calls
`decrypt_credentials(settings.cloud_creds_key.encode(), str(user.id), conn.credentials_enc)`.
HKDF key derivation will silently use the wrong master key, Fernet will raise
`InvalidToken`, and the task returns `{"status": "extract_failed", "error": "retrieval failed: …"}`.
There is no startup-time validation; the failure only surfaces on the first cloud document
task execution.
Additionally, filenames with path separators (e.g. `"../../etc/passwd"`) are accepted without sanitization. While the filename is only stored in the DB (not used for file system paths), it does appear verbatim in the `Content-Disposition` header at `backend/api/documents.py:754`, which can produce a malformed or injection-capable header value.
The `backend` service correctly receives `SECRET_KEY` (line 64) and would receive `CLOUD_CREDS_KEY`
from the environment, but the `celery-worker` service does not pass either.
**Fix:**
```python
if "filename" in body.model_fields_set:
if body.filename is None or not body.filename.strip():
raise HTTPException(422, "filename must be a non-empty string")
# Strip path separators — filename is display-only, not a path
doc.filename = Path(body.filename).name or body.filename
```
---
### CR-04: `fetchDocumentContent` in `client.js` does not check non-401 error responses — callers receive a non-`ok` Response silently
**File:** `frontend/src/api/client.js:399-425`
**Issue:** `fetchDocumentContent` deliberately does not call `res.json()` (it returns the raw `Response` for the caller to `.blob()`). However it also does not throw on non-401, non-ok responses — it returns the raw `Response` regardless of status. The caller in `DocumentPreviewModal.vue:93` checks `if (!res.ok)` correctly. But the caller in `DocumentView.vue:169-179` also checks `if (!res.ok)` and only `console.error`s — it swallows the error silently and returns without user feedback.
More critically: the function handles `401` with a retry, but **a 403, 404, or 503 response is returned to the caller as a `Response` object without throwing**. If a future caller forgets the `res.ok` check (which `request()` does automatically), it will attempt to call `.blob()` on an error response, producing a confusing Blob containing the JSON error body rather than document bytes.
**Fix:** Throw on non-auth error responses, consistent with `request()`:
```javascript
export async function fetchDocumentContent(docId, options = {}) {
// ... (existing auth + fetch code) ...
if (!res.ok && res.status !== 401) {
const msg = `HTTP ${res.status}`
const err = new Error(msg)
err.status = res.status
throw err
}
if (res.status === 401 && !options._retry) {
// ... existing retry logic ...
}
return res
}
```
---
### CR-05: `DocumentView.vue` leaks a blob object URL when opening PDFs in a new tab — the 60-second revoke timer is unreliable
**File:** `frontend/src/views/DocumentView.vue:172-182`
**Issue:** In `openPdf()` (new-tab path), a `URL.createObjectURL(blob)` URL is created, `window.open`ed, and then revoked after a `setTimeout(..., 60000)`. This has two problems:
1. **Memory leak vector:** If the user navigates away from `DocumentView` before 60 seconds, the timeout still fires against the detached window context. More importantly, if `window.open` is blocked by a popup blocker, the object URL is never opened but the timer still runs — the 60-second window holds the blob in memory unnecessarily.
2. **Race condition:** Some browsers begin loading the new tab asynchronously; 60 seconds may not be enough for large PDFs over slow connections, causing the tab to show a broken preview mid-load.
This is a correctness/reliability issue rather than pure performance, because the revoked URL can leave the new tab with a broken blank page.
**Fix:** Use a longer TTL (e.g., 5 minutes) or defer revocation using the `window.open` return value's `onload` event — but as a minimum, guard the open call:
```javascript
const win = window.open(objectUrl, '_blank')
if (!win) {
// Popup blocked — revoke immediately
URL.revokeObjectURL(objectUrl)
} else {
setTimeout(() => URL.revokeObjectURL(objectUrl), 300_000) // 5 min
}
```yaml
celery-worker:
environment:
- DATABASE_URL=${DATABASE_URL}
- MINIO_ENDPOINT=${MINIO_ENDPOINT}
- MINIO_ACCESS_KEY=${MINIO_ACCESS_KEY}
- MINIO_SECRET_KEY=${MINIO_SECRET_KEY}
- MINIO_BUCKET=${MINIO_BUCKET}
- REDIS_URL=${REDIS_URL}
- CLOUD_CREDS_KEY=${CLOUD_CREDS_KEY} # required for cloud document credential decryption
- PYTHONDONTWRITEBYTECODE=1
```
---
## Warnings
### WR-01: `_call_cloud_op` commits the session inside a helper, but the session is owned by the caller — double-commit risk
### WR-01: `update_default_storage` accepts arbitrary string as `backend` value — no server-side allowlist
**File:** `backend/api/cloud.py:116-133`
**File:** `backend/api/cloud.py:922-941`
**Issue:** `_call_cloud_op` calls `await session.commit()` on the session passed in by the caller (at lines 116, 133, 148, 165). The caller (e.g., `list_cloud_folders`) does not commit after calling `_call_cloud_op`. This pattern is fragile: if the caller adds objects to the session after `_call_cloud_op` commits, those will be committed in a separate implicit transaction, potentially leaving the session in an inconsistent state. More importantly, `list_cloud_folders` at line 757 does not call `_call_cloud_op` at all — it calls the fetch functions directly. The commit calls inside `_call_cloud_op` are therefore only triggered on retry paths, making the commit responsibility asymmetric and hard to audit.
**Issue:** `PATCH /api/users/me/default-storage` accepts `{"backend": "<any string>"}` and
writes it directly to `user.default_storage_backend` without validation against an allowlist.
The docstring notes "validated by the frontend dropdown," which is a client-side-only control
trivially bypassed. A user can persist any string (e.g., `"../../etc"`, unsupported provider
slug, or an empty string) to the DB column, potentially causing downstream handler errors when
the value is used for routing.
**Fix:** Establish a clear ownership rule: either `_call_cloud_op` owns the commit (and callers must not commit), or callers own the commit (and `_call_cloud_op` only flushes). Document this contract explicitly in the docstring.
**Fix:** Add a `field_validator` to `DefaultStorageRequest`:
---
```python
_VALID_BACKENDS = frozenset({"minio", "google_drive", "onedrive", "nextcloud", "webdav"})
### WR-02: `CloudCredentialModal.vue` — edit mode submits with an empty password, which the backend rejects without clear user feedback
class DefaultStorageRequest(BaseModel):
backend: str
**File:** `frontend/src/components/cloud/CloudCredentialModal.vue:304-322`
**Issue:** The modal comment at line 311-313 explicitly acknowledges this problem: "If password is empty on edit, the server will reject." The `submit()` function sends `password.value` which may be empty if the user chose not to change it. The backend's `connect_webdav` endpoint always requires the `password` field (it upserts the full credential set). When the user clicks "Save changes" without entering a new password, the call will fail with a validation error, but the displayed error message is the raw backend error rather than a clear "Please re-enter your password to save changes" message.
The code comment itself says "Future enhancement: PATCH endpoint that accepts partial updates" — but shipping with a known broken flow is a user-facing defect.
**Fix:** Add client-side validation in `submit()` for the edit case:
```javascript
async function submit() {
connectError.value = ''
if (props.existing && !password.value) {
connectError.value = 'Please enter your password to save changes.'
return
}
// ... rest of submit
}
@field_validator("backend")
@classmethod
def backend_must_be_valid(cls, v: str) -> str:
if v not in _VALID_BACKENDS:
raise ValueError(f"backend must be one of {sorted(_VALID_BACKENDS)}")
return v
```
---
### WR-03: `adminDeleteUser` in `client.js` sends `admin_password` in a JSON body on a DELETE request — body may be stripped by intermediaries
### WR-02: Pre-flight check for OneDrive omits `onedrive_tenant_id` validation despite advertising it in the error message
**File:** `frontend/src/api/client.js:280-286`
**File:** `backend/api/cloud.py:353-357`
**Issue:** HTTP DELETE requests with a body are technically valid but controversial. Some reverse proxies (nginx, AWS ALB) and CDN configurations strip or reject DELETE request bodies. The `admin_password` credential would then arrive at FastAPI as an empty/missing body, producing a 422, which could be confused with a Pydantic validation failure rather than a transport issue. CLAUDE.md mandates no plaintext secrets in transit beyond TLS, which is met here, but the transport reliability is not.
**Fix:** Consider changing the endpoint to `POST /api/admin/users/{id}/delete` with a JSON body, or accept the password as a header (e.g., `X-Admin-Password`) with a note that headers are also stripped by some proxies. A `POST` endpoint is the most reliable approach and keeps the credential in the body where TLS protects it.
---
### WR-04: `generateRandomPassword` in `AdminUsersTab.vue` appends a fixed suffix `"A1!"` — reducing entropy for the last 3 characters
**File:** `frontend/src/components/admin/AdminUsersTab.vue:291-301`
**Issue:** The password generator creates 16 random bytes mapped to a charset, then replaces the last 4 characters with `"A1!"` (3 fixed characters appended after slicing to 12). This means the last 3 characters of every generated password are always `"A1!"` — deterministic, not random. A 15-character password has its last 3 characters known to any attacker aware of this implementation. The effective entropy is 12 characters from the charset, not 15. The function is also missing a `handle` field — the email split at line 336 may produce an empty handle if the email starts with `@`.
**Issue:** The OneDrive pre-flight guard checks only `onedrive_client_id` and
`onedrive_client_secret`. Its error detail tells the operator to set `ONEDRIVE_TENANT_ID`, but
the code never checks whether `settings.onedrive_tenant_id` is empty. The default value is
`"common"` (config.py:67), so this is rarely a problem in practice. However, if someone
explicitly sets `ONEDRIVE_TENANT_ID=""`, the MSAL authority URL becomes
`https://login.microsoftonline.com//oauth2/v2.0/token`, producing an MSAL runtime error after
the pre-flight is supposed to have caught the misconfiguration.
**Fix:**
```javascript
function generateRandomPassword() {
const upper = 'ABCDEFGHJKLMNPQRSTUVWXYZ'
const lower = 'abcdefghijkmnpqrstuvwxyz'
const digits = '23456789'
const special = '!@#$%^&*'
const all = upper + lower + digits + special
const arr = new Uint8Array(16)
crypto.getRandomValues(arr)
// Guarantee character class coverage using first 4 bytes
let pw = [
upper[arr[0] % upper.length],
lower[arr[1] % lower.length],
digits[arr[2] % digits.length],
special[arr[3] % special.length],
]
for (let i = 4; i < 16; i++) {
pw.push(all[arr[i] % all.length])
}
// Fisher-Yates shuffle
for (let i = pw.length - 1; i > 0; i--) {
const j = arr[i] % (i + 1)
;[pw[i], pw[j]] = [pw[j], pw[i]]
}
return pw.join('')
}
```python
if provider == "onedrive" and (
not settings.onedrive_client_id
or not settings.onedrive_client_secret
or not settings.onedrive_tenant_id
):
raise HTTPException(
status_code=400,
detail="OneDrive OAuth is not configured. Set ONEDRIVE_CLIENT_ID, ONEDRIVE_CLIENT_SECRET, and ONEDRIVE_TENANT_ID.",
)
```
---
### WR-05: `oauth_callback` in `cloud.py` leaks exception messages into redirect URLs
### WR-03: New pre-flight tests do not assert Redis state is clean after a 400 response
**File:** `backend/api/cloud.py:525-530`
**File:** `backend/tests/test_cloud.py:784-835`
**Issue:** The outer `except Exception as exc` block at line 525 passes `str(exc)` directly into a redirect URL via `urllib.parse.quote(error_msg)`. This means internal exception messages — including potentially stack traces from libraries, token values from MSAL error responses, or internal server details — are passed to the frontend as query parameters in the redirect. The error message from `ValueError(f"Token exchange failed: {result.get('error_description', result['error'])}")` (line 493) includes the provider's raw `error_description` which may contain OAuth scopes, client IDs, or internal identifiers.
**Issue:** `test_oauth_initiate_google_drive_not_configured` and
`test_oauth_initiate_onedrive_not_configured` both set up a `FakeRedis`, call the endpoint
expecting a 400, and reset `app.state.redis = None`. Neither test asserts that
`fake_redis._store` is empty after the call. Because the state token is currently written before
the pre-flight check (CR-02 above), a check like this would fail today — confirming the bug.
When CR-02 is fixed, adding the assertion hardens the test against regressions:
**Fix:** Sanitize or categorize errors before inclusion in the redirect:
```python
# After the status assert, add:
assert len(fake_redis._store) == 0, (
"No OAuth state should be stored in Redis when pre-flight validation fails"
)
```
---
### WR-04: `oauth_callback` reflects raw OAuth provider `error` parameter and internal exception messages into redirect URL
**File:** `backend/api/cloud.py:427-428` and `537-539`
**Issue:** `error_param` from the query string is embedded verbatim into a `ValueError` message
(`f"OAuth provider returned error: {error_param}"`), which flows into `str(exc)` and is passed
to `urllib.parse.quote` before appearing as `?cloud_error=…` in the redirect. The URL encoding
prevents injection in the query string. However:
1. A malicious or compromised OAuth provider can inject arbitrary text into the user-visible
error banner with no server-side length cap or character filter.
2. The outer `except Exception` block at line 536 passes `str(exc)` for all internal errors,
which may include stack trace fragments, OAuth client IDs, or token values from provider
error responses (e.g., `ValueError(f"Token exchange failed: {result.get('error_description', result['error'])}")`
at line 504 — `error_description` is provider-controlled).
**Fix:** Cap the length and filter the error message before reflecting it:
```python
except Exception as exc:
# Log the full error internally; expose only a safe generic message
import logging
logging.getLogger(__name__).error("OAuth callback error: %s", exc)
error_msg = "OAuth connection failed. Please try again."
@@ -265,74 +280,50 @@ except Exception as exc:
---
### WR-06: `test_invalid_grant_sets_requires_reauth` test does not actually verify the DB state transition it claims to test
**File:** `backend/tests/test_cloud.py:424-498`
**Issue:** The test name and docstring promise to verify "BOTH HTTP 503 response AND DB state update." However, lines 489-498 contain a comment explicitly conceding that the DB state is NOT verified by this test because the monkeypatch bypasses `_call_cloud_op`. The test asserts only the HTTP 503. The comment says "The DB transition is covered by the cloud.py unit tests" — but no such unit test exists in the reviewed files. This leaves the `conn.status = "REQUIRES_REAUTH"` path in `_call_cloud_op` untested by the test suite.
**Fix:** Either (a) add a separate unit test for `_call_cloud_op` that verifies the DB status transition, or (b) restructure `test_invalid_grant_sets_requires_reauth` to use the real `_call_cloud_op` path and assert the DB state. At minimum, remove the misleading docstring claim about verifying DB state.
---
## Info
### IN-01: `moveDocument` in `client.js` calls a non-existent endpoint — dead code
### IN-01: `CloudStorageView.vue` does not fetch connections on mount — direct navigation shows stale empty state
**File:** `frontend/src/api/client.js:321-327`
**File:** `frontend/src/views/CloudStorageView.vue:61-93`
**Issue:** `moveDocument(docId, folderId)` targets `PATCH /api/documents/{docId}/folder`. That endpoint is defined in `backend/api/folders.py` (not `documents.py`). The new `PATCH /api/documents/{doc_id}` endpoint added in plan 05-09 also accepts `folder_id`. There are now two client-side functions (`moveDocument` via `/folder` and the new `patch_document` path via `PATCH /documents/{id}`) that both accomplish folder moves, but through different backend endpoints. This duplication creates confusion about which to use. If `moveDocument` is the legacy function that should be superseded, it should be removed or deprecated with a clear comment.
**Issue:** The component reads `cloudStore.connections` and `cloudStore.loading` reactively but
never calls `cloudStore.fetchConnections()` (or equivalent) in an `onMounted` hook. If a user
navigates directly to `/cloud` without first visiting a page that pre-populates the store, the
component renders the "No cloud storage connected" empty state without fetching live data. This
is a reliability gap for direct navigation and deep-link scenarios.
**Fix:** Add `onMounted`:
```javascript
import { computed, onMounted } from 'vue'
onMounted(() => { cloudStore.fetchConnections?.() })
```
or document explicitly that the parent layout is responsible for pre-fetching.
---
### IN-02: `classify_document` in `documents.py` uses a mutable default argument `body: dict = {}`
### IN-02: `classify_document` uses mutable default argument `body: dict = {}`
**File:** `backend/api/documents.py:648`
**File:** `backend/api/documents.py:657`
**Issue:** `body: dict = {}` is a mutable default argument in a Python function — a classic Python footgun. In normal Python functions this causes state sharing between calls, but FastAPI reconstructs default parameter values per request for `Body` parameters, so this is unlikely to cause the classic bug in practice. However it is still a code smell that will flag in linters and misleads readers. FastAPI's idiomatic approach is `body: dict = Body(default={})` or a dedicated Pydantic model.
**Fix:**
**Issue:** `body: dict = {}` is the classic Python mutable-default-argument anti-pattern.
FastAPI reconstructs body parameters per request so the classic shared-state bug does not
manifest in production, but static analysis tools (ruff B006, mypy) flag it, and calling the
function directly from tests with no `body` argument risks state sharing if the function is
ever modified. Use `None` as the sentinel:
```python
from fastapi import Body
async def classify_document(
doc_id: str,
body: dict = Body(default={}),
...
body: Optional[dict] = None,
):
topic_names = body.get("topics") if body else None
```
---
### IN-03: `SettingsCloudTab.vue` — `oauthError` banner is shown inside a `v-else` that is mutually exclusive with `store.loading` but not with the provider list
**File:** `frontend/src/components/settings/SettingsCloudTab.vue:23`
**Issue:** The template structure is:
```html
<div v-if="store.loading">Loading...</div>
<div v-if="oauthError">error banner</div> <!-- NOT v-else-if -->
<div v-else class="divide-y ..."> <!-- this v-else pairs with oauthError -->
provider list
</div>
```
The `v-else` on the provider list div pairs with the `oauthError` `v-if`, not with `store.loading`. This means:
- When `store.loading` is true AND `oauthError` is set, both the loading indicator AND the error banner are shown (the provider list is hidden — this is actually correct by accident).
- When `store.loading` is true AND `oauthError` is empty, the loading indicator is shown AND the provider list is also shown (because `v-else` on the list fires when `oauthError` is falsy — regardless of `store.loading`).
The loading state and provider list are not mutually exclusive. Fix by using a proper conditional chain:
```html
<div v-if="store.loading">Loading...</div>
<template v-else>
<div v-if="oauthError" ...>error banner</div>
<div class="divide-y ...">provider list</div>
</template>
```
---
_Reviewed: 2026-05-30_
_Reviewed: 2026-05-30T00:00:00Z_
_Reviewer: Claude (gsd-code-reviewer)_
_Depth: standard_