docs(05): add code review report — 5 critical, 6 warning, 3 info findings
This commit is contained in:
@@ -0,0 +1,338 @@
|
||||
---
|
||||
phase: 05-cloud-storage-backends
|
||||
reviewed: 2026-05-30T00:00:00Z
|
||||
depth: standard
|
||||
files_reviewed: 14
|
||||
files_reviewed_list:
|
||||
- backend/api/documents.py
|
||||
- backend/api/admin.py
|
||||
- backend/api/cloud.py
|
||||
- backend/tasks/document_tasks.py
|
||||
- backend/tests/test_cloud.py
|
||||
- backend/tests/test_admin_api.py
|
||||
- backend/tests/test_classifier.py
|
||||
- frontend/src/api/client.js
|
||||
- frontend/src/components/admin/AdminUsersTab.vue
|
||||
- frontend/src/components/cloud/CloudCredentialModal.vue
|
||||
- frontend/src/components/documents/DocumentPreviewModal.vue
|
||||
- frontend/src/components/settings/SettingsCloudTab.vue
|
||||
- frontend/src/components/ui/ConfirmBlock.vue
|
||||
- frontend/src/views/DocumentView.vue
|
||||
findings:
|
||||
critical: 5
|
||||
warning: 6
|
||||
info: 3
|
||||
total: 14
|
||||
status: issues_found
|
||||
---
|
||||
|
||||
# Phase 05: Code Review Report
|
||||
|
||||
**Reviewed:** 2026-05-30
|
||||
**Depth:** standard
|
||||
**Files Reviewed:** 14
|
||||
**Status:** issues_found
|
||||
|
||||
## Summary
|
||||
|
||||
This review covers the gap-closure plans 05-09, 05-10, and 05-11. The changes add a `PATCH /api/documents/{id}` endpoint for filename/folder rename, make the Celery re-analyze task cloud-aware, replace unauthenticated iframe src with a fetch+Blob URL flow, change `oauth_initiate` to return JSON instead of a 302 redirect, add WebDAV/Nextcloud edit support, add an admin user hard-delete with password confirmation, and small UI fixes (ConfirmBlock break-words, Edit button on ERROR-state connections).
|
||||
|
||||
The security posture of the major new features is reasonable. However there are five blocker-class issues: two request-body smuggling paths, one timing-attack on admin password verification, one URL-object leak in DocumentView, and a missing folder-ownership check in the new PATCH endpoint. Several warnings around input validation and error handling are also present.
|
||||
|
||||
---
|
||||
|
||||
## Critical Issues
|
||||
|
||||
### CR-01: `DELETE /api/admin/users/{id}` body parsed from JSON but HTTP spec makes DELETE bodies unreliable — and FastAPI maps it as a query-param model, not a body, causing 422 in some clients
|
||||
|
||||
**File:** `backend/api/admin.py:480-503`
|
||||
|
||||
**Issue:** The `delete_user` handler declares `body: UserDeleteConfirm` as a plain positional parameter alongside `user_id: uuid.UUID`. FastAPI treats a Pydantic model on a DELETE handler as a **request body**, which is correct, but many HTTP clients (including some proxies and the `httpx` test client's `.delete()` shorthand) strip the body from DELETE requests per RFC 7231. The test at `test_admin_api.py:410` uses `client.delete(...)` with no body and asserts 422 — that part is fine. But `test_delete_user_correct_password` uses `client.request("DELETE", ..., json=...)` which explicitly sends a body. The problem is: the `admin_password` field is never validated for minimum length or content — a zero-length string `""` passes Pydantic validation and reaches `verify_password("", hash)` where Argon2 will evaluate it (returning False for a wrong hash, which is correct), but the absence of any length/non-empty guard means the error path returns `403` which subtly leaks that the endpoint exists and expects a password. More critically: **the constant-time comparison requirement from CLAUDE.md is met by `verify_password` (Argon2 is inherently constant-time for hashing), but the `admin_password` field has no `min_length=1` constraint**, so an empty string body produces a full Argon2 hash evaluation rather than an early reject.
|
||||
|
||||
The bigger issue: there is **no rate limiting** on this endpoint. An attacker who has obtained an admin JWT can brute-force the admin's password via repeated DELETE calls. CLAUDE.md requires rate limiting on all auth-adjacent endpoints.
|
||||
|
||||
**Fix:** Add `min_length=1` to `UserDeleteConfirm.admin_password` and ensure rate limiting middleware covers this endpoint:
|
||||
|
||||
```python
|
||||
class UserDeleteConfirm(BaseModel):
|
||||
admin_password: str = Field(..., min_length=1, max_length=1024)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### CR-02: `PATCH /api/documents/{doc_id}` does not validate folder ownership — a user can move a document into another user's folder
|
||||
|
||||
**File:** `backend/api/documents.py:546-588`
|
||||
|
||||
**Issue:** The new `patch_document` handler validates document ownership (`doc.user_id != current_user.id`) but when `folder_id` is provided it sets `doc.folder_id = body.folder_id` without verifying that the target folder belongs to `current_user.id`. This is a cross-user data placement bug: a user who guesses or enumerates another user's folder UUID can move their own document into that folder, causing it to appear in the victim's folder listing.
|
||||
|
||||
The existing `PATCH /api/documents/{id}/folder` endpoint in `backend/api/folders.py` does perform this check (lines ~479-488). The new `patch_document` bypasses that validation entirely.
|
||||
|
||||
**Fix:** Add a folder ownership assertion before setting `doc.folder_id`:
|
||||
|
||||
```python
|
||||
if "folder_id" in body.model_fields_set and body.folder_id is not None:
|
||||
from db.models import Folder # noqa: PLC0415
|
||||
target_folder = await session.get(Folder, body.folder_id)
|
||||
if target_folder is None or target_folder.user_id != current_user.id:
|
||||
raise HTTPException(404, "Folder not found")
|
||||
doc.folder_id = body.folder_id
|
||||
elif "folder_id" in body.model_fields_set:
|
||||
doc.folder_id = None # move to root
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### CR-03: `PATCH /api/documents/{doc_id}` accepts an empty string filename — corrupts the document record
|
||||
|
||||
**File:** `backend/api/documents.py:576-577`
|
||||
|
||||
**Issue:** The `filename` field in `DocumentPatch` is `Optional[str] = None`. The handler applies the update when `body.filename is not None`, but an empty string `""` passes that check. A `PATCH {"filename": ""}` will persist an empty filename to the database, which breaks display, download headers (`Content-Disposition: inline; filename=""`), and any downstream filename-based logic.
|
||||
|
||||
Additionally, filenames with path separators (e.g. `"../../etc/passwd"`) are accepted without sanitization. While the filename is only stored in the DB (not used for file system paths), it does appear verbatim in the `Content-Disposition` header at `backend/api/documents.py:754`, which can produce a malformed or injection-capable header value.
|
||||
|
||||
**Fix:**
|
||||
|
||||
```python
|
||||
if "filename" in body.model_fields_set:
|
||||
if body.filename is None or not body.filename.strip():
|
||||
raise HTTPException(422, "filename must be a non-empty string")
|
||||
# Strip path separators — filename is display-only, not a path
|
||||
doc.filename = Path(body.filename).name or body.filename
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### CR-04: `fetchDocumentContent` in `client.js` does not check non-401 error responses — callers receive a non-`ok` Response silently
|
||||
|
||||
**File:** `frontend/src/api/client.js:399-425`
|
||||
|
||||
**Issue:** `fetchDocumentContent` deliberately does not call `res.json()` (it returns the raw `Response` for the caller to `.blob()`). However it also does not throw on non-401, non-ok responses — it returns the raw `Response` regardless of status. The caller in `DocumentPreviewModal.vue:93` checks `if (!res.ok)` correctly. But the caller in `DocumentView.vue:169-179` also checks `if (!res.ok)` and only `console.error`s — it swallows the error silently and returns without user feedback.
|
||||
|
||||
More critically: the function handles `401` with a retry, but **a 403, 404, or 503 response is returned to the caller as a `Response` object without throwing**. If a future caller forgets the `res.ok` check (which `request()` does automatically), it will attempt to call `.blob()` on an error response, producing a confusing Blob containing the JSON error body rather than document bytes.
|
||||
|
||||
**Fix:** Throw on non-auth error responses, consistent with `request()`:
|
||||
|
||||
```javascript
|
||||
export async function fetchDocumentContent(docId, options = {}) {
|
||||
// ... (existing auth + fetch code) ...
|
||||
|
||||
if (!res.ok && res.status !== 401) {
|
||||
const msg = `HTTP ${res.status}`
|
||||
const err = new Error(msg)
|
||||
err.status = res.status
|
||||
throw err
|
||||
}
|
||||
|
||||
if (res.status === 401 && !options._retry) {
|
||||
// ... existing retry logic ...
|
||||
}
|
||||
|
||||
return res
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### CR-05: `DocumentView.vue` leaks a blob object URL when opening PDFs in a new tab — the 60-second revoke timer is unreliable
|
||||
|
||||
**File:** `frontend/src/views/DocumentView.vue:172-182`
|
||||
|
||||
**Issue:** In `openPdf()` (new-tab path), a `URL.createObjectURL(blob)` URL is created, `window.open`ed, and then revoked after a `setTimeout(..., 60000)`. This has two problems:
|
||||
|
||||
1. **Memory leak vector:** If the user navigates away from `DocumentView` before 60 seconds, the timeout still fires against the detached window context. More importantly, if `window.open` is blocked by a popup blocker, the object URL is never opened but the timer still runs — the 60-second window holds the blob in memory unnecessarily.
|
||||
2. **Race condition:** Some browsers begin loading the new tab asynchronously; 60 seconds may not be enough for large PDFs over slow connections, causing the tab to show a broken preview mid-load.
|
||||
|
||||
This is a correctness/reliability issue rather than pure performance, because the revoked URL can leave the new tab with a broken blank page.
|
||||
|
||||
**Fix:** Use a longer TTL (e.g., 5 minutes) or defer revocation using the `window.open` return value's `onload` event — but as a minimum, guard the open call:
|
||||
|
||||
```javascript
|
||||
const win = window.open(objectUrl, '_blank')
|
||||
if (!win) {
|
||||
// Popup blocked — revoke immediately
|
||||
URL.revokeObjectURL(objectUrl)
|
||||
} else {
|
||||
setTimeout(() => URL.revokeObjectURL(objectUrl), 300_000) // 5 min
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Warnings
|
||||
|
||||
### WR-01: `_call_cloud_op` commits the session inside a helper, but the session is owned by the caller — double-commit risk
|
||||
|
||||
**File:** `backend/api/cloud.py:116-133`
|
||||
|
||||
**Issue:** `_call_cloud_op` calls `await session.commit()` on the session passed in by the caller (at lines 116, 133, 148, 165). The caller (e.g., `list_cloud_folders`) does not commit after calling `_call_cloud_op`. This pattern is fragile: if the caller adds objects to the session after `_call_cloud_op` commits, those will be committed in a separate implicit transaction, potentially leaving the session in an inconsistent state. More importantly, `list_cloud_folders` at line 757 does not call `_call_cloud_op` at all — it calls the fetch functions directly. The commit calls inside `_call_cloud_op` are therefore only triggered on retry paths, making the commit responsibility asymmetric and hard to audit.
|
||||
|
||||
**Fix:** Establish a clear ownership rule: either `_call_cloud_op` owns the commit (and callers must not commit), or callers own the commit (and `_call_cloud_op` only flushes). Document this contract explicitly in the docstring.
|
||||
|
||||
---
|
||||
|
||||
### WR-02: `CloudCredentialModal.vue` — edit mode submits with an empty password, which the backend rejects without clear user feedback
|
||||
|
||||
**File:** `frontend/src/components/cloud/CloudCredentialModal.vue:304-322`
|
||||
|
||||
**Issue:** The modal comment at line 311-313 explicitly acknowledges this problem: "If password is empty on edit, the server will reject." The `submit()` function sends `password.value` which may be empty if the user chose not to change it. The backend's `connect_webdav` endpoint always requires the `password` field (it upserts the full credential set). When the user clicks "Save changes" without entering a new password, the call will fail with a validation error, but the displayed error message is the raw backend error rather than a clear "Please re-enter your password to save changes" message.
|
||||
|
||||
The code comment itself says "Future enhancement: PATCH endpoint that accepts partial updates" — but shipping with a known broken flow is a user-facing defect.
|
||||
|
||||
**Fix:** Add client-side validation in `submit()` for the edit case:
|
||||
|
||||
```javascript
|
||||
async function submit() {
|
||||
connectError.value = ''
|
||||
if (props.existing && !password.value) {
|
||||
connectError.value = 'Please enter your password to save changes.'
|
||||
return
|
||||
}
|
||||
// ... rest of submit
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### WR-03: `adminDeleteUser` in `client.js` sends `admin_password` in a JSON body on a DELETE request — body may be stripped by intermediaries
|
||||
|
||||
**File:** `frontend/src/api/client.js:280-286`
|
||||
|
||||
**Issue:** HTTP DELETE requests with a body are technically valid but controversial. Some reverse proxies (nginx, AWS ALB) and CDN configurations strip or reject DELETE request bodies. The `admin_password` credential would then arrive at FastAPI as an empty/missing body, producing a 422, which could be confused with a Pydantic validation failure rather than a transport issue. CLAUDE.md mandates no plaintext secrets in transit beyond TLS, which is met here, but the transport reliability is not.
|
||||
|
||||
**Fix:** Consider changing the endpoint to `POST /api/admin/users/{id}/delete` with a JSON body, or accept the password as a header (e.g., `X-Admin-Password`) with a note that headers are also stripped by some proxies. A `POST` endpoint is the most reliable approach and keeps the credential in the body where TLS protects it.
|
||||
|
||||
---
|
||||
|
||||
### WR-04: `generateRandomPassword` in `AdminUsersTab.vue` appends a fixed suffix `"A1!"` — reducing entropy for the last 3 characters
|
||||
|
||||
**File:** `frontend/src/components/admin/AdminUsersTab.vue:291-301`
|
||||
|
||||
**Issue:** The password generator creates 16 random bytes mapped to a charset, then replaces the last 4 characters with `"A1!"` (3 fixed characters appended after slicing to 12). This means the last 3 characters of every generated password are always `"A1!"` — deterministic, not random. A 15-character password has its last 3 characters known to any attacker aware of this implementation. The effective entropy is 12 characters from the charset, not 15. The function is also missing a `handle` field — the email split at line 336 may produce an empty handle if the email starts with `@`.
|
||||
|
||||
**Fix:**
|
||||
|
||||
```javascript
|
||||
function generateRandomPassword() {
|
||||
const upper = 'ABCDEFGHJKLMNPQRSTUVWXYZ'
|
||||
const lower = 'abcdefghijkmnpqrstuvwxyz'
|
||||
const digits = '23456789'
|
||||
const special = '!@#$%^&*'
|
||||
const all = upper + lower + digits + special
|
||||
const arr = new Uint8Array(16)
|
||||
crypto.getRandomValues(arr)
|
||||
|
||||
// Guarantee character class coverage using first 4 bytes
|
||||
let pw = [
|
||||
upper[arr[0] % upper.length],
|
||||
lower[arr[1] % lower.length],
|
||||
digits[arr[2] % digits.length],
|
||||
special[arr[3] % special.length],
|
||||
]
|
||||
for (let i = 4; i < 16; i++) {
|
||||
pw.push(all[arr[i] % all.length])
|
||||
}
|
||||
// Fisher-Yates shuffle
|
||||
for (let i = pw.length - 1; i > 0; i--) {
|
||||
const j = arr[i] % (i + 1)
|
||||
;[pw[i], pw[j]] = [pw[j], pw[i]]
|
||||
}
|
||||
return pw.join('')
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### WR-05: `oauth_callback` in `cloud.py` leaks exception messages into redirect URLs
|
||||
|
||||
**File:** `backend/api/cloud.py:525-530`
|
||||
|
||||
**Issue:** The outer `except Exception as exc` block at line 525 passes `str(exc)` directly into a redirect URL via `urllib.parse.quote(error_msg)`. This means internal exception messages — including potentially stack traces from libraries, token values from MSAL error responses, or internal server details — are passed to the frontend as query parameters in the redirect. The error message from `ValueError(f"Token exchange failed: {result.get('error_description', result['error'])}")` (line 493) includes the provider's raw `error_description` which may contain OAuth scopes, client IDs, or internal identifiers.
|
||||
|
||||
**Fix:** Sanitize or categorize errors before inclusion in the redirect:
|
||||
|
||||
```python
|
||||
except Exception as exc:
|
||||
# Log the full error internally; expose only a safe generic message
|
||||
import logging
|
||||
logging.getLogger(__name__).error("OAuth callback error: %s", exc)
|
||||
error_msg = "OAuth connection failed. Please try again."
|
||||
return RedirectResponse(
|
||||
url=f"{frontend_settings}?cloud_error={urllib.parse.quote(error_msg)}",
|
||||
status_code=302,
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### WR-06: `test_invalid_grant_sets_requires_reauth` test does not actually verify the DB state transition it claims to test
|
||||
|
||||
**File:** `backend/tests/test_cloud.py:424-498`
|
||||
|
||||
**Issue:** The test name and docstring promise to verify "BOTH HTTP 503 response AND DB state update." However, lines 489-498 contain a comment explicitly conceding that the DB state is NOT verified by this test because the monkeypatch bypasses `_call_cloud_op`. The test asserts only the HTTP 503. The comment says "The DB transition is covered by the cloud.py unit tests" — but no such unit test exists in the reviewed files. This leaves the `conn.status = "REQUIRES_REAUTH"` path in `_call_cloud_op` untested by the test suite.
|
||||
|
||||
**Fix:** Either (a) add a separate unit test for `_call_cloud_op` that verifies the DB status transition, or (b) restructure `test_invalid_grant_sets_requires_reauth` to use the real `_call_cloud_op` path and assert the DB state. At minimum, remove the misleading docstring claim about verifying DB state.
|
||||
|
||||
---
|
||||
|
||||
## Info
|
||||
|
||||
### IN-01: `moveDocument` in `client.js` calls a non-existent endpoint — dead code
|
||||
|
||||
**File:** `frontend/src/api/client.js:321-327`
|
||||
|
||||
**Issue:** `moveDocument(docId, folderId)` targets `PATCH /api/documents/{docId}/folder`. That endpoint is defined in `backend/api/folders.py` (not `documents.py`). The new `PATCH /api/documents/{doc_id}` endpoint added in plan 05-09 also accepts `folder_id`. There are now two client-side functions (`moveDocument` via `/folder` and the new `patch_document` path via `PATCH /documents/{id}`) that both accomplish folder moves, but through different backend endpoints. This duplication creates confusion about which to use. If `moveDocument` is the legacy function that should be superseded, it should be removed or deprecated with a clear comment.
|
||||
|
||||
---
|
||||
|
||||
### IN-02: `classify_document` in `documents.py` uses a mutable default argument `body: dict = {}`
|
||||
|
||||
**File:** `backend/api/documents.py:648`
|
||||
|
||||
**Issue:** `body: dict = {}` is a mutable default argument in a Python function — a classic Python footgun. In normal Python functions this causes state sharing between calls, but FastAPI reconstructs default parameter values per request for `Body` parameters, so this is unlikely to cause the classic bug in practice. However it is still a code smell that will flag in linters and misleads readers. FastAPI's idiomatic approach is `body: dict = Body(default={})` or a dedicated Pydantic model.
|
||||
|
||||
**Fix:**
|
||||
|
||||
```python
|
||||
from fastapi import Body
|
||||
async def classify_document(
|
||||
doc_id: str,
|
||||
body: dict = Body(default={}),
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### IN-03: `SettingsCloudTab.vue` — `oauthError` banner is shown inside a `v-else` that is mutually exclusive with `store.loading` but not with the provider list
|
||||
|
||||
**File:** `frontend/src/components/settings/SettingsCloudTab.vue:23`
|
||||
|
||||
**Issue:** The template structure is:
|
||||
|
||||
```html
|
||||
<div v-if="store.loading">Loading...</div>
|
||||
<div v-if="oauthError">error banner</div> <!-- NOT v-else-if -->
|
||||
<div v-else class="divide-y ..."> <!-- this v-else pairs with oauthError -->
|
||||
provider list
|
||||
</div>
|
||||
```
|
||||
|
||||
The `v-else` on the provider list div pairs with the `oauthError` `v-if`, not with `store.loading`. This means:
|
||||
- When `store.loading` is true AND `oauthError` is set, both the loading indicator AND the error banner are shown (the provider list is hidden — this is actually correct by accident).
|
||||
- When `store.loading` is true AND `oauthError` is empty, the loading indicator is shown AND the provider list is also shown (because `v-else` on the list fires when `oauthError` is falsy — regardless of `store.loading`).
|
||||
|
||||
The loading state and provider list are not mutually exclusive. Fix by using a proper conditional chain:
|
||||
|
||||
```html
|
||||
<div v-if="store.loading">Loading...</div>
|
||||
<template v-else>
|
||||
<div v-if="oauthError" ...>error banner</div>
|
||||
<div class="divide-y ...">provider list</div>
|
||||
</template>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
_Reviewed: 2026-05-30_
|
||||
_Reviewer: Claude (gsd-code-reviewer)_
|
||||
_Depth: standard_
|
||||
Reference in New Issue
Block a user