docs(05): add code review and verification reports for phase 5
REVIEW.md: 3 critical findings fixed (HTTPException passthrough, Redis pre-flight ordering, CLOUD_CREDS_KEY in celery-worker env) VERIFICATION.md: 7/7 must-haves verified; 6 human-verification items require live cloud credentials (Google Drive, OneDrive, Nextcloud/WebDAV) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -2,258 +2,273 @@
|
||||
phase: 05-cloud-storage-backends
|
||||
reviewed: 2026-05-30T00:00:00Z
|
||||
depth: standard
|
||||
files_reviewed: 14
|
||||
files_reviewed: 6
|
||||
files_reviewed_list:
|
||||
- backend/api/documents.py
|
||||
- backend/api/admin.py
|
||||
- backend/api/cloud.py
|
||||
- backend/tasks/document_tasks.py
|
||||
- backend/api/documents.py
|
||||
- docker-compose.yml
|
||||
- frontend/src/views/CloudStorageView.vue
|
||||
- backend/tests/test_cloud.py
|
||||
- backend/tests/test_admin_api.py
|
||||
- backend/tests/test_classifier.py
|
||||
- frontend/src/api/client.js
|
||||
- frontend/src/components/admin/AdminUsersTab.vue
|
||||
- frontend/src/components/cloud/CloudCredentialModal.vue
|
||||
- frontend/src/components/documents/DocumentPreviewModal.vue
|
||||
- frontend/src/components/settings/SettingsCloudTab.vue
|
||||
- frontend/src/components/ui/ConfirmBlock.vue
|
||||
- frontend/src/views/DocumentView.vue
|
||||
- backend/tests/test_documents.py
|
||||
findings:
|
||||
critical: 5
|
||||
warning: 6
|
||||
info: 3
|
||||
total: 14
|
||||
critical: 3
|
||||
warning: 4
|
||||
info: 2
|
||||
total: 9
|
||||
status: issues_found
|
||||
---
|
||||
|
||||
# Phase 05: Code Review Report
|
||||
# Phase 05 Plan 12: Code Review Report
|
||||
|
||||
**Reviewed:** 2026-05-30
|
||||
**Reviewed:** 2026-05-30T00:00:00Z
|
||||
**Depth:** standard
|
||||
**Files Reviewed:** 14
|
||||
**Files Reviewed:** 6
|
||||
**Status:** issues_found
|
||||
|
||||
## Summary
|
||||
|
||||
This review covers the gap-closure plans 05-09, 05-10, and 05-11. The changes add a `PATCH /api/documents/{id}` endpoint for filename/folder rename, make the Celery re-analyze task cloud-aware, replace unauthenticated iframe src with a fetch+Blob URL flow, change `oauth_initiate` to return JSON instead of a 302 redirect, add WebDAV/Nextcloud edit support, add an admin user hard-delete with password confirmation, and small UI fixes (ConfirmBlock break-words, Edit button on ERROR-state connections).
|
||||
This review covers the plan 05-12 gap-closure changes: OAuth pre-flight config validation added
|
||||
to `oauth_initiate`, a broad `except Exception → 502` fallback added after the
|
||||
`except CloudConnectionError → 503` clause in `stream_document_content`, a Celery worker source
|
||||
volume mount added to `docker-compose.yml`, an upload-hint paragraph added to
|
||||
`CloudStorageView.vue`, two new pre-flight tests in `test_cloud.py`, and one new 502-path test
|
||||
in `test_documents.py`.
|
||||
|
||||
The security posture of the major new features is reasonable. However there are five blocker-class issues: two request-body smuggling paths, one timing-attack on admin password verification, one URL-object leak in DocumentView, and a missing folder-ownership check in the new PATCH endpoint. Several warnings around input validation and error handling are also present.
|
||||
Three critical issues were found. The most impactful is that the broad `except Exception` clause
|
||||
added to `stream_document_content` unconditionally swallows `HTTPException` raised by
|
||||
`get_storage_backend_for_document`, converting a correct 503 "reconnect" response into a
|
||||
misleading 502 "unreachable" response. The second critical issue is that Redis state tokens are
|
||||
written to Redis before the new pre-flight check runs, leaving one orphan state entry per
|
||||
rejected OAuth initiation request. The third is that the Celery worker container is missing the
|
||||
`CLOUD_CREDS_KEY` environment variable, which causes silent use of the fallback default key
|
||||
`"CHANGEME-32-bytes-padded!!"` for cloud-document credential decryption, making every
|
||||
extract-and-classify Celery task for cloud documents fail at runtime.
|
||||
|
||||
## Narrative Findings (AI reviewer)
|
||||
|
||||
---
|
||||
|
||||
## Critical Issues
|
||||
|
||||
### CR-01: `DELETE /api/admin/users/{id}` body parsed from JSON but HTTP spec makes DELETE bodies unreliable — and FastAPI maps it as a query-param model, not a body, causing 422 in some clients
|
||||
### CR-01: `except Exception` in `stream_document_content` swallows `HTTPException` from `get_storage_backend_for_document`
|
||||
|
||||
**File:** `backend/api/admin.py:480-503`
|
||||
**File:** `backend/api/documents.py:751-763`
|
||||
|
||||
**Issue:** The `delete_user` handler declares `body: UserDeleteConfirm` as a plain positional parameter alongside `user_id: uuid.UUID`. FastAPI treats a Pydantic model on a DELETE handler as a **request body**, which is correct, but many HTTP clients (including some proxies and the `httpx` test client's `.delete()` shorthand) strip the body from DELETE requests per RFC 7231. The test at `test_admin_api.py:410` uses `client.delete(...)` with no body and asserts 422 — that part is fine. But `test_delete_user_correct_password` uses `client.request("DELETE", ..., json=...)` which explicitly sends a body. The problem is: the `admin_password` field is never validated for minimum length or content — a zero-length string `""` passes Pydantic validation and reaches `verify_password("", hash)` where Argon2 will evaluate it (returning False for a wrong hash, which is correct), but the absence of any length/non-empty guard means the error path returns `403` which subtly leaks that the endpoint exists and expects a password. More critically: **the constant-time comparison requirement from CLAUDE.md is met by `verify_password` (Argon2 is inherently constant-time for hashing), but the `admin_password` field has no `min_length=1` constraint**, so an empty string body produces a full Argon2 hash evaluation rather than an early reject.
|
||||
**Issue:** `get_storage_backend_for_document` (in `storage/__init__.py:100-103`) raises
|
||||
`HTTPException(503, "Cloud connection not found or inactive")` when no active `CloudConnection`
|
||||
row exists for the document's provider. `HTTPException` is a subclass of `Exception`
|
||||
(confirmed: `starlette.exceptions.HTTPException → Exception → BaseException`), so the new
|
||||
`except Exception as exc` block on line 759 catches it and re-raises it wrapped in a new
|
||||
`HTTPException(502, "Cloud backend unreachable …")`.
|
||||
|
||||
The bigger issue: there is **no rate limiting** on this endpoint. An attacker who has obtained an admin JWT can brute-force the admin's password via repeated DELETE calls. CLAUDE.md requires rate limiting on all auth-adjacent endpoints.
|
||||
The caller receives a misleading 502 status and a "backend unreachable" message when the real
|
||||
problem is that the cloud connection was deleted or set to `REQUIRES_REAUTH`. The correct 503
|
||||
with the reconnect prompt is silently suppressed.
|
||||
|
||||
**Fix:** Add `min_length=1` to `UserDeleteConfirm.admin_password` and ensure rate limiting middleware covers this endpoint:
|
||||
The new test `test_stream_document_content_cloud_backend_error` (test_documents.py:598-632) only
|
||||
exercises the `RuntimeError` path by monkeypatching `get_storage_backend_for_document` to raise
|
||||
a `RuntimeError`. It does not test the path where `get_storage_backend_for_document` raises
|
||||
`HTTPException(503)`, so this regression is undetected by the test suite.
|
||||
|
||||
**Fix:** Re-order the `except` clauses to explicitly re-raise `HTTPException` before the broad
|
||||
catch catches it:
|
||||
|
||||
```python
|
||||
class UserDeleteConfirm(BaseModel):
|
||||
admin_password: str = Field(..., min_length=1, max_length=1024)
|
||||
try:
|
||||
storage_backend = await get_storage_backend_for_document(doc, current_user, session)
|
||||
file_bytes = await storage_backend.get_object(doc.object_key)
|
||||
except CloudConnectionError as exc:
|
||||
raise HTTPException(
|
||||
status_code=503,
|
||||
detail="Cloud connection requires re-authentication. Please reconnect in Settings.",
|
||||
) from exc
|
||||
except HTTPException:
|
||||
raise # propagate 503 from get_storage_backend_for_document unchanged
|
||||
except Exception as exc:
|
||||
raise HTTPException(
|
||||
status_code=502,
|
||||
detail="Cloud backend unreachable. Please try again or reconnect in Settings.",
|
||||
) from exc
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### CR-02: `PATCH /api/documents/{doc_id}` does not validate folder ownership — a user can move a document into another user's folder
|
||||
### CR-02: Redis OAuth state token written before pre-flight check — orphan Redis entries created on every rejected request
|
||||
|
||||
**File:** `backend/api/documents.py:546-588`
|
||||
**File:** `backend/api/cloud.py:342-357`
|
||||
|
||||
**Issue:** The new `patch_document` handler validates document ownership (`doc.user_id != current_user.id`) but when `folder_id` is provided it sets `doc.folder_id = body.folder_id` without verifying that the target folder belongs to `current_user.id`. This is a cross-user data placement bug: a user who guesses or enumerates another user's folder UUID can move their own document into that folder, causing it to appear in the victim's folder listing.
|
||||
**Issue:** In `oauth_initiate`, `redis_client.setex(f"oauth_state:{state_token}", 1800, …)` is
|
||||
called on line 344, persisting a 30-minute Redis entry, before the provider-config pre-flight
|
||||
checks on lines 348-357. When `google_client_id` or `onedrive_client_id` is empty, the function
|
||||
raises `HTTPException(400)` and the state token is never consumed or deleted. Every rejected call
|
||||
leaves one orphan Redis key with an 1800-second TTL.
|
||||
|
||||
The existing `PATCH /api/documents/{id}/folder` endpoint in `backend/api/folders.py` does perform this check (lines ~479-488). The new `patch_document` bypasses that validation entirely.
|
||||
In a misconfigured deployment (where OAuth credentials are not set), every authenticated user
|
||||
clicking "Connect" generates a Redis key that is never reclaimed except by TTL expiry. Beyond
|
||||
memory waste, an orphan state token created before the rejection could theoretically be captured
|
||||
from server logs or monitoring and submitted to the callback endpoint if credentials are later
|
||||
configured — allowing a replay of a stale initiation.
|
||||
|
||||
**Fix:** Add a folder ownership assertion before setting `doc.folder_id`:
|
||||
The two new tests (`test_oauth_initiate_google_drive_not_configured`,
|
||||
`test_oauth_initiate_onedrive_not_configured`) verify the 400 response but do not assert that
|
||||
the `FakeRedis._store` is empty, so the leak is undetected.
|
||||
|
||||
**Fix:** Move all pre-flight checks above the Redis write:
|
||||
|
||||
```python
|
||||
if "folder_id" in body.model_fields_set and body.folder_id is not None:
|
||||
from db.models import Folder # noqa: PLC0415
|
||||
target_folder = await session.get(Folder, body.folder_id)
|
||||
if target_folder is None or target_folder.user_id != current_user.id:
|
||||
raise HTTPException(404, "Folder not found")
|
||||
doc.folder_id = body.folder_id
|
||||
elif "folder_id" in body.model_fields_set:
|
||||
doc.folder_id = None # move to root
|
||||
@router.get("/oauth/initiate/{provider}")
|
||||
async def oauth_initiate(provider: str, request: Request,
|
||||
current_user: User = Depends(get_regular_user)) -> dict:
|
||||
if provider not in VALID_OAUTH_PROVIDERS:
|
||||
raise HTTPException(status_code=400, detail=f"Unsupported OAuth provider: {provider}.")
|
||||
|
||||
# Pre-flight BEFORE touching Redis
|
||||
if provider == "google_drive" and (not settings.google_client_id or not settings.google_client_secret):
|
||||
raise HTTPException(status_code=400, detail="…Set GOOGLE_CLIENT_ID and GOOGLE_CLIENT_SECRET…")
|
||||
if provider == "onedrive" and (not settings.onedrive_client_id or not settings.onedrive_client_secret):
|
||||
raise HTTPException(status_code=400, detail="…Set ONEDRIVE_CLIENT_ID, ONEDRIVE_CLIENT_SECRET…")
|
||||
|
||||
state_token = secrets.token_urlsafe(32)
|
||||
redis_client = request.app.state.redis
|
||||
await redis_client.setex(f"oauth_state:{state_token}", 1800, str(current_user.id))
|
||||
…
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### CR-03: `PATCH /api/documents/{doc_id}` accepts an empty string filename — corrupts the document record
|
||||
### CR-03: `celery-worker` missing `CLOUD_CREDS_KEY` — cloud document processing silently uses wrong decryption key
|
||||
|
||||
**File:** `backend/api/documents.py:576-577`
|
||||
**File:** `docker-compose.yml:81-102`
|
||||
|
||||
**Issue:** The `filename` field in `DocumentPatch` is `Optional[str] = None`. The handler applies the update when `body.filename is not None`, but an empty string `""` passes that check. A `PATCH {"filename": ""}` will persist an empty filename to the database, which breaks display, download headers (`Content-Disposition: inline; filename=""`), and any downstream filename-based logic.
|
||||
**Issue:** The `celery-worker` service environment block (lines 83-90) does not include
|
||||
`CLOUD_CREDS_KEY`. Without this variable, `settings.cloud_creds_key` falls back to the default
|
||||
`"CHANGEME-32-bytes-padded!!"` (config.py:61). The Celery task `_run` in
|
||||
`tasks/document_tasks.py` calls `get_storage_backend_for_document`, which calls
|
||||
`decrypt_credentials(settings.cloud_creds_key.encode(), str(user.id), conn.credentials_enc)`.
|
||||
HKDF key derivation will silently use the wrong master key, Fernet will raise
|
||||
`InvalidToken`, and the task returns `{"status": "extract_failed", "error": "retrieval failed: …"}`.
|
||||
There is no startup-time validation; the failure only surfaces on the first cloud document
|
||||
task execution.
|
||||
|
||||
Additionally, filenames with path separators (e.g. `"../../etc/passwd"`) are accepted without sanitization. While the filename is only stored in the DB (not used for file system paths), it does appear verbatim in the `Content-Disposition` header at `backend/api/documents.py:754`, which can produce a malformed or injection-capable header value.
|
||||
The `backend` service correctly receives `SECRET_KEY` (line 64) and would receive `CLOUD_CREDS_KEY`
|
||||
from the environment, but the `celery-worker` service does not pass either.
|
||||
|
||||
**Fix:**
|
||||
|
||||
```python
|
||||
if "filename" in body.model_fields_set:
|
||||
if body.filename is None or not body.filename.strip():
|
||||
raise HTTPException(422, "filename must be a non-empty string")
|
||||
# Strip path separators — filename is display-only, not a path
|
||||
doc.filename = Path(body.filename).name or body.filename
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### CR-04: `fetchDocumentContent` in `client.js` does not check non-401 error responses — callers receive a non-`ok` Response silently
|
||||
|
||||
**File:** `frontend/src/api/client.js:399-425`
|
||||
|
||||
**Issue:** `fetchDocumentContent` deliberately does not call `res.json()` (it returns the raw `Response` for the caller to `.blob()`). However it also does not throw on non-401, non-ok responses — it returns the raw `Response` regardless of status. The caller in `DocumentPreviewModal.vue:93` checks `if (!res.ok)` correctly. But the caller in `DocumentView.vue:169-179` also checks `if (!res.ok)` and only `console.error`s — it swallows the error silently and returns without user feedback.
|
||||
|
||||
More critically: the function handles `401` with a retry, but **a 403, 404, or 503 response is returned to the caller as a `Response` object without throwing**. If a future caller forgets the `res.ok` check (which `request()` does automatically), it will attempt to call `.blob()` on an error response, producing a confusing Blob containing the JSON error body rather than document bytes.
|
||||
|
||||
**Fix:** Throw on non-auth error responses, consistent with `request()`:
|
||||
|
||||
```javascript
|
||||
export async function fetchDocumentContent(docId, options = {}) {
|
||||
// ... (existing auth + fetch code) ...
|
||||
|
||||
if (!res.ok && res.status !== 401) {
|
||||
const msg = `HTTP ${res.status}`
|
||||
const err = new Error(msg)
|
||||
err.status = res.status
|
||||
throw err
|
||||
}
|
||||
|
||||
if (res.status === 401 && !options._retry) {
|
||||
// ... existing retry logic ...
|
||||
}
|
||||
|
||||
return res
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### CR-05: `DocumentView.vue` leaks a blob object URL when opening PDFs in a new tab — the 60-second revoke timer is unreliable
|
||||
|
||||
**File:** `frontend/src/views/DocumentView.vue:172-182`
|
||||
|
||||
**Issue:** In `openPdf()` (new-tab path), a `URL.createObjectURL(blob)` URL is created, `window.open`ed, and then revoked after a `setTimeout(..., 60000)`. This has two problems:
|
||||
|
||||
1. **Memory leak vector:** If the user navigates away from `DocumentView` before 60 seconds, the timeout still fires against the detached window context. More importantly, if `window.open` is blocked by a popup blocker, the object URL is never opened but the timer still runs — the 60-second window holds the blob in memory unnecessarily.
|
||||
2. **Race condition:** Some browsers begin loading the new tab asynchronously; 60 seconds may not be enough for large PDFs over slow connections, causing the tab to show a broken preview mid-load.
|
||||
|
||||
This is a correctness/reliability issue rather than pure performance, because the revoked URL can leave the new tab with a broken blank page.
|
||||
|
||||
**Fix:** Use a longer TTL (e.g., 5 minutes) or defer revocation using the `window.open` return value's `onload` event — but as a minimum, guard the open call:
|
||||
|
||||
```javascript
|
||||
const win = window.open(objectUrl, '_blank')
|
||||
if (!win) {
|
||||
// Popup blocked — revoke immediately
|
||||
URL.revokeObjectURL(objectUrl)
|
||||
} else {
|
||||
setTimeout(() => URL.revokeObjectURL(objectUrl), 300_000) // 5 min
|
||||
}
|
||||
```yaml
|
||||
celery-worker:
|
||||
environment:
|
||||
- DATABASE_URL=${DATABASE_URL}
|
||||
- MINIO_ENDPOINT=${MINIO_ENDPOINT}
|
||||
- MINIO_ACCESS_KEY=${MINIO_ACCESS_KEY}
|
||||
- MINIO_SECRET_KEY=${MINIO_SECRET_KEY}
|
||||
- MINIO_BUCKET=${MINIO_BUCKET}
|
||||
- REDIS_URL=${REDIS_URL}
|
||||
- CLOUD_CREDS_KEY=${CLOUD_CREDS_KEY} # required for cloud document credential decryption
|
||||
- PYTHONDONTWRITEBYTECODE=1
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Warnings
|
||||
|
||||
### WR-01: `_call_cloud_op` commits the session inside a helper, but the session is owned by the caller — double-commit risk
|
||||
### WR-01: `update_default_storage` accepts arbitrary string as `backend` value — no server-side allowlist
|
||||
|
||||
**File:** `backend/api/cloud.py:116-133`
|
||||
**File:** `backend/api/cloud.py:922-941`
|
||||
|
||||
**Issue:** `_call_cloud_op` calls `await session.commit()` on the session passed in by the caller (at lines 116, 133, 148, 165). The caller (e.g., `list_cloud_folders`) does not commit after calling `_call_cloud_op`. This pattern is fragile: if the caller adds objects to the session after `_call_cloud_op` commits, those will be committed in a separate implicit transaction, potentially leaving the session in an inconsistent state. More importantly, `list_cloud_folders` at line 757 does not call `_call_cloud_op` at all — it calls the fetch functions directly. The commit calls inside `_call_cloud_op` are therefore only triggered on retry paths, making the commit responsibility asymmetric and hard to audit.
|
||||
**Issue:** `PATCH /api/users/me/default-storage` accepts `{"backend": "<any string>"}` and
|
||||
writes it directly to `user.default_storage_backend` without validation against an allowlist.
|
||||
The docstring notes "validated by the frontend dropdown," which is a client-side-only control
|
||||
trivially bypassed. A user can persist any string (e.g., `"../../etc"`, unsupported provider
|
||||
slug, or an empty string) to the DB column, potentially causing downstream handler errors when
|
||||
the value is used for routing.
|
||||
|
||||
**Fix:** Establish a clear ownership rule: either `_call_cloud_op` owns the commit (and callers must not commit), or callers own the commit (and `_call_cloud_op` only flushes). Document this contract explicitly in the docstring.
|
||||
**Fix:** Add a `field_validator` to `DefaultStorageRequest`:
|
||||
|
||||
---
|
||||
```python
|
||||
_VALID_BACKENDS = frozenset({"minio", "google_drive", "onedrive", "nextcloud", "webdav"})
|
||||
|
||||
### WR-02: `CloudCredentialModal.vue` — edit mode submits with an empty password, which the backend rejects without clear user feedback
|
||||
class DefaultStorageRequest(BaseModel):
|
||||
backend: str
|
||||
|
||||
**File:** `frontend/src/components/cloud/CloudCredentialModal.vue:304-322`
|
||||
|
||||
**Issue:** The modal comment at line 311-313 explicitly acknowledges this problem: "If password is empty on edit, the server will reject." The `submit()` function sends `password.value` which may be empty if the user chose not to change it. The backend's `connect_webdav` endpoint always requires the `password` field (it upserts the full credential set). When the user clicks "Save changes" without entering a new password, the call will fail with a validation error, but the displayed error message is the raw backend error rather than a clear "Please re-enter your password to save changes" message.
|
||||
|
||||
The code comment itself says "Future enhancement: PATCH endpoint that accepts partial updates" — but shipping with a known broken flow is a user-facing defect.
|
||||
|
||||
**Fix:** Add client-side validation in `submit()` for the edit case:
|
||||
|
||||
```javascript
|
||||
async function submit() {
|
||||
connectError.value = ''
|
||||
if (props.existing && !password.value) {
|
||||
connectError.value = 'Please enter your password to save changes.'
|
||||
return
|
||||
}
|
||||
// ... rest of submit
|
||||
}
|
||||
@field_validator("backend")
|
||||
@classmethod
|
||||
def backend_must_be_valid(cls, v: str) -> str:
|
||||
if v not in _VALID_BACKENDS:
|
||||
raise ValueError(f"backend must be one of {sorted(_VALID_BACKENDS)}")
|
||||
return v
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### WR-03: `adminDeleteUser` in `client.js` sends `admin_password` in a JSON body on a DELETE request — body may be stripped by intermediaries
|
||||
### WR-02: Pre-flight check for OneDrive omits `onedrive_tenant_id` validation despite advertising it in the error message
|
||||
|
||||
**File:** `frontend/src/api/client.js:280-286`
|
||||
**File:** `backend/api/cloud.py:353-357`
|
||||
|
||||
**Issue:** HTTP DELETE requests with a body are technically valid but controversial. Some reverse proxies (nginx, AWS ALB) and CDN configurations strip or reject DELETE request bodies. The `admin_password` credential would then arrive at FastAPI as an empty/missing body, producing a 422, which could be confused with a Pydantic validation failure rather than a transport issue. CLAUDE.md mandates no plaintext secrets in transit beyond TLS, which is met here, but the transport reliability is not.
|
||||
|
||||
**Fix:** Consider changing the endpoint to `POST /api/admin/users/{id}/delete` with a JSON body, or accept the password as a header (e.g., `X-Admin-Password`) with a note that headers are also stripped by some proxies. A `POST` endpoint is the most reliable approach and keeps the credential in the body where TLS protects it.
|
||||
|
||||
---
|
||||
|
||||
### WR-04: `generateRandomPassword` in `AdminUsersTab.vue` appends a fixed suffix `"A1!"` — reducing entropy for the last 3 characters
|
||||
|
||||
**File:** `frontend/src/components/admin/AdminUsersTab.vue:291-301`
|
||||
|
||||
**Issue:** The password generator creates 16 random bytes mapped to a charset, then replaces the last 4 characters with `"A1!"` (3 fixed characters appended after slicing to 12). This means the last 3 characters of every generated password are always `"A1!"` — deterministic, not random. A 15-character password has its last 3 characters known to any attacker aware of this implementation. The effective entropy is 12 characters from the charset, not 15. The function is also missing a `handle` field — the email split at line 336 may produce an empty handle if the email starts with `@`.
|
||||
**Issue:** The OneDrive pre-flight guard checks only `onedrive_client_id` and
|
||||
`onedrive_client_secret`. Its error detail tells the operator to set `ONEDRIVE_TENANT_ID`, but
|
||||
the code never checks whether `settings.onedrive_tenant_id` is empty. The default value is
|
||||
`"common"` (config.py:67), so this is rarely a problem in practice. However, if someone
|
||||
explicitly sets `ONEDRIVE_TENANT_ID=""`, the MSAL authority URL becomes
|
||||
`https://login.microsoftonline.com//oauth2/v2.0/token`, producing an MSAL runtime error after
|
||||
the pre-flight is supposed to have caught the misconfiguration.
|
||||
|
||||
**Fix:**
|
||||
|
||||
```javascript
|
||||
function generateRandomPassword() {
|
||||
const upper = 'ABCDEFGHJKLMNPQRSTUVWXYZ'
|
||||
const lower = 'abcdefghijkmnpqrstuvwxyz'
|
||||
const digits = '23456789'
|
||||
const special = '!@#$%^&*'
|
||||
const all = upper + lower + digits + special
|
||||
const arr = new Uint8Array(16)
|
||||
crypto.getRandomValues(arr)
|
||||
|
||||
// Guarantee character class coverage using first 4 bytes
|
||||
let pw = [
|
||||
upper[arr[0] % upper.length],
|
||||
lower[arr[1] % lower.length],
|
||||
digits[arr[2] % digits.length],
|
||||
special[arr[3] % special.length],
|
||||
]
|
||||
for (let i = 4; i < 16; i++) {
|
||||
pw.push(all[arr[i] % all.length])
|
||||
}
|
||||
// Fisher-Yates shuffle
|
||||
for (let i = pw.length - 1; i > 0; i--) {
|
||||
const j = arr[i] % (i + 1)
|
||||
;[pw[i], pw[j]] = [pw[j], pw[i]]
|
||||
}
|
||||
return pw.join('')
|
||||
}
|
||||
```python
|
||||
if provider == "onedrive" and (
|
||||
not settings.onedrive_client_id
|
||||
or not settings.onedrive_client_secret
|
||||
or not settings.onedrive_tenant_id
|
||||
):
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail="OneDrive OAuth is not configured. Set ONEDRIVE_CLIENT_ID, ONEDRIVE_CLIENT_SECRET, and ONEDRIVE_TENANT_ID.",
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### WR-05: `oauth_callback` in `cloud.py` leaks exception messages into redirect URLs
|
||||
### WR-03: New pre-flight tests do not assert Redis state is clean after a 400 response
|
||||
|
||||
**File:** `backend/api/cloud.py:525-530`
|
||||
**File:** `backend/tests/test_cloud.py:784-835`
|
||||
|
||||
**Issue:** The outer `except Exception as exc` block at line 525 passes `str(exc)` directly into a redirect URL via `urllib.parse.quote(error_msg)`. This means internal exception messages — including potentially stack traces from libraries, token values from MSAL error responses, or internal server details — are passed to the frontend as query parameters in the redirect. The error message from `ValueError(f"Token exchange failed: {result.get('error_description', result['error'])}")` (line 493) includes the provider's raw `error_description` which may contain OAuth scopes, client IDs, or internal identifiers.
|
||||
**Issue:** `test_oauth_initiate_google_drive_not_configured` and
|
||||
`test_oauth_initiate_onedrive_not_configured` both set up a `FakeRedis`, call the endpoint
|
||||
expecting a 400, and reset `app.state.redis = None`. Neither test asserts that
|
||||
`fake_redis._store` is empty after the call. Because the state token is currently written before
|
||||
the pre-flight check (CR-02 above), a check like this would fail today — confirming the bug.
|
||||
When CR-02 is fixed, adding the assertion hardens the test against regressions:
|
||||
|
||||
**Fix:** Sanitize or categorize errors before inclusion in the redirect:
|
||||
```python
|
||||
# After the status assert, add:
|
||||
assert len(fake_redis._store) == 0, (
|
||||
"No OAuth state should be stored in Redis when pre-flight validation fails"
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### WR-04: `oauth_callback` reflects raw OAuth provider `error` parameter and internal exception messages into redirect URL
|
||||
|
||||
**File:** `backend/api/cloud.py:427-428` and `537-539`
|
||||
|
||||
**Issue:** `error_param` from the query string is embedded verbatim into a `ValueError` message
|
||||
(`f"OAuth provider returned error: {error_param}"`), which flows into `str(exc)` and is passed
|
||||
to `urllib.parse.quote` before appearing as `?cloud_error=…` in the redirect. The URL encoding
|
||||
prevents injection in the query string. However:
|
||||
|
||||
1. A malicious or compromised OAuth provider can inject arbitrary text into the user-visible
|
||||
error banner with no server-side length cap or character filter.
|
||||
2. The outer `except Exception` block at line 536 passes `str(exc)` for all internal errors,
|
||||
which may include stack trace fragments, OAuth client IDs, or token values from provider
|
||||
error responses (e.g., `ValueError(f"Token exchange failed: {result.get('error_description', result['error'])}")`
|
||||
at line 504 — `error_description` is provider-controlled).
|
||||
|
||||
**Fix:** Cap the length and filter the error message before reflecting it:
|
||||
|
||||
```python
|
||||
except Exception as exc:
|
||||
# Log the full error internally; expose only a safe generic message
|
||||
import logging
|
||||
logging.getLogger(__name__).error("OAuth callback error: %s", exc)
|
||||
error_msg = "OAuth connection failed. Please try again."
|
||||
@@ -265,74 +280,50 @@ except Exception as exc:
|
||||
|
||||
---
|
||||
|
||||
### WR-06: `test_invalid_grant_sets_requires_reauth` test does not actually verify the DB state transition it claims to test
|
||||
|
||||
**File:** `backend/tests/test_cloud.py:424-498`
|
||||
|
||||
**Issue:** The test name and docstring promise to verify "BOTH HTTP 503 response AND DB state update." However, lines 489-498 contain a comment explicitly conceding that the DB state is NOT verified by this test because the monkeypatch bypasses `_call_cloud_op`. The test asserts only the HTTP 503. The comment says "The DB transition is covered by the cloud.py unit tests" — but no such unit test exists in the reviewed files. This leaves the `conn.status = "REQUIRES_REAUTH"` path in `_call_cloud_op` untested by the test suite.
|
||||
|
||||
**Fix:** Either (a) add a separate unit test for `_call_cloud_op` that verifies the DB status transition, or (b) restructure `test_invalid_grant_sets_requires_reauth` to use the real `_call_cloud_op` path and assert the DB state. At minimum, remove the misleading docstring claim about verifying DB state.
|
||||
|
||||
---
|
||||
|
||||
## Info
|
||||
|
||||
### IN-01: `moveDocument` in `client.js` calls a non-existent endpoint — dead code
|
||||
### IN-01: `CloudStorageView.vue` does not fetch connections on mount — direct navigation shows stale empty state
|
||||
|
||||
**File:** `frontend/src/api/client.js:321-327`
|
||||
**File:** `frontend/src/views/CloudStorageView.vue:61-93`
|
||||
|
||||
**Issue:** `moveDocument(docId, folderId)` targets `PATCH /api/documents/{docId}/folder`. That endpoint is defined in `backend/api/folders.py` (not `documents.py`). The new `PATCH /api/documents/{doc_id}` endpoint added in plan 05-09 also accepts `folder_id`. There are now two client-side functions (`moveDocument` via `/folder` and the new `patch_document` path via `PATCH /documents/{id}`) that both accomplish folder moves, but through different backend endpoints. This duplication creates confusion about which to use. If `moveDocument` is the legacy function that should be superseded, it should be removed or deprecated with a clear comment.
|
||||
**Issue:** The component reads `cloudStore.connections` and `cloudStore.loading` reactively but
|
||||
never calls `cloudStore.fetchConnections()` (or equivalent) in an `onMounted` hook. If a user
|
||||
navigates directly to `/cloud` without first visiting a page that pre-populates the store, the
|
||||
component renders the "No cloud storage connected" empty state without fetching live data. This
|
||||
is a reliability gap for direct navigation and deep-link scenarios.
|
||||
|
||||
**Fix:** Add `onMounted`:
|
||||
|
||||
```javascript
|
||||
import { computed, onMounted } from 'vue'
|
||||
onMounted(() => { cloudStore.fetchConnections?.() })
|
||||
```
|
||||
|
||||
or document explicitly that the parent layout is responsible for pre-fetching.
|
||||
|
||||
---
|
||||
|
||||
### IN-02: `classify_document` in `documents.py` uses a mutable default argument `body: dict = {}`
|
||||
### IN-02: `classify_document` uses mutable default argument `body: dict = {}`
|
||||
|
||||
**File:** `backend/api/documents.py:648`
|
||||
**File:** `backend/api/documents.py:657`
|
||||
|
||||
**Issue:** `body: dict = {}` is a mutable default argument in a Python function — a classic Python footgun. In normal Python functions this causes state sharing between calls, but FastAPI reconstructs default parameter values per request for `Body` parameters, so this is unlikely to cause the classic bug in practice. However it is still a code smell that will flag in linters and misleads readers. FastAPI's idiomatic approach is `body: dict = Body(default={})` or a dedicated Pydantic model.
|
||||
|
||||
**Fix:**
|
||||
**Issue:** `body: dict = {}` is the classic Python mutable-default-argument anti-pattern.
|
||||
FastAPI reconstructs body parameters per request so the classic shared-state bug does not
|
||||
manifest in production, but static analysis tools (ruff B006, mypy) flag it, and calling the
|
||||
function directly from tests with no `body` argument risks state sharing if the function is
|
||||
ever modified. Use `None` as the sentinel:
|
||||
|
||||
```python
|
||||
from fastapi import Body
|
||||
async def classify_document(
|
||||
doc_id: str,
|
||||
body: dict = Body(default={}),
|
||||
...
|
||||
body: Optional[dict] = None,
|
||||
…
|
||||
):
|
||||
topic_names = body.get("topics") if body else None
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### IN-03: `SettingsCloudTab.vue` — `oauthError` banner is shown inside a `v-else` that is mutually exclusive with `store.loading` but not with the provider list
|
||||
|
||||
**File:** `frontend/src/components/settings/SettingsCloudTab.vue:23`
|
||||
|
||||
**Issue:** The template structure is:
|
||||
|
||||
```html
|
||||
<div v-if="store.loading">Loading...</div>
|
||||
<div v-if="oauthError">error banner</div> <!-- NOT v-else-if -->
|
||||
<div v-else class="divide-y ..."> <!-- this v-else pairs with oauthError -->
|
||||
provider list
|
||||
</div>
|
||||
```
|
||||
|
||||
The `v-else` on the provider list div pairs with the `oauthError` `v-if`, not with `store.loading`. This means:
|
||||
- When `store.loading` is true AND `oauthError` is set, both the loading indicator AND the error banner are shown (the provider list is hidden — this is actually correct by accident).
|
||||
- When `store.loading` is true AND `oauthError` is empty, the loading indicator is shown AND the provider list is also shown (because `v-else` on the list fires when `oauthError` is falsy — regardless of `store.loading`).
|
||||
|
||||
The loading state and provider list are not mutually exclusive. Fix by using a proper conditional chain:
|
||||
|
||||
```html
|
||||
<div v-if="store.loading">Loading...</div>
|
||||
<template v-else>
|
||||
<div v-if="oauthError" ...>error banner</div>
|
||||
<div class="divide-y ...">provider list</div>
|
||||
</template>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
_Reviewed: 2026-05-30_
|
||||
_Reviewed: 2026-05-30T00:00:00Z_
|
||||
_Reviewer: Claude (gsd-code-reviewer)_
|
||||
_Depth: standard_
|
||||
|
||||
Reference in New Issue
Block a user