From 52d6efb8a222ec4e17793b1a55449b9371183137 Mon Sep 17 00:00:00 2001 From: curo1305 Date: Sun, 31 May 2026 20:23:32 +0200 Subject: [PATCH] docs(06.2): add code review report --- .../06.2-REVIEW.md | 431 +++++++++++------- 1 file changed, 275 insertions(+), 156 deletions(-) diff --git a/.planning/phases/06.2-close-v1-sharing-cloud-delete-csv-export-gaps/06.2-REVIEW.md b/.planning/phases/06.2-close-v1-sharing-cloud-delete-csv-export-gaps/06.2-REVIEW.md index ef28809..20b7d7d 100644 --- a/.planning/phases/06.2-close-v1-sharing-cloud-delete-csv-export-gaps/06.2-REVIEW.md +++ b/.planning/phases/06.2-close-v1-sharing-cloud-delete-csv-export-gaps/06.2-REVIEW.md @@ -2,19 +2,22 @@ phase: 06.2-close-v1-sharing-cloud-delete-csv-export-gaps reviewed: 2026-05-31T00:00:00Z depth: standard -files_reviewed: 6 +files_reviewed: 9 files_reviewed_list: - backend/api/audit.py - backend/tests/test_audit.py - backend/tests/test_documents.py - backend/tests/test_shares.py - frontend/src/api/client.js + - frontend/src/components/admin/AdminUsersTab.vue - frontend/src/components/admin/AuditLogTab.vue + - frontend/src/views/AccountView.vue + - frontend/src/views/CloudFolderView.vue findings: - critical: 2 - warning: 5 - info: 3 - total: 10 + critical: 4 + warning: 6 + info: 4 + total: 14 status: issues_found --- @@ -22,78 +25,66 @@ status: issues_found **Reviewed:** 2026-05-31T00:00:00Z **Depth:** standard -**Files Reviewed:** 6 +**Files Reviewed:** 9 **Status:** issues_found ## Summary -This phase closes sharing, cloud-delete, CSV-export, and daily-audit-export gaps. The backend -endpoint logic is structurally sound (parameterized queries, admin-only guards, proper path-traversal -regex on the date parameter). However, two correctness bugs are present that will silently corrupt or -truncate data in production, and five quality/robustness issues need addressing before the phase can -be marked complete. No hardcoded secrets were found and all admin guards are in place. +This phase closes v1 gaps across document sharing, cloud delete, and audit/CSV-export +functionality. The backend audit API is well-structured with appropriate admin guards, +parameterized queries, and a whitelist serializer that correctly prevents sensitive field +leakage. The share and document test coverage is comprehensive. However, four critical +bugs were found that either break features silently in production or expose runtime +crashes, plus six warnings covering security hygiene and robustness. --- ## Critical Issues -### CR-01: CSV export writes raw Python `dict` repr for `metadata_` — not JSON +### CR-01: AuditLogTab event_type filter always returns zero results — feature non-functional -**File:** `backend/api/audit.py:372` +**File:** `frontend/src/components/admin/AuditLogTab.vue:37-41` -**Issue:** `csv.DictWriter.writerow()` receives the dict returned by -`_audit_to_dict_with_handles()`. That dict contains `"metadata_": entry.metadata_`, where -`entry.metadata_` is a Python `dict` (SQLAlchemy deserializes `JSONB` columns into native Python -objects). `csv.DictWriter` calls `str()` on the value, producing `{'size_bytes': 100}` — -Python repr syntax with single quotes — not valid JSON. Any downstream consumer that tries to -`JSON.parse()` the metadata cell will fail. The same issue exists in the empty-handle early-return -path's header-only CSV, but that branch writes no data rows so the corruption is less visible there. +**Issue:** The filter dropdown submits bare category prefixes (`"auth"`, `"document"`, +`"folder"`, `"share"`, `"admin"`) to the backend. The backend applies exact-equality +filtering (`AuditLog.event_type == event_type` — `backend/api/audit.py:120, 159, 284`). +Every actual event type in the codebase uses dot-notation: +`"auth.login"`, `"auth.logout"`, `"document.uploaded"`, `"document.deleted"`, +`"share.granted"`, `"share.revoked"`, `"admin.user_created"`, `"admin.quota_changed"`, etc. -Proof (reproduced): -```python -import csv, io -output = io.StringIO() -writer = csv.DictWriter(output, fieldnames=["metadata_"]) -writer.writeheader() -writer.writerow({"metadata_": {"size_bytes": 100}}) -print(output.getvalue()) -# metadata_ -# {'size_bytes': 100} ← Python repr, NOT JSON -``` +An exact match of `"auth"` against `"auth.login"` never fires. Selecting any category +from the dropdown silently returns an empty list instead of filtered results — the entire +filter feature is non-functional. -**Fix:** Serialize `metadata_` to JSON in the dict helper before the CSV writer sees it, or only -at the CSV call site: +**Fix:** Change the backend filter from exact equality to a prefix `LIKE` match (most +flexible) or update the dropdown to send exact event-type strings: ```python -import json - -# Option A — at writerow call site (surgical fix): -for row in rows: - entry, user_handle_val, actor_handle_val = row[0], row[1], row[2] - record = _audit_to_dict_with_handles(entry, user_handle_val, actor_handle_val) - record["metadata_"] = json.dumps(record["metadata_"]) if record["metadata_"] is not None else "" - writer.writerow(record) +# backend/api/audit.py — at all three filter call sites (lines 120, 159, 284): +if event_type is not None: + q = q.where(AuditLog.event_type.like(f"{event_type}%")) ``` -The `test_audit_log_export_csv` test does not assert the content of the metadata cell, so this -bug currently passes all tests despite corrupting the exported data. +```python +# Also fix the count_q inline block in list_audit_log (lines 283-284): +if event_type is not None: + count_q = count_q.where(AuditLog.event_type.like(f"{event_type}%")) +``` --- -### CR-02: `download_daily_export` silently crashes on non-MinIO backends — no 404, no guard +### CR-02: `download_daily_export` accesses `backend._client` without a MinIOBackend guard — AttributeError crash on non-MinIO deployments -**File:** `backend/api/audit.py:219-239` +**File:** `backend/api/audit.py:219-228` -**Issue:** `list_daily_exports` (line 182) correctly guards with -`if not isinstance(backend, MinIOBackend): return {"items": []}` and returns an empty list for -non-MinIO deployments. But `download_daily_export` has **no such guard**. If `get_storage_backend()` -returns anything other than a `MinIOBackend` (e.g. a local filesystem backend), line 223 calls -`backend._client.get_object(...)`, which will raise `AttributeError: '...' object has no attribute -'_client'`. The broad `except Exception` on line 231 catches this and re-raises it as a 404, but -only because the error happens inside `_get()`. If the backend raises during construction of `_get` -or if `backend` is `None`, the error propagates as an unhandled 500. More importantly, any new -backend type that happens to have a `_client` attribute with a different meaning would silently -produce wrong results. +**Issue:** `list_daily_exports` (line 182) correctly checks +`isinstance(backend, MinIOBackend)` and returns an empty list for non-MinIO storage. +`download_daily_export` (line 219) calls `get_storage_backend()` and immediately uses +`backend._client` with no type guard. On deployments where the primary storage backend +is Google Drive, OneDrive, or Nextcloud, `backend._client` does not exist on those +classes, raising `AttributeError`. The broad `except Exception` on line 232 catches +this and re-raises it as a 404, masking a real misconfiguration entirely — the admin +sees "Export not found" when the real problem is a storage type mismatch. **Fix:** Mirror the guard from `list_daily_exports`: @@ -107,7 +98,7 @@ async def download_daily_export( raise HTTPException(status_code=404, detail="Invalid date format") backend = get_storage_backend() - if not isinstance(backend, MinIOBackend): # <-- add this guard + if not isinstance(backend, MinIOBackend): # <-- add this guard raise HTTPException(status_code=404, detail="Export not found") key = f"audit-logs/{date}.csv" @@ -116,87 +107,169 @@ async def download_daily_export( --- -## Warnings +### CR-03: CSV export writes raw Python `dict` repr for `metadata_` — not valid JSON -### WR-01: `export_audit_log` loads the entire audit log into memory before streaming +**File:** `backend/api/audit.py:372` -**File:** `backend/api/audit.py:351-377` - -**Issue:** The export endpoint issues `result = await session.execute(q)` followed by -`rows = result.all()`, then builds the full CSV in an `io.StringIO` before returning it in a -`StreamingResponse`. For large deployments the audit log may contain millions of rows; this will -exhaust server memory and cause an OOM kill rather than a graceful error. `StreamingResponse` is -the correct mechanism but it is not being streamed — the CSV is fully materialized first. - -**Fix:** Use a generator that yields CSV chunks from a cursor: +**Issue:** `csv.DictWriter.writerow()` receives the dict returned by +`_audit_to_dict_with_handles()`, which includes `"metadata_": entry.metadata_` where +`entry.metadata_` is a Python `dict` (SQLAlchemy deserializes `JSONB` columns to native +Python objects). `csv.DictWriter` calls `str()` on values it cannot serialize natively, +producing Python repr syntax with single quotes (`{'size_bytes': 100}`) rather than +valid JSON (`{"size_bytes": 100}`). Any downstream system that tries to parse the +`metadata_` CSV column as JSON will fail. +Reproduced: ```python -async def _generate_csv(session, q, fields): - output = io.StringIO() - writer = csv.DictWriter(output, fieldnames=fields) - writer.writeheader() - yield output.getvalue() - - async for row in await session.stream(q): - output = io.StringIO() - entry, user_handle_val, actor_handle_val = row[0], row[1], row[2] - record = _audit_to_dict_with_handles(entry, user_handle_val, actor_handle_val) - record["metadata_"] = json.dumps(record["metadata_"]) if record["metadata_"] is not None else "" - writer = csv.DictWriter(output, fieldnames=fields) - writer.writerow(record) - yield output.getvalue() - -return StreamingResponse(_generate_csv(session, q, fields), media_type="text/csv", ...) +import csv, io +output = io.StringIO() +writer = csv.DictWriter(output, fieldnames=["metadata_"]) +writer.writeheader() +writer.writerow({"metadata_": {"size_bytes": 100}}) +# output: "{'size_bytes': 100}" ← Python repr, NOT JSON ``` -### WR-02: `format` query parameter is accepted but ignored — always returns CSV +**Fix:** +```python +import json + +# In export_audit_log(), at the writerow call site (line 372): +for row in rows: + entry, user_handle_val, actor_handle_val = row[0], row[1], row[2] + record = _audit_to_dict_with_handles(entry, user_handle_val, actor_handle_val) + record["metadata_"] = json.dumps(record["metadata_"]) if record["metadata_"] is not None else "" + writer.writerow(record) +``` + +The `test_audit_log_export_csv` test does not assert the content of the metadata cell +so this bug currently passes all tests while corrupting exported data. + +--- + +### CR-04: Three admin fetch functions lack 401-refresh-retry — silently fail on token expiry + +**File:** `frontend/src/api/client.js:398-483` + +**Issue:** `adminExportAuditLogCsv`, `adminListDailyExports`, and +`adminDownloadDailyExport` all use raw `fetch()` calls without the 401-refresh-then-retry +logic that `request()` (line 27-30) and `fetchDocumentContent()` (lines 520-529) both +implement. When the 15-minute access token expires while an admin is on the audit log +page, all three functions immediately throw (e.g. `Error("Export failed: 401")`) with +no session recovery. The user's session is still valid (they have a live refresh token) +but the export/list operations fail hard. The auth store is not cleared, so the user +cannot tell why the operation failed. + +Note: `adminListDailyExports` returns JSON and could be routed through `request()` +directly, which already has the retry logic. + +**Fix:** +```js +// Simplest fix for adminListDailyExports — route through request() which handles 401 retry: +export function adminListDailyExports() { + return request('/api/admin/audit-log/daily-exports') +} + +// For CSV-returning functions, add the same retry block as fetchDocumentContent: +// In adminExportAuditLogCsv and adminDownloadDailyExport, replace: +// if (!res.ok) throw new Error(...) +// with: + if (res.status === 401) { + try { + await authStore.refresh() + return adminExportAuditLogCsv(params) // retry + } catch { + authStore.accessToken = null + authStore.user = null + throw new Error('Session expired') + } + } + if (!res.ok) throw new Error(`Export failed: ${res.status}`) +``` + +--- + +## Warnings + +### WR-01: Temporary password generation has a fixed suffix that leaks 3 bytes of entropy + +**File:** `frontend/src/components/admin/AdminUsersTab.vue:301` + +**Issue:** `generateRandomPassword()` generates 16 bytes of random data mapped through +a 65-character charset, then unconditionally overwrites the last 4 characters with the +literal `'A1!'`: +```js +pw = pw.slice(0, 12) + 'A1!' +``` +The generated password is always exactly 15 characters with the last 3 bytes fixed +(`A`, `1`, `!`). An attacker who reverse-engineers this algorithm can reduce the +brute-force search space significantly — the last 3 positions are always known. There is +also modulo bias: `256 % 65 = 61`, meaning the first 61 characters of the charset are +slightly overrepresented (each maps from 4 byte values instead of 3). While these +temporary passwords must be changed on first login, they are the sole credential +protecting accounts between creation and first use. + +**Fix:** +```js +function generateRandomPassword() { + const upper = 'ABCDEFGHJKLMNPQRSTUVWXYZ' + const lower = 'abcdefghijkmnpqrstuvwxyz' + const digits = '23456789' + const special = '!@#$%^&*' + const all = upper + lower + digits + special // 58 chars — 256 % 58 = 24, bias reduced + const arr = new Uint32Array(16) + crypto.getRandomValues(arr) + let chars = Array.from(arr.slice(0, 12), v => all[v % all.length]) + // Guarantee one char from each required class by replacing 4 specific positions + const positions = new Uint32Array(4) + crypto.getRandomValues(positions) + chars[positions[0] % 12] = upper[positions[0] % upper.length] + chars[positions[1] % 12] = lower[positions[1] % lower.length] + chars[positions[2] % 12] = digits[positions[2] % digits.length] + chars[positions[3] % 12] = special[positions[3] % special.length] + return chars.join('') +} +``` + +--- + +### WR-02: `format` query parameter is accepted but ignored — always returns CSV regardless **File:** `backend/api/audit.py:313` -**Issue:** The `export_audit_log` endpoint declares `format: str = Query(default="csv")` with a -`# noqa: A002` suppression but never reads the `format` variable in the handler body. Passing -`?format=json` returns a CSV response silently. This is a logic dead-end: the parameter is -either unused dead code or is intended as a future extension. If it is dead code it should be -removed; if it is a future hook, it must at minimum validate that only `"csv"` is accepted and -return 422 for other values. Currently an arbitrary string is accepted with no effect. +**Issue:** The `export_audit_log` endpoint declares `format: str = Query(default="csv")` +but the variable `format` is never read in the handler body. Passing `?format=json` +silently returns a CSV response. The parameter is either dead code or a future extension +stub. Either way the current behaviour is incorrect: the API contract implies multiple +formats are accepted when only CSV exists. Any string passes FastAPI's validation, +meaning callers receive misleading 200 OK responses for unsupported format values. -**Fix (remove the dead parameter):** +**Fix (if only CSV will ever be supported):** Remove the parameter entirely. + +**Fix (if future extension is planned):** Validate at the handler level: ```python -@router.get("/audit-log/export") -async def export_audit_log( - start: Optional[datetime] = Query(default=None), - end: Optional[datetime] = Query(default=None), - user_handle: Optional[str] = Query(default=None), - event_type: Optional[str] = Query(default=None), - # Remove: format: str = Query(default="csv"), - session: AsyncSession = Depends(get_db), - _admin: User = Depends(get_current_admin), -) -> StreamingResponse: +from typing import Literal +format: Literal["csv"] = Query(default="csv"), # noqa: A002 ``` -**Fix (validate if keeping):** -```python - if format not in ("csv",): - raise HTTPException(status_code=422, detail="Unsupported format. Only 'csv' is accepted.") -``` +--- -### WR-03: Pagination "Next" button in `AuditLogTab.vue` disables on the wrong condition +### WR-03: Pagination "Next" button uses wrong heuristic — breaks on last page when total is exact multiple of page size -**File:** `frontend/src/components/admin/AuditLogTab.vue:122` +**File:** `frontend/src/components/admin/AuditLogTab.vue:137,266` -**Issue:** The "Next" button is disabled when `entries.length < perPage`. This is a heuristic: -it assumes the last page has fewer entries than `perPage`. But if the total count is an exact -multiple of `perPage`, the last page has exactly `perPage` entries and the "Next" button remains -enabled. Clicking it fetches a page beyond the last, returns an empty items list, and leaves the -user on a blank page with no way to go back (the empty-entries check at line 73 replaces the -table with the "No entries" message, but the page counter still shows the incremented value). -The `total` ref is populated from `data.total` (line 212) — it should be used instead. +**Issue:** The "Next" button is disabled when `entries.value.length < perPage` (line 137 +in template, line 266 in script). If the last page contains exactly `perPage` (50) +entries, the button stays enabled. Clicking it dispatches a page request that returns an +empty items list, leaving the user on a blank page with the page counter incremented. The +`total` ref is populated from `data.total` (line 227) but is never used for pagination. **Fix:** -```javascript -// Replace line 122: +```html + :disabled="page * perPage >= total" -// And in nextPage(): +``` +```js +// Script, nextPage(): function nextPage() { if (page.value * perPage < total.value) { page.value++ @@ -205,44 +278,71 @@ function nextPage() { } ``` -### WR-04: `loadDailyExports` swallows errors silently — no user-visible feedback +--- -**File:** `frontend/src/components/admin/AuditLogTab.vue:256-266` +### WR-04: `loadDailyExports` swallows errors silently — admin sees "No daily exports" instead of an error -**Issue:** The `loadDailyExports` catch block sets `dailyExports.value = []` but does **not** set -`exportsError.value`. The template renders `

` for the download error, but -the load-failure case is silently dropped. An admin whose MinIO is misconfigured or unreachable -sees "No daily exports available" — identical to the normal empty-bucket state — with no -indication that an error occurred. +**File:** `frontend/src/components/admin/AuditLogTab.vue:289-299` + +**Issue:** The `loadDailyExports` catch block sets `dailyExports.value = []` but does +not set `exportsError.value`. The template renders `

` for +download errors, but load failures produce no user-visible message. An admin whose MinIO +is misconfigured sees "No daily exports available" — identical to a legitimately empty +bucket — with no indication that an error occurred. **Fix:** -```javascript +```js } catch (e) { dailyExports.value = [] - exportsError.value = 'Failed to load daily exports. Please try again.' // add this line - } finally { + exportsError.value = 'Failed to load daily exports. Please try again.' // add + } ``` -### WR-05: `listShares` in `client.js` uses string interpolation instead of `URLSearchParams` +--- + +### WR-05: `URL.revokeObjectURL` called synchronously before browser initiates download — potential silent cancellation + +**File:** `frontend/src/api/client.js:425-426, 481-482` + +**Issue:** In both `adminExportAuditLogCsv` and `adminDownloadDailyExport`: +```js +a.click() +URL.revokeObjectURL(url) // immediate — race condition +``` +`a.click()` triggers the download asynchronously. Revoking the blob URL before the +browser has transferred it to the OS download manager can silently cancel the download +on slow or memory-constrained devices (Chrome has a documented race here; this is a +known pattern issue). The anchor element is also not appended to the DOM (`document.body.appendChild(a)`) +which causes the download to fail entirely in some Firefox versions. + +**Fix:** +```js +document.body.appendChild(a) +a.click() +document.body.removeChild(a) +setTimeout(() => URL.revokeObjectURL(url), 1000) +``` + +--- + +### WR-06: `listShares` uses raw string interpolation in query string — inconsistent and fragile **File:** `frontend/src/api/client.js:353` **Issue:** -```javascript +```js export function listShares(docId) { return request(`/api/shares?document_id=${docId}`) } ``` -`docId` is always a UUID from the DB so the practical risk is low, but the pattern is inconsistent -with the rest of the file (compare `listDocuments`, `adminListAuditLog`, and `listFolders` which -all use `URLSearchParams`). If a caller ever passes a non-UUID value (e.g. a stringified object -from a Vue ref), this will silently produce a malformed URL without any encoding. Unlike the other -route-param-interpolation cases in this file (`/documents/${id}`, `/folders/${folderId}`) — where -the value is an ID used in a path segment — this one places an unencoded value into a query -parameter, making URL injection more likely. +`docId` is always a UUID in normal use, but it is not encoded via `URLSearchParams`. +All other query-parameter functions in this file (`listDocuments`, `adminListAuditLog`, +`listFolders`) use `URLSearchParams` consistently. While the practical risk from a UUID +is low, the pattern is inconsistent and breaks if the value ever contains special +characters. **Fix:** -```javascript +```js export function listShares(docId) { const params = new URLSearchParams({ document_id: docId }) return request(`/api/shares?${params}`) @@ -253,46 +353,65 @@ export function listShares(docId) { ## Info -### IN-01: Unused `import re` in `test_documents.py` +### IN-01: `_build_filtered_query` is defined but never called — dead code + +**File:** `backend/api/audit.py:97-121` + +**Issue:** `_build_filtered_query` is documented as providing the COUNT query helper +(to avoid multi-column JOIN ambiguity), but `list_audit_log` (lines 276-284) manually +re-implements identical filter logic inline without calling this helper. The function +is never referenced anywhere in the file. It is dead code that will confuse maintainers. + +**Fix:** Either delete `_build_filtered_query` or refactor the inline count query in +`list_audit_log` to use it. + +--- + +### IN-02: Unused `import re` in `test_documents.py` **File:** `backend/tests/test_documents.py:9` -**Issue:** `import re` is present at the top of the file but `re.` is never called anywhere in -the module. This is dead code left over from a previous version of the test. +**Issue:** `import re` is present at line 9 but `re.` is never used anywhere in the +module. Dead import from a previous version of the test file. **Fix:** Remove `import re` from line 9. -### IN-02: `deleteDocumentRemoveOnly` is a trivial wrapper that duplicates `deleteDocument(id, true)` +--- + +### IN-03: `deleteDocumentRemoveOnly` is a trivial one-liner wrapper — unnecessary exported symbol **File:** `frontend/src/api/client.js:80-82` **Issue:** -```javascript +```js export function deleteDocumentRemoveOnly(id) { return deleteDocument(id, true) } ``` -This function is a one-liner alias that adds no value beyond calling `deleteDocument(id, true)` -directly. It creates two public API symbols for the same operation, which will confuse future -callers about which to use. The caller in `DocumentView.vue` already uses this alias, but the -wrapper itself adds surface area without benefit. +This adds a second exported name for `deleteDocument(id, true)` with no additional +behaviour. Having two symbols for the same operation creates confusion about which to +use. The wrapper itself provides no documentation, type safety, or default-arg isolation. -**Fix:** Either remove the wrapper and update `DocumentView.vue` to call -`deleteDocument(doc.value.id, true)` directly, or keep it but document why the alias exists. +**Fix:** Remove the wrapper and call `deleteDocument(id, true)` directly at callsites, +or keep only the wrapper and make `removeOnly` an implementation detail. -### IN-03: `test_delete_cloud_remove_only` does not assert quota is unchanged +--- + +### IN-04: `test_delete_cloud_remove_only` does not assert quota is unchanged after DB-only removal **File:** `backend/tests/test_documents.py:897-925` -**Issue:** The test verifies the DB row is removed but does not check that the user's -`used_bytes` quota was NOT decremented (which it should not be — cloud docs are not quota-tracked -per T-06.2-03-01). The `Quota` model is imported but never queried in this test. A future -regression that incorrectly decrements quota on `remove_only` paths would go undetected. +**Issue:** The test verifies the DB row is removed but does not verify that the user's +`used_bytes` quota was not decremented. Cloud documents are not quota-tracked, so a +future regression that incorrectly decrements quota on `remove_only` paths would go +undetected by this test. -**Fix:** Add a quota assertion: +**Fix:** ```python - # Quota must remain unchanged — cloud docs are not quota-tracked + # Add after the deleted-row assertion: + from db.models import Quota quota = await db_session.get(Quota, auth_user["user"].id) + # Cloud docs are not quota-tracked — used_bytes must remain 0 assert quota.used_bytes == 0, ( f"remove_only should not decrement quota, got used_bytes={quota.used_bytes}" )