docs(06.2): add code review report

This commit is contained in:
curo1305
2026-05-31 20:23:32 +02:00
parent 33697f2713
commit 52d6efb8a2
@@ -2,19 +2,22 @@
phase: 06.2-close-v1-sharing-cloud-delete-csv-export-gaps
reviewed: 2026-05-31T00:00:00Z
depth: standard
files_reviewed: 6
files_reviewed: 9
files_reviewed_list:
- backend/api/audit.py
- backend/tests/test_audit.py
- backend/tests/test_documents.py
- backend/tests/test_shares.py
- frontend/src/api/client.js
- frontend/src/components/admin/AdminUsersTab.vue
- frontend/src/components/admin/AuditLogTab.vue
- frontend/src/views/AccountView.vue
- frontend/src/views/CloudFolderView.vue
findings:
critical: 2
warning: 5
info: 3
total: 10
critical: 4
warning: 6
info: 4
total: 14
status: issues_found
---
@@ -22,78 +25,66 @@ status: issues_found
**Reviewed:** 2026-05-31T00:00:00Z
**Depth:** standard
**Files Reviewed:** 6
**Files Reviewed:** 9
**Status:** issues_found
## Summary
This phase closes sharing, cloud-delete, CSV-export, and daily-audit-export gaps. The backend
endpoint logic is structurally sound (parameterized queries, admin-only guards, proper path-traversal
regex on the date parameter). However, two correctness bugs are present that will silently corrupt or
truncate data in production, and five quality/robustness issues need addressing before the phase can
be marked complete. No hardcoded secrets were found and all admin guards are in place.
This phase closes v1 gaps across document sharing, cloud delete, and audit/CSV-export
functionality. The backend audit API is well-structured with appropriate admin guards,
parameterized queries, and a whitelist serializer that correctly prevents sensitive field
leakage. The share and document test coverage is comprehensive. However, four critical
bugs were found that either break features silently in production or expose runtime
crashes, plus six warnings covering security hygiene and robustness.
---
## Critical Issues
### CR-01: CSV export writes raw Python `dict` repr for `metadata_` — not JSON
### CR-01: AuditLogTab event_type filter always returns zero results — feature non-functional
**File:** `backend/api/audit.py:372`
**File:** `frontend/src/components/admin/AuditLogTab.vue:37-41`
**Issue:** `csv.DictWriter.writerow()` receives the dict returned by
`_audit_to_dict_with_handles()`. That dict contains `"metadata_": entry.metadata_`, where
`entry.metadata_` is a Python `dict` (SQLAlchemy deserializes `JSONB` columns into native Python
objects). `csv.DictWriter` calls `str()` on the value, producing `{'size_bytes': 100}`
Python repr syntax with single quotes — not valid JSON. Any downstream consumer that tries to
`JSON.parse()` the metadata cell will fail. The same issue exists in the empty-handle early-return
path's header-only CSV, but that branch writes no data rows so the corruption is less visible there.
**Issue:** The filter dropdown submits bare category prefixes (`"auth"`, `"document"`,
`"folder"`, `"share"`, `"admin"`) to the backend. The backend applies exact-equality
filtering (`AuditLog.event_type == event_type``backend/api/audit.py:120, 159, 284`).
Every actual event type in the codebase uses dot-notation:
`"auth.login"`, `"auth.logout"`, `"document.uploaded"`, `"document.deleted"`,
`"share.granted"`, `"share.revoked"`, `"admin.user_created"`, `"admin.quota_changed"`, etc.
Proof (reproduced):
```python
import csv, io
output = io.StringIO()
writer = csv.DictWriter(output, fieldnames=["metadata_"])
writer.writeheader()
writer.writerow({"metadata_": {"size_bytes": 100}})
print(output.getvalue())
# metadata_
# {'size_bytes': 100} ← Python repr, NOT JSON
```
An exact match of `"auth"` against `"auth.login"` never fires. Selecting any category
from the dropdown silently returns an empty list instead of filtered results — the entire
filter feature is non-functional.
**Fix:** Serialize `metadata_` to JSON in the dict helper before the CSV writer sees it, or only
at the CSV call site:
**Fix:** Change the backend filter from exact equality to a prefix `LIKE` match (most
flexible) or update the dropdown to send exact event-type strings:
```python
import json
# Option A — at writerow call site (surgical fix):
for row in rows:
entry, user_handle_val, actor_handle_val = row[0], row[1], row[2]
record = _audit_to_dict_with_handles(entry, user_handle_val, actor_handle_val)
record["metadata_"] = json.dumps(record["metadata_"]) if record["metadata_"] is not None else ""
writer.writerow(record)
# backend/api/audit.py — at all three filter call sites (lines 120, 159, 284):
if event_type is not None:
q = q.where(AuditLog.event_type.like(f"{event_type}%"))
```
The `test_audit_log_export_csv` test does not assert the content of the metadata cell, so this
bug currently passes all tests despite corrupting the exported data.
```python
# Also fix the count_q inline block in list_audit_log (lines 283-284):
if event_type is not None:
count_q = count_q.where(AuditLog.event_type.like(f"{event_type}%"))
```
---
### CR-02: `download_daily_export` silently crashes on non-MinIO backends — no 404, no guard
### CR-02: `download_daily_export` accesses `backend._client` without a MinIOBackend guard — AttributeError crash on non-MinIO deployments
**File:** `backend/api/audit.py:219-239`
**File:** `backend/api/audit.py:219-228`
**Issue:** `list_daily_exports` (line 182) correctly guards with
`if not isinstance(backend, MinIOBackend): return {"items": []}` and returns an empty list for
non-MinIO deployments. But `download_daily_export` has **no such guard**. If `get_storage_backend()`
returns anything other than a `MinIOBackend` (e.g. a local filesystem backend), line 223 calls
`backend._client.get_object(...)`, which will raise `AttributeError: '...' object has no attribute
'_client'`. The broad `except Exception` on line 231 catches this and re-raises it as a 404, but
only because the error happens inside `_get()`. If the backend raises during construction of `_get`
or if `backend` is `None`, the error propagates as an unhandled 500. More importantly, any new
backend type that happens to have a `_client` attribute with a different meaning would silently
produce wrong results.
**Issue:** `list_daily_exports` (line 182) correctly checks
`isinstance(backend, MinIOBackend)` and returns an empty list for non-MinIO storage.
`download_daily_export` (line 219) calls `get_storage_backend()` and immediately uses
`backend._client` with no type guard. On deployments where the primary storage backend
is Google Drive, OneDrive, or Nextcloud, `backend._client` does not exist on those
classes, raising `AttributeError`. The broad `except Exception` on line 232 catches
this and re-raises it as a 404, masking a real misconfiguration entirely — the admin
sees "Export not found" when the real problem is a storage type mismatch.
**Fix:** Mirror the guard from `list_daily_exports`:
@@ -107,7 +98,7 @@ async def download_daily_export(
raise HTTPException(status_code=404, detail="Invalid date format")
backend = get_storage_backend()
if not isinstance(backend, MinIOBackend): # <-- add this guard
if not isinstance(backend, MinIOBackend): # <-- add this guard
raise HTTPException(status_code=404, detail="Export not found")
key = f"audit-logs/{date}.csv"
@@ -116,87 +107,169 @@ async def download_daily_export(
---
## Warnings
### CR-03: CSV export writes raw Python `dict` repr for `metadata_` — not valid JSON
### WR-01: `export_audit_log` loads the entire audit log into memory before streaming
**File:** `backend/api/audit.py:372`
**File:** `backend/api/audit.py:351-377`
**Issue:** The export endpoint issues `result = await session.execute(q)` followed by
`rows = result.all()`, then builds the full CSV in an `io.StringIO` before returning it in a
`StreamingResponse`. For large deployments the audit log may contain millions of rows; this will
exhaust server memory and cause an OOM kill rather than a graceful error. `StreamingResponse` is
the correct mechanism but it is not being streamed — the CSV is fully materialized first.
**Fix:** Use a generator that yields CSV chunks from a cursor:
**Issue:** `csv.DictWriter.writerow()` receives the dict returned by
`_audit_to_dict_with_handles()`, which includes `"metadata_": entry.metadata_` where
`entry.metadata_` is a Python `dict` (SQLAlchemy deserializes `JSONB` columns to native
Python objects). `csv.DictWriter` calls `str()` on values it cannot serialize natively,
producing Python repr syntax with single quotes (`{'size_bytes': 100}`) rather than
valid JSON (`{"size_bytes": 100}`). Any downstream system that tries to parse the
`metadata_` CSV column as JSON will fail.
Reproduced:
```python
async def _generate_csv(session, q, fields):
output = io.StringIO()
writer = csv.DictWriter(output, fieldnames=fields)
writer.writeheader()
yield output.getvalue()
async for row in await session.stream(q):
output = io.StringIO()
entry, user_handle_val, actor_handle_val = row[0], row[1], row[2]
record = _audit_to_dict_with_handles(entry, user_handle_val, actor_handle_val)
record["metadata_"] = json.dumps(record["metadata_"]) if record["metadata_"] is not None else ""
writer = csv.DictWriter(output, fieldnames=fields)
writer.writerow(record)
yield output.getvalue()
return StreamingResponse(_generate_csv(session, q, fields), media_type="text/csv", ...)
import csv, io
output = io.StringIO()
writer = csv.DictWriter(output, fieldnames=["metadata_"])
writer.writeheader()
writer.writerow({"metadata_": {"size_bytes": 100}})
# output: "{'size_bytes': 100}" ← Python repr, NOT JSON
```
### WR-02: `format` query parameter is accepted but ignored — always returns CSV
**Fix:**
```python
import json
# In export_audit_log(), at the writerow call site (line 372):
for row in rows:
entry, user_handle_val, actor_handle_val = row[0], row[1], row[2]
record = _audit_to_dict_with_handles(entry, user_handle_val, actor_handle_val)
record["metadata_"] = json.dumps(record["metadata_"]) if record["metadata_"] is not None else ""
writer.writerow(record)
```
The `test_audit_log_export_csv` test does not assert the content of the metadata cell
so this bug currently passes all tests while corrupting exported data.
---
### CR-04: Three admin fetch functions lack 401-refresh-retry — silently fail on token expiry
**File:** `frontend/src/api/client.js:398-483`
**Issue:** `adminExportAuditLogCsv`, `adminListDailyExports`, and
`adminDownloadDailyExport` all use raw `fetch()` calls without the 401-refresh-then-retry
logic that `request()` (line 27-30) and `fetchDocumentContent()` (lines 520-529) both
implement. When the 15-minute access token expires while an admin is on the audit log
page, all three functions immediately throw (e.g. `Error("Export failed: 401")`) with
no session recovery. The user's session is still valid (they have a live refresh token)
but the export/list operations fail hard. The auth store is not cleared, so the user
cannot tell why the operation failed.
Note: `adminListDailyExports` returns JSON and could be routed through `request()`
directly, which already has the retry logic.
**Fix:**
```js
// Simplest fix for adminListDailyExports — route through request() which handles 401 retry:
export function adminListDailyExports() {
return request('/api/admin/audit-log/daily-exports')
}
// For CSV-returning functions, add the same retry block as fetchDocumentContent:
// In adminExportAuditLogCsv and adminDownloadDailyExport, replace:
// if (!res.ok) throw new Error(...)
// with:
if (res.status === 401) {
try {
await authStore.refresh()
return adminExportAuditLogCsv(params) // retry
} catch {
authStore.accessToken = null
authStore.user = null
throw new Error('Session expired')
}
}
if (!res.ok) throw new Error(`Export failed: ${res.status}`)
```
---
## Warnings
### WR-01: Temporary password generation has a fixed suffix that leaks 3 bytes of entropy
**File:** `frontend/src/components/admin/AdminUsersTab.vue:301`
**Issue:** `generateRandomPassword()` generates 16 bytes of random data mapped through
a 65-character charset, then unconditionally overwrites the last 4 characters with the
literal `'A1!'`:
```js
pw = pw.slice(0, 12) + 'A1!'
```
The generated password is always exactly 15 characters with the last 3 bytes fixed
(`A`, `1`, `!`). An attacker who reverse-engineers this algorithm can reduce the
brute-force search space significantly — the last 3 positions are always known. There is
also modulo bias: `256 % 65 = 61`, meaning the first 61 characters of the charset are
slightly overrepresented (each maps from 4 byte values instead of 3). While these
temporary passwords must be changed on first login, they are the sole credential
protecting accounts between creation and first use.
**Fix:**
```js
function generateRandomPassword() {
const upper = 'ABCDEFGHJKLMNPQRSTUVWXYZ'
const lower = 'abcdefghijkmnpqrstuvwxyz'
const digits = '23456789'
const special = '!@#$%^&*'
const all = upper + lower + digits + special // 58 chars — 256 % 58 = 24, bias reduced
const arr = new Uint32Array(16)
crypto.getRandomValues(arr)
let chars = Array.from(arr.slice(0, 12), v => all[v % all.length])
// Guarantee one char from each required class by replacing 4 specific positions
const positions = new Uint32Array(4)
crypto.getRandomValues(positions)
chars[positions[0] % 12] = upper[positions[0] % upper.length]
chars[positions[1] % 12] = lower[positions[1] % lower.length]
chars[positions[2] % 12] = digits[positions[2] % digits.length]
chars[positions[3] % 12] = special[positions[3] % special.length]
return chars.join('')
}
```
---
### WR-02: `format` query parameter is accepted but ignored — always returns CSV regardless
**File:** `backend/api/audit.py:313`
**Issue:** The `export_audit_log` endpoint declares `format: str = Query(default="csv")` with a
`# noqa: A002` suppression but never reads the `format` variable in the handler body. Passing
`?format=json` returns a CSV response silently. This is a logic dead-end: the parameter is
either unused dead code or is intended as a future extension. If it is dead code it should be
removed; if it is a future hook, it must at minimum validate that only `"csv"` is accepted and
return 422 for other values. Currently an arbitrary string is accepted with no effect.
**Issue:** The `export_audit_log` endpoint declares `format: str = Query(default="csv")`
but the variable `format` is never read in the handler body. Passing `?format=json`
silently returns a CSV response. The parameter is either dead code or a future extension
stub. Either way the current behaviour is incorrect: the API contract implies multiple
formats are accepted when only CSV exists. Any string passes FastAPI's validation,
meaning callers receive misleading 200 OK responses for unsupported format values.
**Fix (remove the dead parameter):**
**Fix (if only CSV will ever be supported):** Remove the parameter entirely.
**Fix (if future extension is planned):** Validate at the handler level:
```python
@router.get("/audit-log/export")
async def export_audit_log(
start: Optional[datetime] = Query(default=None),
end: Optional[datetime] = Query(default=None),
user_handle: Optional[str] = Query(default=None),
event_type: Optional[str] = Query(default=None),
# Remove: format: str = Query(default="csv"),
session: AsyncSession = Depends(get_db),
_admin: User = Depends(get_current_admin),
) -> StreamingResponse:
from typing import Literal
format: Literal["csv"] = Query(default="csv"), # noqa: A002
```
**Fix (validate if keeping):**
```python
if format not in ("csv",):
raise HTTPException(status_code=422, detail="Unsupported format. Only 'csv' is accepted.")
```
---
### WR-03: Pagination "Next" button in `AuditLogTab.vue` disables on the wrong condition
### WR-03: Pagination "Next" button uses wrong heuristic — breaks on last page when total is exact multiple of page size
**File:** `frontend/src/components/admin/AuditLogTab.vue:122`
**File:** `frontend/src/components/admin/AuditLogTab.vue:137,266`
**Issue:** The "Next" button is disabled when `entries.length < perPage`. This is a heuristic:
it assumes the last page has fewer entries than `perPage`. But if the total count is an exact
multiple of `perPage`, the last page has exactly `perPage` entries and the "Next" button remains
enabled. Clicking it fetches a page beyond the last, returns an empty items list, and leaves the
user on a blank page with no way to go back (the empty-entries check at line 73 replaces the
table with the "No entries" message, but the page counter still shows the incremented value).
The `total` ref is populated from `data.total` (line 212) — it should be used instead.
**Issue:** The "Next" button is disabled when `entries.value.length < perPage` (line 137
in template, line 266 in script). If the last page contains exactly `perPage` (50)
entries, the button stays enabled. Clicking it dispatches a page request that returns an
empty items list, leaving the user on a blank page with the page counter incremented. The
`total` ref is populated from `data.total` (line 227) but is never used for pagination.
**Fix:**
```javascript
// Replace line 122:
```html
<!-- Template, line 137: -->
:disabled="page * perPage >= total"
// And in nextPage():
```
```js
// Script, nextPage():
function nextPage() {
if (page.value * perPage < total.value) {
page.value++
@@ -205,44 +278,71 @@ function nextPage() {
}
```
### WR-04: `loadDailyExports` swallows errors silently — no user-visible feedback
---
**File:** `frontend/src/components/admin/AuditLogTab.vue:256-266`
### WR-04: `loadDailyExports` swallows errors silently — admin sees "No daily exports" instead of an error
**Issue:** The `loadDailyExports` catch block sets `dailyExports.value = []` but does **not** set
`exportsError.value`. The template renders `<p v-if="exportsError">` for the download error, but
the load-failure case is silently dropped. An admin whose MinIO is misconfigured or unreachable
sees "No daily exports available" — identical to the normal empty-bucket state — with no
indication that an error occurred.
**File:** `frontend/src/components/admin/AuditLogTab.vue:289-299`
**Issue:** The `loadDailyExports` catch block sets `dailyExports.value = []` but does
not set `exportsError.value`. The template renders `<p v-if="exportsError">` for
download errors, but load failures produce no user-visible message. An admin whose MinIO
is misconfigured sees "No daily exports available" — identical to a legitimately empty
bucket — with no indication that an error occurred.
**Fix:**
```javascript
```js
} catch (e) {
dailyExports.value = []
exportsError.value = 'Failed to load daily exports. Please try again.' // add this line
} finally {
exportsError.value = 'Failed to load daily exports. Please try again.' // add
}
```
### WR-05: `listShares` in `client.js` uses string interpolation instead of `URLSearchParams`
---
### WR-05: `URL.revokeObjectURL` called synchronously before browser initiates download — potential silent cancellation
**File:** `frontend/src/api/client.js:425-426, 481-482`
**Issue:** In both `adminExportAuditLogCsv` and `adminDownloadDailyExport`:
```js
a.click()
URL.revokeObjectURL(url) // immediate — race condition
```
`a.click()` triggers the download asynchronously. Revoking the blob URL before the
browser has transferred it to the OS download manager can silently cancel the download
on slow or memory-constrained devices (Chrome has a documented race here; this is a
known pattern issue). The anchor element is also not appended to the DOM (`document.body.appendChild(a)`)
which causes the download to fail entirely in some Firefox versions.
**Fix:**
```js
document.body.appendChild(a)
a.click()
document.body.removeChild(a)
setTimeout(() => URL.revokeObjectURL(url), 1000)
```
---
### WR-06: `listShares` uses raw string interpolation in query string — inconsistent and fragile
**File:** `frontend/src/api/client.js:353`
**Issue:**
```javascript
```js
export function listShares(docId) {
return request(`/api/shares?document_id=${docId}`)
}
```
`docId` is always a UUID from the DB so the practical risk is low, but the pattern is inconsistent
with the rest of the file (compare `listDocuments`, `adminListAuditLog`, and `listFolders` which
all use `URLSearchParams`). If a caller ever passes a non-UUID value (e.g. a stringified object
from a Vue ref), this will silently produce a malformed URL without any encoding. Unlike the other
route-param-interpolation cases in this file (`/documents/${id}`, `/folders/${folderId}`) — where
the value is an ID used in a path segment — this one places an unencoded value into a query
parameter, making URL injection more likely.
`docId` is always a UUID in normal use, but it is not encoded via `URLSearchParams`.
All other query-parameter functions in this file (`listDocuments`, `adminListAuditLog`,
`listFolders`) use `URLSearchParams` consistently. While the practical risk from a UUID
is low, the pattern is inconsistent and breaks if the value ever contains special
characters.
**Fix:**
```javascript
```js
export function listShares(docId) {
const params = new URLSearchParams({ document_id: docId })
return request(`/api/shares?${params}`)
@@ -253,46 +353,65 @@ export function listShares(docId) {
## Info
### IN-01: Unused `import re` in `test_documents.py`
### IN-01: `_build_filtered_query` is defined but never called — dead code
**File:** `backend/api/audit.py:97-121`
**Issue:** `_build_filtered_query` is documented as providing the COUNT query helper
(to avoid multi-column JOIN ambiguity), but `list_audit_log` (lines 276-284) manually
re-implements identical filter logic inline without calling this helper. The function
is never referenced anywhere in the file. It is dead code that will confuse maintainers.
**Fix:** Either delete `_build_filtered_query` or refactor the inline count query in
`list_audit_log` to use it.
---
### IN-02: Unused `import re` in `test_documents.py`
**File:** `backend/tests/test_documents.py:9`
**Issue:** `import re` is present at the top of the file but `re.` is never called anywhere in
the module. This is dead code left over from a previous version of the test.
**Issue:** `import re` is present at line 9 but `re.` is never used anywhere in the
module. Dead import from a previous version of the test file.
**Fix:** Remove `import re` from line 9.
### IN-02: `deleteDocumentRemoveOnly` is a trivial wrapper that duplicates `deleteDocument(id, true)`
---
### IN-03: `deleteDocumentRemoveOnly` is a trivial one-liner wrapper — unnecessary exported symbol
**File:** `frontend/src/api/client.js:80-82`
**Issue:**
```javascript
```js
export function deleteDocumentRemoveOnly(id) {
return deleteDocument(id, true)
}
```
This function is a one-liner alias that adds no value beyond calling `deleteDocument(id, true)`
directly. It creates two public API symbols for the same operation, which will confuse future
callers about which to use. The caller in `DocumentView.vue` already uses this alias, but the
wrapper itself adds surface area without benefit.
This adds a second exported name for `deleteDocument(id, true)` with no additional
behaviour. Having two symbols for the same operation creates confusion about which to
use. The wrapper itself provides no documentation, type safety, or default-arg isolation.
**Fix:** Either remove the wrapper and update `DocumentView.vue` to call
`deleteDocument(doc.value.id, true)` directly, or keep it but document why the alias exists.
**Fix:** Remove the wrapper and call `deleteDocument(id, true)` directly at callsites,
or keep only the wrapper and make `removeOnly` an implementation detail.
### IN-03: `test_delete_cloud_remove_only` does not assert quota is unchanged
---
### IN-04: `test_delete_cloud_remove_only` does not assert quota is unchanged after DB-only removal
**File:** `backend/tests/test_documents.py:897-925`
**Issue:** The test verifies the DB row is removed but does not check that the user's
`used_bytes` quota was NOT decremented (which it should not be — cloud docs are not quota-tracked
per T-06.2-03-01). The `Quota` model is imported but never queried in this test. A future
regression that incorrectly decrements quota on `remove_only` paths would go undetected.
**Issue:** The test verifies the DB row is removed but does not verify that the user's
`used_bytes` quota was not decremented. Cloud documents are not quota-tracked, so a
future regression that incorrectly decrements quota on `remove_only` paths would go
undetected by this test.
**Fix:** Add a quota assertion:
**Fix:**
```python
# Quota must remain unchanged — cloud docs are not quota-tracked
# Add after the deleted-row assertion:
from db.models import Quota
quota = await db_session.get(Quota, auth_user["user"].id)
# Cloud docs are not quota-tracked — used_bytes must remain 0
assert quota.used_bytes == 0, (
f"remove_only should not decrement quota, got used_bytes={quota.used_bytes}"
)