docs(06.2): add code review report
This commit is contained in:
@@ -0,0 +1,305 @@
|
||||
---
|
||||
phase: 06.2-close-v1-sharing-cloud-delete-csv-export-gaps
|
||||
reviewed: 2026-05-31T00:00:00Z
|
||||
depth: standard
|
||||
files_reviewed: 6
|
||||
files_reviewed_list:
|
||||
- backend/api/audit.py
|
||||
- backend/tests/test_audit.py
|
||||
- backend/tests/test_documents.py
|
||||
- backend/tests/test_shares.py
|
||||
- frontend/src/api/client.js
|
||||
- frontend/src/components/admin/AuditLogTab.vue
|
||||
findings:
|
||||
critical: 2
|
||||
warning: 5
|
||||
info: 3
|
||||
total: 10
|
||||
status: issues_found
|
||||
---
|
||||
|
||||
# Phase 06.2: Code Review Report
|
||||
|
||||
**Reviewed:** 2026-05-31T00:00:00Z
|
||||
**Depth:** standard
|
||||
**Files Reviewed:** 6
|
||||
**Status:** issues_found
|
||||
|
||||
## Summary
|
||||
|
||||
This phase closes sharing, cloud-delete, CSV-export, and daily-audit-export gaps. The backend
|
||||
endpoint logic is structurally sound (parameterized queries, admin-only guards, proper path-traversal
|
||||
regex on the date parameter). However, two correctness bugs are present that will silently corrupt or
|
||||
truncate data in production, and five quality/robustness issues need addressing before the phase can
|
||||
be marked complete. No hardcoded secrets were found and all admin guards are in place.
|
||||
|
||||
---
|
||||
|
||||
## Critical Issues
|
||||
|
||||
### CR-01: CSV export writes raw Python `dict` repr for `metadata_` — not JSON
|
||||
|
||||
**File:** `backend/api/audit.py:372`
|
||||
|
||||
**Issue:** `csv.DictWriter.writerow()` receives the dict returned by
|
||||
`_audit_to_dict_with_handles()`. That dict contains `"metadata_": entry.metadata_`, where
|
||||
`entry.metadata_` is a Python `dict` (SQLAlchemy deserializes `JSONB` columns into native Python
|
||||
objects). `csv.DictWriter` calls `str()` on the value, producing `{'size_bytes': 100}` —
|
||||
Python repr syntax with single quotes — not valid JSON. Any downstream consumer that tries to
|
||||
`JSON.parse()` the metadata cell will fail. The same issue exists in the empty-handle early-return
|
||||
path's header-only CSV, but that branch writes no data rows so the corruption is less visible there.
|
||||
|
||||
Proof (reproduced):
|
||||
```python
|
||||
import csv, io
|
||||
output = io.StringIO()
|
||||
writer = csv.DictWriter(output, fieldnames=["metadata_"])
|
||||
writer.writeheader()
|
||||
writer.writerow({"metadata_": {"size_bytes": 100}})
|
||||
print(output.getvalue())
|
||||
# metadata_
|
||||
# {'size_bytes': 100} ← Python repr, NOT JSON
|
||||
```
|
||||
|
||||
**Fix:** Serialize `metadata_` to JSON in the dict helper before the CSV writer sees it, or only
|
||||
at the CSV call site:
|
||||
|
||||
```python
|
||||
import json
|
||||
|
||||
# Option A — at writerow call site (surgical fix):
|
||||
for row in rows:
|
||||
entry, user_handle_val, actor_handle_val = row[0], row[1], row[2]
|
||||
record = _audit_to_dict_with_handles(entry, user_handle_val, actor_handle_val)
|
||||
record["metadata_"] = json.dumps(record["metadata_"]) if record["metadata_"] is not None else ""
|
||||
writer.writerow(record)
|
||||
```
|
||||
|
||||
The `test_audit_log_export_csv` test does not assert the content of the metadata cell, so this
|
||||
bug currently passes all tests despite corrupting the exported data.
|
||||
|
||||
---
|
||||
|
||||
### CR-02: `download_daily_export` silently crashes on non-MinIO backends — no 404, no guard
|
||||
|
||||
**File:** `backend/api/audit.py:219-239`
|
||||
|
||||
**Issue:** `list_daily_exports` (line 182) correctly guards with
|
||||
`if not isinstance(backend, MinIOBackend): return {"items": []}` and returns an empty list for
|
||||
non-MinIO deployments. But `download_daily_export` has **no such guard**. If `get_storage_backend()`
|
||||
returns anything other than a `MinIOBackend` (e.g. a local filesystem backend), line 223 calls
|
||||
`backend._client.get_object(...)`, which will raise `AttributeError: '...' object has no attribute
|
||||
'_client'`. The broad `except Exception` on line 231 catches this and re-raises it as a 404, but
|
||||
only because the error happens inside `_get()`. If the backend raises during construction of `_get`
|
||||
or if `backend` is `None`, the error propagates as an unhandled 500. More importantly, any new
|
||||
backend type that happens to have a `_client` attribute with a different meaning would silently
|
||||
produce wrong results.
|
||||
|
||||
**Fix:** Mirror the guard from `list_daily_exports`:
|
||||
|
||||
```python
|
||||
@router.get("/audit-log/daily-exports/{date}")
|
||||
async def download_daily_export(
|
||||
date: str,
|
||||
_admin: User = Depends(get_current_admin),
|
||||
) -> StreamingResponse:
|
||||
if not re.fullmatch(r"\d{4}-\d{2}-\d{2}", date):
|
||||
raise HTTPException(status_code=404, detail="Invalid date format")
|
||||
|
||||
backend = get_storage_backend()
|
||||
if not isinstance(backend, MinIOBackend): # <-- add this guard
|
||||
raise HTTPException(status_code=404, detail="Export not found")
|
||||
|
||||
key = f"audit-logs/{date}.csv"
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Warnings
|
||||
|
||||
### WR-01: `export_audit_log` loads the entire audit log into memory before streaming
|
||||
|
||||
**File:** `backend/api/audit.py:351-377`
|
||||
|
||||
**Issue:** The export endpoint issues `result = await session.execute(q)` followed by
|
||||
`rows = result.all()`, then builds the full CSV in an `io.StringIO` before returning it in a
|
||||
`StreamingResponse`. For large deployments the audit log may contain millions of rows; this will
|
||||
exhaust server memory and cause an OOM kill rather than a graceful error. `StreamingResponse` is
|
||||
the correct mechanism but it is not being streamed — the CSV is fully materialized first.
|
||||
|
||||
**Fix:** Use a generator that yields CSV chunks from a cursor:
|
||||
|
||||
```python
|
||||
async def _generate_csv(session, q, fields):
|
||||
output = io.StringIO()
|
||||
writer = csv.DictWriter(output, fieldnames=fields)
|
||||
writer.writeheader()
|
||||
yield output.getvalue()
|
||||
|
||||
async for row in await session.stream(q):
|
||||
output = io.StringIO()
|
||||
entry, user_handle_val, actor_handle_val = row[0], row[1], row[2]
|
||||
record = _audit_to_dict_with_handles(entry, user_handle_val, actor_handle_val)
|
||||
record["metadata_"] = json.dumps(record["metadata_"]) if record["metadata_"] is not None else ""
|
||||
writer = csv.DictWriter(output, fieldnames=fields)
|
||||
writer.writerow(record)
|
||||
yield output.getvalue()
|
||||
|
||||
return StreamingResponse(_generate_csv(session, q, fields), media_type="text/csv", ...)
|
||||
```
|
||||
|
||||
### WR-02: `format` query parameter is accepted but ignored — always returns CSV
|
||||
|
||||
**File:** `backend/api/audit.py:313`
|
||||
|
||||
**Issue:** The `export_audit_log` endpoint declares `format: str = Query(default="csv")` with a
|
||||
`# noqa: A002` suppression but never reads the `format` variable in the handler body. Passing
|
||||
`?format=json` returns a CSV response silently. This is a logic dead-end: the parameter is
|
||||
either unused dead code or is intended as a future extension. If it is dead code it should be
|
||||
removed; if it is a future hook, it must at minimum validate that only `"csv"` is accepted and
|
||||
return 422 for other values. Currently an arbitrary string is accepted with no effect.
|
||||
|
||||
**Fix (remove the dead parameter):**
|
||||
```python
|
||||
@router.get("/audit-log/export")
|
||||
async def export_audit_log(
|
||||
start: Optional[datetime] = Query(default=None),
|
||||
end: Optional[datetime] = Query(default=None),
|
||||
user_handle: Optional[str] = Query(default=None),
|
||||
event_type: Optional[str] = Query(default=None),
|
||||
# Remove: format: str = Query(default="csv"),
|
||||
session: AsyncSession = Depends(get_db),
|
||||
_admin: User = Depends(get_current_admin),
|
||||
) -> StreamingResponse:
|
||||
```
|
||||
|
||||
**Fix (validate if keeping):**
|
||||
```python
|
||||
if format not in ("csv",):
|
||||
raise HTTPException(status_code=422, detail="Unsupported format. Only 'csv' is accepted.")
|
||||
```
|
||||
|
||||
### WR-03: Pagination "Next" button in `AuditLogTab.vue` disables on the wrong condition
|
||||
|
||||
**File:** `frontend/src/components/admin/AuditLogTab.vue:122`
|
||||
|
||||
**Issue:** The "Next" button is disabled when `entries.length < perPage`. This is a heuristic:
|
||||
it assumes the last page has fewer entries than `perPage`. But if the total count is an exact
|
||||
multiple of `perPage`, the last page has exactly `perPage` entries and the "Next" button remains
|
||||
enabled. Clicking it fetches a page beyond the last, returns an empty items list, and leaves the
|
||||
user on a blank page with no way to go back (the empty-entries check at line 73 replaces the
|
||||
table with the "No entries" message, but the page counter still shows the incremented value).
|
||||
The `total` ref is populated from `data.total` (line 212) — it should be used instead.
|
||||
|
||||
**Fix:**
|
||||
```javascript
|
||||
// Replace line 122:
|
||||
:disabled="page * perPage >= total"
|
||||
// And in nextPage():
|
||||
function nextPage() {
|
||||
if (page.value * perPage < total.value) {
|
||||
page.value++
|
||||
fetchLog()
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### WR-04: `loadDailyExports` swallows errors silently — no user-visible feedback
|
||||
|
||||
**File:** `frontend/src/components/admin/AuditLogTab.vue:256-266`
|
||||
|
||||
**Issue:** The `loadDailyExports` catch block sets `dailyExports.value = []` but does **not** set
|
||||
`exportsError.value`. The template renders `<p v-if="exportsError">` for the download error, but
|
||||
the load-failure case is silently dropped. An admin whose MinIO is misconfigured or unreachable
|
||||
sees "No daily exports available" — identical to the normal empty-bucket state — with no
|
||||
indication that an error occurred.
|
||||
|
||||
**Fix:**
|
||||
```javascript
|
||||
} catch (e) {
|
||||
dailyExports.value = []
|
||||
exportsError.value = 'Failed to load daily exports. Please try again.' // add this line
|
||||
} finally {
|
||||
```
|
||||
|
||||
### WR-05: `listShares` in `client.js` uses string interpolation instead of `URLSearchParams`
|
||||
|
||||
**File:** `frontend/src/api/client.js:353`
|
||||
|
||||
**Issue:**
|
||||
```javascript
|
||||
export function listShares(docId) {
|
||||
return request(`/api/shares?document_id=${docId}`)
|
||||
}
|
||||
```
|
||||
`docId` is always a UUID from the DB so the practical risk is low, but the pattern is inconsistent
|
||||
with the rest of the file (compare `listDocuments`, `adminListAuditLog`, and `listFolders` which
|
||||
all use `URLSearchParams`). If a caller ever passes a non-UUID value (e.g. a stringified object
|
||||
from a Vue ref), this will silently produce a malformed URL without any encoding. Unlike the other
|
||||
route-param-interpolation cases in this file (`/documents/${id}`, `/folders/${folderId}`) — where
|
||||
the value is an ID used in a path segment — this one places an unencoded value into a query
|
||||
parameter, making URL injection more likely.
|
||||
|
||||
**Fix:**
|
||||
```javascript
|
||||
export function listShares(docId) {
|
||||
const params = new URLSearchParams({ document_id: docId })
|
||||
return request(`/api/shares?${params}`)
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Info
|
||||
|
||||
### IN-01: Unused `import re` in `test_documents.py`
|
||||
|
||||
**File:** `backend/tests/test_documents.py:9`
|
||||
|
||||
**Issue:** `import re` is present at the top of the file but `re.` is never called anywhere in
|
||||
the module. This is dead code left over from a previous version of the test.
|
||||
|
||||
**Fix:** Remove `import re` from line 9.
|
||||
|
||||
### IN-02: `deleteDocumentRemoveOnly` is a trivial wrapper that duplicates `deleteDocument(id, true)`
|
||||
|
||||
**File:** `frontend/src/api/client.js:80-82`
|
||||
|
||||
**Issue:**
|
||||
```javascript
|
||||
export function deleteDocumentRemoveOnly(id) {
|
||||
return deleteDocument(id, true)
|
||||
}
|
||||
```
|
||||
This function is a one-liner alias that adds no value beyond calling `deleteDocument(id, true)`
|
||||
directly. It creates two public API symbols for the same operation, which will confuse future
|
||||
callers about which to use. The caller in `DocumentView.vue` already uses this alias, but the
|
||||
wrapper itself adds surface area without benefit.
|
||||
|
||||
**Fix:** Either remove the wrapper and update `DocumentView.vue` to call
|
||||
`deleteDocument(doc.value.id, true)` directly, or keep it but document why the alias exists.
|
||||
|
||||
### IN-03: `test_delete_cloud_remove_only` does not assert quota is unchanged
|
||||
|
||||
**File:** `backend/tests/test_documents.py:897-925`
|
||||
|
||||
**Issue:** The test verifies the DB row is removed but does not check that the user's
|
||||
`used_bytes` quota was NOT decremented (which it should not be — cloud docs are not quota-tracked
|
||||
per T-06.2-03-01). The `Quota` model is imported but never queried in this test. A future
|
||||
regression that incorrectly decrements quota on `remove_only` paths would go undetected.
|
||||
|
||||
**Fix:** Add a quota assertion:
|
||||
```python
|
||||
# Quota must remain unchanged — cloud docs are not quota-tracked
|
||||
quota = await db_session.get(Quota, auth_user["user"].id)
|
||||
assert quota.used_bytes == 0, (
|
||||
f"remove_only should not decrement quota, got used_bytes={quota.used_bytes}"
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
_Reviewed: 2026-05-31T00:00:00Z_
|
||||
_Reviewer: Claude (gsd-code-reviewer)_
|
||||
_Depth: standard_
|
||||
Reference in New Issue
Block a user