diff --git a/.planning/phases/06.2-close-v1-sharing-cloud-delete-csv-export-gaps/06.2-REVIEW.md b/.planning/phases/06.2-close-v1-sharing-cloud-delete-csv-export-gaps/06.2-REVIEW.md new file mode 100644 index 0000000..ef28809 --- /dev/null +++ b/.planning/phases/06.2-close-v1-sharing-cloud-delete-csv-export-gaps/06.2-REVIEW.md @@ -0,0 +1,305 @@ +--- +phase: 06.2-close-v1-sharing-cloud-delete-csv-export-gaps +reviewed: 2026-05-31T00:00:00Z +depth: standard +files_reviewed: 6 +files_reviewed_list: + - backend/api/audit.py + - backend/tests/test_audit.py + - backend/tests/test_documents.py + - backend/tests/test_shares.py + - frontend/src/api/client.js + - frontend/src/components/admin/AuditLogTab.vue +findings: + critical: 2 + warning: 5 + info: 3 + total: 10 +status: issues_found +--- + +# Phase 06.2: Code Review Report + +**Reviewed:** 2026-05-31T00:00:00Z +**Depth:** standard +**Files Reviewed:** 6 +**Status:** issues_found + +## Summary + +This phase closes sharing, cloud-delete, CSV-export, and daily-audit-export gaps. The backend +endpoint logic is structurally sound (parameterized queries, admin-only guards, proper path-traversal +regex on the date parameter). However, two correctness bugs are present that will silently corrupt or +truncate data in production, and five quality/robustness issues need addressing before the phase can +be marked complete. No hardcoded secrets were found and all admin guards are in place. + +--- + +## Critical Issues + +### CR-01: CSV export writes raw Python `dict` repr for `metadata_` — not JSON + +**File:** `backend/api/audit.py:372` + +**Issue:** `csv.DictWriter.writerow()` receives the dict returned by +`_audit_to_dict_with_handles()`. That dict contains `"metadata_": entry.metadata_`, where +`entry.metadata_` is a Python `dict` (SQLAlchemy deserializes `JSONB` columns into native Python +objects). `csv.DictWriter` calls `str()` on the value, producing `{'size_bytes': 100}` — +Python repr syntax with single quotes — not valid JSON. Any downstream consumer that tries to +`JSON.parse()` the metadata cell will fail. The same issue exists in the empty-handle early-return +path's header-only CSV, but that branch writes no data rows so the corruption is less visible there. + +Proof (reproduced): +```python +import csv, io +output = io.StringIO() +writer = csv.DictWriter(output, fieldnames=["metadata_"]) +writer.writeheader() +writer.writerow({"metadata_": {"size_bytes": 100}}) +print(output.getvalue()) +# metadata_ +# {'size_bytes': 100} ← Python repr, NOT JSON +``` + +**Fix:** Serialize `metadata_` to JSON in the dict helper before the CSV writer sees it, or only +at the CSV call site: + +```python +import json + +# Option A — at writerow call site (surgical fix): +for row in rows: + entry, user_handle_val, actor_handle_val = row[0], row[1], row[2] + record = _audit_to_dict_with_handles(entry, user_handle_val, actor_handle_val) + record["metadata_"] = json.dumps(record["metadata_"]) if record["metadata_"] is not None else "" + writer.writerow(record) +``` + +The `test_audit_log_export_csv` test does not assert the content of the metadata cell, so this +bug currently passes all tests despite corrupting the exported data. + +--- + +### CR-02: `download_daily_export` silently crashes on non-MinIO backends — no 404, no guard + +**File:** `backend/api/audit.py:219-239` + +**Issue:** `list_daily_exports` (line 182) correctly guards with +`if not isinstance(backend, MinIOBackend): return {"items": []}` and returns an empty list for +non-MinIO deployments. But `download_daily_export` has **no such guard**. If `get_storage_backend()` +returns anything other than a `MinIOBackend` (e.g. a local filesystem backend), line 223 calls +`backend._client.get_object(...)`, which will raise `AttributeError: '...' object has no attribute +'_client'`. The broad `except Exception` on line 231 catches this and re-raises it as a 404, but +only because the error happens inside `_get()`. If the backend raises during construction of `_get` +or if `backend` is `None`, the error propagates as an unhandled 500. More importantly, any new +backend type that happens to have a `_client` attribute with a different meaning would silently +produce wrong results. + +**Fix:** Mirror the guard from `list_daily_exports`: + +```python +@router.get("/audit-log/daily-exports/{date}") +async def download_daily_export( + date: str, + _admin: User = Depends(get_current_admin), +) -> StreamingResponse: + if not re.fullmatch(r"\d{4}-\d{2}-\d{2}", date): + raise HTTPException(status_code=404, detail="Invalid date format") + + backend = get_storage_backend() + if not isinstance(backend, MinIOBackend): # <-- add this guard + raise HTTPException(status_code=404, detail="Export not found") + + key = f"audit-logs/{date}.csv" + ... +``` + +--- + +## Warnings + +### WR-01: `export_audit_log` loads the entire audit log into memory before streaming + +**File:** `backend/api/audit.py:351-377` + +**Issue:** The export endpoint issues `result = await session.execute(q)` followed by +`rows = result.all()`, then builds the full CSV in an `io.StringIO` before returning it in a +`StreamingResponse`. For large deployments the audit log may contain millions of rows; this will +exhaust server memory and cause an OOM kill rather than a graceful error. `StreamingResponse` is +the correct mechanism but it is not being streamed — the CSV is fully materialized first. + +**Fix:** Use a generator that yields CSV chunks from a cursor: + +```python +async def _generate_csv(session, q, fields): + output = io.StringIO() + writer = csv.DictWriter(output, fieldnames=fields) + writer.writeheader() + yield output.getvalue() + + async for row in await session.stream(q): + output = io.StringIO() + entry, user_handle_val, actor_handle_val = row[0], row[1], row[2] + record = _audit_to_dict_with_handles(entry, user_handle_val, actor_handle_val) + record["metadata_"] = json.dumps(record["metadata_"]) if record["metadata_"] is not None else "" + writer = csv.DictWriter(output, fieldnames=fields) + writer.writerow(record) + yield output.getvalue() + +return StreamingResponse(_generate_csv(session, q, fields), media_type="text/csv", ...) +``` + +### WR-02: `format` query parameter is accepted but ignored — always returns CSV + +**File:** `backend/api/audit.py:313` + +**Issue:** The `export_audit_log` endpoint declares `format: str = Query(default="csv")` with a +`# noqa: A002` suppression but never reads the `format` variable in the handler body. Passing +`?format=json` returns a CSV response silently. This is a logic dead-end: the parameter is +either unused dead code or is intended as a future extension. If it is dead code it should be +removed; if it is a future hook, it must at minimum validate that only `"csv"` is accepted and +return 422 for other values. Currently an arbitrary string is accepted with no effect. + +**Fix (remove the dead parameter):** +```python +@router.get("/audit-log/export") +async def export_audit_log( + start: Optional[datetime] = Query(default=None), + end: Optional[datetime] = Query(default=None), + user_handle: Optional[str] = Query(default=None), + event_type: Optional[str] = Query(default=None), + # Remove: format: str = Query(default="csv"), + session: AsyncSession = Depends(get_db), + _admin: User = Depends(get_current_admin), +) -> StreamingResponse: +``` + +**Fix (validate if keeping):** +```python + if format not in ("csv",): + raise HTTPException(status_code=422, detail="Unsupported format. Only 'csv' is accepted.") +``` + +### WR-03: Pagination "Next" button in `AuditLogTab.vue` disables on the wrong condition + +**File:** `frontend/src/components/admin/AuditLogTab.vue:122` + +**Issue:** The "Next" button is disabled when `entries.length < perPage`. This is a heuristic: +it assumes the last page has fewer entries than `perPage`. But if the total count is an exact +multiple of `perPage`, the last page has exactly `perPage` entries and the "Next" button remains +enabled. Clicking it fetches a page beyond the last, returns an empty items list, and leaves the +user on a blank page with no way to go back (the empty-entries check at line 73 replaces the +table with the "No entries" message, but the page counter still shows the incremented value). +The `total` ref is populated from `data.total` (line 212) — it should be used instead. + +**Fix:** +```javascript +// Replace line 122: +:disabled="page * perPage >= total" +// And in nextPage(): +function nextPage() { + if (page.value * perPage < total.value) { + page.value++ + fetchLog() + } +} +``` + +### WR-04: `loadDailyExports` swallows errors silently — no user-visible feedback + +**File:** `frontend/src/components/admin/AuditLogTab.vue:256-266` + +**Issue:** The `loadDailyExports` catch block sets `dailyExports.value = []` but does **not** set +`exportsError.value`. The template renders `

` for the download error, but +the load-failure case is silently dropped. An admin whose MinIO is misconfigured or unreachable +sees "No daily exports available" — identical to the normal empty-bucket state — with no +indication that an error occurred. + +**Fix:** +```javascript + } catch (e) { + dailyExports.value = [] + exportsError.value = 'Failed to load daily exports. Please try again.' // add this line + } finally { +``` + +### WR-05: `listShares` in `client.js` uses string interpolation instead of `URLSearchParams` + +**File:** `frontend/src/api/client.js:353` + +**Issue:** +```javascript +export function listShares(docId) { + return request(`/api/shares?document_id=${docId}`) +} +``` +`docId` is always a UUID from the DB so the practical risk is low, but the pattern is inconsistent +with the rest of the file (compare `listDocuments`, `adminListAuditLog`, and `listFolders` which +all use `URLSearchParams`). If a caller ever passes a non-UUID value (e.g. a stringified object +from a Vue ref), this will silently produce a malformed URL without any encoding. Unlike the other +route-param-interpolation cases in this file (`/documents/${id}`, `/folders/${folderId}`) — where +the value is an ID used in a path segment — this one places an unencoded value into a query +parameter, making URL injection more likely. + +**Fix:** +```javascript +export function listShares(docId) { + const params = new URLSearchParams({ document_id: docId }) + return request(`/api/shares?${params}`) +} +``` + +--- + +## Info + +### IN-01: Unused `import re` in `test_documents.py` + +**File:** `backend/tests/test_documents.py:9` + +**Issue:** `import re` is present at the top of the file but `re.` is never called anywhere in +the module. This is dead code left over from a previous version of the test. + +**Fix:** Remove `import re` from line 9. + +### IN-02: `deleteDocumentRemoveOnly` is a trivial wrapper that duplicates `deleteDocument(id, true)` + +**File:** `frontend/src/api/client.js:80-82` + +**Issue:** +```javascript +export function deleteDocumentRemoveOnly(id) { + return deleteDocument(id, true) +} +``` +This function is a one-liner alias that adds no value beyond calling `deleteDocument(id, true)` +directly. It creates two public API symbols for the same operation, which will confuse future +callers about which to use. The caller in `DocumentView.vue` already uses this alias, but the +wrapper itself adds surface area without benefit. + +**Fix:** Either remove the wrapper and update `DocumentView.vue` to call +`deleteDocument(doc.value.id, true)` directly, or keep it but document why the alias exists. + +### IN-03: `test_delete_cloud_remove_only` does not assert quota is unchanged + +**File:** `backend/tests/test_documents.py:897-925` + +**Issue:** The test verifies the DB row is removed but does not check that the user's +`used_bytes` quota was NOT decremented (which it should not be — cloud docs are not quota-tracked +per T-06.2-03-01). The `Quota` model is imported but never queried in this test. A future +regression that incorrectly decrements quota on `remove_only` paths would go undetected. + +**Fix:** Add a quota assertion: +```python + # Quota must remain unchanged — cloud docs are not quota-tracked + quota = await db_session.get(Quota, auth_user["user"].id) + assert quota.used_bytes == 0, ( + f"remove_only should not decrement quota, got used_bytes={quota.used_bytes}" + ) +``` + +--- + +_Reviewed: 2026-05-31T00:00:00Z_ +_Reviewer: Claude (gsd-code-reviewer)_ +_Depth: standard_