12 KiB
phase, reviewed, depth, files_reviewed, files_reviewed_list, findings, status
| phase | reviewed | depth | files_reviewed | files_reviewed_list | findings | status | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 06.2-close-v1-sharing-cloud-delete-csv-export-gaps | 2026-05-31T00:00:00Z | standard | 6 |
|
|
issues_found |
Phase 06.2: Code Review Report
Reviewed: 2026-05-31T00:00:00Z Depth: standard Files Reviewed: 6 Status: issues_found
Summary
This phase closes sharing, cloud-delete, CSV-export, and daily-audit-export gaps. The backend endpoint logic is structurally sound (parameterized queries, admin-only guards, proper path-traversal regex on the date parameter). However, two correctness bugs are present that will silently corrupt or truncate data in production, and five quality/robustness issues need addressing before the phase can be marked complete. No hardcoded secrets were found and all admin guards are in place.
Critical Issues
CR-01: CSV export writes raw Python dict repr for metadata_ — not JSON
File: backend/api/audit.py:372
Issue: csv.DictWriter.writerow() receives the dict returned by
_audit_to_dict_with_handles(). That dict contains "metadata_": entry.metadata_, where
entry.metadata_ is a Python dict (SQLAlchemy deserializes JSONB columns into native Python
objects). csv.DictWriter calls str() on the value, producing {'size_bytes': 100} —
Python repr syntax with single quotes — not valid JSON. Any downstream consumer that tries to
JSON.parse() the metadata cell will fail. The same issue exists in the empty-handle early-return
path's header-only CSV, but that branch writes no data rows so the corruption is less visible there.
Proof (reproduced):
import csv, io
output = io.StringIO()
writer = csv.DictWriter(output, fieldnames=["metadata_"])
writer.writeheader()
writer.writerow({"metadata_": {"size_bytes": 100}})
print(output.getvalue())
# metadata_
# {'size_bytes': 100} ← Python repr, NOT JSON
Fix: Serialize metadata_ to JSON in the dict helper before the CSV writer sees it, or only
at the CSV call site:
import json
# Option A — at writerow call site (surgical fix):
for row in rows:
entry, user_handle_val, actor_handle_val = row[0], row[1], row[2]
record = _audit_to_dict_with_handles(entry, user_handle_val, actor_handle_val)
record["metadata_"] = json.dumps(record["metadata_"]) if record["metadata_"] is not None else ""
writer.writerow(record)
The test_audit_log_export_csv test does not assert the content of the metadata cell, so this
bug currently passes all tests despite corrupting the exported data.
CR-02: download_daily_export silently crashes on non-MinIO backends — no 404, no guard
File: backend/api/audit.py:219-239
Issue: list_daily_exports (line 182) correctly guards with
if not isinstance(backend, MinIOBackend): return {"items": []} and returns an empty list for
non-MinIO deployments. But download_daily_export has no such guard. If get_storage_backend()
returns anything other than a MinIOBackend (e.g. a local filesystem backend), line 223 calls
backend._client.get_object(...), which will raise AttributeError: '...' object has no attribute '_client'. The broad except Exception on line 231 catches this and re-raises it as a 404, but
only because the error happens inside _get(). If the backend raises during construction of _get
or if backend is None, the error propagates as an unhandled 500. More importantly, any new
backend type that happens to have a _client attribute with a different meaning would silently
produce wrong results.
Fix: Mirror the guard from list_daily_exports:
@router.get("/audit-log/daily-exports/{date}")
async def download_daily_export(
date: str,
_admin: User = Depends(get_current_admin),
) -> StreamingResponse:
if not re.fullmatch(r"\d{4}-\d{2}-\d{2}", date):
raise HTTPException(status_code=404, detail="Invalid date format")
backend = get_storage_backend()
if not isinstance(backend, MinIOBackend): # <-- add this guard
raise HTTPException(status_code=404, detail="Export not found")
key = f"audit-logs/{date}.csv"
...
Warnings
WR-01: export_audit_log loads the entire audit log into memory before streaming
File: backend/api/audit.py:351-377
Issue: The export endpoint issues result = await session.execute(q) followed by
rows = result.all(), then builds the full CSV in an io.StringIO before returning it in a
StreamingResponse. For large deployments the audit log may contain millions of rows; this will
exhaust server memory and cause an OOM kill rather than a graceful error. StreamingResponse is
the correct mechanism but it is not being streamed — the CSV is fully materialized first.
Fix: Use a generator that yields CSV chunks from a cursor:
async def _generate_csv(session, q, fields):
output = io.StringIO()
writer = csv.DictWriter(output, fieldnames=fields)
writer.writeheader()
yield output.getvalue()
async for row in await session.stream(q):
output = io.StringIO()
entry, user_handle_val, actor_handle_val = row[0], row[1], row[2]
record = _audit_to_dict_with_handles(entry, user_handle_val, actor_handle_val)
record["metadata_"] = json.dumps(record["metadata_"]) if record["metadata_"] is not None else ""
writer = csv.DictWriter(output, fieldnames=fields)
writer.writerow(record)
yield output.getvalue()
return StreamingResponse(_generate_csv(session, q, fields), media_type="text/csv", ...)
WR-02: format query parameter is accepted but ignored — always returns CSV
File: backend/api/audit.py:313
Issue: The export_audit_log endpoint declares format: str = Query(default="csv") with a
# noqa: A002 suppression but never reads the format variable in the handler body. Passing
?format=json returns a CSV response silently. This is a logic dead-end: the parameter is
either unused dead code or is intended as a future extension. If it is dead code it should be
removed; if it is a future hook, it must at minimum validate that only "csv" is accepted and
return 422 for other values. Currently an arbitrary string is accepted with no effect.
Fix (remove the dead parameter):
@router.get("/audit-log/export")
async def export_audit_log(
start: Optional[datetime] = Query(default=None),
end: Optional[datetime] = Query(default=None),
user_handle: Optional[str] = Query(default=None),
event_type: Optional[str] = Query(default=None),
# Remove: format: str = Query(default="csv"),
session: AsyncSession = Depends(get_db),
_admin: User = Depends(get_current_admin),
) -> StreamingResponse:
Fix (validate if keeping):
if format not in ("csv",):
raise HTTPException(status_code=422, detail="Unsupported format. Only 'csv' is accepted.")
WR-03: Pagination "Next" button in AuditLogTab.vue disables on the wrong condition
File: frontend/src/components/admin/AuditLogTab.vue:122
Issue: The "Next" button is disabled when entries.length < perPage. This is a heuristic:
it assumes the last page has fewer entries than perPage. But if the total count is an exact
multiple of perPage, the last page has exactly perPage entries and the "Next" button remains
enabled. Clicking it fetches a page beyond the last, returns an empty items list, and leaves the
user on a blank page with no way to go back (the empty-entries check at line 73 replaces the
table with the "No entries" message, but the page counter still shows the incremented value).
The total ref is populated from data.total (line 212) — it should be used instead.
Fix:
// Replace line 122:
:disabled="page * perPage >= total"
// And in nextPage():
function nextPage() {
if (page.value * perPage < total.value) {
page.value++
fetchLog()
}
}
WR-04: loadDailyExports swallows errors silently — no user-visible feedback
File: frontend/src/components/admin/AuditLogTab.vue:256-266
Issue: The loadDailyExports catch block sets dailyExports.value = [] but does not set
exportsError.value. The template renders <p v-if="exportsError"> for the download error, but
the load-failure case is silently dropped. An admin whose MinIO is misconfigured or unreachable
sees "No daily exports available" — identical to the normal empty-bucket state — with no
indication that an error occurred.
Fix:
} catch (e) {
dailyExports.value = []
exportsError.value = 'Failed to load daily exports. Please try again.' // add this line
} finally {
WR-05: listShares in client.js uses string interpolation instead of URLSearchParams
File: frontend/src/api/client.js:353
Issue:
export function listShares(docId) {
return request(`/api/shares?document_id=${docId}`)
}
docId is always a UUID from the DB so the practical risk is low, but the pattern is inconsistent
with the rest of the file (compare listDocuments, adminListAuditLog, and listFolders which
all use URLSearchParams). If a caller ever passes a non-UUID value (e.g. a stringified object
from a Vue ref), this will silently produce a malformed URL without any encoding. Unlike the other
route-param-interpolation cases in this file (/documents/${id}, /folders/${folderId}) — where
the value is an ID used in a path segment — this one places an unencoded value into a query
parameter, making URL injection more likely.
Fix:
export function listShares(docId) {
const params = new URLSearchParams({ document_id: docId })
return request(`/api/shares?${params}`)
}
Info
IN-01: Unused import re in test_documents.py
File: backend/tests/test_documents.py:9
Issue: import re is present at the top of the file but re. is never called anywhere in
the module. This is dead code left over from a previous version of the test.
Fix: Remove import re from line 9.
IN-02: deleteDocumentRemoveOnly is a trivial wrapper that duplicates deleteDocument(id, true)
File: frontend/src/api/client.js:80-82
Issue:
export function deleteDocumentRemoveOnly(id) {
return deleteDocument(id, true)
}
This function is a one-liner alias that adds no value beyond calling deleteDocument(id, true)
directly. It creates two public API symbols for the same operation, which will confuse future
callers about which to use. The caller in DocumentView.vue already uses this alias, but the
wrapper itself adds surface area without benefit.
Fix: Either remove the wrapper and update DocumentView.vue to call
deleteDocument(doc.value.id, true) directly, or keep it but document why the alias exists.
IN-03: test_delete_cloud_remove_only does not assert quota is unchanged
File: backend/tests/test_documents.py:897-925
Issue: The test verifies the DB row is removed but does not check that the user's
used_bytes quota was NOT decremented (which it should not be — cloud docs are not quota-tracked
per T-06.2-03-01). The Quota model is imported but never queried in this test. A future
regression that incorrectly decrements quota on remove_only paths would go undetected.
Fix: Add a quota assertion:
# Quota must remain unchanged — cloud docs are not quota-tracked
quota = await db_session.get(Quota, auth_user["user"].id)
assert quota.used_bytes == 0, (
f"remove_only should not decrement quota, got used_bytes={quota.used_bytes}"
)
Reviewed: 2026-05-31T00:00:00Z Reviewer: Claude (gsd-code-reviewer) Depth: standard