Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
44 KiB
Phase 6.2: Close v1 sharing + cloud-delete + CSV export gaps — Research
Researched: 2026-05-31 Domain: FastAPI PATCH endpoints, cloud storage backend routing, Fetch API blob download, MinIO list_objects, SQLAlchemy async JOIN Confidence: HIGH (all findings verified against the live codebase)
<user_constraints>
User Constraints (from CONTEXT.md)
Locked Decisions
- D-00:
create_user500 fix already committed —await session.flush()beforewrite_audit_loginbackend/api/admin.py. - D-01: Default delete propagates to cloud provider.
delete_document()inservices/storage.pymust route toget_storage_backend_for_document(doc, user, session)for non-MinIO docs. - D-02: "Remove from app only" is a separate, distinct action that deletes the DB record while leaving the cloud file intact.
- D-03: Cloud provider delete failure: warning modal "Cloud delete failed. Remove from app anyway?" — user chooses. Backend returns structured error distinguishable from hard 4xx/5xx.
- D-04: Cloud documents do NOT touch MinIO quota on delete (unchanged from existing design).
- D-05: Persistent local Celery cache for cloud docs — DEFERRED. Not this phase.
- D-06:
DocumentCard.vue:31— changev-if="doc.share_count > 0"tov-if="doc.is_shared". - D-07: Permission levels:
viewandedit.Share.permissioncolumn already exists. No migration. - D-08: Permission set at share creation: add view/edit dropdown to
ShareModal.vuebefore submitting. Existing POST endpoint already acceptspermissionfield. - D-09: Permission changeable after creation: add View/Edit toggle per share row in
ShareModal.vue. Calls newPATCH /api/shares/{id}with{ permission: "view" | "edit" }. Owner-only, 404 for wrong owner. - D-10: Fix
shares.pyPOST hardcodedpermission="view"(line 97) — read from request body instead. - D-11:
_audit_to_dict()— extend to returnuser_handleandactor_handlevia JOIN onUsertable. - D-12:
user_idfilter inGET /api/admin/audit-log— accept handle string, resolve to UUID viaUser.handle == handle, return empty if not found. - D-13: Replace
window.location.hrefinAuditLogTab.vue:exportCsv()withfetch()+ Blob URL pattern. - D-14: Add
adminExportAuditLogCsv(params)toclient.js— must callres.text()(orres.blob()), NOTres.json(). Create object URL, trigger<a>click download. - D-15: Add
GET /api/admin/audit-log/daily-exports— lists MinIOaudit-logsbucket contents. Returns[{ date, key }]sorted descending. - D-16: Add
GET /api/admin/audit-log/daily-exports/{date}— streams specific daily export from MinIO astext/csv. Auth:get_current_admin. - D-17: Frontend
AuditLogTab.vue: date dropdown fromadminListDailyExports(), "Download" button usesfetch()+ Blob URL.
Claude's Discretion
- Exact API shape for "remove from app only" vs "delete from provider" (
?cloud_only=falsequery param vs separate endpoint). - Whether
PATCH /api/shares/{id}accepts full body or just{ permission: "view"|"edit" }— minimal body preferred. - Exact error response shape for cloud delete failure — must be distinguishable from hard 4xx/5xx.
- MinIO
list_objectspagination — handle ifaudit-logsbucket has >1000 files.
Deferred Ideas (OUT OF SCOPE)
- Persistent local Celery cache for cloud docs with quota tracking.
- Celery local cache "Remove download" button.
- SHARE-03 permission levels beyond view/edit.
</user_constraints>
<phase_requirements>
Phase Requirements
| ID | Description | Research Support |
|---|---|---|
| SHARE-03 | Shared access is view-only by default; owner controls permission level | D-07..D-10: Share.permission column exists; POST hardcodes view; PATCH endpoint needed; frontend dropdown needed |
| SHARE-05 | Documents shared with others display "shared" indicator | D-06: backend sends is_shared; frontend checks wrong field share_count |
| CLOUD (del) | Cloud document deletion propagates to provider | D-01..D-04: delete_document() must route to get_storage_backend_for_document(); structured error response needed |
| ADMIN-06 | Admin audit log with filters and export | D-11..D-17: user handles in responses, handle filter, CSV fetch download, daily export list + stream |
</phase_requirements>
Summary
Phase 6.2 is a brownfield gap-closure sprint with four independent feature areas. All four areas have the backend plumbing already present — the gaps are surgical additions and fixes, not architectural changes.
Area 1 — Cloud-delete propagation: services/storage.py:delete_document() unconditionally calls _backend().delete_object() (MinIO singleton). For cloud documents (doc.storage_backend != "minio") it must instead call get_storage_backend_for_document(doc, user, session). This function is already in backend/storage/__init__.py and used by both the content proxy and admin.py:delete_user(). The wrinkle is that delete_document() currently does not receive the user ORM object or session — the service layer only receives session and doc_id. The cleanest fix is to extend the function signature or to perform the cloud routing in api/documents.py:delete_document before calling storage.delete_document(). The admin delete_user pattern (lines 527-539) is the canonical reference: call get_storage_backend_for_document, then backend.delete_object, wrapped in try/except. A separate "remove from app only" action (D-02) needs an endpoint that skips cloud deletion entirely — a ?remove_only=true query parameter on DELETE /api/documents/{id} is the cleanest route since it shares the auth/ownership gate without duplicating a whole endpoint.
Area 2 — SHARE-03/05 fixes: Three micro-changes. (1) DocumentCard.vue line 31: share_count > 0 to is_shared. (2) shares.py POST: read permission from request body (add to ShareCreate model). (3) New PATCH /api/shares/{id} endpoint with minimal body { permission: "view" | "edit" }, owner-enforced IDOR (404 on mismatch, mirroring the DELETE pattern). (4) Frontend ShareModal.vue: dropdown before submit + toggle per row.
Area 3 — Audit log user handles: _audit_to_dict() is a pure function operating on a loaded AuditLog ORM object — it does not have access to a session. Two options: (a) change it to accept optional pre-fetched handle dicts alongside the entry, or (b) make the list/export endpoints JOIN User as an alias twice (once for user_id, once for actor_id) and pass the handles as extra arguments. Option (b) is cleaner and avoids N+1 queries. The _build_filtered_query() helper returns a select(AuditLog) — it needs to be extended to JOIN User twice (with SQLAlchemy aliases) and yield (AuditLog, user_handle, actor_handle) tuples. The user_id filter must change from accepting Optional[uuid.UUID] to accepting Optional[str] and resolving the handle in a preliminary query.
Area 4 — CSV export + daily exports: The export endpoint is already a StreamingResponse — the only change is on the frontend where window.location.href must become fetch() + Blob URL. The daily export listing uses Minio.list_objects(bucket_name="audit-logs", prefix="audit-logs/") which returns a synchronous iterator — it must be consumed in asyncio.to_thread(). The key pattern audit-logs/{date}.csv is guaranteed by audit_tasks.py:79.
Primary recommendation: Implement in four vertical slices, each deployable independently. Start with SHARE-05 (one-line fix), then SHARE-03, then cloud-delete, then audit log cluster (user handles + CSV + daily exports).
Architectural Responsibility Map
| Capability | Primary Tier | Secondary Tier | Rationale |
|---|---|---|---|
| Cloud-delete routing | API / Backend | — | Must happen server-side where credentials are decrypted; cannot be client-driven |
| "Remove from app" vs "delete from cloud" | API / Backend | Browser / Client | Query param distinction at API layer; UX confirmation modal at client layer |
| Share permission PATCH | API / Backend | Browser / Client | IDOR enforcement is a backend concern; toggle UI is client |
| "Shared" badge display | Browser / Client | — | Trivially reads doc.is_shared field from list response |
| Audit log handle enrichment | API / Backend | — | JOIN happens in DB query layer; frontend only receives enriched response |
| CSV export download | Browser / Client | API / Backend | Authentication requires fetch() with Bearer header — browser nav cannot send it |
| Daily export listing | API / Backend | Browser / Client | MinIO query is server-side; dropdown is client |
| Daily export stream | API / Backend | Browser / Client | Streaming response with auth gate; download trigger at client |
Standard Stack
This phase adds no new packages. All functionality uses the existing stack.
Core (already installed)
| Library | Version | Purpose | Relevance to Phase |
|---|---|---|---|
| FastAPI | 0.136+ | API framework | PATCH endpoint, query params, StreamingResponse |
| SQLAlchemy 2.0 async | 2.x | ORM | JOIN with aliased User twice for handle enrichment |
| Minio (Python SDK) | current | MinIO S3 client | list_objects, get_object for daily exports |
| Pydantic v2 | 2.x | Request validation | SharePermissionPatch minimal body model |
| Vue 3 (Options API/Composition) | 3.x | Frontend framework | Dropdown, toggle, fetch+Blob download |
No new packages required
All operations — cloud backend routing, streaming responses, fetch+Blob download — use code already present in the project.
Package Legitimacy Audit
Not applicable — no new packages installed in this phase.
Architecture Patterns
System Architecture Diagram
DELETE /api/documents/{id}?remove_only=false (default)
└── api/documents.py:delete_document
├── ownership assert (doc.user_id == current_user.id)
├── if doc.storage_backend == "minio":
│ └── services/storage.py:delete_document() ← (unchanged path)
│ ├── _backend().delete_object(key)
│ └── quota decrement (STORE-06)
└── else (cloud backend):
├── get_storage_backend_for_document(doc, user, session)
│ └── decrypts HKDF credentials, returns cloud backend
├── try: cloud_backend.delete_object(doc.object_key)
│ except: return {"cloud_delete_failed": true, "detail": "..."} ← 207/200 structured
└── if cloud delete succeeded (or user confirmed remove_only):
└── services/storage.py:_db_only_delete() ← quota skipped for cloud docs
PATCH /api/shares/{id}
└── api/shares.py:update_share_permission
├── UUID parse (404 on invalid)
├── session.get(Share, sid)
├── assert share.owner_id == current_user.id → 404 on mismatch (IDOR)
├── validate body.permission in {"view", "edit"}
└── share.permission = body.permission; commit
GET /api/admin/audit-log (with handle enrichment)
└── api/audit.py:list_audit_log
├── if user_handle param: resolve handle → UUID, or empty result if not found
├── _build_filtered_query_with_handles(start, end, user_uuid, event_type)
│ └── select(AuditLog, user_alias.handle, actor_alias.handle)
│ .outerjoin(user_alias, user_alias.id == AuditLog.user_id)
│ .outerjoin(actor_alias, actor_alias.id == AuditLog.actor_id)
└── _audit_to_dict(entry, user_handle, actor_handle) → dict with both handles
GET /api/admin/audit-log/export (CSV with auth)
└── StreamingResponse (unchanged backend)
← client: fetch() + res.text() + Blob URL + <a> click download
GET /api/admin/audit-log/daily-exports
└── asyncio.to_thread(minio_client.list_objects, "audit-logs", prefix="audit-logs/")
└── returns [{date: "YYYY-MM-DD", key: "audit-logs/YYYY-MM-DD.csv"}, ...]
GET /api/admin/audit-log/daily-exports/{date}
└── key = f"audit-logs/{date}.csv"
├── asyncio.to_thread(minio_client.get_object, "audit-logs", key)
└── StreamingResponse(iter([csv_bytes]), media_type="text/csv")
Recommended Project Structure (changes only)
backend/
├── api/
│ ├── shares.py # + PATCH /{id} endpoint; fix ShareCreate.permission
│ └── audit.py # + handle JOIN; + 2 daily-export endpoints; fix user filter
├── services/
│ └── storage.py # + cloud routing in delete_document; + _db_only_delete()
frontend/src/
├── api/
│ └── client.js # + adminExportAuditLogCsv(); + adminListDailyExports(); + adminDownloadDailyExport()
├── components/
│ ├── admin/
│ │ └── AuditLogTab.vue # fix exportCsv(); + daily exports date dropdown
│ ├── sharing/
│ │ └── ShareModal.vue # + permission dropdown on share; + toggle per row
│ └── documents/
│ └── DocumentCard.vue # line 31: share_count → is_shared
Pattern 1: Cloud-delete routing in services/storage.py
The cleanest approach to the signature problem: delete_document() gets two new optional parameters user and session_for_cloud. When user is provided and doc.storage_backend != "minio", cloud routing fires. The caller in api/documents.py already has both the current_user and session in scope.
Alternative (preferred for simplicity): Move the cloud routing entirely into api/documents.py:delete_document, before the call to storage.delete_document(). The service layer function handles only DB + MinIO quota. The API layer decides routing. This keeps services/storage.py free of cloud-backend imports and mirrors how admin.py:delete_user() works — the API layer calls get_storage_backend_for_document() and the service layer is unaware of cloud backends.
# api/documents.py — delete_document (modified)
@router.delete("/{doc_id}")
async def delete_document(
doc_id: str,
remove_only: bool = Query(default=False), # D-02: skip cloud delete
request: Request,
session: AsyncSession = Depends(get_db),
current_user: User = Depends(get_regular_user),
):
uid = uuid.UUID(doc_id)
doc = await session.get(Document, uid)
if doc is None or doc.user_id != current_user.id:
raise HTTPException(404, "Document not found")
cloud_delete_failed = False
if doc.storage_backend != "minio" and not remove_only:
try:
cloud_backend = await get_storage_backend_for_document(doc, current_user, session)
await cloud_backend.delete_object(doc.object_key)
except Exception as exc:
# D-03: structured error; frontend shows warning modal
cloud_delete_failed = True
# Caller (frontend) decides whether to retry with remove_only=true
if cloud_delete_failed and not remove_only:
return JSONResponse(
status_code=200,
content={
"success": False,
"cloud_delete_failed": True,
"detail": "Cloud provider delete failed. You can remove from app only.",
}
)
# DB delete + quota decrement (cloud docs skip quota — D-04)
ok = await storage.delete_document(session, doc_id, skip_quota=doc.storage_backend != "minio")
# ... audit log + commit
Key insight on quota: Cloud docs already skip quota at upload time (no UPDATE quotas in cloud upload path). The delete_document() in services/storage.py currently always decrements quota. It needs a skip_quota: bool = False guard for cloud documents.
Source: [VERIFIED: codebase] — confirmed by reading services/storage.py:163-175 and api/documents.py:269-291.
Pattern 2: PATCH /api/shares/{id} — minimal IDOR-safe pattern
# api/shares.py — new endpoint
class SharePermissionPatch(BaseModel):
permission: str # validated: "view" | "edit"
@field_validator("permission")
@classmethod
def validate_permission(cls, v: str) -> str:
if v not in {"view", "edit"}:
raise ValueError("permission must be 'view' or 'edit'")
return v
# CRITICAL: must be defined BEFORE DELETE /{share_id} to avoid path conflict
@router.patch("/{share_id}", status_code=status.HTTP_200_OK)
async def update_share_permission(
share_id: str,
body: SharePermissionPatch,
session: AsyncSession = Depends(get_db),
current_user: User = Depends(get_regular_user),
) -> dict:
try:
sid = uuid.UUID(share_id)
except ValueError:
raise HTTPException(status_code=404, detail="Share not found")
share = await session.get(Share, sid)
# IDOR: 404 on mismatch — mirrors DELETE pattern (T-04-04-02)
if share is None or share.owner_id != current_user.id:
raise HTTPException(status_code=404, detail="Share not found")
share.permission = body.permission
await session.commit()
return {"id": str(share.id), "permission": share.permission}
Source: [VERIFIED: codebase] — mirrors revoke_share IDOR pattern at lines 240-265 in backend/api/shares.py.
Pattern 3: Handle enrichment via SQLAlchemy aliased JOIN
# api/audit.py — handle enrichment
from sqlalchemy.orm import aliased
UserAsSubject = aliased(User) # for user_id FK
UserAsActor = aliased(User) # for actor_id FK
def _build_filtered_query_with_handles(start, end, user_uuid, event_type):
q = (
select(AuditLog, UserAsSubject.handle, UserAsActor.handle)
.outerjoin(UserAsSubject, UserAsSubject.id == AuditLog.user_id)
.outerjoin(UserAsActor, UserAsActor.id == AuditLog.actor_id)
.order_by(AuditLog.created_at.desc())
)
if start: q = q.where(AuditLog.created_at >= start)
if end: q = q.where(AuditLog.created_at <= end)
if user_uuid: q = q.where(AuditLog.user_id == user_uuid)
if event_type: q = q.where(AuditLog.event_type == event_type)
return q
def _audit_to_dict_with_handles(entry, user_handle, actor_handle) -> dict:
d = {
"id": entry.id,
"event_type": entry.event_type,
"user_id": str(entry.user_id) if entry.user_id else None,
"actor_id": str(entry.actor_id) if entry.actor_id else None,
"user_handle": user_handle or None,
"actor_handle": actor_handle or None,
"resource_id": str(entry.resource_id) if entry.resource_id else None,
"ip_address": str(entry.ip_address) if entry.ip_address else None,
"metadata_": entry.metadata_,
"created_at": entry.created_at.isoformat(),
}
return d
Result rows from a multi-column select in SQLAlchemy 2.0 async are Row tuples — access as row[0] (AuditLog), row[1] (user_handle), row[2] (actor_handle). result.all() returns list[Row].
Source: [ASSUMED] — SQLAlchemy 2.0 aliased JOIN pattern is well-documented but not verified via Context7 in this session.
Pattern 4: Handle-to-UUID resolution for user filter
# In list_audit_log / export_audit_log endpoint handlers:
user_uuid: Optional[uuid.UUID] = None
if user_handle_param: # new str param replacing Optional[uuid.UUID]
handle_result = await session.execute(
select(User.id).where(User.handle == user_handle_param)
)
uid = handle_result.scalar_one_or_none()
if uid is None:
# No user with that handle → return empty results (D-12)
return {"items": [], "total": 0, "page": page, "per_page": per_page}
user_uuid = uid
Source: [VERIFIED: codebase] — User.handle is a unique indexed column (models.py line 52).
Pattern 5: Fetch + Blob URL download in Vue 3 (CSV export)
// frontend/src/api/client.js — new function
export async function adminExportAuditLogCsv(params = {}) {
const { useAuthStore } = await import('../stores/auth.js')
const authStore = useAuthStore()
const searchParams = new URLSearchParams({ format: 'csv' })
if (params.start) searchParams.set('start', params.start)
if (params.end) searchParams.set('end', params.end)
if (params.user_handle) searchParams.set('user_handle', params.user_handle)
if (params.event_type) searchParams.set('event_type', params.event_type)
const headers = {}
if (authStore.accessToken) {
headers['Authorization'] = `Bearer ${authStore.accessToken}`
}
const res = await fetch(`/api/admin/audit-log/export?${searchParams}`, {
headers,
credentials: 'include',
})
if (!res.ok) throw new Error(`Export failed: ${res.status}`)
const text = await res.text() // CSV is text, not JSON
const blob = new Blob([text], { type: 'text/csv' })
const url = URL.createObjectURL(blob)
const a = document.createElement('a')
a.href = url
a.download = 'audit-export.csv'
a.click()
URL.revokeObjectURL(url)
}
This is the same pattern as fetchDocumentContent() already in client.js — raw fetch() with Authorization header, NOT using the shared request() wrapper (which always calls res.json()).
Source: [VERIFIED: codebase] — fetchDocumentContent at lines 399-428 of client.js is the exact precedent.
Pattern 6: MinIO daily export listing (asyncio.to_thread)
# api/audit.py — new endpoint
@router.get("/audit-log/daily-exports")
async def list_daily_exports(
session: AsyncSession = Depends(get_db),
_admin: User = Depends(get_current_admin),
) -> dict:
"""List available daily audit export files from MinIO audit-logs bucket."""
from storage import get_storage_backend
from storage.minio_backend import MinIOBackend
backend = get_storage_backend()
if not isinstance(backend, MinIOBackend):
return {"items": []}
def _list() -> list[dict]:
objects = backend._client.list_objects(
"audit-logs", prefix="audit-logs/", recursive=False
)
items = []
for obj in objects:
if obj.object_name and obj.object_name.endswith(".csv"):
# key: "audit-logs/2026-05-30.csv" → date: "2026-05-30"
filename = obj.object_name.removeprefix("audit-logs/").removesuffix(".csv")
items.append({"date": filename, "key": obj.object_name})
items.sort(key=lambda x: x["date"], reverse=True)
return items
items = await asyncio.to_thread(_list)
return {"items": items}
Minio.list_objects() returns a synchronous iterator; consuming it inside asyncio.to_thread blocks the thread (not the event loop). The iterator is lazy — it pages automatically. For fewer than ~1000 objects (years of daily exports) it completes in a single S3 ListObjectsV2 call. [VERIFIED: codebase] — confirmed by inspecting Minio SDK list_objects signature at runtime; obj.object_name and obj.is_dir are the key attributes.
Pattern 7: MinIO daily export streaming download
@router.get("/audit-log/daily-exports/{date}")
async def download_daily_export(
date: str,
session: AsyncSession = Depends(get_db),
_admin: User = Depends(get_current_admin),
) -> StreamingResponse:
import re, asyncio
if not re.fullmatch(r"\d{4}-\d{2}-\d{2}", date):
raise HTTPException(status_code=404, detail="Invalid date format")
from storage import get_storage_backend
backend = get_storage_backend()
key = f"audit-logs/{date}.csv"
def _get() -> bytes:
response = backend._client.get_object("audit-logs", key)
try:
return response.read()
finally:
response.close()
response.release_conn()
try:
csv_bytes = await asyncio.to_thread(_get)
except Exception:
raise HTTPException(status_code=404, detail="Export not found")
return StreamingResponse(
iter([csv_bytes]),
media_type="text/csv",
headers={"Content-Disposition": f'attachment; filename="audit-{date}.csv"'},
)
Note: The get_object call uses "audit-logs" bucket (separate from documents bucket), mirroring audit_tasks.py. Date path parameter must be validated against YYYY-MM-DD regex to prevent path traversal.
Source: [VERIFIED: codebase] — MinIOBackend.get_object() pattern at lines 110-121 of minio_backend.py.
Anti-Patterns to Avoid
- Calling
_backend()singleton for cloud deletes:_backend()inservices/storage.pyalways returns MinIOBackend. Cloud docs must useget_storage_backend_for_document(). Using the singleton for cloud docs would silently succeed (MinIO would return no error for a key that doesn't exist there) while leaving the cloud file orphaned. - Defining
PATCH /{share_id}AFTERDELETE /{share_id}: FastAPI routes earlier-defined paths first. APATCH /{share_id}defined afterDELETE /{share_id}on the same router will work correctly for PATCH (different HTTP method). The ordering concern only affects routes with overlapping path patterns (e.g.,/receivedvs/{share_id}GET routes). For PATCH vs DELETE on/{share_id}, method discrimination is unambiguous. The existing comment inshares.pyconfirms/receivedmust come before/{share_id}for GET routes only. - Calling
res.json()for CSV download: The existingrequest()function inclient.jsalways callsres.json()and will throw ontext/csvresponses. Any CSV or binary endpoint MUST use a separate function that callsres.text()orres.blob(). - Quota decrement on cloud document delete: Cloud docs never touched quota at upload time — decrementing quota on delete would corrupt the counter. The
skip_quotaguard must be explicit. - Passing
user_idas a UUID query param for handle-based filter: The current endpoint signature acceptsOptional[uuid.UUID]via Query, which causes FastAPI to 422 on non-UUID strings. Changing toOptional[str]and doing handle resolution in the handler body is required. - Blocking the event loop with Minio iterator:
list_objects()is synchronous. Consuming it directly in an async handler withoutasyncio.to_threadwould block the FastAPI event loop.
Don't Hand-Roll
| Problem | Don't Build | Use Instead | Why |
|---|---|---|---|
| Cloud backend instantiation | Custom credential decrypt + backend init | get_storage_backend_for_document(doc, user, session) |
Already handles HKDF decrypt, lazy imports, 503 on inactive connection |
| HKDF key derivation | Custom Fernet wrapper | storage.cloud_utils.decrypt_credentials() |
One implementation, already tested |
| Streaming CSV response | Custom write loop | io.StringIO + csv.DictWriter + StreamingResponse(iter([output.getvalue()])) |
Exact pattern already in api/audit.py:export_audit_log |
| Blob download trigger | Custom download UI | Blob + URL.createObjectURL + <a> click |
Exact pattern used by fetchDocumentContent in client.js |
| MinIO audit-logs GET | Custom HTTP call | MinIOBackend._client.get_object("audit-logs", key) in asyncio.to_thread |
Already used for doc content; same response.read() + response.close() pattern |
Common Pitfalls
Pitfall 1: delete_document() quota decrement for cloud docs
What goes wrong: The existing delete_document() in services/storage.py always runs UPDATE quotas SET used_bytes = .... Cloud documents were never counted against quota at upload time (api/documents.py:upload_cloud() does not call the confirm endpoint that updates quota). Decrementing quota on cloud doc delete would underflow the counter.
Why it happens: The service was written for MinIO only. The quota decrement was correct when there were no cloud docs.
How to avoid: Add skip_quota: bool = False parameter. Set to True when doc.storage_backend != "minio". Alternatively, perform the DB delete separately for cloud docs without calling delete_document() at all — use the admin.py pattern.
Warning signs: quotas.used_bytes becomes negative in testing after cloud doc delete.
Pitfall 2: Cloud delete success even though the file remains
What goes wrong: MinIO's remove_object() is a no-op for non-existent keys. If cloud backend routing is broken and the call falls through to the MinIO singleton, it will return without error while leaving the cloud file intact.
Why it happens: MinIO SDK does not raise on missing keys by default.
How to avoid: Log the doc.storage_backend value before the delete call during testing. Assert in tests that cloud backend's delete_object mock was called, not MinIO's.
Pitfall 3: PATCH route ordering / path conflict
What goes wrong: If a PATCH endpoint at /{share_id} is somehow confused with another route.
Why it happens: FastAPI path routing is order-dependent for same-method same-prefix routes.
How to avoid: For PATCH /{share_id}, no conflict exists with the current GET /received special case (different method). Simply add PATCH before DELETE as a matter of style.
Pitfall 4: AuditLog COUNT query broken after JOIN
What goes wrong: The count query select(func.count()).select_from(base_q.subquery()) works when base_q selects a single entity. After adding a multi-column JOIN, the subquery shape changes and the count may become a count of tuples rather than a count of AuditLog rows.
Why it happens: SQLAlchemy 2.0 multi-column select wrapped in a subquery; func.count() on an ambiguous subquery.
How to avoid: Use a separate count query that does NOT join User: select(func.count(AuditLog.id)).where(<same filters>). Keep the count query and the data query separate.
Pitfall 5: res.text() vs res.blob() for CSV in Vue
What goes wrong: Calling res.json() on a CSV response raises a JSON parse error.
Why it happens: The shared request() wrapper always calls res.json().
How to avoid: The CSV download function must be a standalone fetch() call (not via request()) that calls res.text(). This mirrors the existing fetchDocumentContent pattern exactly.
Pitfall 6: Date path parameter traversal in daily-export download
What goes wrong: GET /api/admin/audit-log/daily-exports/../../etc/passwd could construct a malicious MinIO key.
Why it happens: Path parameters are URL-decoded before reaching the handler.
How to avoid: Validate date against r"\d{4}-\d{2}-\d{2}" regex before constructing f"audit-logs/{date}.csv". FastAPI path parameters with slashes are unusual, but hyphens can still be used in injection attempts.
Pitfall 7: _audit_to_dict() called from two places (viewer + export)
What goes wrong: If only the viewer endpoint is updated to use the new handle-enriched function, the CSV export still emits raw UUIDs.
Why it happens: The existing _audit_to_dict() is shared by both endpoints. Both must be updated to the new function signature.
How to avoid: Update _audit_to_dict_with_handles is used in BOTH list_audit_log and export_audit_log. Preserve the original _audit_to_dict as a fallback only if needed.
Code Examples
SQLAlchemy 2.0 async aliased double-JOIN (verified pattern)
# Source: [VERIFIED: codebase] — mirrors Share→User join at shares.py:155-161
from sqlalchemy.orm import aliased
from sqlalchemy import select
UserSubject = aliased(User)
UserActor = aliased(User)
stmt = (
select(AuditLog, UserSubject.handle.label("user_handle"), UserActor.handle.label("actor_handle"))
.outerjoin(UserSubject, UserSubject.id == AuditLog.user_id)
.outerjoin(UserActor, UserActor.id == AuditLog.actor_id)
.order_by(AuditLog.created_at.desc())
)
result = await session.execute(stmt)
rows = result.all()
for row in rows:
entry, user_handle, actor_handle = row
Existing cloud-backend pattern (admin.py reference)
# Source: [VERIFIED: codebase] — admin.py lines 527-539
for doc in cloud_docs:
try:
backend = await get_storage_backend_for_document(doc, user, session)
await backend.delete_object(doc.object_key)
except Exception:
pass # best-effort; deletion proceeds regardless
Existing fetch+Blob pattern (client.js reference)
// Source: [VERIFIED: codebase] — client.js lines 399-428 (fetchDocumentContent)
const res = await fetch(`/api/documents/${docId}/content`, {
headers: { 'Authorization': `Bearer ${authStore.accessToken}` },
credentials: 'include',
})
// For CSV, use res.text() instead of res.blob()
const text = await res.text()
const blob = new Blob([text], { type: 'text/csv' })
const url = URL.createObjectURL(blob)
// ... <a> click + revokeObjectURL
Existing StreamingResponse for CSV (audit.py reference)
# Source: [VERIFIED: codebase] — audit.py lines 158-162
return StreamingResponse(
iter([output.getvalue()]),
media_type="text/csv",
headers={"Content-Disposition": "attachment; filename=audit-export.csv"},
)
State of the Art
| Old Approach | Current Approach | When Changed | Impact |
|---|---|---|---|
window.location.href for auth downloads |
fetch() + Blob URL |
This phase | Enables Bearer token on every request |
user_id UUID filter in audit log |
user_handle string filter |
This phase | Human-friendly; no UUID knowledge required |
| Audit log shows raw UUIDs | Shows user_handle / actor_handle |
This phase | Admin can read who did what without cross-referencing user list |
| Cloud delete only removes DB record | Cloud delete propagates to provider | This phase | CLOUD provider storage reclaimed; no orphaned files |
Assumptions Log
| # | Claim | Section | Risk if Wrong |
|---|---|---|---|
| A1 | SQLAlchemy 2.0 multi-column aliased JOIN returns Row tuples accessible as row[0], row[1], row[2] |
Architecture Patterns §3 | Query rows parsed incorrectly; _audit_to_dict_with_handles receives wrong arguments |
| A2 | Minio.list_objects() iterator is safe to consume inside asyncio.to_thread without additional threading concerns |
Architecture Patterns §6 | If the SDK has internal async state, to_thread could cause issues — unlikely given SDK is sync-only |
| A3 | is_dir objects from list_objects can be filtered by checking obj.object_name.endswith(".csv") |
Architecture Patterns §6 | Directory marker objects (is_dir=True) have object_name ending in "/"; .endswith(".csv") check correctly excludes them |
If this table is empty: All claims in this research were verified or cited — no user confirmation needed.
Open Questions
-
Cloud delete partial failure: HTTP status code choice
- What we know: D-03 says return a structured error the frontend can distinguish from hard failure.
- What's unclear: Should the endpoint return HTTP 200 with
{"cloud_delete_failed": true}or a specific non-error HTTP status like 207 (Multi-Status)? - Recommendation: Return HTTP 200 with
{"success": false, "cloud_delete_failed": true, "detail": "..."}. The frontend checksdata.cloud_delete_failedto show the warning modal. HTTP 207 is unusual for REST APIs and harder forclient.jsto handle given it callsres.okcheck. 200 with a distinguishing body flag is simpler and consistent with the existing pattern whereok = Falsefromstorage.delete_documentreturns 404 — this case is distinct (partial success).
-
Audit export query param rename:
user_id→user_handle- What we know: The current endpoint accepts
user_id: Optional[uuid.UUID]via Query. Changing this toOptional[str]changes the public API. - What's unclear: Existing tests pass a UUID string as
user_id; if the param is renamed touser_handle, tests need updating. - Recommendation: Keep
user_idas the query param name but change its type toOptional[str]. If the value parses as a valid UUID, treat it as a direct UUID filter (backward compat). Otherwise treat as a handle. This avoids breaking any existing automation. Or, adduser_handleas a new param alongsideuser_idand deprecateuser_id. The decision affects the existingtest_audit.pytests.
- What we know: The current endpoint accepts
Environment Availability
No external tools or services beyond what the project already uses. All dependencies are present.
| Dependency | Required By | Available | Version | Fallback |
|---|---|---|---|---|
| Minio Python SDK | Daily export listing/streaming | Yes | current | — |
| FastAPI | PATCH endpoint, query params | Yes | 0.136+ | — |
| SQLAlchemy async | Handle JOIN | Yes | 2.x | — |
| Vue 3 / Pinia | Frontend changes | Yes | 3.x | — |
Validation Architecture
Test Framework
| Property | Value |
|---|---|
| Framework | pytest + pytest-asyncio (backend); Vitest 4.1.7 (frontend) |
| Config file | backend/pytest.ini or backend/pyproject.toml |
| Quick run command | cd backend && pytest tests/test_shares.py tests/test_audit.py -x -q |
| Full suite command | cd backend && pytest -v |
Current baseline: 310 passed, 1 pre-existing failure (test_extractor.py::test_extract_docx — unrelated ModuleNotFoundError), 5 skipped, 10 xfailed.
Phase Requirements to Test Map
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|---|---|---|---|---|
| SHARE-03 | POST /api/shares respects permission field |
integration | pytest tests/test_shares.py::test_share_create_with_permission -x |
Wave 0 gap |
| SHARE-03 | PATCH /api/shares/{id} changes permission | integration | pytest tests/test_shares.py::test_share_patch_permission -x |
Wave 0 gap |
| SHARE-03 | PATCH /api/shares/{id} wrong owner → 404 | integration | pytest tests/test_shares.py::test_share_patch_idor -x |
Wave 0 gap |
| SHARE-05 | is_shared field (not share_count) drives badge |
unit/integration | pytest tests/test_shares.py::test_share_indicator_in_owner_list -x |
Exists (passes) |
| CLOUD-del | delete_document routes to cloud backend | integration (mock) | pytest tests/test_documents.py::test_delete_cloud_document_propagates -x |
Wave 0 gap |
| CLOUD-del | Cloud delete failure returns structured error | integration (mock) | pytest tests/test_documents.py::test_delete_cloud_document_failure -x |
Wave 0 gap |
| CLOUD-del | remove_only=true skips cloud, removes DB | integration (mock) | pytest tests/test_documents.py::test_delete_cloud_remove_only -x |
Wave 0 gap |
| ADMIN-06 | Audit log response includes user_handle | integration | pytest tests/test_audit.py::test_audit_log_includes_user_handle -x |
Wave 0 gap |
| ADMIN-06 | user_handle filter resolves to correct entries | integration | pytest tests/test_audit.py::test_audit_log_filter_by_handle -x |
Wave 0 gap |
| ADMIN-06 | unknown handle filter returns empty | integration | pytest tests/test_audit.py::test_audit_log_filter_unknown_handle -x |
Wave 0 gap |
| ADMIN-06 | Daily exports list endpoint returns keys | integration (mock) | pytest tests/test_audit.py::test_daily_exports_list -x |
Wave 0 gap |
| ADMIN-06 | Daily export download returns CSV bytes | integration (mock) | pytest tests/test_audit.py::test_daily_export_download -x |
Wave 0 gap |
Sampling Rate
- Per task commit:
cd backend && pytest tests/test_shares.py tests/test_audit.py tests/test_documents.py -x -q - Per wave merge:
cd backend && pytest -v - Phase gate: Full suite green (excluding pre-existing
test_extract_docxfailure) before/gsd:verify-work
Wave 0 Gaps
tests/test_shares.py::test_share_create_with_permission— covers SHARE-03 POST permission fieldtests/test_shares.py::test_share_patch_permission— covers SHARE-03 PATCH endpointtests/test_shares.py::test_share_patch_idor— covers IDOR security invariant on PATCHtests/test_documents.py::test_delete_cloud_document_propagates— covers cloud delete routingtests/test_documents.py::test_delete_cloud_document_failure— covers D-03 structured error responsetests/test_documents.py::test_delete_cloud_remove_only— covers D-02 remove_only pathtests/test_audit.py::test_audit_log_includes_user_handle— covers D-11 handle enrichmenttests/test_audit.py::test_audit_log_filter_by_handle— covers D-12 handle filtertests/test_audit.py::test_audit_log_filter_unknown_handle— covers D-12 empty resulttests/test_audit.py::test_daily_exports_list— covers D-15 listing endpointtests/test_audit.py::test_daily_export_download— covers D-16 streaming endpoint
Security Domain
Applicable ASVS Categories
| ASVS Category | Applies | Standard Control |
|---|---|---|
| V4 Access Control | Yes | share.owner_id == current_user.id on PATCH (IDOR, 404 on mismatch) |
| V4 Access Control | Yes | get_current_admin on all audit-log endpoints including new daily-export ones |
| V5 Input Validation | Yes | permission enum validated in Pydantic; date regex-validated before MinIO key construction |
| V13 API | Yes | No new public endpoints; all new endpoints gated by existing auth deps |
Known Threat Patterns
| Pattern | STRIDE | Standard Mitigation |
|---|---|---|
| IDOR on PATCH /shares/{id} | Elevation of Privilege | share.owner_id == current_user.id → 404 on mismatch (T-04-04-02 pattern) |
| Path traversal in daily export date param | Tampering | re.fullmatch(r"\d{4}-\d{2}-\d{2}", date) before key construction |
| Token bypass via window.location.href | Info Disclosure | Replace with fetch() + Authorization header (D-13) |
| Cloud credential exposure in error response | Info Disclosure | Cloud delete exception caught generically; str(exc) must NOT include credential data — catch as Exception, log internally, return generic message |
| Quota underflow on cloud delete | Tampering | skip_quota=True for cloud documents; cloud docs never had quota charged |
| Mass assignment on SharePermissionPatch | Tampering | Minimal Pydantic model with explicit permission field only — no **kwargs passthrough |
Sources
Primary (HIGH confidence)
[VERIFIED: codebase]—backend/services/storage.pylines 143-179 —delete_document()implementation; quota decrement pattern; MinIO-only delete[VERIFIED: codebase]—backend/storage/__init__.py—get_storage_backend_for_document()factory; cloud credential routing[VERIFIED: codebase]—backend/api/admin.pylines 527-539 — canonical cloud delete pattern (per-doc) to follow[VERIFIED: codebase]—backend/api/shares.py— existing IDOR pattern;permission="view"hardcode at line 97; route ordering constraint[VERIFIED: codebase]—backend/api/audit.py—_audit_to_dict()pure function;_build_filtered_query(); both endpoints[VERIFIED: codebase]—backend/tasks/audit_tasks.py— key patternaudit-logs/{date}.csv;put_object_rawusage[VERIFIED: codebase]—frontend/src/api/client.jslines 399-428 —fetchDocumentContentas fetch+Blob precedent[VERIFIED: codebase]—frontend/src/components/admin/AuditLogTab.vuelines 185-191 — brokenwindow.location.hrefexport[VERIFIED: codebase]—frontend/src/components/documents/DocumentCard.vueline 31 —share_count > 0bug[VERIFIED: npm registry]— Minio Python SDKlist_objectssignature confirmed at runtime;Object.object_name,Object.is_dirattributes verified
Secondary (MEDIUM confidence)
[ASSUMED]— SQLAlchemy 2.0 aliased double-JOIN returnsRowtuples with positional access — documented in SA2.0 changelog but not verified via Context7 in this session
Metadata
Confidence breakdown:
- Standard stack: HIGH — all packages already in use; no new dependencies
- Architecture: HIGH — all patterns drawn from existing codebase code; no external documentation required
- Pitfalls: HIGH — each pitfall is backed by specific code paths read directly from the source files
- Test gaps: HIGH — based on reading existing test files and counting what is not yet covered
Research date: 2026-05-31 Valid until: 2026-07-01 (stable brownfield — no fast-moving dependencies)