# Phase 6.2: Close v1 sharing + cloud-delete + CSV export gaps — Research **Researched:** 2026-05-31 **Domain:** FastAPI PATCH endpoints, cloud storage backend routing, Fetch API blob download, MinIO list_objects, SQLAlchemy async JOIN **Confidence:** HIGH (all findings verified against the live codebase) --- ## User Constraints (from CONTEXT.md) ### Locked Decisions - **D-00:** `create_user` 500 fix already committed — `await session.flush()` before `write_audit_log` in `backend/api/admin.py`. - **D-01:** Default delete propagates to cloud provider. `delete_document()` in `services/storage.py` must route to `get_storage_backend_for_document(doc, user, session)` for non-MinIO docs. - **D-02:** "Remove from app only" is a separate, distinct action that deletes the DB record while leaving the cloud file intact. - **D-03:** Cloud provider delete failure: warning modal "Cloud delete failed. Remove from app anyway?" — user chooses. Backend returns structured error distinguishable from hard 4xx/5xx. - **D-04:** Cloud documents do NOT touch MinIO quota on delete (unchanged from existing design). - **D-05:** Persistent local Celery cache for cloud docs — DEFERRED. Not this phase. - **D-06:** `DocumentCard.vue:31` — change `v-if="doc.share_count > 0"` to `v-if="doc.is_shared"`. - **D-07:** Permission levels: `view` and `edit`. `Share.permission` column already exists. No migration. - **D-08:** Permission set at share creation: add view/edit dropdown to `ShareModal.vue` before submitting. Existing POST endpoint already accepts `permission` field. - **D-09:** Permission changeable after creation: add View/Edit toggle per share row in `ShareModal.vue`. Calls new `PATCH /api/shares/{id}` with `{ permission: "view" | "edit" }`. Owner-only, 404 for wrong owner. - **D-10:** Fix `shares.py` POST hardcoded `permission="view"` (line 97) — read from request body instead. - **D-11:** `_audit_to_dict()` — extend to return `user_handle` and `actor_handle` via JOIN on `User` table. - **D-12:** `user_id` filter in `GET /api/admin/audit-log` — accept handle string, resolve to UUID via `User.handle == handle`, return empty if not found. - **D-13:** Replace `window.location.href` in `AuditLogTab.vue:exportCsv()` with `fetch()` + Blob URL pattern. - **D-14:** Add `adminExportAuditLogCsv(params)` to `client.js` — must call `res.text()` (or `res.blob()`), NOT `res.json()`. Create object URL, trigger `` click download. - **D-15:** Add `GET /api/admin/audit-log/daily-exports` — lists MinIO `audit-logs` bucket contents. Returns `[{ date, key }]` sorted descending. - **D-16:** Add `GET /api/admin/audit-log/daily-exports/{date}` — streams specific daily export from MinIO as `text/csv`. Auth: `get_current_admin`. - **D-17:** Frontend `AuditLogTab.vue`: date dropdown from `adminListDailyExports()`, "Download" button uses `fetch()` + Blob URL. ### Claude's Discretion - Exact API shape for "remove from app only" vs "delete from provider" (`?cloud_only=false` query param vs separate endpoint). - Whether `PATCH /api/shares/{id}` accepts full body or just `{ permission: "view"|"edit" }` — minimal body preferred. - Exact error response shape for cloud delete failure — must be distinguishable from hard 4xx/5xx. - MinIO `list_objects` pagination — handle if `audit-logs` bucket has >1000 files. ### Deferred Ideas (OUT OF SCOPE) - Persistent local Celery cache for cloud docs with quota tracking. - Celery local cache "Remove download" button. - SHARE-03 permission levels beyond view/edit. ## Phase Requirements | ID | Description | Research Support | |----|-------------|------------------| | SHARE-03 | Shared access is view-only by default; owner controls permission level | D-07..D-10: `Share.permission` column exists; POST hardcodes view; PATCH endpoint needed; frontend dropdown needed | | SHARE-05 | Documents shared with others display "shared" indicator | D-06: backend sends `is_shared`; frontend checks wrong field `share_count` | | CLOUD (del) | Cloud document deletion propagates to provider | D-01..D-04: `delete_document()` must route to `get_storage_backend_for_document()`; structured error response needed | | ADMIN-06 | Admin audit log with filters and export | D-11..D-17: user handles in responses, handle filter, CSV fetch download, daily export list + stream | --- ## Summary Phase 6.2 is a brownfield gap-closure sprint with four independent feature areas. All four areas have the backend plumbing already present — the gaps are surgical additions and fixes, not architectural changes. **Area 1 — Cloud-delete propagation:** `services/storage.py:delete_document()` unconditionally calls `_backend().delete_object()` (MinIO singleton). For cloud documents (`doc.storage_backend != "minio"`) it must instead call `get_storage_backend_for_document(doc, user, session)`. This function is already in `backend/storage/__init__.py` and used by both the content proxy and `admin.py:delete_user()`. The wrinkle is that `delete_document()` currently does not receive the `user` ORM object or session — the service layer only receives `session` and `doc_id`. The cleanest fix is to extend the function signature or to perform the cloud routing in `api/documents.py:delete_document` before calling `storage.delete_document()`. The admin delete_user pattern (lines 527-539) is the canonical reference: call `get_storage_backend_for_document`, then `backend.delete_object`, wrapped in `try/except`. A separate "remove from app only" action (D-02) needs an endpoint that skips cloud deletion entirely — a `?remove_only=true` query parameter on `DELETE /api/documents/{id}` is the cleanest route since it shares the auth/ownership gate without duplicating a whole endpoint. **Area 2 — SHARE-03/05 fixes:** Three micro-changes. (1) `DocumentCard.vue` line 31: `share_count > 0` to `is_shared`. (2) `shares.py` POST: read `permission` from request body (add to `ShareCreate` model). (3) New `PATCH /api/shares/{id}` endpoint with minimal body `{ permission: "view" | "edit" }`, owner-enforced IDOR (404 on mismatch, mirroring the DELETE pattern). (4) Frontend `ShareModal.vue`: dropdown before submit + toggle per row. **Area 3 — Audit log user handles:** `_audit_to_dict()` is a pure function operating on a loaded `AuditLog` ORM object — it does not have access to a session. Two options: (a) change it to accept optional pre-fetched handle dicts alongside the entry, or (b) make the list/export endpoints JOIN `User` as an alias twice (once for `user_id`, once for `actor_id`) and pass the handles as extra arguments. Option (b) is cleaner and avoids N+1 queries. The `_build_filtered_query()` helper returns a `select(AuditLog)` — it needs to be extended to JOIN User twice (with SQLAlchemy aliases) and yield `(AuditLog, user_handle, actor_handle)` tuples. The `user_id` filter must change from accepting `Optional[uuid.UUID]` to accepting `Optional[str]` and resolving the handle in a preliminary query. **Area 4 — CSV export + daily exports:** The export endpoint is already a `StreamingResponse` — the only change is on the frontend where `window.location.href` must become `fetch()` + Blob URL. The daily export listing uses `Minio.list_objects(bucket_name="audit-logs", prefix="audit-logs/")` which returns a synchronous iterator — it must be consumed in `asyncio.to_thread()`. The key pattern `audit-logs/{date}.csv` is guaranteed by `audit_tasks.py:79`. **Primary recommendation:** Implement in four vertical slices, each deployable independently. Start with SHARE-05 (one-line fix), then SHARE-03, then cloud-delete, then audit log cluster (user handles + CSV + daily exports). --- ## Architectural Responsibility Map | Capability | Primary Tier | Secondary Tier | Rationale | |------------|-------------|----------------|-----------| | Cloud-delete routing | API / Backend | — | Must happen server-side where credentials are decrypted; cannot be client-driven | | "Remove from app" vs "delete from cloud" | API / Backend | Browser / Client | Query param distinction at API layer; UX confirmation modal at client layer | | Share permission PATCH | API / Backend | Browser / Client | IDOR enforcement is a backend concern; toggle UI is client | | "Shared" badge display | Browser / Client | — | Trivially reads `doc.is_shared` field from list response | | Audit log handle enrichment | API / Backend | — | JOIN happens in DB query layer; frontend only receives enriched response | | CSV export download | Browser / Client | API / Backend | Authentication requires `fetch()` with Bearer header — browser nav cannot send it | | Daily export listing | API / Backend | Browser / Client | MinIO query is server-side; dropdown is client | | Daily export stream | API / Backend | Browser / Client | Streaming response with auth gate; download trigger at client | --- ## Standard Stack This phase adds no new packages. All functionality uses the existing stack. ### Core (already installed) | Library | Version | Purpose | Relevance to Phase | |---------|---------|---------|-------------------| | FastAPI | 0.136+ | API framework | PATCH endpoint, query params, StreamingResponse | | SQLAlchemy 2.0 async | 2.x | ORM | JOIN with aliased User twice for handle enrichment | | Minio (Python SDK) | current | MinIO S3 client | `list_objects`, `get_object` for daily exports | | Pydantic v2 | 2.x | Request validation | `SharePermissionPatch` minimal body model | | Vue 3 (Options API/Composition) | 3.x | Frontend framework | Dropdown, toggle, fetch+Blob download | ### No new packages required All operations — cloud backend routing, streaming responses, fetch+Blob download — use code already present in the project. ## Package Legitimacy Audit Not applicable — no new packages installed in this phase. --- ## Architecture Patterns ### System Architecture Diagram ``` DELETE /api/documents/{id}?remove_only=false (default) └── api/documents.py:delete_document ├── ownership assert (doc.user_id == current_user.id) ├── if doc.storage_backend == "minio": │ └── services/storage.py:delete_document() ← (unchanged path) │ ├── _backend().delete_object(key) │ └── quota decrement (STORE-06) └── else (cloud backend): ├── get_storage_backend_for_document(doc, user, session) │ └── decrypts HKDF credentials, returns cloud backend ├── try: cloud_backend.delete_object(doc.object_key) │ except: return {"cloud_delete_failed": true, "detail": "..."} ← 207/200 structured └── if cloud delete succeeded (or user confirmed remove_only): └── services/storage.py:_db_only_delete() ← quota skipped for cloud docs PATCH /api/shares/{id} └── api/shares.py:update_share_permission ├── UUID parse (404 on invalid) ├── session.get(Share, sid) ├── assert share.owner_id == current_user.id → 404 on mismatch (IDOR) ├── validate body.permission in {"view", "edit"} └── share.permission = body.permission; commit GET /api/admin/audit-log (with handle enrichment) └── api/audit.py:list_audit_log ├── if user_handle param: resolve handle → UUID, or empty result if not found ├── _build_filtered_query_with_handles(start, end, user_uuid, event_type) │ └── select(AuditLog, user_alias.handle, actor_alias.handle) │ .outerjoin(user_alias, user_alias.id == AuditLog.user_id) │ .outerjoin(actor_alias, actor_alias.id == AuditLog.actor_id) └── _audit_to_dict(entry, user_handle, actor_handle) → dict with both handles GET /api/admin/audit-log/export (CSV with auth) └── StreamingResponse (unchanged backend) ← client: fetch() + res.text() + Blob URL + click download GET /api/admin/audit-log/daily-exports └── asyncio.to_thread(minio_client.list_objects, "audit-logs", prefix="audit-logs/") └── returns [{date: "YYYY-MM-DD", key: "audit-logs/YYYY-MM-DD.csv"}, ...] GET /api/admin/audit-log/daily-exports/{date} └── key = f"audit-logs/{date}.csv" ├── asyncio.to_thread(minio_client.get_object, "audit-logs", key) └── StreamingResponse(iter([csv_bytes]), media_type="text/csv") ``` ### Recommended Project Structure (changes only) ``` backend/ ├── api/ │ ├── shares.py # + PATCH /{id} endpoint; fix ShareCreate.permission │ └── audit.py # + handle JOIN; + 2 daily-export endpoints; fix user filter ├── services/ │ └── storage.py # + cloud routing in delete_document; + _db_only_delete() frontend/src/ ├── api/ │ └── client.js # + adminExportAuditLogCsv(); + adminListDailyExports(); + adminDownloadDailyExport() ├── components/ │ ├── admin/ │ │ └── AuditLogTab.vue # fix exportCsv(); + daily exports date dropdown │ ├── sharing/ │ │ └── ShareModal.vue # + permission dropdown on share; + toggle per row │ └── documents/ │ └── DocumentCard.vue # line 31: share_count → is_shared ``` ### Pattern 1: Cloud-delete routing in services/storage.py The cleanest approach to the signature problem: `delete_document()` gets two new optional parameters `user` and `session_for_cloud`. When `user` is provided and `doc.storage_backend != "minio"`, cloud routing fires. The caller in `api/documents.py` already has both the `current_user` and `session` in scope. **Alternative (preferred for simplicity):** Move the cloud routing entirely into `api/documents.py:delete_document`, before the call to `storage.delete_document()`. The service layer function handles only DB + MinIO quota. The API layer decides routing. This keeps `services/storage.py` free of cloud-backend imports and mirrors how `admin.py:delete_user()` works — the API layer calls `get_storage_backend_for_document()` and the service layer is unaware of cloud backends. ```python # api/documents.py — delete_document (modified) @router.delete("/{doc_id}") async def delete_document( doc_id: str, remove_only: bool = Query(default=False), # D-02: skip cloud delete request: Request, session: AsyncSession = Depends(get_db), current_user: User = Depends(get_regular_user), ): uid = uuid.UUID(doc_id) doc = await session.get(Document, uid) if doc is None or doc.user_id != current_user.id: raise HTTPException(404, "Document not found") cloud_delete_failed = False if doc.storage_backend != "minio" and not remove_only: try: cloud_backend = await get_storage_backend_for_document(doc, current_user, session) await cloud_backend.delete_object(doc.object_key) except Exception as exc: # D-03: structured error; frontend shows warning modal cloud_delete_failed = True # Caller (frontend) decides whether to retry with remove_only=true if cloud_delete_failed and not remove_only: return JSONResponse( status_code=200, content={ "success": False, "cloud_delete_failed": True, "detail": "Cloud provider delete failed. You can remove from app only.", } ) # DB delete + quota decrement (cloud docs skip quota — D-04) ok = await storage.delete_document(session, doc_id, skip_quota=doc.storage_backend != "minio") # ... audit log + commit ``` **Key insight on quota:** Cloud docs already skip quota at upload time (no `UPDATE quotas` in cloud upload path). The `delete_document()` in `services/storage.py` currently always decrements quota. It needs a `skip_quota: bool = False` guard for cloud documents. Source: `[VERIFIED: codebase]` — confirmed by reading `services/storage.py:163-175` and `api/documents.py:269-291`. ### Pattern 2: PATCH /api/shares/{id} — minimal IDOR-safe pattern ```python # api/shares.py — new endpoint class SharePermissionPatch(BaseModel): permission: str # validated: "view" | "edit" @field_validator("permission") @classmethod def validate_permission(cls, v: str) -> str: if v not in {"view", "edit"}: raise ValueError("permission must be 'view' or 'edit'") return v # CRITICAL: must be defined BEFORE DELETE /{share_id} to avoid path conflict @router.patch("/{share_id}", status_code=status.HTTP_200_OK) async def update_share_permission( share_id: str, body: SharePermissionPatch, session: AsyncSession = Depends(get_db), current_user: User = Depends(get_regular_user), ) -> dict: try: sid = uuid.UUID(share_id) except ValueError: raise HTTPException(status_code=404, detail="Share not found") share = await session.get(Share, sid) # IDOR: 404 on mismatch — mirrors DELETE pattern (T-04-04-02) if share is None or share.owner_id != current_user.id: raise HTTPException(status_code=404, detail="Share not found") share.permission = body.permission await session.commit() return {"id": str(share.id), "permission": share.permission} ``` Source: `[VERIFIED: codebase]` — mirrors `revoke_share` IDOR pattern at lines 240-265 in `backend/api/shares.py`. ### Pattern 3: Handle enrichment via SQLAlchemy aliased JOIN ```python # api/audit.py — handle enrichment from sqlalchemy.orm import aliased UserAsSubject = aliased(User) # for user_id FK UserAsActor = aliased(User) # for actor_id FK def _build_filtered_query_with_handles(start, end, user_uuid, event_type): q = ( select(AuditLog, UserAsSubject.handle, UserAsActor.handle) .outerjoin(UserAsSubject, UserAsSubject.id == AuditLog.user_id) .outerjoin(UserAsActor, UserAsActor.id == AuditLog.actor_id) .order_by(AuditLog.created_at.desc()) ) if start: q = q.where(AuditLog.created_at >= start) if end: q = q.where(AuditLog.created_at <= end) if user_uuid: q = q.where(AuditLog.user_id == user_uuid) if event_type: q = q.where(AuditLog.event_type == event_type) return q def _audit_to_dict_with_handles(entry, user_handle, actor_handle) -> dict: d = { "id": entry.id, "event_type": entry.event_type, "user_id": str(entry.user_id) if entry.user_id else None, "actor_id": str(entry.actor_id) if entry.actor_id else None, "user_handle": user_handle or None, "actor_handle": actor_handle or None, "resource_id": str(entry.resource_id) if entry.resource_id else None, "ip_address": str(entry.ip_address) if entry.ip_address else None, "metadata_": entry.metadata_, "created_at": entry.created_at.isoformat(), } return d ``` Result rows from a multi-column select in SQLAlchemy 2.0 async are `Row` tuples — access as `row[0]` (AuditLog), `row[1]` (user_handle), `row[2]` (actor_handle). `result.all()` returns `list[Row]`. Source: `[ASSUMED]` — SQLAlchemy 2.0 aliased JOIN pattern is well-documented but not verified via Context7 in this session. ### Pattern 4: Handle-to-UUID resolution for user filter ```python # In list_audit_log / export_audit_log endpoint handlers: user_uuid: Optional[uuid.UUID] = None if user_handle_param: # new str param replacing Optional[uuid.UUID] handle_result = await session.execute( select(User.id).where(User.handle == user_handle_param) ) uid = handle_result.scalar_one_or_none() if uid is None: # No user with that handle → return empty results (D-12) return {"items": [], "total": 0, "page": page, "per_page": per_page} user_uuid = uid ``` Source: `[VERIFIED: codebase]` — `User.handle` is a unique indexed column (models.py line 52). ### Pattern 5: Fetch + Blob URL download in Vue 3 (CSV export) ```javascript // frontend/src/api/client.js — new function export async function adminExportAuditLogCsv(params = {}) { const { useAuthStore } = await import('../stores/auth.js') const authStore = useAuthStore() const searchParams = new URLSearchParams({ format: 'csv' }) if (params.start) searchParams.set('start', params.start) if (params.end) searchParams.set('end', params.end) if (params.user_handle) searchParams.set('user_handle', params.user_handle) if (params.event_type) searchParams.set('event_type', params.event_type) const headers = {} if (authStore.accessToken) { headers['Authorization'] = `Bearer ${authStore.accessToken}` } const res = await fetch(`/api/admin/audit-log/export?${searchParams}`, { headers, credentials: 'include', }) if (!res.ok) throw new Error(`Export failed: ${res.status}`) const text = await res.text() // CSV is text, not JSON const blob = new Blob([text], { type: 'text/csv' }) const url = URL.createObjectURL(blob) const a = document.createElement('a') a.href = url a.download = 'audit-export.csv' a.click() URL.revokeObjectURL(url) } ``` This is the same pattern as `fetchDocumentContent()` already in `client.js` — raw `fetch()` with Authorization header, NOT using the shared `request()` wrapper (which always calls `res.json()`). Source: `[VERIFIED: codebase]` — `fetchDocumentContent` at lines 399-428 of `client.js` is the exact precedent. ### Pattern 6: MinIO daily export listing (asyncio.to_thread) ```python # api/audit.py — new endpoint @router.get("/audit-log/daily-exports") async def list_daily_exports( session: AsyncSession = Depends(get_db), _admin: User = Depends(get_current_admin), ) -> dict: """List available daily audit export files from MinIO audit-logs bucket.""" from storage import get_storage_backend from storage.minio_backend import MinIOBackend backend = get_storage_backend() if not isinstance(backend, MinIOBackend): return {"items": []} def _list() -> list[dict]: objects = backend._client.list_objects( "audit-logs", prefix="audit-logs/", recursive=False ) items = [] for obj in objects: if obj.object_name and obj.object_name.endswith(".csv"): # key: "audit-logs/2026-05-30.csv" → date: "2026-05-30" filename = obj.object_name.removeprefix("audit-logs/").removesuffix(".csv") items.append({"date": filename, "key": obj.object_name}) items.sort(key=lambda x: x["date"], reverse=True) return items items = await asyncio.to_thread(_list) return {"items": items} ``` `Minio.list_objects()` returns a synchronous iterator; consuming it inside `asyncio.to_thread` blocks the thread (not the event loop). The iterator is lazy — it pages automatically. For fewer than ~1000 objects (years of daily exports) it completes in a single S3 ListObjectsV2 call. `[VERIFIED: codebase]` — confirmed by inspecting Minio SDK `list_objects` signature at runtime; `obj.object_name` and `obj.is_dir` are the key attributes. ### Pattern 7: MinIO daily export streaming download ```python @router.get("/audit-log/daily-exports/{date}") async def download_daily_export( date: str, session: AsyncSession = Depends(get_db), _admin: User = Depends(get_current_admin), ) -> StreamingResponse: import re, asyncio if not re.fullmatch(r"\d{4}-\d{2}-\d{2}", date): raise HTTPException(status_code=404, detail="Invalid date format") from storage import get_storage_backend backend = get_storage_backend() key = f"audit-logs/{date}.csv" def _get() -> bytes: response = backend._client.get_object("audit-logs", key) try: return response.read() finally: response.close() response.release_conn() try: csv_bytes = await asyncio.to_thread(_get) except Exception: raise HTTPException(status_code=404, detail="Export not found") return StreamingResponse( iter([csv_bytes]), media_type="text/csv", headers={"Content-Disposition": f'attachment; filename="audit-{date}.csv"'}, ) ``` Note: The `get_object` call uses `"audit-logs"` bucket (separate from documents bucket), mirroring `audit_tasks.py`. Date path parameter must be validated against `YYYY-MM-DD` regex to prevent path traversal. Source: `[VERIFIED: codebase]` — `MinIOBackend.get_object()` pattern at lines 110-121 of `minio_backend.py`. ### Anti-Patterns to Avoid - **Calling `_backend()` singleton for cloud deletes:** `_backend()` in `services/storage.py` always returns MinIOBackend. Cloud docs must use `get_storage_backend_for_document()`. Using the singleton for cloud docs would silently succeed (MinIO would return no error for a key that doesn't exist there) while leaving the cloud file orphaned. - **Defining `PATCH /{share_id}` AFTER `DELETE /{share_id}`:** FastAPI routes earlier-defined paths first. A `PATCH /{share_id}` defined after `DELETE /{share_id}` on the same router will work correctly for PATCH (different HTTP method). The ordering concern only affects routes with overlapping path patterns (e.g., `/received` vs `/{share_id}` GET routes). For PATCH vs DELETE on `/{share_id}`, method discrimination is unambiguous. The existing comment in `shares.py` confirms `/received` must come before `/{share_id}` for GET routes only. - **Calling `res.json()` for CSV download:** The existing `request()` function in `client.js` always calls `res.json()` and will throw on `text/csv` responses. Any CSV or binary endpoint MUST use a separate function that calls `res.text()` or `res.blob()`. - **Quota decrement on cloud document delete:** Cloud docs never touched quota at upload time — decrementing quota on delete would corrupt the counter. The `skip_quota` guard must be explicit. - **Passing `user_id` as a UUID query param for handle-based filter:** The current endpoint signature accepts `Optional[uuid.UUID]` via Query, which causes FastAPI to 422 on non-UUID strings. Changing to `Optional[str]` and doing handle resolution in the handler body is required. - **Blocking the event loop with Minio iterator:** `list_objects()` is synchronous. Consuming it directly in an async handler without `asyncio.to_thread` would block the FastAPI event loop. --- ## Don't Hand-Roll | Problem | Don't Build | Use Instead | Why | |---------|-------------|-------------|-----| | Cloud backend instantiation | Custom credential decrypt + backend init | `get_storage_backend_for_document(doc, user, session)` | Already handles HKDF decrypt, lazy imports, 503 on inactive connection | | HKDF key derivation | Custom Fernet wrapper | `storage.cloud_utils.decrypt_credentials()` | One implementation, already tested | | Streaming CSV response | Custom write loop | `io.StringIO` + `csv.DictWriter` + `StreamingResponse(iter([output.getvalue()]))` | Exact pattern already in `api/audit.py:export_audit_log` | | Blob download trigger | Custom download UI | `Blob` + `URL.createObjectURL` + `` click | Exact pattern used by `fetchDocumentContent` in `client.js` | | MinIO audit-logs GET | Custom HTTP call | `MinIOBackend._client.get_object("audit-logs", key)` in `asyncio.to_thread` | Already used for doc content; same `response.read()` + `response.close()` pattern | --- ## Common Pitfalls ### Pitfall 1: delete_document() quota decrement for cloud docs **What goes wrong:** The existing `delete_document()` in `services/storage.py` always runs `UPDATE quotas SET used_bytes = ...`. Cloud documents were never counted against quota at upload time (`api/documents.py:upload_cloud()` does not call the confirm endpoint that updates quota). Decrementing quota on cloud doc delete would underflow the counter. **Why it happens:** The service was written for MinIO only. The quota decrement was correct when there were no cloud docs. **How to avoid:** Add `skip_quota: bool = False` parameter. Set to `True` when `doc.storage_backend != "minio"`. Alternatively, perform the DB delete separately for cloud docs without calling `delete_document()` at all — use the admin.py pattern. **Warning signs:** `quotas.used_bytes` becomes negative in testing after cloud doc delete. ### Pitfall 2: Cloud delete success even though the file remains **What goes wrong:** MinIO's `remove_object()` is a no-op for non-existent keys. If cloud backend routing is broken and the call falls through to the MinIO singleton, it will return without error while leaving the cloud file intact. **Why it happens:** MinIO SDK does not raise on missing keys by default. **How to avoid:** Log the `doc.storage_backend` value before the delete call during testing. Assert in tests that cloud backend's `delete_object` mock was called, not MinIO's. ### Pitfall 3: PATCH route ordering / path conflict **What goes wrong:** If a PATCH endpoint at `/{share_id}` is somehow confused with another route. **Why it happens:** FastAPI path routing is order-dependent for same-method same-prefix routes. **How to avoid:** For `PATCH /{share_id}`, no conflict exists with the current `GET /received` special case (different method). Simply add PATCH before DELETE as a matter of style. ### Pitfall 4: AuditLog COUNT query broken after JOIN **What goes wrong:** The count query `select(func.count()).select_from(base_q.subquery())` works when `base_q` selects a single entity. After adding a multi-column JOIN, the subquery shape changes and the count may become a count of tuples rather than a count of AuditLog rows. **Why it happens:** SQLAlchemy 2.0 multi-column select wrapped in a subquery; `func.count()` on an ambiguous subquery. **How to avoid:** Use a separate count query that does NOT join User: `select(func.count(AuditLog.id)).where()`. Keep the count query and the data query separate. ### Pitfall 5: `res.text()` vs `res.blob()` for CSV in Vue **What goes wrong:** Calling `res.json()` on a CSV response raises a JSON parse error. **Why it happens:** The shared `request()` wrapper always calls `res.json()`. **How to avoid:** The CSV download function must be a standalone `fetch()` call (not via `request()`) that calls `res.text()`. This mirrors the existing `fetchDocumentContent` pattern exactly. ### Pitfall 6: Date path parameter traversal in daily-export download **What goes wrong:** `GET /api/admin/audit-log/daily-exports/../../etc/passwd` could construct a malicious MinIO key. **Why it happens:** Path parameters are URL-decoded before reaching the handler. **How to avoid:** Validate `date` against `r"\d{4}-\d{2}-\d{2}"` regex before constructing `f"audit-logs/{date}.csv"`. FastAPI path parameters with slashes are unusual, but hyphens can still be used in injection attempts. ### Pitfall 7: `_audit_to_dict()` called from two places (viewer + export) **What goes wrong:** If only the viewer endpoint is updated to use the new handle-enriched function, the CSV export still emits raw UUIDs. **Why it happens:** The existing `_audit_to_dict()` is shared by both endpoints. Both must be updated to the new function signature. **How to avoid:** Update `_audit_to_dict_with_handles` is used in BOTH `list_audit_log` and `export_audit_log`. Preserve the original `_audit_to_dict` as a fallback only if needed. --- ## Code Examples ### SQLAlchemy 2.0 async aliased double-JOIN (verified pattern) ```python # Source: [VERIFIED: codebase] — mirrors Share→User join at shares.py:155-161 from sqlalchemy.orm import aliased from sqlalchemy import select UserSubject = aliased(User) UserActor = aliased(User) stmt = ( select(AuditLog, UserSubject.handle.label("user_handle"), UserActor.handle.label("actor_handle")) .outerjoin(UserSubject, UserSubject.id == AuditLog.user_id) .outerjoin(UserActor, UserActor.id == AuditLog.actor_id) .order_by(AuditLog.created_at.desc()) ) result = await session.execute(stmt) rows = result.all() for row in rows: entry, user_handle, actor_handle = row ``` ### Existing cloud-backend pattern (admin.py reference) ```python # Source: [VERIFIED: codebase] — admin.py lines 527-539 for doc in cloud_docs: try: backend = await get_storage_backend_for_document(doc, user, session) await backend.delete_object(doc.object_key) except Exception: pass # best-effort; deletion proceeds regardless ``` ### Existing fetch+Blob pattern (client.js reference) ```javascript // Source: [VERIFIED: codebase] — client.js lines 399-428 (fetchDocumentContent) const res = await fetch(`/api/documents/${docId}/content`, { headers: { 'Authorization': `Bearer ${authStore.accessToken}` }, credentials: 'include', }) // For CSV, use res.text() instead of res.blob() const text = await res.text() const blob = new Blob([text], { type: 'text/csv' }) const url = URL.createObjectURL(blob) // ... click + revokeObjectURL ``` ### Existing StreamingResponse for CSV (audit.py reference) ```python # Source: [VERIFIED: codebase] — audit.py lines 158-162 return StreamingResponse( iter([output.getvalue()]), media_type="text/csv", headers={"Content-Disposition": "attachment; filename=audit-export.csv"}, ) ``` --- ## State of the Art | Old Approach | Current Approach | When Changed | Impact | |--------------|------------------|--------------|--------| | `window.location.href` for auth downloads | `fetch()` + Blob URL | This phase | Enables Bearer token on every request | | `user_id` UUID filter in audit log | `user_handle` string filter | This phase | Human-friendly; no UUID knowledge required | | Audit log shows raw UUIDs | Shows `user_handle` / `actor_handle` | This phase | Admin can read who did what without cross-referencing user list | | Cloud delete only removes DB record | Cloud delete propagates to provider | This phase | CLOUD provider storage reclaimed; no orphaned files | --- ## Assumptions Log | # | Claim | Section | Risk if Wrong | |---|-------|---------|---------------| | A1 | SQLAlchemy 2.0 multi-column aliased JOIN returns `Row` tuples accessible as `row[0]`, `row[1]`, `row[2]` | Architecture Patterns §3 | Query rows parsed incorrectly; `_audit_to_dict_with_handles` receives wrong arguments | | A2 | `Minio.list_objects()` iterator is safe to consume inside `asyncio.to_thread` without additional threading concerns | Architecture Patterns §6 | If the SDK has internal async state, `to_thread` could cause issues — unlikely given SDK is sync-only | | A3 | `is_dir` objects from `list_objects` can be filtered by checking `obj.object_name.endswith(".csv")` | Architecture Patterns §6 | Directory marker objects (is_dir=True) have object_name ending in "/"; `.endswith(".csv")` check correctly excludes them | **If this table is empty:** All claims in this research were verified or cited — no user confirmation needed. --- ## Open Questions (RESOLVED) 1. **Cloud delete partial failure: HTTP status code choice** - What we know: D-03 says return a structured error the frontend can distinguish from hard failure. - What's unclear: Should the endpoint return HTTP 200 with `{"cloud_delete_failed": true}` or a specific non-error HTTP status like 207 (Multi-Status)? - Recommendation: Return HTTP 200 with `{"success": false, "cloud_delete_failed": true, "detail": "..."}`. The frontend checks `data.cloud_delete_failed` to show the warning modal. HTTP 207 is unusual for REST APIs and harder for `client.js` to handle given it calls `res.ok` check. 200 with a distinguishing body flag is simpler and consistent with the existing pattern where `ok = False` from `storage.delete_document` returns 404 — this case is distinct (partial success). 2. **Audit export query param rename: `user_id` → `user_handle`** - What we know: The current endpoint accepts `user_id: Optional[uuid.UUID]` via Query. Changing this to `Optional[str]` changes the public API. - What's unclear: Existing tests pass a UUID string as `user_id`; if the param is renamed to `user_handle`, tests need updating. - Recommendation: Keep `user_id` as the query param name but change its type to `Optional[str]`. If the value parses as a valid UUID, treat it as a direct UUID filter (backward compat). Otherwise treat as a handle. This avoids breaking any existing automation. Or, add `user_handle` as a new param alongside `user_id` and deprecate `user_id`. The decision affects the existing `test_audit.py` tests. --- ## Environment Availability No external tools or services beyond what the project already uses. All dependencies are present. | Dependency | Required By | Available | Version | Fallback | |------------|------------|-----------|---------|----------| | Minio Python SDK | Daily export listing/streaming | Yes | current | — | | FastAPI | PATCH endpoint, query params | Yes | 0.136+ | — | | SQLAlchemy async | Handle JOIN | Yes | 2.x | — | | Vue 3 / Pinia | Frontend changes | Yes | 3.x | — | --- ## Validation Architecture ### Test Framework | Property | Value | |----------|-------| | Framework | pytest + pytest-asyncio (backend); Vitest 4.1.7 (frontend) | | Config file | `backend/pytest.ini` or `backend/pyproject.toml` | | Quick run command | `cd backend && pytest tests/test_shares.py tests/test_audit.py -x -q` | | Full suite command | `cd backend && pytest -v` | Current baseline: **310 passed, 1 pre-existing failure** (`test_extractor.py::test_extract_docx` — unrelated ModuleNotFoundError), 5 skipped, 10 xfailed. ### Phase Requirements to Test Map | Req ID | Behavior | Test Type | Automated Command | File Exists? | |--------|----------|-----------|-------------------|-------------| | SHARE-03 | POST /api/shares respects `permission` field | integration | `pytest tests/test_shares.py::test_share_create_with_permission -x` | Wave 0 gap | | SHARE-03 | PATCH /api/shares/{id} changes permission | integration | `pytest tests/test_shares.py::test_share_patch_permission -x` | Wave 0 gap | | SHARE-03 | PATCH /api/shares/{id} wrong owner → 404 | integration | `pytest tests/test_shares.py::test_share_patch_idor -x` | Wave 0 gap | | SHARE-05 | `is_shared` field (not `share_count`) drives badge | unit/integration | `pytest tests/test_shares.py::test_share_indicator_in_owner_list -x` | Exists (passes) | | CLOUD-del | delete_document routes to cloud backend | integration (mock) | `pytest tests/test_documents.py::test_delete_cloud_document_propagates -x` | Wave 0 gap | | CLOUD-del | Cloud delete failure returns structured error | integration (mock) | `pytest tests/test_documents.py::test_delete_cloud_document_failure -x` | Wave 0 gap | | CLOUD-del | remove_only=true skips cloud, removes DB | integration (mock) | `pytest tests/test_documents.py::test_delete_cloud_remove_only -x` | Wave 0 gap | | ADMIN-06 | Audit log response includes user_handle | integration | `pytest tests/test_audit.py::test_audit_log_includes_user_handle -x` | Wave 0 gap | | ADMIN-06 | user_handle filter resolves to correct entries | integration | `pytest tests/test_audit.py::test_audit_log_filter_by_handle -x` | Wave 0 gap | | ADMIN-06 | unknown handle filter returns empty | integration | `pytest tests/test_audit.py::test_audit_log_filter_unknown_handle -x` | Wave 0 gap | | ADMIN-06 | Daily exports list endpoint returns keys | integration (mock) | `pytest tests/test_audit.py::test_daily_exports_list -x` | Wave 0 gap | | ADMIN-06 | Daily export download returns CSV bytes | integration (mock) | `pytest tests/test_audit.py::test_daily_export_download -x` | Wave 0 gap | ### Sampling Rate - **Per task commit:** `cd backend && pytest tests/test_shares.py tests/test_audit.py tests/test_documents.py -x -q` - **Per wave merge:** `cd backend && pytest -v` - **Phase gate:** Full suite green (excluding pre-existing `test_extract_docx` failure) before `/gsd:verify-work` ### Wave 0 Gaps - [ ] `tests/test_shares.py::test_share_create_with_permission` — covers SHARE-03 POST permission field - [ ] `tests/test_shares.py::test_share_patch_permission` — covers SHARE-03 PATCH endpoint - [ ] `tests/test_shares.py::test_share_patch_idor` — covers IDOR security invariant on PATCH - [ ] `tests/test_documents.py::test_delete_cloud_document_propagates` — covers cloud delete routing - [ ] `tests/test_documents.py::test_delete_cloud_document_failure` — covers D-03 structured error response - [ ] `tests/test_documents.py::test_delete_cloud_remove_only` — covers D-02 remove_only path - [ ] `tests/test_audit.py::test_audit_log_includes_user_handle` — covers D-11 handle enrichment - [ ] `tests/test_audit.py::test_audit_log_filter_by_handle` — covers D-12 handle filter - [ ] `tests/test_audit.py::test_audit_log_filter_unknown_handle` — covers D-12 empty result - [ ] `tests/test_audit.py::test_daily_exports_list` — covers D-15 listing endpoint - [ ] `tests/test_audit.py::test_daily_export_download` — covers D-16 streaming endpoint --- ## Security Domain ### Applicable ASVS Categories | ASVS Category | Applies | Standard Control | |---------------|---------|-----------------| | V4 Access Control | Yes | `share.owner_id == current_user.id` on PATCH (IDOR, 404 on mismatch) | | V4 Access Control | Yes | `get_current_admin` on all audit-log endpoints including new daily-export ones | | V5 Input Validation | Yes | `permission` enum validated in Pydantic; `date` regex-validated before MinIO key construction | | V13 API | Yes | No new public endpoints; all new endpoints gated by existing auth deps | ### Known Threat Patterns | Pattern | STRIDE | Standard Mitigation | |---------|--------|---------------------| | IDOR on PATCH /shares/{id} | Elevation of Privilege | `share.owner_id == current_user.id` → 404 on mismatch (T-04-04-02 pattern) | | Path traversal in daily export date param | Tampering | `re.fullmatch(r"\d{4}-\d{2}-\d{2}", date)` before key construction | | Token bypass via window.location.href | Info Disclosure | Replace with `fetch()` + `Authorization` header (D-13) | | Cloud credential exposure in error response | Info Disclosure | Cloud delete exception caught generically; `str(exc)` must NOT include credential data — catch as `Exception`, log internally, return generic message | | Quota underflow on cloud delete | Tampering | `skip_quota=True` for cloud documents; cloud docs never had quota charged | | Mass assignment on SharePermissionPatch | Tampering | Minimal Pydantic model with explicit `permission` field only — no `**kwargs` passthrough | --- ## Sources ### Primary (HIGH confidence) - `[VERIFIED: codebase]` — `backend/services/storage.py` lines 143-179 — `delete_document()` implementation; quota decrement pattern; MinIO-only delete - `[VERIFIED: codebase]` — `backend/storage/__init__.py` — `get_storage_backend_for_document()` factory; cloud credential routing - `[VERIFIED: codebase]` — `backend/api/admin.py` lines 527-539 — canonical cloud delete pattern (per-doc) to follow - `[VERIFIED: codebase]` — `backend/api/shares.py` — existing IDOR pattern; `permission="view"` hardcode at line 97; route ordering constraint - `[VERIFIED: codebase]` — `backend/api/audit.py` — `_audit_to_dict()` pure function; `_build_filtered_query()`; both endpoints - `[VERIFIED: codebase]` — `backend/tasks/audit_tasks.py` — key pattern `audit-logs/{date}.csv`; `put_object_raw` usage - `[VERIFIED: codebase]` — `frontend/src/api/client.js` lines 399-428 — `fetchDocumentContent` as fetch+Blob precedent - `[VERIFIED: codebase]` — `frontend/src/components/admin/AuditLogTab.vue` lines 185-191 — broken `window.location.href` export - `[VERIFIED: codebase]` — `frontend/src/components/documents/DocumentCard.vue` line 31 — `share_count > 0` bug - `[VERIFIED: npm registry]` — Minio Python SDK `list_objects` signature confirmed at runtime; `Object.object_name`, `Object.is_dir` attributes verified ### Secondary (MEDIUM confidence) - `[ASSUMED]` — SQLAlchemy 2.0 aliased double-JOIN returns `Row` tuples with positional access — documented in SA2.0 changelog but not verified via Context7 in this session --- ## Metadata **Confidence breakdown:** - Standard stack: HIGH — all packages already in use; no new dependencies - Architecture: HIGH — all patterns drawn from existing codebase code; no external documentation required - Pitfalls: HIGH — each pitfall is backed by specific code paths read directly from the source files - Test gaps: HIGH — based on reading existing test files and counting what is not yet covered **Research date:** 2026-05-31 **Valid until:** 2026-07-01 (stable brownfield — no fast-moving dependencies)