--- phase: 04-folders-sharing-quotas-document-ux plan: 03 type: execute wave: 2 depends_on: - "04-01" - "04-02" files_modified: - backend/services/audit.py - backend/api/folders.py - backend/api/documents.py - backend/main.py autonomous: true requirements: - FOLD-01 - FOLD-02 - FOLD-03 - FOLD-04 - FOLD-05 must_haves: truths: - "User can create, rename, delete, and list folders via REST API" - "Deleting a non-empty folder cascade-deletes all documents (MinIO + DB) and decrements quota atomically" - "GET /api/folders/{id} returns a breadcrumb array from root to the requested folder" - "Document list supports sort by name, date, size via ?sort= query param" - "Full-text search via ?q= uses plainto_tsquery on documents.extracted_text GIN index" - "write_audit_log() helper is available for all Phase 4 handlers" - "Duplicate folder name under same parent returns 409" artifacts: - path: "backend/services/audit.py" provides: "write_audit_log() async helper — flush-not-commit, never-raises" exports: ["write_audit_log"] - path: "backend/api/folders.py" provides: "FOLD-01..05 endpoints: POST /api/folders, GET /api/folders, GET /api/folders/{id}, PATCH /api/folders/{id}, DELETE /api/folders/{id}, PATCH /api/documents/{id}/folder" - path: "backend/main.py" provides: "folders router registered; all Phase 4 routers wired" key_links: - from: "backend/api/folders.py" to: "backend/services/audit.py" via: "write_audit_log() call after successful folder operations" pattern: "write_audit_log" - from: "backend/api/folders.py" to: "backend/db/models.py" via: "Folder, Document, Quota ORM models" pattern: "from db.models import Folder" - from: "backend/api/documents.py" to: "backend/api/folders.py" via: "PATCH /api/documents/{id}/folder uses same ownership assertion as documents.py" pattern: "get_regular_user" --- Implement the audit service helper and all folder backend endpoints (FOLD-01..05). This plan introduces write_audit_log() which subsequent plans (shares, proxy, audit viewer) depend on. Purpose: Deliver the complete folder CRUD and document organization API. Output: backend/services/audit.py + backend/api/folders.py + main.py router registration. @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md @.planning/phases/04-folders-sharing-quotas-document-ux/04-CONTEXT.md @.planning/phases/04-folders-sharing-quotas-document-ux/04-PATTERNS.md @.planning/phases/04-folders-sharing-quotas-document-ux/04-RESEARCH.md @backend/api/documents.py @backend/db/models.py @backend/deps/auth.py @backend/deps/db.py Task 1: Create backend/services/audit.py — write_audit_log() helper backend/services/audit.py backend/services/storage.py — read the entire file; extract the import pattern for async service modules and how AsyncSession is used; note the module-level logger pattern backend/db/models.py — search for the AuditLog class definition; read it fully to confirm attribute names: event_type, user_id, actor_id, resource_id, ip_address, metadata_ (ORM attribute; DB column name is "metadata" per CLAUDE.md note) - write_audit_log() called with all fields: session, event_type (str), user_id (Optional[uuid.UUID]), actor_id (Optional[uuid.UUID]), resource_id (Optional[uuid.UUID]), ip_address (Optional[str]), metadata_ (Optional[dict]) - Function creates AuditLog ORM instance, adds to session, calls session.flush() (NOT commit) - Function NEVER raises — exception caught, logged as warning, then swallowed - AuditLog.metadata_ ORM attribute (not "metadata") used per CLAUDE.md note about reserved SQLAlchemy attribute name Create backend/services/audit.py. Imports: `from __future__ import annotations`, `import logging`, `import uuid`, `from typing import Optional`, `from sqlalchemy.ext.asyncio import AsyncSession`, `from db.models import AuditLog`. Module-level: `logger = logging.getLogger(__name__)`. Implement async def write_audit_log with signature: `(session: AsyncSession, event_type: str, user_id: Optional[uuid.UUID], actor_id: Optional[uuid.UUID], resource_id: Optional[uuid.UUID], ip_address: Optional[str], metadata_: Optional[dict] = None) -> None`. Body: try block that creates `AuditLog(event_type=event_type, user_id=user_id, actor_id=actor_id, resource_id=resource_id, ip_address=ip_address, metadata_=metadata_)`, calls `session.add(entry)`, then `await session.flush()`. Except block: `except Exception as exc: logger.warning("audit log write failed: %s", exc)`. No re-raise. Critical: use `session.flush()` not `session.commit()` — this is a hard architectural requirement (D-14). cd /Users/nik/Documents/Progamming/document_scanner/backend && python -c "from services.audit import write_audit_log; import inspect; sig = inspect.signature(write_audit_log); print(list(sig.parameters.keys()))" - backend/services/audit.py exists and is importable - write_audit_log is an async function (grep: `async def write_audit_log`) - Function uses `await session.flush()` not `await session.commit()` (grep: `session.flush` present; `session.commit` absent from this file) - Function has a bare `except Exception` that logs and does NOT re-raise (grep: `logger.warning` inside except block) - `python -c "from services.audit import write_audit_log"` exits 0 write_audit_log() is importable; uses flush-not-commit; never raises. Task 2: Create backend/api/folders.py — all FOLD-01..05 endpoints backend/api/folders.py, backend/main.py backend/api/documents.py — read the entire file; extract the import block, UUID parse pattern (try/except ValueError → 404), ownership assertion pattern (resource is None or resource.user_id != current_user.id → 404), atomic quota UPDATE text(), and the list_documents handler signature for the sort/search extension backend/main.py — read the entire file; identify the include_router section to know where to add the folders router backend/db/models.py — read the Folder class and Document class fully to confirm all column names (parent_id, folder_id FK on Document) POST /api/folders: - Body: {name: str, parent_id: Optional[str] = null} - Auth: get_regular_user (403 for admin per CLAUDE.md) - If parent_id provided: assert parent folder exists and belongs to current_user (→ 404 if not) - Create Folder(user_id=current_user.id, name=name, parent_id=parsed_parent_uuid or None) - Catch IntegrityError → 409 "A folder with that name already exists here" (Pitfall 6) - Call write_audit_log after commit: event_type="folder.created", resource_id=folder.id, metadata_={"name": folder.name} - Return 201 with folder JSON GET /api/folders (list top-level): returns folders where user_id=current_user.id and parent_id IS NULL GET /api/folders/{id}: returns folder JSON + breadcrumb array - Breadcrumb: iterative Python walk up parent chain (not WITH RECURSIVE — avoids SQLite incompatibility in unit tests) - Walk: start from folder, load parent via session.get(Folder, folder.parent_id), repeat until parent_id is None - Breadcrumb array: [{id: str, name: str}, ...] from root to current folder (root first) - Response: {id, name, parent_id, user_id, created_at, breadcrumb: [...]} PATCH /api/folders/{id}: rename folder - Body: {name: str} - Assert folder.user_id == current_user.id → 404 if not - Catch IntegrityError → 409 on duplicate name - Call write_audit_log: event_type="folder.renamed" DELETE /api/folders/{id}: delete folder - Assert folder.user_id == current_user.id → 404 if not - Collect all documents in this folder only (direct children — parent is not responsible for sub-folder docs via cascade; D-03 says cascade-delete but the UI only deletes the selected folder tree, not sub-folders of sub-folders; use WITH RECURSIVE via sqlalchemy text() for PostgreSQL; in test environment without PostgreSQL the recursive CTE will be skipped via a try/except fallback to direct children only) - Use WITH RECURSIVE CTE (via text()) to find all folder IDs in subtree (Pitfall 1) — wrap in try/except OperationalError for SQLite test compat, fallback collects only direct children - Sum size_bytes of all collected documents - Atomic quota decrement with CASE WHEN pattern - Delete MinIO objects best-effort (try/except per each object, per PATTERNS.md Pattern 2) - Delete all documents via ORM, then delete folder via session.delete(folder) - Call write_audit_log: event_type="folder.deleted", metadata_={"doc_count": len(docs), "name": folder.name} - Return 204 PATCH /api/documents/{id}/folder (move document): - Body: {folder_id: Optional[str]} — null = move to root (no folder) - Assert doc.user_id == current_user.id → 404 if not - If folder_id provided: assert target folder.user_id == current_user.id → 404 if not (Pitfall from CONTEXT.md) - Update doc.folder_id and commit - Return 200 with updated doc JSON Extension to GET /api/documents list endpoint (in documents.py, not folders.py): - Add query params: sort (str, default "date"), order (str, default "desc"), folder_id (Optional[str]), q (Optional[str]) - sort=name → order_by(Document.filename), sort=size → order_by(Document.size_bytes), sort=date → order_by(Document.created_at) - order=asc → .asc(), order=desc → .desc() - folder_id provided → add .where(Document.folder_id == parsed_uuid) - q provided and len >= 2: add FTS where clause using func.to_tsvector("english", func.coalesce(Document.extracted_text, "")).op("@@")(func.plainto_tsquery("english", q)) — import sqlalchemy.func - Add is_shared subquery: select(Share.document_id).where(Share.owner_id == current_user.id).scalar_subquery(); add Document.id.in_(shared_subq).label("is_shared") to select Create backend/api/folders.py with APIRouter(prefix="/api/folders", tags=["folders"]). Imports: `from __future__ import annotations`, `import uuid`, `from typing import Optional`, `from fastapi import APIRouter, Depends, HTTPException, Query, Request, status`, `from pydantic import BaseModel`, `from sqlalchemy import select, text, delete, func`, `from sqlalchemy.exc import IntegrityError, OperationalError`, `from sqlalchemy.ext.asyncio import AsyncSession`, `from db.models import Document, Folder, Quota, Share`, `from deps.auth import get_regular_user`, `from deps.db import get_db`, `from storage import get_storage_backend`, `from services.audit import write_audit_log`. Pydantic models: FolderCreate(name: str, parent_id: Optional[str] = None), FolderRename(name: str), DocumentMove(folder_id: Optional[str] = None). Implement all five endpoints as specified in the behavior block. For the WITH RECURSIVE CTE, use: `await session.execute(text("WITH RECURSIVE subtree AS (SELECT id FROM folders WHERE id = :root_id AND user_id = :uid UNION ALL SELECT f.id FROM folders f JOIN subtree s ON f.parent_id = s.id WHERE f.user_id = :uid) SELECT id FROM subtree"), {"root_id": str(folder.id), "uid": str(current_user.id)})` Wrap in try/except OperationalError and fallback to `select(Document).where(Document.folder_id == folder.id, Document.user_id == current_user.id)`. For the ip_address extraction pattern (per Pitfall 5): `ip_address = request.headers.get("X-Forwarded-For") or (request.client.host if request.client else None)`. After creating backend/api/folders.py, modify backend/main.py: add `from api.folders import router as folders_router` and `app.include_router(folders_router)` in the routes section alongside the other routers. Also extend backend/api/documents.py: add sort, order, folder_id, q query params to the list_documents handler. Add import for func and Share. Add the is_shared subquery to the select. Add FTS where clause when q is present. Do NOT break existing tests — these are additive changes. cd /Users/nik/Documents/Progamming/document_scanner/backend && python -m pytest tests/test_folders.py -x -v --no-header 2>&1 | tail -30 - backend/api/folders.py exists with all five endpoint functions (grep: `async def create_folder`, `async def list_folders`, `async def get_folder`, `async def rename_folder`, `async def delete_folder`) - backend/api/folders.py contains PATCH /api/documents/{id}/folder endpoint (grep: `move_document` or equivalent) - backend/main.py includes folders router (grep: `folders_router`) - `python -c "from api.folders import router"` exits 0 - All folder ownership assertions use 404 not 403 (grep: `status_code=404` in folders.py; no `status_code=403`) - IntegrityError caught and returns 409 (grep: `IntegrityError` and `409` in folders.py) - write_audit_log called after folder create, rename, delete (grep: `write_audit_log` appears at least 3 times) - test_create_folder and test_rename_folder turn green (xpass) or remain xfail — no FAILED - `cd backend && python -m pytest tests/ -x --no-header 2>&1 | grep -E "^FAILED"` returns nothing Folders API endpoints are implemented; document list has sort/search; all existing tests still pass. ## Trust Boundaries | Boundary | Description | |----------|-------------| | Client → POST /api/folders | Untrusted name/parent_id input; Pydantic validates types; IntegrityError → 409 | | Client → DELETE /api/folders/{id} | Untrusted folder_id path parameter; UUID parse → 404; ownership assertion → 404 | | Client → GET /api/documents?q= | Untrusted search query; plainto_tsquery is injection-safe (parameterized via SQLAlchemy func) | | Client → PATCH /api/documents/{id}/folder | Untrusted doc_id + folder_id; both validated via ownership assertion | ## STRIDE Threat Register | Threat ID | Category | Component | Disposition | Mitigation Plan | |-----------|----------|-----------|-------------|-----------------| | T-04-03-01 | Elevation of Privilege | POST/PATCH/DELETE /api/folders | mitigate | get_regular_user dep: admin role returns 403; regular user passes through | | T-04-03-02 | Information Disclosure | GET /api/documents?q= (FTS scope) | mitigate | FTS query MUST include Document.user_id == current_user.id filter; plainto_tsquery is parameterized (no injection) | | T-04-03-03 | Tampering | DELETE /api/folders/{id} cascade-delete | mitigate | Ownership assertion on folder before cascade; atomic quota decrement (CASE WHEN); MinIO deletion is best-effort (does not abort primary operation on failure) | | T-04-03-04 | Information Disclosure | folder IDOR via path parameter | mitigate | All folder endpoints assert folder.user_id == current_user.id → 404 (not 403 — prevents attacker enumeration of folder IDs) | | T-04-03-05 | Information Disclosure | PATCH /api/documents/{id}/folder cross-user target folder | mitigate | Both document ownership AND target folder ownership asserted → 404 on mismatch | | T-04-03-06 | Tampering | folder name UniqueConstraint violation surfaced as 500 | mitigate | IntegrityError caught → 409 Conflict with clear error message (Pitfall 6) | | T-04-SC | Tampering | npm/pip/cargo installs | accept | No new packages installed in this plan | 1. Endpoint smoke: `cd backend && python -m pytest tests/test_folders.py -v --no-header 2>&1 | tail -30` 2. Full suite: `cd backend && python -m pytest -v --no-header 2>&1 | grep -E "FAILED|ERROR"` — expect empty 3. Import check: `cd backend && python -c "from api.folders import router; from services.audit import write_audit_log; print('OK')"` 4. Security grep: `grep -n "get_regular_user\|status_code=404\|IntegrityError\|write_audit_log\|plainto_tsquery" backend/api/folders.py` - write_audit_log() is importable from services.audit; uses session.flush(); never raises - All five folder endpoints exist in api/folders.py; router registered in main.py - Document list endpoint supports sort, folder_id, q params; is_shared field returned - test_folders.py tests turn from xfail to xpass (green) or remain xfail — zero FAILED - Full pytest suite is green Create `.planning/phases/04-folders-sharing-quotas-document-ux/04-03-SUMMARY.md` when done.