Files
kite/.planning/phases/03-document-migration-multi-user-isolation/03-CONTEXT.md
T
curo1305 a5994d9ff4 chore: commit pending phase-3 work and add TEST_ACCOUNTS.md
Includes planning artifacts (03-CONTEXT, 03-DISCUSSION-LOG, 03-02-SUMMARY),
integration test script, MinIO/auth/docker fixes, and local dev account reference.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-24 11:30:56 +02:00

12 KiB

Phase 3: Document Migration & Multi-User Isolation - Context

Gathered: 2026-05-23 Status: Ready for planning

## Phase Boundary

Enforce per-user ownership on all documents: make documents.user_id NOT NULL (Phase 1 D-03 deferred to here), add get_current_user guards to all /api/documents/* endpoints (Phase 2 D-07 deferred to here), implement presigned PUT URL upload flow, enforce atomic quota on upload and delete, wire per-user AI classification config from DB, and retire the flat-file settings system. Existing document UI continues to work — updated to use the new two-step upload flow.

This phase does NOT include folder navigation, sharing, or PDF preview (Phase 4). It does NOT include cloud storage backends (Phase 5). The quota bar frontend component is included (STORE-04 is scoped here per REQUIREMENTS.md traceability).

STORE-08 (Celery+Redis) was completed in Phase 1 — no work needed.

## Implementation Decisions

Null-User Record Cleanup

  • D-01: All documents with user_id=NULL are deleted (both DB rows and their MinIO objects) before the NOT NULL constraint is added. These are dev/test data only — consistent with Phase 1 D-04 which deleted flat-file test data with the same reasoning. Zero production data loss.
  • D-02: Cleanup is baked into the Alembic migration's upgrade() function — the migration first deletes all null-user Document rows (and calls the storage backend to delete corresponding MinIO objects), then adds the NOT NULL constraint to documents.user_id. One command, atomic flow.
  • D-03: After null-user cleanup, reconcile quota used_bytes from actual document data: UPDATE quotas SET used_bytes = (SELECT COALESCE(SUM(size_bytes), 0) FROM documents WHERE documents.user_id = quotas.user_id). Phase 3 starts with accurate quota state for all users.

Presigned Upload Flow

  • D-04: Phase 3 implements direct-to-MinIO presigned PUT uploads per CLAUDE.md architectural rule ("bytes never pass through the API layer"). The existing multipart POST-to-FastAPI upload endpoint is replaced.
  • D-05: Two-step upload flow:
    • Step 1 — POST /api/documents/upload-url: FastAPI creates a Document row (status='pending'), generates a presigned PUT URL (15-min TTL), returns {upload_url, document_id}. Quota is NOT reserved at this step.
    • Step 2 — Frontend PUTs bytes directly to MinIO using the presigned URL.
    • Step 3 — POST /api/documents/{id}/confirm: FastAPI retrieves file size from MinIO stat (authoritative), runs atomic quota UPDATE, updates Document row (status='uploaded'), and enqueues extract_and_classify.delay(document_id).
  • D-06: Abandoned uploads (presigned URL fetched but /confirm never called): Celery beat periodic task deletes Document rows older than 1 hour with status='pending' and their MinIO objects. Quota is never reserved for pending rows — no cleanup of quota needed.
  • D-07: Quota is enforced atomically at the /confirm step using the file size retrieved from MinIO stat (not client-supplied). The atomic SQL pattern (from CLAUDE.md) applies: UPDATE quotas SET used_bytes = used_bytes + $delta WHERE (used_bytes + $delta) <= limit_bytes RETURNING used_bytes. A 413 response is returned if the UPDATE returns no rows (quota exceeded). Document delete atomically decrements: UPDATE quotas SET used_bytes = GREATEST(0, used_bytes - $delta).

Topics Isolation Model

  • D-08: Layered topic namespace: system topics (user_id=NULL) are visible to all users as defaults; per-user topics (user_id=current_user.id) are visible only to that user. A user's topic list is the union of system topics + their own topics.
  • D-09: Only admin can create, edit, and delete system topics via a new POST /api/admin/topics endpoint. Regular users can only CRUD their own per-user topics via /api/topics/* (now auth-gated with get_current_user).
  • D-10: All existing topics in the DB (currently user_id=NULL from Phase 1/2 test sessions) are deleted in Phase 3 migration — consistent with null-user document cleanup. Admin seeds system topics fresh post-Phase 3.
  • D-11: AI classification receives system topics + user's own topics as the existing-topics input. New AI-suggested topics are created in the user's namespace (user_id=current_user.id), not as system topics.

Settings Flat-File Retirement

  • D-12: /api/settings endpoint is removed entirely in Phase 3. services/storage.py load_settings() / save_settings() flat-file functions are deleted. settings.json is deleted. All AI config comes from DB (users.ai_provider / users.ai_model set by admin).
  • D-13: System prompt moves to a SYSTEM_PROMPT env var in config.py (optional). If not set, services/classifier.py uses a hardcoded default prompt string. No DB table needed.
  • D-14: Celery extract_and_classify task resolves AI config via doc.user_id → users.ai_provider + users.ai_model (a second DB lookup within the same task session). No user_id parameter added to the task signature.
  • D-15: If user.ai_provider is None (user has no admin-assigned AI config), classifier falls back to DEFAULT_AI_PROVIDER + DEFAULT_AI_MODEL env vars (both optional in config.py; code default: "ollama" / "llama3.2").

Auth Guards

  • D-16: All /api/documents/* endpoints gain get_current_user dependency (Phase 2 D-07 fulfilled). Every handler asserts document.user_id == current_user.id before returning — 404 (not 403) for cross-user access to avoid information leakage. Admin role returns 403 on all document endpoints per Phase 3 SC4 (completing Phase 2 SC5 via D-07).
  • D-17: /api/topics/* gains get_current_user. Topic queries filter by user_id IN (current_user.id, NULL) — user sees their own topics + system topics.

<canonical_refs>

Canonical References

Downstream agents MUST read these before planning or implementing.

Requirements

  • .planning/REQUIREMENTS.md — STORE-03 (atomic quota enforce), STORE-04 (quota bar UI), STORE-05 (upload rejection error), STORE-06 (atomic quota decrement on delete), STORE-08 (Celery+Redis — done in Phase 1), SEC-04 (DB-lookup file access), DOC-03 (per-user AI provider), DOC-04 (system topics + per-user overrides), DOC-05 (classification uses user's assigned provider)

Roadmap & Success Criteria

  • .planning/ROADMAP.md — Phase 3 goal and all 5 success criteria (especially SC2: concurrent quota race, SC4: 403 on cross-user access + admin 403, SC5: per-user AI classification)

Architecture Constraints

  • CLAUDE.md — Key Architectural Rules: presigned MinIO URL flow (bytes never through API), MinIO key schema, atomic quota UPDATE pattern, SEC-04 enforcement, admin endpoints never return document content

Prior Phase Decisions

  • .planning/phases/01-infrastructure-foundation/01-CONTEXT.md — D-03 (documents.user_id nullable in Phase 1), D-05 (storage service replaced), D-06 (MinIO key schema), D-08/D-09 (Celery+Redis wired)
  • .planning/phases/02-users-authentication/02-CONTEXT.md — D-07 (documents endpoints stay public in Phase 2, gain guards in Phase 3), D-08/D-09 (admin endpoints, CORS)

Project Decisions

  • .planning/PROJECT.md — Core Value: per-user isolation; Key Decisions: PostgreSQL+MinIO rationale, atomic quota UPDATE, privacy-first admin model

</canonical_refs>

<code_context>

Existing Code Insights

Reusable Assets

  • backend/deps/auth.pyget_current_user and get_current_admin FastAPI dependencies ready to inject into document/topic endpoints
  • backend/db/models.pyDocument, Quota, Topic, DocumentTopic ORM models complete; documents.user_id is nullable (change to NOT NULL in Phase 3 migration); quotas.used_bytes and limit_bytes are in place
  • backend/storage/minio_backend.pyMinIOBackend.put_object() and delete_object() — extend with generate_presigned_put_url() for Phase 3 upload flow; add stat_object() to retrieve file size after upload
  • backend/storage/base.pyStorageBackend ABC — add generate_presigned_put_url(...) abstract method
  • backend/tasks/document_tasks.pyextract_and_classify task; update _run() to look up doc.user_id → user.ai_provider/ai_model and pass user config to classifier
  • backend/services/classifier.py — update to accept ai_provider and ai_model parameters instead of reading from load_settings()
  • backend/celery_app.py — Celery beat schedule: add periodic task for abandoned upload cleanup

Established Patterns

  • Atomic quota UPDATEUPDATE quotas SET used_bytes = used_bytes + $delta WHERE (used_bytes + $delta) <= limit_bytes RETURNING used_bytes — use session.execute(text(...)) with bound params; check result.rowcount to detect quota exceeded
  • Service layer boundaryservices/classifier.py is pure Python, no FastAPI coupling; call with explicit parameters rather than reading global config
  • get_current_user injection — Phase 2 pattern: current_user: User = Depends(get_current_user) in each handler; current_user: User = Depends(get_current_admin) for admin-only routes
  • asyncio.to_thread() for MinIO sync SDK calls (established in Phase 1 storage/minio_backend.py)

Integration Points

  • backend/api/documents.py — replace existing upload handler with upload-url + confirm endpoints; add get_current_user to all handlers; add document.user_id == current_user.id ownership assertion
  • backend/api/topics.py — add get_current_user; filter all topic queries by user_id IN (current_user.id, NULL)
  • backend/services/storage.py — remove load_settings() / save_settings(); update save_upload() to accept user_id parameter; update delete_document() to decrement quota
  • backend/config.py — add SYSTEM_PROMPT, DEFAULT_AI_PROVIDER, DEFAULT_AI_MODEL optional env vars
  • frontend/src/stores/documents.js (or equivalent) — update upload flow from single multipart POST to two-step: get upload URL, PUT to MinIO, call confirm
  • frontend/src/components/layout/AppSidebar.vue — add quota bar (current/limit in MB, amber at 80%, red at 95%) — STORE-04

Constraints from Prior Phases

  • MinIO key schema {user_id}/{document_id}/{uuid4()}{ext} is locked (Phase 1 D-06) — enforced in MinIOBackend.put_object()
  • documents.user_id is currently nullable — Phase 3 Alembic migration makes it NOT NULL after cleanup
  • Celery+Redis already wired and operational — no infrastructure changes needed
  • BackupCode model and backup_codes table exist from Phase 2 — no changes needed

</code_context>

## Specific Ideas
  • Phase 3 Alembic migration is 0003_multi_user_isolation.py — cleanup + NOT NULL + topic cleanup + quota reconciliation in one migration
  • Presigned PUT URL TTL: 15 minutes (matches typical upload timeout for large documents)
  • Abandoned upload cleanup: Celery beat task running every 30 minutes, deletes pending Document rows older than 1 hour
  • stat_object() for MinIO: use MinIO SDK stat_object(bucket, key).size attribute to get authoritative file size at confirm time
  • Quota exceeded response: HTTP 413 with body {"detail": {"used_bytes": N, "limit_bytes": M, "rejected_bytes": K}}
  • Per-user topic query: WHERE (topics.user_id = :uid OR topics.user_id IS NULL) with an index on topics.user_id
  • Frontend quota bar: fetch from new GET /api/me/quota endpoint returning {used_bytes, limit_bytes} — add this endpoint to the auth API
## Deferred Ideas
  • Presigned GET URLs for document downloads — Phase 4 (DOC-02: PDF preview proxied through app). Phase 3 does not expose presigned GET URLs to the browser.
  • Per-user system prompt overrides — out of scope for v1; system prompt is global via env var
  • Quota reservation at upload-url initiation with client-supplied size — decided against in favor of confirm-time enforcement
  • MinIO event notification webhook approach — deferred; two-step confirm is sufficient for Phase 3

Phase: 3-Document Migration & Multi-User Isolation Context gathered: 2026-05-23