diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md index ecb1485..66ab0e0 100644 --- a/.planning/ROADMAP.md +++ b/.planning/ROADMAP.md @@ -88,7 +88,30 @@ _Last updated: 2026-05-22_ 4. Requesting a document object key or presigned URL for a document owned by a different user returns 403 — no cross-user object access is possible through any request parameter manipulation; all /api/documents/* endpoints enforce get_current_user and return 403 when the requesting user's role is admin (completing SC5 from Phase 2) 5. AI classification for each document uses the provider and model assigned to that user by the admin, not any user-supplied or default value -**Plans**: TBD +**Plans**: 5 plans + +**Wave 1** — Migration + test scaffolds +- [ ] 03-01-PLAN.md — Wave 0 test scaffolds (auth_user/admin_user/MinIO mock fixtures + 16 xfail stubs) + Alembic migration 0003 (null-user cleanup, NOT NULL constraint, topic cleanup, quota reconciliation, ix_topics_user_id) + +**Wave 2** *(blocked on Wave 1)* +- [ ] 03-02-PLAN.md — Presigned upload backend: StorageBackend ABC + MinIOBackend dual client + generate_presigned_put_url/stat_object + /api/documents/upload-url + /api/documents/{id}/confirm with atomic quota UPDATE + GET /api/auth/me/quota + delete-with-quota + abandoned-upload Celery beat + docker-compose CORS/celery-beat + +**Wave 3** *(blocked on Wave 2)* +- [ ] 03-03-PLAN.md — Auth guards: get_regular_user dep + ownership assertions on every /api/documents/* handler (404 not 403) + admin 403 + real user_id in object_key + namespace-scoped /api/topics/* + POST /api/admin/topics + classifier topic-namespace plumbing + +**Wave 4** *(blocked on Wave 3)* +- [ ] 03-04-PLAN.md — Settings retirement + per-user AI: delete /api/settings + remove load_settings/save_settings + classifier accepts ai_provider/ai_model kwargs + Celery task resolves user.ai_provider via DB + frontend SettingsView placeholder + remove settings store/API + +**Wave 5** *(blocked on Wave 4)* +- [ ] 03-05-PLAN.md — Frontend upload flow + quota bar: 3-step upload action with XHR progress + UploadProgress.vue progress bar and quota rejection error block + QuotaBar.vue + AppSidebar embed + quota state in auth store + human checkpoint + +**Cross-cutting constraints:** +- Atomic quota UPDATE pattern only lives in Plan 02; never duplicate (CLAUDE.md) +- Every /api/documents/* handler injects get_regular_user (Plan 03) +- AI provider/model resolved only via Celery task DB lookup (Plan 04) +- Browser XHR PUT to MinIO sends NO Authorization header (Plan 05) + +**UI hint**: yes --- @@ -134,6 +157,6 @@ _Last updated: 2026-05-22_ |-------|----------------|--------|-----------| | 1. Infrastructure Foundation | 5/5 | Complete | 2026-05-22 | | 2. Users & Authentication | 5/5 | Complete | 2026-05-22 | -| 3. Document Migration & Multi-User Isolation | 0/? | Not started | - | +| 3. Document Migration & Multi-User Isolation | 0/5 | Not started | - | | 4. Folders, Sharing, Quotas & Document UX | 0/? | Not started | - | | 5. Cloud Storage Backends | 0/? | Not started | - | diff --git a/.planning/STATE.md b/.planning/STATE.md index ffc77f2..62901a2 100644 --- a/.planning/STATE.md +++ b/.planning/STATE.md @@ -2,13 +2,13 @@ gsd_state_version: 1.0 milestone: v1.0 milestone_name: milestone -current_phase: 2 -status: complete -last_updated: "2026-05-22T18:30:00Z" +current_phase: 3 +status: planned +last_updated: "2026-05-23T00:00:00Z" progress: total_phases: 5 completed_phases: 2 - total_plans: 10 + total_plans: 15 completed_plans: 10 percent: 40 --- @@ -16,9 +16,9 @@ progress: # Project State **Project:** DocuVault -**Status:** Phase 2 Complete — Phase 3 Ready -**Current Phase:** 2 -**Last Updated:** 2026-05-22 +**Status:** Phase 3 Planned — Ready to Execute +**Current Phase:** 3 +**Last Updated:** 2026-05-23 ## Phase Status @@ -26,7 +26,7 @@ progress: |---|---|---| | 1 | Infrastructure Foundation | ✓ Complete | | 2 | Users & Authentication | ✓ Complete (5/5 plans) | -| 3 | Document Migration & Multi-User Isolation | Not Started | +| 3 | Document Migration & Multi-User Isolation | Planned (5 plans, ready to execute) | | 4 | Folders, Sharing, Quotas & Document UX | Not Started | | 5 | Cloud Storage Backends | Not Started | @@ -87,9 +87,7 @@ progress: ### Open Questions -- Celery + Redis vs pgqueuer for Phase 3 (depends on Redis availability in deployment target) - Verify cloud SDK minor versions on PyPI before Phase 5 pinning -- Phase 2 SC5 (admin JWT → 403 on /api/documents/*) deferred to Phase 3 SC4 per D-07 ### Blockers @@ -101,6 +99,7 @@ _Updated at each phase transition._ | Field | Value | |---|---| -| Last session | 2026-05-22 — Executed Phase 2 Plan 05 (Admin panel frontend: AdminView, three tab components, AppSidebar update) | -| Next action | Run `/gsd:discuss-phase 3` or `/gsd:plan-phase 3` to begin Phase 3 (Document Migration & Multi-User Isolation) | -| Pending decisions | See Open Questions above | +| Last session | 2026-05-23 — Planned Phase 3 (5 plans, 5 waves; verification passed) | +| Next action | Run `/gsd:execute-phase 3` to begin execution | +| Pending decisions | None — all Phase 3 decisions locked in 03-CONTEXT.md | +| Resume file | `.planning/phases/03-document-migration-multi-user-isolation/03-01-PLAN.md` | diff --git a/.planning/phases/03-document-migration-multi-user-isolation/03-01-PLAN.md b/.planning/phases/03-document-migration-multi-user-isolation/03-01-PLAN.md new file mode 100644 index 0000000..ed1a42a --- /dev/null +++ b/.planning/phases/03-document-migration-multi-user-isolation/03-01-PLAN.md @@ -0,0 +1,297 @@ +--- +phase: 03-document-migration-multi-user-isolation +plan: 01 +type: execute +wave: 1 +depends_on: [] +files_modified: + - backend/tests/test_quota.py + - backend/tests/test_alembic.py + - backend/tests/test_classifier.py + - backend/tests/test_documents.py + - backend/tests/test_topics.py + - backend/tests/test_settings.py + - backend/tests/conftest.py + - backend/migrations/versions/0003_multi_user_isolation.py +autonomous: true +requirements: + - STORE-03 + - STORE-04 + - STORE-05 + - STORE-06 + - SEC-04 + - DOC-03 + - DOC-04 + - DOC-05 + +must_haves: + truths: + - "All xfail(strict=False) test stubs for Phase 3 requirements exist and are collectable by pytest" + - "Alembic migration 0003 deletes every documents row where user_id IS NULL and removes the corresponding MinIO objects before any schema change" + - "After migration 0003 runs, documents.user_id is NOT NULL" + - "After migration 0003 runs, the topics table is empty (all rows deleted) and ix_topics_user_id exists" + - "After migration 0003 runs, every quotas.used_bytes equals SUM(documents.size_bytes) for that user (or 0 if no documents)" + - "auth_user and admin_user pytest fixtures issue JWTs that get_current_user accepts and that resolve to active users with Quota rows" + artifacts: + - path: "backend/tests/test_quota.py" + provides: "Wave 0 xfail stubs for STORE-03 / STORE-05 / STORE-06 / SC2" + contains: "test_concurrent_quota_race" + - path: "backend/tests/test_classifier.py" + provides: "Wave 0 xfail stubs for DOC-03 / DOC-05 per-user provider resolution" + contains: "test_per_user_provider" + - path: "backend/tests/conftest.py" + provides: "auth_user, admin_user, mock_minio_presigned, mock_minio_stat fixtures" + exports: + - "auth_user" + - "admin_user" + - "mock_minio_presigned" + - "mock_minio_stat" + - path: "backend/migrations/versions/0003_multi_user_isolation.py" + provides: "Null-user document cleanup + topic cleanup + NOT NULL constraint + quota reconciliation + ix_topics_user_id" + contains: "revision = \"0003\"" + key_links: + - from: "backend/migrations/versions/0003_multi_user_isolation.py" + to: "MinIO (synchronous SDK)" + via: "from minio import Minio; client.remove_object(bucket, key) for each null-user object_key" + pattern: "remove_object" + - from: "backend/tests/conftest.py" + to: "backend/services/auth.py" + via: "create_access_token + User row + Quota row for fixture users" + pattern: "create_access_token" +--- + + +Lay the Wave 0 foundation for Phase 3. Two outputs: + +1. Test scaffolds (xfail stubs) covering every requirement in the VALIDATION.md Per-Task Verification Map plus the shared fixtures (`auth_user`, `admin_user`, MinIO mocks) every later plan needs. +2. Alembic migration `0003_multi_user_isolation.py` that performs the null-user document cleanup (D-01/D-02), all-topics cleanup (D-10), `documents.user_id` NOT NULL constraint, quota reconciliation (D-03), and the `ix_topics_user_id` index. + +Purpose: Wave 0 tests must exist before any later plan can run them. The migration is the precondition for adding `get_current_user` to `/api/documents/*` — once `documents.user_id` is NOT NULL, every Document has an owner and ownership assertions become meaningful. +Output: 6 test files (5 stubs + 1 fixture file), 1 Alembic migration file. + + + +@/Users/nik/.claude/get-shit-done/workflows/execute-plan.md +@/Users/nik/.claude/get-shit-done/templates/summary.md + + + +@.planning/ROADMAP.md +@.planning/STATE.md +@.planning/phases/03-document-migration-multi-user-isolation/03-CONTEXT.md +@.planning/phases/03-document-migration-multi-user-isolation/03-RESEARCH.md +@.planning/phases/03-document-migration-multi-user-isolation/03-PATTERNS.md +@.planning/phases/03-document-migration-multi-user-isolation/03-VALIDATION.md +@CLAUDE.md + +@backend/migrations/versions/0002_add_backup_codes_and_password_must_change.py +@backend/db/models.py +@backend/tests/conftest.py +@backend/tests/test_alembic.py +@backend/services/auth.py + + + + +From backend/db/models.py (Phase 1): +```python +class Document(Base): + __tablename__ = "documents" + id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), primary_key=True) + user_id: Mapped[Optional[uuid.UUID]] # nullable in Phase 1 (D-03) — Phase 3 will flip to NOT NULL + object_key: Mapped[str] + size_bytes: Mapped[int] + status: Mapped[str] # 'pending' | 'uploaded' | 'classified' | 'classification_failed' + created_at: Mapped[datetime] + +class Quota(Base): + __tablename__ = "quotas" + user_id: Mapped[uuid.UUID] # PK; FK -> users.id + limit_bytes: Mapped[int] # default 104857600 (100 MB) + used_bytes: Mapped[int] + +class Topic(Base): + __tablename__ = "topics" + id: Mapped[uuid.UUID] + user_id: Mapped[Optional[uuid.UUID]] # NULL = system topic, UUID = per-user + name: Mapped[str] + # UniqueConstraint("user_id", "name", name="uq_topics_user_name") +``` + +From backend/services/auth.py (Phase 2): +```python +def hash_password(password: str) -> str +def create_access_token(user_id: str, role: str) -> str +``` + +From backend/tests/conftest.py (existing): +```python +@pytest_asyncio.fixture +async def db_session() -> AsyncSession # in-memory SQLite + +@pytest_asyncio.fixture +async def async_client(db_session) -> AsyncClient # ASGI httpx client with get_db override +``` + +Migration 0002 reference style (revision/down_revision header + numbered # ── 1. comments + op.execute("GRANT ...") + downgrade()). + + + + +## Trust Boundaries + +| Boundary | Description | +|----------|-------------| +| migration runtime → MinIO | DDL transaction holds DB rows; MinIO deletes happen outside any DB transaction | +| test fixtures → backend code | Fixtures fabricate JWTs that hit every guarded endpoint — must not leak across tests | + +## STRIDE Threat Register + +| Threat ID | Category | Component | Disposition | Mitigation Plan | +|-----------|----------|-----------|-------------|-----------------| +| T-03-01 | Tampering | Alembic migration 0003 upgrade() | mitigate | Collect object_keys via SELECT before DELETE; wrap MinIO remove_object in try/except so MinIO partial failure cannot leave DB+MinIO in worse state than orphans (RESEARCH.md Finding 1 + Risk 3) | +| T-03-02 | Denial of Service | Alembic migration when MinIO unreachable | accept | Migration is run only by docuvault_migrate after docker-compose health checks per RESEARCH.md Risk 3; retry on next deploy. Document in upgrade() docstring. | +| T-03-03 | Information Disclosure | xfail test fixtures (auth_user JWT) | mitigate | Fixture-issued JWTs use short-lived test secret_key; fixtures clean up via async_client teardown; never log token values | +| T-03-SC | Tampering | pip installs | mitigate | No new package-manager installs in this plan — pytest, pytest-asyncio, alembic, sqlalchemy, minio already present in requirements.txt from Phase 1/2 | + + + + + + Task 1: Create Wave 0 test scaffolds and shared fixtures + backend/tests/test_quota.py, backend/tests/test_alembic.py, backend/tests/test_classifier.py, backend/tests/test_documents.py, backend/tests/test_topics.py, backend/tests/test_settings.py, backend/tests/conftest.py + + - backend/tests/conftest.py — current async_client + db_session fixture definitions; must not break existing fixtures + - backend/tests/test_alembic.py — existing xfail style for migration tests (revision 0001/0002) + - backend/tests/test_documents.py — existing async test shape used by later plans + - backend/tests/test_topics.py — existing test shape + - backend/tests/test_settings.py — existing tests (will become obsolete after D-12, but kept here for the stub `test_settings_endpoint_removed`) + - backend/tests/test_classifier.py — existing mock-provider tests; new tests live in same file + - backend/services/auth.py — `hash_password`, `create_access_token` signatures used by new fixtures + - backend/db/models.py — `User`, `Quota`, `Document`, `Topic` columns referenced by fixtures + - .planning/phases/03-document-migration-multi-user-isolation/03-VALIDATION.md — Per-Task Verification Map names every test stub required + - .planning/phases/03-document-migration-multi-user-isolation/03-CONTEXT.md — D-01..D-17 referenced by stub docstrings + + + Add new pytest fixtures to `backend/tests/conftest.py` (append, do not rewrite existing fixtures). Each fixture is `@pytest_asyncio.fixture`: + + 1. `auth_user(db_session)` — creates a User row (role="user", is_active=True, password_must_change=False, password_hash via `services.auth.hash_password("Testpassword123!")`), creates a Quota row (limit_bytes=104857600, used_bytes=0), issues an access token via `services.auth.create_access_token(str(user.id), "user")`, returns dict `{"user": user, "token": token, "headers": {"Authorization": f"Bearer {token}"}}`. Persist with `db_session.commit()`. + 2. `admin_user(db_session)` — same as above but `role="admin"` and token created with `"admin"`. Returns identical dict shape. + 3. `mock_minio_presigned(monkeypatch)` — uses `monkeypatch.setattr` to replace `storage.minio_backend.MinIOBackend.generate_presigned_put_url` (method-level patch with `AsyncMock`) returning the literal string `"http://localhost:9000/docuvault/test-presigned-url"`. The fixture yields the AsyncMock so tests can assert call counts/args. + 4. `mock_minio_stat(monkeypatch)` — same pattern for `MinIOBackend.stat_object`, AsyncMock returning a configurable integer (default 1024). Yields the mock for per-test customization (e.g. `mock_minio_stat.return_value = 50_000_000`). + + The patched methods do not exist yet (added in Plan 03-02) — using `monkeypatch.setattr` with `raising=False` ensures the patch installs before the attribute exists. Document this with an inline comment referencing Plan 03-02. + + Create the following xfail test stub files. Every test is marked `@pytest.mark.xfail(strict=False, reason="implemented in plan 03-XX")` and asserts a single placeholder (`assert True` or `pytest.skip("scaffold")`). Stub file headers explain which requirement(s) they cover and which plan implements them. + + `backend/tests/test_quota.py` (NEW) — async tests under Plan 03-02: + - `test_quota_increment_atomic(async_client, auth_user, mock_minio_presigned, mock_minio_stat)` — STORE-03; after one /confirm with stat=50_000_000, GET /api/auth/me/quota returns used_bytes == 50_000_000 + - `test_concurrent_quota_race(async_client, auth_user, mock_minio_presigned, mock_minio_stat)` — STORE-03 SC2; calls asyncio.gather on two /confirm POSTs targeting two documents totaling 110 MB against a 100 MB quota; asserts statuses sorted == [200, 413] + - `test_quota_exceeded_response(async_client, auth_user, mock_minio_presigned, mock_minio_stat)` — STORE-05; /confirm returns 413 and body shape `{"detail": {"used_bytes", "limit_bytes", "rejected_bytes"}}` + - `test_delete_decrements_quota(async_client, auth_user, mock_minio_presigned, mock_minio_stat)` — STORE-06; upload + confirm then DELETE; GET /api/auth/me/quota returns used_bytes == 0 + + `backend/tests/test_alembic.py` (APPEND to existing) — new sync test `test_migration_0003`: + - Wraps Phase 1 fixture setup; runs `alembic upgrade head` against tmp SQLite; pre-seeds a `documents` row with `user_id=None` AND a row with `user_id=` via raw INSERT before running upgrade head past 0002, then runs `alembic upgrade head` to apply 0003 and asserts: the null-user row is gone, the populated-user row remains, PRAGMA table_info shows `documents.user_id` notnull=1, all `topics` rows are gone, `ix_topics_user_id` is in `sqlite_master`, and `quotas.used_bytes` for the populated user equals SUM(size_bytes). Mark `@pytest.mark.xfail(strict=False, reason="implemented in plan 03-01 migration step")`. + + `backend/tests/test_classifier.py` (APPEND to existing) — new async tests under Plan 03-04: + - `test_per_user_provider(db_session)` — DOC-03; with `user.ai_provider="openai"`, `user.ai_model="gpt-4o"`, the classifier.classify_document call resolves `_settings["active_provider"] == "openai"` (use AsyncMock for `ai.get_provider` and assert call args) + - `test_celery_task_uses_user_provider(db_session)` — DOC-05; calling `_run(document_id)` with a Document owned by `user.ai_provider="anthropic"` calls classifier with `ai_provider="anthropic"` + - `test_default_provider_fallback(db_session)` — D-15; when `user.ai_provider is None`, classifier receives `config.settings.default_ai_provider` + + `backend/tests/test_documents.py` (APPEND to existing) — new async tests under Plans 03-02 / 03-03: + - `test_upload_url_endpoint(async_client, auth_user, mock_minio_presigned)` — D-05; POST /api/documents/upload-url returns `{"upload_url": "...", "document_id": "..."}` and creates a Document row with status="pending" + - `test_confirm_endpoint(async_client, auth_user, mock_minio_presigned, mock_minio_stat)` — D-05; POST /api/documents/{id}/confirm calls stat_object once, updates Document.size_bytes from stat return, sets status="uploaded" + - `test_get_quota(async_client, auth_user)` — STORE-04; GET /api/auth/me/quota returns `{"used_bytes": 0, "limit_bytes": 104857600}` + - `test_cross_user_access_404(async_client, auth_user, db_session)` — SEC-04; user A creates a Document, user B's request for `GET /api/documents/{A_doc_id}` returns 404 + - `test_admin_cannot_access_documents(async_client, admin_user)` — SEC-04 SC4; GET /api/documents using admin_user.headers returns 403 + - `test_documents_require_auth(async_client)` — anonymous GET /api/documents returns 401 or 403 + + `backend/tests/test_topics.py` (APPEND to existing) — new async tests under Plan 03-03: + - `test_topic_namespace(async_client, auth_user, db_session)` — DOC-04; seed one system topic (user_id=NULL), one auth_user-owned topic, one different-user topic; GET /api/topics returns only the first two + - `test_admin_create_system_topic(async_client, admin_user)` — D-09; POST /api/admin/topics returns 201 and creates a Topic with user_id=NULL + - `test_regular_user_cannot_create_system_topic(async_client, auth_user)` — D-09; POST /api/admin/topics with auth_user.headers returns 403 + - `test_topics_require_auth(async_client)` — D-17; anonymous GET /api/topics returns 401 or 403 + + `backend/tests/test_settings.py` (APPEND new test, leave existing) — under Plan 03-04: + - `test_settings_endpoint_removed(async_client)` — D-12; GET /api/settings returns 404 + + Every NEW test must include `@pytest.mark.xfail(strict=False, reason="implemented in plan 03-XX")` with the XX matching its target plan. None of the new tests should pass today (xfail strict=False keeps unexpected passes from breaking CI per Phase 1 D-Wave0). + + + cd backend && pytest tests/test_quota.py tests/test_alembic.py::test_migration_0003 tests/test_classifier.py::test_per_user_provider tests/test_classifier.py::test_celery_task_uses_user_provider tests/test_classifier.py::test_default_provider_fallback tests/test_documents.py::test_upload_url_endpoint tests/test_documents.py::test_confirm_endpoint tests/test_documents.py::test_get_quota tests/test_documents.py::test_cross_user_access_404 tests/test_documents.py::test_admin_cannot_access_documents tests/test_documents.py::test_documents_require_auth tests/test_topics.py::test_topic_namespace tests/test_topics.py::test_admin_create_system_topic tests/test_topics.py::test_regular_user_cannot_create_system_topic tests/test_topics.py::test_topics_require_auth tests/test_settings.py::test_settings_endpoint_removed --collect-only -q + + + All 16 new test IDs are collected by pytest with no collection errors. `grep -c "xfail" backend/tests/test_quota.py` returns >= 4. `grep -c "xfail" backend/tests/test_documents.py` >= 6. `grep -c "xfail" backend/tests/test_topics.py` >= 4. `grep -c "auth_user" backend/tests/conftest.py` returns >= 1. + + + + + Task 2: Write Alembic migration 0003 (null-user cleanup + NOT NULL + topic cleanup + quota reconcile + index) + backend/migrations/versions/0003_multi_user_isolation.py + + - backend/migrations/versions/0002_add_backup_codes_and_password_must_change.py — header style, numbered comments, GRANT pattern, downgrade structure + - backend/migrations/versions/0001_initial_schema.py — column types, table names, baseline schema + - backend/db/models.py — Document, Quota, Topic columns + Index/UniqueConstraint definitions + - backend/config.py — `settings.minio_endpoint`, `settings.minio_access_key`, `settings.minio_secret_key`, `settings.minio_bucket` — but DO NOT import settings inside the migration (env vars only, see action) + - .planning/phases/03-document-migration-multi-user-isolation/03-RESEARCH.md — Finding 1 (sync MinIO SDK inside upgrade), Finding 4 (quota reconciliation SQL), Risk 3 (idempotency, MinIO unreachability) + - backend/tests/test_alembic.py — `test_migration_0003` (xfail stub from Task 1) defines the expected post-state + + + - Migration revision = "0003", down_revision = "0002", branch_labels = None, depends_on = None + - upgrade() must, in order: + 1. SELECT id, object_key FROM documents WHERE user_id IS NULL → list of (id, object_key) + 2. DELETE FROM document_topics WHERE document_id IN (those ids) [cascade safety] + 3. DELETE FROM documents WHERE user_id IS NULL + 4. For each collected object_key: call synchronous `minio.Minio(...).remove_object(bucket, key)` wrapped in try/except (log via op-style `op.execute("-- comment")` skipped; use `print()` to stderr or silent pass per RESEARCH.md Finding 1) + 5. DELETE FROM topics (all rows, per D-10) + 6. op.alter_column("documents", "user_id", existing_type=postgresql.UUID(as_uuid=True), nullable=False) + 7. op.create_index("ix_topics_user_id", "topics", ["user_id"]) + 8. op.execute(text("UPDATE quotas SET used_bytes = (SELECT COALESCE(SUM(size_bytes), 0) FROM documents WHERE documents.user_id = quotas.user_id)")) + - downgrade() must: + 1. op.drop_index("ix_topics_user_id", table_name="topics") + 2. op.alter_column("documents", "user_id", existing_type=postgresql.UUID(as_uuid=True), nullable=True) + 3. Docstring states that deleted rows + MinIO objects are NOT restorable (one-way data cleanup per D-01) + - MinIO connection details read from os.environ ONLY (MINIO_ENDPOINT, MINIO_ACCESS_KEY, MINIO_SECRET_KEY, MINIO_BUCKET) — never `from config import settings` (circular import + Alembic side effects) + - test_migration_0003 (Task 1) must transition from xfail → pass once this task lands (remove or keep the xfail marker; if strict=False it XPASSes silently) + + + Create `backend/migrations/versions/0003_multi_user_isolation.py` following the exact header style of `0002_add_backup_codes_and_password_must_change.py`: + + Header: revision = "0003", down_revision = "0002", branch_labels = None, depends_on = None. Docstring lists the 8 numbered steps and references D-01, D-02, D-03, D-10. Imports: `import os; from __future__ import annotations; import sqlalchemy as sa; from sqlalchemy import text; from sqlalchemy.dialects import postgresql; from alembic import op`. Conditionally `from minio import Minio` inside upgrade() (deferred import — minio not needed in downgrade or for test envs that skip MinIO). + + upgrade() body uses numbered section comments matching the 0002 style (`# ── 1. Collect null-user object keys ─────`). Section 1 uses `bind = op.get_bind(); result = bind.execute(text("SELECT id, object_key FROM documents WHERE user_id IS NULL"))` and materializes `null_user_objects = [(row[0], row[1]) for row in result]`. Section 2 uses `op.execute(text("DELETE FROM document_topics WHERE document_id IN (SELECT id FROM documents WHERE user_id IS NULL)"))`. Section 3 uses `op.execute(text("DELETE FROM documents WHERE user_id IS NULL"))`. Section 4 instantiates `Minio(os.environ.get("MINIO_ENDPOINT", "minio:9000"), access_key=os.environ.get("MINIO_ACCESS_KEY", ""), secret_key=os.environ.get("MINIO_SECRET_KEY", ""), secure=False)` and loops over `null_user_objects`, calling `client.remove_object(bucket, key)` inside `try/except Exception: pass` (skip silently — orphans are harmless per Finding 1). Bucket name from `os.environ.get("MINIO_BUCKET", "docuvault")`. Section 5 uses `op.execute(text("DELETE FROM topics"))`. Section 6 uses `op.alter_column("documents", "user_id", existing_type=postgresql.UUID(as_uuid=True), nullable=False)`. Section 7 uses `op.create_index("ix_topics_user_id", "topics", ["user_id"])`. Section 8 uses the quota reconciliation `UPDATE quotas SET used_bytes = (SELECT COALESCE(SUM(size_bytes), 0) FROM documents WHERE documents.user_id = quotas.user_id)` via `op.execute(text(...))`. + + SQLite test compatibility (Task 1's `test_migration_0003` runs against SQLite): wrap the MinIO step in `if os.environ.get("MINIO_ENDPOINT"): ...` so the test env without MinIO does not import minio at migration time. Also wrap `op.alter_column` with `with op.batch_alter_table("documents") as batch_op: batch_op.alter_column("user_id", existing_type=sa.dialects.postgresql.UUID(as_uuid=True), nullable=False)` — batch mode is required for SQLite ALTER COLUMN, and is a no-op pass-through on PostgreSQL. Section 7's `op.create_index` works on both dialects without batch mode. + + downgrade() body: section 1 `op.drop_index("ix_topics_user_id", table_name="topics")`; section 2 `with op.batch_alter_table("documents") as batch_op: batch_op.alter_column("user_id", existing_type=postgresql.UUID(as_uuid=True), nullable=True)`. Docstring on downgrade() states "WARNING: deleted null-user document rows and their MinIO objects are NOT restored — Phase 3 cleanup is one-way per CONTEXT.md D-01." NO GRANT statements are needed (no new tables per RESEARCH.md Finding 1). + + Do NOT introduce `pytest.mark.xfail` removal in `test_migration_0003`; leave xfail(strict=False) so it XPASSes silently (consistent with Wave 0 D-Wave0 from Phase 1). + + + cd backend && pytest tests/test_alembic.py::test_migration_0003 -x -q 2>&1 | tail -20 && grep -c "DELETE FROM topics" backend/migrations/versions/0003_multi_user_isolation.py && grep -c "ix_topics_user_id" backend/migrations/versions/0003_multi_user_isolation.py && grep -c "SUM(size_bytes)" backend/migrations/versions/0003_multi_user_isolation.py && grep -c "nullable=False" backend/migrations/versions/0003_multi_user_isolation.py + + + `test_migration_0003` either passes or xpasses (no FAILED). The migration file contains: `revision = "0003"`, `down_revision = "0002"`, `DELETE FROM topics`, `ix_topics_user_id`, `SUM(size_bytes)`, `nullable=False`, and a `Minio` import inside upgrade(). `grep -c "revision = \"0003\"" backend/migrations/versions/0003_multi_user_isolation.py` returns 1. + + + + + + +- All 16 new test IDs collect cleanly: `cd backend && pytest tests/ --collect-only -q 2>&1 | grep -E "(test_quota|test_migration_0003|test_per_user_provider|test_upload_url|test_confirm|test_get_quota|test_cross_user|test_admin_cannot|test_topic_namespace|test_settings_endpoint_removed)" | wc -l` >= 16 +- Migration applies cleanly against a pre-seeded SQLite DB (covered by `test_migration_0003`) +- Existing tests still pass: `cd backend && pytest tests/test_documents.py tests/test_topics.py tests/test_settings.py -x -q --ignore-glob="*test_quota*"` reports no regressions in previously-green tests + + + +- 6 test files exist with the 16 new test IDs from VALIDATION.md +- `backend/tests/conftest.py` exports `auth_user`, `admin_user`, `mock_minio_presigned`, `mock_minio_stat` +- `backend/migrations/versions/0003_multi_user_isolation.py` exists with revision "0003", down_revision "0002" +- Migration upgrade() deletes null-user documents (DB + MinIO), deletes all topics, alters documents.user_id to NOT NULL, creates ix_topics_user_id, reconciles quotas.used_bytes +- Migration downgrade() reverses schema-level changes only (documented in docstring) +- All Wave 0 stubs are xfail(strict=False) — none of them pass/fail unexpectedly + + + +Create `.planning/phases/03-document-migration-multi-user-isolation/03-01-SUMMARY.md` when done — list every new test ID and the migration revision identifier; flag the SQLite-vs-PostgreSQL alter_column batch-mode choice as a key downstream constraint. + diff --git a/.planning/phases/03-document-migration-multi-user-isolation/03-02-PLAN.md b/.planning/phases/03-document-migration-multi-user-isolation/03-02-PLAN.md new file mode 100644 index 0000000..93b5fb1 --- /dev/null +++ b/.planning/phases/03-document-migration-multi-user-isolation/03-02-PLAN.md @@ -0,0 +1,384 @@ +--- +phase: 03-document-migration-multi-user-isolation +plan: 02 +type: execute +wave: 2 +depends_on: + - 03-01 +files_modified: + - backend/storage/base.py + - backend/storage/minio_backend.py + - backend/storage/__init__.py + - backend/config.py + - backend/api/documents.py + - backend/api/auth.py + - backend/services/storage.py + - backend/tasks/document_tasks.py + - backend/celery_app.py + - docker-compose.yml +autonomous: true +requirements: + - STORE-03 + - STORE-04 + - STORE-05 + - STORE-06 + - SEC-04 + +must_haves: + truths: + - "Frontend can request a presigned PUT URL targeting a browser-resolvable hostname (not 'minio:9000') for a newly created pending Document row" + - "After the browser PUT, /confirm reads the authoritative file size from MinIO stat_object and atomically updates the user's quota, returning 413 with {used_bytes, limit_bytes, rejected_bytes} on overflow" + - "Two concurrent /confirm calls that would together exceed the quota result in exactly one 200 and one 413" + - "Document delete decrements quota using GREATEST(0, used_bytes - delta) atomically" + - "Authenticated user can read their own quota via GET /api/auth/me/quota" + - "Celery beat runs cleanup_abandoned_uploads every 30 minutes, deleting pending Document rows older than 1 hour and removing their MinIO objects" + - "Browser PUT requests succeed against MinIO because MINIO_API_CORS_ALLOW_ORIGIN is set to the configured CORS origin" + artifacts: + - path: "backend/storage/base.py" + provides: "StorageBackend ABC with generate_presigned_put_url and stat_object abstract methods" + contains: "generate_presigned_put_url" + - path: "backend/storage/minio_backend.py" + provides: "MinIOBackend dual-client (internal + public) + generate_presigned_put_url + stat_object methods" + contains: "self._public_client" + - path: "backend/storage/__init__.py" + provides: "get_storage_backend factory passes public_endpoint to MinIOBackend" + contains: "public_endpoint" + - path: "backend/api/documents.py" + provides: "POST /api/documents/upload-url + POST /api/documents/{id}/confirm endpoints; old /upload removed" + contains: "/upload-url" + - path: "backend/api/auth.py" + provides: "GET /api/auth/me/quota endpoint" + contains: "/me/quota" + - path: "backend/services/storage.py" + provides: "delete_document decrements quota atomically; save_upload removed" + contains: "GREATEST(0, used_bytes" + - path: "backend/tasks/document_tasks.py" + provides: "cleanup_abandoned_uploads Celery task + _cleanup_abandoned async body" + contains: "cleanup_abandoned_uploads" + - path: "backend/celery_app.py" + provides: "beat_schedule entry for cleanup_abandoned_uploads every 30 minutes" + contains: "beat_schedule" + - path: "docker-compose.yml" + provides: "MINIO_API_CORS_ALLOW_ORIGIN + MINIO_PUBLIC_ENDPOINT env vars; celery-beat service" + contains: "MINIO_API_CORS_ALLOW_ORIGIN" + key_links: + - from: "backend/api/documents.py" + to: "backend/storage/minio_backend.py" + via: "backend.generate_presigned_put_url(object_key) and backend.stat_object(object_key)" + pattern: "generate_presigned_put_url|stat_object" + - from: "backend/api/documents.py" + to: "quotas table" + via: "atomic UPDATE quotas SET used_bytes = used_bytes + :delta WHERE (used_bytes + :delta) <= limit_bytes RETURNING used_bytes" + pattern: "UPDATE quotas" + - from: "backend/api/documents.py" + to: "Celery (tasks.document_tasks.extract_and_classify)" + via: "extract_and_classify.delay(str(doc.id)) after successful confirm" + pattern: "extract_and_classify\\.delay" + - from: "backend/celery_app.py" + to: "backend/tasks/document_tasks.py" + via: "beat_schedule: tasks.document_tasks.cleanup_abandoned_uploads every 30m" + pattern: "cleanup_abandoned_uploads" +--- + + +Implement the presigned upload backend per CONTEXT.md D-04..D-07: replace multipart POST /upload with a 3-step flow (upload-url → browser PUT direct to MinIO → confirm) using the atomic quota UPDATE pattern from CLAUDE.md. Add GET /api/auth/me/quota for the sidebar quota bar. Wire the abandoned-upload cleanup Celery beat task per D-06. + +Purpose: This plan is the only place the atomic quota SQL lives. Plan 03-03's auth guards depend on the endpoint surface defined here. Plan 03-05's frontend depends on the API contract defined here. +Output: 6 backend code changes + 1 docker-compose change. After this plan, an authenticated user can perform the full presigned upload/delete/quota cycle from curl; auth guards still allow anonymous access (closed in Plan 03-03). + + + +@/Users/nik/.claude/get-shit-done/workflows/execute-plan.md +@/Users/nik/.claude/get-shit-done/templates/summary.md + + + +@.planning/ROADMAP.md +@.planning/STATE.md +@.planning/phases/03-document-migration-multi-user-isolation/03-CONTEXT.md +@.planning/phases/03-document-migration-multi-user-isolation/03-RESEARCH.md +@.planning/phases/03-document-migration-multi-user-isolation/03-PATTERNS.md +@.planning/phases/03-document-migration-multi-user-isolation/03-UI-SPEC.md +@.planning/phases/03-document-migration-multi-user-isolation/03-VALIDATION.md +@.planning/phases/03-document-migration-multi-user-isolation/03-01-SUMMARY.md +@CLAUDE.md + +@backend/storage/base.py +@backend/storage/minio_backend.py +@backend/storage/__init__.py +@backend/api/documents.py +@backend/api/auth.py +@backend/services/storage.py +@backend/tasks/document_tasks.py +@backend/celery_app.py +@backend/config.py +@docker-compose.yml + + + + +From backend/storage/base.py (current — Phase 1): +``` +class StorageBackend(ABC): + @abstractmethod + async def put_object(self, user_id, document_id, file_bytes, extension, content_type) -> str + @abstractmethod + async def get_object(self, object_key) -> bytes + @abstractmethod + async def delete_object(self, object_key) -> None + @abstractmethod + async def presigned_get_url(self, object_key, expires_minutes=60) -> str + @abstractmethod + async def health_check(self) -> bool +``` + +From backend/storage/minio_backend.py (current): +``` +class MinIOBackend(StorageBackend): + def __init__(self, endpoint, access_key, secret_key, bucket, secure=False) -> None + # Each method wraps the synchronous Minio SDK in asyncio.to_thread. +``` + +From CONTEXT.md decisions (D-05): + POST /api/documents/upload-url body: {"filename": str, "content_type": str} + Returns: {"upload_url": str, "document_id": str} + POST /api/documents/{id}/confirm body: empty + Returns: {"id": str, "size_bytes": int, "used_bytes": int, "status": "uploaded"} + Quota exceeded: HTTP 413 detail = {"used_bytes": int, "limit_bytes": int, "rejected_bytes": int} + +From backend/services/auth.py (Phase 2): + def create_access_token(user_id: str, role: str) -> str + +From backend/db/models.py: + Document.user_id — Plan 03-01 migration flips to NOT NULL + Document.status — 'pending' | 'uploaded' | 'classified' | 'classification_failed' + Document.object_key + Document.size_bytes + Quota.limit_bytes — default 104857600 (100 MB) + Quota.used_bytes + +From backend/celery_app.py (current): + celery_app.conf.task_routes = { + "tasks.document_tasks.*": {"queue": "documents"}, + "tasks.email_tasks.*": {"queue": "email"}, + } + + + + +## Trust Boundaries + +| Boundary | Description | +|----------|-------------| +| browser → MinIO (direct PUT) | Untrusted browser uses time-limited presigned URL; MinIO authenticates via HMAC signature; CORS preflight must succeed | +| browser → FastAPI /confirm | Authenticated user provides only document_id; FastAPI reads size_bytes from MinIO stat (never from client) | +| FastAPI /confirm → quotas table | Concurrent /confirm calls race against the same Quota row; PostgreSQL row-level lock + atomic UPDATE WHERE clause prevents overflow | +| Celery beat → DB+MinIO | Runs as docuvault_app; deletes its own rows (status='pending', age > 1h) and own MinIO objects only | + +## STRIDE Threat Register + +| Threat ID | Category | Component | Disposition | Mitigation Plan | +|-----------|----------|-----------|-------------|-----------------| +| T-03-04 | Spoofing | POST /api/documents/upload-url body filename/content_type | mitigate | object_key is computed server-side as {user_id}/{document_id}/{uuid4()}{ext} (STORE-02) — filename stored in DB column only; extension derived from Path(body.filename).suffix.lower() | +| T-03-05 | Tampering | Confirm endpoint quota delta | mitigate | size_bytes always comes from backend.stat_object(doc.object_key) (MinIO authoritative) — never from client body or request param | +| T-03-06 | Denial of Service | Concurrent /confirm uploads at quota boundary (SC2) | mitigate | Atomic SQL UPDATE quotas SET used_bytes = used_bytes + :delta WHERE (used_bytes + :delta) <= limit_bytes RETURNING used_bytes; fetchone() None → HTTP 413 (RESEARCH.md Finding 4 + Risk 2) | +| T-03-07 | Information Disclosure | Presigned URL leakage in logs | accept | 15-min TTL, single object key; leak risk acceptable for v1; do not log full URL — log only document_id | +| T-03-08 | Repudiation | Abandoned uploads pile up MinIO orphans | mitigate | Celery beat cleanup_abandoned_uploads every 30m deletes pending docs older than 1 hour and their MinIO objects (D-06) | +| T-03-09 | Information Disclosure | Browser→MinIO PUT CORS misconfiguration leaks origin | mitigate | MINIO_API_CORS_ALLOW_ORIGIN env var explicitly set to ${CORS_ORIGINS} (or http://localhost:5173 default) — not wildcard | +| T-03-10 | Tampering | Docker hostname in presigned URL (minio:9000 not browser-resolvable) | mitigate | Dual MinIO client: self._client (internal endpoint, used for stat/get/put/delete) and self._public_client (public endpoint, used for generate_presigned_put_url) per RESEARCH.md Finding 3 | +| T-03-SC | Tampering | pip installs | mitigate | No new package installs (minio + sqlalchemy already pinned in Phase 1 requirements.txt) | + + + + + + Task 1: Extend StorageBackend ABC + MinIOBackend with dual client, presigned PUT, stat_object; add config knobs and docker-compose env + backend/storage/base.py, backend/storage/minio_backend.py, backend/storage/__init__.py, backend/config.py, docker-compose.yml + + - backend/storage/base.py — existing ABC method signatures and docstring style + - backend/storage/minio_backend.py — existing constructor, asyncio.to_thread pattern, presigned_get_url example + - backend/storage/__init__.py — current get_storage_backend factory + - backend/config.py — Settings class shape, SettingsConfigDict usage, existing optional-string fields + - docker-compose.yml — current minio service environment block, current celery-worker shape (model for celery-beat) + - .planning/phases/03-document-migration-multi-user-isolation/03-RESEARCH.md — Finding 2 (presigned_put_object signature), Finding 3 (dual-client + CORS), Finding 5 (stat_object), Finding 10 (celery-beat service block) + + + - StorageBackend ABC gains two new abstract methods: generate_presigned_put_url(object_key, expires_minutes=15) -> str and stat_object(object_key) -> int + - MinIOBackend constructor accepts optional public_endpoint: str | None = None; stores self._public_client (Minio instance pointing at public_endpoint or falling back to endpoint) + - MinIOBackend.generate_presigned_put_url uses self._public_client.presigned_put_object via asyncio.to_thread with timedelta(minutes=expires_minutes) + - MinIOBackend.stat_object uses self._client.stat_object via asyncio.to_thread and returns .size (int) + - get_storage_backend() reads settings.minio_public_endpoint and passes it to MinIOBackend + - config.py adds minio_public_endpoint: str = "" (empty falls back to minio_endpoint inside MinIOBackend) + - docker-compose.yml minio service env adds MINIO_API_CORS_ALLOW_ORIGIN: ${CORS_ORIGINS:-http://localhost:5173}; backend service env adds MINIO_PUBLIC_ENDPOINT; new celery-beat service runs celery -A celery_app beat --loglevel=info + + + Modify `backend/storage/base.py`: append two `@abstractmethod async def` blocks after `presigned_get_url`, named `generate_presigned_put_url(self, object_key: str, expires_minutes: int = 15) -> str` and `stat_object(self, object_key: str) -> int`. Docstrings cite RESEARCH.md Finding 3 (public client requirement) and Finding 5 (returns authoritative size). + + Modify `backend/storage/minio_backend.py`: extend `__init__` signature with `public_endpoint: str | None = None`; after the existing `self._client = Minio(...)` block add `self._public_client = Minio(endpoint=(public_endpoint or endpoint), access_key=access_key, secret_key=secret_key, secure=secure)`. Append two new async methods (after `presigned_get_url`): + - `generate_presigned_put_url(self, object_key, expires_minutes=15)` — `return await asyncio.to_thread(self._public_client.presigned_put_object, self._bucket, object_key, timedelta(minutes=expires_minutes))` + - `stat_object(self, object_key)` — `result = await asyncio.to_thread(self._client.stat_object, self._bucket, object_key); return result.size` + + Modify `backend/storage/__init__.py`: extend `get_storage_backend()` to pass `public_endpoint=settings.minio_public_endpoint` to the `MinIOBackend(...)` constructor. + + Modify `backend/config.py`: add `minio_public_endpoint: str = ""` to the `Settings` class alongside the existing `minio_endpoint` field. Comment reference: `# RESEARCH.md Finding 3 — browser-resolvable hostname for presigned URLs`. + + Modify `docker-compose.yml`: + 1. In the `minio:` service `environment:` block, add `MINIO_API_CORS_ALLOW_ORIGIN: ${CORS_ORIGINS:-http://localhost:5173}` (RESEARCH.md Finding 3, T-03-09). + 2. In the `backend:` service `environment:` block, add `- MINIO_PUBLIC_ENDPOINT=${MINIO_PUBLIC_ENDPOINT:-localhost:9000}`. + 3. Append a new `celery-beat:` service mirroring `celery-worker` structure with `command: celery -A celery_app beat --loglevel=info` and the same `depends_on`, environment block, and build context (RESEARCH.md Finding 10). + + + cd backend && python -c "from storage.base import StorageBackend; assert 'generate_presigned_put_url' in dir(StorageBackend); assert 'stat_object' in dir(StorageBackend); from storage.minio_backend import MinIOBackend; import inspect; sig = inspect.signature(MinIOBackend.__init__); assert 'public_endpoint' in sig.parameters; print('OK')" && grep -c "MINIO_API_CORS_ALLOW_ORIGIN" docker-compose.yml && grep -c "celery-beat:" docker-compose.yml && grep -c "minio_public_endpoint" backend/config.py + + + StorageBackend ABC has 7 abstract methods (5 original + 2 new). MinIOBackend `__init__` accepts `public_endpoint`. docker-compose.yml contains `MINIO_API_CORS_ALLOW_ORIGIN` and `celery-beat:`. config.py contains `minio_public_endpoint`. All grep counts >= 1. + + + + + Task 2: Implement upload-url, confirm, delete-with-quota, me/quota endpoints + remove old /upload + abandoned-upload Celery beat + backend/api/documents.py, backend/api/auth.py, backend/services/storage.py, backend/tasks/document_tasks.py, backend/celery_app.py + + - backend/api/documents.py — existing handler structure (will be replaced) + - backend/api/auth.py — existing /me endpoint pattern, router prefix, `from db.models import BackupCode, Quota, RefreshToken, User` import line + - backend/services/storage.py — existing save_upload, delete_document, _backend() helper, _doc_to_dict, list_metadata + - backend/tasks/document_tasks.py — existing extract_and_classify task; pattern for sync entry + async body + - backend/tasks/email_tasks.py — _run_send_security_alert shape for async task body + - backend/celery_app.py — current task_routes block (model for beat_schedule append) + - backend/deps/auth.py — get_current_user signature (this plan does NOT wire auth into documents.py — Plan 03-03 does) + - .planning/phases/03-document-migration-multi-user-isolation/03-RESEARCH.md — Finding 4 (atomic quota SQL), Finding 5 (stat_object error handling), Finding 9 (/me/quota), Finding 10 (Celery beat schedule) + - .planning/phases/03-document-migration-multi-user-isolation/03-PATTERNS.md — atomic quota pattern, Celery task pattern + + + - POST /api/documents/upload-url body {filename: str, content_type: str} returns {upload_url: str, document_id: str} and inserts a Document row with user_id=None (Wave 2 placeholder — Plan 03-03 replaces with current_user.id), status="pending", object_key=f"null-user/{doc_id}/{uuid4()}{ext}", size_bytes=0 + - POST /api/documents/{id}/confirm: load Document by id; call backend.stat_object(doc.object_key); catch minio.error.S3Error code="NoSuchKey" → return 422; on success set doc.size_bytes; execute atomic quota UPDATE; on row None → DELETE pending Document + remove MinIO object, return 413 {used_bytes, limit_bytes, rejected_bytes}; on success set status="uploaded", commit, enqueue extract_and_classify.delay(str(doc.id)); return {id, size_bytes, used_bytes, status} + - GET /api/auth/me/quota: load Quota by current_user.id; return {used_bytes, limit_bytes}; if quota missing return 404 + - DELETE /api/documents/{id}: existing handler retained but services.storage.delete_document gains an atomic quota decrement (UPDATE quotas SET used_bytes = GREATEST(0, used_bytes - :delta) WHERE user_id = :uid) immediately before row delete + - Old POST /api/documents/upload (multipart) is removed entirely; services.storage.save_upload is removed + - tasks/document_tasks.py adds cleanup_abandoned_uploads task and async _cleanup_abandoned body that selects status="pending" AND created_at < now()-1h, deletes MinIO objects (try/except), then session.delete(doc), commits; returns {"cleaned": int} + - celery_app.py adds beat_schedule entry "cleanup-abandoned-uploads" → "tasks.document_tasks.cleanup_abandoned_uploads" with schedule timedelta(minutes=30) and timezone "UTC" + - test_upload_url_endpoint, test_confirm_endpoint, test_get_quota, test_quota_increment_atomic, test_concurrent_quota_race, test_quota_exceeded_response, test_delete_decrements_quota transition from xfail → pass (or xpass under strict=False) + + + Rewrite `backend/api/documents.py`. Module-level imports: `from __future__ import annotations`, `import uuid`, `from pathlib import Path`, `from typing import Optional`, `from fastapi import APIRouter, Depends, HTTPException, Query, status`, `from pydantic import BaseModel`, `from sqlalchemy import text`, `from sqlalchemy.ext.asyncio import AsyncSession`, `from db.models import Document, Quota`, `from deps.db import get_db`, `from services import classifier, storage`, `from tasks.document_tasks import extract_and_classify`, `from storage import get_storage_backend`, `from minio.error import S3Error`. Router: `router = APIRouter(prefix="/api/documents", tags=["documents"])`. + + Define request model `class UploadUrlRequest(BaseModel): filename: str; content_type: str`. No response models — return plain dicts matching the documented shape. + + Endpoint POST `/upload-url`: handler signature `async def request_upload_url(body: UploadUrlRequest, session: AsyncSession = Depends(get_db))`. Generate `doc_id = uuid.uuid4()`; `suffix = Path(body.filename).suffix.lower()`; `object_key = f"null-user/{doc_id}/{uuid.uuid4()}{suffix}"` (Plan 03-03 replaces `"null-user"` with `str(current_user.id)`); insert `Document(id=doc_id, user_id=None, filename=body.filename, content_type=body.content_type, size_bytes=0, storage_backend="minio", status="pending", object_key=object_key)`; `session.add(doc); await session.commit()`; call `upload_url = await get_storage_backend().generate_presigned_put_url(object_key, expires_minutes=15)`; return `{"upload_url": upload_url, "document_id": str(doc_id)}`. + + Endpoint POST `/{doc_id}/confirm`: handler signature `async def confirm_upload(doc_id: str, session: AsyncSession = Depends(get_db))`. Parse `uid = uuid.UUID(doc_id)` (catch ValueError → 404); `doc = await session.get(Document, uid)`; if None → HTTPException(404, "Document not found"). Try `size = await get_storage_backend().stat_object(doc.object_key)`; catch `S3Error` where `getattr(e, "code", "") == "NoSuchKey"` → HTTPException(422, "Upload not found — presigned URL may have expired"). Set `doc.size_bytes = size`; `await session.flush()`. Conditional atomic quota update — if `doc.user_id is not None`: execute `text("UPDATE quotas SET used_bytes = used_bytes + :delta WHERE user_id = :uid AND (used_bytes + :delta) <= limit_bytes RETURNING used_bytes, limit_bytes")` with `{"delta": size, "uid": str(doc.user_id)}`; `row = result.fetchone()`. If `row is None` (quota exceeded): execute SELECT `text("SELECT used_bytes, limit_bytes FROM quotas WHERE user_id = :uid")`; `q = quota_row.fetchone()`; delete the pending Document via `await session.delete(doc)`; best-effort `try: await get_storage_backend().delete_object(doc.object_key) except Exception: pass`; `await session.commit()`; raise `HTTPException(status_code=413, detail={"used_bytes": q.used_bytes if q else 0, "limit_bytes": q.limit_bytes if q else 0, "rejected_bytes": size})`. On success path: `doc.status = "uploaded"`; `await session.commit()`; `extract_and_classify.delay(str(doc.id))`; return `{"id": str(doc.id), "size_bytes": size, "used_bytes": (row.used_bytes if row else 0), "status": "uploaded"}`. If `doc.user_id is None` (Wave 2 only): skip quota update entirely, set `doc.status = "uploaded"`, commit, enqueue, return `{"id": str(doc.id), "size_bytes": size, "used_bytes": 0, "status": "uploaded"}`. + + Endpoints GET `""` (list), GET `/{doc_id}` (get), DELETE `/{doc_id}` (delete), POST `/{doc_id}/classify` (reclassify): preserve the current handler bodies verbatim from existing `backend/api/documents.py` (lines 70-118) but rebound to the new router. NOTE: this plan does NOT yet add `Depends(get_current_user)` — Plan 03-03 adds auth wiring. Anonymous handlers remain in Wave 2. + + Modify `backend/services/storage.py`: + - Delete `save_upload(session, file_bytes, original_name, mime_type)` function entirely (lines ~86-136) and its entry in `__all__`. + - At top imports add `from sqlalchemy import text` (current import is `from sqlalchemy import select, delete` — extend the line). + - Modify `delete_document(session, doc_id)`: after the existing `try: await _backend().delete_object(doc.object_key) ... except` block and BEFORE `await session.delete(doc)`, insert: + ``` + if doc.user_id is not None: + await session.execute( + text("UPDATE quotas SET used_bytes = GREATEST(0, used_bytes - :delta) WHERE user_id = :uid"), + {"delta": doc.size_bytes, "uid": str(doc.user_id)}, + ) + ``` + (Comment: "Atomic quota decrement (STORE-06, D-07). The user_id is None guard is removed in Plan 03-03.") + + Modify `backend/api/auth.py`: append a `GET /me/quota` handler after the existing `/me` handler. Handler: + ``` + @router.get("/me/quota") + async def get_my_quota( + current_user: User = Depends(get_current_user), + session: AsyncSession = Depends(get_db), + ): + q = await session.get(Quota, current_user.id) + if q is None: + raise HTTPException(status_code=404, detail="Quota not found") + return {"used_bytes": q.used_bytes, "limit_bytes": q.limit_bytes} + ``` + `Quota` is already imported on the existing `from db.models import BackupCode, Quota, RefreshToken, User` line — verify, no new import needed. + + Modify `backend/tasks/document_tasks.py`: append at end of file (after `_run`): + ``` + @celery_app.task(name="tasks.document_tasks.cleanup_abandoned_uploads") + def cleanup_abandoned_uploads() -> dict: + """Periodic Celery beat task — deletes Document rows with status='pending' + older than 1 hour and their MinIO objects (D-06). + """ + return asyncio.run(_cleanup_abandoned()) + + + async def _cleanup_abandoned() -> dict: + from datetime import datetime, timezone, timedelta + from sqlalchemy import select + + from db.session import AsyncSessionLocal + from db.models import Document + from storage import get_storage_backend + + cutoff = datetime.now(timezone.utc) - timedelta(hours=1) + async with AsyncSessionLocal() as session: + result = await session.execute( + select(Document).where( + Document.status == "pending", + Document.created_at < cutoff, + ) + ) + docs = result.scalars().all() + backend = get_storage_backend() + cleaned = 0 + for doc in docs: + try: + if doc.object_key: + await backend.delete_object(doc.object_key) + except Exception: + pass # MinIO object may not exist yet — safe to ignore + await session.delete(doc) + cleaned += 1 + await session.commit() + return {"cleaned": cleaned} + ``` + + Modify `backend/celery_app.py`: at module top add `from datetime import timedelta as _timedelta`. After the existing `celery_app.conf.task_routes = {...}` block (before `autodiscover_tasks`), append: + ``` + celery_app.conf.beat_schedule = { + "cleanup-abandoned-uploads": { + "task": "tasks.document_tasks.cleanup_abandoned_uploads", + "schedule": _timedelta(minutes=30), + }, + } + celery_app.conf.timezone = "UTC" + ``` + + + cd backend && pytest tests/test_documents.py::test_upload_url_endpoint tests/test_documents.py::test_confirm_endpoint tests/test_documents.py::test_get_quota tests/test_quota.py -x -q 2>&1 | tail -30 && grep -c "generate_presigned_put_url" backend/api/documents.py && grep -c "stat_object" backend/api/documents.py && grep -c "UPDATE quotas" backend/api/documents.py && grep -v '^[[:space:]]*#' backend/services/storage.py | grep -c "GREATEST(0, used_bytes" && grep -c "cleanup_abandoned_uploads" backend/tasks/document_tasks.py && grep -c "beat_schedule" backend/celery_app.py && grep -c "/me/quota" backend/api/auth.py + + + All 7 listed pytest IDs pass (or xpass under strict=False). `backend/api/documents.py` contains `/upload-url`, `/{doc_id}/confirm`, calls `generate_presigned_put_url` + `stat_object` + atomic `UPDATE quotas`. `backend/services/storage.py` contains `GREATEST(0, used_bytes`. `backend/api/auth.py` contains `/me/quota`. `backend/tasks/document_tasks.py` contains `cleanup_abandoned_uploads`. `backend/celery_app.py` contains `beat_schedule`. All grep counts >= 1. + + + + + + +- Atomic quota race test green: `cd backend && pytest tests/test_quota.py::test_concurrent_quota_race -x -q` (SC2 for Phase 3) +- Quota rejection shape green: `cd backend && pytest tests/test_quota.py::test_quota_exceeded_response -x -q` +- Quota decrement on delete green: `cd backend && pytest tests/test_quota.py::test_delete_decrements_quota -x -q` +- Presigned upload endpoints green: `cd backend && pytest tests/test_documents.py::test_upload_url_endpoint tests/test_documents.py::test_confirm_endpoint -x -q` +- Quota read endpoint green: `cd backend && pytest tests/test_documents.py::test_get_quota -x -q` +- No legacy /upload route remains: `cd backend && grep -c '"/upload"' backend/api/documents.py` returns 0 + + + +- StorageBackend ABC has `generate_presigned_put_url` and `stat_object` abstract methods +- MinIOBackend has dual MinIO client instances; presigned URL uses public client +- POST /api/documents/upload-url, POST /api/documents/{id}/confirm, GET /api/auth/me/quota exist and respond as documented +- DELETE /api/documents/{id} decrements quota atomically when doc.user_id is not None +- POST /api/documents/upload (multipart legacy) is removed; services.storage.save_upload is removed +- Celery beat schedule includes cleanup-abandoned-uploads every 30 minutes +- docker-compose.yml has MINIO_API_CORS_ALLOW_ORIGIN on minio service, MINIO_PUBLIC_ENDPOINT on backend service, and celery-beat service +- All Plan 03-01 stubs for STORE-03 / STORE-04 / STORE-05 / STORE-06 / D-05 pass or xpass + + + +Create `.planning/phases/03-document-migration-multi-user-isolation/03-02-SUMMARY.md` when done — include the exact request/response shapes for /upload-url and /confirm; note the temporary `doc.user_id is None` guard that Plan 03-03 must remove. + diff --git a/.planning/phases/03-document-migration-multi-user-isolation/03-03-PLAN.md b/.planning/phases/03-document-migration-multi-user-isolation/03-03-PLAN.md new file mode 100644 index 0000000..00d834c --- /dev/null +++ b/.planning/phases/03-document-migration-multi-user-isolation/03-03-PLAN.md @@ -0,0 +1,348 @@ +--- +phase: 03-document-migration-multi-user-isolation +plan: 03 +type: execute +wave: 3 +depends_on: + - 03-02 +files_modified: + - backend/deps/auth.py + - backend/api/documents.py + - backend/api/topics.py + - backend/api/admin.py + - backend/services/storage.py + - backend/services/classifier.py +autonomous: true +requirements: + - SEC-04 + - DOC-04 + +must_haves: + truths: + - "Every /api/documents/* handler requires a valid Bearer JWT and a non-admin user role; admins receive 403 on every document endpoint" + - "Cross-user access to a document (by manipulating the doc_id path parameter) returns 404 — never reveals that the document exists" + - "Newly uploaded Document rows have user_id = current_user.id; object_key embeds str(current_user.id) (no more 'null-user' sentinel)" + - "GET /api/topics returns the union of system topics (user_id IS NULL) and the requesting user's own topics — never another user's topics" + - "POST /api/admin/topics requires admin role; creates a Topic with user_id = NULL" + - "Regular POST /api/topics creates a Topic with user_id = current_user.id" + - "Atomic quota path in /confirm runs unconditionally (no more None-user fallback) for every authenticated upload" + artifacts: + - path: "backend/deps/auth.py" + provides: "get_regular_user dependency (current_user with admin-403)" + exports: + - "get_regular_user" + - path: "backend/api/documents.py" + provides: "All handlers wired with get_regular_user dep + ownership assertion (doc.user_id == current_user.id else 404)" + contains: "get_regular_user" + - path: "backend/api/topics.py" + provides: "All handlers wired with get_current_user dep + namespace-scoped queries" + contains: "or_" + - path: "backend/api/admin.py" + provides: "POST /api/admin/topics endpoint (admin-only system topic creation)" + contains: "/admin/topics" + - path: "backend/services/storage.py" + provides: "load_topics_for_user(session, user_id) + create_topic accepts user_id" + exports: + - "load_topics_for_user" + - path: "backend/services/classifier.py" + provides: "classify_document passes user_id through topic lookup and create_topic calls" + contains: "load_topics_for_user" + key_links: + - from: "backend/api/documents.py" + to: "backend/deps/auth.py" + via: "current_user: User = Depends(get_regular_user) on every handler" + pattern: "Depends\\(get_regular_user\\)" + - from: "backend/api/documents.py" + to: "documents table" + via: "ownership assertion: if doc is None or doc.user_id != current_user.id → 404" + pattern: "doc\\.user_id != current_user\\.id" + - from: "backend/api/topics.py" + to: "topics table" + via: "WHERE (Topic.user_id == current_user.id OR Topic.user_id IS NULL)" + pattern: "or_\\(Topic\\.user_id" + - from: "backend/services/classifier.py" + to: "backend/services/storage.py" + via: "create_topic(session, name, user_id=doc.user_id) to keep AI-suggested topics in the user namespace" + pattern: "create_topic.*user_id" +--- + + +Enforce per-user isolation across all `/api/documents/*` and `/api/topics/*` endpoints. Add the `get_regular_user` dependency (rejects admin), wire it into every documents handler with ownership assertions (404 on cross-user access per D-16), add `get_current_user` to topics with namespace-scoped queries (D-17), and create the admin-only `POST /api/admin/topics` endpoint (D-09). Remove the Wave 2 `null-user` placeholder from upload-url so newly created Documents carry the authenticated user_id. + +Purpose: Closes Phase 2 D-07 / Phase 3 SC4 — admins return 403, cross-user returns 404. The atomic quota path from Plan 03-02 finally runs for real authenticated traffic. +Output: 6 file modifications. After this plan, the backend enforces full multi-user isolation for documents and topics; only Plan 03-04 (settings retirement + per-user AI) remains for the backend. + + + +@/Users/nik/.claude/get-shit-done/workflows/execute-plan.md +@/Users/nik/.claude/get-shit-done/templates/summary.md + + + +@.planning/ROADMAP.md +@.planning/STATE.md +@.planning/phases/03-document-migration-multi-user-isolation/03-CONTEXT.md +@.planning/phases/03-document-migration-multi-user-isolation/03-RESEARCH.md +@.planning/phases/03-document-migration-multi-user-isolation/03-PATTERNS.md +@.planning/phases/03-document-migration-multi-user-isolation/03-VALIDATION.md +@.planning/phases/03-document-migration-multi-user-isolation/03-02-SUMMARY.md +@CLAUDE.md + +@backend/deps/auth.py +@backend/api/documents.py +@backend/api/topics.py +@backend/api/admin.py +@backend/services/storage.py +@backend/services/classifier.py + + + + +From backend/deps/auth.py (Phase 2): + async def get_current_user(credentials, session) -> User # 401 on invalid/expired + async def get_current_admin(user = Depends(get_current_user)) -> User # 403 on non-admin + +This plan adds: + async def get_regular_user(user = Depends(get_current_user)) -> User # 403 on admin + +From backend/db/models.py: + Document.user_id # NOT NULL after Plan 03-01 migration + Topic.user_id # nullable: NULL = system topic; UUID = per-user + Topic UniqueConstraint("user_id", "name", name="uq_topics_user_name") + +From backend/services/storage.py (current — Plan 03-02 has removed save_upload): + async def create_topic(session, name, description="", color="#6366f1") -> dict + async def load_topics(session) -> list[dict] + async def list_metadata(session, topic=None) -> list[dict] + async def get_metadata(session, doc_id) -> Optional[dict] + async def delete_document(session, doc_id) -> bool # already decrements quota for non-None user_id + +This plan changes: + create_topic gains user_id: uuid.UUID | None = None parameter (deduplication scoped to user namespace) + NEW: load_topics_for_user(session, user_id) -> list[dict] + +From backend/services/classifier.py (current — Plan 03-04 changes signature for AI provider; this plan only changes topic plumbing): + async def classify_document(session, doc_id, topic_names=None) -> list[str] + — uses storage.load_topics(session) for the topic list and storage.create_topic(session, name) for new suggestions + async def suggest_topics_for_document(session, doc_id) -> list[str] + + + + +## Trust Boundaries + +| Boundary | Description | +|----------|-------------| +| browser → /api/documents/* | Authenticated user supplies doc_id path param; ownership assertion gates every read/write | +| browser → /api/topics/* | Authenticated user supplies topic ops; namespace filter prevents cross-user topic enumeration | +| admin token → /api/documents/* | Admin role is explicitly rejected with 403 — admins cannot read document content per CLAUDE.md | + +## STRIDE Threat Register + +| Threat ID | Category | Component | Disposition | Mitigation Plan | +|-----------|----------|-----------|-------------|-----------------| +| T-03-11 | Information Disclosure | Cross-user document access via doc_id manipulation | mitigate | Every handler taking doc_id asserts `doc is None or doc.user_id != current_user.id` and raises 404 (not 403) per D-16 / SEC-04 — attacker cannot distinguish "not found" from "wrong owner" | +| T-03-12 | Elevation of Privilege | Admin reading user document content | mitigate | get_regular_user dependency rejects admin with 403; applied to every /api/documents/* handler; PATTERNS.md "Auth Guard Pattern" | +| T-03-13 | Information Disclosure | Cross-user topic enumeration | mitigate | All topic queries filter by `or_(Topic.user_id == current_user.id, Topic.user_id.is_(None))`; create_topic dedup scoped by user_id (RESEARCH.md Finding 6 Risk 5) | +| T-03-14 | Elevation of Privilege | Regular user creating system topic | mitigate | POST /api/admin/topics requires `Depends(get_current_admin)`; regular POST /api/topics forces user_id=current_user.id | +| T-03-15 | Tampering | object_key forged with another user's UUID prefix | mitigate | object_key is computed server-side using `str(current_user.id)` — never accepts user-supplied prefix; CLAUDE.md "Every document/folder endpoint asserts resource.user_id == current_user.id" | +| T-03-16 | Spoofing | Anonymous traffic to /api/documents/* after auth wiring | mitigate | HTTPBearer auto_error=True raises 403 when no Authorization header present; get_current_user raises 401 on invalid/expired token | +| T-03-SC | Tampering | pip installs | mitigate | No new package installs | + + + + + + Task 1: Add get_regular_user dependency; wire auth + ownership + real user_id into /api/documents/* + backend/deps/auth.py, backend/api/documents.py, backend/services/storage.py + + - backend/deps/auth.py — current get_current_user, get_current_admin signatures + docstrings (model for get_regular_user) + - backend/api/documents.py — current state after Plan 03-02 (upload-url, confirm, list, get, delete, classify); locate the `"null-user"` placeholder and the `if doc.user_id is not None:` Wave 2 fallback + - backend/services/storage.py — current list_metadata signature (must accept user_id filter); delete_document; get_metadata + - backend/db/models.py — Document.user_id (now NOT NULL post Plan 03-01) + - .planning/phases/03-document-migration-multi-user-isolation/03-CONTEXT.md — D-16 (404 not 403, admin 403) + - .planning/phases/03-document-migration-multi-user-isolation/03-PATTERNS.md — "Auth Guard Pattern", "Ownership Assertion (SEC-04)" + - .planning/phases/03-document-migration-multi-user-isolation/03-RESEARCH.md — Finding 7 (404 vs 403) + + + - get_regular_user is a FastAPI dependency that wraps get_current_user and raises 403 if user.role == "admin" + - Every handler in api/documents.py adds `current_user: User = Depends(get_regular_user)` parameter + - upload-url replaces `"null-user"` in object_key with `str(current_user.id)` and sets `Document.user_id = current_user.id` + - confirm endpoint removes the `if doc.user_id is not None:` Wave 2 fallback — the atomic quota path runs for every confirm + - confirm endpoint adds ownership assertion before the stat_object call: `if doc is None or doc.user_id != current_user.id: raise HTTPException(404, "Document not found")` + - list endpoint filters by user_id = current_user.id (services.storage.list_metadata accepts user_id param) + - get endpoint asserts `doc.user_id == current_user.id` else 404 + - delete endpoint asserts ownership then calls services.storage.delete_document (which decrements quota — Plan 03-02 already wired this; remove the `if doc.user_id is not None:` guard there too) + - classify endpoint asserts ownership + - services.storage.list_metadata adds required `user_id: uuid.UUID` parameter and `Document.user_id == user_id` filter + - services.storage.delete_document removes the `if doc.user_id is not None:` guard (now always non-None) + - Tests test_cross_user_access_404, test_admin_cannot_access_documents, test_documents_require_auth transition from xfail → pass + + + Modify `backend/deps/auth.py`: append a new dependency function: + ``` + async def get_regular_user(user: User = Depends(get_current_user)) -> User: + """Reject admin accounts on all /api/documents/* endpoints (D-16, SC4). + + Admin accounts cannot access document content (CLAUDE.md + SEC-04). + Returns 403 (not 404) — the admin knows document endpoints exist. + """ + if user.role == "admin": + raise HTTPException( + status_code=status.HTTP_403_FORBIDDEN, + detail="Admin accounts cannot access document content", + ) + return user + ``` + + Modify `backend/api/documents.py`: + - Add imports `from db.models import User` and `from deps.auth import get_regular_user`. + - In POST `/upload-url`: add `current_user: User = Depends(get_regular_user)` parameter; change `object_key = f"null-user/{doc_id}/...` to `object_key = f"{current_user.id}/{doc_id}/{uuid.uuid4()}{suffix}"`; change Document insert to `user_id=current_user.id`. + - In POST `/{doc_id}/confirm`: add `current_user: User = Depends(get_regular_user)` parameter. After loading doc by id, add `if doc is None or doc.user_id != current_user.id: raise HTTPException(status_code=404, detail="Document not found")` (this replaces the simpler 404 check). Remove the `if doc.user_id is not None:` Wave 2 branch entirely — the atomic quota update path runs unconditionally. The else-branch that returned `used_bytes=0` for None-user is deleted. + - In GET `""` (list): add `current_user: User = Depends(get_regular_user)` parameter; change `await storage.list_metadata(session, topic=topic)` to `await storage.list_metadata(session, user_id=current_user.id, topic=topic)`. + - In GET `/{doc_id}`: add `current_user: User = Depends(get_regular_user)` parameter; replace the existing `meta = await storage.get_metadata(...)` flow with `doc = await session.get(Document, uuid.UUID(doc_id))`; check `if doc is None or doc.user_id != current_user.id: raise HTTPException(404, "Document not found")`; then call `meta = await storage.get_metadata(session, doc_id)` for the response shape (or replace with `_doc_to_dict` if more direct). + - In DELETE `/{doc_id}`: add `current_user: User = Depends(get_regular_user)` parameter; load Document via `session.get(Document, uuid.UUID(doc_id))` and assert ownership before calling `storage.delete_document(session, doc_id)`; on `doc is None or doc.user_id != current_user.id` raise 404 (do not return success). + - In POST `/{doc_id}/classify`: add `current_user: User = Depends(get_regular_user)` parameter; assert ownership via `doc = await session.get(Document, uuid.UUID(doc_id)); if doc is None or doc.user_id != current_user.id: raise HTTPException(404, "Document not found")` before calling classifier. + + Modify `backend/services/storage.py`: + - Change `list_metadata` signature from `(session, topic=None)` to `(session, user_id: uuid.UUID, topic: Optional[str] = None)`. Update the `stmt = select(Document).order_by(...)` block to insert `.where(Document.user_id == user_id)` before any topic join. + - Remove the `if doc.user_id is not None:` guard from `delete_document` added in Plan 03-02 — the quota decrement now always runs (post-migration, user_id is NOT NULL). + + + cd backend && pytest tests/test_documents.py::test_cross_user_access_404 tests/test_documents.py::test_admin_cannot_access_documents tests/test_documents.py::test_documents_require_auth tests/test_documents.py::test_upload_url_endpoint tests/test_documents.py::test_confirm_endpoint tests/test_quota.py -x -q 2>&1 | tail -20 && grep -c "get_regular_user" backend/deps/auth.py && grep -c "Depends(get_regular_user)" backend/api/documents.py && grep -v '^[[:space:]]*#' backend/api/documents.py | grep -c "doc.user_id != current_user.id" && grep -c "null-user" backend/api/documents.py + + + `get_regular_user` defined in deps/auth.py. `Depends(get_regular_user)` present in every handler of api/documents.py (>= 6 occurrences expected: upload-url, confirm, list, get, delete, classify). `null-user` literal NOT present in api/documents.py (count == 0). Ownership assertion `doc.user_id != current_user.id` appears in at least 4 handlers. All 6 listed tests pass or xpass. + + + + + Task 2: Wire get_current_user + namespace filter into /api/topics/* + add POST /api/admin/topics + propagate user_id through classifier topic ops + backend/api/topics.py, backend/api/admin.py, backend/services/storage.py, backend/services/classifier.py + + - backend/api/topics.py — current handler structure (will be modified) + - backend/api/admin.py — pattern for `_admin: User = Depends(get_current_admin)` and Pydantic model definitions (UserCreate, AiConfigUpdate) + - backend/services/storage.py — current `create_topic` (lines ~328-355), `load_topics` (lines ~289-295), `update_topic`, `delete_topic` signatures + - backend/services/classifier.py — current `classify_document` body lines 33-58; `storage.load_topics` and `storage.create_topic` call sites + - .planning/phases/03-document-migration-multi-user-isolation/03-CONTEXT.md — D-08, D-09, D-10, D-11, D-17 + - .planning/phases/03-document-migration-multi-user-isolation/03-PATTERNS.md — Topic namespace query (lines ~401-409) + - .planning/phases/03-document-migration-multi-user-isolation/03-RESEARCH.md — Finding 6 (namespace query + dedup scoping + Risk 5) + + + - GET /api/topics requires get_current_user; query filters `or_(Topic.user_id == current_user.id, Topic.user_id.is_(None))` via new service `load_topics_for_user` + - POST /api/topics requires get_current_user; creates Topic with user_id = current_user.id (never NULL) + - PATCH /api/topics/{id} requires get_current_user; the topic must belong to the user (user_id == current_user.id) — otherwise 404 (NOT 403 — same info-leak rationale as docs) + - DELETE /api/topics/{id} requires get_current_user; same ownership rule. System topics (user_id IS NULL) cannot be edited or deleted via /api/topics/* + - POST /api/topics/suggest requires get_current_user; existing classifier path + - POST /api/admin/topics (NEW in api/admin.py) requires get_current_admin; creates Topic with user_id = NULL + - services.storage.create_topic gains `user_id: uuid.UUID | None = None` parameter; dedup query is scoped: `WHERE lower(name) = :name AND user_id IS [NOT DISTINCT FROM] :user_id` + - services.storage adds `load_topics_for_user(session, user_id) -> list[dict]` using the or_/is_ filter + - services.classifier.classify_document replaces `storage.load_topics(session)` with `storage.load_topics_for_user(session, doc.user_id)` and passes `user_id=doc.user_id` into `storage.create_topic` for AI-suggested topics + - Tests test_topic_namespace, test_admin_create_system_topic, test_regular_user_cannot_create_system_topic, test_topics_require_auth transition from xfail → pass + + + Modify `backend/api/topics.py`. Add imports `from db.models import User` and `from deps.auth import get_current_user`. For every existing handler add `current_user: User = Depends(get_current_user)` as a parameter: + + - GET `""` (list_topics): replace `await storage.load_topics(session)` with `await storage.load_topics_for_user(session, user_id=current_user.id)`. Existing topic_doc_counts call: the doc count must also be scoped to current user — pass `user_id=current_user.id` into a new `storage.topic_doc_counts(session, user_id=current_user.id)` (add user filter to that helper as part of this task — add `Document.user_id == user_id` join condition). + - POST `""` (create_topic): pass `user_id=current_user.id` into `storage.create_topic(...)`. + - PATCH `/{topic_id}` (update_topic): before calling `storage.update_topic`, load the Topic and assert `topic.user_id == current_user.id` (no editing of system topics or other users' topics) — else 404. Implementation: `t = await session.get(Topic, uuid.UUID(topic_id)); if t is None or t.user_id != current_user.id: raise HTTPException(404, "Topic not found")`. + - DELETE `/{topic_id}` (delete_topic): same ownership assertion before delete. + - POST `/suggest` (suggest_topics): just add the dep; classifier handles the namespace. + + Modify `backend/api/admin.py`. Add a new Pydantic model `class SystemTopicCreate(BaseModel): name: str; description: str = ""; color: str = "#6366f1"`. Add a new endpoint: + ``` + @router.post("/topics", status_code=status.HTTP_201_CREATED) + async def create_system_topic( + body: SystemTopicCreate, + session: AsyncSession = Depends(get_db), + _admin: User = Depends(get_current_admin), + ) -> dict: + """Admin creates a system topic visible to all users (D-09). + + System topics have user_id = NULL (per D-08 namespace model). + """ + from services import storage + topic = await storage.create_topic( + session, body.name, body.description, body.color, user_id=None + ) + return topic + ``` + Import additions: add `from db.models import Topic` if not already imported (Topic is needed for any future admin topic edit/delete; this plan only adds POST). + + Modify `backend/services/storage.py`: + 1. Add `from sqlalchemy import or_` to the existing `from sqlalchemy import select, delete` import line. + 2. Add `import uuid` if not already at top (it is — verify). + 3. Modify `create_topic` signature: `async def create_topic(session, name, description="", color="#6366f1", user_id: uuid.UUID | None = None) -> dict`. Replace the dedup query with a namespace-scoped lookup. Because PostgreSQL needs `IS NOT DISTINCT FROM` semantics to compare NULLs but SQLite does not implement `IS NOT DISTINCT FROM`, use a branching approach: + ``` + if user_id is None: + q = await session.execute( + select(Topic).where( + sql_func.lower(Topic.name) == name.lower(), + Topic.user_id.is_(None), + ) + ) + else: + q = await session.execute( + select(Topic).where( + sql_func.lower(Topic.name) == name.lower(), + Topic.user_id == user_id, + ) + ) + ``` + On insert, set `Topic(name=name, description=description, color=color, user_id=user_id)`. + 4. Add new function (after `load_topics`): + ``` + async def load_topics_for_user(session, user_id: uuid.UUID) -> list: + """Return system topics (user_id IS NULL) + user's own topics, ordered by name (D-17, DOC-04).""" + q = await session.execute( + select(Topic).where( + or_(Topic.user_id == user_id, Topic.user_id.is_(None)) + ).order_by(Topic.name) + ) + return [ + {"id": str(t.id), "name": t.name, "description": t.description, "color": t.color} + for t in q.scalars() + ] + ``` + Append `"load_topics_for_user"` to `__all__`. + 5. Modify `topic_doc_counts` to optionally scope by user: add `user_id: uuid.UUID | None = None` param; if non-None, join Document and filter by `Document.user_id == user_id` (so a user only sees the count of their own documents tagged to each topic). + + Modify `backend/services/classifier.py`: + 1. In `classify_document` (line ~39): replace `all_topics = await storage.load_topics(session)` with `all_topics = await storage.load_topics_for_user(session, user_id=doc.user_id)`. To access `doc.user_id`, fetch the Document at the top of the function: add `from db.models import Document; from sqlalchemy.ext.asyncio import AsyncSession; doc = await session.get(Document, uuid.UUID(doc_id))` (`uuid` is already imported via meta path; if not, add `import uuid`). + 2. Replace `await storage.create_topic(session, name.strip())` (line ~52) with `await storage.create_topic(session, name.strip(), user_id=doc.user_id)` to put AI-suggested topics in the user namespace (D-11). + 3. In `suggest_topics_for_document`: also replace `storage.load_topics(session)` if called (it is not, per current code) — only `provider.suggest_topics` is invoked, no change needed. + + + cd backend && pytest tests/test_topics.py::test_topic_namespace tests/test_topics.py::test_admin_create_system_topic tests/test_topics.py::test_regular_user_cannot_create_system_topic tests/test_topics.py::test_topics_require_auth tests/test_topics.py -x -q 2>&1 | tail -20 && grep -c "load_topics_for_user" backend/services/storage.py && grep -c "load_topics_for_user" backend/services/classifier.py && grep -c "or_(Topic.user_id" backend/services/storage.py && grep -c "/topics" backend/api/admin.py && grep -c "Depends(get_current_user)" backend/api/topics.py + + + All 4 topic-namespace tests pass or xpass. Existing topic tests (`test_list_topics_empty`, `test_create_topic`, etc.) still pass after auth wiring (they may need fixture updates — surface as a separate concern if they regress). `load_topics_for_user` is defined in storage.py and imported in classifier.py. `Depends(get_current_user)` appears in api/topics.py >= 5 times (list, create, update, delete, suggest). `POST /api/admin/topics` exists in api/admin.py. + + + + + + +- Cross-user access returns 404: `cd backend && pytest tests/test_documents.py::test_cross_user_access_404 -x -q` +- Admin gets 403 on documents: `cd backend && pytest tests/test_documents.py::test_admin_cannot_access_documents -x -q` +- Anonymous gets rejected: `cd backend && pytest tests/test_documents.py::test_documents_require_auth -x -q` +- Topic namespace isolation: `cd backend && pytest tests/test_topics.py::test_topic_namespace -x -q` +- Admin-only system topic creation: `cd backend && pytest tests/test_topics.py::test_admin_create_system_topic tests/test_topics.py::test_regular_user_cannot_create_system_topic -x -q` +- No null-user sentinel remains in api/documents.py: `cd backend && grep -c "null-user" backend/api/documents.py` returns 0 +- Atomic quota path unconditional: `cd backend && grep -v '^[[:space:]]*#' backend/api/documents.py | grep -c "if doc.user_id is not None" | head -1` returns 0 + + + +- get_regular_user dependency defined in deps/auth.py +- Every /api/documents/* handler injects Depends(get_regular_user) and asserts ownership (404 on cross-user) +- Every /api/topics/* handler injects Depends(get_current_user); queries are namespace-scoped +- POST /api/admin/topics exists and requires admin +- services.storage.create_topic accepts user_id; dedup scoped by namespace +- services.storage.load_topics_for_user implemented; consumed by classifier +- classify_document creates AI-suggested topics in the user namespace +- Phase 2 SC5 deferred item (admin JWT → 403 on documents) is fulfilled + + + +Create `.planning/phases/03-document-migration-multi-user-isolation/03-03-SUMMARY.md` when done — list every endpoint now requiring auth, the ownership-assertion pattern (404 not 403), and any existing topic/document tests that needed fixture updates to handle the new auth requirement. + diff --git a/.planning/phases/03-document-migration-multi-user-isolation/03-04-PLAN.md b/.planning/phases/03-document-migration-multi-user-isolation/03-04-PLAN.md new file mode 100644 index 0000000..3b38b47 --- /dev/null +++ b/.planning/phases/03-document-migration-multi-user-isolation/03-04-PLAN.md @@ -0,0 +1,377 @@ +--- +phase: 03-document-migration-multi-user-isolation +plan: 04 +type: execute +wave: 4 +depends_on: + - 03-03 +files_modified: + - backend/config.py + - backend/services/classifier.py + - backend/services/storage.py + - backend/tasks/document_tasks.py + - backend/api/settings.py + - backend/main.py + - backend/tests/test_settings.py + - frontend/src/views/SettingsView.vue + - frontend/src/stores/settings.js + - frontend/src/api/client.js +autonomous: true +requirements: + - DOC-03 + - DOC-05 + +must_haves: + truths: + - "AI provider and model are resolved per document from users.ai_provider / users.ai_model via the Celery task (no flat-file read)" + - "Falls back to settings.default_ai_provider / settings.default_ai_model env-var defaults when user.ai_provider is None" + - "/api/settings endpoint no longer exists; backend/main.py does not register api.settings router" + - "services.storage no longer exposes load_settings/save_settings/mask_api_key/settings_masked; settings.json is no longer read or written" + - "classifier.classify_document accepts ai_provider and ai_model parameters; no longer reads global config from disk" + - "SettingsView.vue displays an admin-managed message; the form is removed; stores/settings.js and api/client.js settings calls are removed" + artifacts: + - path: "backend/config.py" + provides: "SYSTEM_PROMPT + DEFAULT_AI_PROVIDER + DEFAULT_AI_MODEL settings; SETTINGS_FILE / DEFAULT_SETTINGS / DEFAULT_SYSTEM_PROMPT removed" + contains: "default_ai_provider" + - path: "backend/services/classifier.py" + provides: "classify_document accepts ai_provider, ai_model kwargs; _DEFAULT_SYSTEM_PROMPT module constant" + contains: "ai_provider" + - path: "backend/services/storage.py" + provides: "load_settings/save_settings/mask_api_key/settings_masked removed" + contains: "load_topics_for_user" + - path: "backend/tasks/document_tasks.py" + provides: "_run resolves user.ai_provider via session.get(User, doc.user_id) and passes to classifier" + contains: "user.ai_provider" + - path: "backend/main.py" + provides: "api.settings router import + include_router removed" + contains: "settings_router" + - path: "backend/api/settings.py" + provides: "File deleted" + - path: "frontend/src/views/SettingsView.vue" + provides: "Admin-managed placeholder card; form/store/api removed" + contains: "managed by" + - path: "frontend/src/stores/settings.js" + provides: "File deleted or gutted to empty placeholder" + - path: "frontend/src/api/client.js" + provides: "getSettings/patchSettings/testProvider/getDefaultPrompt removed" + contains: "getMyQuota" + key_links: + - from: "backend/tasks/document_tasks.py" + to: "backend/db/models.py User" + via: "user = await session.get(User, doc.user_id); ai_provider = user.ai_provider or settings.default_ai_provider" + pattern: "user\\.ai_provider" + - from: "backend/services/classifier.py" + to: "backend/ai/__init__.py get_provider" + via: "_settings = {'active_provider': ai_provider, 'providers': {ai_provider: {'model': ai_model}}}; provider = get_provider(_settings)" + pattern: "active_provider.*ai_provider" +--- + + +Retire the flat-file settings system end-to-end (D-12, D-13) and wire per-user AI classification via DB lookup (D-14, D-15, DOC-03, DOC-05). Move system prompt + defaults to env vars in config.py. Delete /api/settings; refactor classifier to receive ai_provider/ai_model as parameters; update Celery task to look up doc.user_id → user.ai_provider/ai_model and pass through to classifier. Replace the frontend Settings view with an admin-managed placeholder; remove the settings store and client calls. + +Purpose: Phase 3 SC5 (each user's AI classification uses the provider/model assigned to that user by the admin) cannot pass while load_settings() exists. This is the final backend plan for Phase 3. +Output: 10 file modifications (3 backend code, 1 backend route removal, 1 backend test update, 1 backend main.py, 3 frontend code, 1 backend config). After this plan, the entire Phase 3 backend is multi-user-isolated and admin-controlled. + + + +@/Users/nik/.claude/get-shit-done/workflows/execute-plan.md +@/Users/nik/.claude/get-shit-done/templates/summary.md + + + +@.planning/ROADMAP.md +@.planning/STATE.md +@.planning/phases/03-document-migration-multi-user-isolation/03-CONTEXT.md +@.planning/phases/03-document-migration-multi-user-isolation/03-RESEARCH.md +@.planning/phases/03-document-migration-multi-user-isolation/03-PATTERNS.md +@.planning/phases/03-document-migration-multi-user-isolation/03-VALIDATION.md +@.planning/phases/03-document-migration-multi-user-isolation/03-03-SUMMARY.md +@CLAUDE.md + +@backend/config.py +@backend/services/classifier.py +@backend/services/storage.py +@backend/tasks/document_tasks.py +@backend/api/settings.py +@backend/main.py +@backend/ai/__init__.py +@backend/tests/test_settings.py +@frontend/src/views/SettingsView.vue +@frontend/src/stores/settings.js +@frontend/src/api/client.js + + + + +From backend/db/models.py: + User.ai_provider: Mapped[Optional[str]] # nullable text — set by admin via PATCH /api/admin/users/{id}/ai-config + User.ai_model: Mapped[Optional[str]] # nullable text + +From backend/ai/__init__.py (current — RESEARCH.md Finding 12 / A4): + def get_provider(settings: dict) -> AIProvider + # Accepts dict with keys: "active_provider": str, "providers": {: {"model": str, "api_key"?: str, "base_url"?: str}} + # Branches on active_provider for anthropic / openai / ollama / lmstudio + +From backend/tasks/document_tasks.py (current): + async def _run(document_id: str) -> dict + — fetches Document by id + — calls `topics = await classifier.classify_document(session, document_id)` # no ai_provider/ai_model yet + +From backend/services/classifier.py (post Plan 03-03): + async def classify_document(session, doc_id, topic_names=None) -> list[str] + — currently calls `settings = storage.load_settings(); provider = get_provider(settings)` + — must change to `provider = get_provider({"active_provider": ai_provider, "providers": {ai_provider: {"model": ai_model}}})` + +From backend/config.py (current): + SETTINGS_FILE = Path(settings.data_dir) / "settings.json" # to be removed + DEFAULT_SYSTEM_PROMPT = "..." # to be removed (kept as module const in classifier.py) + DEFAULT_SETTINGS = {...} # to be removed + class Settings(BaseSettings): ... # extend with new fields + + + + +## Trust Boundaries + +| Boundary | Description | +|----------|-------------| +| user → /api/settings | Endpoint removed; previous attack surface eliminated | +| Celery task → users table | Task resolves AI config via doc.user_id → users row; no user-controlled provider/model input | +| admin → PATCH /api/admin/users/{id}/ai-config | Existing Phase 2 admin endpoint is the only write path to user.ai_provider / user.ai_model | + +## STRIDE Threat Register + +| Threat ID | Category | Component | Disposition | Mitigation Plan | +|-----------|----------|-----------|-------------|-----------------| +| T-03-17 | Elevation of Privilege | User changing their own AI provider | mitigate | /api/settings is removed (D-12); only PATCH /api/admin/users/{id}/ai-config (admin-only) writes user.ai_provider/ai_model | +| T-03-18 | Information Disclosure | settings.json flat file persisted to disk with API keys | mitigate | services.storage.load_settings/save_settings deleted; settings.json no longer read or written; API keys live in env vars only | +| T-03-19 | Tampering | Celery task accepting ai_provider in task signature | mitigate | Task signature unchanged (just document_id); ai_provider/ai_model resolved INSIDE the task via DB lookup so a malicious broker message cannot inject provider | +| T-03-20 | Information Disclosure | system_prompt env var in container logs | accept | SYSTEM_PROMPT is a static instruction string with no PII; documented as low-sensitivity | +| T-03-21 | Repudiation | Frontend SettingsView still attempts old API calls | mitigate | Remove getSettings/patchSettings/testProvider/getDefaultPrompt from api/client.js; gut stores/settings.js; SettingsView shows static message — no API calls | +| T-03-SC | Tampering | pip installs | mitigate | No new package installs | + + + + + + Task 1: Backend — retire settings flat-file; route classifier through per-user AI config + backend/config.py, backend/services/classifier.py, backend/services/storage.py, backend/tasks/document_tasks.py, backend/api/settings.py, backend/main.py, backend/tests/test_settings.py + + - backend/config.py — Settings class fields, SETTINGS_FILE definition, DEFAULT_SYSTEM_PROMPT string, DEFAULT_SETTINGS dict + - backend/services/storage.py — load_settings / save_settings / mask_api_key / settings_masked function bodies and __all__ entries (lines ~411-473) + - backend/services/classifier.py — current settings load + get_provider call sites (lines 33-35 and 67-69) + - backend/tasks/document_tasks.py — _run async body, particularly the classifier.classify_document call (line ~80) + - backend/api/settings.py — the entire file (to be deleted) + - backend/main.py — settings_router import + include_router call site + - backend/ai/__init__.py — get_provider factory signature (RESEARCH.md A4 — verify factory accepts {"active_provider", "providers"} dict) + - backend/tests/test_settings.py — existing test file with active tests (to be reduced to test_settings_endpoint_removed) + - .planning/phases/03-document-migration-multi-user-isolation/03-RESEARCH.md — Finding 8 (Celery refactor), Finding 11 (retirement scope), Finding 12 (get_provider factory) + - .planning/phases/03-document-migration-multi-user-isolation/03-PATTERNS.md — classifier signature (lines ~452-510), config.py additions (lines ~743-752) + + + - config.py removes `SETTINGS_FILE`, `DEFAULT_SYSTEM_PROMPT`, `DEFAULT_SETTINGS` module-level constants + - config.py Settings class gains `system_prompt: str = ""`, `default_ai_provider: str = "ollama"`, `default_ai_model: str = "llama3.2"` + - services/classifier.py defines a module-level `_DEFAULT_SYSTEM_PROMPT` constant (the hardcoded string from old config.DEFAULT_SYSTEM_PROMPT). classify_document signature gains `ai_provider: str | None = None, ai_model: str | None = None` keyword params; substitutes config defaults when None; constructs `_settings = {"active_provider": ai_provider, "providers": {ai_provider: {"model": ai_model}}}` and calls `get_provider(_settings)`. NEVER calls `storage.load_settings()` + - suggest_topics_for_document gets the same treatment (also gains ai_provider/ai_model kwargs and same fallback) + - services/storage.py: removes `load_settings`, `save_settings`, `mask_api_key`, `settings_masked`; removes these from `__all__`; removes `from config import DEFAULT_SETTINGS, SETTINGS_FILE` and any `import json`, `import copy` lines that are no longer used; adds `from config import settings as app_settings` if not already present (Plan 03-04 may not need it, but classifier does) + - tasks/document_tasks.py: in `_run` after fetching Document, fetch `user = await session.get(User, doc.user_id)`; compute `ai_provider = (user.ai_provider if user else None) or settings.default_ai_provider`, same for model; pass into `classifier.classify_document(session, document_id, ai_provider=ai_provider, ai_model=ai_model)`. Add `from config import settings` and `from db.models import User` deferred imports inside `_run` + - backend/api/settings.py: file deleted + - backend/main.py: removes `from api.settings import router as settings_router` and `app.include_router(settings_router)` + - backend/tests/test_settings.py: replaces all existing tests with the single `test_settings_endpoint_removed` (the xfail stub from Plan 03-01) — keep file path. Mark previous tests as xfail or simply delete them; the only required test is the 404 assertion + - GET /api/settings returns 404 (route no longer exists) + - The Plan 03-01 stub tests test_per_user_provider, test_celery_task_uses_user_provider, test_default_provider_fallback, test_settings_endpoint_removed transition from xfail → pass + + + Modify `backend/config.py`: + 1. Remove `SETTINGS_FILE = Path(settings.data_dir) / "settings.json"` (line 60). + 2. Remove the `DEFAULT_SYSTEM_PROMPT = """..."""` block (lines 62-67). + 3. Remove the `DEFAULT_SETTINGS = {...}` dict (lines 69-91). + 4. Inside the `Settings` class (alongside existing fields), add: + ``` + # AI classification defaults (Phase 3 — D-13, D-15) + system_prompt: str = "" # SYSTEM_PROMPT env var; hardcoded fallback lives in classifier.py + default_ai_provider: str = "ollama" # DEFAULT_AI_PROVIDER env var + default_ai_model: str = "llama3.2" # DEFAULT_AI_MODEL env var + ``` + 5. Verify no module-level imports of `Path` or anything else remain unused after the deletions (clean up `from pathlib import Path` if no other consumers). + + Modify `backend/services/classifier.py`: + 1. At module top, add `_DEFAULT_SYSTEM_PROMPT = """You are a document classification assistant..."""` with the verbatim string from the old `config.DEFAULT_SYSTEM_PROMPT`. + 2. Add imports `import uuid` and `from db.models import Document` and `from config import settings as app_settings`. + 3. Modify `classify_document` signature: + ``` + async def classify_document( + session: AsyncSession, + doc_id: str, + topic_names: list[str] | None = None, + ai_provider: str | None = None, + ai_model: str | None = None, + ) -> list[str]: + ``` + Replace `settings = storage.load_settings(); system_prompt = settings.get("system_prompt", ""); provider = get_provider(settings)` with: + ``` + _ai_provider = ai_provider or app_settings.default_ai_provider + _ai_model = ai_model or app_settings.default_ai_model + system_prompt = app_settings.system_prompt or _DEFAULT_SYSTEM_PROMPT + _settings = { + "active_provider": _ai_provider, + "providers": {_ai_provider: {"model": _ai_model}}, + } + provider = get_provider(_settings) + ``` + 4. Modify `suggest_topics_for_document` with the same signature change (`ai_provider`, `ai_model` kwargs) and same `_settings` construction in place of `storage.load_settings()`. + 5. Ensure no remaining `storage.load_settings()` call exists anywhere in this file. + + Modify `backend/services/storage.py`: + 1. Remove the import `from config import DEFAULT_SETTINGS, SETTINGS_FILE` (line 36) — replace with nothing (config import is no longer needed in this module). + 2. Remove `import copy` and `import json` from top imports (no longer needed). + 3. Remove `load_settings()`, `save_settings()`, `mask_api_key()`, `settings_masked()` function definitions (lines ~416-449). + 4. Remove these from `__all__`: `"load_settings", "save_settings", "mask_api_key", "settings_masked"`. + + Modify `backend/tasks/document_tasks.py` `_run` function: + 1. After `doc = await session.get(Document, doc_uuid)` and the None check, insert: + ``` + from db.models import User + from config import settings as app_settings + user = await session.get(User, doc.user_id) if doc.user_id else None + ai_provider = (user.ai_provider if user else None) or app_settings.default_ai_provider + ai_model = (user.ai_model if user else None) or app_settings.default_ai_model + ``` + 2. Replace `topics = await classifier.classify_document(session, document_id)` with `topics = await classifier.classify_document(session, document_id, ai_provider=ai_provider, ai_model=ai_model)`. + + Delete `backend/api/settings.py` entirely. Use the `rm` shell equivalent or write an empty stub — preferable: delete the file. (Executor: use `git rm backend/api/settings.py` or `os.remove` then commit.) + + Modify `backend/main.py`: + 1. Remove `from api.settings import router as settings_router` (line 18). + 2. Remove `app.include_router(settings_router)` (line 174). + + Modify `backend/tests/test_settings.py`: + 1. Delete all existing tests (they reference the removed endpoint and storage functions). + 2. Keep only the single test from Plan 03-01: + ``` + async def test_settings_endpoint_removed(async_client): + """D-12: /api/settings endpoint is removed in Phase 3.""" + resp = await async_client.get("/api/settings") + assert resp.status_code == 404 + ``` + Remove the `@pytest.mark.xfail` marker — this test should now pass green. + + + cd backend && pytest tests/test_settings.py tests/test_classifier.py::test_per_user_provider tests/test_classifier.py::test_celery_task_uses_user_provider tests/test_classifier.py::test_default_provider_fallback -x -q 2>&1 | tail -30 && test ! -f backend/api/settings.py && echo "settings.py deleted" && ! grep -q "load_settings" backend/services/storage.py && echo "load_settings removed" && grep -c "default_ai_provider" backend/config.py && grep -c "_DEFAULT_SYSTEM_PROMPT" backend/services/classifier.py && grep -c "user.ai_provider" backend/tasks/document_tasks.py + + + `backend/api/settings.py` file does not exist. `backend/services/storage.py` contains no `load_settings`, `save_settings`, `mask_api_key`, `settings_masked` (grep returns 0). `backend/config.py` contains `default_ai_provider` and `default_ai_model`. `backend/services/classifier.py` contains `_DEFAULT_SYSTEM_PROMPT` and accepts `ai_provider` kwarg. `backend/tasks/document_tasks.py` contains `user.ai_provider`. test_settings_endpoint_removed and the 3 classifier tests pass. + + + + + Task 2: Frontend — strip settings store/API and replace SettingsView with admin-managed placeholder + frontend/src/views/SettingsView.vue, frontend/src/stores/settings.js, frontend/src/api/client.js + + - frontend/src/views/SettingsView.vue — current form structure; replace entirely with placeholder card + - frontend/src/stores/settings.js — actions to remove (fetchSettings, save, testConnection, resetPrompt) + - frontend/src/api/client.js — getSettings, patchSettings, testProvider, getDefaultPrompt functions (lines ~110-132) + - frontend/src/router/index.js — verify /settings route still exists (do not remove — UI-SPEC Risk 6 says keep route with placeholder) + - .planning/phases/03-document-migration-multi-user-isolation/03-UI-SPEC.md — Phase 3 spec inherits Phase 2 design system; no specific SettingsView spec — use minimal card matching existing card style + - .planning/phases/03-document-migration-multi-user-isolation/03-RESEARCH.md — Finding 11 (frontend retirement scope), Risk 6 (UX regression mitigation) + + + - frontend/src/views/SettingsView.vue is replaced with a static placeholder: a heading "AI Configuration" and a card explaining "AI configuration is managed by your admin." Add a secondary link or text directing users to contact their admin. No form, no API calls, no useSettingsStore import + - frontend/src/stores/settings.js is reduced to an empty defineStore that exports no actions, OR deleted entirely. Plan: delete the file and update any imports (SettingsView no longer imports it; check if anything else imports it — grep). If other files import it, gut the file to a minimal noop store + - frontend/src/api/client.js removes the four functions: getSettings, patchSettings, testProvider, getDefaultPrompt (lines ~110-132 in current file). Adds a new export getMyQuota() that hits GET /api/auth/me/quota (used by QuotaBar in Plan 03-05 — pre-emptively add here so Plan 03-05 is a pure UI plan) + - Visiting /settings in the browser shows the admin-managed placeholder without any console errors + - SettingsView.vue uses only the Phase 2 design system (existing classes; text-sm / text-xl / bg-white / border-gray-200 / rounded-xl) — UI-SPEC inheritance + + + Rewrite `frontend/src/views/SettingsView.vue` with the placeholder content. Template structure: + ```vue + + + Settings + Account-level options for your DocuVault workspace. + + + AI configuration + + AI provider and model are managed by your administrator. Contact your admin + to request changes to which AI provider is used for your documents. + + + + + + + ``` + Remove any `
Account-level options for your DocuVault workspace.
+ AI provider and model are managed by your administrator. Contact your admin + to request changes to which AI provider is used for your documents. +