Files
curo1305 bd17b4b22f docs(06.2): mark phase 6.2 complete — all gates passed
UAT complete (7/7 re-tests passed or skipped with reason), security gate
passed (threats_open: 0), 344 backend tests passing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-01 21:09:04 +02:00

23 KiB
Raw Permalink Blame History

DocuVault — v1 Roadmap

Last updated: 2026-05-31

Mandatory Cross-Cutting Gates (every phase)

Before any phase is marked complete, all three gates must pass:

  1. Test gatepytest -v passes with zero failures; every new function/endpoint has at least one test; all security invariant tests pass (wrong owner, admin block, token replay)
  2. Security gate — Security agent runs bandit -r backend/ (zero HIGH), pip audit (zero critical/high), npm audit --audit-level=high (zero high/critical); admin endpoints verified to never return password_hash, credentials_enc, or document content; no hardcoded secrets
  3. Bug fix rule — Any bug fix during execution must: (a) target the root cause, (b) change ≤50 lines, (c) include a regression test — no workarounds permitted

Phases

  • Phase 1: Infrastructure Foundation — PostgreSQL + MinIO wired into Docker Compose; Alembic migrations running; existing app still works
  • Phase 2: Users & Authentication — Full auth flow end-to-end (register, login, TOTP, backup codes, password reset, sign-out-all) with admin panel for user management
  • Phase 3: Document Migration & Multi-User Isolation — All documents in PostgreSQL + MinIO; per-user isolation enforced; existing UI still works
  • Phase 4: Folders, Sharing, Quotas & Document UX — Full document management UX (folders, sharing, quota bar, PDF preview, search, audit log)
  • Phase 5: Cloud Storage Backends — Users can connect OneDrive, Google Drive, Nextcloud, or WebDAV as a personal storage backend

Phase Details

Phase 1: Infrastructure Foundation

Goal: PostgreSQL + MinIO are wired into Docker Compose with a complete Alembic-managed schema; all services boot cleanly and the existing single-user document scanner continues to work exactly as before — no user-facing behavior change. Mode: mvp Depends on: Nothing (first phase) Requirements: STORE-01, STORE-02, STORE-07

Success Criteria (what must be TRUE):

  1. docker compose up starts PostgreSQL, MinIO, and the FastAPI backend with no errors; health checks pass for all three services
  2. Running alembic upgrade head applies the initial migration cleanly against the fresh PostgreSQL instance with no errors
  3. The full existing document upload, text extraction, and AI classification workflow completes successfully — no regression in single-user behavior
  4. MinIO object key schema {user_id}/{document_id}/{uuid4()}{ext} is enforced in the model layer; human-readable filenames are stored in the DB column, not in the MinIO key

Plans: 5 plans

  • 01-01-PLAN.md — Docker Compose service topology + Postgres init + Pydantic Settings + requirements
  • 01-02-PLAN.md — Wave 0 test scaffolds (xfail/skip stubs) + async pytest fixtures
  • 01-03-PLAN.md — SQLAlchemy ORM models + async engine + Alembic async migration (incl. alembic upgrade head)
  • 01-04-PLAN.md — StorageBackend ABC + MinIO backend + rewritten async services/storage.py
  • 01-05-PLAN.md — Lifespan + /health + API cutover + Celery worker + walking-skeleton e2e verify

Phase 2: Users & Authentication

Goal: Users can register, log in (with optional TOTP 2FA), reset their password, and sign out all active sessions; admins can manage user accounts and assign AI providers — all enforced by a complete FastAPI dependency chain. Mode: mvp Depends on: Phase 1 Requirements: AUTH-01, AUTH-02, AUTH-03, AUTH-04, AUTH-05, AUTH-06, AUTH-07, AUTH-08, SEC-01, SEC-02, SEC-03, SEC-05, SEC-06, SEC-07, ADMIN-01, ADMIN-02, ADMIN-03, ADMIN-04, ADMIN-05, ADMIN-07

Success Criteria (what must be TRUE):

  1. A new user can register with an email and password that passes strength validation; a password from the HaveIBeenPwned list is rejected with a clear error
  2. A logged-in user can enroll a TOTP authenticator app, receive 810 backup codes, explicitly acknowledge them, and thereafter be required to supply a TOTP code (or backup code) on every login — a backup code is invalidated on first use
  3. A user who forgets their password can receive a reset email, follow the link within 1 hour, set a new password, and is then returned to the TOTP login gate (not auto-logged in)
  4. A user can trigger "sign out all devices" from account settings; all other active sessions are immediately invalidated and any reuse of a rotated refresh token revokes the entire token family
  5. An admin user can create, deactivate, and reset a user account, and assign an AI provider and model to that user; admin API endpoints never return document content or credentials_enc (per-user document auth enforcement deferred to Phase 3 per D-07)

Plans: 5 plans

Wave 1 — Foundation

  • 02-01-PLAN.md — Auth service layer (Argon2, JWT, refresh tokens, TOTP, backup codes, HIBP, security alert), FastAPI deps, BackupCode model + password_must_change migration

Wave 2 (blocked on Wave 1 completion)

  • 02-02-PLAN.md — Register/login (TOTP + backup code paths) + refresh/logout/change-password endpoints + CSP/Origin validation/rate-limit (IP + per-account) + Vue auth store + router guard + Login/Register views

Wave 3 (blocked on Wave 2 completion)

  • 02-03-PLAN.md — TOTP enrollment + backup codes + password reset + sign-out-all endpoints + AccountView + TotpEnrollment + BackupCodesDisplay + PasswordReset views

Wave 4 (blocked on Wave 3 completion)

  • 02-04-PLAN.md — Admin backend: user CRUD, quota, AI config endpoints with get_current_admin enforced + tests

Wave 5 (blocked on Wave 4 completion)

  • 02-05-PLAN.md — Admin panel frontend: AdminView + three tab components + AppSidebar admin link and user identity footer

Cross-cutting constraints:

  • JWT access token in Pinia memory only — never localStorage (Plans 02, 03, 05)
  • Refresh token httpOnly SameSite=Strict cookie on all token issuance (Plans 02, 03)
  • Admin endpoints never return document content or credentials_enc (Plans 04, 05)
  • All auth endpoints rate-limited per-IP and per-account (Plans 02, 03)

UI hint: yes


Phase 3: Document Migration & Multi-User Isolation

Goal: All existing documents have been migrated from flat-file JSON + filesystem into PostgreSQL + MinIO; all new uploads use the presigned URL flow; per-user isolation is enforced at the DB level; the existing document UI works without regression; the backend is stateless and ready for horizontal scaling. Mode: mvp Depends on: Phase 2 Requirements: STORE-03, STORE-04, STORE-05, STORE-06, STORE-08, SEC-04, DOC-03, DOC-04, DOC-05

Success Criteria (what must be TRUE):

  1. Every document present before migration is accessible after migration with the same metadata and extracted text; a count reconciliation check confirms zero document loss
  2. Two concurrent uploads that would together exceed a user's 100 MB quota result in exactly one success and one 413 rejection — the quota never goes over limit
  3. A document delete atomically decrements the user's recorded quota usage; after deletion the quota reflects the freed bytes
  4. Requesting a document object key or presigned URL for a document owned by a different user returns 403 — no cross-user object access is possible through any request parameter manipulation; all /api/documents/* endpoints enforce get_current_user and return 403 when the requesting user's role is admin (completing SC5 from Phase 2)
  5. AI classification for each document uses the provider and model assigned to that user by the admin, not any user-supplied or default value

Plans: 5 plans

Wave 1 — Migration + test scaffolds

  • 03-01-PLAN.md — Wave 0 test scaffolds (auth_user/admin_user/MinIO mock fixtures + 19 xfail stubs) + Alembic migration 0003 (null-user cleanup, NOT NULL constraint, topic cleanup, quota reconciliation, ix_topics_user_id) — Complete 2026-05-23

Wave 2 (blocked on Wave 1)

  • 03-02-PLAN.md — Presigned upload backend: StorageBackend ABC + MinIOBackend dual client + generate_presigned_put_url/stat_object + /api/documents/upload-url + /api/documents/{id}/confirm with atomic quota UPDATE + GET /api/auth/me/quota + delete-with-quota + abandoned-upload Celery beat + docker-compose CORS/celery-beat

Wave 3 (blocked on Wave 2)

  • 03-03-PLAN.md — Auth guards: get_regular_user dep + ownership assertions on every /api/documents/* handler (404 not 403) + admin 403 + real user_id in object_key + namespace-scoped /api/topics/* + POST /api/admin/topics + classifier topic-namespace plumbing

Wave 4 (blocked on Wave 3)

  • 03-04-PLAN.md — Settings retirement + per-user AI: delete /api/settings + remove load_settings/save_settings + classifier accepts ai_provider/ai_model kwargs + Celery task resolves user.ai_provider via DB + frontend SettingsView placeholder + remove settings store/API — Complete 2026-05-23

Wave 5 (blocked on Wave 4)

  • 03-05-PLAN.md — Frontend upload flow + quota bar: 3-step upload action with XHR progress + UploadProgress.vue progress bar and quota rejection error block + QuotaBar.vue + AppSidebar embed + quota state in auth store + human checkpoint

Cross-cutting constraints:

  • Atomic quota UPDATE pattern only lives in Plan 02; never duplicate (CLAUDE.md)
  • Every /api/documents/* handler injects get_regular_user (Plan 03)
  • AI provider/model resolved only via Celery task DB lookup (Plan 04)
  • Browser XHR PUT to MinIO sends NO Authorization header (Plan 05)

Phase gates (must pass before Phase 3 is complete):

  • pytest -v — zero failures; presigned URL, quota enforcement, ownership isolation, and admin-403 all covered
  • Security agent: path traversal check on object key construction; cross-user IDOR tests; quota race condition test
  • Bandit + pip audit + npm audit all clean

UI hint: yes


Phase 4: Folders, Sharing, Quotas & Document UX

Goal: Users have a complete document management experience — organized with folders, shared by handle, warned before they hit quota, able to preview PDFs in-browser, and served by a searchable document list; admins can view the append-only audit log. Mode: mvp Depends on: Phase 3 Requirements: FOLD-01, FOLD-02, FOLD-03, FOLD-04, FOLD-05, SHARE-01, SHARE-02, SHARE-03, SHARE-04, SHARE-05, SEC-08, SEC-09, ADMIN-06, DOC-01, DOC-02

Success Criteria (what must be TRUE):

  1. A user can create, rename, and delete folders; moving a document between folders preserves its metadata and AI classification; deleting a non-empty folder prompts with the content count before proceeding
  2. A user can share a document with another user by handle; the recipient sees it appear in a "Shared with me" virtual folder with no storage quota charged against them; the owner can revoke access and the shared entry disappears immediately for the recipient
  3. The sidebar quota bar displays current usage in MB; it turns amber at 80% and red at 95%; an upload that would exceed the limit is rejected with an error showing current usage, the rejected file size, and a link to storage settings
  4. Any document in the user's library can be previewed in-browser as a PDF; document bytes are proxied through the app and no presigned URLs are exposed to the browser (native browser PDF rendering via Content-Type header)
  5. An admin can view the audit log filtered by date range, user, and action type; the log contains no document content, filenames, or extracted text; account deletion triggers cleanup of all user files before DB records are removed

Plans: 9 plans

Wave 1 — Test scaffolds + DB migration (parallel)

  • 04-01-PLAN.md — Wave 0 test stubs: test_folders.py + test_shares.py + test_audit.py + proxy stubs in test_documents.py + SEC-08/SEC-09 stubs in test_security.py
  • 04-02-PLAN.md — Alembic migration 0004 (users.pdf_open_mode, GIN FTS index, audit-logs bucket) + MinIOBackend.put_object_raw()

Wave 2 (blocked on Wave 1)

  • 04-03-PLAN.md — Audit service (write_audit_log) + Folders API (FOLD-01..05): POST/GET/PATCH/DELETE /api/folders + PATCH /api/documents/{id}/folder + document list sort/search/is_shared extension
  • 04-04-PLAN.md — Shares API (SHARE-01..05): POST/GET /api/shares + GET /api/shares/received + DELETE /api/shares/{id} with IDOR protection

Wave 3 (blocked on Wave 2)

  • 04-05-PLAN.md — PDF streaming proxy GET /api/documents/{id}/content with Range header support + PATCH /api/auth/me/preferences (pdf_open_mode)
  • 04-06-PLAN.md — Admin audit log API (GET /api/admin/audit-log, CSV export) + Celery beat daily audit export task + celery_app.py beat schedule

Wave 4 (blocked on Wave 3)

  • 04-07-PLAN.md — SEC-08/SEC-09 hardening + audit log backfill into auth.py/admin.py/documents.py + CloudConnectionOut Pydantic model + delete-user file cleanup

Wave 5 (blocked on Wave 4)

  • 04-08-PLAN.md — Frontend data layer: API client functions + useFoldersStore + documents store extension + Vue Router routes (/folders/:folderId, /shared)

Wave 6 (blocked on Wave 5)

  • 04-09-PLAN.md — Frontend UI: all new components (FolderRow, FolderBreadcrumb, FolderDeleteModal, ShareModal, DocumentPreviewModal, SearchBar, SortControls, AuditLogTab) + view wiring (AppSidebar, DocumentCard, HomeView, FolderView, SharedView, SettingsView, AdminView) + human checkpoint

Phase gates (must pass before Phase 4 is complete):

  • pytest -v — zero failures; folder ownership, share revocation, quota bar, PDF proxy (no presigned URL exposure) all covered
  • Security agent: audit log verified to contain zero document content; sharing IDOR tests; PDF proxy verified to not leak presigned URLs or object keys
  • Bandit + pip audit + npm audit all clean

UI hint: yes


Phase 5: Cloud Storage Backends

Goal: Users can connect OneDrive, Google Drive, Nextcloud, or a generic WebDAV server as a personal storage backend; credentials are encrypted with a per-user HKDF-derived key; connection status is visible; local and cloud storage coexist; the StorageBackend ABC makes adding further backends straightforward. Mode: mvp Depends on: Phase 4 Requirements: CLOUD-01, CLOUD-02, CLOUD-03, CLOUD-04, CLOUD-05, CLOUD-06, CLOUD-07

Success Criteria (what must be TRUE):

  1. A user can connect OneDrive, Google Drive, Nextcloud, or a WebDAV endpoint through an OAuth or credential flow; the connection status is displayed as ACTIVE, REQUIRES_REAUTH, or ERROR — never shows raw credentials
  2. When an OAuth token is revoked externally (simulated invalid_grant response), the connection status transitions to REQUIRES_REAUTH without a 500 error; the user is shown a re-authentication prompt
  3. A user can select their connected cloud backend as the default storage destination for new uploads; local MinIO storage remains available as an alternative; existing local documents are unaffected
  4. A user can disconnect a cloud backend; credentials are permanently deleted from the DB and a subsequent attempt to use that backend returns an appropriate error — no orphaned data remains
  5. An admin API response for a user's cloud connections returns only provider, display_name, connected_at, status — the credentials_enc column is never present in any serialized response

Plans: 12 plans (8 original + 3 UAT gap closure + 1 gap closure wave)

Wave 1 — Test scaffold + dependencies

  • 05-01-PLAN.md — Wave 0 xfail stubs, conftest cloud fixtures, requirements.txt packages, config.py settings

Wave 2 — Shared utilities

  • 05-02-PLAN.md — cloud_utils.py (SSRF + HKDF), cloud_cache.py (TTLCache), storage factory extension

Wave 3 — Cloud backends (parallel, both blocked on Wave 2 / Plan 05-02)

  • 05-03-PLAN.md — GoogleDriveBackend + OneDriveBackend (all 7 StorageBackend methods)
  • 05-04-PLAN.md — NextcloudBackend + WebDAVBackend (all 7 StorageBackend methods)

Wave 4 — Cloud API

  • 05-05-PLAN.md — All /api/cloud/* endpoints + /api/users/me/default-storage + main.py router registration

Wave 5 — Document routing + full test suite

  • 05-06-PLAN.md — Upload/content proxy cloud routing + all 15 tests promoted to passing

Wave 6 — Frontend settings UI

  • 05-07-PLAN.md — cloudConnections store + API client + SettingsView 3-tab + SettingsCloudTab + CloudCredentialModal

Wave 7 — Frontend sidebar (human checkpoint)

  • 05-08-PLAN.md — AppSidebar cloud section + CloudProviderTreeItem + CloudFolderTreeItem + human checkpoint

Wave 8 — UAT gap closure (parallel, all independent)

  • 05-09-PLAN.md — Cloud document open/re-analyze/edit: authenticated fetch+Blob URL, cloud-aware Celery task, PATCH /api/documents/{id}
  • 05-10-PLAN.md — OAuth initiate fix (JSON response), Nextcloud custom endpoint edit round-trip, Edit button on ERROR rows, confirmation text overflow
  • 05-11-PLAN.md — Admin hard-delete with password confirmation: UserDeleteConfirm backend model + inline frontend panel

Wave 9 — Post-UAT gap closure

  • 05-12-PLAN.md — OAuth 400 preflight (unconfigured creds), 502 cloud fallback, upload hint in CloudStorageView, celery-worker volume mount

Phase gates (must pass before Phase 5 is complete):

  • pytest -v — zero failures; SSRF prevention on WebDAV/Nextcloud user-supplied URLs; credential encryption/decryption round-trip; admin response never exposes credentials_enc; OAuth invalid_grant handling
  • Security agent: SSRF allowlist verification; credential key derivation correctness; connection status never leaks raw credential values
  • Bandit + pip audit + npm audit all clean
  • UAT gaps resolved and re-tested (05-09, 05-10, 05-11, 05-12)

UI hint: yes


Phase 6: Performance & Production Hardening

Goal: The application is ready for production deployment — observable, load-tested, and hardened; response times meet SLA targets under concurrent load; all auth and document endpoints are rate-limited; structured logging and distributed tracing are in place; the Docker image runs as a non-root user with a read-only filesystem. Mode: mvp Depends on: Phase 5 Requirements: TBD

Success Criteria (what must be TRUE):

  1. All API endpoints respond within defined latency targets (p50/p95/p99) under a realistic load test (e.g., 50 concurrent users, 5-minute soak)
  2. Structured JSON logging (correlation IDs, user ID, request latency) is emitted to stdout; a local log aggregation stack (Loki or similar) captures and queries them
  3. All auth endpoints (login, register, password reset, TOTP) enforce per-IP and per-account rate limits that cannot be bypassed by header manipulation
  4. Container hardening is complete: non-root user, read-only root filesystem, dropped Linux capabilities; docker scout or equivalent reports zero critical CVEs
  5. A runbook documents all environment variables, startup/shutdown procedures, backup strategy, and on-call escalation path; the app can be stood up from scratch using only the runbook

Plans: TBD


Phase 6.1: Close v1.0 audit gaps: SHARE-02/STORE-06/ADMIN-06

Goal: Close three v1.0 requirements that remain unimplemented — atomic quota decrement on document delete (STORE-06), "Shared with me" virtual folder without recipient quota charge (SHARE-02), and admin audit log viewer with date/user/action type filters (ADMIN-06). Mode: mvp Depends on: Phase 6 Requirements: STORE-06, SHARE-02, ADMIN-06

Success Criteria (what must be TRUE):

  1. Deleting a document atomically decrements the owning user's quota; after deletion the quota reflects the freed bytes with no race condition under concurrent deletes
  2. A user who receives a shared document sees it appear in a "Shared with me" virtual folder; the recipient's quota usage is not charged for the shared document's storage
  3. An admin can view the audit log filtered independently by date range, user, and action type; filtered results contain no document content, filenames, or extracted text

Plans: 2 plans

Wave 1 — Test promotion (parallel)

  • 06.1-01-PLAN.md — Promote test_shares.py stubs to real tests + second_auth_user fixture (SHARE-01..05)
  • 06.1-02-PLAN.md — Promote test_audit.py stubs to real tests (ADMIN-06)

Phase gates (must pass before Phase 6.1 is complete):

  • pytest -v — zero failures; all 7 share tests + 4 audit log tests passing
  • Security agent: bandit + pip audit + npm audit all clean
  • STORE-06 confirmed: test_delete_decrements_quota passes under INTEGRATION=1

Phase 6.2: Close v1 sharing + cloud-delete + CSV export gaps

Goal: Close remaining v1 gaps — sharing edge cases (SHARE-03/SHARE-05), cloud document deletion propagation to the remote backend, and CSV export + daily export UI for the admin audit log (ADMIN-06). Mode: mvp Depends on: Phase 6.1 Requirements: SHARE-03, SHARE-05, ADMIN-06

Success Criteria (what must be TRUE):

  1. Documents shared with others display a "Shared" badge in the owner's list view (reads doc.is_shared, not doc.share_count)
  2. Owner can set permission to "view" or "edit" when creating a share and toggle it per-recipient afterward; PATCH /api/shares/{id} enforces IDOR protection (404 on wrong owner)
  3. Deleting a cloud document propagates the delete to the cloud provider; failure shows a warning modal with "Remove from app" fallback; ?remove_only=true removes only the DB record; cloud docs never affect quota on delete
  4. Admin can download filtered audit log CSV via fetch+Blob (not window.location.href); audit log entries show user handles instead of raw UUIDs; user filter accepts handles (not UUIDs)
  5. Admin can list and download Celery-generated daily audit export files from a new section in the Audit Log tab

Plans: 4 plans

Wave 0 — Test stubs

  • 06.2-01-PLAN.md — 11 xfail stubs across test_shares.py, test_documents.py, test_audit.py

Wave 1 — Feature slices (parallel)

  • 06.2-02-PLAN.md — SHARE-05 badge fix + SHARE-03 permission control (backend PATCH + frontend dropdown + toggle)
  • 06.2-03-PLAN.md — Cloud-delete propagation + structured error response + remove_only path + DocumentView warning modal

Wave 2 — Audit log enrichment

  • 06.2-04-PLAN.md — Audit handle JOIN + user_handle filter + CSV fetch+Blob fix + daily-export list + download endpoints + AuditLogTab UI

Phase gates (must pass before Phase 6.2 is complete):

  • pytest -v — 344 passed, 1 pre-existing unrelated failure (test_extract_docx missing module)
  • Security agent: bandit + pip audit + npm audit all clean (SECURITY.md threats_open: 0)
  • IDOR on PATCH /api/shares/{id}: test_share_patch_idor passes
  • Date regex validation confirmed: GET /api/admin/audit-log/daily-exports/invalid-date returns 404
  • window.location.href removed from AuditLogTab.vue confirmed by grep

Status: ✓ Complete (2026-06-01)


Progress Table

Phase Plans Complete Status Completed
1. Infrastructure Foundation 5/5 Complete 2026-05-22
2. Users & Authentication 6/6 Complete 2026-06-01
3. Document Migration & Multi-User Isolation 5/5 Complete 2026-05-25
4. Folders, Sharing, Quotas & Document UX 9/9 Complete 2026-05-28
5. Cloud Storage Backends 12/12 Complete 2026-05-30
6. Performance & Production Hardening 0/TBD Not started
6.1. Close v1.0 audit gaps 2/2 Complete 2026-05-30
6.2. Close v1 sharing + cloud-delete + CSV export gaps 5/5 Complete 2026-05-31