UAT complete (7/7 re-tests passed or skipped with reason), security gate passed (threats_open: 0), 344 backend tests passing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
23 KiB
DocuVault — v1 Roadmap
Last updated: 2026-05-31
Mandatory Cross-Cutting Gates (every phase)
Before any phase is marked complete, all three gates must pass:
- Test gate —
pytest -vpasses with zero failures; every new function/endpoint has at least one test; all security invariant tests pass (wrong owner, admin block, token replay) - Security gate — Security agent runs
bandit -r backend/(zero HIGH),pip audit(zero critical/high),npm audit --audit-level=high(zero high/critical); admin endpoints verified to never returnpassword_hash,credentials_enc, or document content; no hardcoded secrets - Bug fix rule — Any bug fix during execution must: (a) target the root cause, (b) change ≤50 lines, (c) include a regression test — no workarounds permitted
Phases
- Phase 1: Infrastructure Foundation — PostgreSQL + MinIO wired into Docker Compose; Alembic migrations running; existing app still works
- Phase 2: Users & Authentication — Full auth flow end-to-end (register, login, TOTP, backup codes, password reset, sign-out-all) with admin panel for user management
- Phase 3: Document Migration & Multi-User Isolation — All documents in PostgreSQL + MinIO; per-user isolation enforced; existing UI still works
- Phase 4: Folders, Sharing, Quotas & Document UX — Full document management UX (folders, sharing, quota bar, PDF preview, search, audit log)
- Phase 5: Cloud Storage Backends — Users can connect OneDrive, Google Drive, Nextcloud, or WebDAV as a personal storage backend
Phase Details
Phase 1: Infrastructure Foundation
Goal: PostgreSQL + MinIO are wired into Docker Compose with a complete Alembic-managed schema; all services boot cleanly and the existing single-user document scanner continues to work exactly as before — no user-facing behavior change. Mode: mvp Depends on: Nothing (first phase) Requirements: STORE-01, STORE-02, STORE-07
Success Criteria (what must be TRUE):
docker compose upstarts PostgreSQL, MinIO, and the FastAPI backend with no errors; health checks pass for all three services- Running
alembic upgrade headapplies the initial migration cleanly against the fresh PostgreSQL instance with no errors - The full existing document upload, text extraction, and AI classification workflow completes successfully — no regression in single-user behavior
- MinIO object key schema
{user_id}/{document_id}/{uuid4()}{ext}is enforced in the model layer; human-readable filenames are stored in the DB column, not in the MinIO key
Plans: 5 plans
- 01-01-PLAN.md — Docker Compose service topology + Postgres init + Pydantic Settings + requirements
- 01-02-PLAN.md — Wave 0 test scaffolds (xfail/skip stubs) + async pytest fixtures
- 01-03-PLAN.md — SQLAlchemy ORM models + async engine + Alembic async migration (incl. alembic upgrade head)
- 01-04-PLAN.md — StorageBackend ABC + MinIO backend + rewritten async services/storage.py
- 01-05-PLAN.md — Lifespan + /health + API cutover + Celery worker + walking-skeleton e2e verify
Phase 2: Users & Authentication
Goal: Users can register, log in (with optional TOTP 2FA), reset their password, and sign out all active sessions; admins can manage user accounts and assign AI providers — all enforced by a complete FastAPI dependency chain. Mode: mvp Depends on: Phase 1 Requirements: AUTH-01, AUTH-02, AUTH-03, AUTH-04, AUTH-05, AUTH-06, AUTH-07, AUTH-08, SEC-01, SEC-02, SEC-03, SEC-05, SEC-06, SEC-07, ADMIN-01, ADMIN-02, ADMIN-03, ADMIN-04, ADMIN-05, ADMIN-07
Success Criteria (what must be TRUE):
- A new user can register with an email and password that passes strength validation; a password from the HaveIBeenPwned list is rejected with a clear error
- A logged-in user can enroll a TOTP authenticator app, receive 8–10 backup codes, explicitly acknowledge them, and thereafter be required to supply a TOTP code (or backup code) on every login — a backup code is invalidated on first use
- A user who forgets their password can receive a reset email, follow the link within 1 hour, set a new password, and is then returned to the TOTP login gate (not auto-logged in)
- A user can trigger "sign out all devices" from account settings; all other active sessions are immediately invalidated and any reuse of a rotated refresh token revokes the entire token family
- An admin user can create, deactivate, and reset a user account, and assign an AI provider and model to that user; admin API endpoints never return document content or credentials_enc (per-user document auth enforcement deferred to Phase 3 per D-07)
Plans: 5 plans
Wave 1 — Foundation
- 02-01-PLAN.md — Auth service layer (Argon2, JWT, refresh tokens, TOTP, backup codes, HIBP, security alert), FastAPI deps, BackupCode model + password_must_change migration
Wave 2 (blocked on Wave 1 completion)
- 02-02-PLAN.md — Register/login (TOTP + backup code paths) + refresh/logout/change-password endpoints + CSP/Origin validation/rate-limit (IP + per-account) + Vue auth store + router guard + Login/Register views
Wave 3 (blocked on Wave 2 completion)
- 02-03-PLAN.md — TOTP enrollment + backup codes + password reset + sign-out-all endpoints + AccountView + TotpEnrollment + BackupCodesDisplay + PasswordReset views
Wave 4 (blocked on Wave 3 completion)
- 02-04-PLAN.md — Admin backend: user CRUD, quota, AI config endpoints with get_current_admin enforced + tests
Wave 5 (blocked on Wave 4 completion)
- 02-05-PLAN.md — Admin panel frontend: AdminView + three tab components + AppSidebar admin link and user identity footer
Cross-cutting constraints:
- JWT access token in Pinia memory only — never localStorage (Plans 02, 03, 05)
- Refresh token httpOnly SameSite=Strict cookie on all token issuance (Plans 02, 03)
- Admin endpoints never return document content or credentials_enc (Plans 04, 05)
- All auth endpoints rate-limited per-IP and per-account (Plans 02, 03)
UI hint: yes
Phase 3: Document Migration & Multi-User Isolation
Goal: All existing documents have been migrated from flat-file JSON + filesystem into PostgreSQL + MinIO; all new uploads use the presigned URL flow; per-user isolation is enforced at the DB level; the existing document UI works without regression; the backend is stateless and ready for horizontal scaling. Mode: mvp Depends on: Phase 2 Requirements: STORE-03, STORE-04, STORE-05, STORE-06, STORE-08, SEC-04, DOC-03, DOC-04, DOC-05
Success Criteria (what must be TRUE):
- Every document present before migration is accessible after migration with the same metadata and extracted text; a count reconciliation check confirms zero document loss
- Two concurrent uploads that would together exceed a user's 100 MB quota result in exactly one success and one 413 rejection — the quota never goes over limit
- A document delete atomically decrements the user's recorded quota usage; after deletion the quota reflects the freed bytes
- Requesting a document object key or presigned URL for a document owned by a different user returns 403 — no cross-user object access is possible through any request parameter manipulation; all /api/documents/* endpoints enforce get_current_user and return 403 when the requesting user's role is admin (completing SC5 from Phase 2)
- AI classification for each document uses the provider and model assigned to that user by the admin, not any user-supplied or default value
Plans: 5 plans
Wave 1 — Migration + test scaffolds
- 03-01-PLAN.md — Wave 0 test scaffolds (auth_user/admin_user/MinIO mock fixtures + 19 xfail stubs) + Alembic migration 0003 (null-user cleanup, NOT NULL constraint, topic cleanup, quota reconciliation, ix_topics_user_id) — Complete 2026-05-23
Wave 2 (blocked on Wave 1)
- 03-02-PLAN.md — Presigned upload backend: StorageBackend ABC + MinIOBackend dual client + generate_presigned_put_url/stat_object + /api/documents/upload-url + /api/documents/{id}/confirm with atomic quota UPDATE + GET /api/auth/me/quota + delete-with-quota + abandoned-upload Celery beat + docker-compose CORS/celery-beat
Wave 3 (blocked on Wave 2)
- 03-03-PLAN.md — Auth guards: get_regular_user dep + ownership assertions on every /api/documents/* handler (404 not 403) + admin 403 + real user_id in object_key + namespace-scoped /api/topics/* + POST /api/admin/topics + classifier topic-namespace plumbing
Wave 4 (blocked on Wave 3)
- 03-04-PLAN.md — Settings retirement + per-user AI: delete /api/settings + remove load_settings/save_settings + classifier accepts ai_provider/ai_model kwargs + Celery task resolves user.ai_provider via DB + frontend SettingsView placeholder + remove settings store/API — Complete 2026-05-23
Wave 5 (blocked on Wave 4)
- 03-05-PLAN.md — Frontend upload flow + quota bar: 3-step upload action with XHR progress + UploadProgress.vue progress bar and quota rejection error block + QuotaBar.vue + AppSidebar embed + quota state in auth store + human checkpoint
Cross-cutting constraints:
- Atomic quota UPDATE pattern only lives in Plan 02; never duplicate (CLAUDE.md)
- Every /api/documents/* handler injects get_regular_user (Plan 03)
- AI provider/model resolved only via Celery task DB lookup (Plan 04)
- Browser XHR PUT to MinIO sends NO Authorization header (Plan 05)
Phase gates (must pass before Phase 3 is complete):
pytest -v— zero failures; presigned URL, quota enforcement, ownership isolation, and admin-403 all covered- Security agent: path traversal check on object key construction; cross-user IDOR tests; quota race condition test
- Bandit + pip audit + npm audit all clean
UI hint: yes
Phase 4: Folders, Sharing, Quotas & Document UX
Goal: Users have a complete document management experience — organized with folders, shared by handle, warned before they hit quota, able to preview PDFs in-browser, and served by a searchable document list; admins can view the append-only audit log. Mode: mvp Depends on: Phase 3 Requirements: FOLD-01, FOLD-02, FOLD-03, FOLD-04, FOLD-05, SHARE-01, SHARE-02, SHARE-03, SHARE-04, SHARE-05, SEC-08, SEC-09, ADMIN-06, DOC-01, DOC-02
Success Criteria (what must be TRUE):
- A user can create, rename, and delete folders; moving a document between folders preserves its metadata and AI classification; deleting a non-empty folder prompts with the content count before proceeding
- A user can share a document with another user by handle; the recipient sees it appear in a "Shared with me" virtual folder with no storage quota charged against them; the owner can revoke access and the shared entry disappears immediately for the recipient
- The sidebar quota bar displays current usage in MB; it turns amber at 80% and red at 95%; an upload that would exceed the limit is rejected with an error showing current usage, the rejected file size, and a link to storage settings
- Any document in the user's library can be previewed in-browser as a PDF; document bytes are proxied through the app and no presigned URLs are exposed to the browser (native browser PDF rendering via Content-Type header)
- An admin can view the audit log filtered by date range, user, and action type; the log contains no document content, filenames, or extracted text; account deletion triggers cleanup of all user files before DB records are removed
Plans: 9 plans
Wave 1 — Test scaffolds + DB migration (parallel)
- 04-01-PLAN.md — Wave 0 test stubs: test_folders.py + test_shares.py + test_audit.py + proxy stubs in test_documents.py + SEC-08/SEC-09 stubs in test_security.py
- 04-02-PLAN.md — Alembic migration 0004 (users.pdf_open_mode, GIN FTS index, audit-logs bucket) + MinIOBackend.put_object_raw()
Wave 2 (blocked on Wave 1)
- 04-03-PLAN.md — Audit service (write_audit_log) + Folders API (FOLD-01..05): POST/GET/PATCH/DELETE /api/folders + PATCH /api/documents/{id}/folder + document list sort/search/is_shared extension
- 04-04-PLAN.md — Shares API (SHARE-01..05): POST/GET /api/shares + GET /api/shares/received + DELETE /api/shares/{id} with IDOR protection
Wave 3 (blocked on Wave 2)
- 04-05-PLAN.md — PDF streaming proxy GET /api/documents/{id}/content with Range header support + PATCH /api/auth/me/preferences (pdf_open_mode)
- 04-06-PLAN.md — Admin audit log API (GET /api/admin/audit-log, CSV export) + Celery beat daily audit export task + celery_app.py beat schedule
Wave 4 (blocked on Wave 3)
- 04-07-PLAN.md — SEC-08/SEC-09 hardening + audit log backfill into auth.py/admin.py/documents.py + CloudConnectionOut Pydantic model + delete-user file cleanup
Wave 5 (blocked on Wave 4)
- 04-08-PLAN.md — Frontend data layer: API client functions + useFoldersStore + documents store extension + Vue Router routes (/folders/:folderId, /shared)
Wave 6 (blocked on Wave 5)
- 04-09-PLAN.md — Frontend UI: all new components (FolderRow, FolderBreadcrumb, FolderDeleteModal, ShareModal, DocumentPreviewModal, SearchBar, SortControls, AuditLogTab) + view wiring (AppSidebar, DocumentCard, HomeView, FolderView, SharedView, SettingsView, AdminView) + human checkpoint
Phase gates (must pass before Phase 4 is complete):
pytest -v— zero failures; folder ownership, share revocation, quota bar, PDF proxy (no presigned URL exposure) all covered- Security agent: audit log verified to contain zero document content; sharing IDOR tests; PDF proxy verified to not leak presigned URLs or object keys
- Bandit + pip audit + npm audit all clean
UI hint: yes
Phase 5: Cloud Storage Backends
Goal: Users can connect OneDrive, Google Drive, Nextcloud, or a generic WebDAV server as a personal storage backend; credentials are encrypted with a per-user HKDF-derived key; connection status is visible; local and cloud storage coexist; the StorageBackend ABC makes adding further backends straightforward.
Mode: mvp
Depends on: Phase 4
Requirements: CLOUD-01, CLOUD-02, CLOUD-03, CLOUD-04, CLOUD-05, CLOUD-06, CLOUD-07
Success Criteria (what must be TRUE):
- A user can connect OneDrive, Google Drive, Nextcloud, or a WebDAV endpoint through an OAuth or credential flow; the connection status is displayed as
ACTIVE,REQUIRES_REAUTH, orERROR— never shows raw credentials - When an OAuth token is revoked externally (simulated
invalid_grantresponse), the connection status transitions toREQUIRES_REAUTHwithout a 500 error; the user is shown a re-authentication prompt - A user can select their connected cloud backend as the default storage destination for new uploads; local MinIO storage remains available as an alternative; existing local documents are unaffected
- A user can disconnect a cloud backend; credentials are permanently deleted from the DB and a subsequent attempt to use that backend returns an appropriate error — no orphaned data remains
- An admin API response for a user's cloud connections returns only
provider, display_name, connected_at, status— thecredentials_enccolumn is never present in any serialized response
Plans: 12 plans (8 original + 3 UAT gap closure + 1 gap closure wave)
Wave 1 — Test scaffold + dependencies
- 05-01-PLAN.md — Wave 0 xfail stubs, conftest cloud fixtures, requirements.txt packages, config.py settings
Wave 2 — Shared utilities
- 05-02-PLAN.md — cloud_utils.py (SSRF + HKDF), cloud_cache.py (TTLCache), storage factory extension
Wave 3 — Cloud backends (parallel, both blocked on Wave 2 / Plan 05-02)
- 05-03-PLAN.md — GoogleDriveBackend + OneDriveBackend (all 7 StorageBackend methods)
- 05-04-PLAN.md — NextcloudBackend + WebDAVBackend (all 7 StorageBackend methods)
Wave 4 — Cloud API
- 05-05-PLAN.md — All /api/cloud/* endpoints + /api/users/me/default-storage + main.py router registration
Wave 5 — Document routing + full test suite
- 05-06-PLAN.md — Upload/content proxy cloud routing + all 15 tests promoted to passing
Wave 6 — Frontend settings UI
- 05-07-PLAN.md — cloudConnections store + API client + SettingsView 3-tab + SettingsCloudTab + CloudCredentialModal
Wave 7 — Frontend sidebar (human checkpoint)
- 05-08-PLAN.md — AppSidebar cloud section + CloudProviderTreeItem + CloudFolderTreeItem + human checkpoint
Wave 8 — UAT gap closure (parallel, all independent)
- 05-09-PLAN.md — Cloud document open/re-analyze/edit: authenticated fetch+Blob URL, cloud-aware Celery task, PATCH /api/documents/{id}
- 05-10-PLAN.md — OAuth initiate fix (JSON response), Nextcloud custom endpoint edit round-trip, Edit button on ERROR rows, confirmation text overflow
- 05-11-PLAN.md — Admin hard-delete with password confirmation: UserDeleteConfirm backend model + inline frontend panel
Wave 9 — Post-UAT gap closure
- 05-12-PLAN.md — OAuth 400 preflight (unconfigured creds), 502 cloud fallback, upload hint in CloudStorageView, celery-worker volume mount
Phase gates (must pass before Phase 5 is complete):
pytest -v— zero failures; SSRF prevention on WebDAV/Nextcloud user-supplied URLs; credential encryption/decryption round-trip; admin response never exposescredentials_enc; OAuth invalid_grant handling- Security agent: SSRF allowlist verification; credential key derivation correctness; connection status never leaks raw credential values
- Bandit + pip audit + npm audit all clean
- UAT gaps resolved and re-tested (05-09, 05-10, 05-11, 05-12)
UI hint: yes
Phase 6: Performance & Production Hardening
Goal: The application is ready for production deployment — observable, load-tested, and hardened; response times meet SLA targets under concurrent load; all auth and document endpoints are rate-limited; structured logging and distributed tracing are in place; the Docker image runs as a non-root user with a read-only filesystem. Mode: mvp Depends on: Phase 5 Requirements: TBD
Success Criteria (what must be TRUE):
- All API endpoints respond within defined latency targets (p50/p95/p99) under a realistic load test (e.g., 50 concurrent users, 5-minute soak)
- Structured JSON logging (correlation IDs, user ID, request latency) is emitted to stdout; a local log aggregation stack (Loki or similar) captures and queries them
- All auth endpoints (login, register, password reset, TOTP) enforce per-IP and per-account rate limits that cannot be bypassed by header manipulation
- Container hardening is complete: non-root user, read-only root filesystem, dropped Linux capabilities;
docker scoutor equivalent reports zero critical CVEs - A runbook documents all environment variables, startup/shutdown procedures, backup strategy, and on-call escalation path; the app can be stood up from scratch using only the runbook
Plans: TBD
Phase 6.1: Close v1.0 audit gaps: SHARE-02/STORE-06/ADMIN-06
Goal: Close three v1.0 requirements that remain unimplemented — atomic quota decrement on document delete (STORE-06), "Shared with me" virtual folder without recipient quota charge (SHARE-02), and admin audit log viewer with date/user/action type filters (ADMIN-06). Mode: mvp Depends on: Phase 6 Requirements: STORE-06, SHARE-02, ADMIN-06
Success Criteria (what must be TRUE):
- Deleting a document atomically decrements the owning user's quota; after deletion the quota reflects the freed bytes with no race condition under concurrent deletes
- A user who receives a shared document sees it appear in a "Shared with me" virtual folder; the recipient's quota usage is not charged for the shared document's storage
- An admin can view the audit log filtered independently by date range, user, and action type; filtered results contain no document content, filenames, or extracted text
Plans: 2 plans
Wave 1 — Test promotion (parallel)
- 06.1-01-PLAN.md — Promote test_shares.py stubs to real tests + second_auth_user fixture (SHARE-01..05)
- 06.1-02-PLAN.md — Promote test_audit.py stubs to real tests (ADMIN-06)
Phase gates (must pass before Phase 6.1 is complete):
pytest -v— zero failures; all 7 share tests + 4 audit log tests passing- Security agent: bandit + pip audit + npm audit all clean
- STORE-06 confirmed:
test_delete_decrements_quotapasses underINTEGRATION=1
Phase 6.2: Close v1 sharing + cloud-delete + CSV export gaps
Goal: Close remaining v1 gaps — sharing edge cases (SHARE-03/SHARE-05), cloud document deletion propagation to the remote backend, and CSV export + daily export UI for the admin audit log (ADMIN-06). Mode: mvp Depends on: Phase 6.1 Requirements: SHARE-03, SHARE-05, ADMIN-06
Success Criteria (what must be TRUE):
- Documents shared with others display a "Shared" badge in the owner's list view (reads doc.is_shared, not doc.share_count)
- Owner can set permission to "view" or "edit" when creating a share and toggle it per-recipient afterward; PATCH /api/shares/{id} enforces IDOR protection (404 on wrong owner)
- Deleting a cloud document propagates the delete to the cloud provider; failure shows a warning modal with "Remove from app" fallback; ?remove_only=true removes only the DB record; cloud docs never affect quota on delete
- Admin can download filtered audit log CSV via fetch+Blob (not window.location.href); audit log entries show user handles instead of raw UUIDs; user filter accepts handles (not UUIDs)
- Admin can list and download Celery-generated daily audit export files from a new section in the Audit Log tab
Plans: 4 plans
Wave 0 — Test stubs
- 06.2-01-PLAN.md — 11 xfail stubs across test_shares.py, test_documents.py, test_audit.py
Wave 1 — Feature slices (parallel)
- 06.2-02-PLAN.md — SHARE-05 badge fix + SHARE-03 permission control (backend PATCH + frontend dropdown + toggle)
- 06.2-03-PLAN.md — Cloud-delete propagation + structured error response + remove_only path + DocumentView warning modal
Wave 2 — Audit log enrichment
- 06.2-04-PLAN.md — Audit handle JOIN + user_handle filter + CSV fetch+Blob fix + daily-export list + download endpoints + AuditLogTab UI
Phase gates (must pass before Phase 6.2 is complete):
pytest -v— 344 passed, 1 pre-existing unrelated failure (test_extract_docx missing module)- Security agent: bandit + pip audit + npm audit all clean (SECURITY.md threats_open: 0)
- IDOR on PATCH /api/shares/{id}: test_share_patch_idor passes
- Date regex validation confirmed: GET /api/admin/audit-log/daily-exports/invalid-date returns 404
- window.location.href removed from AuditLogTab.vue confirmed by grep
Status: ✓ Complete (2026-06-01)
Progress Table
| Phase | Plans Complete | Status | Completed |
|---|---|---|---|
| 1. Infrastructure Foundation | 5/5 | Complete | 2026-05-22 |
| 2. Users & Authentication | 6/6 | Complete | 2026-06-01 |
| 3. Document Migration & Multi-User Isolation | 5/5 | Complete | 2026-05-25 |
| 4. Folders, Sharing, Quotas & Document UX | 9/9 | Complete | 2026-05-28 |
| 5. Cloud Storage Backends | 12/12 | Complete | 2026-05-30 |
| 6. Performance & Production Hardening | 0/TBD | Not started | — |
| 6.1. Close v1.0 audit gaps | 2/2 | Complete | 2026-05-30 |
| 6.2. Close v1 sharing + cloud-delete + CSV export gaps | 5/5 | Complete | 2026-05-31 |