docs: create roadmap (5 phases)

This commit is contained in:
curo1305
2026-05-21 20:53:28 +02:00
parent 7fac328a54
commit 3353387312
4 changed files with 313 additions and 2 deletions
+55 -2
View File
@@ -113,8 +113,61 @@ _Last updated: 2026-05-21_
## Traceability
_Filled by roadmapper._
_Filled by roadmapper — 2026-05-21._
| REQ-ID | Phase | Notes |
|---|---|---|
| (pending) | | |
| STORE-01 | 1 | Dual-write migration script; schema and Alembic wiring |
| STORE-02 | 1 | Object key schema enforced in model layer |
| STORE-07 | 1 | Stateless backend; no per-instance file locks |
| AUTH-01 | 2 | Registration with Argon2 + HaveIBeenPwned check |
| AUTH-02 | 2 | JWT session; httpOnly refresh cookie; Pinia memory access token |
| AUTH-03 | 2 | TOTP enrollment with backup code acknowledgement flow |
| AUTH-04 | 2 | Login via TOTP code or single-use backup code |
| AUTH-05 | 2 | Password reset email; routes back to TOTP gate |
| AUTH-06 | 2 | Sign out all devices; revokes all refresh tokens |
| AUTH-07 | 2 | Refresh token family revocation on reuse; security alert |
| AUTH-08 | 2 | TOTP single-use enforcement within validity window |
| SEC-01 | 2 | CSRF protection on all state-changing endpoints |
| SEC-02 | 2 | Rate limiting on auth endpoints (per-IP and per-account) |
| SEC-03 | 2 | Parameterized queries / ORM enforced from first migration |
| SEC-05 | 2 | Security response headers on all responses |
| SEC-06 | 2 | Constant-time comparison for token/code verification |
| SEC-07 | 2 | Admin role dependency; admin blocked from document content |
| ADMIN-01 | 2 | Admin creates user with temporary password |
| ADMIN-02 | 2 | Admin deactivates user account |
| ADMIN-03 | 2 | Admin initiates password reset for user |
| ADMIN-04 | 2 | Admin views and adjusts user storage quotas |
| ADMIN-05 | 2 | Admin assigns AI provider and model per user |
| ADMIN-07 | 2 | Explicit architectural exclusion of admin impersonation |
| STORE-03 | 3 | Atomic quota enforcement at upload |
| STORE-04 | 3 | Quota usage bar with 80%/95% warnings |
| STORE-05 | 3 | Upload rejection at quota limit with detailed error |
| STORE-06 | 3 | Atomic quota decrement on document delete |
| STORE-08 | 3 | BackgroundTasks replaced with Celery+Redis or pgqueuer |
| SEC-04 | 3 | DB-lookup-only file access; no key reconstruction from params |
| DOC-03 | 3 | AI provider/model from DB per user; not user-supplied |
| DOC-04 | 3 | System default topics + per-user topic overrides preserved |
| DOC-05 | 3 | Classification uses user's assigned provider and model |
| FOLD-01 | 4 | Folder CRUD with content-count confirmation on delete |
| FOLD-02 | 4 | Document move between folders |
| FOLD-03 | 4 | Breadcrumb navigation with clickable path segments |
| FOLD-04 | 4 | Document list sort by name, date, and file size |
| FOLD-05 | 4 | Full-text search via PostgreSQL tsvector index |
| SHARE-01 | 4 | Share document by user handle |
| SHARE-02 | 4 | "Shared with me" virtual folder; no quota charged to recipient |
| SHARE-03 | 4 | View-only default sharing; owner controls permission level |
| SHARE-04 | 4 | Immediate share revocation |
| SHARE-05 | 4 | Shared indicator on documents in owner's list view |
| SEC-08 | 4 | credentials_enc excluded from all serializers |
| SEC-09 | 4 | Account deletion triggers delete_user_files() per cloud connection |
| ADMIN-06 | 4 | Admin audit log viewer filtered by date, user, action |
| DOC-01 | 4 | View document metadata and extracted text |
| DOC-02 | 4 | In-browser PDF preview via PDF.js; bytes proxied through app |
| CLOUD-01 | 5 | Connect OneDrive, Google Drive, Nextcloud, WebDAV |
| CLOUD-02 | 5 | HKDF per-user key derivation for credential encryption |
| CLOUD-03 | 5 | Local and cloud storage coexist; user selects default |
| CLOUD-04 | 5 | Connection status display: ACTIVE / REQUIRES_REAUTH / ERROR |
| CLOUD-05 | 5 | invalid_grant transitions to REQUIRES_REAUTH; surfaced to user |
| CLOUD-06 | 5 | Disconnect cloud backend; credentials permanently deleted |
| CLOUD-07 | 5 | StorageBackend ABC + factory in storage/ module |
+112
View File
@@ -0,0 +1,112 @@
# DocuVault — v1 Roadmap
_Last updated: 2026-05-21_
## Phases
- [ ] **Phase 1: Infrastructure Foundation** — PostgreSQL + MinIO wired into Docker Compose; Alembic migrations running; existing app still works
- [ ] **Phase 2: Users & Authentication** — Full auth flow end-to-end (register, login, TOTP, backup codes, password reset, sign-out-all) with admin panel for user management
- [ ] **Phase 3: Document Migration & Multi-User Isolation** — All documents in PostgreSQL + MinIO; per-user isolation enforced; existing UI still works
- [ ] **Phase 4: Folders, Sharing, Quotas & Document UX** — Full document management UX (folders, sharing, quota bar, PDF preview, search, audit log)
- [ ] **Phase 5: Cloud Storage Backends** — Users can connect OneDrive, Google Drive, Nextcloud, or WebDAV as a personal storage backend
---
## Phase Details
### Phase 1: Infrastructure Foundation
**Goal**: PostgreSQL + MinIO are wired into Docker Compose with a complete Alembic-managed schema; all services boot cleanly and the existing single-user document scanner continues to work exactly as before — no user-facing behavior change.
**Mode:** mvp
**Depends on**: Nothing (first phase)
**Requirements**: STORE-01, STORE-02, STORE-07
**Success Criteria** (what must be TRUE):
1. `docker compose up` starts PostgreSQL, MinIO, and the FastAPI backend with no errors; health checks pass for all three services
2. Running `alembic upgrade head` applies the initial migration cleanly against the fresh PostgreSQL instance with no errors
3. The full existing document upload, text extraction, and AI classification workflow completes successfully — no regression in single-user behavior
4. MinIO object key schema `{user_id}/{document_id}/{uuid4()}{ext}` is enforced in the model layer; human-readable filenames are stored in the DB column, not in the MinIO key
**Plans**: TBD
---
### Phase 2: Users & Authentication
**Goal**: Users can register, log in (with optional TOTP 2FA), reset their password, and sign out all active sessions; admins can manage user accounts and assign AI providers — all enforced by a complete FastAPI dependency chain.
**Mode:** mvp
**Depends on**: Phase 1
**Requirements**: AUTH-01, AUTH-02, AUTH-03, AUTH-04, AUTH-05, AUTH-06, AUTH-07, AUTH-08, SEC-01, SEC-02, SEC-03, SEC-05, SEC-06, SEC-07, ADMIN-01, ADMIN-02, ADMIN-03, ADMIN-04, ADMIN-05, ADMIN-07
**Success Criteria** (what must be TRUE):
1. A new user can register with an email and password that passes strength validation; a password from the HaveIBeenPwned list is rejected with a clear error
2. A logged-in user can enroll a TOTP authenticator app, receive 810 backup codes, explicitly acknowledge them, and thereafter be required to supply a TOTP code (or backup code) on every login — a backup code is invalidated on first use
3. A user who forgets their password can receive a reset email, follow the link within 1 hour, set a new password, and is then returned to the TOTP login gate (not auto-logged in)
4. A user can trigger "sign out all devices" from account settings; all other active sessions are immediately invalidated and any reuse of a rotated refresh token revokes the entire token family
5. An admin user can create, deactivate, and reset a user account, and assign an AI provider and model to that user; attempting to access document content via an admin JWT returns 403
**Plans**: TBD
**UI hint**: yes
---
### Phase 3: Document Migration & Multi-User Isolation
**Goal**: All existing documents have been migrated from flat-file JSON + filesystem into PostgreSQL + MinIO; all new uploads use the presigned URL flow; per-user isolation is enforced at the DB level; the existing document UI works without regression; the backend is stateless and ready for horizontal scaling.
**Mode:** mvp
**Depends on**: Phase 2
**Requirements**: STORE-03, STORE-04, STORE-05, STORE-06, STORE-08, SEC-04, DOC-03, DOC-04, DOC-05
**Success Criteria** (what must be TRUE):
1. Every document present before migration is accessible after migration with the same metadata and extracted text; a count reconciliation check confirms zero document loss
2. Two concurrent uploads that would together exceed a user's 100 MB quota result in exactly one success and one 413 rejection — the quota never goes over limit
3. A document delete atomically decrements the user's recorded quota usage; after deletion the quota reflects the freed bytes
4. Requesting a document object key or presigned URL for a document owned by a different user returns 403 — no cross-user object access is possible through any request parameter manipulation
5. AI classification for each document uses the provider and model assigned to that user by the admin, not any user-supplied or default value
**Plans**: TBD
---
### Phase 4: Folders, Sharing, Quotas & Document UX
**Goal**: Users have a complete document management experience — organized with folders, shared by handle, warned before they hit quota, able to preview PDFs in-browser, and served by a searchable document list; admins can view the append-only audit log.
**Mode:** mvp
**Depends on**: Phase 3
**Requirements**: FOLD-01, FOLD-02, FOLD-03, FOLD-04, FOLD-05, SHARE-01, SHARE-02, SHARE-03, SHARE-04, SHARE-05, SEC-08, SEC-09, ADMIN-06, DOC-01, DOC-02
**Success Criteria** (what must be TRUE):
1. A user can create, rename, and delete folders; moving a document between folders preserves its metadata and AI classification; deleting a non-empty folder prompts with the content count before proceeding
2. A user can share a document with another user by handle; the recipient sees it appear in a "Shared with me" virtual folder with no storage quota charged against them; the owner can revoke access and the shared entry disappears immediately for the recipient
3. The sidebar quota bar displays current usage in MB; it turns amber at 80% and red at 95%; an upload that would exceed the limit is rejected with an error showing current usage, the rejected file size, and a link to storage settings
4. Any document in the user's library can be previewed in-browser as a PDF via PDF.js; document bytes are proxied through the app and no presigned URLs are exposed to the browser
5. An admin can view the audit log filtered by date range, user, and action type; the log contains no document content, filenames, or extracted text; account deletion triggers cleanup of all user files before DB records are removed
**Plans**: TBD
**UI hint**: yes
---
### Phase 5: Cloud Storage Backends
**Goal**: Users can connect OneDrive, Google Drive, Nextcloud, or a generic WebDAV server as a personal storage backend; credentials are encrypted with a per-user HKDF-derived key; connection status is visible; local and cloud storage coexist; the `StorageBackend` ABC makes adding further backends straightforward.
**Mode:** mvp
**Depends on**: Phase 4
**Requirements**: CLOUD-01, CLOUD-02, CLOUD-03, CLOUD-04, CLOUD-05, CLOUD-06, CLOUD-07
**Success Criteria** (what must be TRUE):
1. A user can connect OneDrive, Google Drive, Nextcloud, or a WebDAV endpoint through an OAuth or credential flow; the connection status is displayed as `ACTIVE`, `REQUIRES_REAUTH`, or `ERROR` — never shows raw credentials
2. When an OAuth token is revoked externally (simulated `invalid_grant` response), the connection status transitions to `REQUIRES_REAUTH` without a 500 error; the user is shown a re-authentication prompt
3. A user can select their connected cloud backend as the default storage destination for new uploads; local MinIO storage remains available as an alternative; existing local documents are unaffected
4. A user can disconnect a cloud backend; credentials are permanently deleted from the DB and a subsequent attempt to use that backend returns an appropriate error — no orphaned data remains
5. An admin API response for a user's cloud connections returns only `provider, display_name, connected_at, status` — the `credentials_enc` column is never present in any serialized response
**Plans**: TBD
**UI hint**: yes
---
## Progress Table
| Phase | Plans Complete | Status | Completed |
|-------|----------------|--------|-----------|
| 1. Infrastructure Foundation | 0/? | Not started | - |
| 2. Users & Authentication | 0/? | Not started | - |
| 3. Document Migration & Multi-User Isolation | 0/? | Not started | - |
| 4. Folders, Sharing, Quotas & Document UX | 0/? | Not started | - |
| 5. Cloud Storage Backends | 0/? | Not started | - |
+67
View File
@@ -0,0 +1,67 @@
# Project State
**Project:** DocuVault
**Status:** Planning
**Current Phase:**
**Last Updated:** 2026-05-21
## Phase Status
| Phase | Name | Status |
|---|---|---|
| 1 | Infrastructure Foundation | Not Started |
| 2 | Users & Authentication | Not Started |
| 3 | Document Migration & Multi-User Isolation | Not Started |
| 4 | Folders, Sharing, Quotas & Document UX | Not Started |
| 5 | Cloud Storage Backends | Not Started |
## Current Position
**Phase:**
**Plan:**
**Progress:** ░░░░░░░░░░ 0%
## Performance Metrics
| Metric | Value |
|---|---|
| Phases complete | 0 / 5 |
| Requirements mapped | 54 / 54 |
| Plans written | 0 |
| Plans complete | 0 |
## Accumulated Context
### Key Decisions
| Decision | Rationale |
|---|---|
| PostgreSQL + MinIO | Multi-user quotas and horizontal scaling require shared, consistent state |
| HKDF per-user key derivation | Single Fernet key would be catastrophic on leak — must be derived before first credential is stored |
| Presigned MinIO URL flow | FastAPI handles metadata only; bytes never pass through the API layer |
| Atomic PostgreSQL quota UPDATE | Never perform quota arithmetic in Python between two DB statements |
| JWT in httpOnly cookie | Refresh token in httpOnly cookie; access token in Pinia memory only — never localStorage |
| Refresh token family revocation | RFC 9700 — reuse of a rotated token revokes entire family and alerts user |
| BackgroundTasks replacement | FastAPI BackgroundTasks is per-instance; replace with Celery+Redis or pgqueuer before horizontal scale |
| Admin impersonation excluded | Explicit architectural exclusion — no endpoint or UI pathway; violates privacy-first core value |
### Open Questions
- Celery + Redis vs pgqueuer for Phase 3 (depends on Redis availability in deployment target)
- Verify cloud SDK minor versions on PyPI before Phase 5 pinning
- Confirm PyOTP `valid_window` default in current docs (recommend `valid_window=1` for ±30s clock drift)
- Audit existing codebase for any bcrypt hashes before removing passlib in Phase 2
### Blockers
None.
## Session Continuity
_Updated at each phase transition._
| Field | Value |
|---|---|
| Last session | 2026-05-21 — Roadmap created |
| Next action | Run `/gsd:plan-phase 1` to begin Phase 1 planning |
| Pending decisions | See Open Questions above |