Files
kite/.planning/ROADMAP.md
T

118 lines
8.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# DocuVault — v1 Roadmap
_Last updated: 2026-05-21_
## Phases
- [ ] **Phase 1: Infrastructure Foundation** — PostgreSQL + MinIO wired into Docker Compose; Alembic migrations running; existing app still works
- [ ] **Phase 2: Users & Authentication** — Full auth flow end-to-end (register, login, TOTP, backup codes, password reset, sign-out-all) with admin panel for user management
- [ ] **Phase 3: Document Migration & Multi-User Isolation** — All documents in PostgreSQL + MinIO; per-user isolation enforced; existing UI still works
- [ ] **Phase 4: Folders, Sharing, Quotas & Document UX** — Full document management UX (folders, sharing, quota bar, PDF preview, search, audit log)
- [ ] **Phase 5: Cloud Storage Backends** — Users can connect OneDrive, Google Drive, Nextcloud, or WebDAV as a personal storage backend
---
## Phase Details
### Phase 1: Infrastructure Foundation
**Goal**: PostgreSQL + MinIO are wired into Docker Compose with a complete Alembic-managed schema; all services boot cleanly and the existing single-user document scanner continues to work exactly as before — no user-facing behavior change.
**Mode:** mvp
**Depends on**: Nothing (first phase)
**Requirements**: STORE-01, STORE-02, STORE-07
**Success Criteria** (what must be TRUE):
1. `docker compose up` starts PostgreSQL, MinIO, and the FastAPI backend with no errors; health checks pass for all three services
2. Running `alembic upgrade head` applies the initial migration cleanly against the fresh PostgreSQL instance with no errors
3. The full existing document upload, text extraction, and AI classification workflow completes successfully — no regression in single-user behavior
4. MinIO object key schema `{user_id}/{document_id}/{uuid4()}{ext}` is enforced in the model layer; human-readable filenames are stored in the DB column, not in the MinIO key
**Plans**: 5 plans
- [x] 01-01-PLAN.md — Docker Compose service topology + Postgres init + Pydantic Settings + requirements
- [x] 01-02-PLAN.md — Wave 0 test scaffolds (xfail/skip stubs) + async pytest fixtures
- [x] 01-03-PLAN.md — SQLAlchemy ORM models + async engine + Alembic async migration (incl. alembic upgrade head)
- [x] 01-04-PLAN.md — StorageBackend ABC + MinIO backend + rewritten async services/storage.py
- [ ] 01-05-PLAN.md — Lifespan + /health + API cutover + Celery worker + walking-skeleton e2e verify
---
### Phase 2: Users & Authentication
**Goal**: Users can register, log in (with optional TOTP 2FA), reset their password, and sign out all active sessions; admins can manage user accounts and assign AI providers — all enforced by a complete FastAPI dependency chain.
**Mode:** mvp
**Depends on**: Phase 1
**Requirements**: AUTH-01, AUTH-02, AUTH-03, AUTH-04, AUTH-05, AUTH-06, AUTH-07, AUTH-08, SEC-01, SEC-02, SEC-03, SEC-05, SEC-06, SEC-07, ADMIN-01, ADMIN-02, ADMIN-03, ADMIN-04, ADMIN-05, ADMIN-07
**Success Criteria** (what must be TRUE):
1. A new user can register with an email and password that passes strength validation; a password from the HaveIBeenPwned list is rejected with a clear error
2. A logged-in user can enroll a TOTP authenticator app, receive 810 backup codes, explicitly acknowledge them, and thereafter be required to supply a TOTP code (or backup code) on every login — a backup code is invalidated on first use
3. A user who forgets their password can receive a reset email, follow the link within 1 hour, set a new password, and is then returned to the TOTP login gate (not auto-logged in)
4. A user can trigger "sign out all devices" from account settings; all other active sessions are immediately invalidated and any reuse of a rotated refresh token revokes the entire token family
5. An admin user can create, deactivate, and reset a user account, and assign an AI provider and model to that user; attempting to access document content via an admin JWT returns 403
**Plans**: TBD
**UI hint**: yes
---
### Phase 3: Document Migration & Multi-User Isolation
**Goal**: All existing documents have been migrated from flat-file JSON + filesystem into PostgreSQL + MinIO; all new uploads use the presigned URL flow; per-user isolation is enforced at the DB level; the existing document UI works without regression; the backend is stateless and ready for horizontal scaling.
**Mode:** mvp
**Depends on**: Phase 2
**Requirements**: STORE-03, STORE-04, STORE-05, STORE-06, STORE-08, SEC-04, DOC-03, DOC-04, DOC-05
**Success Criteria** (what must be TRUE):
1. Every document present before migration is accessible after migration with the same metadata and extracted text; a count reconciliation check confirms zero document loss
2. Two concurrent uploads that would together exceed a user's 100 MB quota result in exactly one success and one 413 rejection — the quota never goes over limit
3. A document delete atomically decrements the user's recorded quota usage; after deletion the quota reflects the freed bytes
4. Requesting a document object key or presigned URL for a document owned by a different user returns 403 — no cross-user object access is possible through any request parameter manipulation
5. AI classification for each document uses the provider and model assigned to that user by the admin, not any user-supplied or default value
**Plans**: TBD
---
### Phase 4: Folders, Sharing, Quotas & Document UX
**Goal**: Users have a complete document management experience — organized with folders, shared by handle, warned before they hit quota, able to preview PDFs in-browser, and served by a searchable document list; admins can view the append-only audit log.
**Mode:** mvp
**Depends on**: Phase 3
**Requirements**: FOLD-01, FOLD-02, FOLD-03, FOLD-04, FOLD-05, SHARE-01, SHARE-02, SHARE-03, SHARE-04, SHARE-05, SEC-08, SEC-09, ADMIN-06, DOC-01, DOC-02
**Success Criteria** (what must be TRUE):
1. A user can create, rename, and delete folders; moving a document between folders preserves its metadata and AI classification; deleting a non-empty folder prompts with the content count before proceeding
2. A user can share a document with another user by handle; the recipient sees it appear in a "Shared with me" virtual folder with no storage quota charged against them; the owner can revoke access and the shared entry disappears immediately for the recipient
3. The sidebar quota bar displays current usage in MB; it turns amber at 80% and red at 95%; an upload that would exceed the limit is rejected with an error showing current usage, the rejected file size, and a link to storage settings
4. Any document in the user's library can be previewed in-browser as a PDF via PDF.js; document bytes are proxied through the app and no presigned URLs are exposed to the browser
5. An admin can view the audit log filtered by date range, user, and action type; the log contains no document content, filenames, or extracted text; account deletion triggers cleanup of all user files before DB records are removed
**Plans**: TBD
**UI hint**: yes
---
### Phase 5: Cloud Storage Backends
**Goal**: Users can connect OneDrive, Google Drive, Nextcloud, or a generic WebDAV server as a personal storage backend; credentials are encrypted with a per-user HKDF-derived key; connection status is visible; local and cloud storage coexist; the `StorageBackend` ABC makes adding further backends straightforward.
**Mode:** mvp
**Depends on**: Phase 4
**Requirements**: CLOUD-01, CLOUD-02, CLOUD-03, CLOUD-04, CLOUD-05, CLOUD-06, CLOUD-07
**Success Criteria** (what must be TRUE):
1. A user can connect OneDrive, Google Drive, Nextcloud, or a WebDAV endpoint through an OAuth or credential flow; the connection status is displayed as `ACTIVE`, `REQUIRES_REAUTH`, or `ERROR` — never shows raw credentials
2. When an OAuth token is revoked externally (simulated `invalid_grant` response), the connection status transitions to `REQUIRES_REAUTH` without a 500 error; the user is shown a re-authentication prompt
3. A user can select their connected cloud backend as the default storage destination for new uploads; local MinIO storage remains available as an alternative; existing local documents are unaffected
4. A user can disconnect a cloud backend; credentials are permanently deleted from the DB and a subsequent attempt to use that backend returns an appropriate error — no orphaned data remains
5. An admin API response for a user's cloud connections returns only `provider, display_name, connected_at, status` — the `credentials_enc` column is never present in any serialized response
**Plans**: TBD
**UI hint**: yes
---
## Progress Table
| Phase | Plans Complete | Status | Completed |
|-------|----------------|--------|-----------|
| 1. Infrastructure Foundation | 4/5 | In Progress | - |
| 2. Users & Authentication | 0/? | Not started | - |
| 3. Document Migration & Multi-User Isolation | 0/? | Not started | - |
| 4. Folders, Sharing, Quotas & Document UX | 0/? | Not started | - |
| 5. Cloud Storage Backends | 0/? | Not started | - |