Files
curo1305 be6ff5a71f docs(05-05): complete cloud API endpoints plan — SUMMARY and STATE
- Created 05-05-SUMMARY.md: cloud.py (7 endpoints), main.py (router registration), admin.py (SEC-09 cleanup)
- Updated STATE.md: plan advanced to 5/8, session log updated, decisions recorded
- Updated ROADMAP.md: 05-03, 05-04, 05-05 marked complete
- Updated REQUIREMENTS.md: SEC-09 marked complete (cloud credential purge on account deletion)
2026-05-29 07:34:22 +02:00

12 KiB
Raw Permalink Blame History

DocuVault — v1 Requirements

Last updated: 2026-05-21

v1 Requirements

Authentication (AUTH)

  • AUTH-01: User can register with email and password (Argon2 hashing; strength enforced: ≥12 chars, uppercase, lowercase, number, special char; HaveIBeenPwned breach check)
  • AUTH-02: User can log in and maintain a session (JWT access token in Pinia memory only — never localStorage; refresh token in httpOnly; Secure; SameSite=Strict cookie; 15-min access / 30-day refresh)
  • AUTH-03: User can enroll a TOTP authenticator app (RFC 6238; 810 single-use backup codes issued and explicitly acknowledged before TOTP is marked active)
  • AUTH-04: User can complete login using TOTP code or a one-time backup code (backup code invalidated on use)
  • AUTH-05: User can reset password via email (signed token, 1-hour expiry; reset does not auto-login — user must pass TOTP gate on next login)
  • AUTH-06: User can sign out all active sessions (revokes all refresh tokens in DB; "sign out all devices" control in account settings)
  • AUTH-07: Refresh token rotation with family revocation — reuse of a rotated token revokes the entire family and emits a security alert to the user
  • AUTH-08: TOTP codes are single-use (mark used in DB within the validity window; prevent replay attacks)

Security (SEC) — Cross-Cutting

  • SEC-01: All state-changing endpoints are protected against CSRF (SameSite=Strict cookie + origin validation)
  • SEC-02: Auth endpoints (login, register, password reset, TOTP verify) are rate-limited (per-IP and per-account)
  • SEC-03: All DB queries use parameterized statements / ORM (zero raw string interpolation into queries)
  • SEC-04: All file/document access resolved through DB lookup — object keys are never reconstructed from request parameters (prevents path traversal and cross-user access)
  • SEC-05: Content-Security-Policy, X-Frame-Options, and X-Content-Type-Options headers set on all responses
  • SEC-06: Constant-time comparison used for all token and code verification (prevents timing attacks)
  • SEC-07: Admin role verified on every admin endpoint request; admin cannot access document content, extracted text, or cloud credentials in any response
  • SEC-08: Cloud credential ciphertext (credentials_enc) excluded from all API serializers by default — admin and user responses return only provider, display_name, connected_at, status
  • SEC-09: Account deletion triggers delete_user_files() on every active cloud connection before removing DB records (prevents orphaned cloud data and satisfies GDPR Article 17)

Users & Admin (ADMIN)

  • ADMIN-01: Admin can create user accounts (email, temporary password that must be changed on first login)
  • ADMIN-02: Admin can deactivate a user account (blocks all logins and API access; data preserved)
  • ADMIN-03: Admin can initiate password reset for a user (sends reset email; does not grant admin access to the account)
  • ADMIN-04: Admin can view and adjust individual user storage quotas (warns if new limit is below current usage)
  • ADMIN-05: Admin can assign AI provider and model per user (users cannot modify their own AI configuration)
  • ADMIN-06: Admin can view audit log filtered by date range, user, and action type (metadata only — no document content, filenames, or extracted text)
  • ADMIN-07: Admin impersonation ("log in as user") is explicitly excluded by architecture — no endpoint or UI pathway exists

Storage & Infrastructure (STORE)

  • STORE-01: Platform storage layer migrated from flat-file JSON + local filesystem to PostgreSQL (metadata) + MinIO (objects); existing documents preserved via dual-write migration script
  • STORE-02: Each user's MinIO objects use {user_id}/{document_id}/{uuid4()}{ext} keys — human-readable filenames stored in DB only
  • STORE-03: Each user has a 100 MB storage quota enforced atomically at upload using UPDATE quotas SET used_bytes = used_bytes + $delta WHERE (used_bytes + $delta) <= limit_bytes RETURNING used_bytes
  • STORE-04: User sees quota usage bar in sidebar (X MB of Y MB) with amber warning at 80% and red warning at 95%
  • STORE-05: Upload rejected at quota limit with a specific error showing current usage, rejected file size, and a link to storage settings
  • STORE-06: Document delete atomically decrements quota usage
  • STORE-07: Backend is stateless — no per-instance file locks; multiple instances can run behind a load balancer
  • STORE-08: FastAPI BackgroundTasks replaced with Celery + Redis or pgqueuer before horizontal scaling is enabled

Folders & Organization (FOLD)

  • FOLD-01: User can create, rename, and delete folders (delete confirms content count before proceeding)
  • FOLD-02: User can move documents between folders
  • FOLD-03: Breadcrumb navigation renders current folder path; each segment is clickable to navigate up
  • FOLD-04: Document list supports sort by name, date uploaded, and file size
  • FOLD-05: Full-text search across user's documents (PostgreSQL tsvector index on extracted text)

Document Sharing (SHARE)

  • SHARE-01: User can share a document with another user by their unique handle (at-handle or user ID)
  • SHARE-02: Shared documents appear in a "Shared with me" virtual folder for the recipient (no storage quota counted against recipient)
  • SHARE-03: Shared access is view-only by default; owner controls permission level
  • SHARE-04: Owner can revoke share access; revocation is immediate
  • SHARE-05: Documents shared with others display a "shared" indicator in the owner's list view

Cloud Storage (CLOUD)

  • CLOUD-01: User can connect OneDrive (Microsoft Graph), Google Drive (v3 API), Nextcloud, or generic WebDAV as a personal storage backend
  • CLOUD-02: Cloud OAuth credentials encrypted using HKDF per-user key derivation (HKDF(master_key, salt=user_id_bytes, info=b"cloud-credentials")); master key in CLOUD_CREDS_KEY env var; never stored in DB
  • CLOUD-03: Local MinIO storage and connected cloud backends coexist; user can select their default storage destination
  • CLOUD-04: Each cloud connection displays status: ACTIVE | REQUIRES_REAUTH | ERROR
  • CLOUD-05: On OAuth revocation (invalid_grant), connection status transitions to REQUIRES_REAUTH — the error is surfaced to the user, not retried silently
  • CLOUD-06: User can disconnect a cloud backend; credentials are permanently deleted from the DB
  • CLOUD-07: Storage backend abstracted via StorageBackend ABC + factory in storage/ module (mirrors existing ai/ provider pattern)

Documents & AI (DOC)

  • DOC-01: User can view document metadata and extracted text for any document in their library
  • DOC-02: In-browser PDF preview (PDF.js); document bytes proxied through the app — no presigned URLs exposed to the browser (privacy model)
  • DOC-03: AI provider and model assigned by admin per user; user cannot change AI configuration
  • DOC-04: System default topics + per-user topic overrides preserved from existing implementation
  • DOC-05: AI classification uses the user's assigned provider and model (from DB, not from user-supplied settings)

v2 Requirements (Deferred)

  • Subscription billing and payment processing (quota model designed to plug in)
  • SSO: Microsoft, Google, Apple (auth layer designed for extension)
  • Keycloak / SAML / OAuth2 enterprise federation
  • Group admin roles (groups table seeded in schema, unpopulated)
  • Share permission levels beyond view-only (edit, comment)
  • Document version history
  • Share expiry dates
  • Real-time collaboration or comments
  • Mobile app
  • GDPR data export (Article 20) — async background job, deferred to v2
  • Email notifications for sharing events
  • Public link sharing (unauthenticated)

Out of Scope

  • Admin impersonation / "log in as user" — violates privacy-first core value; explicit architectural exclusion
  • Document editing or annotation — not planned
  • Document viewer for non-PDF types beyond metadata (DOCX, image renders) — v2
  • AI-generated document summaries beyond topic classification — v2
  • Webhooks or API access for third parties — not planned for v1

Traceability

Filled by roadmapper — 2026-05-21.

REQ-ID Phase Notes
STORE-01 1 Dual-write migration script; schema and Alembic wiring
STORE-02 1 Object key schema enforced in model layer
STORE-07 1 Stateless backend; no per-instance file locks
AUTH-01 2 Registration with Argon2 + HaveIBeenPwned check
AUTH-02 2 JWT session; httpOnly refresh cookie; Pinia memory access token
AUTH-03 2 TOTP enrollment with backup code acknowledgement flow
AUTH-04 2 Login via TOTP code or single-use backup code
AUTH-05 2 Password reset email; routes back to TOTP gate
AUTH-06 2 Sign out all devices; revokes all refresh tokens
AUTH-07 2 Refresh token family revocation on reuse; security alert
AUTH-08 2 TOTP single-use enforcement within validity window
SEC-01 2 CSRF protection on all state-changing endpoints
SEC-02 2 Rate limiting on auth endpoints (per-IP and per-account)
SEC-03 2 Parameterized queries / ORM enforced from first migration
SEC-05 2 Security response headers on all responses
SEC-06 2 Constant-time comparison for token/code verification
SEC-07 2 Admin role dependency; admin blocked from document content
ADMIN-01 2 Admin creates user with temporary password
ADMIN-02 2 Admin deactivates user account
ADMIN-03 2 Admin initiates password reset for user
ADMIN-04 2 Admin views and adjusts user storage quotas
ADMIN-05 2 Admin assigns AI provider and model per user
ADMIN-07 2 Explicit architectural exclusion of admin impersonation
STORE-03 3 Atomic quota enforcement at upload
STORE-04 3 Quota usage bar with 80%/95% warnings
STORE-05 3 Upload rejection at quota limit with detailed error
STORE-06 3 Atomic quota decrement on document delete
STORE-08 3 BackgroundTasks replaced with Celery+Redis or pgqueuer
SEC-04 3 DB-lookup-only file access; no key reconstruction from params
DOC-03 3 AI provider/model from DB per user; not user-supplied
DOC-04 3 System default topics + per-user topic overrides preserved
DOC-05 3 Classification uses user's assigned provider and model
FOLD-01 4 Folder CRUD with content-count confirmation on delete
FOLD-02 4 Document move between folders
FOLD-03 4 Breadcrumb navigation with clickable path segments
FOLD-04 4 Document list sort by name, date, and file size
FOLD-05 4 Full-text search via PostgreSQL tsvector index
SHARE-01 4 Share document by user handle
SHARE-02 4 "Shared with me" virtual folder; no quota charged to recipient
SHARE-03 4 View-only default sharing; owner controls permission level
SHARE-04 4 Immediate share revocation
SHARE-05 4 Shared indicator on documents in owner's list view
SEC-08 4 credentials_enc excluded from all serializers
SEC-09 4 Account deletion triggers delete_user_files() per cloud connection
ADMIN-06 4 Admin audit log viewer filtered by date, user, action
DOC-01 4 View document metadata and extracted text
DOC-02 4 In-browser PDF preview via PDF.js; bytes proxied through app
CLOUD-01 5 Connect OneDrive, Google Drive, Nextcloud, WebDAV
CLOUD-02 5 HKDF per-user key derivation for credential encryption
CLOUD-03 5 Local and cloud storage coexist; user selects default
CLOUD-04 5 Connection status display: ACTIVE / REQUIRES_REAUTH / ERROR
CLOUD-05 5 invalid_grant transitions to REQUIRES_REAUTH; surfaced to user
CLOUD-06 5 Disconnect cloud backend; credentials permanently deleted
CLOUD-07 5 StorageBackend ABC + factory in storage/ module