Files
curo1305 bd17b4b22f docs(06.2): mark phase 6.2 complete — all gates passed
UAT complete (7/7 re-tests passed or skipped with reason), security gate
passed (threats_open: 0), 344 backend tests passing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-01 21:09:04 +02:00

20 KiB
Raw Permalink Blame History

gsd_state_version, milestone, milestone_name, current_phase, status, last_updated, progress
gsd_state_version milestone milestone_name current_phase status last_updated progress
1.0 v1.0 audit gaps: SHARE-02/STORE-06/ADMIN-06 06.2 complete 2026-06-01T00:00:00.000Z
total_phases completed_phases total_plans completed_plans percent
2 2 7 7 100

Project State

Project: DocuVault Status: Executing Phase 06.2 Current Phase: 06.2 Last Updated: 2026-05-28

Phase Status

Phase Name Status
1 Infrastructure Foundation ✓ Complete
2 Users & Authentication ✓ Complete (5/5 plans)
3 Document Migration & Multi-User Isolation ✓ Complete (5/5 plans, UAT passed, security gate passed)
4 Folders, Sharing, Quotas & Document UX ✓ Complete (9/9 plans, UAT 14/15 passed, 1 bug fixed)
5 Cloud Storage Backends ✓ Complete (12/12 plans, UAT 5/6 passed, 3 gaps closed by 05-12)
6 Performance & Production Hardening Not started
6.1 Close v1.0 audit gaps: SHARE-02/STORE-06/ADMIN-06 ✓ Complete (2/2 plans)
6.2 Close v1 sharing + cloud-delete + CSV export gaps ✓ Complete (5/5 plans, UAT passed, security gate passed)

Current Position

Phase: 06.2 (close-v1-sharing-cloud-delete-csv-export-gaps) — EXECUTING Plan: 1 of 5 Phase: 05-cloud-storage-backends — Complete (12/12 plans, all UAT gaps resolved) Plan: 05-12 — complete Progress: [██████████] 100%

Performance Metrics

Metric Value
Phases complete 1 / 5
Requirements mapped 54 / 54
Plans written 5 (Phase 1)
Plans complete 10 (5 Phase 1 + 5 Phase 2)

Accumulated Context

Key Decisions

Decision Rationale
PostgreSQL + MinIO Multi-user quotas and horizontal scaling require shared, consistent state
HKDF per-user key derivation Single Fernet key would be catastrophic on leak — must be derived before first credential is stored
Presigned MinIO URL flow FastAPI handles metadata only; bytes never pass through the API layer
Atomic PostgreSQL quota UPDATE Never perform quota arithmetic in Python between two DB statements
JWT in httpOnly cookie Refresh token in httpOnly cookie; access token in Pinia memory only — never localStorage
Refresh token family revocation RFC 9700 — reuse of a rotated token revokes entire family and alerts user
BackgroundTasks replacement FastAPI BackgroundTasks is per-instance; replace with Celery+Redis or pgqueuer before horizontal scale
AuditLog metadata_ ORM attribute metadata is reserved on DeclarativeBase; ORM attribute is metadata_ with name="metadata" kwarg to avoid silent collision
documents.user_id nullable Phase 1 D-03 — no auth in Phase 1; Phase 2 migration adds NOT NULL after auth lands
groups stub table Phase 1 D-02 — groups is a v2 feature; table created now for schema completeness, no rows until Phase 2+
SEQUENCES grants in migration GRANT USAGE/SELECT on sequences required for audit_log.id autoincrement nextval() by docuvault_app
Admin impersonation excluded Explicit architectural exclusion — no endpoint or UI pathway; violates privacy-first core value
user_id as refresh token family proxy No separate family_id column; user_id serves as family per RFC 9700 — simpler schema
pwdlib over passlib pwdlib actively maintained with clean Argon2Hasher API; passlib unmaintained
TOTP replay TTL=90s valid_window=1 covers ±30s (90s total) — TTL matches window
HIBP fail-open Network errors return False + log warning; auth never blocked by external service
Two-DSN PostgreSQL strategy DATABASE_URL (docuvault_app, DML only) + DATABASE_MIGRATE_URL (docuvault_migrate, DDL only); celery-worker gets only DATABASE_URL
MinIO healthcheck via mc ready local curl removed from MinIO Docker image since Oct 2023; mc is the correct in-container healthcheck tool
pydantic-settings v2 SettingsConfigDict SettingsConfigDict API used (not deprecated class Config form) for env var config
async_client fixture name Distinct from legacy sync client fixture to avoid collision; both coexist until Plan 05
xfail(strict=False) for Wave 0 All pre-implementation scaffolds use strict=False so unexpected passes don't break CI
StorageBackend ABC + factory mirrors ai/ pattern 5 abstract methods; get_storage_backend() factory; MinIOBackend wraps all sync Minio SDK calls in asyncio.to_thread()
Explicit localhost string block in validate_cloud_url hostname == "localhost" blocked before DNS resolution — OS-agnostic (getaddrinfo("localhost") behaviour varies by OS)
Fresh HKDF instance per _derive_fernet_key call cryptography library raises AlreadyFinalized on 2nd .derive() call; always create new HKDF(...) instance — never cache
Lazy import of cloud backends in get_storage_backend_for_document Avoids circular imports at module load time; backends imported inside function body with type: ignore[import] until Plans 05-03..05-05 create them
Fetch-outside-lock async cache pattern get_cloud_folders_cached acquires lock to check cache, releases lock, awaits fetch_fn, re-acquires lock to write — prevents event loop blocking on cache miss
STORE-02 key enforced in code MinIOBackend.put_object constructs {user_id}/{document_id}/{uuid4()}{ext}; no filename parameter — only extension passes through
null-user D-03 sentinel services/storage.save_upload uses user_id="null-user" in Phase 1 (no auth); Phase 2 replaces with str(current_user.id)
load_settings flat-file Phase 1 users.ai_provider/ai_model columns cannot be populated until Phase 2; settings remain flat-file JSON for Phase 1
Deferred Celery import in /password-reset send_reset_email.delay called via from tasks.email_tasks import send_reset_email inside handler body — same circular-import fix as document_tasks
TOTP QR code as otpauth:// link No QR library installed; plan permits manual secret display for MVP; functional flow complete without rendered QR image
ConfirmBlock no acknowledgment checkbox ConfirmBlock handles message + button pair; BackupCodesDisplay owns its separate acknowledgment checkbox — no overlap
ADMIN-07 enforced by omission No impersonation endpoint exists; AST check + test_admin_impersonation_not_found verify absence; violates privacy-first core value
_user_to_dict() whitelist for admin responses Explicit field whitelist prevents accidental password_hash/credentials_enc leakage from admin endpoints
Quota warning is 200 not 4xx Below-usage limit change is applied; warning=True advisory field returned — not a rejection
AdminQuotasTab fetches quotas per-user via Promise.allSettled adminListUsers() does not include quota fields; per-user endpoint parallelized; failed quotas filtered silently
Temp password via crypto.getRandomValues Browser-native CSPRNG; no external library; always satisfies AUTH-01 strength rules
batch_alter_table for NOT NULL in migration 0003 SQLite requires batch_alter_table for ALTER COLUMN; transparent passthrough on PostgreSQL — enables SQLite CI test runs
MinIO step in migration 0003 gated on MINIO_ENDPOINT Migration skips MinIO deletions when env var absent; enables safe SQLite test runs per T-03-02
raising=False for Phase 3 MinIO mock fixtures mock_minio_presigned + mock_minio_stat patch methods that don't exist until Plan 03-02; raising=False pre-installs them
Dual MinIO client (internal + public) Presigned URL HMAC signature must be computed with browser-visible hostname (localhost:9000); using internal Docker client (minio:9000) causes browser signature mismatch
Wave 2 user_id=None guard upload-url sets user_id=None + object_key "null-user/" prefix; confirm skips quota when user_id is None; Plan 03-03 removes both guards
SQLite quota xfail(strict=False) SQLite stores UUID as CHAR(32) without dashes; raw SQL WHERE user_id = :uid never matches str(uuid) dashed format — test-env limitation, not code defect
Celery mock required in /confirm tests extract_and_classify.delay() connects to Redis; monkeypatch blocks it in unit tests; MagicMock pattern established for all confirm endpoint tests
get_regular_user raises 403 for admin Admin is authenticated but must not access document content; 401 would falsely imply unauthenticated — 403 is correct for role rejection
Cross-user doc access returns 404 not 403 Combining "not found" and "wrong owner" into 404 prevents attacker from learning which doc IDs exist for other users (D-16, T-03-11)
CASE WHEN replaces GREATEST in quota decrement SQLite lacks GREATEST scalar function; CASE WHEN used_bytes > :delta THEN used_bytes - :delta ELSE 0 END is semantically equivalent and SQLite-compatible
load_topics_for_user uses or_(user_id == x, user_id.is_(None)) SQLAlchemy is_(None) not == None; or_() combines system topics and user's own topics for namespace-scoped query (D-17, DOC-04)
AI-suggested topics go in user namespace classifier passes user_id=doc.user_id to create_topic; AI-suggested topics are per-user not system-wide (D-11)
Celery task signature unchanged for ai_provider Task receives only document_id; ai_provider/ai_model resolved inside _run via session.get(User, doc.user_id) — prevents broker injection (T-03-19)
_DEFAULT_SYSTEM_PROMPT in classifier.py System prompt env var is optional; hardcoded fallback kept in classifier module not config.py (D-13)
Default AI provider is ollama/llama3.2 Code defaults; overridable via DEFAULT_AI_PROVIDER / DEFAULT_AI_MODEL env vars (D-15)
/settings route kept as static placeholder SettingsView shows admin-managed card; route not removed to avoid UX regression (Risk 6)
Plain anchor in quota rejection block used instead of to avoid import dependency in upload component
uploadProgress entries owned by parent Store does not clear uploadProgress map entries after upload; DropZone/parent clears on row dismiss
fetchQuota silent catch in auth store Silent catch keeps last-known values; QuotaBar owns loadFailed state and hides on error (UI-SPEC)
XHR PUT progress range 590 5 + Math.round(pct * 0.85) maps XHR 0-100 → visual 5-90; remaining 10% covers confirm + enqueue
FTS stubs carry both xfail and skipif(INTEGRATION) skipif fires first in non-INTEGRATION runs (tests appear SKIPPED); xfail catches failures when INTEGRATION=1 — both decorators required
Wave 0 stubs: single-line body only All Phase 4 stubs: body is only pytest.xfail("not implemented yet") — no assertion code; strict=False so xpass never breaks CI
GIN index via op.execute() raw SQL Alembic autogenerate cannot round-trip expression indexes; raw SQL with comment prevents re-creation on every --autogenerate run (issue #1390)
put_object_raw not in StorageBackend ABC audit-logs bucket is MinIO-only; local/WebDAV backends have no audit concept; MinIOBackend-only method
write_audit_log uses session.flush() D-14: caller owns the transaction; flush queues the audit entry without committing — commit remains caller's responsibility
Breadcrumb uses iterative Python parent-walk Not WITH RECURSIVE — ensures SQLite unit tests pass; cycle guard (visited set) prevents infinite loop on malformed data
document_move_router is a separate APIRouter PATCH /api/documents/{id}/folder placed in folders.py not documents.py; separate router with /api/documents prefix avoids circular import
FTS plainto_tsquery wrapped in try/except SQLite silently degrades to unfiltered results when plainto_tsquery unavailable; PostgreSQL works fully — no unit test breakage
Share IDOR: DELETE returns 404 not 403 Prevents share ID enumeration; attacker cannot learn which share IDs exist for other users (T-04-04-02)
/received before /{share_id} in router Path parameter conflict: FastAPI routes /received as /{share_id}="received" if DELETE is defined first — ordering enforced by comment
No quota touch in shares.py Recipient's quota is never modified by share operations (T-04-04-04); sharing is metadata-only from quota's perspective
login_failed audit metadata_=None No email, no hash, no PII in login failure audit events — T-04-07-01 threat mitigation
document audit metadata whitelist document.uploaded contains only size_bytes and storage_backend; document.deleted contains only size_bytes — no filename, no extracted_text
CloudConnectionOut whitelist pattern Pydantic model with exactly the safe fields; credentials_enc absent by omission — SEC-08 safe-by-default
admin.user_deleted flush before delete audit write flushed (session.flush()) while user FK still valid; session.delete(user) follows — preserves audit FK integrity
test_admin_impersonation 405 acceptable DELETE /users/{id} causes GET to return 405 not 422; both mean no GET impersonation endpoint; test updated to accept {404, 405, 422}
CloudConnectionError shared exception type Defined once in google_drive_backend.py; imported by onedrive_backend.py — single exception type across all cloud backends
cache_discovery=False on Drive build() Prevents /tmp discovery cache writes — directory traversal vector (T-05-03-05)
createUploadSession for all OneDrive uploads No 4 MB size gate; resumable sessions handle small and large files through same code path (Pitfall 6)
MSAL invalid_grant via result.get('error') MSAL returns dict (never raises); field-level check is correct — Assumption A3 confirmed
WebDAVBackend SSRF double guard pattern validate_cloud_url in init (construct-time) AND before every asyncio.to_thread() call — mirrors D-17 requirement for DNS-rebinding mitigation
nextcloud/webdav dispatch to distinct classes NextcloudBackend for 'nextcloud' provider (has list_folder); WebDAVBackend for 'webdav' — identical constructor signatures
webdavclient3 upload_to/download_from confirmed A1 assumption in RESEARCH.md was correct; verified via runtime dir(Client) inspection before use
OAuth callback not authenticated via JWT OAuth redirect flow cannot carry Bearer header; state token (256 bits, TTL 1800s, single-use) provides equivalent security
Cloud cleanup added to admin delete_user only auth.py has no DELETE /api/users/me; admin-initiated deletion is the only account deletion code path
Cloud cleanup runs before MinIO cleanup credentials still in DB when get_storage_backend_for_document is called; sessions.flush() after conn deletes

Roadmap Evolution

  • Phase 6 added: Performance & Production Hardening (2026-05-30)
  • Phase 6.1 inserted: Close v1.0 audit gaps — SHARE-02/STORE-06/ADMIN-06 (2026-05-30)
  • Phase 6.2 inserted: Close v1 sharing + cloud-delete + CSV export gaps (2026-05-31)

Open Questions

  • Verify cloud SDK minor versions on PyPI before Phase 5 pinning

Workflow Changes (2026-05-25)

Two mandatory cross-cutting gates added to all phases going forward:

1. Test gate — every plan must leave pytest -v passing with zero failures. Every new function/endpoint/component requires at least one test. All security-invariant negative tests (wrong owner, admin block, token replay) must exist and pass.

2. Security gate — a security agent runs after every plan execution and is a blocking requirement before phase advancement. It:

  • Runs bandit -r backend/, pip audit, npm audit --audit-level=high
  • Checks for path traversal, IDOR, SSRF, timing attacks, mass assignment, token replay
  • Verifies admin endpoints never return password_hash, credentials_enc, or document content
  • Fixes issues directly (full edit access) rather than deferring

3. Bug fix rule — all fixes: root cause only, ≤50 lines, regression test required, no workarounds.

See CLAUDE.md "Testing Protocol" and "Security Protocol" sections for full detail.

Blockers

None.

Session Continuity

Updated at each phase transition.

Field Value
Last session 2026-05-25 — Phase 3 UAT complete (10/10); security gate passed (3 fixes: bandit B324, Referrer-Policy, IDOR on /topics/suggest); test fix for test_lmstudio.py import
Last session 2026-05-25 — Phase 4 context gathered (4 areas: folder nav, sharing, PDF proxy, audit log)
Last session 2026-05-25 — Phase 4 UI-SPEC approved (6 dimensions: 2 PASS clean, 3 FLAG non-blocking, 0 BLOCK)
Last session 2026-05-25 — Phase 4 plans created (9 plans, 7 waves) + verification passed (0 blockers, 2 warnings)
Last session 2026-05-25 — Plan 04-01 executed: 30 Wave 0 xfail stubs across 5 test files; 39 xfailed total, zero new failures
Last session 2026-05-25 — Plan 04-02 executed: migration 0004 (pdf_open_mode, GIN FTS index, audit-logs bucket) + MinIOBackend.put_object_raw(); 122 tests pass
Last session 2026-05-25 — Plan 04-03 executed: write_audit_log() helper (flush-not-commit, never-raises) + FOLD-01..05 folder API + document sort/FTS/move; 122 pass, 0 new failures
Last session 2026-05-25 — Plan 04-04 executed: Sharing API (SHARE-01..05) — grant/list/received/revoke with IDOR protection; 7 xfailed, zero new failures
Last session 2026-05-28 — Phase 4 UAT complete (14/15 passed, 1 bug found + fixed: duplicate folder on creation); sidebar collapsible folder tree added; Phase 4 marked complete
Last session 2026-05-28 — Phase 5 UI-SPEC approved (6/6 dimensions passed; 2 revision rounds: Cancel label → context-specific, text-lg → text-xl)
Last session 2026-05-28 — Phase 5 planned (8 plans, 7 waves); verification passed (4 blockers → resolved: D-05 API-layer refresh path, SEC-09 cloud cleanup, frontend_url config, RESEARCH resolved markers)
Last session 2026-05-28 — Plan 05-01 executed: Wave 0 Nyquist scaffold — 19 xfail stubs in test_cloud.py, 4 cloud fixtures in conftest.py, 6 package pins, 8 config settings; 172 passed / 43 xfailed
Last session 2026-05-28 — Plan 05-02 executed: cloud_utils.py (SSRF+HKDF), cloud_cache.py (TTLCache), storage factory extended; 199 passed / 43 xfailed / 1 pre-existing failure
Last session 2026-05-28 — Plan 05-03 executed: GoogleDriveBackend (Drive v3, cache_discovery=False, asyncio.to_thread) + OneDriveBackend (MSAL, resumable upload, CHUNK_SIZE=10MB); 262 passed / 43 xfailed / 1 pre-existing failure
Last session 2026-05-28 — Plan 05-04 executed: WebDAVBackend + NextcloudBackend (SSRF double-guard, asyncio.to_thread, list_folder); 262 passed / 43 xfailed / 1 pre-existing failure
Last session 2026-05-29 — Plan 05-05 executed: cloud.py (7 endpoints), main.py (routers registered), admin.py (SEC-09 cloud cleanup); 262 passed / 43 xfailed / 1 pre-existing failure
Last session 2026-05-29 — Plan 05-06 executed: documents.py cloud upload+content-proxy extension; all 15 xfail stubs promoted to 20 passing tests (CLOUD-03, CLOUD-05, CLOUD-07); 282 passed / 24 xfailed / 1 pre-existing failure
Last session 2026-05-29 — Plan 05-07 executed: useCloudConnectionsStore, 3-tab SettingsView, SettingsCloudTab (4 providers, status badges, OAuth callback), CloudCredentialModal; 61 tests passing, build exits 0
Last session 2026-05-29 — Phase 5 complete: 4 cloud backends (Google Drive, OneDrive, Nextcloud, WebDAV), HKDF credential encryption, SSRF prevention, OAuth flows, cloud API (7 endpoints), frontend Settings 3-tab + CloudCredentialModal, AppSidebar cloud section, all 20 Phase 5 tests passing, security gates passed
Last session 2026-05-30 — Phase 5 UAT: 5/6 tests passed; 3 gaps diagnosed (OneDrive unconfigured 500, cloud doc stream opaque 500, DropZone disappeared); gap-closure plan 05-12 created (3 tasks, wave 1)
Last session 2026-05-30 — Plan 05-12 executed: OAuth 400 preflight (unconfigured creds), 502 cloud fallback, celery-worker volume mount, upload hint in CloudStorageView; 293 passed / 24 xfailed / 1 pre-existing failure
Last session 2026-05-30 — Phase 6.1 executed: 7 share tests + 4 audit tests promoted from xfail stubs; second_auth_user fixture added; 309 passed / 0 failed
Last session 2026-05-31 — Phase 6.2 planned: 4 plans (3 waves); SHARE-03/SHARE-05 (Plan 02), cloud-delete (Plan 03), ADMIN-06 audit enrichment + CSV + daily exports (Plan 04); verification passed (0 blockers, 2 cosmetic warnings fixed)
Next action Milestone v1.0 complete — run /gsd:complete-milestone or start Phase 6 (Performance & Production Hardening)
Pending decisions None
Resume file None