- CLAUDE.md: add Code Standards section with backend and frontend shared module maps, component architecture rules, duplication checklist, and no-dead-code enforcement rule - SECURITY.md: Phase 02 + 03 security audit results (all threats CLOSED) - .planning: update milestone audit, config, and add plan/UAT files for phases 01, 02-06, and 06.2-05 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
14 KiB
DocuVault — Claude Code Guide
Project Overview
DocuVault is a multi-user SaaS document management platform built on FastAPI (Python) + Vue 3. It handles document upload, text extraction (PDF/DOCX/image/text), AI-based topic classification, per-user isolated storage, folder organization, document sharing, and pluggable cloud storage backends (OneDrive, Google Drive, Nextcloud, WebDAV).
Current state: Brownfield — single-user app is functional. Active milestone: migrating to multi-user, adding auth, PostgreSQL + MinIO, and cloud storage.
Stack
- Backend: Python 3.12, FastAPI 0.136+, SQLAlchemy 2.0 async, psycopg v3, Alembic, MinIO SDK
- Frontend: Vue 3 (Options API), Pinia, Vue Router 4, Vite, Tailwind CSS
- Infrastructure: Docker Compose, PostgreSQL, MinIO (S3-compatible)
- Auth: PyJWT 2.12+, pwdlib[argon2], pyotp (TOTP), cryptography (Fernet/HKDF)
Key Architectural Rules
- JWT access token lives in Pinia memory only — never localStorage or sessionStorage
- Refresh token is an httpOnly; Secure; SameSite=Strict cookie — never accessible to JavaScript
- MinIO object keys are UUID-based (
{user_id}/{document_id}/{uuid4()}{ext}) — human filenames in DB only - Cloud credentials encrypted with HKDF per-user key derivation — master key in env var only
- Quota enforced atomically:
UPDATE quotas SET used_bytes = used_bytes + $delta WHERE (used_bytes + $delta) <= limit_bytes RETURNING used_bytes - Admin endpoints never return document content, extracted text, or
credentials_enc - Every document/folder endpoint asserts
resource.user_id == current_user.id - All DB queries via ORM / parameterized statements — zero raw string interpolation
Code Standards (Non-Negotiable)
Core principle
Things that look the same to the user are the same in code. Local file navigation and cloud file navigation share one component. Sidebar folder trees and cloud trees share one component. Format helpers exist once. If you are about to write the same logic a second time, extract it first.
Backend: shared module map
Before adding a helper, check if it belongs in an existing shared module:
| Module | What lives here |
|---|---|
backend/deps/utils.py |
get_client_ip(request), parse_uuid(value) — request-parsing helpers used across all routers |
backend/storage/exceptions.py |
CloudConnectionError — single canonical definition; all files import from here |
backend/ai/utils.py |
strip_code_fences, parse_classification, parse_suggestions — AI response parsing shared by all providers |
backend/services/auth.py |
validate_password_strength(password) — raises ValueError; routers catch and re-raise as HTTPException |
Rules:
- No router may define
_ip(),_get_ip(), or any other local variant ofget_client_ip. Import fromdeps.utils. - No router may define its own
CloudConnectionError. Import fromstorage.exceptions. - No AI provider may define its own
_strip_code_fencesor_parse_*. Import fromai.utils. - No API file may define
_validate_password_strength. Import fromservices.auth. - Service layer raises
ValueError(or domain exceptions), neverHTTPException. Only the router layer raisesHTTPException.
Frontend: shared module map
| Module | What lives here |
|---|---|
src/utils/formatters.js |
formatDate, formatSize, providerColor, providerBg, providerLabel |
src/components/ui/TreeItem.vue |
Generic expand/collapse tree node — all sidebar tree items wrap this |
src/components/storage/StorageBrowser.vue |
Unified file browser grid — used by both FileManagerView and CloudFolderView |
Rules:
- No component may define its own
formatDateorformatSize. Always import fromutils/formatters.js. - No component may define its own
providerColororproviderBg. Always import fromutils/formatters.js. - No new tree sidebar component may implement its own expand/collapse state. It must wrap
TreeItem.vue. StorageBrowser.vueis the single file browser. Do not create a parallel file grid anywhere.FileManagerViewandCloudFolderVieware thin data-providers: they feed props intoStorageBrowserand handle emitted events. They contain no layout or grid logic of their own.
Component architecture
View (thin data-provider)
└── Smart component (StorageBrowser, AdminUsersTab, etc.)
└── Dumb/presentational components (DocumentCard, FolderTreeItem, etc.)
- Views own stores and route params. They pass data down as props and handle emitted events.
- Smart components own layout, interactions, and internal state. They emit events upward; they do not call stores directly (exception: read-only lookups like topic color).
- Presentational components receive everything as props and emit actions.
- Props that are passed from parent to child are never mutated with
v-model— use:model-value+@update:modelValueand emit upward.
No dead code
- Files with no active route and no active import are deleted immediately — not commented out, not kept "just in case".
HomeView.vueandFolderView.vueare deleted. Do not recreate them.- Any file that becomes unreferenced after a refactor must be deleted in the same commit.
Duplication checklist (run before writing new code)
- Does a shared utility already exist for this logic? (Check the module map above.)
- Does this component already exist? (Search
components/before creating.) - Is this logic already in a Pinia store? (Check
stores/before duplicating in a view.) - If none of the above: create the shared module first, then use it everywhere that needs it.
GSD Workflow
This project uses the GSD (Get Shit Done) planning workflow. Planning artifacts live in .planning/.
Key files
| File | Purpose |
|---|---|
.planning/ROADMAP.md |
5-phase plan with success criteria |
.planning/REQUIREMENTS.md |
54 v1 requirements with REQ-IDs |
.planning/STATE.md |
Current phase and completion status |
.planning/PROJECT.md |
Project context and key decisions |
.planning/research/SUMMARY.md |
Domain research synthesis |
.planning/codebase/ |
Codebase map (architecture, stack, concerns) |
Commands
/gsd:discuss-phase N — gather context before planning a phase
/gsd:plan-phase N — create execution plan for a phase
/gsd:execute-phase N — execute the plan
/gsd:verify-work N — verify phase deliverables against requirements
/gsd:progress — check status and advance workflow
Current phase: Not started — run /gsd:discuss-phase 1 to begin
Development Setup
# Start all services
docker compose up
# Backend only (local dev)
cd backend && uvicorn main:app --reload
# Frontend only (local dev)
cd frontend && npm run dev
# Run backend tests
cd backend && pytest -v
Testing Protocol (Non-Negotiable)
Every feature, function, and bug fix requires tests. No phase or plan may advance until all tests pass.
Rules
- Coverage: Every new function, endpoint, and UI component must have at least one test — unit for isolated logic, integration for DB/service boundaries, E2E for critical user flows
- Gate:
pytest -v(backend) and frontend test suite must pass with zero failures before marking a plan complete or advancing to the next phase - Bug fixes: Must fix the root cause, not work around it. Maximum 50 lines of changed code per fix. If a fix requires more, it is scope-creep and must be broken into a separate plan
- No workarounds:
# type: ignore,noqa, skipping a test, or adding atry/exceptthat silently swallows an error are prohibited as bug fixes - Regression: Any time a bug is fixed, a test must be added that would have caught it
Test types per layer
| Layer | Required test type |
|---|---|
| Service / business logic | Unit tests with mocked dependencies |
| DB queries / ORM | Integration tests against real PostgreSQL (not SQLite for quota/UUID tests) |
| API endpoints | httpx.AsyncClient integration tests with real DB fixtures |
| Auth flows | Full round-trip tests (register → login → TOTP → refresh → revoke) |
| Security invariants | Dedicated negative tests (wrong owner → 403/404, admin → 403, replay → 401) |
| Frontend | Vitest unit tests for stores/composables; Playwright or Cypress for critical flows |
Security Protocol (Non-Negotiable)
A dedicated security agent runs after every plan execution and before any phase is marked complete. This agent has full read/write/edit access to the entire codebase and is the final gate before advancement.
Security agent mandate
The security agent must check — and fix — every class of vulnerability listed below. It may not flag and defer; it must resolve or escalate blocking issues.
OWASP Top 10 + auth-specific
| Threat | Required mitigation |
|---|---|
| SQL injection | All queries via ORM or parameterized statements — zero raw string interpolation |
| XSS | CSP headers, httpOnly cookies, no innerHTML with user data, Vue template auto-escaping never bypassed |
| CSRF | SameSite=Strict cookie + Origin/Referer header validation on all state-changing endpoints |
| Broken auth | Short-lived JWT (≤15 min), refresh rotation, family revocation on reuse, constant-time comparison |
| IDOR / broken access control | Every resource endpoint asserts resource.user_id == current_user.id; admin blocked from document content |
| Security misconfiguration | No debug mode in production, no stack traces in API responses, no default credentials |
| Sensitive data exposure | Passwords hashed Argon2id, PII fields encrypted at rest, credentials_enc never in API responses |
| Insecure deserialization | No pickle, no eval, no dynamic __import__; all user-supplied data validated via Pydantic |
| Vulnerable dependencies | pip audit / npm audit run; critical/high CVEs blocked |
| Insufficient logging | All auth events, quota violations, and admin actions written to audit log without document content |
Advanced threats
- Path traversal: All file path construction uses
os.path.basename/pathlib— never joins user-supplied strings directly - SSRF: All outbound HTTP (HIBP, cloud OAuth) via an allowlisted client; user-supplied URLs for WebDAV/Nextcloud must pass hostname allowlist
- Timing attacks:
hmac.compare_digest/secrets.compare_digestfor all token, TOTP, and backup-code comparison — no== - Race conditions / TOCTOU: Quota enforcement via single atomic
UPDATE … RETURNING— never read-then-write in Python - Mass assignment: Pydantic models explicitly declare every accepted field; no
**kwargspassthrough from request body to ORM - Privilege escalation:
get_regular_userandget_current_admindeps checked on every endpoint; no role elevation path exists - Token replay: JTI stored in DB; used TOTP codes invalidated within the 90 s window; refresh token family revocation on reuse
Zero-day / defense-in-depth
- Minimal attack surface: Every endpoint that is not needed is absent — no commented-out code, no
TODO: removeendpoints left alive - Principle of least privilege:
docuvault_appDB role has DML only;docuvault_migratehas DDL; MinIO bucket policy denies public access - Secrets in env only: No credentials, API keys, or signing secrets in code, commits, or
.envfiles checked in;.gitignoreenforces this - Dependency pinning:
requirements.txtandpackage-lock.jsonpin exact versions; no floating>=for security-critical packages (PyJWT, pwdlib, cryptography) - Container hardening: Non-root user in Dockerfile, read-only filesystem where possible, no
--privilegedcontainers - Header hardening:
X-Content-Type-Options: nosniff,X-Frame-Options: DENY,Referrer-Policy: strict-origin-when-cross-originon every response
Database user table encryption
Sensitive user PII (email, display name) must be encrypted at the application layer before storage:
- Encryption: AES-256-GCM via
cryptographylibrary, per-row nonce, master key from env var - Key derivation: HKDF-SHA256 with
purpose=b"user-pii"salt — same pattern as cloud credentials - Admin queries: never return plaintext PII for users other than the requesting user
- Indexing: email lookup uses a deterministic HMAC-SHA256 index (
email_hmaccolumn) — the encrypted column is never used for WHERE clauses
Login token hardening (state of the art)
- Algorithm: ES256 (ECDSA P-256) — asymmetric; the private key signs, the public key verifies; a leaked public key cannot forge tokens
- Access token TTL: 15 minutes maximum
- Refresh token: 30-day httpOnly Strict cookie; rotated on every use; reuse of a rotated token revokes entire family and fires a security alert email
- JTI claim: Every token has a unique
jti; revoked JTIs stored in Redis with TTL matching the token lifetime - Token binding: Access token embeds a
fgp(fingerprint) claim = HMAC ofUser-Agent + Accept-Language; backend validates on every request - Rotation on privilege change: Password change, TOTP enroll/revoke, and account deactivation immediately revoke all active sessions
Security gate checklist (must all pass before phase advances)
bandit -r backend/— zero HIGH severity findingspip audit— zero critical/high CVEsnpm audit --audit-level=high— zero high/critical vulnerabilities- All security-invariant tests pass (wrong owner, admin block, token replay, CSRF)
- No new
# noqa: Ssuppressions without a documented justification comment - Admin endpoints verified to never return
password_hash,credentials_enc, or document content - No hardcoded secrets detected by
git secrets/trufflehog
Security Requirements (Non-Negotiable)
- Rate limiting on all auth endpoints (login, register, password reset, TOTP)
- Constant-time comparison for all token/code verification
- CSRF protection on all state-changing endpoints
- Content-Security-Policy headers on all responses
- HaveIBeenPwned API check on registration and password change
- TOTP replay prevention (mark used codes in DB within validity window)
- Refresh token family revocation on token reuse detection
- Admin impersonation is an explicit architectural exclusion — no endpoint or code path may exist