docs(codebase): refresh codebase map after Phase 06.2 completion
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
+397
-69
@@ -1,87 +1,415 @@
|
||||
# CONCERNS — document-scanner
|
||||
# Codebase Concerns
|
||||
|
||||
_Last updated: 2026-05-21_
|
||||
|
||||
## Summary
|
||||
|
||||
The codebase is a well-structured local-first prototype. The main concerns are security issues that matter if exposed beyond localhost (open CORS, no file validation, plain-text key storage), several blocking I/O calls in async handlers, and a handful of code duplication issues in the AI provider layer. Overall health is good for a local dev tool; requires hardening before any networked deployment.
|
||||
**Analysis Date:** 2026-06-02
|
||||
|
||||
---
|
||||
|
||||
## Concerns by Severity
|
||||
## Security Concerns
|
||||
|
||||
### HIGH
|
||||
### JWT Algorithm Downgrade: HS256 Instead of ES256
|
||||
|
||||
**1. File type validation is defined but never enforced**
|
||||
`ALLOWED_MIME_TYPES` is defined in `backend/api/documents.py` but the upload handler never checks it — any file type is accepted. An attacker could upload executable files or crafted archives.
|
||||
|
||||
**2. No file size limit on uploads**
|
||||
The entire uploaded file is read before any cap is applied. A large file could exhaust memory or disk. No `MAX_UPLOAD_SIZE` check exists at the HTTP boundary.
|
||||
|
||||
**3. API keys stored in plain-text JSON**
|
||||
`backend/data/settings.json` stores API keys in plaintext. The volume mount in `docker-compose.yml` (`./backend/data:/app/data`) means any process with Docker access can read them. Masking only applies to API responses, not to disk.
|
||||
|
||||
**4. CORS fully open**
|
||||
`allow_origins=["*"]` in `main.py` means any website can make cross-origin requests to the API, including with credentials if ever added.
|
||||
|
||||
**5. Docker Compose mounts entire backend source as writable volume**
|
||||
`./backend:/app` gives the container write access to the host source tree. A path traversal or code execution bug in the app could overwrite source files.
|
||||
- **Risk:** CLAUDE.md specifies ES256 (asymmetric ECDSA P-256) as the required algorithm, but the implementation uses HS256 (symmetric HMAC-SHA256).
|
||||
- **Files:** `backend/services/auth.py` lines 99, 109, 132, 141
|
||||
- **Impact:** A leaked `SECRET_KEY` allows arbitrary token forgery. With HS256 any party that has the secret can forge access tokens, impersonate admin users, and bypass all auth checks. ES256 would require the private key for forgery while the public key could safely be distributed for verification.
|
||||
- **Fix approach:** Generate an ECDSA P-256 key pair, store the private key in an env var (`JWT_PRIVATE_KEY`), store the public key as `JWT_PUBLIC_KEY`. Update `create_access_token` to use `algorithm="ES256"` and `decode_access_token` / `decode_password_reset_token` to use the public key. Rotate all active refresh tokens after deploy.
|
||||
- **Priority:** HIGH
|
||||
|
||||
---
|
||||
|
||||
### MEDIUM
|
||||
### No JTI Claim and No JTI Revocation in Redis
|
||||
|
||||
**6. Blocking I/O in async FastAPI handlers**
|
||||
`storage.py` uses synchronous file reads/writes and `filelock` blocking calls inside `async def` endpoints. This blocks the uvicorn event loop during every request. Should use `asyncio.to_thread()` or `aiofiles` (which is already in requirements but unused).
|
||||
|
||||
**7. Topic rename does not cascade to documents**
|
||||
Deleting a topic removes it from document metadata, but renaming is not implemented — there is no rename endpoint. Users have no way to rename a topic without losing document associations.
|
||||
|
||||
**8. `list_metadata` loads all documents before filtering**
|
||||
`storage.list_metadata()` reads all metadata JSON files on every list request. No pagination at the storage layer — O(N) disk reads per page request as the document count grows.
|
||||
|
||||
**9. `topic_doc_counts()` scans all metadata on every topic request**
|
||||
Every `GET /api/topics` call triggers a full scan of all metadata files to count documents per topic. Not cached; will degrade linearly.
|
||||
|
||||
**10. `MAX_AI_CHARS` duplicated across 3 files**
|
||||
The character truncation limit for AI input is duplicated as a magic constant in multiple provider files. The provider-level truncation is effectively dead code since `extractor.py` already truncates to `MAX_STORED_CHARS` (50,000).
|
||||
|
||||
**11. `_parse_classification` / `_parse_suggestions` duplicated between providers**
|
||||
`anthropic_provider.py` and `openai_provider.py` each define their own JSON parsing helpers for AI responses. `test_classifier.py` only imports from `openai_provider`, meaning the Anthropic variants are untested.
|
||||
|
||||
**12. `health_check()` makes real billed API calls**
|
||||
The "Test Connection" UI action calls `provider.health_check()`, which makes a real API call to Anthropic/OpenAI — incurring cost and latency every time the user tests connectivity. Should use a cheaper probe (e.g., list models endpoint or a cached status).
|
||||
- **Risk:** CLAUDE.md mandates JTI (JWT ID) in every access token stored in Redis for revocation, but the `create_access_token` function emits no `jti` claim and there is no check in `get_current_user`.
|
||||
- **Files:** `backend/services/auth.py` (create_access_token), `backend/deps/auth.py` (get_current_user)
|
||||
- **Impact:** Deactivated users can continue using valid access tokens until TTL expiry (up to 15 minutes). Password changes and account deactivations do not immediately invalidate active sessions (only refresh tokens are revoked — not the live access token in the client's Pinia store).
|
||||
- **Fix approach:** Add `jti=str(uuid.uuid4())` to the access token payload. In `get_current_user`, after successful decode, check `await redis.get(f"jti_revoked:{jti}")` and raise 401 if set. Add a `revoke_access_token(jti, ttl)` helper called from account deactivation and password change.
|
||||
- **Priority:** HIGH
|
||||
|
||||
---
|
||||
|
||||
### LOW
|
||||
### No Token Fingerprint / Token Binding
|
||||
|
||||
**13. `uvicorn --reload` hardcoded in docker-compose.yml**
|
||||
Hot-reload is hardcoded in the production compose file. There is no separate `docker-compose.prod.yml` or build-arg to disable it.
|
||||
|
||||
**14. Unused `shutil` import in `storage.py`**
|
||||
`import shutil` appears in `storage.py` but is never used.
|
||||
|
||||
**15. Topic IDs are 8-character UUID prefixes**
|
||||
`str(uuid.uuid4())[:8]` generates IDs with ~4 billion combinations — low collision risk for personal use but not safe at scale or for security-sensitive identifiers.
|
||||
|
||||
**16. `classify_document` request body uses raw `dict`, not a Pydantic model**
|
||||
The reclassify endpoint accepts an unvalidated `dict` body. Invalid input causes an unformatted 500 rather than a clean 422 validation error.
|
||||
|
||||
**17. No global frontend error handling**
|
||||
There is no Vue error boundary or global `window.onerror` / `app.config.errorHandler`. Failed API calls in stores may surface as silent failures or unhandled promise rejections.
|
||||
|
||||
**18. No document download endpoint**
|
||||
Uploaded files are stored in `data/uploads/` but there is no `GET /api/documents/:id/file` endpoint to retrieve the original binary. Files are effectively write-only through the UI.
|
||||
|
||||
**19. `aiofiles` in requirements but never used**
|
||||
`aiofiles>=23.2` is listed in `requirements.txt` but no code imports it. The blocking I/O concern (item 6) should use it.
|
||||
- **Risk:** CLAUDE.md requires a `fgp` (fingerprint) claim = HMAC of `User-Agent + Accept-Language`, validated on every request. This is absent.
|
||||
- **Files:** `backend/services/auth.py`, `backend/deps/auth.py`
|
||||
- **Impact:** Stolen access tokens can be replayed from any device/browser. Token binding would limit the window of a stolen token attack.
|
||||
- **Fix approach:** On login, compute `fgp = hmac.new(key, (user_agent + accept_lang).encode(), sha256).hexdigest()[:16]`. Embed in JWT payload. In `get_current_user`, recompute and compare with `hmac.compare_digest`.
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
## Gaps / Unknowns
|
||||
### Password Change Does Not Revoke Active Sessions
|
||||
|
||||
- Production deployment path is undefined (no nginx, no TLS, no auth)
|
||||
- OCR language support for pytesseract is not configured (defaults to English only)
|
||||
- `suggest_topics` method on all providers is untested — unclear if it is used in the current UI flow
|
||||
- No backup or recovery strategy for `data/` volume
|
||||
- **Risk:** `POST /api/auth/change-password` updates `password_hash` and writes an audit log but never calls `revoke_all_refresh_tokens`. CLAUDE.md mandates "Password change… immediately revoke all active sessions."
|
||||
- **Files:** `backend/api/auth.py` lines 446–495
|
||||
- **Impact:** An attacker who has a valid refresh cookie can continue rotating tokens even after the account owner changes their password.
|
||||
- **Fix approach:** Add `await auth_service.revoke_all_refresh_tokens(session, current_user.id)` after the password hash update, before `session.commit()`, and also invalidate all JTIs for that user in Redis (once JTI is implemented).
|
||||
- **Priority:** HIGH
|
||||
|
||||
---
|
||||
|
||||
### TOTP Disable Does Not Revoke Active Sessions
|
||||
|
||||
- **Risk:** `DELETE /api/auth/totp` clears the TOTP secret and disables TOTP but does not call `revoke_all_refresh_tokens`. CLAUDE.md mandates revocation on "TOTP enroll/revoke."
|
||||
- **Files:** `backend/api/auth.py` lines 587–616
|
||||
- **Impact:** An attacker who triggered TOTP removal (via CSRF or compromised session) and has a refresh token continues to operate as an authenticated user with no second factor.
|
||||
- **Fix approach:** Add `await auth_service.revoke_all_refresh_tokens(session, current_user.id)` in `disable_totp` before `session.commit()`.
|
||||
- **Priority:** HIGH
|
||||
|
||||
---
|
||||
|
||||
### Health Endpoint Exposes Internal Error Details Without Auth
|
||||
|
||||
- **Risk:** `GET /health` returns full Python exception class names and messages (e.g. `"error: OperationalError: (psycopg.OperationalError) …"`) with no authentication requirement. The comment at line 144 (T-01-05-03) acknowledges this but defers the fix to "Phase 2."
|
||||
- **Files:** `backend/main.py` lines 136–167
|
||||
- **Impact:** Exposes DB driver versions, hostnames, and connection string fragments to unauthenticated callers. Information useful for targeted attacks.
|
||||
- **Fix approach:** Replace `f"error: {type(e).__name__}: {e}"` with `"error"` in non-debug mode. Log the detail server-side only. Optionally require admin Bearer token for the detailed form.
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
### Default Secrets Shipped in Code
|
||||
|
||||
- **Risk:** `backend/config.py` hardcodes `secret_key = "CHANGEME"`, `cloud_creds_key = "CHANGEME-32-bytes-padded!!"`, and `minio_secret_key = "changeme_minio_app"` as Pydantic field defaults.
|
||||
- **Files:** `backend/config.py` lines 31, 61, 21
|
||||
- **Impact:** If deployed without overriding env vars, production tokens are signed with the known `CHANGEME` key, all cloud credentials can be decrypted by anyone with the source code, and MinIO uses a known password. Critical misconfiguration vector.
|
||||
- **Fix approach:** Change defaults to `""` and add a startup validator (`@model_validator(mode="after")`) that raises `ValueError` when these fields equal their placeholder values in production (`DEBUG=false`). Log a WARNING in dev if the default is detected.
|
||||
- **Priority:** HIGH
|
||||
|
||||
---
|
||||
|
||||
### `default_storage_backend` Not Validated Against Allowlist
|
||||
|
||||
- **Risk:** `PATCH /api/users/me/default-storage` accepts `body.backend` as a free string and writes it directly to the DB with no allowlist validation.
|
||||
- **Files:** `backend/api/cloud.py` lines 927–946
|
||||
- **Impact:** A user can set `default_storage_backend` to any arbitrary string. A future code path using it as a routing key could allow bypassing the `_CLOUD_PROVIDERS` allowlist.
|
||||
- **Fix approach:** Validate `body.backend in {"minio", "google_drive", "onedrive", "nextcloud", "webdav"}` before the DB write. Use a `Literal` type or `@field_validator` on `DefaultStorageRequest`.
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
### `X-Forwarded-For` Trusted for IP Rate Limiting Without Proxy Enforcement
|
||||
|
||||
- **Risk:** The IP-level rate limiter (`slowapi`) uses `get_remote_address` which reads `X-Forwarded-For`. Without a trusted reverse proxy normalizing this header, an attacker can bypass the IP rate limit.
|
||||
- **Files:** `backend/api/auth.py` line 44; `backend/deps/utils.py`
|
||||
- **Impact:** Attackers can bypass the 10 req/min IP-level limit on login, register, and TOTP endpoints by spoofing the forwarded IP on each request.
|
||||
- **Fix approach:** In Docker Compose, front the backend with nginx configured to set `X-Forwarded-For` from `$remote_addr`, stripping any client-supplied value. Document this as a mandatory production requirement.
|
||||
- **Priority:** HIGH
|
||||
|
||||
---
|
||||
|
||||
### Email HTML Body Uses Unsanitized Server-Supplied Link
|
||||
|
||||
- **Risk:** `send_password_reset_email` builds an HTML body via f-string with `reset_link` directly in an `<a href='…'>` attribute without HTML-escaping.
|
||||
- **Files:** `backend/services/email.py` line 47; line ~105 (security alert email)
|
||||
- **Impact:** If `reset_link` contains a single-quote (possible under certain URL encoding), the HTML attribute breaks. Low-severity HTML injection risk that violates defense-in-depth.
|
||||
- **Fix approach:** Use `html.escape(reset_link, quote=True)` when embedding the link in the HTML body.
|
||||
- **Priority:** LOW
|
||||
|
||||
---
|
||||
|
||||
### Audit Log Written After Commit in `delete_folder`
|
||||
|
||||
- **Risk:** In `DELETE /api/folders/{folder_id}`, `session.commit()` is called at line 424 and `write_audit_log()` is called at line 426 — after the commit, in a separate implicit transaction.
|
||||
- **Files:** `backend/api/folders.py` lines 424–435
|
||||
- **Impact:** If the audit log write fails (DB error, constraint violation), the folder is already deleted with no audit record. Inconsistent with the WR-08 pattern used by `delete_document` (`auto_commit=False`).
|
||||
- **Fix approach:** Move `write_audit_log()` before `session.commit()`, following the pattern used in `api/documents.py::delete_document`.
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
### OAuth Callback Error Redirect May Leak Internal Exception Details
|
||||
|
||||
- **Risk:** In `oauth_callback`, the `except Exception as exc` block redirects to `frontend_url/settings?cloud_error={urllib.parse.quote(str(exc))}`. Exception strings from google-auth-oauthlib or msal may include OAuth client secrets, state values, or internal URL fragments.
|
||||
- **Files:** `backend/api/cloud.py` lines 541–546
|
||||
- **Impact:** Exception details appear in the browser URL bar, referrer headers, browser history, and server access logs.
|
||||
- **Fix approach:** Map exception types to user-safe generic messages (`"auth_failed"`, `"connection_error"`). Log the real exception server-side at ERROR level. Only pass an opaque error code in the redirect.
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
## Performance Concerns
|
||||
|
||||
### N+1 Query Pattern in `list_metadata` / `list_documents`
|
||||
|
||||
- **Risk:** `services/storage.py::list_metadata` loads all documents then calls `_load_topic_names(session, doc.id)` in a Python loop — one DB round-trip per document. The same pattern repeats in the `list_documents` handler's non-legacy code path.
|
||||
- **Files:** `backend/services/storage.py` lines 136–139; `backend/api/documents.py` lines 501–506
|
||||
- **Impact:** For a user with 100 documents, a single list request issues 101 DB queries. At 1000 documents, 1001 queries. Response time degrades linearly as the library grows.
|
||||
- **Fix approach:** Replace with a single JOIN query using PostgreSQL's `array_agg(t.name)` grouped by document. Or use a subquery fetching all document-topic associations for the user in one query and merging in Python.
|
||||
- **Priority:** HIGH
|
||||
|
||||
---
|
||||
|
||||
### Entire File Loaded into Memory for Download and Task Processing
|
||||
|
||||
- **Risk:** `GET /api/documents/{id}/content` calls `await storage_backend.get_object(…)` which returns full `bytes`, loads them into a list, and returns `StreamingResponse(iter([file_bytes]))`. The Celery extraction task also buffers the full file.
|
||||
- **Files:** `backend/api/documents.py` lines 792, 827–831; `backend/tasks/document_tasks.py` line 74
|
||||
- **Impact:** A 100 MB file consumes 100 MB of heap per concurrent request. With 10 simultaneous downloads, the worker needs 1 GB just for file buffers. The 100 MB quota mitigates this today but does not scale.
|
||||
- **Fix approach:** For MinIO, return presigned GET URLs with short TTL instead of proxying through FastAPI. For cloud backends, pipe the provider HTTP response stream directly. For Celery extraction, stream text extraction from bytes in chunks.
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
### No Upload Size Pre-Validation
|
||||
|
||||
- **Risk:** `POST /api/documents/upload` (cloud path) reads the entire file via `await file.read()` before any quota or size check. FastAPI has no global `max_upload_size` configured.
|
||||
- **Files:** `backend/api/documents.py` line 207
|
||||
- **Impact:** A malicious user can upload a multi-gigabyte file, exhausting FastAPI worker memory before the quota check fires.
|
||||
- **Fix approach:** Check `Content-Length` header at endpoint entry; reject with 413 if above a configurable `MAX_UPLOAD_BYTES` limit. Add a `--limit-max-requests` or body-size middleware at the uvicorn/nginx level.
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
### `revoke_all_refresh_tokens` Issues One UPDATE per Token
|
||||
|
||||
- **Risk:** `services/auth.py::revoke_all_refresh_tokens` loads all active refresh token rows into Python, then marks each `revoked=True` individually via ORM, issuing one UPDATE statement per token.
|
||||
- **Files:** `backend/services/auth.py` lines 218–237
|
||||
- **Impact:** A user with many active sessions (e.g. 50 devices) causes 50 individual UPDATE statements on sign-out-all. Could be replaced with a single bulk UPDATE.
|
||||
- **Fix approach:** Replace with `UPDATE refresh_tokens SET revoked = true WHERE user_id = :uid AND revoked = false` and count affected rows via `result.rowcount`.
|
||||
- **Priority:** LOW
|
||||
|
||||
---
|
||||
|
||||
### FTS Falls Back Silently on Any Exception
|
||||
|
||||
- **Risk:** The FTS code path in `list_documents` wraps the FTS query in `except Exception:` and falls back to an unfiltered query.
|
||||
- **Files:** `backend/api/documents.py` lines 486–489
|
||||
- **Impact:** Any PostgreSQL error causes silent fallback — the user sees all their documents when they searched for a term, with no indication of failure.
|
||||
- **Fix approach:** Narrow the catch to `sqlalchemy.exc.OperationalError` (for SQLite compat in tests only) and log all other exceptions at ERROR level before re-raising.
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
## Reliability Concerns
|
||||
|
||||
### Email Queue Worker Missing From Docker Compose
|
||||
|
||||
- **Risk:** Celery routes email tasks to the `email` queue but `docker-compose.yml` defines only one Celery worker consuming `-Q documents`. No worker processes the `email` queue.
|
||||
- **Files:** `backend/celery_app.py` line 36; `docker-compose.yml` line 96
|
||||
- **Impact:** Password reset emails, security alert emails (refresh token reuse detection), and backup code emails are silently enqueued but never delivered. Callers receive 202 but emails never arrive.
|
||||
- **Fix approach:** Add a `celery-worker-email` service in `docker-compose.yml` consuming `-Q email`, or update the existing worker command to `-Q documents,email`.
|
||||
- **Priority:** HIGH
|
||||
|
||||
---
|
||||
|
||||
### `documents.updated_at` Not Auto-Updated on Row Changes
|
||||
|
||||
- **Risk:** `Document.updated_at` is declared with `server_default=func.now()` but no `onupdate` trigger. When `extracted_text`, `status`, or `filename` is changed, `updated_at` stays as the creation timestamp.
|
||||
- **Files:** `backend/db/models.py` lines 192–194
|
||||
- **Impact:** `classified_at` in `_doc_to_dict` is computed from `doc.updated_at` when `status == "classified"` — if `updated_at` is stale, the displayed timestamp is incorrect. Sort-by-date after reclassification is also wrong.
|
||||
- **Fix approach:** Add a PostgreSQL `BEFORE UPDATE` trigger that sets `updated_at = now()`, or add `onupdate=func.now()` to the mapped column (requires SQLAlchemy ORM event to fire at update time).
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
### Celery Task Result Backend Accumulates Without Expiry
|
||||
|
||||
- **Risk:** Celery is configured to use Redis as result backend with no `result_expires` setting. Task results accumulate in Redis indefinitely.
|
||||
- **Files:** `backend/celery_app.py` lines 23–24
|
||||
- **Impact:** Redis memory grows unboundedly over time, potentially causing OOM which would also break rate limiting and TOTP replay prevention.
|
||||
- **Fix approach:** Add `celery_app.conf.result_expires = 3600` or disable the result backend entirely since no code reads task results.
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
### Breadcrumb Builder in `get_folder` Has No Depth Limit
|
||||
|
||||
- **Risk:** The breadcrumb builder in `GET /api/folders/{folder_id}` walks up the parent chain iteratively with a `visited` set but no maximum depth cap.
|
||||
- **Files:** `backend/api/folders.py` lines 234–247
|
||||
- **Impact:** With a deeply nested folder tree (e.g. 200 levels of nesting), the loop issues 200 sequential DB round-trips before terminating.
|
||||
- **Fix approach:** Add `if len(crumbs) >= 20: break` to cap at a reasonable depth.
|
||||
- **Priority:** LOW
|
||||
|
||||
---
|
||||
|
||||
## Code Quality Concerns
|
||||
|
||||
### Duplicate Inline IP Extraction (Not Using `get_client_ip`)
|
||||
|
||||
- **Risk:** Several endpoints extract the client IP inline instead of using `deps/utils.py::get_client_ip()`.
|
||||
- **Files:** `backend/api/documents.py` lines 269–271, 376; `backend/api/cloud.py` lines 624, 753
|
||||
- **Impact:** If the trusted-proxy logic changes, all inline copies must be updated individually.
|
||||
- **Fix approach:** Replace all inline `request.headers.get("X-Forwarded-For") or request.client.host` with `get_client_ip(request)`.
|
||||
- **Priority:** LOW
|
||||
|
||||
---
|
||||
|
||||
### `Document.status` Is an Unconstrained String Column
|
||||
|
||||
- **Risk:** `Document.status` is `String, nullable=False, default="pending"` with no DB-level CHECK constraint or Python enum. Values `"pending"`, `"uploaded"`, `"classified"`, `"classification_failed"` are used in code but not enforced.
|
||||
- **Files:** `backend/db/models.py` line 188
|
||||
- **Impact:** A typo in a task or direct DB write silently sets an invalid status, causing silent bugs in status-checking code (e.g. `classified_at` timestamp never shown).
|
||||
- **Fix approach:** Add a migration with `ALTER TABLE documents ADD CONSTRAINT ck_documents_status CHECK (status IN ('pending', 'uploaded', 'classified', 'classification_failed'))`.
|
||||
- **Priority:** LOW
|
||||
|
||||
---
|
||||
|
||||
### Stale Wave-2 Comment in `documents.py` Module Docstring
|
||||
|
||||
- **Risk:** `backend/api/documents.py` lines 19–20 contain `"NOTE (Wave 2): No auth guards on any endpoint yet — Plan 03-03 adds get_current_user…"` — this is false; all handlers use `get_regular_user`.
|
||||
- **Files:** `backend/api/documents.py` lines 19–20
|
||||
- **Impact:** Misleads reviewers into thinking auth is not applied, potentially causing incorrect security assessments.
|
||||
- **Fix approach:** Remove or replace the stale NOTE comment.
|
||||
- **Priority:** LOW
|
||||
|
||||
---
|
||||
|
||||
### `classify_document` Endpoint Uses Mutable Default and Unvalidated Dict Body
|
||||
|
||||
- **Risk:** `POST /api/documents/{doc_id}/classify` has `body: dict = {}` — mutable default argument antipattern and no Pydantic validation.
|
||||
- **Files:** `backend/api/documents.py` line 695
|
||||
- **Impact:** Static analysis confusion; unvalidated request body accepts arbitrary JSON keys.
|
||||
- **Fix approach:** Define `class ClassifyRequest(BaseModel): topics: Optional[list[str]] = None` and replace `body: dict = {}`.
|
||||
- **Priority:** LOW
|
||||
|
||||
---
|
||||
|
||||
## Missing Tests / Coverage Gaps
|
||||
|
||||
### No Tests for JWT Algorithm, JTI, or Token Binding
|
||||
|
||||
- **Risk:** No tests verify the JWT algorithm, JTI presence/validation, or token binding.
|
||||
- **Files:** `backend/tests/test_auth_deps.py`, `backend/tests/test_auth_api.py`
|
||||
- **Impact:** Algorithm or claim changes would not be caught. The security invariants are untested.
|
||||
- **Priority:** HIGH
|
||||
|
||||
---
|
||||
|
||||
### Password Change Has No Session-Revocation Test
|
||||
|
||||
- **Risk:** No test verifies that changing a password invalidates existing refresh tokens.
|
||||
- **Files:** `backend/tests/test_auth_api.py`
|
||||
- **Fix approach:** Add test: register → login (obtain refresh cookie) → change password → assert old refresh cookie returns 401.
|
||||
- **Priority:** HIGH
|
||||
|
||||
---
|
||||
|
||||
### No Frontend E2E Tests
|
||||
|
||||
- **Risk:** The frontend has Vitest unit tests for 3 stores and a small set of components, but no Playwright or Cypress E2E tests for critical user flows.
|
||||
- **Files:** `frontend/src/stores/__tests__/`, `frontend/src/views/__tests__/`
|
||||
- **Impact:** Breaking changes in API contract, router guards, or component interactions are not caught until manual testing. The upload flow, TOTP enrollment, and admin operations have no automated coverage.
|
||||
- **Fix approach:** Add Playwright E2E tests for: login → upload → view → share → recipient download.
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
### No Regression Test for `delete_folder` Audit Log Ordering
|
||||
|
||||
- **Risk:** The audit log after-commit ordering issue in `delete_folder` has no test to prevent regression after fixing.
|
||||
- **Files:** `backend/tests/test_folders.py`
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
### Quota Concurrency Tests Run Against SQLite, Not PostgreSQL
|
||||
|
||||
- **Risk:** Quota enforcement tests run against SQLite in the default test config. CLAUDE.md specifies "integration tests against real PostgreSQL (not SQLite for quota/UUID tests)."
|
||||
- **Files:** `backend/tests/test_quota.py`; `backend/tests/conftest.py`
|
||||
- **Impact:** A race condition in the atomic quota UPDATE would only be detectable with concurrent clients on real PostgreSQL.
|
||||
- **Fix approach:** Mark quota atomicity tests with `@pytest.mark.skipif(not live_services_available, ...)` and add a concurrent-upload test using `asyncio.gather`.
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
### Pinia Stores `documents.js` and `topics.js` Have No Unit Tests
|
||||
|
||||
- **Risk:** The stores for documents and topics — which implement pagination, filtering, and topic assignment logic — have no tests in `frontend/src/stores/__tests__/`.
|
||||
- **Files:** `frontend/src/stores/documents.js`, `frontend/src/stores/topics.js`
|
||||
- **Priority:** LOW
|
||||
|
||||
---
|
||||
|
||||
## Dependency Risks
|
||||
|
||||
### All Backend Dependencies Use Floor `>=` Version Pins
|
||||
|
||||
- **Risk:** `backend/requirements.txt` uses `>=` for all packages including security-critical ones: `PyJWT>=2.8.0`, `pwdlib[argon2]>=0.2.1`, `cryptography>=41.0.0`, `fastapi>=0.111`.
|
||||
- **Files:** `backend/requirements.txt`
|
||||
- **Impact:** `pip install` resolves to the latest available version at build time. A breaking change or vulnerability in any dependency silently takes effect on the next Docker build. CLAUDE.md mandates exact version pinning for security-critical packages.
|
||||
- **Fix approach:** Run `pip freeze > requirements.lock` to generate an exact pinned lockfile. Use `pip-tools` or `uv lock` to manage upgrades. At minimum, pin `PyJWT`, `pwdlib`, `cryptography`, and `fastapi` to exact versions.
|
||||
- **Priority:** HIGH
|
||||
|
||||
---
|
||||
|
||||
### `minio/minio:latest` Tag in Docker Compose
|
||||
|
||||
- **Risk:** `docker-compose.yml` uses `image: minio/minio:latest` — a floating tag that pulls a new release on `docker compose pull`.
|
||||
- **Files:** `docker-compose.yml` line 19
|
||||
- **Impact:** Breaking MinIO API changes or security regressions in a new release could break file storage without warning.
|
||||
- **Fix approach:** Pin to a specific MinIO release tag (e.g. `minio/minio:RELEASE.2024-11-07T00-52-20Z`).
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
## Infrastructure and Operational Concerns
|
||||
|
||||
### No Reverse Proxy / TLS Termination in Production Setup
|
||||
|
||||
- **Risk:** `docker-compose.yml` exposes the FastAPI backend on port 8000 and frontend on port 5173 directly, with no nginx or Caddy container for TLS termination or `X-Forwarded-For` normalization.
|
||||
- **Files:** `docker-compose.yml`
|
||||
- **Impact:** (1) The refresh cookie uses `secure=True` in code but travels over plain HTTP, making the `secure` flag ineffective. (2) IP rate limiting is spoofable. (3) Credentials and session cookies travel in cleartext.
|
||||
- **Fix approach:** Add an nginx service to `docker-compose.yml` that terminates TLS (Let's Encrypt or self-signed), proxies `/api/` to the backend, and sets `proxy_set_header X-Forwarded-For $remote_addr`.
|
||||
- **Priority:** HIGH
|
||||
|
||||
---
|
||||
|
||||
### MinIO Uses Plain HTTP Between Containers
|
||||
|
||||
- **Risk:** `Minio(…, secure=False)` — all object data travels over HTTP between FastAPI and MinIO containers.
|
||||
- **Files:** `backend/main.py` line 82; `backend/storage/__init__.py` line 48
|
||||
- **Impact:** An attacker with access to the Docker network can intercept document bytes in transit. Critical if containers share a host with untrusted workloads.
|
||||
- **Fix approach:** Enable TLS on MinIO (`secure=True`) or document the trust model explicitly. For shared-host deployments, configure mTLS between containers.
|
||||
- **Priority:** MEDIUM (acceptable on isolated Docker bridge; critical on shared host)
|
||||
|
||||
---
|
||||
|
||||
### No Backup Strategy for PostgreSQL or MinIO Data
|
||||
|
||||
- **Risk:** `docker-compose.yml` uses named volumes (`postgres_data`, `minio_data`) with no backup tooling, retention policy, or point-in-time recovery.
|
||||
- **Files:** `docker-compose.yml` lines 138–140
|
||||
- **Impact:** A disk failure, container wipe, or accidental `docker volume rm` causes permanent loss of all user documents, credentials, audit logs, and accounts.
|
||||
- **Fix approach:** Add a `backup` service running `pg_dump` on a schedule (e.g. via `ofelia` or a cron sidecar), compressing and shipping to an off-site store. Configure MinIO `mc mirror` to a second bucket or provider. Document RTO/RPO targets.
|
||||
- **Priority:** HIGH
|
||||
|
||||
---
|
||||
|
||||
### Redis Has No Persistence Configuration
|
||||
|
||||
- **Risk:** Redis is started with only `--requirepass`. No `--save` or `--appendonly yes` flags are set, making all Redis data ephemeral.
|
||||
- **Files:** `docker-compose.yml` line 42
|
||||
- **Impact:** A Redis restart clears all rate-limit counters (brief brute-force window on auth endpoints), TOTP replay prevention keys (30-second replay window reopens), and pending OAuth state tokens.
|
||||
- **Fix approach:** Add `--save 60 1 --appendonly yes` to the Redis command and mount a Redis data volume. Document that Redis restart is a brief security event requiring monitoring.
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
### Docker Compose Mounts Source Code as Live Volume
|
||||
|
||||
- **Risk:** `docker-compose.yml` mounts `./backend:/app` and `./frontend/src:/app/src` as live volumes (appropriate for dev hot-reload but dangerous in production if the same file is used).
|
||||
- **Files:** `docker-compose.yml` lines 53–54, 131–132
|
||||
- **Impact:** In production, host filesystem modifications immediately affect the running container without a deploy cycle.
|
||||
- **Fix approach:** Create a `docker-compose.prod.yml` that omits the volume mounts and uses the Dockerfile `COPY . .` layer only. Document the two-file strategy clearly.
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
### Dockerfile Runs Application as Root
|
||||
|
||||
- **Risk:** `backend/Dockerfile` uses `FROM python:3.12-slim` with no `USER` directive. FastAPI and Celery run as root inside the container.
|
||||
- **Files:** `backend/Dockerfile`
|
||||
- **Impact:** A container escape vulnerability or SSRF leading to RCE gives the attacker root-equivalent access to the container filesystem.
|
||||
- **Fix approach:** Add `RUN adduser --disabled-password --gecos "" appuser && chown -R appuser /app` and `USER appuser` before `EXPOSE 8000`.
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
### No Structured Logging, Metrics, or Alerting
|
||||
|
||||
- **Risk:** All logging uses Python's stdlib logger with no structured format, no Prometheus/StatsD metrics endpoint, no error aggregation service, and no alerting on security events in the audit log.
|
||||
- **Files:** All backend files
|
||||
- **Impact:** Silent failures — email queue not processing, repeated TOTP replay attempts, brute-force login spikes — go undetected. Failed Celery tasks log to stderr with no aggregation. The security alert email on refresh token reuse is the only active notification mechanism.
|
||||
- **Fix approach:** Add `structlog` for JSON-formatted structured logs. Add a `/metrics` endpoint with `prometheus-fastapi-instrumentator`. Configure alerting on `auth.login_failed` count spikes in the audit log.
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
---
|
||||
|
||||
*Concerns audit: 2026-06-02*
|
||||
|
||||
Reference in New Issue
Block a user