da9b911f1e
Adds an explicit rule at the top of CLAUDE.md requiring a check after every codebase change: routes, models, migrations, files, limits, security patterns, Docker infra, and stack versions each map to the specific section that must be updated. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
842 lines
34 KiB
Markdown
842 lines
34 KiB
Markdown
# CLAUDE.md
|
||
|
||
This file provides permanent, authoritative guidance to Claude Code for every session. All sections below reflect the actual codebase state and must be kept up-to-date as the project evolves.
|
||
|
||
## CLAUDE.md self-update checkpoint
|
||
|
||
**After every change to the codebase**, before committing, check whether CLAUDE.md needs updating:
|
||
|
||
- New route added → update **All API Endpoints** and **Frontend Routes** tables
|
||
- New DB model or column → update **Database Models**
|
||
- New migration → update **Migration chains**
|
||
- New file or directory → update **File & Folder Tree**
|
||
- New limit or default value changed → update **Default Values & Limits**
|
||
- New dependency, auth mechanism, or security pattern → update **Security Standards**
|
||
- New Docker service, volume, network, or env var → update **Docker Infrastructure**
|
||
- Stack version changed → update **Stack**
|
||
|
||
This check is mandatory — treat it the same as updating STATUS.md.
|
||
|
||
---
|
||
|
||
## Stack
|
||
|
||
| Layer | Tech |
|
||
|---|---|
|
||
| Backend | FastAPI (async), SQLAlchemy 2 (async), Alembic, PostgreSQL 16 |
|
||
| Auth | JWT RS256 via `python-jose`, bcrypt via `bcrypt` (direct, 13 rounds) |
|
||
| Frontend | React 18, TypeScript, Vite, React Router v6, TanStack Query, Axios |
|
||
| UI Library | shadcn/ui (Radix primitives + Tailwind CSS v3) |
|
||
| Styling | Tailwind CSS v3, CSS custom properties for theme tokens |
|
||
| Containerisation | Docker Compose (5 services, non-root users, named volumes) |
|
||
|
||
---
|
||
|
||
## Commands
|
||
|
||
All test, build, and package-manager commands run **inside Docker** — never on the host. See the memory note: "Testing inside Docker only".
|
||
|
||
### Migrations (run in Docker)
|
||
|
||
```bash
|
||
docker compose exec backend alembic revision --autogenerate -m "describe change"
|
||
docker compose exec backend alembic upgrade head
|
||
docker compose exec backend alembic downgrade -1
|
||
```
|
||
|
||
### Lint (run in Docker)
|
||
|
||
```bash
|
||
docker compose exec backend ruff check . && ruff format .
|
||
docker compose exec frontend npm run typecheck
|
||
docker compose exec frontend npm run lint
|
||
```
|
||
|
||
### Full stack
|
||
|
||
```bash
|
||
# Dev stack (hot-reload, Vite on :5173)
|
||
cp .env.example backend/.env
|
||
docker compose -f docker-compose.yml -f docker-compose.dev.yml up --build
|
||
|
||
# Prod stack
|
||
docker compose up --build -d
|
||
```
|
||
|
||
---
|
||
|
||
## File & Folder Tree
|
||
|
||
```
|
||
/
|
||
├── CLAUDE.md ← This file — authoritative session context
|
||
├── README.md ← Project overview, containers table, Current State
|
||
├── TODO.md ← Task list
|
||
├── .env.example ← Template for backend/.env
|
||
├── docker-compose.yml ← Production (5 services, named volumes)
|
||
├── docker-compose.dev.yml ← Dev overrides (hot-reload, host ports)
|
||
├── .githooks/pre-commit ← Runs scripts/security_check.py before every commit
|
||
├── scripts/security_check.py ← Static analysis: secrets, weak crypto, SQLi, JWT
|
||
├── changelog/YYYY-MM-DD_<slug>.md ← Per-date change logs
|
||
│
|
||
├── backend/ ← FastAPI gateway (port 8000, internal)
|
||
│ ├── app/
|
||
│ │ ├── main.py ← App factory, router registration, lifespan (health loop)
|
||
│ │ ├── database.py ← AsyncEngine, AsyncSessionLocal, Base
|
||
│ │ ├── deps.py ← get_current_user, get_current_admin
|
||
│ │ ├── core/
|
||
│ │ │ ├── config.py ← All settings via pydantic-settings (reads .env)
|
||
│ │ │ ├── security.py ← JWT sign/verify (RS256), bcrypt hash/verify
|
||
│ │ │ ├── sanitize.py ← Input sanitization helpers (see Security Standards)
|
||
│ │ │ └── app_config.py ← Per-service config load/save to /config volume
|
||
│ │ ├── models/
|
||
│ │ │ ├── __init__.py ← Imports all models (required for Alembic autogenerate)
|
||
│ │ │ ├── user.py ← User model (see Database Models)
|
||
│ │ │ ├── profile.py ← Profile model
|
||
│ │ │ └── group.py ← Group, GroupMembership models
|
||
│ │ ├── schemas/
|
||
│ │ │ ├── user.py ← UserCreate/Out, Token, DashboardPrefsOut/Update
|
||
│ │ │ ├── profile.py ← ProfileRead, ProfileUpdate
|
||
│ │ │ └── group.py ← GroupCreate/Update/Out/DetailOut, GroupMemberOut
|
||
│ │ ├── routers/
|
||
│ │ │ ├── auth.py ← POST /register, POST /login
|
||
│ │ │ ├── users.py ← GET /me, GET+PATCH /me/preferences
|
||
│ │ │ ├── profile.py ← GET+PUT /me (profile)
|
||
│ │ │ ├── admin.py ← User admin CRUD (admin-only)
|
||
│ │ │ ├── groups.py ← Group CRUD + member management (admin-only)
|
||
│ │ │ ├── settings.py ← AI, doc limits, system prompts (admin-only)
|
||
│ │ │ ├── services.py ← GET /services (health status)
|
||
│ │ │ ├── categories_proxy.py ← Transparent proxy → doc-service /categories/*
|
||
│ │ │ └── documents_proxy.py ← Transparent proxy → doc-service /documents/*
|
||
│ │ └── services/
|
||
│ │ └── service_health.py ← Background 30s health-check loop
|
||
│ ├── alembic/
|
||
│ │ ├── env.py ← Async migration runner
|
||
│ │ └── versions/ ← Migration chain (see Migrations section)
|
||
│ ├── scripts/seed.py ← Seed test user
|
||
│ ├── Dockerfile ← python:3.12-slim, non-root user 1001
|
||
│ └── STATUS.md
|
||
│
|
||
├── features/
|
||
│ ├── ai-service/ ← AI provider intermediary (port 8010, internal)
|
||
│ │ ├── app/
|
||
│ │ │ ├── main.py ← FastAPI, queue worker lifespan
|
||
│ │ │ ├── routers/chat.py ← POST /chat (sync, NORMAL priority queue)
|
||
│ │ │ ├── routers/health.py ← GET /health
|
||
│ │ │ ├── routers/queue.py ← GET /queue/status, /pause, /resume, /cancel/{id}
|
||
│ │ │ ├── providers/base.py ← AIProvider abstract class
|
||
│ │ │ ├── providers/anthropic_provider.py
|
||
│ │ │ ├── providers/openai_compat.py ← Ollama / LM Studio
|
||
│ │ │ └── services/queue.py ← Priority queue (CRITICAL > HIGH > NORMAL)
|
||
│ │ ├── Dockerfile
|
||
│ │ └── STATUS.md
|
||
│ │
|
||
│ └── doc-service/ ← PDF extraction microservice (port 8001, internal)
|
||
│ ├── app/
|
||
│ │ ├── main.py
|
||
│ │ ├── database.py ← Same PostgreSQL instance as backend
|
||
│ │ ├── deps.py ← get_user_id (reads x-user-id header)
|
||
│ │ ├── models/
|
||
│ │ │ ├── document.py ← Document model (see Database Models)
|
||
│ │ │ ├── category.py ← DocumentCategory model
|
||
│ │ │ └── category_assignment.py ← CategoryAssignment (composite PK)
|
||
│ │ ├── schemas/
|
||
│ │ │ ├── document.py ← DocumentOut, DocumentPage, DocumentStatusOut, etc.
|
||
│ │ │ └── category.py ← CategoryOut, CategoryCreate, CategoryUpdate
|
||
│ │ ├── routers/
|
||
│ │ │ ├── documents.py ← Full document CRUD + file serving + reprocess
|
||
│ │ │ └── categories.py ← Category CRUD
|
||
│ │ └── services/
|
||
│ │ ├── storage.py ← File I/O
|
||
│ │ ├── ai_client.py ← classify_document() → ai-service:8010/chat
|
||
│ │ └── config_reader.py
|
||
│ ├── alembic/versions/ ← Doc-service migration chain
|
||
│ ├── Dockerfile
|
||
│ └── STATUS.md
|
||
│
|
||
└── frontend/ ← React SPA (port 5173 dev / 80 prod)
|
||
├── src/
|
||
│ ├── main.tsx ← React root, QueryClientProvider, BrowserRouter
|
||
│ ├── App.tsx ← Route tree, PrivateRoute, AdminRoute
|
||
│ ├── api/client.ts ← Axios instance + ALL API functions (single source of truth)
|
||
│ ├── hooks/
|
||
│ │ ├── useAuth.ts ← Token state (localStorage), login/logout
|
||
│ │ └── useTheme.ts ← Theme toggle
|
||
│ ├── components/
|
||
│ │ ├── AppShell.tsx ← Layout: Sidebar + scrollable main
|
||
│ │ ├── Sidebar.tsx ← Collapsible nav (icons ↔ icons+labels)
|
||
│ │ ├── ThemeToggle.tsx ← Light/dark mode toggle
|
||
│ │ └── ui/ ← shadcn/ui components (Button, Input, …)
|
||
│ ├── pages/ ← One file per route (see Routes section)
|
||
│ ├── lib/utils.ts ← cn() = clsx + tailwind-merge
|
||
│ └── styles/theme.css ← CSS custom properties, Tailwind setup
|
||
├── vite.config.ts ← /api/* proxied to backend:8000
|
||
├── tailwind.config.ts
|
||
├── components.json ← shadcn/ui config
|
||
├── Dockerfile ← Multi-stage: Node build → nginx-unprivileged
|
||
└── STATUS.md
|
||
```
|
||
|
||
---
|
||
|
||
## Architecture
|
||
|
||
### Request flow
|
||
|
||
```
|
||
Browser (:5173 dev / :80 prod)
|
||
│
|
||
└── Vite dev proxy / nginx
|
||
│
|
||
└── /api/* ──→ backend:8000 (FastAPI)
|
||
│
|
||
┌───────────────┼───────────────────┐
|
||
/auth /admin /documents/*
|
||
/users /groups /documents/categories/*
|
||
/profile /settings
|
||
/services │ │
|
||
JSON volume proxy (injects x-user-id)
|
||
(/config) │
|
||
doc-service:8001
|
||
│
|
||
ai-service:8010
|
||
(classify, chat)
|
||
```
|
||
|
||
### Auth flow
|
||
|
||
1. `POST /api/auth/login` → RS256 JWT (8 h), stored in `localStorage`
|
||
2. Axios interceptor injects `Authorization: Bearer {token}` on every request
|
||
3. `get_current_user` dep validates token on every protected route
|
||
4. Admin routes additionally check `user.is_superuser`; return 404 (not 403) if not admin
|
||
|
||
---
|
||
|
||
## Database Models
|
||
|
||
### Backend (`users`, `profiles`, `groups`, `group_memberships`)
|
||
|
||
**`users`**
|
||
|
||
| Column | Type | Constraints | Notes |
|
||
|--------|------|-------------|-------|
|
||
| `id` | String | PK, UUID | auto-generated |
|
||
| `email` | String | UNIQUE, indexed, NOT NULL | lowercased before storing |
|
||
| `hashed_password` | String | NOT NULL | bcrypt 13 rounds |
|
||
| `full_name` | String | nullable | sanitized max 128 chars |
|
||
| `is_active` | Boolean | default=True | soft-delete flag |
|
||
| `is_superuser` | Boolean | default=False | admin role; never exposed as-is (serialised as `is_admin`) |
|
||
| `dashboard_app_ids` | JSON | NOT NULL, default=[] | list of pinned service IDs |
|
||
|
||
Relationship: `profile` (one-to-one, cascade all+delete-orphan)
|
||
|
||
**`profiles`**
|
||
|
||
| Column | Type | Constraints | Notes |
|
||
|--------|------|-------------|-------|
|
||
| `id` | String | PK, UUID | auto-generated |
|
||
| `user_id` | String | FK→users.id UNIQUE, cascade delete | one-to-one |
|
||
| `phone` | String(20) | nullable | validated format |
|
||
| `date_of_birth` | Date | nullable | 1900+ and not future |
|
||
| `position` | String(128) | nullable | job title |
|
||
| `address` | String(255) | nullable | |
|
||
| `updated_at` | DateTime(tz) | server_default=now(), onupdate=now() | |
|
||
|
||
**`groups`**
|
||
|
||
| Column | Type | Constraints |
|
||
|--------|------|-------------|
|
||
| `id` | String | PK, UUID |
|
||
| `name` | String(128) | UNIQUE indexed, NOT NULL |
|
||
| `description` | String(512) | nullable |
|
||
| `created_at` | DateTime(tz) | server_default=now() |
|
||
|
||
**`group_memberships`**
|
||
|
||
| Column | Type | Constraints |
|
||
|--------|------|-------------|
|
||
| `id` | String | PK, UUID |
|
||
| `group_id` | String | FK→groups.id, indexed, CASCADE |
|
||
| `user_id` | String | FK→users.id, indexed, CASCADE |
|
||
| `joined_at` | DateTime(tz) | server_default=now() |
|
||
|
||
Unique constraint: `(group_id, user_id)`
|
||
|
||
### Doc-service (`documents`, `document_categories`, `document_category_assignments`)
|
||
|
||
**`documents`**
|
||
|
||
| Column | Type | Constraints | Notes |
|
||
|--------|------|-------------|-------|
|
||
| `id` | String | PK, UUID | |
|
||
| `user_id` | String | indexed | not FK — trusts x-user-id header |
|
||
| `filename` | String | NOT NULL | |
|
||
| `file_path` | String | NOT NULL | absolute path under /data/documents |
|
||
| `file_size` | Integer | NOT NULL | bytes |
|
||
| `status` | String | default="pending" | pending / processing / done / failed |
|
||
| `title` | String(500) | nullable | AI-extracted |
|
||
| `document_type` | String | nullable | invoice / bill / receipt / order / expense / revenue / unknown |
|
||
| `raw_text` | Text | nullable | first 500 k chars |
|
||
| `extracted_data` | Text | nullable | JSON string |
|
||
| `tags` | Text | nullable | JSON array string |
|
||
| `error_message` | String(500) | nullable | |
|
||
| `created_at` | DateTime(tz) | server_default=now() | |
|
||
| `processed_at` | DateTime(tz) | nullable | |
|
||
|
||
**`document_categories`**
|
||
|
||
| Column | Type | Constraints |
|
||
|--------|------|-------------|
|
||
| `id` | String | PK, UUID |
|
||
| `user_id` | String | indexed |
|
||
| `name` | String(128) | NOT NULL |
|
||
| `created_at` | DateTime(tz) | server_default=now() |
|
||
|
||
**`document_category_assignments`** (composite PK)
|
||
|
||
| Column | Type | Constraints |
|
||
|--------|------|-------------|
|
||
| `document_id` | String | PK + FK→documents.id CASCADE |
|
||
| `category_id` | String | PK + FK→document_categories.id CASCADE |
|
||
|
||
### Migration chains
|
||
|
||
**Backend** (must be applied in order):
|
||
|
||
| Rev ID | Slug |
|
||
|--------|------|
|
||
| `38efeff7c45a` | `create_users_table` |
|
||
| `676084df61d1` | `add_profiles_table` |
|
||
| `a3f9c2d14e87` | `add_groups_and_group_memberships` |
|
||
| `c7e8f9a0b1d2` | `add_dashboard_app_ids_to_users` |
|
||
|
||
**Doc-service**:
|
||
|
||
| Rev ID | Slug |
|
||
|--------|------|
|
||
| `0001` | `create_doc_tables` |
|
||
| `0002` | `add_document_title` |
|
||
|
||
---
|
||
|
||
## All API Endpoints
|
||
|
||
### Auth (`/api/auth`) — public
|
||
|
||
| Method | Path | Auth | Description |
|
||
|--------|------|------|-------------|
|
||
| POST | `/api/auth/register` | — | Create account; returns `UserOut`; enforces password policy |
|
||
| POST | `/api/auth/login` | — | OAuth2 password flow; returns `{access_token, token_type}` |
|
||
|
||
### Users (`/api/users`) — authenticated
|
||
|
||
| Method | Path | Auth | Description |
|
||
|--------|------|------|-------------|
|
||
| GET | `/api/users/me` | user | Current user info → `UserOut` |
|
||
| GET | `/api/users/me/preferences` | user | Dashboard pinned app IDs → `{app_ids}` |
|
||
| PATCH | `/api/users/me/preferences` | user | Save pinned app IDs (max 50, slug-safe) |
|
||
|
||
### Profile (`/api/profile`) — authenticated
|
||
|
||
| Method | Path | Auth | Description |
|
||
|--------|------|------|-------------|
|
||
| GET | `/api/profile/me` | user | Fetch profile; auto-creates if missing |
|
||
| PUT | `/api/profile/me` | user | Update profile fields |
|
||
|
||
### Admin — Users (`/api/admin`) — admin-only
|
||
|
||
| Method | Path | Description |
|
||
|--------|------|-------------|
|
||
| GET | `/api/admin/users` | List all users → `list[UserAdminOut]` |
|
||
| POST | `/api/admin/users` | Create user (with optional is_admin) |
|
||
| DELETE | `/api/admin/users/{user_id}` | Delete user (204) |
|
||
| PATCH | `/api/admin/users/{user_id}/active` | Toggle active status |
|
||
|
||
### Admin — Groups (`/api/admin/groups`) — admin-only
|
||
|
||
| Method | Path | Description |
|
||
|--------|------|-------------|
|
||
| GET | `/api/admin/groups` | List groups with member count |
|
||
| POST | `/api/admin/groups` | Create group |
|
||
| GET | `/api/admin/groups/{id}` | Group detail + members |
|
||
| PATCH | `/api/admin/groups/{id}` | Update name / description |
|
||
| DELETE | `/api/admin/groups/{id}` | Delete (cascades memberships) |
|
||
| POST | `/api/admin/groups/{id}/members/{user_id}` | Add member |
|
||
| DELETE | `/api/admin/groups/{id}/members/{user_id}` | Remove member |
|
||
|
||
### Settings (`/api/settings`) — admin-only
|
||
|
||
| Method | Path | Description |
|
||
|--------|------|-------------|
|
||
| GET | `/api/settings/ai` | AI config (keys masked) |
|
||
| PATCH | `/api/settings/ai` | Update AI provider / credentials |
|
||
| POST | `/api/settings/ai/test` | Test AI connection |
|
||
| GET | `/api/settings/documents/limits` | PDF upload limits |
|
||
| PATCH | `/api/settings/documents/limits` | Update max PDF size |
|
||
| GET | `/api/settings/system-prompts` | All editable system prompts |
|
||
| PATCH | `/api/settings/system-prompts/{service_id}` | Update system prompt |
|
||
|
||
### Services (`/api/services`) — authenticated
|
||
|
||
| Method | Path | Description |
|
||
|--------|------|-------------|
|
||
| GET | `/api/services` | Health status of all registered services → `list[ServiceStatus]` |
|
||
|
||
### Documents (`/api/documents/*`) — authenticated, proxied to doc-service
|
||
|
||
| Method | Path | Description |
|
||
|--------|------|-------------|
|
||
| POST | `/api/documents/upload` | Upload PDF (202, background processing) |
|
||
| GET | `/api/documents` | Paginated list (filterable: search, status, type, category, sort) |
|
||
| GET | `/api/documents/{id}` | Document detail |
|
||
| GET | `/api/documents/{id}/status` | Processing status only |
|
||
| PATCH | `/api/documents/{id}/type` | Update document type |
|
||
| PATCH | `/api/documents/{id}/tags` | Update tags |
|
||
| PATCH | `/api/documents/{id}/title` | Update title |
|
||
| POST | `/api/documents/{id}/reprocess` | Re-run AI extraction |
|
||
| DELETE | `/api/documents/{id}` | Delete document (204) |
|
||
| GET | `/api/documents/{id}/file` | Download PDF (streaming) |
|
||
| POST | `/api/documents/{id}/categories/{cat_id}` | Assign category |
|
||
| DELETE | `/api/documents/{id}/categories/{cat_id}` | Remove category |
|
||
|
||
### Categories (`/api/documents/categories/*`) — authenticated, proxied to doc-service
|
||
|
||
| Method | Path | Description |
|
||
|--------|------|-------------|
|
||
| GET | `/api/documents/categories` | List user's categories |
|
||
| POST | `/api/documents/categories` | Create category (triggers background AI reanalysis) |
|
||
| PATCH | `/api/documents/categories/{id}` | Rename |
|
||
| DELETE | `/api/documents/categories/{id}` | Delete (204) |
|
||
|
||
### AI-service (internal only — not exposed to browser)
|
||
|
||
| Method | Path | Description |
|
||
|--------|------|-------------|
|
||
| POST | `/chat` | Chat request (queued at NORMAL priority) |
|
||
| GET | `/health` | Health check |
|
||
| GET | `/queue/status` | Queue state |
|
||
| POST | `/queue/pause` | Pause queue |
|
||
| POST | `/queue/resume` | Resume queue |
|
||
| POST | `/queue/cancel/{job_id}` | Cancel job |
|
||
|
||
---
|
||
|
||
## Frontend Routes
|
||
|
||
| Path | Component | Guard |
|
||
|------|-----------|-------|
|
||
| `/login` | `LoginPage` | Public |
|
||
| `/` | `DashboardPage` | PrivateRoute |
|
||
| `/apps` | `AppsPage` | PrivateRoute |
|
||
| `/apps/documents` | `DocumentsPage` | PrivateRoute |
|
||
| `/apps/documents/settings/admin` | `DocumentAdminSettingsPage` | AdminRoute |
|
||
| `/apps/ai/settings/admin` | `AIAdminSettingsPage` | AdminRoute |
|
||
| `/profile` | `ProfilePage` | PrivateRoute |
|
||
| `/settings` | `SettingsPage` | PrivateRoute |
|
||
| `/admin` | `AdminPage` (→ `/admin/users`) | AdminRoute |
|
||
| `/admin/users` | `AdminUsersPage` | AdminRoute |
|
||
| `/admin/groups` | `AdminGroupsPage` | AdminRoute |
|
||
| `*` | redirect to `/` | — |
|
||
|
||
`PrivateRoute` — checks `token` from `useAuth`, redirects to `/login` if absent.
|
||
`AdminRoute` — checks token AND queries `GET /api/users/me` for `is_admin`; waits for query to avoid flash; redirects to `/login` (not `/`) if not admin.
|
||
|
||
---
|
||
|
||
## Security Standards
|
||
|
||
These standards are **non-negotiable**. Every change must comply.
|
||
|
||
### JWT
|
||
|
||
- **Algorithm**: RS256 (4096-bit RSA key pair, generated by `scripts/generate_jwt_keys.py`)
|
||
- **Keys**: PEM-encoded in `backend/.env` as `JWT_PRIVATE_KEY` / `JWT_PUBLIC_KEY` (gitignored)
|
||
- **Expiry**: 8 hours (`EXPIRE_MINUTES=480`) — never set longer; no refresh tokens
|
||
- **Claims**: `{sub: user_id, exp, iat}` — user_id is a UUID string
|
||
- **Validation**: `decode_access_token()` in `core/security.py`; called by `get_current_user`
|
||
- **Never**: set algorithm to `"none"`, disable `verify_exp`, or hardcode secrets in code
|
||
|
||
### Password hashing
|
||
|
||
- **Algorithm**: bcrypt, **13 rounds** (`bcrypt.gensalt(rounds=13)`)
|
||
- **Timing**: ~300 ms per hash (intentional brute-force resistance)
|
||
- **Never** use MD5, SHA1, or plain SHA256 for password storage
|
||
|
||
### Password policy (enforced in `UserCreate` schema)
|
||
|
||
All of the following must pass:
|
||
- ≥ 8 characters
|
||
- ≥ 1 uppercase (A–Z)
|
||
- ≥ 1 lowercase (a–z)
|
||
- ≥ 1 digit (0–9)
|
||
- ≥ 1 special character: `!@#$%^&*()\-_=+[]{}|;:'"<>?/\`~`
|
||
- No common words (password, secret, login, admin, test, qwerty, welcome, …)
|
||
|
||
### Input sanitization
|
||
|
||
Every user-supplied string stored in the database **must** pass through `core/sanitize.py`:
|
||
|
||
```python
|
||
sanitize_str(value, max_len=255)
|
||
# → strips whitespace; rejects null bytes (\x00); rejects control chars
|
||
# (0x01–0x1F, 0x7F except \t \n \r); enforces max_len; returns None for ""
|
||
|
||
normalize_email(value) # lowercase + strip
|
||
validate_phone(value) # sanitize_str(max=20) + regex ^\+?[\d\s\-()\[\]]{7,20}$
|
||
validate_date_of_birth(v) # must be ≥ 1900, not future
|
||
```
|
||
|
||
Apply via Pydantic `@field_validator` on all request schemas.
|
||
|
||
### XSS prevention
|
||
|
||
- React JSX text interpolation (`{value}`) is HTML-escaped by the DOM renderer — **never** use `dangerouslySetInnerHTML` with user-supplied content.
|
||
- Server-side `sanitize_str` provides defense-in-depth (control char stripping, max length).
|
||
|
||
### SQL injection prevention
|
||
|
||
- Use SQLAlchemy ORM (bound parameters) — **never** raw SQL strings.
|
||
- If `text()` is needed, use `bindparam()` for all user-supplied values.
|
||
- **Never** use f-strings, `.format()`, or `%`-formatting for SQL.
|
||
|
||
### Admin route security
|
||
|
||
- Use `get_current_admin` dependency (checks `is_superuser`).
|
||
- Return **404** (not 403) for unauthorized access — hides both endpoint existence and permission model.
|
||
|
||
### Network isolation
|
||
|
||
- `backend-net`: all containers except frontend; not reachable from host in prod.
|
||
- `frontend-net`: only frontend; single host port (80 prod / 5173 dev).
|
||
- DB, backend, doc-service, ai-service have **no** host port bindings in prod.
|
||
|
||
### Pre-commit security hook
|
||
|
||
`.githooks/pre-commit` runs `scripts/security_check.py` on every staged commit. It blocks commits that contain:
|
||
|
||
1. Hardcoded credentials / private keys / AWS creds
|
||
2. `eval()`, `exec()`, `shell=True`, `pickle.loads()`, `yaml.load()` without SafeLoader
|
||
3. MD5, SHA1, DES, `random.random()` / `random.randint()` for security use
|
||
4. SQL f-strings / format strings / concatenation passed to `execute()`/`query()`
|
||
5. JWT algorithm `"none"`, `verify_exp=False`, expiry > 9999 min, hardcoded secrets
|
||
6. `debug=True`, `print()` with passwords
|
||
7. `bandit` static analysis failures
|
||
|
||
**Never** bypass with `--no-verify` unless explicitly instructed by the user.
|
||
|
||
---
|
||
|
||
## Frontend Patterns & Conventions
|
||
|
||
### API client (`src/api/client.ts`)
|
||
|
||
Single Axios instance — **all** API calls live here, nowhere else:
|
||
|
||
```typescript
|
||
const api = axios.create({ baseURL: "/api" });
|
||
api.interceptors.request.use((config) => {
|
||
const token = localStorage.getItem("token");
|
||
if (token) config.headers.Authorization = `Bearer ${token}`;
|
||
return config;
|
||
});
|
||
```
|
||
|
||
Adding a new API call:
|
||
1. Define a TypeScript interface for the response if it's new.
|
||
2. Add a named export function (`getX`, `createX`, `updateX`, `deleteX`).
|
||
3. Use `api.get<T>(...)`, `api.post<T>(...)`, etc.; always `.then((r) => r.data)`.
|
||
|
||
### TanStack Query conventions
|
||
|
||
**Query keys** (flat arrays, lowercase):
|
||
```typescript
|
||
["me"] // current user
|
||
["services"] // service health list
|
||
["dashboard-prefs"] // user dashboard preferences
|
||
["categories"] // document categories
|
||
["documents", params] // document list (params object for cache isolation)
|
||
["document", id] // single document
|
||
```
|
||
|
||
**Mutation pattern**:
|
||
```typescript
|
||
const mutation = useMutation({
|
||
mutationFn: apiFunction,
|
||
onSuccess: () => {
|
||
queryClient.invalidateQueries({ queryKey: ["affected-key"] });
|
||
// additional side effects (close dialog, reset form, etc.)
|
||
},
|
||
});
|
||
// Usage:
|
||
mutation.mutate(data);
|
||
mutation.isPending // show spinner / disable button
|
||
mutation.isError // show error message
|
||
```
|
||
|
||
**Polling**:
|
||
```typescript
|
||
useQuery({ queryKey: ["services"], queryFn: getServices,
|
||
refetchInterval: 30_000, refetchIntervalInBackground: true });
|
||
```
|
||
|
||
### Route guards
|
||
|
||
```typescript
|
||
// PrivateRoute — redirect to /login if no token
|
||
// AdminRoute — redirect to /login if no token OR not admin
|
||
// (waits for getMe() query to avoid flash; uses 404 semantics)
|
||
```
|
||
|
||
### Component patterns
|
||
|
||
- Functional components only.
|
||
- Local `useState` for UI-only state (edit mode, pending values, open/closed).
|
||
- Server state via `useQuery` / `useMutation` — no duplicated local copies.
|
||
- `cn()` from `lib/utils.ts` for conditional Tailwind classes.
|
||
- `lucide-react` for all icons.
|
||
- Never use `dangerouslySetInnerHTML` with user-supplied content.
|
||
|
||
---
|
||
|
||
## Naming & Code Conventions
|
||
|
||
### Database
|
||
|
||
- **Tables**: lowercase, plural, snake_case (`users`, `group_memberships`, `document_category_assignments`)
|
||
- **Columns**: lowercase, snake_case
|
||
- **ORM models**: PascalCase, singular (`User`, `Group`, `GroupMembership`, `Document`)
|
||
- Primary keys: `id` (String UUID, auto-generated)
|
||
- Timestamps: `created_at` / `updated_at` / `joined_at` / `processed_at` — always timezone-aware
|
||
|
||
### Pydantic schemas
|
||
|
||
| Suffix | Purpose |
|
||
|--------|---------|
|
||
| `Create` | POST request body (user-supplied input) |
|
||
| `Update` | PATCH request body (partial update) |
|
||
| `Out` | API response (safe subset of model) |
|
||
| `AdminOut` | Extended response for admin endpoints |
|
||
| `Read` | GET response (same as `Out`, used for profiles) |
|
||
|
||
Always set `model_config = {"from_attributes": True}` on response schemas.
|
||
Use `validation_alias` when the ORM field name differs from the JSON key (e.g., `is_superuser` → `is_admin`).
|
||
|
||
### HTTP status codes
|
||
|
||
| Code | Use |
|
||
|------|-----|
|
||
| 200 | Successful GET / PATCH / PUT |
|
||
| 201 | Successful POST that creates a resource |
|
||
| 202 | Accepted (async processing started, e.g., document upload) |
|
||
| 204 | Successful DELETE or action with no response body |
|
||
| 400 | Bad request (duplicates, invalid data beyond Pydantic) |
|
||
| 401 | Missing / invalid JWT |
|
||
| 404 | Not found **and** admin routes when not admin |
|
||
| 413 | Payload too large (file exceeds limit) |
|
||
| 415 | Unsupported media type (not a PDF) |
|
||
| 422 | Pydantic validation failure (FastAPI default) |
|
||
| 502 | Downstream service unreachable |
|
||
| 503 | Service unavailable (queue stopped, AI error) |
|
||
| 504 | Gateway timeout |
|
||
|
||
### Backend code style
|
||
|
||
- Async/await for **all** I/O (DB, HTTP, file).
|
||
- `raise HTTPException(status_code=..., detail="...")` for all errors.
|
||
- Response models always declared in route decorator: `@router.get("/path", response_model=XOut)`.
|
||
- Background tasks via `BackgroundTasks` param; tasks open their own `AsyncSessionLocal` session.
|
||
- Commit + refresh pattern after mutations:
|
||
```python
|
||
await db.commit()
|
||
await db.refresh(obj)
|
||
```
|
||
|
||
### Frontend code style
|
||
|
||
- TypeScript strict mode — no `any`.
|
||
- API response types inferred from interfaces in `client.ts` only.
|
||
- Error messages displayed inline (no alert); loading shown as disabled state or "…" text.
|
||
- All user-facing text: safe via React JSX rendering (not innerHTML).
|
||
|
||
---
|
||
|
||
## Default Values & Limits
|
||
|
||
| Parameter | Value | Location |
|
||
|-----------|-------|----------|
|
||
| JWT expiry | 480 min (8 h) | `core/security.py` |
|
||
| Bcrypt rounds | 13 | `core/security.py` |
|
||
| Token localStorage key | `"token"` | `useAuth.ts` |
|
||
| Health check interval | 30 s | `service_health.py` |
|
||
| Service poll (frontend) | 30 s | `AppsPage.tsx`, `DashboardPage.tsx` |
|
||
| Max dashboard pinned apps | 50 | `schemas/user.py` |
|
||
| App ID max length | 64 chars | `schemas/user.py` |
|
||
| App ID allowed chars | `[a-zA-Z0-9_\-]` | `schemas/user.py` |
|
||
| full_name max length | 128 chars | `schemas/user.py` |
|
||
| Group name max length | 128 chars | `schemas/group.py` |
|
||
| Group description max | 512 chars | `schemas/group.py` |
|
||
| Phone max length | 20 chars | `sanitize.py` |
|
||
| Position max length | 128 chars | `schemas/profile.py` |
|
||
| Address max length | 255 chars | `schemas/profile.py` |
|
||
| Document title max | 500 chars | `models/document.py` |
|
||
| Category name max | 128 chars | `models/category.py` |
|
||
| PDF max size (default) | 20 MB | admin settings (configurable) |
|
||
| Raw text cap | 500 k chars | `doc-service` AI client |
|
||
| Documents per_page | 1–100, default 20 | `routers/documents.py` |
|
||
| AI service timeout | 60 s | `ai_client.py` |
|
||
| AI service max retries | 2 | `ai_client.py` |
|
||
|
||
---
|
||
|
||
## Docker Infrastructure
|
||
|
||
### Services
|
||
|
||
| Service | Image base | Internal port | User | Volumes | Network |
|
||
|---------|-----------|---------------|------|---------|---------|
|
||
| `db` | postgres:16-alpine | 5432 | 70:70 | `postgres_data` | backend-net |
|
||
| `backend` | python:3.12-slim | 8000 | 1001:1001 | `app_config` | backend-net |
|
||
| `ai-service` | python:3.12-slim | 8010 | 1001:1001 | `app_config` | backend-net |
|
||
| `doc-service` | python:3.12-slim | 8001 | 1001:1001 | `doc_data`, `app_config` | backend-net |
|
||
| `frontend` | nginx-unprivileged:alpine | 8080 | 1001:1001 | — | backend-net, frontend-net |
|
||
|
||
### Volumes
|
||
|
||
| Volume | Mount path | Contains |
|
||
|--------|-----------|---------|
|
||
| `postgres_data` | `/var/lib/postgresql/data` | PostgreSQL data |
|
||
| `doc_data` | `/data/documents` | Uploaded PDF files |
|
||
| `app_config` | `/config` | Per-service runtime config JSON files |
|
||
|
||
### Networks
|
||
|
||
| Network | Host-accessible | Members |
|
||
|---------|----------------|---------|
|
||
| `backend-net` | No (no host ports in prod) | db, backend, ai-service, doc-service, frontend |
|
||
| `frontend-net` | Yes (port 80 → frontend:8080) | frontend |
|
||
|
||
### Environment variables (required in `backend/.env`)
|
||
|
||
```
|
||
DATABASE_URL=postgresql+asyncpg://<user>:<pass>@db:5432/destroying_sap
|
||
CORS_ORIGINS=["http://localhost:5173"]
|
||
JWT_PRIVATE_KEY=<PEM, newlines as \n>
|
||
JWT_PUBLIC_KEY=<PEM, newlines as \n>
|
||
```
|
||
|
||
Injected by docker-compose (not in `.env`):
|
||
```
|
||
DOC_SERVICE_URL=http://doc-service:8001
|
||
AI_SERVICE_URL=http://ai-service:8010
|
||
```
|
||
|
||
---
|
||
|
||
## Workflows
|
||
|
||
### STATUS.md workflow
|
||
|
||
Every directory with runnable code has a `STATUS.md`. These are the canonical **resume point** for each session.
|
||
|
||
**At the start of every conversation:**
|
||
1. Read the `STATUS.md` for every directory you will touch.
|
||
2. If it does not exist for a directory you are working in, create it using the structure below.
|
||
|
||
This applies equally to subagents.
|
||
|
||
**After making changes**, update affected `STATUS.md` files:
|
||
- Add new endpoints / models / routes.
|
||
- Move completed items off the **Future work** checklist.
|
||
- Add new items to **Known limitations** or **Future work**.
|
||
- Keep the **What it is** summary accurate.
|
||
|
||
**Structure:**
|
||
```markdown
|
||
# <Service Name> — Status
|
||
|
||
## What it is
|
||
One paragraph: purpose, port, database/storage, how traffic arrives.
|
||
|
||
## Current functionality
|
||
Subsections per router / feature area. Tables for endpoints.
|
||
|
||
## Architecture
|
||
ASCII diagram of call graph / data flow.
|
||
|
||
## Known limitations / not implemented
|
||
Bullet list of known gaps.
|
||
|
||
## Future work
|
||
- [ ] Planned improvements
|
||
```
|
||
|
||
Maintained in: `backend/`, `features/ai-service/`, `features/doc-service/`, `frontend/`
|
||
|
||
---
|
||
|
||
### Changelog convention
|
||
|
||
Every time files are added or modified, append to `changelog/YYYY-MM-DD_<slug>.md`. If today's file exists, append; otherwise create new.
|
||
|
||
Each entry must include:
|
||
- A heading with date and short description
|
||
- `**Timestamp:**` in ISO-8601 format
|
||
- A **Summary** sentence
|
||
- A **Files Added / Modified / Deleted** list with one-line descriptions
|
||
|
||
---
|
||
|
||
### Adding a new resource (checklist)
|
||
|
||
1. Add ORM model in `backend/app/models/`, import it in `models/__init__.py`
|
||
2. Run migration: `docker compose exec backend alembic revision --autogenerate -m "add <resource>"` then `alembic upgrade head`
|
||
3. Add Pydantic schemas in `backend/app/schemas/`
|
||
4. Add router in `backend/app/routers/`, mount it in `main.py`
|
||
5. Add API function(s) to `frontend/src/api/client.ts`
|
||
6. Add page component in `frontend/src/pages/`, register route in `App.tsx`
|
||
7. Update `STATUS.md` for affected services
|
||
8. Add changelog entry
|
||
|
||
---
|
||
|
||
### Git convention
|
||
|
||
Always run `git push` immediately after every `git commit`.
|
||
|
||
---
|
||
|
||
### Infrastructure change protocol
|
||
|
||
After **any** change to Dockerfiles, `docker-compose*.yml`, `nginx.conf`, or setup scripts:
|
||
|
||
1. **Update `README.md`** — containers table, ports, image names, Current State section.
|
||
2. **Dev stack** — verify login and registration end-to-end:
|
||
```bash
|
||
docker compose -f docker-compose.yml -f docker-compose.dev.yml up --build
|
||
```
|
||
3. **Prod stack** — run the same checks:
|
||
```bash
|
||
docker compose up --build -d
|
||
```
|
||
4. Confirm non-root users:
|
||
```bash
|
||
docker inspect <container> --format '{{.Config.User}}'
|
||
```
|
||
5. **Tear down** after testing:
|
||
```bash
|
||
docker compose down --volumes --remove-orphans
|
||
```
|
||
|
||
---
|
||
|
||
### Security hook
|
||
|
||
`.githooks/pre-commit` (registered via `git config core.hooksPath .githooks`). Runs `scripts/security_check.py` in Docker. New clones must run:
|
||
```bash
|
||
git config core.hooksPath .githooks
|
||
```
|
||
|
||
See **Security Standards → Pre-commit security hook** for the full list of checks.
|
||
|
||
**Never** bypass with `--no-verify`.
|