Files
Business-Management/CLAUDE.md
T
curo1305 00466a9801 Add generic plugin architecture and watch-directory feature
Introduces a manifest contract so feature containers self-describe their
settings (JSON Schema + access rules). Backend and frontend gain generic
plugin proxy and dynamic Extensions UI with zero feature-specific code.

Doc-service is the first plugin consumer: exposes /plugin/manifest and
/plugin/settings, adds a watchdog-based file watcher that auto-ingests
PDFs from a mounted directory, maps subfolders to categories, supports
AI-suggested folder/filename (user-confirmed), and enforces a no-remove
policy. Access is gated by is_superuser or doc-service-admin group.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 02:09:50 +02:00

979 lines
40 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# CLAUDE.md
This file provides permanent, authoritative guidance to Claude Code for every session. All sections below reflect the actual codebase state and must be kept up-to-date as the project evolves.
## CLAUDE.md self-update checkpoint
**After every change to the codebase**, before committing, check whether CLAUDE.md needs updating:
- New route added → update **All API Endpoints** and **Frontend Routes** tables
- New DB model or column → update **Database Models**
- New migration → update **Migration chains**
- New file or directory → update **File & Folder Tree**
- New limit or default value changed → update **Default Values & Limits**
- New dependency, auth mechanism, or security pattern → update **Security Standards**
- New Docker service, volume, network, or env var → update **Docker Infrastructure**
- Stack version changed → update **Stack**
This check is mandatory — treat it the same as updating STATUS.md.
---
## Stack
| Layer | Tech |
|---|---|
| Backend | FastAPI (async), SQLAlchemy 2 (async), Alembic, PostgreSQL 16 |
| Auth | JWT RS256 via `python-jose`, bcrypt via `bcrypt` (direct, 13 rounds) |
| Frontend | React 18, TypeScript, Vite, React Router v6, TanStack Query, Axios |
| UI Library | shadcn/ui (Radix primitives + Tailwind CSS v3) |
| Styling | Tailwind CSS v3, CSS custom properties for theme tokens |
| Containerisation | Docker Compose (5 services, non-root users, named volumes) |
---
## Commands
All test, build, and package-manager commands run **inside Docker** — never on the host. See the memory note: "Testing inside Docker only".
### Migrations (run in Docker)
```bash
docker compose exec backend alembic revision --autogenerate -m "describe change"
docker compose exec backend alembic upgrade head
docker compose exec backend alembic downgrade -1
```
### Lint (run in Docker)
```bash
docker compose exec backend ruff check . && ruff format .
docker compose exec frontend npm run typecheck
docker compose exec frontend npm run lint
```
### Full stack
```bash
# Dev stack (hot-reload, Vite on :5173)
cp .env.example backend/.env
docker compose -f docker-compose.yml -f docker-compose.dev.yml up --build
# Prod stack
docker compose up --build -d
```
---
## File & Folder Tree
```
/
├── CLAUDE.md ← This file — authoritative session context
├── README.md ← Project overview, containers table, Current State
├── TODO.md ← Task list
├── .env.example ← Template for backend/.env
├── docker-compose.yml ← Production (5 services, named volumes)
├── docker-compose.dev.yml ← Dev overrides (hot-reload, host ports)
├── .githooks/pre-commit ← Runs scripts/security_check.py before every commit
├── scripts/security_check.py ← Static analysis: secrets, weak crypto, SQLi, JWT
├── changelog/YYYY-MM-DD_<slug>.md ← Per-date change logs
├── dev-watch/ ← Dev bind-mount for file watcher testing (.gitkeep only)
├── backend/ ← FastAPI gateway (port 8000, internal)
│ ├── app/
│ │ ├── main.py ← App factory, router registration, lifespan (health loop)
│ │ ├── database.py ← AsyncEngine, AsyncSessionLocal, Base
│ │ ├── deps.py ← get_current_user, get_current_admin, check_plugin_access
│ │ ├── core/
│ │ │ ├── config.py ← All settings via pydantic-settings (reads .env)
│ │ │ ├── security.py ← JWT sign/verify (RS256), bcrypt hash/verify
│ │ │ ├── sanitize.py ← Input sanitization helpers (see Security Standards)
│ │ │ └── app_config.py ← Per-service config load/save to /config volume; theme files in /config/themes/
│ │ ├── models/
│ │ │ ├── __init__.py ← Imports all models (required for Alembic autogenerate)
│ │ │ ├── user.py ← User model (see Database Models)
│ │ │ ├── profile.py ← Profile model
│ │ │ └── group.py ← Group, GroupMembership models
│ │ ├── schemas/
│ │ │ ├── user.py ← UserCreate/Out, Token, DashboardPrefsOut/Update
│ │ │ ├── profile.py ← ProfileRead, ProfileUpdate
│ │ │ └── group.py ← GroupCreate/Update/Out/DetailOut, GroupMemberOut
│ │ ├── routers/
│ │ │ ├── auth.py ← POST /register, POST /login
│ │ │ ├── users.py ← GET /me, GET+PATCH /me/preferences, PATCH /me/color-mode
│ │ │ ├── profile.py ← GET+PUT /me (profile)
│ │ │ ├── admin.py ← User admin CRUD (admin-only)
│ │ │ ├── groups.py ← Group CRUD + member management (admin-only)
│ │ │ ├── settings.py ← AI, doc limits, system prompts, appearance, themes (admin-only)
│ │ │ ├── services.py ← GET /services (health status)
│ │ │ ├── plugins.py ← Generic plugin proxy (GET/PATCH /api/plugins/*)
│ │ │ ├── categories_proxy.py ← Transparent proxy → doc-service /categories/*
│ │ │ └── documents_proxy.py ← Transparent proxy → doc-service /documents/*
│ │ └── services/
│ │ └── service_health.py ← Background 30s health-check loop; caches /plugin/manifest per service
│ ├── alembic/
│ │ ├── env.py ← Async migration runner
│ │ └── versions/ ← Migration chain (see Migrations section)
│ ├── scripts/seed.py ← Seed test user
│ ├── Dockerfile ← python:3.12-slim, non-root user 1001
│ └── STATUS.md
├── features/
│ ├── ai-service/ ← AI provider intermediary (port 8010, internal)
│ │ ├── app/
│ │ │ ├── main.py ← FastAPI, queue worker lifespan
│ │ │ ├── routers/chat.py ← POST /chat (sync, NORMAL priority queue)
│ │ │ ├── routers/health.py ← GET /health
│ │ │ ├── routers/queue.py ← GET /queue/status, /pause, /resume, /cancel/{id}
│ │ │ ├── providers/base.py ← AIProvider abstract class
│ │ │ ├── providers/anthropic_provider.py
│ │ │ ├── providers/openai_compat.py ← Ollama / LM Studio
│ │ │ └── services/queue.py ← Priority queue (CRITICAL > HIGH > NORMAL)
│ │ ├── Dockerfile
│ │ └── STATUS.md
│ │
│ └── doc-service/ ← PDF extraction microservice (port 8001, internal)
│ ├── app/
│ │ ├── main.py ← FastAPI, lifespan (file watcher start/stop)
│ │ ├── database.py ← Same PostgreSQL instance as backend
│ │ ├── deps.py ← get_user_id (reads x-user-id header)
│ │ ├── models/
│ │ │ ├── document.py ← Document model (see Database Models)
│ │ │ ├── category.py ← DocumentCategory model
│ │ │ └── category_assignment.py ← CategoryAssignment (composite PK)
│ │ ├── schemas/
│ │ │ ├── document.py ← DocumentOut, DocumentPage, DocumentStatusOut, etc.
│ │ │ └── category.py ← CategoryOut, CategoryCreate, CategoryUpdate
│ │ ├── routers/
│ │ │ ├── documents.py ← Full document CRUD + file serving + reprocess + suggestion endpoints
│ │ │ ├── categories.py ← Category CRUD (includes watch-owned categories)
│ │ │ └── plugin.py ← GET /plugin/manifest, GET+PATCH /plugin/settings
│ │ └── services/
│ │ ├── storage.py ← File I/O
│ │ ├── ai_client.py ← classify_document() → ai-service:8010/chat
│ │ ├── config_reader.py ← Config load/save including storage/watch settings
│ │ └── file_watcher.py ← watchdog-based PDF watcher + startup scan + ingestion
│ ├── alembic/versions/ ← Doc-service migration chain
│ │ └── 0003_add_watch_columns.py ← source, watch_path, suggested_folder, suggested_filename
│ ├── Dockerfile
│ └── STATUS.md
└── frontend/ ← React SPA (port 5173 dev / 80 prod)
├── src/
│ ├── main.tsx ← React root, QueryClientProvider, BrowserRouter
│ ├── App.tsx ← Route tree, PrivateRoute, AdminRoute
│ ├── api/client.ts ← Axios instance + ALL API functions (single source of truth)
│ ├── hooks/
│ │ ├── useAuth.ts ← Token state (localStorage), login/logout
│ │ └── useTheme.ts ← Theme toggle
│ ├── components/
│ │ ├── AppShell.tsx ← Layout: Sidebar + scrollable main
│ │ ├── Sidebar.tsx ← Collapsible nav; "Extensions" section auto-populated from /api/plugins
│ │ ├── ThemeToggle.tsx ← Light/dark mode toggle
│ │ ├── PluginSchemaForm.tsx ← JSON Schema → React form (boolean/string/number/readOnly)
│ │ └── ui/ ← shadcn/ui components (Button, Input, …)
│ ├── pages/ ← One file per route (see Routes section)
│ │ └── PluginSettingsPage.tsx ← Generic plugin settings page driven by manifest
│ ├── lib/utils.ts ← cn() = clsx + tailwind-merge
│ └── styles/theme.css ← CSS custom properties, Tailwind setup
├── vite.config.ts ← /api/* proxied to backend:8000
├── tailwind.config.ts
├── components.json ← shadcn/ui config
├── Dockerfile ← Multi-stage: Node build → nginx-unprivileged
└── STATUS.md
```
---
## Architecture
### Request flow
```
Browser (:5173 dev / :80 prod)
└── Vite dev proxy / nginx
└── /api/* ──→ backend:8000 (FastAPI)
┌───────────────┼───────────────────┐
/auth /admin /documents/*
/users /groups /documents/categories/*
/profile /settings
/services │ │
JSON volume proxy (injects x-user-id)
(/config) │
doc-service:8001
ai-service:8010
(classify, chat)
```
### Auth flow
1. `POST /api/auth/login` → RS256 JWT (8 h), stored in `localStorage`
2. Axios interceptor injects `Authorization: Bearer {token}` on every request
3. `get_current_user` dep validates token on every protected route
4. Admin routes additionally check `user.is_superuser`; return 404 (not 403) if not admin
---
## Database Models
### Backend (`users`, `profiles`, `groups`, `group_memberships`)
**`users`**
| Column | Type | Constraints | Notes |
|--------|------|-------------|-------|
| `id` | String | PK, UUID | auto-generated |
| `email` | String | UNIQUE, indexed, NOT NULL | lowercased before storing |
| `hashed_password` | String | NOT NULL | bcrypt 13 rounds |
| `full_name` | String | nullable | sanitized max 128 chars |
| `is_active` | Boolean | default=True | soft-delete flag |
| `is_superuser` | Boolean | default=False | admin role; never exposed as-is (serialised as `is_admin`) |
| `dashboard_app_ids` | JSON | NOT NULL, default=[] | list of pinned service IDs |
| `color_mode` | String | nullable, default=NULL | user's preferred mode: "light" / "dark" / "system" / NULL (use admin default) |
Relationship: `profile` (one-to-one, cascade all+delete-orphan)
**`profiles`**
| Column | Type | Constraints | Notes |
|--------|------|-------------|-------|
| `id` | String | PK, UUID | auto-generated |
| `user_id` | String | FK→users.id UNIQUE, cascade delete | one-to-one |
| `phone` | String(20) | nullable | validated format |
| `date_of_birth` | Date | nullable | 1900+ and not future |
| `position` | String(128) | nullable | job title |
| `address` | String(255) | nullable | |
| `updated_at` | DateTime(tz) | server_default=now(), onupdate=now() | |
**`groups`**
| Column | Type | Constraints |
|--------|------|-------------|
| `id` | String | PK, UUID |
| `name` | String(128) | UNIQUE indexed, NOT NULL |
| `description` | String(512) | nullable |
| `created_at` | DateTime(tz) | server_default=now() |
**`group_memberships`**
| Column | Type | Constraints |
|--------|------|-------------|
| `id` | String | PK, UUID |
| `group_id` | String | FK→groups.id, indexed, CASCADE |
| `user_id` | String | FK→users.id, indexed, CASCADE |
| `joined_at` | DateTime(tz) | server_default=now() |
Unique constraint: `(group_id, user_id)`
### Doc-service (`documents`, `document_categories`, `document_category_assignments`)
**`documents`**
| Column | Type | Constraints | Notes |
|--------|------|-------------|-------|
| `id` | String | PK, UUID | |
| `user_id` | String | indexed | not FK — trusts x-user-id header |
| `filename` | String | NOT NULL | |
| `file_path` | String | NOT NULL | absolute path under /data/documents |
| `file_size` | Integer | NOT NULL | bytes |
| `status` | String | default="pending" | pending / processing / done / failed |
| `title` | String(500) | nullable | AI-extracted |
| `document_type` | String | nullable | invoice / bill / receipt / order / expense / revenue / unknown |
| `raw_text` | Text | nullable | first 500 k chars |
| `extracted_data` | Text | nullable | JSON string |
| `tags` | Text | nullable | JSON array string |
| `error_message` | String(500) | nullable | |
| `created_at` | DateTime(tz) | server_default=now() | |
| `processed_at` | DateTime(tz) | nullable | |
| `source` | String(16) | default="upload" | "upload" or "watch" |
| `watch_path` | String | nullable | original absolute path in watch directory |
| `suggested_folder` | String(128) | nullable | AI-suggested category (pending user confirm) |
| `suggested_filename` | String(500) | nullable | AI-suggested title/rename (pending user confirm) |
**`document_categories`**
| Column | Type | Constraints |
|--------|------|-------------|
| `id` | String | PK, UUID |
| `user_id` | String | indexed |
| `name` | String(128) | NOT NULL |
| `created_at` | DateTime(tz) | server_default=now() |
**`document_category_assignments`** (composite PK)
| Column | Type | Constraints |
|--------|------|-------------|
| `document_id` | String | PK + FK→documents.id CASCADE |
| `category_id` | String | PK + FK→document_categories.id CASCADE |
### Migration chains
**Backend** (must be applied in order):
| Rev ID | Slug |
|--------|------|
| `38efeff7c45a` | `create_users_table` |
| `676084df61d1` | `add_profiles_table` |
| `a3f9c2d14e87` | `add_groups_and_group_memberships` |
| `c7e8f9a0b1d2` | `add_dashboard_app_ids_to_users` |
| `dd6ad2f2c211` | `add_color_mode_to_users` |
**Doc-service**:
| Rev ID | Slug |
|--------|------|
| `0001` | `create_doc_tables` |
| `0002` | `add_document_title` |
| `0003` | `add_watch_columns` |
---
## All API Endpoints
### Auth (`/api/auth`) — public
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| POST | `/api/auth/register` | — | Create account; returns `UserOut`; enforces password policy |
| POST | `/api/auth/login` | — | OAuth2 password flow; returns `{access_token, token_type}` |
### Users (`/api/users`) — authenticated
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| GET | `/api/users/me` | user | Current user info → `UserOut` |
| GET | `/api/users/me/preferences` | user | Dashboard pinned app IDs → `{app_ids}` |
| PATCH | `/api/users/me/preferences` | user | Save pinned app IDs (max 50, slug-safe) |
| PATCH | `/api/users/me/color-mode` | user | Save colour mode preference ("light"/"dark"/"system") |
### Profile (`/api/profile`) — authenticated
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| GET | `/api/profile/me` | user | Fetch profile; auto-creates if missing |
| PUT | `/api/profile/me` | user | Update profile fields |
### Admin — Users (`/api/admin`) — admin-only
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/admin/users` | List all users → `list[UserAdminOut]` |
| POST | `/api/admin/users` | Create user (with optional is_admin) |
| DELETE | `/api/admin/users/{user_id}` | Delete user (204) |
| PATCH | `/api/admin/users/{user_id}/active` | Toggle active status |
### Admin — Groups (`/api/admin/groups`) — admin-only
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/admin/groups` | List groups with member count |
| POST | `/api/admin/groups` | Create group |
| GET | `/api/admin/groups/{id}` | Group detail + members |
| PATCH | `/api/admin/groups/{id}` | Update name / description |
| DELETE | `/api/admin/groups/{id}` | Delete (cascades memberships) |
| POST | `/api/admin/groups/{id}/members/{user_id}` | Add member |
| DELETE | `/api/admin/groups/{id}/members/{user_id}` | Remove member |
### Settings (`/api/settings`) — admin-only
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/settings/ai` | AI config (keys masked) |
| PATCH | `/api/settings/ai` | Update AI provider / credentials |
| POST | `/api/settings/ai/test` | Test AI connection |
| GET | `/api/settings/documents/limits` | PDF upload limits |
| PATCH | `/api/settings/documents/limits` | Update max PDF size |
| GET | `/api/settings/system-prompts` | All editable system prompts |
| PATCH | `/api/settings/system-prompts/{service_id}` | Update system prompt |
| GET | `/api/settings/appearance` | Active theme + default mode (auth) |
| PATCH | `/api/settings/appearance` | Update active theme + default mode (admin) |
| GET | `/api/settings/themes` | List all themes — built-in + custom (auth) |
| POST | `/api/settings/themes` | Create custom theme (admin) |
| PATCH | `/api/settings/themes/{id}` | Update custom theme label/colours (admin) |
| DELETE | `/api/settings/themes/{id}` | Delete custom theme (admin, 204) |
### Services (`/api/services`) — authenticated
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/services` | Health status of all registered services → `list[ServiceStatus]` |
### Documents (`/api/documents/*`) — authenticated, proxied to doc-service
| Method | Path | Description |
|--------|------|-------------|
| POST | `/api/documents/upload` | Upload PDF (202, background processing) |
| GET | `/api/documents` | Paginated list (filterable: search, status, type, category, sort) |
| GET | `/api/documents/{id}` | Document detail |
| GET | `/api/documents/{id}/status` | Processing status only |
| PATCH | `/api/documents/{id}/type` | Update document type |
| PATCH | `/api/documents/{id}/tags` | Update tags |
| PATCH | `/api/documents/{id}/title` | Update title |
| POST | `/api/documents/{id}/reprocess` | Re-run AI extraction |
| DELETE | `/api/documents/{id}` | Delete document (204) |
| GET | `/api/documents/{id}/file` | Download PDF (streaming) |
| POST | `/api/documents/{id}/categories/{cat_id}` | Assign category |
| DELETE | `/api/documents/{id}/categories/{cat_id}` | Remove category |
| POST | `/api/documents/{id}/suggestions/folder/confirm` | Confirm AI folder suggestion |
| POST | `/api/documents/{id}/suggestions/folder/reject` | Reject AI folder suggestion |
| POST | `/api/documents/{id}/suggestions/filename/confirm` | Confirm AI filename suggestion |
| POST | `/api/documents/{id}/suggestions/filename/reject` | Reject AI filename suggestion |
### Categories (`/api/documents/categories/*`) — authenticated, proxied to doc-service
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/documents/categories` | List user's categories |
| POST | `/api/documents/categories` | Create category (triggers background AI reanalysis) |
| PATCH | `/api/documents/categories/{id}` | Rename |
| DELETE | `/api/documents/categories/{id}` | Delete (204) |
### Plugins (`/api/plugins`) — authenticated, auth-per-plugin
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/plugins` | List plugins accessible to current user |
| GET | `/api/plugins/{id}/manifest` | Plugin manifest with settings JSON Schema (auth-gated) |
| GET | `/api/plugins/{id}/settings` | Proxy to feature `/plugin/settings` (auth-gated) |
| PATCH | `/api/plugins/{id}/settings` | Proxy to feature `/plugin/settings` (auth-gated) |
Auth: is_superuser OR member of group listed in manifest `required_groups`. Returns 404 (not 403) to hide existence.
### AI-service (internal only — not exposed to browser)
| Method | Path | Description |
|--------|------|-------------|
| POST | `/chat` | Chat request (queued at NORMAL priority) |
| GET | `/health` | Health check |
| GET | `/queue/status` | Queue state |
| POST | `/queue/pause` | Pause queue |
| POST | `/queue/resume` | Resume queue |
| POST | `/queue/cancel/{job_id}` | Cancel job |
---
## Frontend Routes
| Path | Component | Guard |
|------|-----------|-------|
| `/login` | `LoginPage` | Public |
| `/` | `DashboardPage` | PrivateRoute |
| `/apps` | `AppsPage` | PrivateRoute |
| `/apps/documents` | `DocumentsPage` | PrivateRoute |
| `/apps/documents/settings/admin` | `DocumentAdminSettingsPage` | AdminRoute |
| `/apps/ai/settings/admin` | `AIAdminSettingsPage` | AdminRoute |
| `/profile` | `ProfilePage` | PrivateRoute |
| `/settings` | `SettingsPage` | PrivateRoute |
| `/settings/plugins/:id` | `PluginSettingsPage` | PrivateRoute (auth enforced per-plugin by backend) |
| `/admin` | `AdminPage` (→ `/admin/users`) | AdminRoute |
| `/admin/users` | `AdminUsersPage` | AdminRoute |
| `/admin/groups` | `AdminGroupsPage` | AdminRoute |
| `/admin/appearance` | `AdminAppearancePage` | AdminRoute |
| `*` | redirect to `/` | — |
`PrivateRoute` — checks `token` from `useAuth`, redirects to `/login` if absent.
`AdminRoute` — checks token AND queries `GET /api/users/me` for `is_admin`; waits for query to avoid flash; redirects to `/login` (not `/`) if not admin.
---
## Security Standards
These standards are **non-negotiable**. Every change must comply.
### JWT
- **Algorithm**: RS256 (4096-bit RSA key pair, generated by `scripts/generate_jwt_keys.py`)
- **Keys**: PEM-encoded in `backend/.env` as `JWT_PRIVATE_KEY` / `JWT_PUBLIC_KEY` (gitignored)
- **Expiry**: 8 hours (`EXPIRE_MINUTES=480`) — never set longer; no refresh tokens
- **Claims**: `{sub: user_id, exp, iat}` — user_id is a UUID string
- **Validation**: `decode_access_token()` in `core/security.py`; called by `get_current_user`
- **Never**: set algorithm to `"none"`, disable `verify_exp`, or hardcode secrets in code
### Password hashing
- **Algorithm**: bcrypt, **13 rounds** (`bcrypt.gensalt(rounds=13)`)
- **Timing**: ~300 ms per hash (intentional brute-force resistance)
- **Never** use MD5, SHA1, or plain SHA256 for password storage
### Password policy (enforced in `UserCreate` schema)
All of the following must pass:
- ≥ 8 characters
- ≥ 1 uppercase (AZ)
- ≥ 1 lowercase (az)
- ≥ 1 digit (09)
- ≥ 1 special character: `!@#$%^&*()\-_=+[]{}|;:'"<>?/\`~`
- No common words (password, secret, login, admin, test, qwerty, welcome, …)
### Input sanitization
Every user-supplied string stored in the database **must** pass through `core/sanitize.py`:
```python
sanitize_str(value, max_len=255)
# → strips whitespace; rejects null bytes (\x00); rejects control chars
# (0x010x1F, 0x7F except \t \n \r); enforces max_len; returns None for ""
normalize_email(value) # lowercase + strip
validate_phone(value) # sanitize_str(max=20) + regex ^\+?[\d\s\-()\[\]]{7,20}$
validate_date_of_birth(v) # must be ≥ 1900, not future
```
Apply via Pydantic `@field_validator` on all request schemas.
### XSS prevention
- React JSX text interpolation (`{value}`) is HTML-escaped by the DOM renderer — **never** use `dangerouslySetInnerHTML` with user-supplied content.
- Server-side `sanitize_str` provides defense-in-depth (control char stripping, max length).
### SQL injection prevention
- Use SQLAlchemy ORM (bound parameters) — **never** raw SQL strings.
- If `text()` is needed, use `bindparam()` for all user-supplied values.
- **Never** use f-strings, `.format()`, or `%`-formatting for SQL.
### Admin route security
- Use `get_current_admin` dependency (checks `is_superuser`).
- Return **404** (not 403) for unauthorized access — hides both endpoint existence and permission model.
### Network isolation
- `backend-net`: all containers except frontend; not reachable from host in prod.
- `frontend-net`: only frontend; single host port (80 prod / 5173 dev).
- DB, backend, doc-service, ai-service have **no** host port bindings in prod.
### Pre-commit security hook
`.githooks/pre-commit` runs `scripts/security_check.py` on every staged commit. It blocks commits that contain:
1. Hardcoded credentials / private keys / AWS creds
2. `eval()`, `exec()`, `shell=True`, `pickle.loads()`, `yaml.load()` without SafeLoader
3. MD5, SHA1, DES, `random.random()` / `random.randint()` for security use
4. SQL f-strings / format strings / concatenation passed to `execute()`/`query()`
5. JWT algorithm `"none"`, `verify_exp=False`, expiry > 9999 min, hardcoded secrets
6. `debug=True`, `print()` with passwords
7. `bandit` static analysis failures
**Never** bypass with `--no-verify` unless explicitly instructed by the user.
---
## Frontend Patterns & Conventions
### API client (`src/api/client.ts`)
Single Axios instance — **all** API calls live here, nowhere else:
```typescript
const api = axios.create({ baseURL: "/api" });
api.interceptors.request.use((config) => {
const token = localStorage.getItem("token");
if (token) config.headers.Authorization = `Bearer ${token}`;
return config;
});
```
Adding a new API call:
1. Define a TypeScript interface for the response if it's new.
2. Add a named export function (`getX`, `createX`, `updateX`, `deleteX`).
3. Use `api.get<T>(...)`, `api.post<T>(...)`, etc.; always `.then((r) => r.data)`.
### TanStack Query conventions
**Query keys** (flat arrays, lowercase):
```typescript
["me"] // current user
["services"] // service health list
["dashboard-prefs"] // user dashboard preferences
["categories"] // document categories
["documents", params] // document list (params object for cache isolation)
["document", id] // single document
["plugins"] // accessible plugin list (filtered by user access)
["plugin-manifest", id] // plugin manifest (cached)
["plugin-settings", id] // plugin current settings
```
**Mutation pattern**:
```typescript
const mutation = useMutation({
mutationFn: apiFunction,
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ["affected-key"] });
// additional side effects (close dialog, reset form, etc.)
},
});
// Usage:
mutation.mutate(data);
mutation.isPending // show spinner / disable button
mutation.isError // show error message
```
**Polling**:
```typescript
useQuery({ queryKey: ["services"], queryFn: getServices,
refetchInterval: 30_000, refetchIntervalInBackground: true });
```
### Route guards
```typescript
// PrivateRoute — redirect to /login if no token
// AdminRoute — redirect to /login if no token OR not admin
// (waits for getMe() query to avoid flash; uses 404 semantics)
```
### Component patterns
- Functional components only.
- Local `useState` for UI-only state (edit mode, pending values, open/closed).
- Server state via `useQuery` / `useMutation` — no duplicated local copies.
- `cn()` from `lib/utils.ts` for conditional Tailwind classes.
- `lucide-react` for all icons.
- Never use `dangerouslySetInnerHTML` with user-supplied content.
---
## Naming & Code Conventions
### Database
- **Tables**: lowercase, plural, snake_case (`users`, `group_memberships`, `document_category_assignments`)
- **Columns**: lowercase, snake_case
- **ORM models**: PascalCase, singular (`User`, `Group`, `GroupMembership`, `Document`)
- Primary keys: `id` (String UUID, auto-generated)
- Timestamps: `created_at` / `updated_at` / `joined_at` / `processed_at` — always timezone-aware
### Pydantic schemas
| Suffix | Purpose |
|--------|---------|
| `Create` | POST request body (user-supplied input) |
| `Update` | PATCH request body (partial update) |
| `Out` | API response (safe subset of model) |
| `AdminOut` | Extended response for admin endpoints |
| `Read` | GET response (same as `Out`, used for profiles) |
Always set `model_config = {"from_attributes": True}` on response schemas.
Use `validation_alias` when the ORM field name differs from the JSON key (e.g., `is_superuser``is_admin`).
### HTTP status codes
| Code | Use |
|------|-----|
| 200 | Successful GET / PATCH / PUT |
| 201 | Successful POST that creates a resource |
| 202 | Accepted (async processing started, e.g., document upload) |
| 204 | Successful DELETE or action with no response body |
| 400 | Bad request (duplicates, invalid data beyond Pydantic) |
| 401 | Missing / invalid JWT |
| 404 | Not found **and** admin routes when not admin |
| 413 | Payload too large (file exceeds limit) |
| 415 | Unsupported media type (not a PDF) |
| 422 | Pydantic validation failure (FastAPI default) |
| 502 | Downstream service unreachable |
| 503 | Service unavailable (queue stopped, AI error) |
| 504 | Gateway timeout |
### Backend code style
- Async/await for **all** I/O (DB, HTTP, file).
- `raise HTTPException(status_code=..., detail="...")` for all errors.
- Response models always declared in route decorator: `@router.get("/path", response_model=XOut)`.
- Background tasks via `BackgroundTasks` param; tasks open their own `AsyncSessionLocal` session.
- Commit + refresh pattern after mutations:
```python
await db.commit()
await db.refresh(obj)
```
### Frontend code style
- TypeScript strict mode — no `any`.
- API response types inferred from interfaces in `client.ts` only.
- Error messages displayed inline (no alert); loading shown as disabled state or "…" text.
- All user-facing text: safe via React JSX rendering (not innerHTML).
---
## Default Values & Limits
| Parameter | Value | Location |
|-----------|-------|----------|
| JWT expiry | 480 min (8 h) | `core/security.py` |
| Bcrypt rounds | 13 | `core/security.py` |
| Token localStorage key | `"token"` | `useAuth.ts` |
| Health check interval | 30 s | `service_health.py` |
| Service poll (frontend) | 30 s | `AppsPage.tsx`, `DashboardPage.tsx` |
| User `color_mode` default | NULL (falls back to admin default_mode, then system) | `models/user.py` |
| Max dashboard pinned apps | 50 | `schemas/user.py` |
| App ID max length | 64 chars | `schemas/user.py` |
| App ID allowed chars | `[a-zA-Z0-9_\-]` | `schemas/user.py` |
| full_name max length | 128 chars | `schemas/user.py` |
| Group name max length | 128 chars | `schemas/group.py` |
| Group description max | 512 chars | `schemas/group.py` |
| Phone max length | 20 chars | `sanitize.py` |
| Position max length | 128 chars | `schemas/profile.py` |
| Address max length | 255 chars | `schemas/profile.py` |
| Document title max | 500 chars | `models/document.py` |
| Category name max | 128 chars | `models/category.py` |
| PDF max size (default) | 20 MB | admin settings (configurable) |
| Raw text cap | 500 k chars | `doc-service` AI client |
| Documents per_page | 1100, default 20 | `routers/documents.py` |
| AI service timeout | 60 s | `ai_client.py` |
| AI service max retries | 2 | `ai_client.py` |
---
## Docker Infrastructure
### Services
| Service | Image base | Internal port | User | Volumes | Network |
|---------|-----------|---------------|------|---------|---------|
| `db` | postgres:16-alpine | 5432 | 70:70 | `postgres_data` | backend-net |
| `backend` | python:3.12-slim | 8000 | 1001:1001 | `app_config` | backend-net |
| `ai-service` | python:3.12-slim | 8010 | 1001:1001 | `app_config` | backend-net |
| `doc-service` | python:3.12-slim | 8001 | 1001:1001 | `doc_data`, `watch_data`, `app_config` | backend-net |
| `frontend` | nginx-unprivileged:alpine | 8080 | 1001:1001 | — | backend-net, frontend-net |
### Volumes
| Volume | Mount path | Contains |
|--------|-----------|---------|
| `postgres_data` | `/var/lib/postgresql/data` | PostgreSQL data |
| `doc_data` | `/data/documents` | Uploaded PDF files |
| `watch_data` | `/data/watch` | Watch directory (bind-mount NAS/Nextcloud via docker-compose.override.yml) |
| `app_config` | `/config` | Per-service runtime config JSON files |
### Networks
| Network | Host-accessible | Members |
|---------|----------------|---------|
| `backend-net` | No (no host ports in prod) | db, backend, ai-service, doc-service, frontend |
| `frontend-net` | Yes (port 80 → frontend:8080) | frontend |
### Environment variables (required in `backend/.env`)
```
DATABASE_URL=postgresql+asyncpg://<user>:<pass>@db:5432/destroying_sap
CORS_ORIGINS=["http://localhost:5173"]
JWT_PRIVATE_KEY=<PEM, newlines as \n>
JWT_PUBLIC_KEY=<PEM, newlines as \n>
```
Injected by docker-compose (not in `.env`):
```
DOC_SERVICE_URL=http://doc-service:8001
AI_SERVICE_URL=http://ai-service:8010
```
---
## Workflows
### STATUS.md workflow
Every directory with runnable code has a `STATUS.md`. These are the canonical **resume point** for each session.
**At the start of every conversation:**
1. Read the `STATUS.md` for every directory you will touch.
2. If it does not exist for a directory you are working in, create it using the structure below.
This applies equally to subagents.
**After making changes**, update affected `STATUS.md` files:
- Add new endpoints / models / routes.
- Move completed items off the **Future work** checklist.
- Add new items to **Known limitations** or **Future work**.
- Keep the **What it is** summary accurate.
**Structure:**
```markdown
# <Service Name> — Status
## What it is
One paragraph: purpose, port, database/storage, how traffic arrives.
## Current functionality
Subsections per router / feature area. Tables for endpoints.
## Architecture
ASCII diagram of call graph / data flow.
## Known limitations / not implemented
Bullet list of known gaps.
## Future work
- [ ] Planned improvements
```
Maintained in: `backend/`, `features/ai-service/`, `features/doc-service/`, `frontend/`
---
### Changelog convention
Every time files are added or modified, append to `changelog/YYYY-MM-DD_<slug>.md`. If today's file exists, append; otherwise create new.
Each entry must include:
- A heading with date and short description
- `**Timestamp:**` in ISO-8601 format
- A **Summary** sentence
- A **Files Added / Modified / Deleted** list with one-line descriptions
---
### Adding a new resource (checklist)
1. Add ORM model in `backend/app/models/`, import it in `models/__init__.py`
2. Run migration: `docker compose exec backend alembic revision --autogenerate -m "add <resource>"` then `alembic upgrade head`
3. Add Pydantic schemas in `backend/app/schemas/`
4. Add router in `backend/app/routers/`, mount it in `main.py`
5. Add API function(s) to `frontend/src/api/client.ts`
6. Add page component in `frontend/src/pages/`, register route in `App.tsx`
7. Update `STATUS.md` for affected services
8. Add changelog entry
---
### Git convention
Always run `git push` immediately after every `git commit`.
---
### Feature branch & isolated test environment
Every non-trivial implementation (anything beyond a one-line fix or doc change) **must** follow this workflow:
#### 1 — Create a feature branch
After the planning phase is approved, branch off `main`:
```bash
git checkout main && git pull
git checkout -b feat/<slug> # e.g. feat/color-mode, feat/admin-appearance
```
#### 2 — Spin up an isolated Docker stack for the feature
A dedicated compose stack runs alongside the main dev stack so both can be tested independently.
**Find the next free port** (main dev stack owns 5173):
```bash
for port in $(seq 5174 5200); do
lsof -iTCP:$port -sTCP:LISTEN -t &>/dev/null || { echo "$port"; break; }
done
```
Use the first free port returned (call it `$PORT`).
**Create a per-feature override file** at `docker-compose.feat-<slug>.yml` (gitignored):
```yaml
# docker-compose.feat-<slug>.yml — feature test stack, never committed to main
services:
frontend:
ports:
- "$PORT:8080" # e.g. 5174:8080
container_name: frontend-<slug>
backend:
container_name: backend-<slug>
doc-service:
container_name: doc-service-<slug>
ai-service:
container_name: ai-service-<slug>
db:
container_name: db-<slug>
networks:
backend-net:
name: backend-net-<slug>
frontend-net:
name: frontend-net-<slug>
```
**Start the feature stack**:
```bash
docker compose -f docker-compose.yml \
-f docker-compose.dev.yml \
-f docker-compose.feat-<slug>.yml \
--project-name <slug> up --build
```
The feature frontend is now reachable at `http://localhost:$PORT`.
The main dev stack continues running unaffected on `:5173`.
#### 3 — Develop on the feature branch
All code changes happen on `feat/<slug>`. Commit and push normally:
```bash
git add <files>
git commit -m "feat: <description>"
git push -u origin feat/<slug>
```
#### 4 — Confirm functionality
Before merging, verify all of the following on `http://localhost:$PORT`:
- [ ] Login and registration work end-to-end
- [ ] The specific feature works as intended
- [ ] No regressions visible in the UI
- [ ] Backend logs show no unexpected errors: `docker compose -p <slug> logs backend`
- [ ] Migrations (if any) applied cleanly: `docker compose -p <slug> exec backend alembic upgrade head`
#### 5 — Merge to main
Once all checks pass:
```bash
git checkout main
git merge --no-ff feat/<slug> -m "Merge feat/<slug>: <description>"
git push
git branch -d feat/<slug>
git push origin --delete feat/<slug>
```
#### 6 — Tear down the feature stack
```bash
docker compose -f docker-compose.yml \
-f docker-compose.dev.yml \
-f docker-compose.feat-<slug>.yml \
--project-name <slug> down --volumes --remove-orphans
rm docker-compose.feat-<slug>.yml
```
---
### Infrastructure change protocol
After **any** change to Dockerfiles, `docker-compose*.yml`, `nginx.conf`, or setup scripts:
1. **Update `README.md`** — containers table, ports, image names, Current State section.
2. **Dev stack** — verify login and registration end-to-end:
```bash
docker compose -f docker-compose.yml -f docker-compose.dev.yml up --build
```
3. **Prod stack** — run the same checks:
```bash
docker compose up --build -d
```
4. Confirm non-root users:
```bash
docker inspect <container> --format '{{.Config.User}}'
```
5. **Tear down** after testing:
```bash
docker compose down --volumes --remove-orphans
```
---
### Security hook
`.githooks/pre-commit` (registered via `git config core.hooksPath .githooks`). Runs `scripts/security_check.py` in Docker. New clones must run:
```bash
git config core.hooksPath .githooks
```
See **Security Standards → Pre-commit security hook** for the full list of checks.
**Never** bypass with `--no-verify`.