diff --git a/CLAUDE.md b/CLAUDE.md index 61c0b75..576c04f 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,19 +1,26 @@ # CLAUDE.md -This file provides permanent, authoritative guidance to Claude Code for every session. All sections below reflect the actual codebase state and must be kept up-to-date as the project evolves. +This file provides permanent, authoritative guidance to Claude Code for every session. It covers project-wide concerns only. Service-specific details live in sub-files — read them only when working in that service: + +- `backend/CLAUDE.md` — auth/users/admin/settings/plugins endpoints; DB models; JWT/bcrypt/sanitization security; naming conventions +- `frontend/CLAUDE.md` — routes, components, API client patterns, XSS prevention +- `features/ai-service/CLAUDE.md` — /chat, /health, /queue endpoints; queue service +- `features/doc-service/CLAUDE.md` — document/category/share endpoints; DB models; PDF limits; file watcher + +--- ## CLAUDE.md self-update checkpoint -**After every change to the codebase**, before committing, check whether CLAUDE.md needs updating: +**After every change to the codebase**, before committing, check which CLAUDE.md files need updating: -- New route added → update **All API Endpoints** and **Frontend Routes** tables -- New DB model or column → update **Database Models** -- New migration → update **Migration chains** -- New file or directory → update **File & Folder Tree** -- New limit or default value changed → update **Default Values & Limits** -- New dependency, auth mechanism, or security pattern → update **Security Standards** -- New Docker service, volume, network, or env var → update **Docker Infrastructure** -- Stack version changed → update **Stack** +- New route added → update **API Endpoints** in `backend/CLAUDE.md`, `features/doc-service/CLAUDE.md`, or `features/ai-service/CLAUDE.md`; update **Frontend Routes** in `frontend/CLAUDE.md` +- New DB model or column → update **Database Models** in `backend/CLAUDE.md` or `features/doc-service/CLAUDE.md` +- New migration → update **Migration chain** table in `backend/CLAUDE.md` or `features/doc-service/CLAUDE.md` +- New file or directory → update **File & Folder Tree** in the relevant sub-file; update the high-level tree in this root file only if a top-level directory changes +- New limit or default value changed → update **Default Values & Limits** in the relevant sub-file +- New dependency, auth mechanism, or security pattern → update **Security Standards** in the relevant sub-file +- New Docker service, volume, network, or env var → update **Docker Infrastructure** in this file +- Stack version changed → update **Stack** in this file This check is mandatory — treat it the same as updating STATUS.md. @@ -36,22 +43,6 @@ This check is mandatory — treat it the same as updating STATUS.md. All test, build, and package-manager commands run **inside Docker** — never on the host. See the memory note: "Testing inside Docker only". -### Migrations (run in Docker) - -```bash -docker compose exec backend alembic revision --autogenerate -m "describe change" -docker compose exec backend alembic upgrade head -docker compose exec backend alembic downgrade -1 -``` - -### Lint (run in Docker) - -```bash -docker compose exec backend ruff check . && ruff format . -docker compose exec frontend npm run typecheck -docker compose exec frontend npm run lint -``` - ### Full stack ```bash @@ -63,13 +54,15 @@ docker compose -f docker-compose.yml -f docker-compose.dev.yml up --build docker compose up --build -d ``` +For service-specific commands (migrations, lint), see `backend/CLAUDE.md` and `frontend/CLAUDE.md`. + --- ## File & Folder Tree ``` / -├── CLAUDE.md ← This file — authoritative session context +├── CLAUDE.md ← This file — project-wide context ├── README.md ← Project overview, containers table, Current State ├── TODO.md ← Task list ├── .env.example ← Template for backend/.env @@ -80,117 +73,11 @@ docker compose up --build -d ├── changelog/YYYY-MM-DD_.md ← Per-date change logs ├── dev-watch/ ← Dev bind-mount for file watcher testing (.gitkeep only) │ -├── backend/ ← FastAPI gateway (port 8000, internal) -│ ├── app/ -│ │ ├── main.py ← App factory, router registration, lifespan (health loop) -│ │ ├── database.py ← AsyncEngine, AsyncSessionLocal, Base -│ │ ├── deps.py ← get_current_user, get_current_admin, get_service_admin(id), check_plugin_access (also get_user_groups in doc-service) -│ │ ├── core/ -│ │ │ ├── config.py ← All settings via pydantic-settings (reads .env) -│ │ │ ├── security.py ← JWT sign/verify (RS256), bcrypt hash/verify -│ │ │ ├── sanitize.py ← Input sanitization helpers (see Security Standards) -│ │ │ └── app_config.py ← Per-service config load/save to /config volume; theme files in /config/themes/ -│ │ ├── models/ -│ │ │ ├── __init__.py ← Imports all models (required for Alembic autogenerate) -│ │ │ ├── user.py ← User model (see Database Models) -│ │ │ ├── profile.py ← Profile model -│ │ │ └── group.py ← Group, GroupMembership models -│ │ ├── schemas/ -│ │ │ ├── user.py ← UserCreate/Out, Token, DashboardPrefsOut/Update -│ │ │ ├── profile.py ← ProfileRead, ProfileUpdate -│ │ │ └── group.py ← GroupCreate/Update/Out/DetailOut, GroupMemberOut -│ │ ├── routers/ -│ │ │ ├── auth.py ← POST /register, POST /login -│ │ │ ├── users.py ← GET /me, GET+PATCH /me/preferences, PATCH /me/color-mode, GET /me/groups -│ │ │ ├── profile.py ← GET+PUT /me (profile) -│ │ │ ├── admin.py ← User admin CRUD (admin-only) -│ │ │ ├── groups.py ← Group CRUD + member management (admin-only) -│ │ │ ├── settings.py ← AI, doc limits, system prompts, appearance, themes (admin-only) -│ │ │ ├── services.py ← GET /services (health status) -│ │ │ ├── plugins.py ← Generic plugin proxy (GET/PATCH /api/plugins/*) -│ │ │ ├── categories_proxy.py ← Transparent proxy → doc-service /categories/* -│ │ │ └── documents_proxy.py ← Transparent proxy → doc-service /documents/* -│ │ └── services/ -│ │ ├── service_health.py ← Background 30s health-check loop; caches /plugin/manifest per service -│ │ └── group_bootstrap.py ← Ensures {service-id}-admin group exists for every registered service at startup -│ ├── alembic/ -│ │ ├── env.py ← Async migration runner -│ │ └── versions/ ← Migration chain (see Migrations section) -│ ├── scripts/seed.py ← Seed test user -│ ├── Dockerfile ← python:3.12-slim, non-root user 1001 -│ └── STATUS.md -│ +├── backend/ ← FastAPI gateway (port 8000, internal); see backend/CLAUDE.md ├── features/ -│ ├── ai-service/ ← AI provider intermediary (port 8010, internal) -│ │ ├── app/ -│ │ │ ├── main.py ← FastAPI, queue worker lifespan -│ │ │ ├── routers/chat.py ← POST /chat (sync, NORMAL priority queue) -│ │ │ ├── routers/health.py ← GET /health -│ │ │ ├── routers/queue.py ← GET /queue/status, /pause, /resume, /cancel/{id} -│ │ │ └── routers/plugin.py ← GET /plugin/manifest (access rules for ai-service-admin group) -│ │ │ ├── providers/base.py ← AIProvider abstract class -│ │ │ ├── providers/anthropic_provider.py -│ │ │ ├── providers/openai_compat.py ← Ollama / LM Studio -│ │ │ └── services/queue.py ← Priority queue (CRITICAL > HIGH > NORMAL) -│ │ ├── Dockerfile -│ │ └── STATUS.md -│ │ -│ └── doc-service/ ← PDF extraction microservice (port 8001, internal) -│ ├── app/ -│ │ ├── main.py ← FastAPI, lifespan (file watcher start/stop) -│ │ ├── database.py ← Same PostgreSQL instance as backend -│ │ ├── deps.py ← get_user_id (x-user-id), get_user_groups (x-user-groups) -│ │ ├── models/ -│ │ │ ├── document.py ← Document model (see Database Models) -│ │ │ ├── category.py ← DocumentCategory model -│ │ │ ├── category_assignment.py ← CategoryAssignment (composite PK) -│ │ │ └── document_share.py ← DocumentShare model (group-based sharing) -│ │ ├── schemas/ -│ │ │ ├── document.py ← DocumentOut, DocumentPage, DocumentStatusOut, etc. -│ │ │ ├── category.py ← CategoryOut, CategoryCreate, CategoryUpdate -│ │ │ └── share.py ← DocumentShareOut, DocumentShareCreate, SharedDocumentOut -│ │ ├── routers/ -│ │ │ ├── documents.py ← Full CRUD + file serving + reprocess + suggestions + sharing -│ │ │ ├── categories.py ← Category CRUD (includes watch-owned categories) -│ │ │ └── plugin.py ← GET /plugin/manifest, GET+PATCH /plugin/settings -│ │ └── services/ -│ │ ├── storage.py ← File I/O -│ │ ├── ai_client.py ← classify_document() → ai-service:8010/chat -│ │ ├── config_reader.py ← Config load/save including storage/watch settings -│ │ └── file_watcher.py ← watchdog-based PDF watcher + startup scan + ingestion -│ ├── alembic/versions/ ← Doc-service migration chain -│ │ ├── 0003_add_watch_columns.py ← source, watch_path, suggested_folder, suggested_filename -│ │ └── 0004_add_document_shares.py ← document_shares table (group-based sharing) -│ ├── Dockerfile -│ └── STATUS.md -│ -└── frontend/ ← React SPA (port 5173 dev / 80 prod) - ├── src/ - │ ├── main.tsx ← React root, QueryClientProvider, BrowserRouter - │ ├── App.tsx ← Route tree, PrivateRoute, AdminRoute - │ ├── api/client.ts ← Axios instance + ALL API functions (single source of truth) - │ ├── hooks/ - │ │ ├── useAuth.ts ← Token state (localStorage), login/logout - │ │ └── useTheme.ts ← Theme toggle - │ ├── components/ - │ │ ├── AppShell.tsx ← Layout: Sidebar + SourcePanel (on /apps/documents) + main - │ │ ├── Sidebar.tsx ← Collapsible nav (icons ↔ icons+labels) - │ │ ├── SourcePanel.tsx ← Views + searchable category tree (docs route only) - │ │ ├── ManageCategoriesDialog.tsx ← Category CRUD modal (rename, delete) - │ │ ├── DocumentSlideOver.tsx ← Right slide-over: detail, edit, share, AI suggestions - │ │ ├── ThemeToggle.tsx ← Light/dark mode toggle - │ │ ├── PluginSchemaForm.tsx ← JSON Schema → React form (boolean/string/number/readOnly) - │ │ └── ui/ ← shadcn/ui components (Button, Input, …) - │ ├── pages/ ← One file per route (see Routes section) - │ │ ├── DocServiceSettingsPage.tsx ← Combined doc-service settings: upload limits + watch directory - │ │ └── PluginSettingsPage.tsx ← Generic plugin settings page driven by manifest - │ ├── lib/utils.ts ← cn() = clsx + tailwind-merge - │ └── styles/theme.css ← CSS custom properties, Tailwind setup - ├── vite.config.ts ← /api/* proxied to backend:8000 - ├── tailwind.config.ts - ├── components.json ← shadcn/ui config - ├── Dockerfile ← Multi-stage: Node build → nginx-unprivileged - └── STATUS.md +│ ├── ai-service/ ← AI provider intermediary (port 8010, internal); see features/ai-service/CLAUDE.md +│ └── doc-service/ ← PDF extraction microservice (port 8001, internal); see features/doc-service/CLAUDE.md +└── frontend/ ← React SPA (port 5173 dev / 80 prod); see frontend/CLAUDE.md ``` --- @@ -228,346 +115,9 @@ Browser (:5173 dev / :80 prod) --- -## Database Models - -### Backend (`users`, `profiles`, `groups`, `group_memberships`) - -**`users`** - -| Column | Type | Constraints | Notes | -|--------|------|-------------|-------| -| `id` | String | PK, UUID | auto-generated | -| `email` | String | UNIQUE, indexed, NOT NULL | lowercased before storing | -| `hashed_password` | String | NOT NULL | bcrypt 13 rounds | -| `full_name` | String | nullable | sanitized max 128 chars | -| `is_active` | Boolean | default=True | soft-delete flag | -| `is_superuser` | Boolean | default=False | admin role; never exposed as-is (serialised as `is_admin`) | -| `dashboard_app_ids` | JSON | NOT NULL, default=[] | list of pinned service IDs | -| `color_mode` | String | nullable, default=NULL | user's preferred mode: "light" / "dark" / "system" / NULL (use admin default) | - -Relationship: `profile` (one-to-one, cascade all+delete-orphan) - -**`profiles`** - -| Column | Type | Constraints | Notes | -|--------|------|-------------|-------| -| `id` | String | PK, UUID | auto-generated | -| `user_id` | String | FK→users.id UNIQUE, cascade delete | one-to-one | -| `phone` | String(20) | nullable | validated format | -| `date_of_birth` | Date | nullable | 1900+ and not future | -| `position` | String(128) | nullable | job title | -| `address` | String(255) | nullable | | -| `updated_at` | DateTime(tz) | server_default=now(), onupdate=now() | | - -**`groups`** - -| Column | Type | Constraints | -|--------|------|-------------| -| `id` | String | PK, UUID | -| `name` | String(128) | UNIQUE indexed, NOT NULL | -| `description` | String(512) | nullable | -| `created_at` | DateTime(tz) | server_default=now() | - -**`group_memberships`** - -| Column | Type | Constraints | -|--------|------|-------------| -| `id` | String | PK, UUID | -| `group_id` | String | FK→groups.id, indexed, CASCADE | -| `user_id` | String | FK→users.id, indexed, CASCADE | -| `joined_at` | DateTime(tz) | server_default=now() | - -Unique constraint: `(group_id, user_id)` - -### Doc-service (`documents`, `document_categories`, `document_category_assignments`) - -**`documents`** - -| Column | Type | Constraints | Notes | -|--------|------|-------------|-------| -| `id` | String | PK, UUID | | -| `user_id` | String | indexed | not FK — trusts x-user-id header | -| `filename` | String | NOT NULL | | -| `file_path` | String | NOT NULL | absolute path under /data/documents | -| `file_size` | Integer | NOT NULL | bytes | -| `status` | String | default="pending" | pending / processing / done / failed | -| `title` | String(500) | nullable | AI-extracted | -| `document_type` | String | nullable | invoice / bill / receipt / order / expense / revenue / unknown | -| `raw_text` | Text | nullable | first 500 k chars | -| `extracted_data` | Text | nullable | JSON string | -| `tags` | Text | nullable | JSON array string | -| `error_message` | String(500) | nullable | | -| `created_at` | DateTime(tz) | server_default=now() | | -| `processed_at` | DateTime(tz) | nullable | | -| `source` | String(16) | default="upload" | "upload" or "watch" | -| `watch_path` | String | nullable | original absolute path in watch directory | -| `suggested_folder` | String(128) | nullable | AI-suggested category (pending user confirm) | -| `suggested_filename` | String(500) | nullable | AI-suggested title/rename (pending user confirm) | - -**`document_categories`** - -| Column | Type | Constraints | -|--------|------|-------------| -| `id` | String | PK, UUID | -| `user_id` | String | indexed | -| `name` | String(128) | NOT NULL | -| `created_at` | DateTime(tz) | server_default=now() | - -**`document_category_assignments`** (composite PK) - -| Column | Type | Constraints | -|--------|------|-------------| -| `document_id` | String | PK + FK→documents.id CASCADE | -| `category_id` | String | PK + FK→document_categories.id CASCADE | - -**`document_shares`** - -| Column | Type | Constraints | Notes | -|--------|------|-------------|-------| -| `id` | String | PK, UUID | | -| `document_id` | String | indexed, NOT NULL | not FK — trusts proxy | -| `group_id` | String | indexed, NOT NULL | group from backend | -| `shared_by_user_id` | String | NOT NULL | owner who shared | -| `created_at` | DateTime(tz) | server_default=now() | | - -Unique constraint: `(document_id, group_id)` - -### Migration chains - -**Backend** (must be applied in order): - -| Rev ID | Slug | -|--------|------| -| `38efeff7c45a` | `create_users_table` | -| `676084df61d1` | `add_profiles_table` | -| `a3f9c2d14e87` | `add_groups_and_group_memberships` | -| `c7e8f9a0b1d2` | `add_dashboard_app_ids_to_users` | -| `dd6ad2f2c211` | `add_color_mode_to_users` | - -**Doc-service**: - -| Rev ID | Slug | -|--------|------| -| `0001` | `create_doc_tables` | -| `0002` | `add_document_title` | -| `0003` | `add_watch_columns` | -| `0004` | `add_document_shares` | - ---- - -## All API Endpoints - -### Auth (`/api/auth`) — public - -| Method | Path | Auth | Description | -|--------|------|------|-------------| -| POST | `/api/auth/register` | — | Create account; returns `UserOut`; enforces password policy | -| POST | `/api/auth/login` | — | OAuth2 password flow; returns `{access_token, token_type}` | - -### Users (`/api/users`) — authenticated - -| Method | Path | Auth | Description | -|--------|------|------|-------------| -| GET | `/api/users/me` | user | Current user info → `UserOut` | -| GET | `/api/users/me/preferences` | user | Dashboard pinned app IDs → `{app_ids}` | -| PATCH | `/api/users/me/preferences` | user | Save pinned app IDs (max 50, slug-safe) | -| PATCH | `/api/users/me/color-mode` | user | Save colour mode preference ("light"/"dark"/"system") | -| GET | `/api/users/me/groups` | user | Groups current user belongs to → `list[UserGroupOut]` | - -### Profile (`/api/profile`) — authenticated - -| Method | Path | Auth | Description | -|--------|------|------|-------------| -| GET | `/api/profile/me` | user | Fetch profile; auto-creates if missing | -| PUT | `/api/profile/me` | user | Update profile fields | - -### Admin — Users (`/api/admin`) — admin-only - -| Method | Path | Description | -|--------|------|-------------| -| GET | `/api/admin/users` | List all users → `list[UserAdminOut]` | -| POST | `/api/admin/users` | Create user (with optional is_admin) | -| DELETE | `/api/admin/users/{user_id}` | Delete user (204) | -| PATCH | `/api/admin/users/{user_id}/active` | Toggle active status | - -### Admin — Groups (`/api/admin/groups`) — admin-only - -| Method | Path | Description | -|--------|------|-------------| -| GET | `/api/admin/groups` | List groups with member count | -| POST | `/api/admin/groups` | Create group | -| GET | `/api/admin/groups/{id}` | Group detail + members | -| PATCH | `/api/admin/groups/{id}` | Update name / description | -| DELETE | `/api/admin/groups/{id}` | Delete (cascades memberships) | -| POST | `/api/admin/groups/{id}/members/{user_id}` | Add member | -| DELETE | `/api/admin/groups/{id}/members/{user_id}` | Remove member | - -### Settings (`/api/settings`) — admin-only - -| Method | Path | Description | -|--------|------|-------------| -| GET | `/api/settings/ai` | AI config (keys masked) | -| PATCH | `/api/settings/ai` | Update AI provider / credentials | -| POST | `/api/settings/ai/test` | Test AI connection | -| GET | `/api/settings/documents/limits` | PDF upload limits | -| PATCH | `/api/settings/documents/limits` | Update max PDF size | -| GET | `/api/settings/system-prompts` | All editable system prompts | -| PATCH | `/api/settings/system-prompts/{service_id}` | Update system prompt | -| GET | `/api/settings/appearance` | Active theme + default mode (auth) | -| PATCH | `/api/settings/appearance` | Update active theme + default mode (admin) | -| GET | `/api/settings/themes` | List all themes — built-in + custom (auth) | -| POST | `/api/settings/themes` | Create custom theme (admin) | -| PATCH | `/api/settings/themes/{id}` | Update custom theme label/colours (admin) | -| DELETE | `/api/settings/themes/{id}` | Delete custom theme (admin, 204) | - -### Services (`/api/services`) — authenticated - -| Method | Path | Description | -|--------|------|-------------| -| GET | `/api/services` | Health status of all registered services → `list[ServiceStatus]` | - -### Documents (`/api/documents/*`) — authenticated, proxied to doc-service - -| Method | Path | Description | -|--------|------|-------------| -| POST | `/api/documents/upload` | Upload PDF (202, background processing) | -| GET | `/api/documents` | Paginated list (filterable: search, status, type, category, sort) | -| GET | `/api/documents/{id}` | Document detail | -| GET | `/api/documents/{id}/status` | Processing status only | -| PATCH | `/api/documents/{id}/type` | Update document type | -| PATCH | `/api/documents/{id}/tags` | Update tags | -| PATCH | `/api/documents/{id}/title` | Update title | -| POST | `/api/documents/{id}/reprocess` | Re-run AI extraction | -| DELETE | `/api/documents/{id}` | Delete document (204) | -| GET | `/api/documents/{id}/file` | Download PDF (streaming) | -| POST | `/api/documents/{id}/categories/{cat_id}` | Assign category | -| DELETE | `/api/documents/{id}/categories/{cat_id}` | Remove category | -| POST | `/api/documents/{id}/suggestions/folder/confirm` | Confirm AI folder suggestion | -| POST | `/api/documents/{id}/suggestions/folder/reject` | Reject AI folder suggestion | -| POST | `/api/documents/{id}/suggestions/filename/confirm` | Confirm AI filename suggestion | -| POST | `/api/documents/{id}/suggestions/filename/reject` | Reject AI filename suggestion | -| GET | `/api/documents/shared-with-me` | Documents shared with current user via their groups | -| GET | `/api/documents/{id}/shares` | List groups the document is shared with (owner only) | -| POST | `/api/documents/{id}/shares` | Share with a group (owner only; group must be in user's groups) | -| DELETE | `/api/documents/{id}/shares/{group_id}` | Stop sharing with a group (owner only) | - -### Categories (`/api/documents/categories/*`) — authenticated, proxied to doc-service - -| Method | Path | Description | -|--------|------|-------------| -| GET | `/api/documents/categories` | List user's categories | -| POST | `/api/documents/categories` | Create category (triggers background AI reanalysis) | -| PATCH | `/api/documents/categories/{id}` | Rename | -| DELETE | `/api/documents/categories/{id}` | Delete (204) | - -### Plugins (`/api/plugins`) — authenticated, auth-per-plugin - -| Method | Path | Description | -|--------|------|-------------| -| GET | `/api/plugins` | List plugins accessible to current user | -| GET | `/api/plugins/{id}/manifest` | Plugin manifest with settings JSON Schema (auth-gated) | -| GET | `/api/plugins/{id}/settings` | Proxy to feature `/plugin/settings` (auth-gated) | -| PATCH | `/api/plugins/{id}/settings` | Proxy to feature `/plugin/settings` (auth-gated) | - -Auth: is_superuser OR member of group listed in manifest `required_groups`. Returns 404 (not 403) to hide existence. - -### AI-service (internal only — not exposed to browser) - -| Method | Path | Description | -|--------|------|-------------| -| POST | `/chat` | Chat request (queued at NORMAL priority) | -| GET | `/health` | Health check | -| GET | `/queue/status` | Queue state | -| POST | `/queue/pause` | Pause queue | -| POST | `/queue/resume` | Resume queue | -| POST | `/queue/cancel/{job_id}` | Cancel job | - ---- - -## Frontend Routes - -| Path | Component | Guard | -|------|-----------|-------| -| `/login` | `LoginPage` | Public | -| `/` | `DashboardPage` | PrivateRoute | -| `/apps` | `AppsPage` | PrivateRoute | -| `/apps/documents` | `DocumentsPage` | PrivateRoute | -| `/apps/documents/settings` | `DocServiceSettingsPage` | ServiceAdminRoute (is_admin OR doc-service-admin member) | -| `/apps/ai/settings` | `AIAdminSettingsPage` | ServiceAdminRoute (is_admin OR ai-service-admin member) | -| `/profile` | `ProfilePage` | PrivateRoute | -| `/settings` | `SettingsPage` | PrivateRoute | -| `/settings/plugins/:id` | `PluginSettingsPage` | PrivateRoute (auth enforced per-plugin by backend) | -| `/admin` | `AdminPage` (→ `/admin/users`) | AdminRoute | -| `/admin/users` | `AdminUsersPage` | AdminRoute | -| `/admin/groups` | `AdminGroupsPage` | AdminRoute | -| `/admin/appearance` | `AdminAppearancePage` | AdminRoute | -| `*` | redirect to `/` | — | - -`PrivateRoute` — checks `token` from `useAuth`, redirects to `/login` if absent. -`AdminRoute` — checks token AND queries `GET /api/users/me` for `is_admin`; waits for query to avoid flash; redirects to `/login` (not `/`) if not admin. - ---- - ## Security Standards -These standards are **non-negotiable**. Every change must comply. - -### JWT - -- **Algorithm**: RS256 (4096-bit RSA key pair, generated by `scripts/generate_jwt_keys.py`) -- **Keys**: PEM-encoded in `backend/.env` as `JWT_PRIVATE_KEY` / `JWT_PUBLIC_KEY` (gitignored) -- **Expiry**: 8 hours (`EXPIRE_MINUTES=480`) — never set longer; no refresh tokens -- **Claims**: `{sub: user_id, exp, iat}` — user_id is a UUID string -- **Validation**: `decode_access_token()` in `core/security.py`; called by `get_current_user` -- **Never**: set algorithm to `"none"`, disable `verify_exp`, or hardcode secrets in code - -### Password hashing - -- **Algorithm**: bcrypt, **13 rounds** (`bcrypt.gensalt(rounds=13)`) -- **Timing**: ~300 ms per hash (intentional brute-force resistance) -- **Never** use MD5, SHA1, or plain SHA256 for password storage - -### Password policy (enforced in `UserCreate` schema) - -All of the following must pass: -- ≥ 8 characters -- ≥ 1 uppercase (A–Z) -- ≥ 1 lowercase (a–z) -- ≥ 1 digit (0–9) -- ≥ 1 special character: `!@#$%^&*()\-_=+[]{}|;:'"<>?/\`~` -- No common words (password, secret, login, admin, test, qwerty, welcome, …) - -### Input sanitization - -Every user-supplied string stored in the database **must** pass through `core/sanitize.py`: - -```python -sanitize_str(value, max_len=255) -# → strips whitespace; rejects null bytes (\x00); rejects control chars -# (0x01–0x1F, 0x7F except \t \n \r); enforces max_len; returns None for "" - -normalize_email(value) # lowercase + strip -validate_phone(value) # sanitize_str(max=20) + regex ^\+?[\d\s\-()\[\]]{7,20}$ -validate_date_of_birth(v) # must be ≥ 1900, not future -``` - -Apply via Pydantic `@field_validator` on all request schemas. - -### XSS prevention - -- React JSX text interpolation (`{value}`) is HTML-escaped by the DOM renderer — **never** use `dangerouslySetInnerHTML` with user-supplied content. -- Server-side `sanitize_str` provides defense-in-depth (control char stripping, max length). - -### SQL injection prevention - -- Use SQLAlchemy ORM (bound parameters) — **never** raw SQL strings. -- If `text()` is needed, use `bindparam()` for all user-supplied values. -- **Never** use f-strings, `.format()`, or `%`-formatting for SQL. - -### Admin route security - -- Use `get_current_admin` dependency (checks `is_superuser`). -- Return **404** (not 403) for unauthorized access — hides both endpoint existence and permission model. +These standards are **non-negotiable**. Every change must comply. Implementation-specific security rules (JWT, bcrypt, input sanitization, XSS, SQLi, admin routes) are in the relevant sub-CLAUDE.md files. ### Network isolation @@ -591,172 +141,14 @@ Apply via Pydantic `@field_validator` on all request schemas. --- -## Frontend Patterns & Conventions - -### API client (`src/api/client.ts`) - -Single Axios instance — **all** API calls live here, nowhere else: - -```typescript -const api = axios.create({ baseURL: "/api" }); -api.interceptors.request.use((config) => { - const token = localStorage.getItem("token"); - if (token) config.headers.Authorization = `Bearer ${token}`; - return config; -}); -``` - -Adding a new API call: -1. Define a TypeScript interface for the response if it's new. -2. Add a named export function (`getX`, `createX`, `updateX`, `deleteX`). -3. Use `api.get(...)`, `api.post(...)`, etc.; always `.then((r) => r.data)`. - -### TanStack Query conventions - -**Query keys** (flat arrays, lowercase): -```typescript -["me"] // current user -["services"] // service health list -["dashboard-prefs"] // user dashboard preferences -["categories"] // document categories -["documents", params] // document list (params object for cache isolation) -["documents-shared", params] // shared-with-me list -["document", id] // single document -["document-shares", id] // share list for a specific document -["my-groups"] // current user's group memberships (for share picker) -["plugins"] // accessible plugin list (filtered by user access) -["plugin-manifest", id] // plugin manifest (cached) -["plugin-settings", id] // plugin current settings -``` - -**Mutation pattern**: -```typescript -const mutation = useMutation({ - mutationFn: apiFunction, - onSuccess: () => { - queryClient.invalidateQueries({ queryKey: ["affected-key"] }); - // additional side effects (close dialog, reset form, etc.) - }, -}); -// Usage: -mutation.mutate(data); -mutation.isPending // show spinner / disable button -mutation.isError // show error message -``` - -**Polling**: -```typescript -useQuery({ queryKey: ["services"], queryFn: getServices, - refetchInterval: 30_000, refetchIntervalInBackground: true }); -``` - -### Route guards - -```typescript -// PrivateRoute — redirect to /login if no token -// AdminRoute — redirect to /login if no token OR not admin -// (waits for getMe() query to avoid flash; uses 404 semantics) -``` - -### Component patterns - -- Functional components only. -- Local `useState` for UI-only state (edit mode, pending values, open/closed). -- Server state via `useQuery` / `useMutation` — no duplicated local copies. -- `cn()` from `lib/utils.ts` for conditional Tailwind classes. -- `lucide-react` for all icons. -- Never use `dangerouslySetInnerHTML` with user-supplied content. - ---- - -## Naming & Code Conventions - -### Database - -- **Tables**: lowercase, plural, snake_case (`users`, `group_memberships`, `document_category_assignments`) -- **Columns**: lowercase, snake_case -- **ORM models**: PascalCase, singular (`User`, `Group`, `GroupMembership`, `Document`) -- Primary keys: `id` (String UUID, auto-generated) -- Timestamps: `created_at` / `updated_at` / `joined_at` / `processed_at` — always timezone-aware - -### Pydantic schemas - -| Suffix | Purpose | -|--------|---------| -| `Create` | POST request body (user-supplied input) | -| `Update` | PATCH request body (partial update) | -| `Out` | API response (safe subset of model) | -| `AdminOut` | Extended response for admin endpoints | -| `Read` | GET response (same as `Out`, used for profiles) | - -Always set `model_config = {"from_attributes": True}` on response schemas. -Use `validation_alias` when the ORM field name differs from the JSON key (e.g., `is_superuser` → `is_admin`). - -### HTTP status codes - -| Code | Use | -|------|-----| -| 200 | Successful GET / PATCH / PUT | -| 201 | Successful POST that creates a resource | -| 202 | Accepted (async processing started, e.g., document upload) | -| 204 | Successful DELETE or action with no response body | -| 400 | Bad request (duplicates, invalid data beyond Pydantic) | -| 401 | Missing / invalid JWT | -| 404 | Not found **and** admin routes when not admin | -| 413 | Payload too large (file exceeds limit) | -| 415 | Unsupported media type (not a PDF) | -| 422 | Pydantic validation failure (FastAPI default) | -| 502 | Downstream service unreachable | -| 503 | Service unavailable (queue stopped, AI error) | -| 504 | Gateway timeout | - -### Backend code style - -- Async/await for **all** I/O (DB, HTTP, file). -- `raise HTTPException(status_code=..., detail="...")` for all errors. -- Response models always declared in route decorator: `@router.get("/path", response_model=XOut)`. -- Background tasks via `BackgroundTasks` param; tasks open their own `AsyncSessionLocal` session. -- Commit + refresh pattern after mutations: - ```python - await db.commit() - await db.refresh(obj) - ``` - -### Frontend code style - -- TypeScript strict mode — no `any`. -- API response types inferred from interfaces in `client.ts` only. -- Error messages displayed inline (no alert); loading shown as disabled state or "…" text. -- All user-facing text: safe via React JSX rendering (not innerHTML). - ---- - -## Default Values & Limits +## Default Values & Limits (cross-cutting) | Parameter | Value | Location | |-----------|-------|----------| -| JWT expiry | 480 min (8 h) | `core/security.py` | -| Bcrypt rounds | 13 | `core/security.py` | -| Token localStorage key | `"token"` | `useAuth.ts` | | Health check interval | 30 s | `service_health.py` | | Service poll (frontend) | 30 s | `AppsPage.tsx`, `DashboardPage.tsx` | -| User `color_mode` default | NULL (falls back to admin default_mode, then system) | `models/user.py` | -| Max dashboard pinned apps | 50 | `schemas/user.py` | -| App ID max length | 64 chars | `schemas/user.py` | -| App ID allowed chars | `[a-zA-Z0-9_\-]` | `schemas/user.py` | -| full_name max length | 128 chars | `schemas/user.py` | -| Group name max length | 128 chars | `schemas/group.py` | -| Group description max | 512 chars | `schemas/group.py` | -| Phone max length | 20 chars | `sanitize.py` | -| Position max length | 128 chars | `schemas/profile.py` | -| Address max length | 255 chars | `schemas/profile.py` | -| Document title max | 500 chars | `models/document.py` | -| Category name max | 128 chars | `models/category.py` | -| PDF max size (default) | 20 MB | admin settings (configurable) | -| Raw text cap | 500 k chars | `doc-service` AI client | -| Documents per_page | 1–100, default 20 | `routers/documents.py` | -| AI service timeout | 60 s | `ai_client.py` | -| AI service max retries | 2 | `ai_client.py` | + +All other per-service defaults are in the relevant sub-CLAUDE.md file. --- diff --git a/backend/CLAUDE.md b/backend/CLAUDE.md new file mode 100644 index 0000000..4c85907 --- /dev/null +++ b/backend/CLAUDE.md @@ -0,0 +1,350 @@ +# backend — Claude context + +FastAPI async gateway, port 8000 (internal). Handles auth, user/group management, settings, and proxies document/category requests to `doc-service:8001`. See root `CLAUDE.md` for architecture, Docker, and project-wide workflows. + +--- + +## Commands + +All commands run inside Docker — never on the host. + +### Migrations + +```bash +docker compose exec backend alembic revision --autogenerate -m "describe change" +docker compose exec backend alembic upgrade head +docker compose exec backend alembic downgrade -1 +``` + +### Lint + +```bash +docker compose exec backend ruff check . && ruff format . +``` + +--- + +## File & Folder Tree + +``` +backend/ +├── app/ +│ ├── main.py ← App factory, router registration, lifespan (health loop) +│ ├── database.py ← AsyncEngine, AsyncSessionLocal, Base +│ ├── deps.py ← get_current_user, get_current_admin, get_service_admin(id), check_plugin_access (also get_user_groups in doc-service) +│ ├── core/ +│ │ ├── config.py ← All settings via pydantic-settings (reads .env) +│ │ ├── security.py ← JWT sign/verify (RS256), bcrypt hash/verify +│ │ ├── sanitize.py ← Input sanitization helpers (see Security Standards) +│ │ └── app_config.py ← Per-service config load/save to /config volume; theme files in /config/themes/ +│ ├── models/ +│ │ ├── __init__.py ← Imports all models (required for Alembic autogenerate) +│ │ ├── user.py ← User model +│ │ ├── profile.py ← Profile model +│ │ └── group.py ← Group, GroupMembership models +│ ├── schemas/ +│ │ ├── user.py ← UserCreate/Out, Token, DashboardPrefsOut/Update +│ │ ├── profile.py ← ProfileRead, ProfileUpdate +│ │ └── group.py ← GroupCreate/Update/Out/DetailOut, GroupMemberOut +│ ├── routers/ +│ │ ├── auth.py ← POST /register, POST /login +│ │ ├── users.py ← GET /me, GET+PATCH /me/preferences, PATCH /me/color-mode, GET /me/groups +│ │ ├── profile.py ← GET+PUT /me (profile) +│ │ ├── admin.py ← User admin CRUD (admin-only) +│ │ ├── groups.py ← Group CRUD + member management (admin-only) +│ │ ├── settings.py ← AI, doc limits, system prompts, appearance, themes (admin-only) +│ │ ├── services.py ← GET /services (health status) +│ │ ├── plugins.py ← Generic plugin proxy (GET/PATCH /api/plugins/*) +│ │ ├── categories_proxy.py ← Transparent proxy → doc-service /categories/* +│ │ └── documents_proxy.py ← Transparent proxy → doc-service /documents/* +│ └── services/ +│ ├── service_health.py ← Background 30s health-check loop; caches /plugin/manifest per service +│ └── group_bootstrap.py ← Ensures {service-id}-admin group exists for every registered service at startup +├── alembic/ +│ ├── env.py ← Async migration runner +│ └── versions/ ← Migration chain (see Database Models) +├── scripts/seed.py ← Seed test user +├── Dockerfile ← python:3.12-slim, non-root user 1001 +└── STATUS.md +``` + +--- + +## Database Models + +### `users` + +| Column | Type | Constraints | Notes | +|--------|------|-------------|-------| +| `id` | String | PK, UUID | auto-generated | +| `email` | String | UNIQUE, indexed, NOT NULL | lowercased before storing | +| `hashed_password` | String | NOT NULL | bcrypt 13 rounds | +| `full_name` | String | nullable | sanitized max 128 chars | +| `is_active` | Boolean | default=True | soft-delete flag | +| `is_superuser` | Boolean | default=False | admin role; never exposed as-is (serialised as `is_admin`) | +| `dashboard_app_ids` | JSON | NOT NULL, default=[] | list of pinned service IDs | +| `color_mode` | String | nullable, default=NULL | user's preferred mode: "light" / "dark" / "system" / NULL (use admin default) | + +Relationship: `profile` (one-to-one, cascade all+delete-orphan) + +### `profiles` + +| Column | Type | Constraints | Notes | +|--------|------|-------------|-------| +| `id` | String | PK, UUID | auto-generated | +| `user_id` | String | FK→users.id UNIQUE, cascade delete | one-to-one | +| `phone` | String(20) | nullable | validated format | +| `date_of_birth` | Date | nullable | 1900+ and not future | +| `position` | String(128) | nullable | job title | +| `address` | String(255) | nullable | | +| `updated_at` | DateTime(tz) | server_default=now(), onupdate=now() | | + +### `groups` + +| Column | Type | Constraints | +|--------|------|-------------| +| `id` | String | PK, UUID | +| `name` | String(128) | UNIQUE indexed, NOT NULL | +| `description` | String(512) | nullable | +| `created_at` | DateTime(tz) | server_default=now() | + +### `group_memberships` + +| Column | Type | Constraints | +|--------|------|-------------| +| `id` | String | PK, UUID | +| `group_id` | String | FK→groups.id, indexed, CASCADE | +| `user_id` | String | FK→users.id, indexed, CASCADE | +| `joined_at` | DateTime(tz) | server_default=now() | + +Unique constraint: `(group_id, user_id)` + +### Migration chain (must be applied in order) + +| Rev ID | Slug | +|--------|------| +| `38efeff7c45a` | `create_users_table` | +| `676084df61d1` | `add_profiles_table` | +| `a3f9c2d14e87` | `add_groups_and_group_memberships` | +| `c7e8f9a0b1d2` | `add_dashboard_app_ids_to_users` | +| `dd6ad2f2c211` | `add_color_mode_to_users` | + +--- + +## API Endpoints + +### Auth (`/api/auth`) — public + +| Method | Path | Auth | Description | +|--------|------|------|-------------| +| POST | `/api/auth/register` | — | Create account; returns `UserOut`; enforces password policy | +| POST | `/api/auth/login` | — | OAuth2 password flow; returns `{access_token, token_type}` | + +### Users (`/api/users`) — authenticated + +| Method | Path | Auth | Description | +|--------|------|------|-------------| +| GET | `/api/users/me` | user | Current user info → `UserOut` | +| GET | `/api/users/me/preferences` | user | Dashboard pinned app IDs → `{app_ids}` | +| PATCH | `/api/users/me/preferences` | user | Save pinned app IDs (max 50, slug-safe) | +| PATCH | `/api/users/me/color-mode` | user | Save colour mode preference ("light"/"dark"/"system") | +| GET | `/api/users/me/groups` | user | Groups current user belongs to → `list[UserGroupOut]` | + +### Profile (`/api/profile`) — authenticated + +| Method | Path | Auth | Description | +|--------|------|------|-------------| +| GET | `/api/profile/me` | user | Fetch profile; auto-creates if missing | +| PUT | `/api/profile/me` | user | Update profile fields | + +### Admin — Users (`/api/admin`) — admin-only + +| Method | Path | Description | +|--------|------|-------------| +| GET | `/api/admin/users` | List all users → `list[UserAdminOut]` | +| POST | `/api/admin/users` | Create user (with optional is_admin) | +| DELETE | `/api/admin/users/{user_id}` | Delete user (204) | +| PATCH | `/api/admin/users/{user_id}/active` | Toggle active status | + +### Admin — Groups (`/api/admin/groups`) — admin-only + +| Method | Path | Description | +|--------|------|-------------| +| GET | `/api/admin/groups` | List groups with member count | +| POST | `/api/admin/groups` | Create group | +| GET | `/api/admin/groups/{id}` | Group detail + members | +| PATCH | `/api/admin/groups/{id}` | Update name / description | +| DELETE | `/api/admin/groups/{id}` | Delete (cascades memberships) | +| POST | `/api/admin/groups/{id}/members/{user_id}` | Add member | +| DELETE | `/api/admin/groups/{id}/members/{user_id}` | Remove member | + +### Settings (`/api/settings`) — admin-only + +| Method | Path | Description | +|--------|------|-------------| +| GET | `/api/settings/ai` | AI config (keys masked) | +| PATCH | `/api/settings/ai` | Update AI provider / credentials | +| POST | `/api/settings/ai/test` | Test AI connection | +| GET | `/api/settings/documents/limits` | PDF upload limits | +| PATCH | `/api/settings/documents/limits` | Update max PDF size | +| GET | `/api/settings/system-prompts` | All editable system prompts | +| PATCH | `/api/settings/system-prompts/{service_id}` | Update system prompt | +| GET | `/api/settings/appearance` | Active theme + default mode (auth) | +| PATCH | `/api/settings/appearance` | Update active theme + default mode (admin) | +| GET | `/api/settings/themes` | List all themes — built-in + custom (auth) | +| POST | `/api/settings/themes` | Create custom theme (admin) | +| PATCH | `/api/settings/themes/{id}` | Update custom theme label/colours (admin) | +| DELETE | `/api/settings/themes/{id}` | Delete custom theme (admin, 204) | + +### Services (`/api/services`) — authenticated + +| Method | Path | Description | +|--------|------|-------------| +| GET | `/api/services` | Health status of all registered services → `list[ServiceStatus]` | + +### Plugins (`/api/plugins`) — authenticated, auth-per-plugin + +| Method | Path | Description | +|--------|------|-------------| +| GET | `/api/plugins` | List plugins accessible to current user | +| GET | `/api/plugins/{id}/manifest` | Plugin manifest with settings JSON Schema (auth-gated) | +| GET | `/api/plugins/{id}/settings` | Proxy to feature `/plugin/settings` (auth-gated) | +| PATCH | `/api/plugins/{id}/settings` | Proxy to feature `/plugin/settings` (auth-gated) | + +Auth: is_superuser OR member of group listed in manifest `required_groups`. Returns 404 (not 403) to hide existence. + +### Documents and Categories — proxied + +`/api/documents/*` and `/api/documents/categories/*` are transparently proxied to `doc-service:8001`. The backend injects `x-user-id` and `x-user-groups` headers. See `features/doc-service/CLAUDE.md` for the internal endpoint list. + +--- + +## Security Standards + +These standards are **non-negotiable**. Every change must comply. + +### JWT + +- **Algorithm**: RS256 (4096-bit RSA key pair, generated by `scripts/generate_jwt_keys.py`) +- **Keys**: PEM-encoded in `backend/.env` as `JWT_PRIVATE_KEY` / `JWT_PUBLIC_KEY` (gitignored) +- **Expiry**: 8 hours (`EXPIRE_MINUTES=480`) — never set longer; no refresh tokens +- **Claims**: `{sub: user_id, exp, iat}` — user_id is a UUID string +- **Validation**: `decode_access_token()` in `core/security.py`; called by `get_current_user` +- **Never**: set algorithm to `"none"`, disable `verify_exp`, or hardcode secrets in code + +### Password hashing + +- **Algorithm**: bcrypt, **13 rounds** (`bcrypt.gensalt(rounds=13)`) +- **Timing**: ~300 ms per hash (intentional brute-force resistance) +- **Never** use MD5, SHA1, or plain SHA256 for password storage + +### Password policy (enforced in `UserCreate` schema) + +All of the following must pass: +- ≥ 8 characters +- ≥ 1 uppercase (A–Z) +- ≥ 1 lowercase (a–z) +- ≥ 1 digit (0–9) +- ≥ 1 special character: `!@#$%^&*()\-_=+[]{}|;:'"<>?/\`~` +- No common words (password, secret, login, admin, test, qwerty, welcome, …) + +### Input sanitization + +Every user-supplied string stored in the database **must** pass through `core/sanitize.py`: + +```python +sanitize_str(value, max_len=255) +# → strips whitespace; rejects null bytes (\x00); rejects control chars +# (0x01–0x1F, 0x7F except \t \n \r); enforces max_len; returns None for "" + +normalize_email(value) # lowercase + strip +validate_phone(value) # sanitize_str(max=20) + regex ^\+?[\d\s\-()\[\]]{7,20}$ +validate_date_of_birth(v) # must be ≥ 1900, not future +``` + +Apply via Pydantic `@field_validator` on all request schemas. + +### SQL injection prevention + +- Use SQLAlchemy ORM (bound parameters) — **never** raw SQL strings. +- If `text()` is needed, use `bindparam()` for all user-supplied values. +- **Never** use f-strings, `.format()`, or `%`-formatting for SQL. + +### Admin route security + +- Use `get_current_admin` dependency (checks `is_superuser`). +- Return **404** (not 403) for unauthorized access — hides both endpoint existence and permission model. + +--- + +## Naming & Code Conventions + +### Database + +- **Tables**: lowercase, plural, snake_case (`users`, `group_memberships`, `document_category_assignments`) +- **Columns**: lowercase, snake_case +- **ORM models**: PascalCase, singular (`User`, `Group`, `GroupMembership`, `Document`) +- Primary keys: `id` (String UUID, auto-generated) +- Timestamps: `created_at` / `updated_at` / `joined_at` / `processed_at` — always timezone-aware + +### Pydantic schemas + +| Suffix | Purpose | +|--------|---------| +| `Create` | POST request body (user-supplied input) | +| `Update` | PATCH request body (partial update) | +| `Out` | API response (safe subset of model) | +| `AdminOut` | Extended response for admin endpoints | +| `Read` | GET response (same as `Out`, used for profiles) | + +Always set `model_config = {"from_attributes": True}` on response schemas. +Use `validation_alias` when the ORM field name differs from the JSON key (e.g., `is_superuser` → `is_admin`). + +### HTTP status codes + +| Code | Use | +|------|-----| +| 200 | Successful GET / PATCH / PUT | +| 201 | Successful POST that creates a resource | +| 202 | Accepted (async processing started, e.g., document upload) | +| 204 | Successful DELETE or action with no response body | +| 400 | Bad request (duplicates, invalid data beyond Pydantic) | +| 401 | Missing / invalid JWT | +| 404 | Not found **and** admin routes when not admin | +| 413 | Payload too large (file exceeds limit) | +| 415 | Unsupported media type (not a PDF) | +| 422 | Pydantic validation failure (FastAPI default) | +| 502 | Downstream service unreachable | +| 503 | Service unavailable (queue stopped, AI error) | +| 504 | Gateway timeout | + +### Backend code style + +- Async/await for **all** I/O (DB, HTTP, file). +- `raise HTTPException(status_code=..., detail="...")` for all errors. +- Response models always declared in route decorator: `@router.get("/path", response_model=XOut)`. +- Background tasks via `BackgroundTasks` param; tasks open their own `AsyncSessionLocal` session. +- Commit + refresh pattern after mutations: + ```python + await db.commit() + await db.refresh(obj) + ``` + +--- + +## Default Values & Limits + +| Parameter | Value | Location | +|-----------|-------|----------| +| JWT expiry | 480 min (8 h) | `core/security.py` | +| Bcrypt rounds | 13 | `core/security.py` | +| User `color_mode` default | NULL (falls back to admin default_mode, then system) | `models/user.py` | +| Max dashboard pinned apps | 50 | `schemas/user.py` | +| App ID max length | 64 chars | `schemas/user.py` | +| App ID allowed chars | `[a-zA-Z0-9_\-]` | `schemas/user.py` | +| full_name max length | 128 chars | `schemas/user.py` | +| Group name max length | 128 chars | `schemas/group.py` | +| Group description max | 512 chars | `schemas/group.py` | +| Phone max length | 20 chars | `sanitize.py` | +| Position max length | 128 chars | `schemas/profile.py` | +| Address max length | 255 chars | `schemas/profile.py` | diff --git a/features/ai-service/CLAUDE.md b/features/ai-service/CLAUDE.md new file mode 100644 index 0000000..75582ed --- /dev/null +++ b/features/ai-service/CLAUDE.md @@ -0,0 +1,50 @@ +# ai-service — Claude context + +AI provider intermediary, port 8010 (internal only — never proxied to the browser). Accepts chat requests from `doc-service` (and potentially other callers). Manages a priority queue and abstracts over multiple AI providers (Anthropic, Ollama/LM Studio). See root `CLAUDE.md` for architecture, Docker, and project-wide workflows. + +--- + +## File & Folder Tree + +``` +features/ai-service/ +├── app/ +│ ├── main.py ← FastAPI, queue worker lifespan +│ ├── core/ +│ │ └── config.py ← Settings via pydantic-settings +│ ├── providers/ +│ │ ├── base.py ← AIProvider abstract class +│ │ ├── anthropic_provider.py ← Anthropic API integration +│ │ └── openai_compat.py ← Ollama / LM Studio compatibility +│ ├── routers/ +│ │ ├── chat.py ← POST /chat (sync, NORMAL priority queue) +│ │ ├── health.py ← GET /health +│ │ ├── queue.py ← GET /queue/status, /pause, /resume, /cancel/{id} +│ │ └── plugin.py ← GET /plugin/manifest (access rules for ai-service-admin group) +│ └── services/ +│ └── queue.py ← Priority queue (CRITICAL > HIGH > NORMAL) +├── Dockerfile ← python:3.12-slim, non-root user 1001 +└── STATUS.md +``` + +--- + +## API Endpoints (internal only) + +| Method | Path | Description | +|--------|------|-------------| +| POST | `/chat` | Chat request (queued at NORMAL priority) | +| GET | `/health` | Health check | +| GET | `/queue/status` | Queue state | +| POST | `/queue/pause` | Pause queue | +| POST | `/queue/resume` | Resume queue | +| POST | `/queue/cancel/{job_id}` | Cancel job | +| GET | `/plugin/manifest` | Plugin manifest (access rules for ai-service-admin group) | + +These endpoints are only reachable on `backend-net`. The backend does not expose them to the browser. + +--- + +## Note on timeout and retry configuration + +Caller-side timeout and retry settings live in `features/doc-service/app/services/ai_client.py` — see `features/doc-service/CLAUDE.md` for the values. diff --git a/features/doc-service/CLAUDE.md b/features/doc-service/CLAUDE.md new file mode 100644 index 0000000..3068298 --- /dev/null +++ b/features/doc-service/CLAUDE.md @@ -0,0 +1,176 @@ +# doc-service — Claude context + +PDF extraction microservice, port 8001 (internal). Shares the same PostgreSQL instance as the backend. Receives proxied requests from `backend:8000`, which injects `x-user-id` and `x-user-groups` headers — doc-service trusts these headers directly. Calls `ai-service:8010` for document classification. See root `CLAUDE.md` for architecture, Docker, and project-wide workflows. + +--- + +## Commands + +All commands run inside Docker — never on the host. + +```bash +docker compose exec doc-service alembic revision --autogenerate -m "describe change" +docker compose exec doc-service alembic upgrade head +docker compose exec doc-service alembic downgrade -1 +``` + +--- + +## File & Folder Tree + +``` +features/doc-service/ +├── app/ +│ ├── main.py ← FastAPI, lifespan (file watcher start/stop) +│ ├── database.py ← Same PostgreSQL instance as backend +│ ├── deps.py ← get_user_id (x-user-id), get_user_groups (x-user-groups) +│ ├── models/ +│ │ ├── document.py ← Document model +│ │ ├── category.py ← DocumentCategory model +│ │ ├── category_assignment.py ← CategoryAssignment (composite PK) +│ │ └── document_share.py ← DocumentShare model (group-based sharing) +│ ├── schemas/ +│ │ ├── document.py ← DocumentOut, DocumentPage, DocumentStatusOut, etc. +│ │ ├── category.py ← CategoryOut, CategoryCreate, CategoryUpdate +│ │ └── share.py ← DocumentShareOut, DocumentShareCreate, SharedDocumentOut +│ ├── routers/ +│ │ ├── documents.py ← Full CRUD + file serving + reprocess + suggestions + sharing +│ │ ├── categories.py ← Category CRUD (includes watch-owned categories) +│ │ └── plugin.py ← GET /plugin/manifest, GET+PATCH /plugin/settings +│ └── services/ +│ ├── storage.py ← File I/O +│ ├── ai_client.py ← classify_document() → ai-service:8010/chat +│ ├── config_reader.py ← Config load/save including storage/watch settings +│ └── file_watcher.py ← watchdog-based PDF watcher + startup scan + ingestion +├── alembic/versions/ ← Migration chain +│ ├── 0003_add_watch_columns.py ← source, watch_path, suggested_folder, suggested_filename +│ └── 0004_add_document_shares.py ← document_shares table (group-based sharing) +├── Dockerfile ← python:3.12-slim, non-root user 1001 +└── STATUS.md +``` + +--- + +## Database Models + +### `documents` + +| Column | Type | Constraints | Notes | +|--------|------|-------------|-------| +| `id` | String | PK, UUID | | +| `user_id` | String | indexed | not FK — trusts x-user-id header | +| `filename` | String | NOT NULL | | +| `file_path` | String | NOT NULL | absolute path under /data/documents | +| `file_size` | Integer | NOT NULL | bytes | +| `status` | String | default="pending" | pending / processing / done / failed | +| `title` | String(500) | nullable | AI-extracted | +| `document_type` | String | nullable | invoice / bill / receipt / order / expense / revenue / unknown | +| `raw_text` | Text | nullable | first 500 k chars | +| `extracted_data` | Text | nullable | JSON string | +| `tags` | Text | nullable | JSON array string | +| `error_message` | String(500) | nullable | | +| `created_at` | DateTime(tz) | server_default=now() | | +| `processed_at` | DateTime(tz) | nullable | | +| `source` | String(16) | default="upload" | "upload" or "watch" | +| `watch_path` | String | nullable | original absolute path in watch directory | +| `suggested_folder` | String(128) | nullable | AI-suggested category (pending user confirm) | +| `suggested_filename` | String(500) | nullable | AI-suggested title/rename (pending user confirm) | + +### `document_categories` + +| Column | Type | Constraints | +|--------|------|-------------| +| `id` | String | PK, UUID | +| `user_id` | String | indexed | +| `name` | String(128) | NOT NULL | +| `created_at` | DateTime(tz) | server_default=now() | + +### `document_category_assignments` (composite PK) + +| Column | Type | Constraints | +|--------|------|-------------| +| `document_id` | String | PK + FK→documents.id CASCADE | +| `category_id` | String | PK + FK→document_categories.id CASCADE | + +### `document_shares` + +| Column | Type | Constraints | Notes | +|--------|------|-------------|-------| +| `id` | String | PK, UUID | | +| `document_id` | String | indexed, NOT NULL | not FK — trusts proxy | +| `group_id` | String | indexed, NOT NULL | group from backend | +| `shared_by_user_id` | String | NOT NULL | owner who shared | +| `created_at` | DateTime(tz) | server_default=now() | | + +Unique constraint: `(document_id, group_id)` + +### Migration chain + +| Rev ID | Slug | +|--------|------| +| `0001` | `create_doc_tables` | +| `0002` | `add_document_title` | +| `0003` | `add_watch_columns` | +| `0004` | `add_document_shares` | + +--- + +## API Endpoints (internal — reached via backend proxy) + +All these endpoints are proxied from `backend:8000`. The backend injects `x-user-id` and `x-user-groups` before forwarding. + +### Documents + +| Method | Path | Description | +|--------|------|-------------| +| POST | `/documents/upload` | Upload PDF (202, background processing) | +| GET | `/documents` | Paginated list (filterable: search, status, type, category, sort) | +| GET | `/documents/{id}` | Document detail | +| GET | `/documents/{id}/status` | Processing status only | +| PATCH | `/documents/{id}/type` | Update document type | +| PATCH | `/documents/{id}/tags` | Update tags | +| PATCH | `/documents/{id}/title` | Update title | +| POST | `/documents/{id}/reprocess` | Re-run AI extraction | +| DELETE | `/documents/{id}` | Delete document (204) | +| GET | `/documents/{id}/file` | Download PDF (streaming) | +| POST | `/documents/{id}/categories/{cat_id}` | Assign category | +| DELETE | `/documents/{id}/categories/{cat_id}` | Remove category | +| POST | `/documents/{id}/suggestions/folder/confirm` | Confirm AI folder suggestion | +| POST | `/documents/{id}/suggestions/folder/reject` | Reject AI folder suggestion | +| POST | `/documents/{id}/suggestions/filename/confirm` | Confirm AI filename suggestion | +| POST | `/documents/{id}/suggestions/filename/reject` | Reject AI filename suggestion | +| GET | `/documents/shared-with-me` | Documents shared with current user via their groups | +| GET | `/documents/{id}/shares` | List groups the document is shared with (owner only) | +| POST | `/documents/{id}/shares` | Share with a group (owner only; group must be in user's groups) | +| DELETE | `/documents/{id}/shares/{group_id}` | Stop sharing with a group (owner only) | + +### Categories + +| Method | Path | Description | +|--------|------|-------------| +| GET | `/categories` | List user's categories | +| POST | `/categories` | Create category (triggers background AI reanalysis) | +| PATCH | `/categories/{id}` | Rename | +| DELETE | `/categories/{id}` | Delete (204) | + +### Plugin + +| Method | Path | Description | +|--------|------|-------------| +| GET | `/plugin/manifest` | Plugin manifest with settings JSON Schema | +| GET | `/plugin/settings` | Current plugin settings | +| PATCH | `/plugin/settings` | Update plugin settings | + +--- + +## Default Values & Limits + +| Parameter | Value | Location | +|-----------|-------|----------| +| Document title max | 500 chars | `models/document.py` | +| Category name max | 128 chars | `models/category.py` | +| PDF max size (default) | 20 MB | admin settings (configurable) | +| Raw text cap | 500 k chars | `services/ai_client.py` | +| Documents per_page | 1–100, default 20 | `routers/documents.py` | +| AI service timeout | 60 s | `services/ai_client.py` | +| AI service max retries | 2 | `services/ai_client.py` | diff --git a/frontend/CLAUDE.md b/frontend/CLAUDE.md new file mode 100644 index 0000000..7631aec --- /dev/null +++ b/frontend/CLAUDE.md @@ -0,0 +1,176 @@ +# frontend — Claude context + +React 18 SPA built with Vite, port 5173 dev / 80 prod, served by nginx-unprivileged in production. All `/api/*` requests are proxied to `backend:8000`. See root `CLAUDE.md` for architecture, Docker, and project-wide workflows. + +--- + +## Commands + +All commands run inside Docker — never on the host. + +```bash +docker compose exec frontend npm run typecheck +docker compose exec frontend npm run lint +``` + +--- + +## File & Folder Tree + +``` +frontend/ +├── src/ +│ ├── main.tsx ← React root, QueryClientProvider, BrowserRouter +│ ├── App.tsx ← Route tree, PrivateRoute, AdminRoute +│ ├── api/client.ts ← Axios instance + ALL API functions (single source of truth) +│ ├── hooks/ +│ │ ├── useAuth.ts ← Token state (localStorage), login/logout +│ │ └── useTheme.ts ← Theme toggle +│ ├── components/ +│ │ ├── AppShell.tsx ← Layout: Sidebar + SourcePanel (on /apps/documents) + main +│ │ ├── Sidebar.tsx ← Collapsible nav (icons ↔ icons+labels) +│ │ ├── SourcePanel.tsx ← Views + searchable category tree (docs route only) +│ │ ├── ManageCategoriesDialog.tsx ← Category CRUD modal (rename, delete) +│ │ ├── DocumentSlideOver.tsx ← Right slide-over: detail, edit, share, AI suggestions +│ │ ├── ThemeToggle.tsx ← Light/dark mode toggle +│ │ ├── PluginSchemaForm.tsx ← JSON Schema → React form (boolean/string/number/readOnly) +│ │ └── ui/ ← shadcn/ui components (Button, Input, …) +│ ├── pages/ ← One file per route +│ │ ├── DocServiceSettingsPage.tsx ← Combined doc-service settings: upload limits + watch directory +│ │ └── PluginSettingsPage.tsx ← Generic plugin settings page driven by manifest +│ ├── lib/utils.ts ← cn() = clsx + tailwind-merge +│ └── styles/theme.css ← CSS custom properties, Tailwind setup +├── vite.config.ts ← /api/* proxied to backend:8000 +├── tailwind.config.ts +├── components.json ← shadcn/ui config +├── Dockerfile ← Multi-stage: Node build → nginx-unprivileged +└── STATUS.md +``` + +--- + +## Frontend Routes + +| Path | Component | Guard | +|------|-----------|-------| +| `/login` | `LoginPage` | Public | +| `/` | `DashboardPage` | PrivateRoute | +| `/apps` | `AppsPage` | PrivateRoute | +| `/apps/documents` | `DocumentsPage` | PrivateRoute | +| `/apps/documents/settings` | `DocServiceSettingsPage` | ServiceAdminRoute (is_admin OR doc-service-admin member) | +| `/apps/ai/settings` | `AIAdminSettingsPage` | ServiceAdminRoute (is_admin OR ai-service-admin member) | +| `/profile` | `ProfilePage` | PrivateRoute | +| `/settings` | `SettingsPage` | PrivateRoute | +| `/settings/plugins/:id` | `PluginSettingsPage` | PrivateRoute (auth enforced per-plugin by backend) | +| `/admin` | `AdminPage` (→ `/admin/users`) | AdminRoute | +| `/admin/users` | `AdminUsersPage` | AdminRoute | +| `/admin/groups` | `AdminGroupsPage` | AdminRoute | +| `/admin/appearance` | `AdminAppearancePage` | AdminRoute | +| `*` | redirect to `/` | — | + +`PrivateRoute` — checks `token` from `useAuth`, redirects to `/login` if absent. +`AdminRoute` — checks token AND queries `GET /api/users/me` for `is_admin`; waits for query to avoid flash; redirects to `/login` (not `/`) if not admin. + +--- + +## Security Standards + +### XSS prevention + +- React JSX text interpolation (`{value}`) is HTML-escaped by the DOM renderer — **never** use `dangerouslySetInnerHTML` with user-supplied content. +- Server-side `sanitize_str` provides defense-in-depth (control char stripping, max length). + +--- + +## Frontend Patterns & Conventions + +### API client (`src/api/client.ts`) + +Single Axios instance — **all** API calls live here, nowhere else: + +```typescript +const api = axios.create({ baseURL: "/api" }); +api.interceptors.request.use((config) => { + const token = localStorage.getItem("token"); + if (token) config.headers.Authorization = `Bearer ${token}`; + return config; +}); +``` + +Adding a new API call: +1. Define a TypeScript interface for the response if it's new. +2. Add a named export function (`getX`, `createX`, `updateX`, `deleteX`). +3. Use `api.get(...)`, `api.post(...)`, etc.; always `.then((r) => r.data)`. + +### TanStack Query conventions + +**Query keys** (flat arrays, lowercase): +```typescript +["me"] // current user +["services"] // service health list +["dashboard-prefs"] // user dashboard preferences +["categories"] // document categories +["documents", params] // document list (params object for cache isolation) +["documents-shared", params] // shared-with-me list +["document", id] // single document +["document-shares", id] // share list for a specific document +["my-groups"] // current user's group memberships (for share picker) +["plugins"] // accessible plugin list (filtered by user access) +["plugin-manifest", id] // plugin manifest (cached) +["plugin-settings", id] // plugin current settings +``` + +**Mutation pattern**: +```typescript +const mutation = useMutation({ + mutationFn: apiFunction, + onSuccess: () => { + queryClient.invalidateQueries({ queryKey: ["affected-key"] }); + // additional side effects (close dialog, reset form, etc.) + }, +}); +// Usage: +mutation.mutate(data); +mutation.isPending // show spinner / disable button +mutation.isError // show error message +``` + +**Polling**: +```typescript +useQuery({ queryKey: ["services"], queryFn: getServices, + refetchInterval: 30_000, refetchIntervalInBackground: true }); +``` + +### Route guards + +```typescript +// PrivateRoute — redirect to /login if no token +// AdminRoute — redirect to /login if no token OR not admin +// (waits for getMe() query to avoid flash; uses 404 semantics) +``` + +### Component patterns + +- Functional components only. +- Local `useState` for UI-only state (edit mode, pending values, open/closed). +- Server state via `useQuery` / `useMutation` — no duplicated local copies. +- `cn()` from `lib/utils.ts` for conditional Tailwind classes. +- `lucide-react` for all icons. +- Never use `dangerouslySetInnerHTML` with user-supplied content. + +--- + +## Naming & Code Conventions + +- TypeScript strict mode — no `any`. +- API response types inferred from interfaces in `client.ts` only. +- Error messages displayed inline (no alert); loading shown as disabled state or "…" text. +- All user-facing text: safe via React JSX rendering (not innerHTML). + +--- + +## Default Values & Limits + +| Parameter | Value | Location | +|-----------|-------|----------| +| Token localStorage key | `"token"` | `useAuth.ts` |