0f760c379d
doc-service and ai-service no longer use local filesystem directories — all file and config I/O goes through storage-service. Update README and CLAUDE.md to reflect 6-service architecture, new volumes, and add storage-service step to the "Adding a new resource" checklist. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
446 lines
18 KiB
Markdown
446 lines
18 KiB
Markdown
# CLAUDE.md
|
||
|
||
This file provides permanent, authoritative guidance to Claude Code for every session. It covers project-wide concerns only. Service-specific details live in sub-files — read them only when working in that service:
|
||
|
||
- `backend/CLAUDE.md` — auth/users/admin/settings/plugins endpoints; DB models; JWT/bcrypt/sanitization security; naming conventions
|
||
- `frontend/CLAUDE.md` — routes, components, API client patterns, XSS prevention
|
||
- `features/ai-service/CLAUDE.md` — /chat, /health, /queue endpoints; queue service
|
||
- `features/doc-service/CLAUDE.md` — document/category/share endpoints; DB models; PDF limits; file watcher
|
||
- `features/storage-service/CLAUDE.md` — storage API, pluggable backend drivers (local/S3/WebDAV), migration
|
||
|
||
---
|
||
|
||
## Merge checklist
|
||
|
||
Before merging any feature branch into `main`, every test relevant to the changed area in `tests/ALL_TESTS.md` (and the relevant service-specific file) must be marked passing. The test suite covers all 20 feature areas across five service files:
|
||
|
||
- `tests/backend_tests.md` — §1–9, §18
|
||
- `tests/frontend_tests.md` — §19
|
||
- `tests/doc-service_tests.md` — §10–16
|
||
- `tests/ai-service_tests.md` — §17
|
||
- `tests/storage-service_tests.md` — §20
|
||
|
||
Do not merge without it.
|
||
|
||
---
|
||
|
||
## CLAUDE.md self-update checkpoint
|
||
|
||
**After every change to the codebase**, before committing, check which CLAUDE.md files need updating:
|
||
|
||
- New route added → update **API Endpoints** in `backend/CLAUDE.md`, `features/doc-service/CLAUDE.md`, or `features/ai-service/CLAUDE.md`; update **Frontend Routes** in `frontend/CLAUDE.md`
|
||
- New DB model or column → update **Database Models** in `backend/CLAUDE.md` or `features/doc-service/CLAUDE.md`
|
||
- New migration → update **Migration chain** table in `backend/CLAUDE.md` or `features/doc-service/CLAUDE.md`
|
||
- New file or directory → update **File & Folder Tree** in the relevant sub-file; update the high-level tree in this root file only if a top-level directory changes
|
||
- New limit or default value changed → update **Default Values & Limits** in the relevant sub-file
|
||
- New dependency, auth mechanism, or security pattern → update **Security Standards** in the relevant sub-file
|
||
- New Docker service, volume, network, or env var → update **Docker Infrastructure** in this file
|
||
- Stack version changed → update **Stack** in this file
|
||
|
||
- New feature or endpoint added → add test rows to **both** `tests/ALL_TESTS.md` (in the relevant section) **and** the matching service-specific file (`tests/backend_tests.md`, `tests/frontend_tests.md`, `tests/doc-service_tests.md`, `tests/ai-service_tests.md`, or `tests/storage-service_tests.md`). Use the same test number and format as existing rows.
|
||
|
||
This check is mandatory — treat it the same as updating STATUS.md.
|
||
|
||
---
|
||
|
||
## Stack
|
||
|
||
| Layer | Tech |
|
||
|---|---|
|
||
| Backend | FastAPI (async), SQLAlchemy 2 (async), Alembic, PostgreSQL 16 |
|
||
| Auth | JWT RS256 via `python-jose`, bcrypt via `bcrypt` (direct, 13 rounds) |
|
||
| Frontend | React 18, TypeScript, Vite, React Router v6, TanStack Query, Axios |
|
||
| UI Library | shadcn/ui (Radix primitives + Tailwind CSS v3) |
|
||
| Styling | Tailwind CSS v3, CSS custom properties for theme tokens |
|
||
| Containerisation | Docker Compose (5 services, non-root users, named volumes) |
|
||
|
||
---
|
||
|
||
## Commands
|
||
|
||
All test, build, and package-manager commands run **inside Docker** — never on the host. See the memory note: "Testing inside Docker only".
|
||
|
||
### Full stack
|
||
|
||
```bash
|
||
# Dev stack (hot-reload, Vite on :5173)
|
||
cp .env.example backend/.env
|
||
docker compose -f docker-compose.yml -f docker-compose.dev.yml up --build
|
||
|
||
# Prod stack
|
||
docker compose up --build -d
|
||
```
|
||
|
||
For service-specific commands (migrations, lint), see `backend/CLAUDE.md` and `frontend/CLAUDE.md`.
|
||
|
||
---
|
||
|
||
## File & Folder Tree
|
||
|
||
```
|
||
/
|
||
├── CLAUDE.md ← This file — project-wide context
|
||
├── README.md ← Project overview, containers table, Current State
|
||
├── TODO.md ← Task list
|
||
├── .env.example ← Template for backend/.env
|
||
├── docker-compose.yml ← Production (5 services, named volumes)
|
||
├── docker-compose.dev.yml ← Dev overrides (hot-reload, host ports)
|
||
├── .githooks/pre-commit ← Runs scripts/security_check.py before every commit
|
||
├── scripts/security_check.py ← Static analysis: secrets, weak crypto, SQLi, JWT
|
||
├── changelog/YYYY-MM-DD_<slug>.md ← Per-date change logs
|
||
├── tests/ALL_TESTS.md ← Full test suite (all 19 areas); must pass before merging to main
|
||
├── tests/backend_tests.md ← Backend-only tests (§1–9, §18)
|
||
├── tests/frontend_tests.md ← Frontend-only tests (§19)
|
||
├── tests/doc-service_tests.md ← Doc-service tests (§10–16)
|
||
├── tests/ai-service_tests.md ← AI-service tests (§17)
|
||
├── dev-watch/ ← Dev bind-mount for file watcher testing (.gitkeep only)
|
||
│
|
||
├── backend/ ← FastAPI gateway (port 8000, internal); see backend/CLAUDE.md
|
||
├── features/
|
||
│ ├── ai-service/ ← AI provider intermediary (port 8010, internal); see features/ai-service/CLAUDE.md
|
||
│ └── doc-service/ ← PDF extraction microservice (port 8001, internal); see features/doc-service/CLAUDE.md
|
||
└── frontend/ ← React SPA (port 5173 dev / 80 prod); see frontend/CLAUDE.md
|
||
```
|
||
|
||
---
|
||
|
||
## Architecture
|
||
|
||
### Request flow
|
||
|
||
```
|
||
Browser (:5173 dev / :80 prod)
|
||
│
|
||
└── Vite dev proxy / nginx
|
||
│
|
||
└── /api/* ──→ backend:8000 (FastAPI)
|
||
│
|
||
┌───────────────┼───────────────────┐
|
||
/auth /admin /documents/*
|
||
/users /groups /documents/categories/*
|
||
/profile /settings
|
||
/services │ │
|
||
JSON volume proxy (injects x-user-id,
|
||
(/config) x-user-groups) │
|
||
doc-service:8001
|
||
│
|
||
ai-service:8010
|
||
(classify, chat)
|
||
```
|
||
|
||
### Auth flow
|
||
|
||
1. `POST /api/auth/login` → RS256 JWT (8 h), stored in `localStorage`
|
||
2. Axios interceptor injects `Authorization: Bearer {token}` on every request
|
||
3. `get_current_user` dep validates token on every protected route
|
||
4. Admin routes additionally check `user.is_superuser`; return 404 (not 403) if not admin
|
||
|
||
---
|
||
|
||
## Security Standards
|
||
|
||
These standards are **non-negotiable**. Every change must comply. Implementation-specific security rules (JWT, bcrypt, input sanitization, XSS, SQLi, admin routes) are in the relevant sub-CLAUDE.md files.
|
||
|
||
### Network isolation
|
||
|
||
- `backend-net`: all containers except frontend; not reachable from host in prod.
|
||
- `frontend-net`: only frontend; single host port (80 prod / 5173 dev).
|
||
- DB, backend, doc-service, ai-service, storage-service have **no** host port bindings in prod.
|
||
|
||
### Storage rule (non-negotiable)
|
||
|
||
**No service may write to a filesystem path for persistent data.** All file/blob storage must go through the storage-service HTTP API (`PUT/GET/DELETE /objects/{bucket}/{key}`). Config JSON files must be stored in the `config` bucket. Uploaded files must be stored in the `documents` bucket. Violation is a security and architecture defect.
|
||
|
||
The only two persistent storage mechanisms in the project are:
|
||
1. **PostgreSQL** — structured/relational data
|
||
2. **storage-service** — all file/blob/config data (local filesystem by default; switchable to S3-compatible or WebDAV)
|
||
|
||
New services and features must follow this pattern. See `features/storage-service/CLAUDE.md` for the API reference.
|
||
|
||
### Pre-commit security hook
|
||
|
||
`.githooks/pre-commit` runs `scripts/security_check.py` on every staged commit. It blocks commits that contain:
|
||
|
||
1. Hardcoded credentials / private keys / AWS creds
|
||
2. `eval()`, `exec()`, `shell=True`, `pickle.loads()`, `yaml.load()` without SafeLoader
|
||
3. MD5, SHA1, DES, `random.random()` / `random.randint()` for security use
|
||
4. SQL f-strings / format strings / concatenation passed to `execute()`/`query()`
|
||
5. JWT algorithm `"none"`, `verify_exp=False`, expiry > 9999 min, hardcoded secrets
|
||
6. `debug=True`, `print()` with passwords
|
||
7. `bandit` static analysis failures
|
||
|
||
**Never** bypass with `--no-verify` unless explicitly instructed by the user.
|
||
|
||
---
|
||
|
||
## Default Values & Limits (cross-cutting)
|
||
|
||
| Parameter | Value | Location |
|
||
|-----------|-------|----------|
|
||
| Health check interval | 30 s | `service_health.py` |
|
||
| Service poll (frontend) | 30 s | `AppsPage.tsx`, `DashboardPage.tsx` |
|
||
|
||
All other per-service defaults are in the relevant sub-CLAUDE.md file.
|
||
|
||
---
|
||
|
||
## Docker Infrastructure
|
||
|
||
### Services
|
||
|
||
| Service | Image base | Internal port | User | Volumes | Network |
|
||
|---------|-----------|---------------|------|---------|---------|
|
||
| `db` | postgres:16-alpine | 5432 | 70:70 | `postgres_data` | backend-net |
|
||
| `backend` | python:3.12-slim | 8000 | 1001:1001 | — | backend-net |
|
||
| `ai-service` | python:3.12-slim | 8010 | 1001:1001 | — | backend-net |
|
||
| `doc-service` | python:3.12-slim | 8001 | 1001:1001 | `watch_data` | backend-net |
|
||
| `storage-service` | python:3.12-slim | 8020 | 1001:1001 | `storage_data` | backend-net |
|
||
| `frontend` | nginx-unprivileged:alpine | 8080 | 1001:1001 | — | backend-net, frontend-net |
|
||
|
||
### Volumes
|
||
|
||
| Volume | Mount path | Contains |
|
||
|--------|-----------|---------|
|
||
| `postgres_data` | `/var/lib/postgresql/data` | PostgreSQL data |
|
||
| `storage_data` | `/data/storage` | All file/blob storage: PDFs (`documents/`) and config JSONs (`config/`) |
|
||
| `watch_data` | `/data/watch` | Watch directory (bind-mount NAS/Nextcloud via docker-compose.override.yml) |
|
||
|
||
### Networks
|
||
|
||
| Network | Host-accessible | Members |
|
||
|---------|----------------|---------|
|
||
| `backend-net` | No (no host ports in prod) | db, backend, ai-service, doc-service, storage-service, frontend |
|
||
| `frontend-net` | Yes (port 80 → frontend:8080) | frontend |
|
||
|
||
### Environment variables (required in `backend/.env`)
|
||
|
||
```
|
||
DATABASE_URL=postgresql+asyncpg://<user>:<pass>@db:5432/destroying_sap
|
||
CORS_ORIGINS=["http://localhost:5173"]
|
||
JWT_PRIVATE_KEY=<PEM, newlines as \n>
|
||
JWT_PUBLIC_KEY=<PEM, newlines as \n>
|
||
```
|
||
|
||
Injected by docker-compose (not in `.env`):
|
||
```
|
||
DOC_SERVICE_URL=http://doc-service:8001
|
||
AI_SERVICE_URL=http://ai-service:8010
|
||
STORAGE_SERVICE_URL=http://storage-service:8020
|
||
```
|
||
|
||
---
|
||
|
||
## Workflows
|
||
|
||
### STATUS.md workflow
|
||
|
||
Every directory with runnable code has a `STATUS.md`. These are the canonical **resume point** for each session.
|
||
|
||
**At the start of every conversation:**
|
||
1. Read the `STATUS.md` for every directory you will touch.
|
||
2. If it does not exist for a directory you are working in, create it using the structure below.
|
||
|
||
This applies equally to subagents.
|
||
|
||
**After making changes**, update affected `STATUS.md` files:
|
||
- Add new endpoints / models / routes.
|
||
- Move completed items off the **Future work** checklist.
|
||
- Add new items to **Known limitations** or **Future work**.
|
||
- Keep the **What it is** summary accurate.
|
||
|
||
**Structure:**
|
||
```markdown
|
||
# <Service Name> — Status
|
||
|
||
## What it is
|
||
One paragraph: purpose, port, database/storage, how traffic arrives.
|
||
|
||
## Current functionality
|
||
Subsections per router / feature area. Tables for endpoints.
|
||
|
||
## Architecture
|
||
ASCII diagram of call graph / data flow.
|
||
|
||
## Known limitations / not implemented
|
||
Bullet list of known gaps.
|
||
|
||
## Future work
|
||
- [ ] Planned improvements
|
||
```
|
||
|
||
Maintained in: `backend/`, `features/ai-service/`, `features/doc-service/`, `frontend/`
|
||
|
||
---
|
||
|
||
### Changelog convention
|
||
|
||
Every time files are added or modified, append to `changelog/YYYY-MM-DD_<slug>.md`. If today's file exists, append; otherwise create new.
|
||
|
||
Each entry must include:
|
||
- A heading with date and short description
|
||
- `**Timestamp:**` in ISO-8601 format
|
||
- A **Summary** sentence
|
||
- A **Files Added / Modified / Deleted** list with one-line descriptions
|
||
|
||
---
|
||
|
||
### Adding a new resource (checklist)
|
||
|
||
1. Add ORM model in `backend/app/models/`, import it in `models/__init__.py`
|
||
2. Run migration: `docker compose exec backend alembic revision --autogenerate -m "add <resource>"` then `alembic upgrade head`
|
||
3. Add Pydantic schemas in `backend/app/schemas/`
|
||
4. Add router in `backend/app/routers/`, mount it in `main.py`
|
||
5. Add API function(s) to `frontend/src/api/client.ts`
|
||
6. Add page component in `frontend/src/pages/`, register route in `App.tsx`
|
||
7. If the resource involves file or blob data: store it via `PUT /objects/{bucket}/{key}` on `storage-service:8020`. Never write to the local filesystem. See `features/storage-service/CLAUDE.md` for the API.
|
||
8. Update `STATUS.md` for affected services
|
||
9. Add changelog entry
|
||
|
||
---
|
||
|
||
### Git convention
|
||
|
||
Always run `git push` immediately after every `git commit`.
|
||
|
||
---
|
||
|
||
### Feature branch & isolated test environment
|
||
|
||
Every non-trivial implementation (anything beyond a one-line fix or doc change) **must** follow this workflow:
|
||
|
||
#### 0 — Mandatory planning phase (REQUIRED before any code changes)
|
||
|
||
Before touching any code, present a written plan and **wait for explicit user approval**. Do not open files to edit, do not create branches, do not write code until the user says the plan is approved.
|
||
|
||
The plan must include:
|
||
- **What** is changing and **why**
|
||
- **Which files** will be created or modified (with paths)
|
||
- **Database / migration impact** (if any)
|
||
- **API contract changes** (new endpoints, changed schemas)
|
||
- **Frontend route / component changes**
|
||
- **Risks or non-obvious decisions**
|
||
|
||
Only proceed to step 1 after the user responds with explicit approval (e.g. "looks good", "go ahead", "approved").
|
||
|
||
#### 1 — Create a feature branch
|
||
After the planning phase is approved, branch off `main`. Name the branch after the title of the change — use lowercase words separated by hyphens, descriptive enough to understand at a glance what the branch does:
|
||
```bash
|
||
git checkout main && git pull
|
||
git checkout -b feat/<descriptive-title> # e.g. feat/user-profile-avatar-upload, feat/document-bulk-delete
|
||
```
|
||
|
||
#### 2 — Spin up an isolated Docker stack for the feature
|
||
The feature stack always uses port `5173` (same as the main dev stack). Stop the main stack before starting a feature stack, and restart it when done.
|
||
|
||
**Stop the main dev stack first:**
|
||
```bash
|
||
docker compose -f docker-compose.yml -f docker-compose.dev.yml down
|
||
```
|
||
|
||
**Create a per-feature override file** at `docker-compose.feat-<slug>.yml` (gitignored):
|
||
```yaml
|
||
# docker-compose.feat-<slug>.yml — feature test stack, never committed to main
|
||
services:
|
||
frontend:
|
||
container_name: frontend-<slug>
|
||
backend:
|
||
container_name: backend-<slug>
|
||
doc-service:
|
||
container_name: doc-service-<slug>
|
||
ai-service:
|
||
container_name: ai-service-<slug>
|
||
db:
|
||
container_name: db-<slug>
|
||
|
||
networks:
|
||
backend-net:
|
||
name: backend-net-<slug>
|
||
frontend-net:
|
||
name: frontend-net-<slug>
|
||
```
|
||
|
||
**Start the feature stack**:
|
||
```bash
|
||
docker compose -f docker-compose.yml \
|
||
-f docker-compose.dev.yml \
|
||
-f docker-compose.feat-<slug>.yml \
|
||
--project-name <slug> up --build
|
||
```
|
||
|
||
The feature frontend is now reachable at `http://localhost:5173`.
|
||
|
||
#### 3 — Develop on the feature branch
|
||
All code changes happen on `feat/<slug>`. Commit and push normally:
|
||
```bash
|
||
git add <files>
|
||
git commit -m "feat: <description>"
|
||
git push -u origin feat/<slug>
|
||
```
|
||
|
||
#### 4 — Confirm functionality
|
||
Before merging, verify all of the following on `http://localhost:5173`:
|
||
- [ ] Login and registration work end-to-end
|
||
- [ ] The specific feature works as intended
|
||
- [ ] No regressions visible in the UI
|
||
- [ ] Backend logs show no unexpected errors: `docker compose -p <slug> logs backend`
|
||
- [ ] Migrations (if any) applied cleanly: `docker compose -p <slug> exec backend alembic upgrade head`
|
||
|
||
#### 5 — Merge to main
|
||
Once all checks pass:
|
||
```bash
|
||
git checkout main
|
||
git merge --no-ff feat/<slug> -m "Merge feat/<slug>: <description>"
|
||
git push
|
||
git branch -d feat/<slug>
|
||
git push origin --delete feat/<slug>
|
||
```
|
||
|
||
#### 6 — Tear down the feature stack and restart main dev stack
|
||
```bash
|
||
docker compose -f docker-compose.yml \
|
||
-f docker-compose.dev.yml \
|
||
-f docker-compose.feat-<slug>.yml \
|
||
--project-name <slug> down --volumes --remove-orphans
|
||
rm docker-compose.feat-<slug>.yml
|
||
|
||
# Restart the main dev stack on :5173
|
||
docker compose -f docker-compose.yml -f docker-compose.dev.yml up --build -d
|
||
```
|
||
|
||
---
|
||
|
||
### Infrastructure change protocol
|
||
|
||
After **any** change to Dockerfiles, `docker-compose*.yml`, `nginx.conf`, or setup scripts:
|
||
|
||
1. **Update `README.md`** — containers table, ports, image names, Current State section.
|
||
2. **Dev stack** — verify login and registration end-to-end:
|
||
```bash
|
||
docker compose -f docker-compose.yml -f docker-compose.dev.yml up --build
|
||
```
|
||
3. **Prod stack** — run the same checks:
|
||
```bash
|
||
docker compose up --build -d
|
||
```
|
||
4. Confirm non-root users:
|
||
```bash
|
||
docker inspect <container> --format '{{.Config.User}}'
|
||
```
|
||
5. **Tear down** after testing:
|
||
```bash
|
||
docker compose down --volumes --remove-orphans
|
||
```
|
||
|
||
---
|
||
|
||
### Security hook
|
||
|
||
`.githooks/pre-commit` (registered via `git config core.hooksPath .githooks`). Runs `scripts/security_check.py` in Docker. New clones must run:
|
||
```bash
|
||
git config core.hooksPath .githooks
|
||
```
|
||
|
||
See **Security Standards → Pre-commit security hook** for the full list of checks.
|
||
|
||
**Never** bypass with `--no-verify`.
|