fec3953009
- Three category scopes: personal / group / system (watch) - PascalCase-with-dashes naming convention enforced at backend + frontend - is_group_admin flag on GroupMembership; PATCH endpoint for admins to toggle it - Categories router: scope-based list/create/rename/delete with _check_can_manage_cat - Documents router: delete uses is_admin + can_delete share flag + group-admin check; remove_category requires doc ownership; assign_category accepts group/system categories - Proxy layers inject x-user-is-admin and x-user-admin-groups headers - Frontend: ManageCategoriesDialog grouped by scope with lock icons; SourcePanel scope picker + client-side name validation; AdminGroupsPage group-admin checkbox Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
182 lines
8.2 KiB
Markdown
182 lines
8.2 KiB
Markdown
# doc-service — Claude context
|
||
|
||
PDF extraction microservice, port 8001 (internal). Shares the same PostgreSQL instance as the backend. Receives proxied requests from `backend:8000`, which injects `x-user-id` and `x-user-groups` headers — doc-service trusts these headers directly. Calls `ai-service:8010` for document classification. See root `CLAUDE.md` for architecture, Docker, and project-wide workflows.
|
||
|
||
---
|
||
|
||
## Commands
|
||
|
||
All commands run inside Docker — never on the host.
|
||
|
||
```bash
|
||
docker compose exec doc-service alembic revision --autogenerate -m "describe change"
|
||
docker compose exec doc-service alembic upgrade head
|
||
docker compose exec doc-service alembic downgrade -1
|
||
```
|
||
|
||
---
|
||
|
||
## File & Folder Tree
|
||
|
||
```
|
||
features/doc-service/
|
||
├── app/
|
||
│ ├── main.py ← FastAPI, lifespan (file watcher start/stop)
|
||
│ ├── database.py ← Same PostgreSQL instance as backend
|
||
│ ├── deps.py ← get_user_id, get_user_groups, get_user_is_admin, get_user_admin_groups (injected headers)
|
||
│ ├── models/
|
||
│ │ ├── document.py ← Document model
|
||
│ │ ├── category.py ← DocumentCategory model
|
||
│ │ ├── category_assignment.py ← CategoryAssignment (composite PK)
|
||
│ │ └── document_share.py ← DocumentShare model (group-based sharing)
|
||
│ ├── schemas/
|
||
│ │ ├── document.py ← DocumentOut, DocumentPage, DocumentStatusOut, etc.
|
||
│ │ ├── category.py ← CategoryOut, CategoryCreate, CategoryUpdate
|
||
│ │ └── share.py ← DocumentShareOut, DocumentShareCreate, SharedDocumentOut
|
||
│ ├── routers/
|
||
│ │ ├── documents.py ← Full CRUD + file serving + reprocess + suggestions + sharing
|
||
│ │ ├── categories.py ← Category CRUD (includes watch-owned categories)
|
||
│ │ └── plugin.py ← GET /plugin/manifest, GET+PATCH /plugin/settings
|
||
│ └── services/
|
||
│ ├── storage.py ← File I/O
|
||
│ ├── ai_client.py ← classify_document() → ai-service:8010/chat
|
||
│ ├── config_reader.py ← Config load/save including storage/watch settings
|
||
│ └── file_watcher.py ← watchdog-based PDF watcher + startup scan + ingestion
|
||
├── alembic/versions/ ← Migration chain
|
||
│ ├── 0003_add_watch_columns.py ← source, watch_path, suggested_folder, suggested_filename
|
||
│ └── 0004_add_document_shares.py ← document_shares table (group-based sharing)
|
||
├── Dockerfile ← python:3.12-slim, non-root user 1001
|
||
└── STATUS.md
|
||
```
|
||
|
||
---
|
||
|
||
## Database Models
|
||
|
||
### `documents`
|
||
|
||
| Column | Type | Constraints | Notes |
|
||
|--------|------|-------------|-------|
|
||
| `id` | String | PK, UUID | |
|
||
| `user_id` | String | indexed | not FK — trusts x-user-id header |
|
||
| `filename` | String | NOT NULL | |
|
||
| `file_path` | String | NOT NULL | absolute path under /data/documents |
|
||
| `file_size` | Integer | NOT NULL | bytes |
|
||
| `status` | String | default="pending" | pending / processing / done / failed |
|
||
| `title` | String(500) | nullable | AI-extracted |
|
||
| `document_type` | String | nullable | invoice / bill / receipt / order / expense / revenue / unknown |
|
||
| `raw_text` | Text | nullable | first 500 k chars |
|
||
| `extracted_data` | Text | nullable | JSON string |
|
||
| `tags` | Text | nullable | JSON array string |
|
||
| `error_message` | String(500) | nullable | |
|
||
| `created_at` | DateTime(tz) | server_default=now() | |
|
||
| `processed_at` | DateTime(tz) | nullable | |
|
||
| `source` | String(16) | default="upload" | "upload" or "watch" |
|
||
| `watch_path` | String | nullable | original absolute path in watch directory |
|
||
| `suggested_folder` | String(128) | nullable | AI-suggested category (pending user confirm) |
|
||
| `suggested_filename` | String(500) | nullable | AI-suggested title/rename (pending user confirm) |
|
||
|
||
### `document_categories`
|
||
|
||
| Column | Type | Constraints | Notes |
|
||
|--------|------|-------------|-------|
|
||
| `id` | String | PK, UUID | |
|
||
| `user_id` | String | indexed | owner; "watch" for system categories |
|
||
| `name` | String(128) | NOT NULL | PascalCase-with-dashes convention enforced on create/rename |
|
||
| `scope` | String(16) | NOT NULL, default="personal" | "personal" / "group" / "system" |
|
||
| `group_id` | String | nullable, indexed | set when scope="group" |
|
||
| `created_at` | DateTime(tz) | server_default=now() | |
|
||
|
||
### `document_category_assignments` (composite PK)
|
||
|
||
| Column | Type | Constraints |
|
||
|--------|------|-------------|
|
||
| `document_id` | String | PK + FK→documents.id CASCADE |
|
||
| `category_id` | String | PK + FK→document_categories.id CASCADE |
|
||
|
||
### `document_shares`
|
||
|
||
| Column | Type | Constraints | Notes |
|
||
|--------|------|-------------|-------|
|
||
| `id` | String | PK, UUID | |
|
||
| `document_id` | String | indexed, NOT NULL | not FK — trusts proxy |
|
||
| `group_id` | String | indexed, NOT NULL | group from backend |
|
||
| `shared_by_user_id` | String | NOT NULL | owner who shared |
|
||
| `can_delete` | Boolean | NOT NULL, default=false | allows group members to delete the doc |
|
||
| `created_at` | DateTime(tz) | server_default=now() | |
|
||
|
||
Unique constraint: `(document_id, group_id)`
|
||
|
||
### Migration chain
|
||
|
||
| Rev ID | Slug |
|
||
|--------|------|
|
||
| `0001` | `create_doc_tables` |
|
||
| `0002` | `add_document_title` |
|
||
| `0003` | `add_watch_columns` |
|
||
| `0004` | `add_document_shares` |
|
||
| `0005` | `add_share_can_delete` |
|
||
| `0006` | `add_category_scope` |
|
||
|
||
---
|
||
|
||
## API Endpoints (internal — reached via backend proxy)
|
||
|
||
All these endpoints are proxied from `backend:8000`. The backend injects `x-user-id` and `x-user-groups` before forwarding.
|
||
|
||
### Documents
|
||
|
||
| Method | Path | Description |
|
||
|--------|------|-------------|
|
||
| POST | `/documents/upload` | Upload PDF (202, background processing) |
|
||
| GET | `/documents` | Paginated list (filterable: search, status, type, category, sort) |
|
||
| GET | `/documents/{id}` | Document detail |
|
||
| GET | `/documents/{id}/status` | Processing status only |
|
||
| PATCH | `/documents/{id}/type` | Update document type |
|
||
| PATCH | `/documents/{id}/tags` | Update tags |
|
||
| PATCH | `/documents/{id}/title` | Update title |
|
||
| POST | `/documents/{id}/reprocess` | Re-run AI extraction |
|
||
| DELETE | `/documents/{id}` | Delete document (204) |
|
||
| GET | `/documents/{id}/file` | Download PDF (streaming) |
|
||
| POST | `/documents/{id}/categories/{cat_id}` | Assign category |
|
||
| DELETE | `/documents/{id}/categories/{cat_id}` | Remove category |
|
||
| POST | `/documents/{id}/suggestions/folder/confirm` | Confirm AI folder suggestion |
|
||
| POST | `/documents/{id}/suggestions/folder/reject` | Reject AI folder suggestion |
|
||
| POST | `/documents/{id}/suggestions/filename/confirm` | Confirm AI filename suggestion |
|
||
| POST | `/documents/{id}/suggestions/filename/reject` | Reject AI filename suggestion |
|
||
| GET | `/documents/shared-with-me` | Documents shared with current user via their groups |
|
||
| GET | `/documents/{id}/shares` | List groups the document is shared with (owner only) |
|
||
| POST | `/documents/{id}/shares` | Share with a group (owner only; group must be in user's groups) |
|
||
| DELETE | `/documents/{id}/shares/{group_id}` | Stop sharing with a group (owner only) |
|
||
|
||
### Categories
|
||
|
||
| Method | Path | Description |
|
||
|--------|------|-------------|
|
||
| GET | `/categories` | List user's categories |
|
||
| POST | `/categories` | Create category (triggers background AI reanalysis) |
|
||
| PATCH | `/categories/{id}` | Rename |
|
||
| DELETE | `/categories/{id}` | Delete (204) |
|
||
|
||
### Plugin
|
||
|
||
| Method | Path | Description |
|
||
|--------|------|-------------|
|
||
| GET | `/plugin/manifest` | Plugin manifest with settings JSON Schema |
|
||
| GET | `/plugin/settings` | Current plugin settings |
|
||
| PATCH | `/plugin/settings` | Update plugin settings |
|
||
|
||
---
|
||
|
||
## Default Values & Limits
|
||
|
||
| Parameter | Value | Location |
|
||
|-----------|-------|----------|
|
||
| Document title max | 500 chars | `models/document.py` |
|
||
| Category name max | 128 chars | `models/category.py` |
|
||
| PDF max size (default) | 20 MB | admin settings (configurable) |
|
||
| Raw text cap | 500 k chars | `services/ai_client.py` |
|
||
| Documents per_page | 1–100, default 20 | `routers/documents.py` |
|
||
| AI service timeout | 60 s | `services/ai_client.py` |
|
||
| AI service max retries | 2 | `services/ai_client.py` |
|