Add PDF document service with AI extraction and per-app settings

- New `features/doc-service` FastAPI microservice: PDF upload, async
  text extraction (pdfplumber), AI classification via Anthropic/Ollama/
  LM Studio, per-user categories, file download
- Alembic migration isolated with `alembic_version_doc_service` table
- Main backend: httpx proxy routers for /api/documents/* and
  /api/documents/categories/*, admin settings API at /api/settings/*
- Runtime config in /config/doc_service_config.json (shared Docker
  volume); api_key masking on reads; atomic write with os.replace()
- Frontend: DocumentsPage, DocumentAdminSettingsPage, updated AppsPage
  launcher hub, simplified Nav (removed Settings link), new routes
- docker-compose: doc-service service, doc_data + app_config volumes,
  removed internal:true from backend-net for outbound AI API calls
- Fix pre-commit hook: probe Docker socket path so git subprocess picks
  up Docker Desktop on macOS
- Fix security_check.py: use sys.executable for bandit so venv python
  is used instead of system python

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
curo1305
2026-04-14 05:28:11 +02:00
parent d423bea134
commit 0d34867a69
52 changed files with 2500 additions and 28 deletions
+12
View File
@@ -19,6 +19,18 @@
- [ ] **Permissions registry** — admin-managed table that controls which apps each user can access. Schema: `user_app_permissions (user_id FK, app_key)`. Admin UI lets the admin grant/revoke per-app access per user. The Apps page only shows apps the current user has been granted access to.
## PDF Documents app (`features/doc-service`)
- [x] **doc-service container** — FastAPI microservice on `backend-net`; never exposed to host or frontend directly
- [x] **PDF upload + async extraction** — background task with pdfplumber + pluggable AI (Anthropic / Ollama / LM Studio)
- [x] **Per-app settings page**`/apps/documents/settings/admin`; AI provider config, max file size; admin only
- [x] **Per-user categories** — create/rename/delete categories; assign multiple categories per document
- [x] **Alembic isolation**`alembic_version_doc_service` version table; no collision with main backend migrations
- [x] **Runtime config file**`/config/doc_service_config.json` on shared Docker volume; editable from frontend; 30s TTL cache in doc-service
- [ ] **Re-process document** — UI button to re-trigger AI extraction on an existing document (after changing AI provider/model)
- [ ] **Bulk category operations** — assign/remove a category from multiple documents at once
- [ ] **Search / filter documents** — filter by status, document type, category, date range
## Frontend features
- [x] **Logout button** — visible when logged in, clears token and redirects to `/login`