Files
Business-Management/TODO.md
T
curo1305 0d34867a69 Add PDF document service with AI extraction and per-app settings
- New `features/doc-service` FastAPI microservice: PDF upload, async
  text extraction (pdfplumber), AI classification via Anthropic/Ollama/
  LM Studio, per-user categories, file download
- Alembic migration isolated with `alembic_version_doc_service` table
- Main backend: httpx proxy routers for /api/documents/* and
  /api/documents/categories/*, admin settings API at /api/settings/*
- Runtime config in /config/doc_service_config.json (shared Docker
  volume); api_key masking on reads; atomic write with os.replace()
- Frontend: DocumentsPage, DocumentAdminSettingsPage, updated AppsPage
  launcher hub, simplified Nav (removed Settings link), new routes
- docker-compose: doc-service service, doc_data + app_config volumes,
  removed internal:true from backend-net for outbound AI API calls
- Fix pre-commit hook: probe Docker socket path so git subprocess picks
  up Docker Desktop on macOS
- Fix security_check.py: use sys.executable for bandit so venv python
  is used instead of system python

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 05:28:11 +02:00

4.9 KiB
Raw Blame History

TODO

UX/UI — Penpot setup

  • Spin up Penpot LXC — separate LXC container on the server (~24 GB RAM), Docker Compose from https://github.com/penpot/penpot; expose via subdomain behind nginx proxy manager
  • Create Penpot project — register on the self-hosted instance, create project destroying_sap, create initial design file
  • Generate Penpot access token — Profile → Access tokens; used by the ux-designer agent via WebFetch REST API calls
  • Decide on UI component library — shadcn/ui (recommended: Tailwind-based, unstyled accessible primitives, white-label friendly) vs MUI vs other; decision affects both Penpot design system and frontend implementation
  • Connect ux-designer agent — confirm Penpot API reachable, provide instance URL + token to agent at session start

Auth / session security

  • 8-hour JWT expiryACCESS_TOKEN_EXPIRE_MINUTES = 60 * 8; no permanent login
  • RS256 JWT signing — 4096-bit RSA asymmetric keys; iat claim included; generate keys with scripts/generate_jwt_keys.py
  • No refresh tokens — refresh token flow not implemented; if added later, must use httpOnly cookies and rotation
  • httpOnly cookie migration — currently storing JWT in localStorage (XSS-exposed); migrate to httpOnly cookie when hardening for production

App permissions

  • Permissions registry — admin-managed table that controls which apps each user can access. Schema: user_app_permissions (user_id FK, app_key). Admin UI lets the admin grant/revoke per-app access per user. The Apps page only shows apps the current user has been granted access to.

PDF Documents app (features/doc-service)

  • doc-service container — FastAPI microservice on backend-net; never exposed to host or frontend directly
  • PDF upload + async extraction — background task with pdfplumber + pluggable AI (Anthropic / Ollama / LM Studio)
  • Per-app settings page/apps/documents/settings/admin; AI provider config, max file size; admin only
  • Per-user categories — create/rename/delete categories; assign multiple categories per document
  • Alembic isolationalembic_version_doc_service version table; no collision with main backend migrations
  • Runtime config file/config/doc_service_config.json on shared Docker volume; editable from frontend; 30s TTL cache in doc-service
  • Re-process document — UI button to re-trigger AI extraction on an existing document (after changing AI provider/model)
  • Bulk category operations — assign/remove a category from multiple documents at once
  • Search / filter documents — filter by status, document type, category, date range

Frontend features

  • Logout button — visible when logged in, clears token and redirects to /login
  • Profile page (/profile) — shows personal information for the logged-in user
  • Edit & save profile — form to update personal details, stored in a dedicated profiles table (separate from users, same PostgreSQL container)

App container architecture (future)

Design decision: each installable app (billing, PDF, email, etc.) runs in its own isolated Docker/Podman container, spawned and managed by the backend via the Docker API. Key rules to implement:

  • Docker socket proxy — backend must never mount /var/run/docker.sock directly; use tecnativa/docker-socket-proxy on an internal-only network, with only the required API endpoints whitelisted (CONTAINERS, IMAGES, NETWORKS, POST). Raw socket access = root on the host.
  • Network isolation per app — each spawned app container gets its own Docker bridge network; app containers never talk to each other directly; only the backend can reach them
  • No privileged app containers — all spawned containers run without --privileged, without extra capabilities, with resource limits (CPU, memory)
  • Image allowlist — backend may only spawn containers from a pre-approved image list; never pull or build arbitrary images at runtime
  • Consider Podman — evaluate rootless Podman as replacement for Docker daemon; daemonless model eliminates the socket entirely; Docker SDK compatible

Infrastructure

  • Docker port hardening — only port 80 (prod) / 5173 (dev) exposed on the host via frontend-net; backend and db have no host port bindings and sit on internal: true backend-net

Infrastructure (existing)

  • Rootless containers — run backend and frontend containers as non-root users (add USER directive to Dockerfiles, map UID/GID appropriately)
  • Persistent storage — ensure database data, config files, and any uploaded assets survive container restarts and rebuilds (named volumes, bind mounts for config)
  • Docker development workflow — document and streamline the full dev loop: hot reload, one-command startup, migration handling, seed data, and how to attach a debugger