Add PDF document service with AI extraction and per-app settings
- New `features/doc-service` FastAPI microservice: PDF upload, async text extraction (pdfplumber), AI classification via Anthropic/Ollama/ LM Studio, per-user categories, file download - Alembic migration isolated with `alembic_version_doc_service` table - Main backend: httpx proxy routers for /api/documents/* and /api/documents/categories/*, admin settings API at /api/settings/* - Runtime config in /config/doc_service_config.json (shared Docker volume); api_key masking on reads; atomic write with os.replace() - Frontend: DocumentsPage, DocumentAdminSettingsPage, updated AppsPage launcher hub, simplified Nav (removed Settings link), new routes - docker-compose: doc-service service, doc_data + app_config volumes, removed internal:true from backend-net for outbound AI API calls - Fix pre-commit hook: probe Docker socket path so git subprocess picks up Docker Desktop on macOS - Fix security_check.py: use sys.executable for bandit so venv python is used instead of system python Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,44 @@
|
||||
"""
|
||||
Reads doc_service_config.json from the shared config volume.
|
||||
Caches the result for 30 seconds to avoid hitting the filesystem on every request.
|
||||
Uses asyncio.to_thread so the synchronous file read doesn't block the event loop.
|
||||
"""
|
||||
import asyncio
|
||||
import json
|
||||
import time
|
||||
from pathlib import Path
|
||||
|
||||
from app.core.config import settings
|
||||
|
||||
_DEFAULT_CONFIG: dict = {
|
||||
"ai": {
|
||||
"provider": "anthropic",
|
||||
"anthropic": {"api_key": "", "model": "claude-haiku-4-5-20251001"},
|
||||
"ollama": {"base_url": "http://localhost:11434/v1", "model": "llama3.2", "api_key": "ollama"},
|
||||
"lmstudio": {"base_url": "http://localhost:1234/v1", "model": "local-model", "api_key": ""},
|
||||
},
|
||||
"documents": {"max_pdf_bytes": 20 * 1024 * 1024},
|
||||
}
|
||||
|
||||
_cache: dict | None = None
|
||||
_cache_at: float = 0.0
|
||||
_CACHE_TTL = 30.0
|
||||
|
||||
|
||||
def _read_config_sync() -> dict:
|
||||
path = Path(settings.CONFIG_PATH)
|
||||
if not path.exists():
|
||||
return _DEFAULT_CONFIG.copy()
|
||||
with open(path) as f:
|
||||
return json.load(f)
|
||||
|
||||
|
||||
async def load_doc_config() -> dict:
|
||||
global _cache, _cache_at
|
||||
now = time.monotonic()
|
||||
if _cache is not None and (now - _cache_at) < _CACHE_TTL:
|
||||
return _cache
|
||||
data = await asyncio.to_thread(_read_config_sync)
|
||||
_cache = data
|
||||
_cache_at = now
|
||||
return data
|
||||
Reference in New Issue
Block a user