Add priority queue to ai-service and STATUS.md workflow

- Introduce async priority queue service in ai-service; all /chat calls now route through it - Refactor chat router to separate execute_chat (core logic) from the HTTP handler - Add /queue endpoints (status, pause, resume, cancel) for queue management - Update ai-service config to use Pydantic v2 model_config style - Add STATUS.md files for backend, ai-service, doc-service, and frontend - Document STATUS.md workflow in CLAUDE.md - Update doc-service documents router and schemas; frontend DocumentsPage and API client Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 22:58:10 +02:00
parent d2495190a9
commit c4f0c7ad49
18 changed files with 1253 additions and 35 deletions
@@ -88,6 +88,56 @@ Browser → Vite dev server (:5173)
 2. Token stored in `localStorage`, attached to every request by the Axios interceptor
 3. Protected routes call `GET /api/users/me`; `get_current_user` dep validates the token on the server

+## STATUS.md workflow
+
+Every directory that contains runnable code, a feature service, or significant logic has a `STATUS.md` file. These files are the canonical **resume point** for development — they describe what the component is, what it currently does, its limitations, and what is planned next.
+
+### At the start of every conversation
+
+1. Read the `STATUS.md` for every directory you will touch.
+2. If the file does not yet exist for a directory you are working in, create one using the structure below.
+
+This applies equally to subagents — always read the relevant `STATUS.md` before starting work.
+
+### After making changes
+
+Update any `STATUS.md` that is affected:
+- Add new endpoints / models / routes to the **Current functionality** tables.
+- Move completed items off the **Future work** checklist.
+- Add new items to **Known limitations** or **Future work** as appropriate.
+- Keep the **What it is** summary accurate (port, DB, storage, etc.).
+
+### STATUS.md structure
+
+Each file must contain these sections (add/remove sub-sections as needed):
+
+```markdown
+# <Service Name> — Status
+
+## What it is
+One paragraph: purpose, internal port, database/storage, how traffic arrives.
+
+## Current functionality
+Subsections per router / feature area. Use tables for endpoints.
+
+## Architecture
+ASCII diagram showing the call graph / data flow.
+
+## Known limitations / not implemented
+Bullet list of gaps that are known but not yet addressed.
+
+## Future work
+- [ ] Checklist of planned improvements
+```
+
+Root-level services and directories to maintain STATUS.md in:
+- `backend/` — FastAPI gateway
+- `features/ai-service/` — AI intermediary
+- `features/doc-service/` — document microservice
+- `frontend/` — React SPA
+
+---
+
 ## Git convention

 Always run `git push` immediately after every `git commit`.
@@ -0,0 +1,121 @@
+# Backend — Status
+
+## What it is
+
+Central FastAPI gateway. Handles authentication, user management, admin settings, and proxies feature-service traffic. It is the only container that has host-level port exposure (`8000`, internal) — all browser traffic arrives via the Vite/nginx frontend proxy.
+
+Port: `8000` (on `backend-net`, no direct host binding in prod).
+Database: PostgreSQL 16 (`postgres_data` named volume).
+
+---
+
+## Current functionality
+
+### Auth (`/api/auth`)
+
+| Method | Path | Description |
+|--------|------|-------------|
+| `POST` | `/api/auth/register` | Create account; password policy enforced (uppercase, special char, no "test") |
+| `POST` | `/api/auth/login` | OAuth2 password flow; returns RS256 JWT (8-hour expiry) |
+
+JWT signing uses a 4096-bit RSA key pair (`RS256`). Keys are generated by `scripts/generate_jwt_keys.py` and stored in `backend/.env` (gitignored). Token stored in `localStorage` on the client.
+
+### Users (`/api/users`)
+
+| Method | Path | Description |
+|--------|------|-------------|
+| `GET` | `/api/users/me` | Current user info |
+
+### Profile (`/api/profile`)
+
+| Method | Path | Description |
+|--------|------|-------------|
+| `GET` | `/api/profile` | Fetch profile (separate `profiles` table) |
+| `PUT` | `/api/profile` | Update profile fields |
+
+### Admin (`/api/admin`)
+
+| Method | Path | Description |
+|--------|------|-------------|
+| `GET` | `/api/admin/users` | List all users (admin only) |
+| `PATCH` | `/api/admin/users/{id}` | Update user (role, active flag) |
+
+### Settings (`/api/settings`)
+
+| Method | Path | Description |
+|--------|------|-------------|
+| `GET` | `/api/settings/ai` | AI service config (masked — API keys redacted) |
+| `PATCH` | `/api/settings/ai` | Update AI provider / credentials |
+| `POST` | `/api/settings/ai/test` | Test AI connection (proxies a minimal /chat call) |
+| `GET` | `/api/settings/documents/limits` | Doc service upload limits |
+| `PATCH` | `/api/settings/documents/limits` | Update max PDF size |
+
+Settings are persisted to JSON files on the `app_config` Docker named volume and read by the respective feature services.
+
+### Feature proxies
+
+All `/api/documents/*` and `/api/documents/categories/*` requests are transparently proxied to `doc-service:8001` via `httpx.AsyncClient`. The proxy:
+- Validates the JWT (`get_current_user`)
+- Injects `x-user-id` header (UUID from `users.id`)
+- Strips hop-by-hop headers + `content-length`, `accept-encoding`, `content-type`
+- Returns `Response` (not `StreamingResponse`) to avoid content-length/chunked conflicts
+
+### Database models
+
+| Model | Table | Notes |
+|-------|-------|-------|
+| `User` | `users` | email, hashed_password, role (`user`\|`admin`), is_active |
+| `Profile` | `profiles` | one-to-one with User; full_name, phone, etc. |
+
+Alembic migrations in `backend/alembic/versions/` — version table: `alembic_version`.
+
+---
+
+## Architecture
+
+```
+Browser (port 5173 dev / 80 prod)
+    │
+    └── Vite dev proxy / nginx
+            │
+            └── /api/*  →  backend:8000  (FastAPI)
+                                │
+                    ┌───────────┼────────────┐
+                 /auth       /settings    /documents/*
+                 /users       (JSON        │
+                 /admin        volume)     └── proxy → doc-service:8001
+                 /profile
+```
+
+---
+
+## Security notes
+
+- JWT stored in `localStorage` — XSS risk. Migration to `httpOnly` cookie planned.
+- No refresh token — after 8h the user must log in again.
+- Admin routes use `get_current_admin` dependency (checks `role == "admin"`).
+- All backend routes require authentication except `/api/auth/*`.
+- `backend-net` is marked `internal: true` — containers on it cannot reach the internet directly.
+
+---
+
+## Known limitations / not implemented
+
+- **No refresh tokens** — 8h hard expiry; adding refresh requires `httpOnly` cookie + rotation
+- **No `httpOnly` cookie** — JWT in `localStorage` is XSS-exposed
+- **App permissions** — no per-user, per-app access control. Currently all authenticated users can use all apps. Planned: `user_app_permissions` table, admin UI to grant/revoke
+- **Groups / sharing** — no group model yet; blocks document sharing in doc-service
+- **Email verification** — accounts are active immediately after registration
+- **Password reset** — no flow implemented
+
+---
+
+## Future work
+
+- [ ] Groups + permissions system: `groups`, `group_memberships`, `group_app_permissions` tables; admin CRUD; doc sharing via group membership
+- [ ] App permissions registry: `user_app_permissions (user_id, app_key)`; AppsPage filtered by grants
+- [ ] `httpOnly` cookie migration for JWT
+- [ ] Refresh token flow (paired with cookie migration)
+- [ ] Email verification on registration
+- [ ] Password reset flow
+- [ ] Rate limiting on auth endpoints
@@ -56,3 +56,41 @@ Added `features/doc-service` — a FastAPI microservice that accepts PDF uploads
 ## Files Deleted

 - `frontend/src/pages/SettingsPage.tsx` — stub replaced by per-app settings pages
+
+---
+
+# 2026-04-14 — Server-side pagination and filter bar
+
+**Timestamp:** 2026-04-14T12:00:00+00:00
+
+## Summary
+
+Added server-side pagination and a filter bar to the Documents feature.
+
+## Files Added / Modified / Deleted
+
+- **Modified** `features/doc-service/app/schemas/document.py` — Added `DocumentPage` schema (`items`, `total`, `page`, `pages`)
+- **Modified** `features/doc-service/app/routers/documents.py` — `GET /documents` now accepts `page`, `per_page`, `sort`, `order`, `status`, `document_type`, `search` query params; returns `DocumentPage`
+- **Modified** `frontend/src/api/client.ts` — `listDocuments` accepts `DocumentListParams`; added `DocumentPage` and `DocumentListParams` interfaces
+- **Modified** `frontend/src/pages/DocumentsPage.tsx` — Added `FilterBar` (search, status, type, sort, order) and `Pagination` controls; query key includes params for cache isolation
+
+---
+
+# 2026-04-14 — AI Service priority queue + model config update
+
+**Timestamp:** 2026-04-14T15:00:00+00:00
+
+## Summary
+
+Added a priority queue system to ai-service with start/pause/resume/stop controls. Updated LM Studio model to gemma-4-e4b-it.
+
+## Files Added / Modified / Deleted
+
+- **Created** `features/ai-service/app/services/queue.py` — in-memory `asyncio.PriorityQueue` with HIGH/NORMAL/LOW priorities, FIFO within same level, single async worker with pause/resume/stop
+- **Created** `features/ai-service/app/schemas/queue.py` — `QueueRequest`, `JobStatus`, `QueueStatus` Pydantic models
+- **Created** `features/ai-service/app/routers/queue.py` — `POST /queue/jobs`, `GET /queue/jobs/{id}`, `DELETE /queue/jobs/{id}`, `GET /queue/status`, `POST /queue/pause|resume|start|stop`
+- **Modified** `features/ai-service/app/routers/chat.py` — extracted `execute_chat()` (called by queue worker); `POST /chat` now submits to queue at NORMAL priority and awaits result
+- **Modified** `features/ai-service/app/main.py` — start/stop queue worker in lifespan; mount queue router
+- **Modified** `features/ai-service/app/services/config_reader.py` — default model updated to `gemma-4-e4b-it`
+- **Modified** `features/ai-service/pyproject.toml` — `httpx` moved to runtime deps
+- **Modified** `features/ai-service/.env` — model updated to `gemma-4-e4b-it`
@@ -0,0 +1,112 @@
+# AI Service — Status
+
+## What it is
+
+Shared AI intermediary container. All feature containers (doc-service, future services) POST prompts here. It routes requests to the configured model (LM Studio / Ollama / Anthropic) and returns a normalised response. It is **stateless** — no database, no conversation history. History and context are the caller's responsibility.
+
+Port: `8010` (internal only, not exposed to host).
+
+---
+
+## Current functionality
+
+### Endpoints
+
+| Method | Path | Description |
+|--------|------|-------------|
+| `POST` | `/chat` | Synchronous chat: submits at NORMAL priority, blocks until done |
+| `GET` | `/health` | `{"status": "ok"}` |
+| `GET` | `/health/provider` | Active provider name, model, configured flag |
+| `POST` | `/queue/jobs` | Async enqueue — returns `job_id` immediately |
+| `GET` | `/queue/jobs/{id}` | Poll job: status, position, result, error |
+| `DELETE` | `/queue/jobs/{id}` | Cancel a pending job |
+| `GET` | `/queue/status` | Worker state: running, paused, queue_size, current_job_id |
+| `POST` | `/queue/pause` | Finish current job, stop picking new ones |
+| `POST` | `/queue/resume` | Unpause |
+| `POST` | `/queue/start` | Start (or restart) the worker task |
+| `POST` | `/queue/stop` | Stop worker (pending jobs stay queued) |
+
+### Priority queue
+
+- Three levels: `high` (1) > `normal` (3) > `low` (5)
+- FIFO within same priority level (monotonic sequence counter)
+- Single async worker — one LLM call at a time
+- Pause / resume / start / stop without restarting the container
+- `POST /chat` is a synchronous wrapper: enqueues at NORMAL, awaits the future
+
+### Providers
+
+| Provider | Protocol | SDK |
+|----------|----------|-----|
+| LM Studio | OpenAI-compatible HTTP | openai |
+| Ollama | OpenAI-compatible HTTP | openai |
+| Anthropic | Anthropic API (HTTPS) | anthropic |
+
+Active provider is selected by `"provider"` key in `/config/ai_service_config.json` (shared Docker volume), with env var overrides for dev.
+
+### Configuration (env var overrides)
+
+```
+AI_PROVIDER          lmstudio | ollama | anthropic
+LMSTUDIO_BASE_URL    http://host.docker.internal:1234/v1
+LMSTUDIO_API_KEY     sk-lm-…
+LMSTUDIO_MODEL       gemma-4-e4b-it          ← current
+OLLAMA_BASE_URL / OLLAMA_MODEL / OLLAMA_API_KEY
+ANTHROPIC_API_KEY / ANTHROPIC_MODEL
+```
+
+Credentials live in `features/ai-service/.env` (gitignored).
+
+### Error codes
+
+| Code | Meaning |
+|------|---------|
+| 422 | Bad request (empty messages, unknown priority) |
+| 502 | Provider connection / API error |
+| 503 | Provider not configured / unknown provider |
+| 504 | Provider timeout |
+
+---
+
+## Architecture
+
+```
+Callers (doc-service, future services)
+    │
+    └─▶ POST /chat (sync)         ─┐
+    └─▶ POST /queue/jobs (async)  ─┤
+                                   ▼
+                        asyncio.PriorityQueue
+                        (HIGH=1, NORMAL=3, LOW=5)
+                                   │
+                        QueueWorker (single task)
+                                   │
+                        execute_chat(request)
+                                   │
+                        Provider SDK (openai / anthropic)
+                                   │
+                        LM Studio / Ollama / Anthropic API
+```
+
+---
+
+## Known limitations / not implemented
+
+- **TLS to LM Studio** — communication is plain HTTP (`http://host.docker.internal:1234`). Deferred until LM Studio HTTPS configuration is confirmed. When ready: set `LMSTUDIO_BASE_URL=https://...` and optionally add `ssl_verify` + `ca_bundle` config keys to the OpenAI-compat provider.
+- **True preemption** — a HIGH job arriving while a LOW job is processing will be next in queue but will not interrupt the running inference.
+- **Queue persistence** — the in-memory queue is lost on container restart. Pending jobs are not persisted to disk.
+- **Authentication on queue endpoints** — `/queue/*` management endpoints have no auth guard. Should be protected before any public/multi-tenant deployment (internal network is the only current protection).
+- **Streaming responses** — `/chat` returns the full response after generation. Streaming (Server-Sent Events) not implemented.
+- **Metrics / observability** — no Prometheus metrics, no structured request logging per job.
+
+---
+
+## Future work
+
+- [ ] TLS support for LM Studio / Ollama (`ssl_verify`, `ca_bundle` config)
+- [ ] Auth guard on queue management endpoints (admin token or internal-only route)
+- [ ] Streaming responses via SSE (`POST /chat/stream`)
+- [ ] Queue persistence (SQLite or Redis-backed) so jobs survive restarts
+- [ ] Job result TTL / cleanup (currently jobs accumulate in `_jobs` dict indefinitely)
+- [ ] Per-caller priority override (e.g. doc-service background jobs = LOW, user-triggered = NORMAL)
+- [ ] Metrics endpoint (`/metrics`) for queue depth, job latency, provider error rate
@@ -5,8 +5,7 @@ class Settings(BaseSettings):
    PROJECT_NAME: str = "ai-service"
    CONFIG_PATH: str = "/config/ai_service_config.json"

-    class Config:
-        env_file = ".env"
+    model_config = {"env_file": ".env", "extra": "ignore"}


 settings = Settings()
@@ -5,7 +5,9 @@ from fastapi import FastAPI

 from app.core.config import settings
 from app.routers import chat, health
+from app.routers import queue as queue_router
 from app.services.config_reader import load_ai_config
+from app.services.queue import queue_service

 logger = logging.getLogger("ai-service")

@@ -16,10 +18,18 @@ async def lifespan(app: FastAPI):
    provider = config.get("provider", "lmstudio")
    model = config.get(provider, {}).get("model", "unknown")
    logger.info("[ai-service] active provider: %s  model: %s", provider, model)
+
+    queue_service.start()
+    logger.info("[ai-service] queue worker started")
+
    yield

+    queue_service.stop()
+    logger.info("[ai-service] queue worker stopped")
+

 app = FastAPI(title=settings.PROJECT_NAME, lifespan=lifespan)

 app.include_router(chat.router, tags=["chat"])
 app.include_router(health.router, tags=["health"])
+app.include_router(queue_router.router)
@@ -1,3 +1,10 @@
+"""
+POST /chat — synchronous chat endpoint.
+
+All requests are submitted to the priority queue at NORMAL priority and the caller
+waits for the result. This keeps the contract identical to the original endpoint
+while ensuring all AI traffic flows through one ordered queue.
+"""
 import asyncio
 import re

@@ -21,8 +28,11 @@ def _strip_fences(text: str) -> str:
    return m.group(1).strip() if m else text.strip()


-@router.post("/chat", response_model=ChatResponse)
-async def chat(request: ChatRequest) -> ChatResponse:
+async def execute_chat(request: ChatRequest) -> ChatResponse:
+    """
+    Core provider call — invoked by the queue worker.
+    Raises HTTPException on provider errors so the queue worker stores the message.
+    """
    config = await load_ai_config()

    provider_name = config.get("provider", "lmstudio")
@@ -36,7 +46,6 @@ async def chat(request: ChatRequest) -> ChatResponse:

    timeout = config.get("timeout_seconds", 60)
    max_retries = config.get("max_retries", 2)
-    last_exc: Exception | None = None

    for attempt in range(max_retries + 1):
        try:
@@ -46,11 +55,8 @@ async def chat(request: ChatRequest) -> ChatResponse:
            )
            break
        except asyncio.TimeoutError as exc:
-            last_exc = exc
-            # Don't retry on timeout — the model is busy; fail fast
            raise HTTPException(status_code=504, detail="AI provider timed out") from exc
        except (AnthropicConnError, OpenAIConnError) as exc:
-            last_exc = exc
            if attempt < max_retries:
                await asyncio.sleep(0.5 * (attempt + 1))
                continue
@@ -68,3 +74,28 @@ async def chat(request: ChatRequest) -> ChatResponse:
        input_tokens=input_tokens,
        output_tokens=output_tokens,
    )
+
+
+@router.post("/chat", response_model=ChatResponse)
+async def chat(request: ChatRequest) -> ChatResponse:
+    """
+    Submit at NORMAL priority and block until the queue processes the job.
+    If the queue is paused or stopped, the call blocks until resumed (or times out).
+    """
+    from app.services.queue import Priority, queue_service  # deferred — avoids circular import
+
+    job = await queue_service.enqueue(request, Priority.NORMAL)
+    config = await load_ai_config()
+    timeout = float(config.get("timeout_seconds", 60)) + 5.0  # +5s buffer over provider timeout
+
+    try:
+        return await asyncio.wait_for(asyncio.shield(job.future), timeout=timeout)
+    except asyncio.TimeoutError:
+        queue_service.cancel_job(job.id)
+        raise HTTPException(status_code=504, detail="Timed out waiting for queue to process job")
+    except asyncio.CancelledError:
+        raise HTTPException(status_code=503, detail="Job was cancelled")
+    except Exception as exc:
+        if isinstance(exc, HTTPException):
+            raise
+        raise HTTPException(status_code=502, detail=str(exc)) from exc
@@ -0,0 +1,104 @@
+"""
+Queue management router.
+
+POST  /queue/jobs          — enqueue a job, return immediately with job metadata
+GET   /queue/jobs/{id}     — poll job status / result
+DELETE /queue/jobs/{id}    — cancel a pending job
+
+GET   /queue/status        — worker state + queue depth
+POST  /queue/pause         — finish current job, stop picking new ones
+POST  /queue/resume        — resume from pause
+POST  /queue/start         — start (or restart) the worker
+POST  /queue/stop          — stop worker immediately (pending jobs stay queued)
+"""
+from fastapi import APIRouter, HTTPException
+
+from app.schemas.queue import JobStatus, QueueRequest, QueueStatus
+from app.services.queue import PRIORITY_MAP, Job, Priority, queue_service
+
+router = APIRouter(prefix="/queue", tags=["queue"])
+
+
+# ── Job endpoints ─────────────────────────────────────────────────────────────
+
+@router.post("/jobs", response_model=JobStatus, status_code=202)
+async def enqueue_job(request: QueueRequest) -> JobStatus:
+    priority = PRIORITY_MAP[request.priority]
+    job = await queue_service.enqueue(request, priority)
+    return _job_to_status(job)
+
+
+@router.get("/jobs/{job_id}", response_model=JobStatus)
+async def get_job(job_id: str) -> JobStatus:
+    job = queue_service.get_job(job_id)
+    if not job:
+        raise HTTPException(status_code=404, detail="Job not found")
+    return _job_to_status(job)
+
+
+@router.delete("/jobs/{job_id}", status_code=204)
+async def cancel_job(job_id: str) -> None:
+    if not queue_service.cancel_job(job_id):
+        raise HTTPException(status_code=404, detail="Job not found or already started")
+
+
+# ── Worker control endpoints ──────────────────────────────────────────────────
+
+@router.get("/status", response_model=QueueStatus)
+async def get_status() -> QueueStatus:
+    cur = queue_service.current_job
+    return QueueStatus(
+        running=queue_service._running,
+        paused=queue_service.is_paused,
+        queue_size=queue_service.queue_size,
+        current_job_id=cur.id if cur else None,
+    )
+
+
+@router.post("/pause", status_code=204)
+async def pause() -> None:
+    """Pause after the current job finishes."""
+    queue_service.pause()
+
+
+@router.post("/resume", status_code=204)
+async def resume() -> None:
+    """Resume from a paused state."""
+    queue_service.resume()
+
+
+@router.post("/start", status_code=204)
+async def start() -> None:
+    """Start (or restart) the worker task."""
+    queue_service.start()
+
+
+@router.post("/stop", status_code=204)
+async def stop() -> None:
+    """Stop the worker. Pending jobs remain in queue; POST /queue/start to resume."""
+    queue_service.stop()
+
+
+# ── Helper ────────────────────────────────────────────────────────────────────
+
+def _job_to_status(job: Job) -> JobStatus:
+    pos: int | None = None
+    if job.status == "pending":
+        # Count jobs that are ahead: same or higher priority AND earlier seq
+        pos = sum(
+            1
+            for j in queue_service._jobs.values()
+            if j.status == "pending"
+            and (int(j.priority), j.seq) < (int(job.priority), job.seq)
+        )
+    return JobStatus(
+        id=job.id,
+        status=job.status,
+        priority=Priority(job.priority).name.lower(),
+        position=pos,
+        created_at=job.created_at,
+        started_at=job.started_at,
+        finished_at=job.finished_at,
+        result=job.result,
+        error=job.error,
+    )
@@ -0,0 +1,40 @@
+from datetime import datetime
+from typing import Literal
+
+from pydantic import BaseModel, field_validator
+
+from app.schemas.chat import ChatMessage, ChatResponse
+
+
+class QueueRequest(BaseModel):
+    messages: list[ChatMessage]
+    max_tokens: int = 2048
+    temperature: float = 0.0
+    response_format: Literal["json", "text"] = "text"
+    priority: Literal["high", "normal", "low"] = "normal"
+
+    @field_validator("messages")
+    @classmethod
+    def messages_not_empty(cls, v: list) -> list:
+        if not v:
+            raise ValueError("messages must not be empty")
+        return v
+
+
+class JobStatus(BaseModel):
+    id: str
+    status: str
+    priority: str
+    position: int | None = None  # number of jobs ahead; None when not pending
+    created_at: datetime
+    started_at: datetime | None = None
+    finished_at: datetime | None = None
+    result: ChatResponse | None = None
+    error: str | None = None
+
+
+class QueueStatus(BaseModel):
+    running: bool
+    paused: bool
+    queue_size: int
+    current_job_id: str | None = None
@@ -23,7 +23,7 @@ _DEFAULT_CONFIG: dict = {
    "max_retries": 2,
    "anthropic": {"api_key": "", "model": "claude-haiku-4-5-20251001"},
    "ollama": {"base_url": "http://host.docker.internal:11434/v1", "model": "llama3.2", "api_key": "ollama"},
-    "lmstudio": {"base_url": "http://host.docker.internal:1234/v1", "model": "local-model", "api_key": "lm-studio"},
+    "lmstudio": {"base_url": "http://host.docker.internal:1234/v1", "model": "gemma-4-e4b-it", "api_key": "lm-studio"},
 }

 _cache: dict | None = None
@@ -0,0 +1,169 @@
+"""
+In-memory priority queue for AI requests.
+
+Jobs are ordered by (priority, sequence_number) so HIGH=1 jobs always run before
+NORMAL=3 and LOW=5 regardless of arrival order. Within the same priority level
+insertion order (FIFO) is preserved via the monotonically incrementing seq counter.
+
+The QueueService runs a single async worker task. It can be paused (current job
+finishes, no new jobs start), resumed, started, or stopped from outside.
+
+Module-level singleton `queue_service` is imported by routers and the app lifespan.
+"""
+import asyncio
+import uuid
+from dataclasses import dataclass, field
+from datetime import datetime, timezone
+from enum import IntEnum
+
+
+class Priority(IntEnum):
+    HIGH = 1
+    NORMAL = 3
+    LOW = 5
+
+
+PRIORITY_MAP: dict[str, Priority] = {
+    "high": Priority.HIGH,
+    "normal": Priority.NORMAL,
+    "low": Priority.LOW,
+}
+
+
+@dataclass
+class Job:
+    id: str
+    priority: Priority
+    seq: int
+    request: object  # ChatRequest — typed as object to avoid circular import
+    future: asyncio.Future
+    status: str = "pending"  # pending | processing | done | failed | cancelled
+    created_at: datetime = field(default_factory=lambda: datetime.now(timezone.utc))
+    started_at: datetime | None = None
+    finished_at: datetime | None = None
+    result: object = None
+    error: str | None = None
+
+    def __lt__(self, other: "Job") -> bool:
+        # asyncio.PriorityQueue requires items to be orderable
+        return (self.priority, self.seq) < (other.priority, other.seq)
+
+
+class QueueService:
+    def __init__(self) -> None:
+        self._queue: asyncio.PriorityQueue = asyncio.PriorityQueue()
+        self._jobs: dict[str, Job] = {}
+        self._seq: int = 0
+        self._worker_task: asyncio.Task | None = None
+        # Event: set = allowed to run; clear = paused
+        self._resume_event: asyncio.Event = asyncio.Event()
+        self._resume_event.set()
+        self._running: bool = False
+        self.current_job: Job | None = None
+
+    # ── Public API ────────────────────────────────────────────────────────────
+
+    async def enqueue(self, request: object, priority: Priority = Priority.NORMAL) -> Job:
+        self._seq += 1
+        job = Job(
+            id=str(uuid.uuid4()),
+            priority=priority,
+            seq=self._seq,
+            request=request,
+            future=asyncio.get_event_loop().create_future(),
+        )
+        self._jobs[job.id] = job
+        await self._queue.put((int(priority), self._seq, job))
+        return job
+
+    def get_job(self, job_id: str) -> Job | None:
+        return self._jobs.get(job_id)
+
+    def cancel_job(self, job_id: str) -> bool:
+        """Cancel a pending job. Returns False if not found or already started."""
+        job = self._jobs.get(job_id)
+        if job and job.status == "pending":
+            job.status = "cancelled"
+            if not job.future.done():
+                job.future.cancel()
+            return True
+        return False
+
+    def start(self) -> None:
+        """Start the worker. No-op if already running."""
+        if not self._running or (self._worker_task and self._worker_task.done()):
+            self._resume_event.set()
+            self._running = True
+            self._worker_task = asyncio.create_task(self._worker_loop())
+
+    def pause(self) -> None:
+        """Pause after the current job finishes. Does not cancel in-progress work."""
+        self._resume_event.clear()
+
+    def resume(self) -> None:
+        """Resume from a paused state."""
+        self._resume_event.set()
+
+    def stop(self) -> None:
+        """Stop the worker. Pending jobs remain in the queue; start() will resume them."""
+        self._running = False
+        self._resume_event.set()  # unblock the wait so the loop can exit
+        if self._worker_task and not self._worker_task.done():
+            self._worker_task.cancel()
+
+    @property
+    def is_paused(self) -> bool:
+        return not self._resume_event.is_set()
+
+    @property
+    def queue_size(self) -> int:
+        return self._queue.qsize()
+
+    # ── Internal ──────────────────────────────────────────────────────────────
+
+    async def _worker_loop(self) -> None:
+        while self._running:
+            # Block here while paused
+            await self._resume_event.wait()
+
+            try:
+                _, _, job = await asyncio.wait_for(self._queue.get(), timeout=1.0)
+            except asyncio.TimeoutError:
+                continue
+            except asyncio.CancelledError:
+                break
+
+            if job.status == "cancelled":
+                self._queue.task_done()
+                continue
+
+            try:
+                await self._process(job)
+            finally:
+                self._queue.task_done()
+
+    async def _process(self, job: Job) -> None:
+        # Deferred import — avoids circular dependency with chat router
+        from app.routers.chat import execute_chat  # noqa: PLC0415
+
+        job.status = "processing"
+        job.started_at = datetime.now(timezone.utc)
+        self.current_job = job
+        try:
+            result = await execute_chat(job.request)
+            job.status = "done"
+            job.result = result
+            if not job.future.done():
+                job.future.set_result(result)
+        except Exception as exc:
+            job.status = "failed"
+            job.error = str(exc)
+            if not job.future.done():
+                job.future.set_exception(exc)
+        finally:
+            job.finished_at = datetime.now(timezone.utc)
+            self.current_job = None
+
+
+# Singleton used throughout the app
+queue_service = QueueService()
@@ -12,6 +12,7 @@ dependencies = [
    "pydantic-settings>=2.2",
    "anthropic>=0.28",
    "openai>=1.0",
+    "httpx>=0.27",
 ]

 [project.optional-dependencies]
@@ -0,0 +1,143 @@
+# Doc Service — Status
+
+## What it is
+
+PDF document management microservice. Handles upload, storage, async AI-powered extraction, tagging, categorisation, and retrieval of PDF documents on a per-user basis.
+
+Port: `8001` (internal only, not exposed to host). All traffic arrives via the backend proxy (`backend/app/routers/documents_proxy.py`), which injects the authenticated `x-user-id` header.
+
+Database: shared PostgreSQL instance, isolated via `alembic_version_doc_service` Alembic version table. Storage: `/data/documents/` (Docker named volume `doc_data`).
+
+---
+
+## Current functionality
+
+### Document lifecycle
+
+1. `POST /documents/upload` — validate PDF, persist file to `/data/documents/{user_id}/{doc_id}.pdf`, create DB row with `status=pending`, enqueue background extraction
+2. Background task: extract text with `pdfplumber` → POST to ai-service `/chat` → parse JSON result → update `status=done` (or `failed`)
+3. AI extracts: `title`, `document_type`, `tags`, `suggested_categories`, plus domain fields (vendor, customer, dates, amounts, etc.) into `extracted_data` (JSON string)
+
+### Endpoints
+
+| Method | Path | Description |
+|--------|------|-------------|
+| `POST` | `/documents/upload` | Upload PDF; returns 202 with initial doc row |
+| `GET` | `/documents` | Paginated list with filters and sort |
+| `GET` | `/documents/{id}` | Single document |
+| `GET` | `/documents/{id}/status` | Lightweight status poll |
+| `GET` | `/documents/{id}/download` | Stream file bytes |
+| `DELETE` | `/documents/{id}` | Delete document and file |
+| `PATCH` | `/documents/{id}/type` | Update document type |
+| `PATCH` | `/documents/{id}/tags` | Replace tag list (dedup, preserve order) |
+| `PATCH` | `/documents/{id}/title` | Update editable title |
+| `GET` | `/documents/categories` | List all categories for the user |
+| `POST` | `/documents/categories` | Create a category |
+| `POST` | `/documents/{id}/categories/{cat_id}` | Assign category to document |
+| `DELETE` | `/documents/{id}/categories/{cat_id}` | Remove category from document |
+
+### Pagination & filtering (`GET /documents`)
+
+Query params:
+
+| Param | Default | Notes |
+|-------|---------|-------|
+| `page` | 1 | ≥ 1 |
+| `per_page` | 20 | 1–100 |
+| `sort` | `created_at` | `created_at`, `processed_at`, `filename`, `title`, `file_size`, `status`, `document_type` |
+| `order` | `desc` | `asc` \| `desc` |
+| `status` | — | filter by status string |
+| `document_type` | — | filter by document type |
+| `search` | — | case-insensitive ILIKE on `title`, `filename`, `tags`, `document_type` |
+
+Response: `{ items: [...], total: N, page: N, pages: N }`
+
+### Document schema
+
+```
+id            UUID
+user_id       string (from x-user-id header)
+filename      original filename
+title         AI-suggested editable title (nullable)
+file_size     bytes
+status        pending | processing | done | failed
+document_type AI-classified type (nullable)
+extracted_data JSON string — all AI-extracted fields
+tags          JSON array string — editable tags
+error_message set if status=failed
+created_at    upload timestamp
+processed_at  when extraction finished
+categories    many-to-many via category_assignments
+```
+
+### AI extraction (via ai-service)
+
+Prompt sends the first 50 000 chars of extracted text. Expected JSON response includes:
+- `title` — suggested human-readable title
+- `document_type` — invoice / bill / receipt / order / expense / revenue / unknown
+- `tags` — list of keyword tags
+- `suggested_categories` — list of category names to suggest in the UI
+- Domain fields: `vendor`, `customer`, `invoice_number`, `due_date`, `total_amount`, `currency`, etc.
+
+### Config (runtime, persisted to shared volume)
+
+`/config/doc_service_config.json`:
+```json
+{ "documents": { "max_pdf_bytes": 20971520 } }
+```
+Env override: `DOC_MAX_PDF_MB`
+
+### Database migrations
+
+| Revision | Description |
+|----------|-------------|
+| 0001 | Initial schema (documents, categories, category_assignments) |
+| 0002 | Add `title` column to documents |
+
+Run automatically on container start via `alembic upgrade head`.
+
+---
+
+## Architecture
+
+```
+backend (proxy)  →  doc-service:8001
+                        │
+                   documents.py router
+                        │
+               ┌────────┴────────┐
+          upload              list/get/patch
+               │
+        save_upload()        pdfplumber extraction
+               │                    │
+         Document(status=pending)   ai_client.classify_document()
+               │                    │
+        BackgroundTask         ai-service:8010/chat
+               │                    │
+         process_document()   JSON result → update doc row
+```
+
+---
+
+## Known limitations / not implemented
+
+- **Re-process** — no endpoint to re-trigger AI extraction on an existing document (e.g. after changing the AI model or prompt)
+- **Advanced field-level search** — `search` param matches text fields via ILIKE but does not query into `extracted_data` JSON (e.g. filter by `vendor` or `due_date`)
+- **Bulk operations** — no bulk category assign/remove, no bulk delete
+- **Document sharing** — documents are strictly per-user; no group sharing yet
+- **Pagination in categories** — categories are returned as a full list (no pagination)
+- **File type** — only PDF supported
+- **Concurrent uploads** — no rate limiting per user
+
+---
+
+## Future work
+
+- [ ] `POST /documents/{id}/reprocess` — re-run AI extraction
+- [ ] Advanced filter: query `extracted_data` JSON fields (vendor, due_date, amount) — requires PostgreSQL `jsonb` column or indexed virtual columns
+- [ ] Bulk operations endpoint
+- [ ] Document sharing via groups (blocked on groups/permissions system in backend)
+- [ ] Support additional file types (images via OCR, DOCX)
+- [ ] Rate limiting on upload endpoint
+- [ ] Soft delete with restore
+- [ ] Category rename / delete with cascade handling
@@ -1,13 +1,14 @@
 import asyncio
 import json
+import math
 import uuid
 from datetime import datetime, timezone

 import aiofiles
 import pdfplumber
-from fastapi import APIRouter, BackgroundTasks, Depends, HTTPException, UploadFile
+from fastapi import APIRouter, BackgroundTasks, Depends, HTTPException, Query, UploadFile
 from fastapi.responses import StreamingResponse
-from sqlalchemy import select
+from sqlalchemy import func, or_, select
 from sqlalchemy.ext.asyncio import AsyncSession
 from sqlalchemy.orm import selectinload

@@ -16,7 +17,7 @@ from app.deps import get_user_id
 from app.models.category import DocumentCategory
 from app.models.category_assignment import CategoryAssignment
 from app.models.document import Document
-from app.schemas.document import DocumentOut, DocumentStatusOut, DocumentTypeUpdate, TagsUpdate, TitleUpdate
+from app.schemas.document import DocumentOut, DocumentPage, DocumentStatusOut, DocumentTypeUpdate, TagsUpdate, TitleUpdate
 from app.services.ai_client import AIServiceError, classify_document
 from app.services.config_reader import load_doc_config
 from app.services.storage import delete_file, get_upload_path, save_upload
@@ -50,6 +51,7 @@ def _doc_with_categories(doc: Document) -> DocumentOut:
        id=doc.id,
        user_id=doc.user_id,
        filename=doc.filename,
+        title=doc.title,
        file_size=doc.file_size,
        status=doc.status,
        document_type=doc.document_type,
@@ -143,28 +145,83 @@ async def upload_document(
    )
    db.add(doc)
    await db.commit()
-    await db.refresh(doc)

    background_tasks.add_task(process_document, doc_id)

+    # Re-query with selectinload so category_assignments is eagerly loaded.
+    # A new doc has no categories yet, but we need the relationship populated
+    # to avoid MissingGreenlet in the async session.
+    doc = await _get_user_doc(doc_id, user_id, db)
    return _doc_with_categories(doc)


-@router.get("", response_model=list[DocumentOut])
+_SORT_COLUMNS = {
+    "created_at": Document.created_at,
+    "processed_at": Document.processed_at,
+    "filename": Document.filename,
+    "title": Document.title,
+    "file_size": Document.file_size,
+    "status": Document.status,
+    "document_type": Document.document_type,
+}
+
+
+@router.get("", response_model=DocumentPage)
 async def list_documents(
+    page: int = Query(default=1, ge=1),
+    per_page: int = Query(default=20, ge=1, le=100),
+    sort: str = Query(default="created_at"),
+    order: str = Query(default="desc", pattern="^(asc|desc)$"),
+    status: str | None = Query(default=None),
+    document_type: str | None = Query(default=None),
+    search: str | None = Query(default=None),
    user_id: str = Depends(get_user_id),
    db: AsyncSession = Depends(get_db),
-) -> list[DocumentOut]:
-    result = await db.execute(
+) -> DocumentPage:
+    sort_col = _SORT_COLUMNS.get(sort, Document.created_at)
+    sort_expr = sort_col.desc() if order == "desc" else sort_col.asc()
+
+    # Build filter conditions once and reuse for both count + items queries.
+    conditions = [Document.user_id == user_id]
+    if status:
+        conditions.append(Document.status == status)
+    if document_type:
+        conditions.append(Document.document_type == document_type)
+    if search:
+        like = f"%{search}%"
+        conditions.append(
+            or_(
+                Document.title.ilike(like),
+                Document.filename.ilike(like),
+                Document.tags.ilike(like),
+                Document.document_type.ilike(like),
+            )
+        )
+
+    count_result = await db.execute(
+        select(func.count(Document.id)).where(*conditions)
+    )
+    total = count_result.scalar_one()
+
+    items_result = await db.execute(
        select(Document)
-        .where(Document.user_id == user_id)
+        .where(*conditions)
        .options(
            selectinload(Document.category_assignments)
            .selectinload(CategoryAssignment.category)
        )
-        .order_by(Document.created_at.desc())
+        .order_by(sort_expr)
+        .offset((page - 1) * per_page)
+        .limit(per_page)
+    )
+    items = [_doc_with_categories(d) for d in items_result.scalars().all()]
+
+    return DocumentPage(
+        items=items,
+        total=total,
+        page=page,
+        pages=max(1, math.ceil(total / per_page)),
    )
-    return [_doc_with_categories(d) for d in result.scalars().all()]


@router.get("/{doc_id}", response_model=DocumentOut)
@@ -27,6 +27,13 @@ class DocumentOut(BaseModel):
    model_config = {"from_attributes": True}


+class DocumentPage(BaseModel):
+    items: list[DocumentOut]
+    total: int
+    page: int
+    pages: int
+
+
 class DocumentStatusOut(BaseModel):
    id: str
    status: str
@@ -0,0 +1,151 @@
+# Frontend — Status
+
+## What it is
+
+React 18 + TypeScript + Vite SPA. In dev it runs on port `5173` and proxies `/api/*` to `backend:8000`. In prod it is served by nginx on port `80`.
+
+All API calls go through `src/api/client.ts` (single Axios instance, JWT injected via request interceptor from `localStorage`).
+
+---
+
+## Routes
+
+| Path | Component | Auth |
+|------|-----------|------|
+| `/login` | `LoginPage` | Public |
+| `/` | `DashboardPage` | Required |
+| `/apps` | `AppsPage` | Required |
+| `/apps/documents` | `DocumentsPage` | Required |
+| `/apps/documents/settings/admin` | `DocumentAdminSettingsPage` | Admin only |
+| `/apps/ai/settings/admin` | `AIAdminSettingsPage` | Admin only |
+| `/admin` | `AdminPage` | Admin only |
+| `/profile` | `ProfilePage` | Required |
+
+`PrivateRoute` redirects to `/login` when no token. `AdminRoute` redirects to `/` when not admin.
+
+---
+
+## Current functionality
+
+### Auth
+
+- Login form (`POST /api/auth/login`) stores JWT in `localStorage`
+- Logout clears token and redirects to `/login`
+- `GET /api/users/me` verifies token on protected routes
+
+### Apps page (`/apps`)
+
+Cards for each installed app:
+- **Documents** — link to `/apps/documents`; admin gear icon → `/apps/documents/settings/admin`
+- **AI Service** — infrastructure card; admin gear icon → `/apps/ai/settings/admin`; no Open button (no user-facing UI)
+
+### Documents page (`/apps/documents`)
+
+**Upload:** PDF file input, 202 response, error display.
+
+**Filter bar:**
+- Search input (400ms debounce) — matches title, filename, tags, document_type
+- Status dropdown (all / pending / processing / done / failed)
+- Type dropdown (all / invoice / bill / receipt / order / expense / revenue / unknown)
+- Sort selector (upload date / processed date / title / filename / file size / type / status)
+- Asc/Desc toggle
+- "Clear filters" button (appears when any filter is active)
+
+**Pagination:** Prev/Next with "X–Y of Z" count. Only shown when total > per_page.
+
+**Document row (collapsed):**
+- Inline title editor (pencil icon, Enter to save, Esc to cancel; shows filename in italic when no title)
+- Status badge (colour-coded)
+- Document type label
+- File size
+- View button (opens PDF in new tab via blob URL — auth-gated)
+- Download button
+- Delete button (confirm dialog)
+
+**Document row (expanded):**
+- **Tag editor** — read mode shows chips + Edit button; edit mode has removable chips + input (Enter/comma to add) + Save/Cancel
+- **Extracted data table** — all AI-extracted JSON fields (excludes `tags`, `suggested_categories`)
+- **Error message** — shown if status=failed
+- **Categories** — assigned chips with remove; dropdown to assign existing; AI-suggested chips with Accept / Create & Assign / Dismiss
+- **Status polling** — auto-refetches every 3s while status is pending/processing; invalidates document list on done/failed
+
+### AI Admin Settings (`/apps/ai/settings/admin`)
+
+- Provider selector (lmstudio / ollama / anthropic)
+- Per-provider fields (base URL, model, API key)
+- Test Connection button (`POST /api/settings/ai/test`)
+- Save button
+
+### Document Admin Settings (`/apps/documents/settings/admin`)
+
+- Upload Limits section only (max PDF size in MB)
+- Save button
+
+### Admin page (`/admin`)
+
+- User list with role and active status
+- Inline role/status editing
+
+### Profile page (`/profile`)
+
+- Display and edit personal information
+
+---
+
+## API client (`src/api/client.ts`)
+
+Key functions:
+
+| Function | Description |
+|----------|-------------|
+| `listDocuments(params)` | `GET /documents` — returns `DocumentPage` |
+| `uploadDocument(file)` | `POST /documents/upload` |
+| `deleteDocument(id)` | `DELETE /documents/{id}` |
+| `downloadDocument(id, filename)` | Blob URL download |
+| `viewDocument(id)` | Blob URL → `window.open`, auto-revoke after 60s |
+| `getDocumentStatus(id)` | Poll endpoint |
+| `listCategories()` | All categories for user |
+| `createCategory(name)` | Create category |
+| `assignCategory(docId, catId)` | Assign |
+| `removeCategory(docId, catId)` | Remove |
+| `updateDocumentTags(id, tags)` | `PATCH /documents/{id}/tags` |
+| `updateDocumentTitle(id, title)` | `PATCH /documents/{id}/title` |
+| `getAISettings()` | `GET /settings/ai` (masked) |
+| `updateAISettings(data)` | `PATCH /settings/ai` |
+| `testAIConnection()` | `POST /settings/ai/test` |
+| `getDocumentLimits()` | `GET /settings/documents/limits` |
+| `updateDocumentLimits(data)` | `PATCH /settings/documents/limits` |
+
+---
+
+## State management
+
+- **TanStack Query** — all server state; `queryKey: ["documents", params]` for cache isolation per filter/page combination
+- **No global store** — local `useState` for UI-only state (editing mode, filter params, etc.)
+- **Token** — `localStorage`, read by `useAuth` hook, injected by Axios interceptor
+
+---
+
+## Known limitations / not implemented
+
+- **JWT in `localStorage`** — XSS risk; migrate to `httpOnly` cookie when backend supports it
+- **No toast / notification system** — errors shown inline; success is silent
+- **No loading skeletons** — "Loading…" text only
+- **No UI component library** — raw inline styles throughout; Penpot + shadcn/ui evaluation pending
+- **No group/sharing UI** — blocked on backend groups system
+- **No app permission UI** — all apps visible to all authenticated users
+
+---
+
+## Future work
+
+- [ ] UI component library decision (shadcn/ui recommended) + Penpot design system
+- [ ] Toast notification system (upload success, save feedback, errors)
+- [ ] Loading skeletons
+- [ ] `POST /queue/jobs` integration — show AI processing queue status / progress per document
+- [ ] Re-process document button (`POST /documents/{id}/reprocess` — needs backend endpoint first)
+- [ ] Advanced filter: extracted data fields (vendor, due date, amount) — needs backend support
+- [ ] Groups + document sharing UI — blocked on backend
+- [ ] App permissions UI in Admin page
+- [ ] `httpOnly` cookie auth (requires backend change)
+- [ ] Bulk document operations (select multiple, bulk delete / bulk categorise)
@@ -98,6 +98,23 @@ export interface DocumentOut {
  categories: CategoryOut[];
 }

+export interface DocumentPage {
+  items: DocumentOut[];
+  total: number;
+  page: number;
+  pages: number;
+}
+
+export interface DocumentListParams {
+  page?: number;
+  per_page?: number;
+  sort?: string;
+  order?: "asc" | "desc";
+  status?: string;
+  document_type?: string;
+  search?: string;
+}
+
 export interface DocumentStatusOut {
  id: string;
  status: DocumentStatus;
@@ -106,8 +123,8 @@ export interface DocumentStatusOut {
  processed_at: string | null;
 }

-export const listDocuments = () =>
-  api.get<DocumentOut[]>("/documents").then((r) => r.data);
+export const listDocuments = (params: DocumentListParams = {}) =>
+  api.get<DocumentPage>("/documents", { params }).then((r) => r.data);

 export const getDocument = (id: string) =>
  api.get<DocumentOut>(`/documents/${id}`).then((r) => r.data);
@@ -1,4 +1,4 @@
-import { useRef, useState, useEffect } from "react";
+import { useRef, useState, useEffect, useCallback } from "react";
 import { useQuery, useMutation, useQueryClient } from "@tanstack/react-query";
 import Nav from "../components/Nav";
 import {
@@ -16,6 +16,7 @@ import {
  updateDocumentTitle,
  type DocumentOut,
  type CategoryOut,
+  type DocumentListParams,
 } from "../api/client";

 function StatusBadge({ status }: { status: DocumentOut["status"] }) {
@@ -504,6 +505,140 @@ function DocumentRow({
  );
 }

+// ── Filter bar ──────────────────────────────────────────────────────────────
+
+const SORT_OPTIONS = [
+  { value: "created_at", label: "Upload date" },
+  { value: "processed_at", label: "Processed date" },
+  { value: "title", label: "Title" },
+  { value: "filename", label: "Filename" },
+  { value: "file_size", label: "File size" },
+  { value: "document_type", label: "Type" },
+  { value: "status", label: "Status" },
+];
+
+const STATUS_OPTIONS = ["pending", "processing", "done", "failed"];
+const TYPE_OPTIONS = ["invoice", "bill", "receipt", "order", "expense", "revenue", "unknown"];
+
+function FilterBar({
+  params,
+  onChange,
+}: {
+  params: DocumentListParams;
+  onChange: (p: Partial<DocumentListParams>) => void;
+}) {
+  const [searchInput, setSearchInput] = useState(params.search ?? "");
+
+  // Debounce search: commit after 400 ms of no typing
+  useEffect(() => {
+    const id = setTimeout(() => onChange({ search: searchInput || undefined, page: 1 }), 400);
+    return () => clearTimeout(id);
+  // eslint-disable-next-line react-hooks/exhaustive-deps
+  }, [searchInput]);
+
+  return (
+    <div style={{ display: "flex", flexWrap: "wrap", gap: 8, marginBottom: 16, alignItems: "center" }}>
+      <input
+        value={searchInput}
+        onChange={(e) => setSearchInput(e.target.value)}
+        placeholder="Search title, filename, tags…"
+        style={{ padding: "6px 10px", fontSize: 13, border: "1px solid #ccc", borderRadius: 4, width: 220 }}
+      />
+
+      <select
+        value={params.status ?? ""}
+        onChange={(e) => onChange({ status: e.target.value || undefined, page: 1 })}
+        style={{ padding: "6px 8px", fontSize: 13, border: "1px solid #ccc", borderRadius: 4 }}
+      >
+        <option value="">All statuses</option>
+        {STATUS_OPTIONS.map((s) => (
+          <option key={s} value={s}>{s}</option>
+        ))}
+      </select>
+
+      <select
+        value={params.document_type ?? ""}
+        onChange={(e) => onChange({ document_type: e.target.value || undefined, page: 1 })}
+        style={{ padding: "6px 8px", fontSize: 13, border: "1px solid #ccc", borderRadius: 4 }}
+      >
+        <option value="">All types</option>
+        {TYPE_OPTIONS.map((t) => (
+          <option key={t} value={t}>{t}</option>
+        ))}
+      </select>
+
+      <select
+        value={params.sort ?? "created_at"}
+        onChange={(e) => onChange({ sort: e.target.value, page: 1 })}
+        style={{ padding: "6px 8px", fontSize: 13, border: "1px solid #ccc", borderRadius: 4 }}
+      >
+        {SORT_OPTIONS.map((o) => (
+          <option key={o.value} value={o.value}>{o.label}</option>
+        ))}
+      </select>
+
+      <button
+        onClick={() => onChange({ order: params.order === "asc" ? "desc" : "asc", page: 1 })}
+        title={params.order === "asc" ? "Ascending — click to reverse" : "Descending — click to reverse"}
+        style={{ padding: "6px 10px", fontSize: 13, border: "1px solid #ccc", borderRadius: 4, cursor: "pointer", background: "#fff" }}
+      >
+        {params.order === "asc" ? "↑ Asc" : "↓ Desc"}
+      </button>
+
+      {(params.search || params.status || params.document_type) && (
+        <button
+          onClick={() => { setSearchInput(""); onChange({ search: undefined, status: undefined, document_type: undefined, page: 1 }); }}
+          style={{ padding: "6px 10px", fontSize: 12, border: "1px solid #ddd", borderRadius: 4, cursor: "pointer", color: "#666", background: "#fafafa" }}
+        >
+          Clear filters
+        </button>
+      )}
+    </div>
+  );
+}
+
+// ── Pagination controls ──────────────────────────────────────────────────────
+
+function Pagination({
+  page,
+  pages,
+  total,
+  perPage,
+  onChange,
+}: {
+  page: number;
+  pages: number;
+  total: number;
+  perPage: number;
+  onChange: (p: number) => void;
+}) {
+  const start = (page - 1) * perPage + 1;
+  const end = Math.min(page * perPage, total);
+
+  return (
+    <div style={{ display: "flex", alignItems: "center", gap: 8, marginTop: 16, fontSize: 13, color: "#555" }}>
+      <button
+        onClick={() => onChange(page - 1)}
+        disabled={page <= 1}
+        style={{ padding: "4px 10px", cursor: page > 1 ? "pointer" : "default", borderRadius: 4, border: "1px solid #ccc", background: "#fff" }}
+      >
+        ‹ Prev
+      </button>
+      <span>
+        {start}–{end} of {total}
+      </span>
+      <button
+        onClick={() => onChange(page + 1)}
+        disabled={page >= pages}
+        style={{ padding: "4px 10px", cursor: page < pages ? "pointer" : "default", borderRadius: 4, border: "1px solid #ccc", background: "#fff" }}
+      >
+        Next ›
+      </button>
+      <span style={{ color: "#aaa" }}>Page {page} / {pages}</span>
+    </div>
+  );
+}
+
 // ── Page ────────────────────────────────────────────────────────────────────

 export default function DocumentsPage() {
@@ -512,11 +647,26 @@ export default function DocumentsPage() {
  const [newCatName, setNewCatName] = useState("");
  const [uploadError, setUploadError] = useState<string | null>(null);

-  const { data: documents = [], isLoading } = useQuery({
-    queryKey: ["documents"],
-    queryFn: listDocuments,
+  const [params, setParams] = useState<DocumentListParams>({
+    page: 1,
+    per_page: 20,
+    sort: "created_at",
+    order: "desc",
  });

+  const updateParams = useCallback((patch: Partial<DocumentListParams>) => {
+    setParams((prev) => ({ ...prev, ...patch }));
+  }, []);
+
+  const { data: docPage, isLoading } = useQuery({
+    queryKey: ["documents", params],
+    queryFn: () => listDocuments(params),
+  });
+
+  const documents = docPage?.items ?? [];
+  const total = docPage?.total ?? 0;
+  const pages = docPage?.pages ?? 1;
+
  const { data: categories = [] } = useQuery({
    queryKey: ["categories"],
    queryFn: listCategories,
@@ -558,7 +708,7 @@ export default function DocumentsPage() {
  return (
    <>
      <Nav />
-      <div style={{ padding: 32, maxWidth: 900, margin: "0 auto" }}>
+      <div style={{ padding: 32, maxWidth: 960, margin: "0 auto" }}>
        <h1>Documents</h1>

        {/* Upload */}
@@ -611,20 +761,38 @@ export default function DocumentsPage() {
          </form>
        </details>

+        {/* Filter bar */}
+        <FilterBar params={params} onChange={updateParams} />
+
        {/* Document list */}
        {isLoading ? (
          <p>Loading…</p>
        ) : documents.length === 0 ? (
-          <p style={{ color: "#666" }}>No documents yet. Upload a PDF to get started.</p>
+          <p style={{ color: "#666" }}>
+            {total === 0 && !params.search && !params.status && !params.document_type
+              ? "No documents yet. Upload a PDF to get started."
+              : "No documents match the current filters."}
+          </p>
        ) : (
-          documents.map((doc) => (
-            <DocumentRow
-              key={doc.id}
-              doc={doc}
-              categories={categories}
-              onDelete={(id) => deleteMut.mutate(id)}
-            />
-          ))
+          <>
+            {documents.map((doc) => (
+              <DocumentRow
+                key={doc.id}
+                doc={doc}
+                categories={categories}
+                onDelete={(id) => deleteMut.mutate(id)}
+              />
+            ))}
+            {pages > 1 && (
+              <Pagination
+                page={params.page ?? 1}
+                pages={pages}
+                total={total}
+                perPage={params.per_page ?? 20}
+                onChange={(p) => updateParams({ page: p })}
+              />
+            )}
+          </>
        )}
      </div>
    </>