docs(06): capture phase context — performance & production hardening
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,176 @@
|
||||
# Phase 6: Performance & Production Hardening - Discussion Log
|
||||
|
||||
> **Audit trail only.** Do not use as input to planning, research, or execution agents.
|
||||
> Decisions are captured in CONTEXT.md — this log preserves the alternatives considered.
|
||||
|
||||
**Date:** 2026-05-30
|
||||
**Phase:** 6-performance-production-hardening
|
||||
**Areas discussed:** Observability stack, Load testing & SLA targets, Container hardening depth, Rate limit header bypass prevention
|
||||
|
||||
---
|
||||
|
||||
## Observability Stack
|
||||
|
||||
### Structured Logging Library
|
||||
|
||||
| Option | Description | Selected |
|
||||
|--------|-------------|----------|
|
||||
| structlog | Purpose-built for structured logging; processors pipeline makes correlation IDs trivial; plays well with FastAPI middleware | ✓ |
|
||||
| Standard logging + python-json-logger | Minimal change — configure stdlib root logger with a JSON formatter. Less powerful but zero new dependencies | |
|
||||
| loguru | Simple API, good defaults, supports structured output via sink config | |
|
||||
|
||||
**User's choice:** structlog
|
||||
**Notes:** No follow-up notes.
|
||||
|
||||
---
|
||||
|
||||
### Log Aggregation
|
||||
|
||||
| Option | Description | Selected |
|
||||
|--------|-------------|----------|
|
||||
| Loki + Grafana in docker-compose | Matches success criteria literally. Adds 2 services; queries logs via Grafana UI at localhost | ✓ |
|
||||
| stdout JSON only, no aggregation service | Simpler — just emit JSON to stdout, rely on `docker compose logs` | |
|
||||
| Promtail + Loki + Grafana full stack | Full Grafana stack with Promtail log shipper. More production-realistic but heavier | |
|
||||
|
||||
**User's choice:** Loki + Grafana in docker-compose
|
||||
**Notes:** No follow-up notes.
|
||||
|
||||
---
|
||||
|
||||
### Distributed Tracing
|
||||
|
||||
| Option | Description | Selected |
|
||||
|--------|-------------|----------|
|
||||
| Skip for now — correlation IDs in logs are enough | Simpler; stays in scope for v1 | ✓ |
|
||||
| OpenTelemetry with Tempo (add to Grafana stack) | More complete observability but heavier setup | |
|
||||
| OpenTelemetry spans to stdout only (no backend) | Lightweight but not queryable | |
|
||||
|
||||
**User's choice:** Skip — correlation IDs in logs are enough
|
||||
**Notes:** No follow-up notes.
|
||||
|
||||
---
|
||||
|
||||
## Load Testing & SLA Targets
|
||||
|
||||
### Load Testing Tool
|
||||
|
||||
| Option | Description | Selected |
|
||||
|--------|-------------|----------|
|
||||
| Locust | Python-native, fits the existing stack. Test scenarios reuse auth helpers. Lives in backend/load_tests/ | ✓ |
|
||||
| k6 | JavaScript-based, excellent HTML reports. Separate language from the rest of the stack | |
|
||||
| pytest-benchmark + httpx | Minimal setup, reuses existing test infrastructure. Not realistic for concurrent load | |
|
||||
|
||||
**User's choice:** Locust
|
||||
**Notes:** No follow-up notes.
|
||||
|
||||
---
|
||||
|
||||
### Latency Targets
|
||||
|
||||
| Option | Description | Selected |
|
||||
|--------|-------------|----------|
|
||||
| Strict: p95 < 200ms, p99 < 500ms | Reasonable for a local Docker stack. Clear pass/fail criteria | ✓ |
|
||||
| Relaxed: p95 < 500ms, p99 < 1s | More lenient — appropriate if cloud backend latency is included in scope | |
|
||||
| You decide based on profiling | Run a baseline first, then set targets at 2x observed p95 | |
|
||||
|
||||
**User's choice:** Strict — p95 < 200ms, p99 < 500ms
|
||||
**Notes:** No follow-up notes.
|
||||
|
||||
---
|
||||
|
||||
### Load Test Endpoint Scope
|
||||
|
||||
| Option | Description | Selected |
|
||||
|--------|-------------|----------|
|
||||
| Auth + document list + document get + upload | Covers the critical read/write path. Excludes cloud backends | ✓ |
|
||||
| Auth only | Focus on rate limiting under load. Misses the storage I/O path | |
|
||||
| All endpoints including cloud proxy | Comprehensive but cloud latency makes p95 targets meaningless | |
|
||||
|
||||
**User's choice:** Auth + document list/get/upload (no cloud backends)
|
||||
**Notes:** No follow-up notes.
|
||||
|
||||
---
|
||||
|
||||
## Container Hardening Depth
|
||||
|
||||
### Non-root User Setup
|
||||
|
||||
| Option | Description | Selected |
|
||||
|--------|-------------|----------|
|
||||
| Create appuser (uid 1000), chown /app, switch USER | Standard pattern. Works with read-only rootfs | |
|
||||
| Multi-stage build: builder as root, runtime as appuser | Cleaner security boundary. pip install in builder, copy only packages to runtime. Reduces attack surface | ✓ |
|
||||
| Distroless base image | Minimal image with no shell. Breaks pytesseract (needs system deps) | |
|
||||
|
||||
**User's choice:** Multi-stage build with appuser
|
||||
**Notes:** No follow-up notes.
|
||||
|
||||
---
|
||||
|
||||
### Read-only Filesystem
|
||||
|
||||
| Option | Description | Selected |
|
||||
|--------|-------------|----------|
|
||||
| tmpfs for /tmp + named volume for /app/data in docker-compose | `read_only: true` + tmpfs for temp files + named volume for data. Correct pattern | ✓ |
|
||||
| tmpfs for /tmp only, data paths via env var | Simpler but less strict | |
|
||||
| Skip read-only filesystem for Celery worker | Read-only only on FastAPI service; worker stays writable | |
|
||||
|
||||
**User's choice:** tmpfs for /tmp + named volume for /app/data (full read-only rootfs on both services)
|
||||
**Notes:** No follow-up notes.
|
||||
|
||||
---
|
||||
|
||||
### Linux Capability Dropping
|
||||
|
||||
| Option | Description | Selected |
|
||||
|--------|-------------|----------|
|
||||
| drop ALL capabilities, no cap_add | `cap_drop: [ALL]` with no cap_add. Port 8000 needs no capabilities | ✓ |
|
||||
| drop ALL, add back CAP_NET_BIND_SERVICE | Only needed if binding to port 80/443 — unnecessary for port 8000 | |
|
||||
| drop only dangerous caps (SYS_ADMIN, SYS_PTRACE, NET_RAW) | Less strict than CLAUDE.md mandate | |
|
||||
|
||||
**User's choice:** drop ALL, no cap_add
|
||||
**Notes:** No follow-up notes.
|
||||
|
||||
---
|
||||
|
||||
## Rate Limit Header Bypass Prevention
|
||||
|
||||
### IP Extraction Strategy
|
||||
|
||||
| Option | Description | Selected |
|
||||
|--------|-------------|----------|
|
||||
| Custom key_func: trust X-Forwarded-For only from known proxy IPs | Replace get_remote_address with trusted-proxy check. Prevents header spoofing from external clients | ✓ |
|
||||
| Never trust forwarded headers — always use request.client.host | Simplest and most secure for Docker Compose. Breaks if a proxy is added later | |
|
||||
| Redis-backed rate limiter with per-account AND per-IP limits | More resilient for horizontal scaling but adds Redis dependency | |
|
||||
|
||||
**User's choice:** Custom key_func with trusted-proxy CIDR check
|
||||
**Notes:** No follow-up notes.
|
||||
|
||||
---
|
||||
|
||||
### Per-Account Rate Limiting
|
||||
|
||||
| Option | Description | Selected |
|
||||
|--------|-------------|----------|
|
||||
| Yes — add per-account limits on authenticated endpoints | Second limiter keyed by user_id on document/cloud endpoints (100 req/min per user) | ✓ |
|
||||
| No — per-IP is sufficient for now | Document endpoints don't need additional per-user limits | |
|
||||
| Per-account on auth endpoints only | Match Phase 2 intent exactly | |
|
||||
|
||||
**User's choice:** Yes — per-account limits on authenticated document/cloud endpoints
|
||||
**Notes:** No follow-up notes.
|
||||
|
||||
---
|
||||
|
||||
## Claude's Discretion
|
||||
|
||||
- Exact structlog processor chain configuration
|
||||
- Loki Docker Compose service version and loki-config.yaml — use official Grafana example as base
|
||||
- Promtail vs. Docker log driver for shipping to Loki
|
||||
- Locust user class structure and task weight distribution
|
||||
- Grafana dashboard panel layout (basic request rate + latency + error rate panels)
|
||||
|
||||
## Deferred Ideas
|
||||
|
||||
- HTTPS/TLS termination (nginx + Let's Encrypt or Caddy) — out of scope; RUNBOOK.md documents how to add
|
||||
- Horizontal scaling + Redis-backed rate limit counters — Phase 7+ concern
|
||||
- GitHub Actions CI/CD pipeline for automated load tests and docker scout on every PR
|
||||
- Automated backup cron job as a Docker service — RUNBOOK.md documents manual procedure
|
||||
Reference in New Issue
Block a user