Files
curo1305 5349f21752 feat: add storage-service container with pluggable backends (Phase 1)
New FastAPI microservice (port 8020) providing unified blob storage via
PUT/GET/DELETE/LIST HTTP API. Local filesystem backend is the default (zero
extra deps). S3-compatible and WebDAV backends are built in. Backend is
switchable at runtime via POST /migrate, which copies all objects to the new
backend, verifies each one, atomically switches, then cleans up the old backend.

WebDAV XML parsing uses defusedxml to prevent XXE attacks.

Wired into docker-compose (storage_data volume) and registered in the backend
service-health poller as 'storage-service'.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 15:50:31 +02:00

116 lines
4.6 KiB
Markdown

# storage-service — Claude context
Unified file/blob storage microservice, port 8020 (internal). All services must use this service's
HTTP API for any file persistence — no service may write to a Docker volume directly. See root
`CLAUDE.md` for architecture, Docker, and project-wide workflows.
---
## Architecture rule (enforced)
**No service may write to a filesystem path for persistent data.**
All file/blob storage must go through the storage-service HTTP API.
Violation is a security/architecture defect.
---
## File & Folder Tree
```
features/storage-service/
├── app/
│ ├── main.py ← FastAPI, lifespan (backend init)
│ ├── core/config.py ← Settings (DATA_DIR, STORAGE_BACKEND, S3_*, WEBDAV_*)
│ ├── routers/
│ │ ├── health.py ← GET /health
│ │ ├── objects.py ← PUT/GET/DELETE /objects/{bucket}/{key:path}, GET /objects/{bucket}
│ │ └── migrate.py ← POST /migrate, GET /migrate/status, DELETE /migrate, PATCH /backend-config
│ └── services/
│ ├── backend_manager.py ← build_backend(), initialize_backend(), get_backend(), switch_backend()
│ ├── migration.py ← run_migration(), get_status(), cancel(); KNOWN_BUCKETS
│ └── backends/
│ ├── base.py ← AbstractStorageBackend (ABC)
│ ├── local.py ← LocalFSBackend — /data/storage/{bucket}/{key}
│ ├── s3.py ← S3Backend — aiobotocore, endpoint_url configurable
│ └── webdav.py ← WebDAVBackend — aiohttp + WebDAV PROPFIND/PUT/GET/DELETE
├── scripts/
│ ├── start.sh ← prod start (uvicorn port 8020)
│ └── start_dev.sh ← dev start (uvicorn --reload)
├── Dockerfile ← python:3.12-slim, non-root user 1001
└── STATUS.md
```
---
## HTTP API
### Objects
| Method | Path | Body | Response |
|--------|------|------|----------|
| PUT | `/objects/{bucket}/{key:path}` | Raw bytes | 204 |
| GET | `/objects/{bucket}/{key:path}` | — | 200 Raw bytes / 404 |
| DELETE | `/objects/{bucket}/{key:path}` | — | 204 |
| GET | `/objects/{bucket}` | — | `{"bucket": "...", "keys": [...]}` |
Keys may contain `/` (e.g. `user123/abc.pdf`). Path traversal (`..`) returns 400.
### Migration
| Method | Path | Body | Response |
|--------|------|------|----------|
| POST | `/migrate` | `{"driver": "s3", "config": {...}}` | 202 / 400 / 409 |
| GET | `/migrate/status` | — | `{state, total, done, failed, errors[]}` |
| DELETE | `/migrate` | — | 204 / 409 |
| PATCH | `/backend-config` | `{"driver": "...", "config": {...}}` | 204 / 400 / 409 |
Migration states: `idle → validating → migrating → switching → cleaning → done` (or `failed`/`cancelled`)
### Health
| Method | Path | Response |
|--------|------|----------|
| GET | `/health` | `{"status": "ok", "backend": "local"}` |
---
## Buckets
| Bucket | Contents | Key format |
|--------|----------|------------|
| `documents` | Uploaded PDFs | `{user_id}/{doc_id}.pdf` or `watch/{doc_id}.pdf` |
| `config` | JSON config files | `{service_name}_config.json` |
To add a new bucket: add it to `KNOWN_BUCKETS` in `services/migration.py` so it is included in migrations.
---
## Backend drivers
| Driver | Config fields | Notes |
|--------|---------------|-------|
| `local` | `data_dir` (optional) | Default. Files under `/data/storage/`. Zero external deps. |
| `s3` | `endpoint_url`, `access_key`, `secret_key`, `region` | Works with MinIO, AWS S3, Backblaze B2, Cloudflare R2. Set `endpoint_url=""` for real AWS. |
| `webdav` | `url`, `username`, `password`, `root_path` | Nextcloud: set root_path to `/remote.php/dav/files/{username}` |
---
## Adding a new backend driver
1. Create `app/services/backends/your_driver.py` implementing `AbstractStorageBackend`
2. Add a branch in `build_backend()` in `backend_manager.py`
3. Add config fields to `app/core/config.py` if env-based config is needed
4. Document driver name + config fields in this file
---
## Default Values & Limits
| Parameter | Value | Location |
|-----------|-------|----------|
| Default backend | `local` | `STORAGE_BACKEND` env var |
| Local data dir | `/data/storage` | `DATA_DIR` env var |
| S3 region default | `us-east-1` | `S3_REGION` env var |
| Migration error cap in response | 50 | `migration.py` |
| Port | 8020 | `scripts/start.sh` |