Research, pattern mapping, and verification complete. Walking Skeleton mode active (MVP Phase 1). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
26 KiB
phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, user_setup, tags, must_haves
| phase | plan | type | wave | depends_on | files_modified | autonomous | requirements | user_setup | tags | must_haves | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 01-infrastructure-foundation | 01 | execute | 1 |
|
true |
|
|
|
Purpose: This is the foundation layer of the walking skeleton. Until docker compose up boots all five services cleanly and Settings() can read every Phase 1 variable, no subsequent plan can run.
Output: A new docker-compose.yml, a new docker/postgres/initdb.d/01-init-users.sql, an extended .env.example, an updated backend/requirements.txt, and a rewritten backend/config.py.
<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>
@CLAUDE.md @.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/01-infrastructure-foundation/01-CONTEXT.md @.planning/phases/01-infrastructure-foundation/01-RESEARCH.md @.planning/phases/01-infrastructure-foundation/01-PATTERNS.md @.planning/phases/01-infrastructure-foundation/SKELETON.md Key existing structures the executor must preserve:From the current docker-compose.yml (only backend and frontend services exist today; backend volumes include ./backend/data:/app/data which MUST be removed per D-04). The Compose file uses no top-level volumes: block yet.
From the current backend/config.py: module-level constants DATA_DIR, UPLOADS_DIR, METADATA_DIR, TOPICS_FILE, SETTINGS_FILE, DEFAULT_SYSTEM_PROMPT, DEFAULT_SETTINGS, and a function ensure_data_dirs(). DEFAULT_SYSTEM_PROMPT and DEFAULT_SETTINGS must be preserved verbatim because services/storage.py, services/classifier.py, and api/settings.py still consume them; DATA_DIR/UPLOADS_DIR/METADATA_DIR/TOPICS_FILE/SETTINGS_FILE/ensure_data_dirs will be removed by Plan 05 once services/storage.py is rewritten — leave them in place for now to keep the rest of the app booting between waves.
From backend/requirements.txt: pydantic-settings>=2.2 is already declared (line 4) — no new install needed for that package.
Env var canon (sourced from RESEARCH.md Code Examples lines 914-937 and PATTERNS.md .env.example section):
DATABASE_URL—postgresql+psycopg://docuvault_app:<pw>@postgres:5432/docuvaultDATABASE_MIGRATE_URL—postgresql+psycopg://docuvault_migrate:<pw>@postgres:5432/docuvaultPOSTGRES_PASSWORD— superuser password for the init container (not used by app)MINIO_ROOT_USER/MINIO_ROOT_PASSWORD— MinIO root, init-time onlyMINIO_ENDPOINT—minio:9000MINIO_ACCESS_KEY/MINIO_SECRET_KEY— app-level access key pairMINIO_BUCKET— valuedocuvaultREDIS_PASSWORD— used by Redis--requirepassand insideREDIS_URLREDIS_URL—redis://:<pw>@redis:6379/0SECRET_KEY— Phase 2 JWT/HKDF placeholder; documented now, not read by Phase 1 code paths
1. **PostgreSQL section**: `DATABASE_URL=postgresql+psycopg://docuvault_app:changeme_app@postgres:5432/docuvault` (with explanatory comment "App user — SELECT/INSERT/UPDATE/DELETE only, used by FastAPI + Celery"), `DATABASE_MIGRATE_URL=postgresql+psycopg://docuvault_migrate:changeme_migrate@postgres:5432/docuvault` (with comment "Migration user — DDL privileges, used ONLY by Alembic, never by the app at runtime"), `POSTGRES_PASSWORD=changeme_super` (with comment "Superuser password for the postgres init container — used only by initdb.d scripts").
2. **MinIO section**: `MINIO_ROOT_USER=minioadmin`, `MINIO_ROOT_PASSWORD=changeme_minio_root`, `MINIO_ENDPOINT=minio:9000`, `MINIO_ACCESS_KEY=docuvault_app` (comment: "App-level access key — minimal permissions on docuvault bucket only"), `MINIO_SECRET_KEY=changeme_minio_app`, `MINIO_BUCKET=docuvault`.
3. **Redis section**: `REDIS_PASSWORD=changeme_redis`, `REDIS_URL=redis://:changeme_redis@redis:6379/0` (comment noting it must match `REDIS_PASSWORD` and that the leading `:` is the no-username form for `requirepass`).
4. **Security (Phase 2) section**: `SECRET_KEY=CHANGEME-replace-with-64-char-random-hex` (comment: "Not read by the app in Phase 1 — documented here for Phase 2 JWT + HKDF use"). Each value uses `changeme_*` style placeholders to make obvious which fields require replacement. The DATABASE_URL password (`changeme_app`) and DATABASE_MIGRATE_URL password (`changeme_migrate`) MUST match the hardcoded passwords in `docker/postgres/initdb.d/01-init-users.sql` from Task 1 — re-read that file to confirm. End the file with a final newline.
grep -E '^(DATABASE_URL|DATABASE_MIGRATE_URL|POSTGRES_PASSWORD|MINIO_ROOT_USER|MINIO_ROOT_PASSWORD|MINIO_ENDPOINT|MINIO_ACCESS_KEY|MINIO_SECRET_KEY|MINIO_BUCKET|REDIS_PASSWORD|REDIS_URL|SECRET_KEY|ANTHROPIC_API_KEY|OPENAI_API_KEY)=' /Users/nik/Documents/Progamming/document_scanner/.env.example | sort -u | wc -l | awk '{exit ($1 == 14) ? 0 : 1}'
- `.env.example` defines exactly 14 named variables: `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `DATABASE_URL`, `DATABASE_MIGRATE_URL`, `POSTGRES_PASSWORD`, `MINIO_ROOT_USER`, `MINIO_ROOT_PASSWORD`, `MINIO_ENDPOINT`, `MINIO_ACCESS_KEY`, `MINIO_SECRET_KEY`, `MINIO_BUCKET`, `REDIS_PASSWORD`, `REDIS_URL`, `SECRET_KEY` (verifiable via the Verify command)
- `DATABASE_URL` value starts with `postgresql+psycopg://docuvault_app:` (verifiable via `grep -c "^DATABASE_URL=postgresql+psycopg://docuvault_app:" .env.example` >= 1)
- `DATABASE_MIGRATE_URL` value starts with `postgresql+psycopg://docuvault_migrate:`
- `MINIO_BUCKET=docuvault` exactly (D-06)
- `REDIS_URL` value matches the form `redis://:@redis:6379/0` (verifiable via `grep -E "^REDIS_URL=redis://:[^@]+@redis:6379/0$" .env.example` exits 0)
- `MINIO_ENDPOINT=minio:9000` exactly
- Section headers are present: `grep -c "── PostgreSQL ──" .env.example` >= 1, similarly for MinIO, Redis, Security
- The password embedded in `DATABASE_URL` matches the password literal used in `docker/postgres/initdb.d/01-init-users.sql` for the `docuvault_app` user (re-read both files and confirm string equality of the password substring)
- The password embedded in `DATABASE_MIGRATE_URL` matches the password literal used in `docker/postgres/initdb.d/01-init-users.sql` for the `docuvault_migrate` user
- The password embedded in `REDIS_URL` matches `REDIS_PASSWORD` (verifiable: extract value of REDIS_PASSWORD and grep `redis://:${value}@` in REDIS_URL line)
`.env.example` documents every Phase 1 environment variable with safe placeholder values, grouped and commented by service; the DB and Redis password placeholders are consistent between `.env.example` and the Postgres init script.
Task 3: Replace backend/config.py with Pydantic Settings + extend backend/requirements.txt
backend/config.py, backend/requirements.txt
- backend/config.py (current — preserve `DEFAULT_SYSTEM_PROMPT`, `DEFAULT_SETTINGS`, and `ensure_data_dirs()` verbatim because they are still consumed by `services/storage.py`, `services/classifier.py`, `api/settings.py` until Plan 05; rewrite the rest as a Pydantic `Settings` class)
- backend/requirements.txt (current — `pydantic-settings>=2.2` and `httpx>=0.27` and `pytest-asyncio>=0.23` already present; add new deps, remove `filelock`)
- .planning/phases/01-infrastructure-foundation/01-RESEARCH.md (Standard Stack section: exact pinned versions; Config Extension code example lines 914-937)
- .planning/phases/01-infrastructure-foundation/01-PATTERNS.md (backend/config.py section — module-level Settings instance is the established consumer interface)
- .env.example (Task 2 output — variable names must match the Settings field names lower-cased)
Rewrite `backend/config.py` so it imports `from pydantic_settings import BaseSettings`, defines a `Settings(BaseSettings)` class with these fields (with the listed defaults — defaults are used only if env vars are unset, which keeps tests workable): `data_dir: str = "/app/data"`, `database_url: str = "postgresql+psycopg://docuvault_app:changeme_app@postgres:5432/docuvault"`, `database_migrate_url: str = "postgresql+psycopg://docuvault_migrate:changeme_migrate@postgres:5432/docuvault"`, `minio_endpoint: str = "minio:9000"`, `minio_access_key: str = "docuvault_app"`, `minio_secret_key: str = "changeme_minio_app"`, `minio_bucket: str = "docuvault"`, `redis_url: str = "redis://:changeme_redis@redis:6379/0"`, `secret_key: str = "CHANGEME"`. Inside the class, add `model_config = SettingsConfigDict(env_file=".env", env_file_encoding="utf-8", extra="ignore")` (Pydantic Settings v2 API — `class Config:` is deprecated). After the class, instantiate `settings = Settings()` at module level so all callers `from config import settings`. **Preserve and keep at module level for backward compatibility through Wave 4:** the existing `DATA_DIR = Path(...)`, `UPLOADS_DIR`, `METADATA_DIR`, `TOPICS_FILE`, `SETTINGS_FILE` constants; the `DEFAULT_SYSTEM_PROMPT` string; the `DEFAULT_SETTINGS` dict; and the `ensure_data_dirs()` function. Plan 05 deletes these once `services/storage.py` is rewritten. The rewrite is additive in this plan: the existing app must still boot after this change. Then update `backend/requirements.txt`: REMOVE the line `filelock>=3.14` (RESEARCH.md "State of the Art" — replaced by PostgreSQL transactions). APPEND these new lines: `sqlalchemy[asyncio]>=2.0.49`, `psycopg[binary]>=3.3.4`, `alembic>=1.18.4`, `minio>=7.2.20`, `celery[redis]>=5.6.3`, `redis>=7.4.0`, `aiosqlite>=0.20.0` (needed by Plan 02's in-memory test engine). Bump `pytest-asyncio>=1.3.0` (existing `>=0.23` no longer supports `asyncio_mode = auto` reliably with current pytest). Keep all other existing lines.
cd /Users/nik/Documents/Progamming/document_scanner/backend && python3 -c "import ast; tree = ast.parse(open('config.py').read()); names = {n.name for node in ast.walk(tree) for n in ([node] if isinstance(node, ast.ClassDef) else [])}; assert 'Settings' in names, 'Settings class missing'; print('settings-class-ok')"
- `backend/config.py` contains `class Settings(BaseSettings):` (verifiable via `grep -c "class Settings(BaseSettings)" backend/config.py` >= 1)
- `backend/config.py` contains the literal `settings = Settings()` at module level
- `backend/config.py` contains every field name `database_url`, `database_migrate_url`, `minio_endpoint`, `minio_access_key`, `minio_secret_key`, `minio_bucket`, `redis_url`, `secret_key` (each verifiable via grep)
- `backend/config.py` uses `SettingsConfigDict(env_file=".env"` (not the deprecated `class Config:` form) — verifiable via `grep -c "SettingsConfigDict" backend/config.py` >= 1
- `backend/config.py` still exports the `DEFAULT_SYSTEM_PROMPT` and `DEFAULT_SETTINGS` constants and the `ensure_data_dirs` function (verifiable: `grep -c "DEFAULT_SYSTEM_PROMPT\|DEFAULT_SETTINGS\|def ensure_data_dirs" backend/config.py` >= 3)
- `python3 -c "import sys; sys.path.insert(0, 'backend'); from config import settings; assert settings.minio_bucket == 'docuvault'; assert settings.database_url.startswith('postgresql+psycopg://'); print('config-import-ok')"` exits 0 (executed from the project root)
- `backend/requirements.txt` no longer contains `filelock` (verifiable: `grep -c "^filelock" backend/requirements.txt | grep -q "^0$"`)
- `backend/requirements.txt` contains each of: `sqlalchemy[asyncio]>=2.0`, `psycopg[binary]>=3.3`, `alembic>=1.18`, `minio>=7.2`, `celery[redis]>=5.6`, `redis>=7.4`, `aiosqlite>=0.20`, `pytest-asyncio>=1.3` (each verifiable via `grep -F` on the line prefix)
- Existing lines preserved: `fastapi>=0.111`, `uvicorn[standard]>=0.29`, `python-multipart`, `pydantic-settings>=2.2`, `anthropic>=0.26`, `openai>=1.30`, `PyMuPDF>=1.24`, `python-docx>=1.1`, `pytesseract>=0.3`, `Pillow>=10.3`, `aiofiles>=23.2`, `httpx>=0.27`, `pytest>=8.2` (each verifiable via `grep -F`)
`config.settings` is a Pydantic Settings instance reading every Phase 1 env var; the legacy data-dir constants remain available for the still-running flat-file code path; `requirements.txt` declares the full Phase 1 dependency set and removes `filelock`.
<threat_model>
Trust Boundaries
| Boundary | Description |
|---|---|
| Docker host → containers | Compose orchestrates services; environment variables flow from .env//etc/docuvault/env into containers |
| Backend container → PostgreSQL | App connects with restricted docuvault_app role; Alembic connects with privileged docuvault_migrate role |
| Backend container → MinIO | App connects with app-level access key (separate from MinIO root credentials) |
| Backend container → Redis | App + Celery worker connect with requirepass-protected URL |
STRIDE Threat Register
| Threat ID | Category | Component | Disposition | Mitigation Plan |
|---|---|---|---|---|
| T-01-01-01 | Elevation of Privilege | PostgreSQL connection from app | mitigate | Two-DSN pattern (D-13): DATABASE_URL uses docuvault_app (DML only, no DDL); DATABASE_MIGRATE_URL uses docuvault_migrate (DDL only, used by Alembic only). Init script in Task 1 hard-codes both grants; Plan 03 issues ALTER DEFAULT PRIVILEGES inside the migration. |
| T-01-01-02 | Elevation of Privilege | MinIO root credentials | mitigate | MINIO_ROOT_USER / MINIO_ROOT_PASSWORD used only for MinIO container init; app uses separate MINIO_ACCESS_KEY / MINIO_SECRET_KEY (D-15); the app never connects with root credentials. |
| T-01-01-03 | Information Disclosure | Redis unauthenticated access on Docker network | mitigate | Redis runs with --requirepass ${REDIS_PASSWORD} (Pattern 6); both app and worker connect via REDIS_URL containing the password; healthcheck passes -a $REDIS_PASSWORD per Pitfall 5. |
| T-01-01-04 | Information Disclosure | Secret leakage via committed .env file |
mitigate | .env added to .gitignore in Task 1; only .env.example with placeholder changeme_* values is committed (D-11); production secrets stored outside the project at /etc/docuvault/env chmod 600 (D-12, documented in SKELETON.md). |
| T-01-01-05 | Tampering | Compose service starts before its dependencies are ready | mitigate | depends_on: condition: service_healthy on backend and celery-worker for postgres + minio + redis (Pattern 6); replaces race-prone sleep patterns. |
| T-01-01-SC | Tampering | npm/pip supply chain on dependency install | mitigate | RESEARCH.md Package Legitimacy Audit verified all 6 new packages on PyPI via pip3 index versions + slopcheck OK; no [ASSUMED] or [SUS] packages — no checkpoint required this plan. |
| </threat_model> |
<success_criteria>
docker-compose.yml,docker/postgres/initdb.d/01-init-users.sql,.env.example,backend/config.py, andbackend/requirements.txtare all updated according to the acceptance criteria above.docker compose --env-file .env.example config -qexits 0.from config import settingsworks in Python without raising;settings.minio_bucket == "docuvault"..envis gitignored and.env.exampledocuments every env var referenced bydocker-compose.yml.- No existing flat-file constants are deleted yet — the app must still boot after this plan in isolation (Plan 05 completes the cutover). </success_criteria>