6fed5ba531
Research, pattern mapping, and verification complete. Walking Skeleton mode active (MVP Phase 1). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
27 KiB
27 KiB
phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, user_setup, tags, must_haves
| phase | plan | type | wave | depends_on | files_modified | autonomous | requirements | user_setup | tags | must_haves | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 01-infrastructure-foundation | 02 | execute | 1 |
|
true |
|
|
|
Purpose: Wave 0 fills the validation gaps catalogued in 01-VALIDATION.md Section "Wave 0 Gaps" so that every later task has a meaningful <automated> verify command. Without this plan, later tasks would have no automated test target and would silently regress to "smoke check by hand."
Output: Five test files (one new, four updated) plus a refreshed conftest.py with async SQLAlchemy fixtures.
<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>
@CLAUDE.md @.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/01-infrastructure-foundation/01-CONTEXT.md @.planning/phases/01-infrastructure-foundation/01-RESEARCH.md @.planning/phases/01-infrastructure-foundation/01-PATTERNS.md @.planning/phases/01-infrastructure-foundation/01-VALIDATION.md The tests in this plan describe the interface that Plans 03, 04, and 05 must build to. They are intentionally written BEFORE the implementation.Interfaces under test (will exist after Plans 03-05):
# backend/db/models.py — created by Plan 03
class Base(DeclarativeBase): ...
class User(Base): __tablename__ = "users"; id, handle, email, ...
class Document(Base): __tablename__ = "documents"; id, user_id (NULLABLE in Phase 1 per D-03), filename, object_key, ...
class Topic(Base): __tablename__ = "topics"; id, user_id, name, description, color
class CloudConnection(Base): __tablename__ = "cloud_connections"; ...
class Group(Base): __tablename__ = "groups" # D-02 stub
# Full table list: users, quotas, refresh_tokens, folders, documents, topics, document_topics, shares, audit_log, cloud_connections, groups
# backend/deps/db.py — created by Plan 03
async def get_db() -> AsyncGenerator[AsyncSession, None]: ...
# backend/storage/base.py — created by Plan 04
class StorageBackend(ABC):
async def put_object(user_id, document_id, file_bytes, extension, content_type) -> str # returns object_key
async def get_object(object_key) -> bytes
async def delete_object(object_key) -> None
async def presigned_get_url(object_key, expires_minutes=60) -> str
async def health_check() -> bool
# backend/storage/minio_backend.py — created by Plan 04
class MinIOBackend(StorageBackend): ...
# backend/main.py — modified by Plan 05; /health response shape:
# {"status": "ok"|"degraded", "checks": {"postgres": "ok"|"error: ...", "minio": "ok"|"error: ..."}}
Existing files referenced by tests:
backend/main.pyexportsapp: FastAPI(current top-level binding at line 16)backend/api/documents.pyexposesPOST /api/documents/upload,GET /api/documents,GET /api/documents/{doc_id},DELETE /api/documents/{doc_id}(existing route table)backend/pytest.inialready setsasyncio_mode = autoandtestpaths = tests
1. `test_object_key_schema(db_session)`: try `from storage.minio_backend import MinIOBackend`; build an instance with stubbed `Minio` client (use `unittest.mock.MagicMock` for `self._client`); call `await backend.put_object(user_id="11111111-1111-1111-1111-111111111111", document_id="22222222-2222-2222-2222-222222222222", file_bytes=b"x", extension=".pdf", content_type="application/pdf")`; assert returned key matches the regex `r'^[^/]+/[^/]+/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}(\.[a-zA-Z0-9]+)?$'` and assert the middle UUID segment is NOT equal to the user_id or document_id; assert the file extension `.pdf` is preserved at the tail.
2. `test_filename_not_in_object_key()`: same setup; pass extension `.pdf` but record that the human filename `invoice_Q3_2025_secret.pdf` is never passed into the SDK at all — assert `"invoice"` not in returned key, `"Q3"` not in returned key, `"secret"` not in returned key.
3. `test_storage_backend_abc_methods()`: try `from storage.base import StorageBackend`; define a local class `Stub(StorageBackend): pass`; assert `pytest.raises(TypeError)` when `Stub()` is instantiated (because all 5 abstract methods are unimplemented).
4. `test_get_storage_backend_returns_minio()`: try `from storage import get_storage_backend; from storage.minio_backend import MinIOBackend`; assert `isinstance(get_storage_backend(), MinIOBackend)`.
5. `test_put_object_uses_asyncio_to_thread(monkeypatch)`: assert that `MinIOBackend.put_object` does NOT call `self._client.put_object` directly inside the async function — it must wrap with `asyncio.to_thread` (verifiable by monkeypatching `asyncio.to_thread` to a tracking mock and asserting it was called with `self._client.put_object` as the first arg). RESEARCH.md Pattern 3.
6. `test_minio_backend_health_check_returns_bool()`: stub `self._client.bucket_exists` to return `True`; await `health_check()`; assert return is exactly `True`. Then stub it to raise `Exception("boom")`; assert `health_check()` returns `False`.
Each test wraps its imports in `try/except ImportError as e: pytest.skip(f"{e}")` so they collect cleanly before Plan 04 lands.
Create `backend/tests/test_alembic.py` containing two tests, each marked `@pytest.mark.xfail(strict=False, reason="implemented in plan 03")`:
1. `test_migration_creates_all_tables(tmp_path, monkeypatch)`: create a fresh aiosqlite DB file under tmp_path; set `DATABASE_MIGRATE_URL` env var to `sqlite+aiosqlite:///<tmp file>`; invoke `alembic.command.upgrade(Config("backend/alembic.ini"), "head")` (use the python API not subprocess); connect with an async engine; query `sqlite_master` for table names; assert the set `{"users","quotas","refresh_tokens","folders","documents","topics","document_topics","shares","audit_log","cloud_connections","groups"}` is a subset of the materialized tables. NOTE: Alembic on aiosqlite is acceptable for this test only — production uses PostgreSQL.
2. `test_documents_user_id_nullable(tmp_path)`: after running upgrade, run `PRAGMA table_info(documents)` (SQLite) or `INFORMATION_SCHEMA.COLUMNS` query for PostgreSQL targets; assert the `user_id` column's `notnull` flag is `0` / `is_nullable == 'YES'` (D-03).
Wrap Alembic imports in `try/except ImportError: pytest.skip(...)`.
cd /Users/nik/Documents/Progamming/document_scanner/backend && python3 -m pytest tests/test_storage.py tests/test_alembic.py -v 2>&1 | tail -30
- File `backend/tests/test_storage.py` exists and contains all six test function names: `test_object_key_schema`, `test_filename_not_in_object_key`, `test_storage_backend_abc_methods`, `test_get_storage_backend_returns_minio`, `test_put_object_uses_asyncio_to_thread`, `test_minio_backend_health_check_returns_bool` (verifiable via `grep -c "^async def test_\|^def test_" backend/tests/test_storage.py >= 6`)
- Each test in `test_storage.py` carries `@pytest.mark.xfail(strict=False` (verifiable via `grep -c "@pytest.mark.xfail" backend/tests/test_storage.py >= 6`)
- File `backend/tests/test_alembic.py` exists and contains both test function names `test_migration_creates_all_tables`, `test_documents_user_id_nullable`
- `cd backend && python3 -m pytest tests/test_storage.py tests/test_alembic.py -v` exits 0 (xfail/skip both count as non-failing)
- Output of the same pytest run mentions `xfailed` or `skipped` at least 8 times in total (6 + 2) — verifiable via `python3 -m pytest tests/test_storage.py tests/test_alembic.py -v | grep -E "xfail|XFAIL|skipped|SKIPPED" | wc -l >= 8`
- The regex literal used to match object keys in `test_object_key_schema` is exactly `r'^[^/]+/[^/]+/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}(\.[a-zA-Z0-9]+)?$'` (verifiable via grep with the literal pattern)
- `test_alembic.py` references all 11 table names from the schema: `users`, `quotas`, `refresh_tokens`, `folders`, `documents`, `topics`, `document_topics`, `shares`, `audit_log`, `cloud_connections`, `groups` (verifiable via `for t in users quotas refresh_tokens folders documents topics document_topics shares audit_log cloud_connections groups; do grep -q "\"$t\"\|'$t'" backend/tests/test_alembic.py || echo MISSING:$t; done` produces no MISSING lines)
Wave 0 unit tests for Plan 04 (storage) and Plan 03 (migration) exist as xfail-marked tests; the suite collects and passes; the tests define the contract that Plans 03 + 04 must satisfy.
Task 3: Extend tests/test_health.py + port tests/test_documents.py to the async client
backend/tests/test_health.py, backend/tests/test_documents.py
- `test_health_status_ok`: `GET /health` returns 200 and `data["status"]` is the string `"ok"` (unchanged behavior — keeps Plan 01 green)
- `test_health_checks_postgres_and_minio` (xfail until Plan 05): response JSON has a `checks` dict with keys `postgres` and `minio` both equal to `"ok"`
- Existing document upload/list/get/delete tests are PORTED to the async client (`def` → `async def`, `client.X(...)` → `await async_client.X(...)`) — every existing assertion is preserved verbatim; the new async-port tests are xfail until Plan 05 lands the storage rewrite
- The current sync versions are NOT deleted in this plan — Plan 05 deletes them as part of the cutover so the existing flat-file code stays validated until then
- One new test `test_upload_persists_to_postgres_and_minio(async_client, sample_txt)` (xfail until Plan 05) asserts that after a successful upload, the response includes both an `id` (uuid string) and the document is queryable via `GET /api/documents/{id}` returning the same metadata
- backend/tests/test_health.py (current 5-line file)
- backend/tests/test_documents.py (current 108-line file — all existing test functions must be preserved during this plan)
- .planning/phases/01-infrastructure-foundation/01-VALIDATION.md (Wave 0 Requirements: extend test_health.py and test_documents.py)
- .planning/phases/01-infrastructure-foundation/01-PATTERNS.md (backend/tests/test_health.py section — `test_health_checks_postgres_and_minio` source pattern; backend/tests/test_documents.py section — sync→async port pattern)
- .planning/phases/01-infrastructure-foundation/01-RESEARCH.md (Phase Requirements → Test Map; STORE-07 health checks)
Edit `backend/tests/test_health.py`: KEEP the existing `test_health(client)` test exactly as-is (renaming is not required; it documents the current behavior). APPEND a new test `test_health_checks_postgres_and_minio(async_client)` that issues `await async_client.get("/health")` and asserts: `resp.status_code == 200`, `data := resp.json()`, `"checks" in data`, `"postgres" in data["checks"]`, `"minio" in data["checks"]`, `data["checks"]["postgres"] == "ok"`, `data["checks"]["minio"] == "ok"`, and `data["status"] == "ok"`. Mark this new test `@pytest.mark.xfail(strict=False, reason="extended health probe implemented in plan 05")`.
Edit `backend/tests/test_documents.py`: KEEP all existing sync tests verbatim. APPEND a new section commented `# ── Async port (Plan 05 cutover) ─────────────────────────` containing async versions of each existing test under names with the suffix `_async`: `test_upload_txt_no_classify_async`, `test_upload_pdf_no_classify_async`, `test_list_documents_async`, `test_list_documents_filter_by_topic_async`, `test_get_document_async`, `test_get_document_not_found_async`, `test_delete_document_async`, `test_delete_document_not_found_async`, `test_upload_empty_file_async`. Each `_async` test is `async def`, takes `async_client` instead of `client`, uses `await async_client.post(...)`/`await async_client.get(...)`/`await async_client.delete(...)`, preserves every assertion from its sync counterpart, and is marked `@pytest.mark.xfail(strict=False, reason="async storage layer implemented in plan 05")`. Additionally, ADD one new test `test_upload_persists_to_postgres_and_minio_async(async_client, sample_txt)` (same xfail marker) that uploads `sample_txt`, parses the returned JSON, asserts `data["id"]` matches the uuid regex `r'^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$'`, then `GET /api/documents/{id}` and asserts the metadata round-trips with `data["original_name"] == "sample.txt"`. For the topic-filter port (`test_list_documents_filter_by_topic_async`), replace the `import services.storage as st; st.update_document_topics(...)` step with a direct SQL update against `db_session` (e.g., `await db_session.execute(update(Document).where(...).values(...))`) — wrap this in `try/except ImportError: pytest.skip(...)` because the model imports may not exist yet.
cd /Users/nik/Documents/Progamming/document_scanner/backend && python3 -m pytest tests/test_health.py tests/test_documents.py -v 2>&1 | tail -30
- `backend/tests/test_health.py` contains BOTH `def test_health(` (existing sync test, unchanged) AND `async def test_health_checks_postgres_and_minio(` (new async test)
- The new `test_health_checks_postgres_and_minio` is marked `@pytest.mark.xfail`
- `cd backend && python3 -m pytest tests/test_health.py::test_health -v` exits 0 with `passed` status (the existing test still passes)
- `backend/tests/test_documents.py` contains every existing sync test name (`test_upload_txt_no_classify`, `test_upload_pdf_no_classify`, `test_list_documents`, `test_list_documents_filter_by_topic`, `test_get_document`, `test_get_document_not_found`, `test_delete_document`, `test_delete_document_not_found`, `test_upload_empty_file`) — verifiable via grep for each
- `backend/tests/test_documents.py` contains all nine `_async` counterparts plus `test_upload_persists_to_postgres_and_minio_async` — verifiable via `grep -c "^async def test_.*_async\b" backend/tests/test_documents.py >= 9`
- At least 10 `@pytest.mark.xfail` markers are present in `test_documents.py` (9 ports + 1 persistence test)
- `cd backend && python3 -m pytest tests/test_documents.py -v` exits 0 (sync tests pass, async tests xfail) — verify in output that the sync tests `test_upload_txt_no_classify` etc. report `PASSED`
- Total Wave-0 xfail count across the suite: `cd backend && python3 -m pytest tests/ -v 2>&1 | grep -cE "XFAIL|xfail"` >= 18 (6 storage + 2 alembic + 1 health + 9 async-port + 1 persistence = 19 at minimum)
Health and document test files now contain both the legacy sync tests (still passing) and the new async/PostgreSQL/MinIO-backed tests (xfail until Plan 05); the full suite is green; Wave 0 gaps from VALIDATION.md are filled with executable scaffolds.
<threat_model>
Trust Boundaries
| Boundary | Description |
|---|---|
| Test harness → backend code | Tests instantiate MinIOBackend and ORM models in isolation; in-memory aiosqlite engine prevents test pollution of real services |
STRIDE Threat Register
| Threat ID | Category | Component | Disposition | Mitigation Plan |
|---|---|---|---|---|
| T-01-02-01 | Information Disclosure | Test fixtures leaking sensitive data into committed test files | mitigate | All test data is synthetic (invoice_Q3_2025_secret.pdf is a literal string, no real PII); aiosqlite DB lives in tmp_path and :memory: only |
| T-01-02-02 | Tampering | Object key schema regression introduces filename leakage | mitigate | test_filename_not_in_object_key asserts the human filename is never present in the returned object key (STORE-02); regression would xfail-flip to FAILED and break CI |
| T-01-02-03 | Tampering | Migration creates documents.user_id as NOT NULL (violates D-03) | mitigate | test_documents_user_id_nullable asserts the column's notnull flag is 0; regression breaks the test |
| T-01-02-SC | Tampering | npm/pip/cargo installs | N/A | No new package installs in this plan; tests reuse Plan 01's dependency set |
| </threat_model> |
<success_criteria>
tests/conftest.pyprovides both legacy sync fixtures (unchanged) and new asyncdb_session+async_clientfixtures.tests/test_storage.pyandtests/test_alembic.pyexist and collect cleanly.tests/test_health.pycarries the extended-health-probe scaffold;tests/test_documents.pycarries an async port of every existing test plus a new persistence test.- Every test that depends on Plan 03+ code is
xfail(strict=False)so the suite stays green between waves. - Total xfail count >= 18 across the new scaffolds. </success_criteria>