docs(01-05): complete walking-skeleton plan — SUMMARY, STATE, ROADMAP
Phase 1 complete: all 5/5 plans executed, walking-skeleton e2e verified live against Docker stack (postgres + minio + redis + backend + celery-worker). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -4,7 +4,7 @@ _Last updated: 2026-05-21_
|
|||||||
|
|
||||||
## Phases
|
## Phases
|
||||||
|
|
||||||
- [ ] **Phase 1: Infrastructure Foundation** — PostgreSQL + MinIO wired into Docker Compose; Alembic migrations running; existing app still works
|
- [x] **Phase 1: Infrastructure Foundation** — PostgreSQL + MinIO wired into Docker Compose; Alembic migrations running; existing app still works
|
||||||
- [ ] **Phase 2: Users & Authentication** — Full auth flow end-to-end (register, login, TOTP, backup codes, password reset, sign-out-all) with admin panel for user management
|
- [ ] **Phase 2: Users & Authentication** — Full auth flow end-to-end (register, login, TOTP, backup codes, password reset, sign-out-all) with admin panel for user management
|
||||||
- [ ] **Phase 3: Document Migration & Multi-User Isolation** — All documents in PostgreSQL + MinIO; per-user isolation enforced; existing UI still works
|
- [ ] **Phase 3: Document Migration & Multi-User Isolation** — All documents in PostgreSQL + MinIO; per-user isolation enforced; existing UI still works
|
||||||
- [ ] **Phase 4: Folders, Sharing, Quotas & Document UX** — Full document management UX (folders, sharing, quota bar, PDF preview, search, audit log)
|
- [ ] **Phase 4: Folders, Sharing, Quotas & Document UX** — Full document management UX (folders, sharing, quota bar, PDF preview, search, audit log)
|
||||||
@@ -31,7 +31,7 @@ _Last updated: 2026-05-21_
|
|||||||
- [x] 01-02-PLAN.md — Wave 0 test scaffolds (xfail/skip stubs) + async pytest fixtures
|
- [x] 01-02-PLAN.md — Wave 0 test scaffolds (xfail/skip stubs) + async pytest fixtures
|
||||||
- [x] 01-03-PLAN.md — SQLAlchemy ORM models + async engine + Alembic async migration (incl. alembic upgrade head)
|
- [x] 01-03-PLAN.md — SQLAlchemy ORM models + async engine + Alembic async migration (incl. alembic upgrade head)
|
||||||
- [x] 01-04-PLAN.md — StorageBackend ABC + MinIO backend + rewritten async services/storage.py
|
- [x] 01-04-PLAN.md — StorageBackend ABC + MinIO backend + rewritten async services/storage.py
|
||||||
- [ ] 01-05-PLAN.md — Lifespan + /health + API cutover + Celery worker + walking-skeleton e2e verify
|
- [x] 01-05-PLAN.md — Lifespan + /health + API cutover + Celery worker + walking-skeleton e2e verify
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
+16
-18
@@ -2,29 +2,29 @@
|
|||||||
gsd_state_version: 1.0
|
gsd_state_version: 1.0
|
||||||
milestone: v1.0
|
milestone: v1.0
|
||||||
milestone_name: milestone
|
milestone_name: milestone
|
||||||
current_phase: 1
|
current_phase: 2
|
||||||
status: executing
|
status: ready
|
||||||
last_updated: "2026-05-22T07:40:00Z"
|
last_updated: "2026-05-22T12:00:00Z"
|
||||||
progress:
|
progress:
|
||||||
total_phases: 5
|
total_phases: 5
|
||||||
completed_phases: 0
|
completed_phases: 1
|
||||||
total_plans: 5
|
total_plans: 5
|
||||||
completed_plans: 4
|
completed_plans: 5
|
||||||
percent: 80
|
percent: 20
|
||||||
---
|
---
|
||||||
|
|
||||||
# Project State
|
# Project State
|
||||||
|
|
||||||
**Project:** DocuVault
|
**Project:** DocuVault
|
||||||
**Status:** Executing Phase 1
|
**Status:** Phase 1 Complete — Ready for Phase 2
|
||||||
**Current Phase:** 1
|
**Current Phase:** 2
|
||||||
**Last Updated:** 2026-05-22
|
**Last Updated:** 2026-05-22
|
||||||
|
|
||||||
## Phase Status
|
## Phase Status
|
||||||
|
|
||||||
| Phase | Name | Status |
|
| Phase | Name | Status |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| 1 | Infrastructure Foundation | In Progress (4/5 plans) |
|
| 1 | Infrastructure Foundation | ✓ Complete |
|
||||||
| 2 | Users & Authentication | Not Started |
|
| 2 | Users & Authentication | Not Started |
|
||||||
| 3 | Document Migration & Multi-User Isolation | Not Started |
|
| 3 | Document Migration & Multi-User Isolation | Not Started |
|
||||||
| 4 | Folders, Sharing, Quotas & Document UX | Not Started |
|
| 4 | Folders, Sharing, Quotas & Document UX | Not Started |
|
||||||
@@ -32,20 +32,18 @@ progress:
|
|||||||
|
|
||||||
## Current Position
|
## Current Position
|
||||||
|
|
||||||
Phase: 1 (Infrastructure Foundation) — EXECUTING
|
**Phase:** 01-infrastructure-foundation — COMPLETE ✓
|
||||||
Plan: 5 of 5
|
**Plan:** 5/5 complete
|
||||||
**Phase:** 01-infrastructure-foundation
|
**Progress:** ██░░░░░░░░ 20% (1/5 phases)
|
||||||
**Plan:** 01-04 COMPLETE → advancing to 01-05
|
|
||||||
**Progress:** ████████░░ 80%
|
|
||||||
|
|
||||||
## Performance Metrics
|
## Performance Metrics
|
||||||
|
|
||||||
| Metric | Value |
|
| Metric | Value |
|
||||||
|---|---|
|
|---|---|
|
||||||
| Phases complete | 0 / 5 |
|
| Phases complete | 1 / 5 |
|
||||||
| Requirements mapped | 54 / 54 |
|
| Requirements mapped | 54 / 54 |
|
||||||
| Plans written | 5 (Phase 1) |
|
| Plans written | 5 (Phase 1) |
|
||||||
| Plans complete | 4 |
|
| Plans complete | 5 |
|
||||||
|
|
||||||
## Accumulated Context
|
## Accumulated Context
|
||||||
|
|
||||||
@@ -92,6 +90,6 @@ _Updated at each phase transition._
|
|||||||
|
|
||||||
| Field | Value |
|
| Field | Value |
|
||||||
|---|---|
|
|---|---|
|
||||||
| Last session | 2026-05-22 — Executed 01-04-PLAN.md (StorageBackend ABC + MinIOBackend + async services/storage.py; all 6 test_storage.py xfail tests PASSED) |
|
| Last session | 2026-05-22 — Executed Phase 1 (all 5 plans complete); walking-skeleton e2e verified live against Docker stack |
|
||||||
| Next action | Execute 01-05-PLAN.md (Lifespan + /health + API cutover + Celery worker + walking-skeleton e2e verify) |
|
| Next action | Run `/gsd:discuss-phase 2` to begin Phase 2 (Users & Authentication) |
|
||||||
| Pending decisions | See Open Questions above |
|
| Pending decisions | See Open Questions above |
|
||||||
|
|||||||
@@ -0,0 +1,100 @@
|
|||||||
|
---
|
||||||
|
plan: 01-05
|
||||||
|
phase: 01-infrastructure-foundation
|
||||||
|
status: complete
|
||||||
|
completed: "2026-05-22"
|
||||||
|
tasks_total: 4
|
||||||
|
tasks_complete: 4
|
||||||
|
requirements_satisfied:
|
||||||
|
- STORE-01
|
||||||
|
- STORE-07
|
||||||
|
self_check: PASSED
|
||||||
|
---
|
||||||
|
|
||||||
|
# Plan 01-05 Summary — Lifespan + /health + API Cutover + Celery + Walking Skeleton
|
||||||
|
|
||||||
|
## What Was Built
|
||||||
|
|
||||||
|
### Task 1 — Celery app + task + session-aware classifier (commit 32d67de)
|
||||||
|
|
||||||
|
**`backend/celery_app.py`** — Minimal Celery instance (`celery_app = Celery("docuvault")`). Reads `REDIS_URL` directly from `os.environ` (no config import — Pitfall 7). JSON serialization, `documents` queue route for `tasks.document_tasks.*`, `autodiscover_tasks(["tasks"])`.
|
||||||
|
|
||||||
|
**`backend/tasks/__init__.py`** — Empty package file.
|
||||||
|
|
||||||
|
**`backend/tasks/document_tasks.py`** — Sync `def extract_and_classify(document_id: str)` Celery task (NOT async). Uses `asyncio.run(_run(document_id))` to drive the async body. Opens a fresh `AsyncSessionLocal` session, fetches the Document ORM row, pulls bytes from MinIO via `MinIOBackend.get_object`, calls `extractor.extract_text_from_bytes`, persists extracted text, then calls `classifier.classify_document(session, doc_id)`. Non-fatal classification failures set `status = "classification_failed"`.
|
||||||
|
|
||||||
|
**`backend/services/extractor.py`** — Added `extract_text_from_bytes(file_bytes, content_type)` helper that writes to a `NamedTemporaryFile` and delegates to the existing `extract_text(path, mime)` function.
|
||||||
|
|
||||||
|
**`backend/services/classifier.py`** — Added `session: AsyncSession` as first parameter to `classify_document` and `suggest_topics_for_document`. All internal `storage.*` calls updated to pass session.
|
||||||
|
|
||||||
|
### Task 2 — Lifespan + /health + async API wiring (commit c1931fd)
|
||||||
|
|
||||||
|
**`backend/main.py`** — Lifespan creates `Minio` client, auto-creates `docuvault` bucket if absent (via `asyncio.to_thread`), attaches to `app.state.minio`, disposes `engine` on shutdown. No longer calls `ensure_data_dirs()`. `/health` endpoint probes PostgreSQL (`SELECT 1` via `AsyncSessionLocal`) and MinIO (`bucket_exists` via `asyncio.to_thread`); returns `{"status": "ok"|"degraded", "checks": {"postgres": ..., "minio": ...}}`.
|
||||||
|
|
||||||
|
**`backend/api/documents.py`** — All 5 route handlers inject `session: AsyncSession = Depends(get_db)`. Upload handler: calls `await storage.save_upload(session, ...)`, uses in-memory `content` bytes for extraction (no filesystem path needed), enqueues `extract_and_classify.delay(saved["id"])` for async classification, returns `topics: []` immediately. `/classify` endpoint retains synchronous `await classifier.classify_document(session, doc_id)` for backward compatibility.
|
||||||
|
|
||||||
|
**`backend/api/topics.py`** — All 5 route handlers inject session dependency; all `storage.*` calls are async with session.
|
||||||
|
|
||||||
|
### Task 3 — Final cutover (commit 970c8e4)
|
||||||
|
|
||||||
|
- `backend/data/` — All tracked files removed via `git rm -rf`; `backend/data/` added to `.gitignore`
|
||||||
|
- `backend/config.py` — Removed `DATA_DIR`, `UPLOADS_DIR`, `METADATA_DIR`, `TOPICS_FILE`, `ensure_data_dirs()`, `import os`. Retained `DEFAULT_SETTINGS`, `DEFAULT_SYSTEM_PROMPT`, `class Settings(BaseSettings)`, `settings = Settings()`. `SETTINGS_FILE` rebased as `Path(settings.data_dir) / "settings.json"` after `settings = Settings()`.
|
||||||
|
- `backend/tests/conftest.py` — Removed `isolated_data_dir` fixture and sync `TestClient` `client` fixture. Promoted `db_session` and `async_client` fixtures (removed `try/except ImportError` wrappers — deps now exist). Added `live_services_available` session fixture that probes localhost:5432/9000/6379 via socket.
|
||||||
|
- `backend/tests/test_documents.py` — Deleted 9 legacy sync tests. Removed all `@pytest.mark.xfail` markers from async ports.
|
||||||
|
- `backend/tests/test_health.py` — Removed `@pytest.mark.xfail` from `test_health_checks_postgres_and_minio`. Deleted legacy `test_health(client)` sync test.
|
||||||
|
- `backend/tests/test_settings.py`, `backend/tests/test_topics.py` — Updated to remove any remaining sync client references.
|
||||||
|
|
||||||
|
### Task 4 — Walking-skeleton e2e verification (human-approved ✓)
|
||||||
|
|
||||||
|
All 12 verification steps passed:
|
||||||
|
|
||||||
|
1. `.env` created from `.env.example`
|
||||||
|
2. `docker compose down -v` — clean state
|
||||||
|
3. `docker compose up --build -d` — all 5 services booted
|
||||||
|
4. `docker compose ps` — `postgres`, `minio`, `redis`, `backend`, `celery-worker` all `Up (healthy)`
|
||||||
|
5. `alembic upgrade head` — exit 0, `Running upgrade -> 0001`
|
||||||
|
6. `/health` response:
|
||||||
|
```json
|
||||||
|
{"status": "ok", "checks": {"postgres": "ok", "minio": "ok"}}
|
||||||
|
```
|
||||||
|
7. Upload `test.txt` — returned `{"id": "<uuid>", "original_name": "test.txt", "topics": [], ...}`
|
||||||
|
8. PostgreSQL confirmed: one row, `object_key` starts with `null-user/`
|
||||||
|
9. MinIO confirmed: object present in `docuvault` bucket
|
||||||
|
10. Celery confirmed: `Task tasks.document_tasks.extract_and_classify[...] succeeded`
|
||||||
|
11. Delete confirmed: `{"success": true}`, MinIO object removed
|
||||||
|
12. Integration tests: zero FAILED, zero XFAIL
|
||||||
|
|
||||||
|
## ROADMAP.md Phase 1 Success Criteria — All Met
|
||||||
|
|
||||||
|
| # | Criterion | Status |
|
||||||
|
|---|-----------|--------|
|
||||||
|
| 1 | `docker compose up` starts all services healthy | ✓ Verified (Task 4, step 4) |
|
||||||
|
| 2 | `alembic upgrade head` applies cleanly | ✓ Verified (Plan 03 Task 3 + Task 4 step 5) |
|
||||||
|
| 3 | Full upload/extract/classify workflow works — no regression | ✓ Verified (Task 4, steps 7-10) |
|
||||||
|
| 4 | MinIO object key schema `{user_id}/{document_id}/{uuid4()}{ext}` enforced | ✓ Verified (Plan 04 + Task 4 step 8) |
|
||||||
|
|
||||||
|
## Deviations
|
||||||
|
|
||||||
|
| Rule | Deviation | Resolution |
|
||||||
|
|------|-----------|------------|
|
||||||
|
| 1 (Bug) | `extract_text_from_bytes` helper did not exist in extractor.py | Added to `services/extractor.py` as specified in Plan 05 Task 1 action block |
|
||||||
|
| 2 (Enhancement) | `live_services_available` uses env var `INTEGRATION=1` as fallback | Socket probe primary, env var secondary — matches plan intent |
|
||||||
|
|
||||||
|
## Key Files Created/Modified
|
||||||
|
|
||||||
|
| File | Status | Notes |
|
||||||
|
|------|--------|-------|
|
||||||
|
| backend/celery_app.py | Created | Minimal Celery — no config import |
|
||||||
|
| backend/tasks/document_tasks.py | Created | Sync task wrapping asyncio.run |
|
||||||
|
| backend/tasks/__init__.py | Created | Package marker |
|
||||||
|
| backend/main.py | Rewritten | Lifespan + /health |
|
||||||
|
| backend/api/documents.py | Rewritten | Async session injection |
|
||||||
|
| backend/api/topics.py | Rewritten | Async session injection |
|
||||||
|
| backend/services/classifier.py | Updated | Session-aware |
|
||||||
|
| backend/services/extractor.py | Updated | Added bytes helper |
|
||||||
|
| backend/config.py | Pruned | Flat-file constants removed |
|
||||||
|
| backend/tests/conftest.py | Pruned | Async-only fixtures |
|
||||||
|
| backend/tests/test_documents.py | Pruned | Async-only tests |
|
||||||
|
| backend/data/ | Deleted | D-04 complete |
|
||||||
|
|
||||||
|
## Self-Check: PASSED
|
||||||
Reference in New Issue
Block a user