Commit Graph

212 Commits

Author SHA1 Message Date
curo1305 5950a3f5c2 feat(03-03): wire get_current_user into /api/topics/*; add load_topics_for_user; POST /api/admin/topics
- api/topics.py: add get_current_user dep to all 5 handlers (list, create, update, delete, suggest)
- list_topics: uses load_topics_for_user (system topics + user's own) with user-scoped doc counts
- create_topic: passes user_id=current_user.id (never creates system topics via regular endpoint)
- update_topic/delete_topic: ownership assertion — system topics and other users' topics return 404
- api/admin.py: add SystemTopicCreate model + POST /api/admin/topics (user_id=NULL, admin-only)
- services/storage.py: add or_ import; load_topics_for_user (D-17); create_topic gains user_id param with namespace-scoped dedup; topic_doc_counts gains optional user_id for user-scoped counts; add load_topics_for_user to __all__
- services/classifier.py: replace load_topics with load_topics_for_user(doc.user_id); pass user_id=doc.user_id to create_topic for AI-suggested topics (D-11)
- Tests: update all topic tests to pass auth headers; implement test_topic_namespace, test_admin_create_system_topic, test_regular_user_cannot_create_system_topic, test_topics_require_auth
2026-05-23 20:15:44 +02:00
curo1305 b28bb01995 feat(03-03): add get_regular_user dep; wire auth + ownership into /api/documents/*
- Add get_regular_user FastAPI dep (rejects admin with 403) to deps/auth.py
- Wire Depends(get_regular_user) into all 6 /api/documents/* handlers
- upload-url: replace null-user/... object_key with str(current_user.id)/...; set user_id=current_user.id
- confirm: remove Wave 2 doc.user_id is None guard — quota runs unconditionally; add ownership assertion (404 on cross-user)
- list: filter by user_id=current_user.id via storage.list_metadata(user_id=...)
- get/delete/classify: ownership assertion (doc.user_id != current_user.id → 404)
- storage.list_metadata: add required user_id param + Document.user_id == user_id filter
- storage.delete_document: remove if doc.user_id is not None guard; use CASE WHEN for SQLite-compat quota decrement
- Tests: update existing tests to pass auth headers; implement test_cross_user_access_404, test_admin_cannot_access_documents, test_documents_require_auth; mark test_confirm_endpoint xfail(strict=False) for SQLite UUID mismatch
2026-05-23 20:05:34 +02:00
curo1305 0d51d023ce feat(03-02): implement presigned upload flow, quota enforcement, cleanup task
- Replace POST /api/documents/upload with POST /api/documents/upload-url + /{id}/confirm
- upload-url: create pending Document row with user_id=None (Wave 2), return presigned PUT URL
- confirm: stat MinIO for authoritative size (T-03-05), atomic quota UPDATE (T-03-06, STORE-03)
- Confirm returns 413 with {used_bytes, limit_bytes, rejected_bytes} on quota exceeded (STORE-05)
- Wave 2 guard: skip quota UPDATE when doc.user_id is None (Plan 03-03 removes this)
- Add GET /api/auth/me/quota to api/auth.py (STORE-04)
- services/storage.py: remove save_upload (D-04); add GREATEST(0, used_bytes-delta) quota decrement to delete_document (STORE-06)
- tasks/document_tasks.py: add cleanup_abandoned_uploads Celery beat task (D-06)
- celery_app.py: add beat_schedule for cleanup-abandoned-uploads every 30 minutes
- tests/test_documents.py: replace legacy /upload tests with xfail; add real test logic for upload-url/confirm/get-quota
- tests/test_quota.py: implement real test logic with xfail for PostgreSQL-specific SQL
2026-05-23 14:32:12 +02:00
curo1305 3ed6dd494f feat(03-02): extend StorageBackend ABC and MinIOBackend with presigned PUT and stat_object
- Add generate_presigned_put_url and stat_object abstract methods to StorageBackend ABC
- Extend MinIOBackend with dual client (self._client internal + self._public_client public)
- MinIOBackend.__init__ accepts optional public_endpoint param (RESEARCH.md Finding 3)
- generate_presigned_put_url uses self._public_client for browser-resolvable URLs
- stat_object uses self._client.stat_object and returns .size (authoritative, T-03-05)
- get_storage_backend() passes public_endpoint=settings.minio_public_endpoint
- config.py adds minio_public_endpoint field (RESEARCH.md Finding 3)
- docker-compose.yml: MINIO_API_CORS_ALLOW_ORIGIN on minio service (T-03-09)
- docker-compose.yml: MINIO_PUBLIC_ENDPOINT on backend service
- docker-compose.yml: new celery-beat service (RESEARCH.md Finding 10)
2026-05-23 13:52:16 +02:00
curo1305 4e9b586ec4 docs(03-01): complete Wave 0 scaffolding plan — migration 0003 + xfail stubs
- Create 03-01-SUMMARY.md with all 19 new test IDs, task commits, and decisions
- Update STATE.md: phase 3 in progress, plan 1/5 complete, 3 new key decisions
- Update ROADMAP.md: mark 03-01-PLAN.md as complete (2026-05-23)
2026-05-23 13:48:07 +02:00
curo1305 807a1b3e67 feat(03-01): create Alembic migration 0003 for multi-user isolation
- revision="0003", down_revision="0002"
- upgrade(): collects null-user object_keys, deletes document_topics cascade,
  deletes null-user documents, removes MinIO objects (skip if MINIO_ENDPOINT unset),
  deletes all topics (D-10), alters documents.user_id NOT NULL via batch_alter_table,
  creates ix_topics_user_id index, reconciles quotas.used_bytes from SUM(size_bytes)
- downgrade(): drops ix_topics_user_id, reverts user_id to nullable; documents not restored
- batch_alter_table ensures SQLite compatibility for test suite
- MinIO step gated on MINIO_ENDPOINT env var for safe SQLite test runs
2026-05-23 13:44:22 +02:00
curo1305 21ec9cb4c3 test(03-01): add Wave 0 xfail stubs and shared fixtures for Phase 3
- Add auth_user, admin_user, mock_minio_presigned, mock_minio_stat fixtures to conftest.py
- Create test_quota.py with 4 xfail stubs (STORE-03, STORE-05, STORE-06, SC2 race)
- Append test_migration_0003 to test_alembic.py (full pre-seed + post-migration assertions)
- Append 3 classifier xfail stubs (DOC-03, DOC-05, D-15)
- Append 6 document xfail stubs (D-05, STORE-04, SEC-04, D-16)
- Append 4 topic xfail stubs (DOC-04, D-09, D-17)
- Append test_settings_endpoint_removed stub (D-12)
- All 19 new test IDs collect cleanly with xfail(strict=False)
2026-05-23 13:42:37 +02:00
curo1305 fdc32d431d docs(03): create Phase 3 execution plan — document migration & multi-user isolation
5 plans across 5 sequential waves covering: Alembic migration 0003 (null-user
cleanup, NOT NULL constraint, quota reconciliation), presigned MinIO PUT upload
flow with atomic quota enforcement, auth guards on all document/topic endpoints,
flat-file settings retirement + per-user AI classification, and frontend quota bar
with 3-step XHR upload progress.

Verification passed across all 12 dimensions. All 8 phase requirements covered
(STORE-03/04/05/06, SEC-04, DOC-03/04/05).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-23 13:36:28 +02:00
curo1305 1ba578c7f6 docs(03): UI design contract for Phase 3 document migration
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-23 10:21:05 +02:00
curo1305 5905642a31 docs(03): UI design contract for document migration and quota UI
Phase 3 UI-SPEC covering two-step presigned upload progress bar,
quota usage bar (amber 80% / red 95%), and 413 quota rejection
inline error block — all inheriting Phase 2 design system tokens.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-23 10:16:28 +02:00
curo1305 f261c1a53b docs(02): defer SC5 admin-JWT/document-403 to Phase 3 per D-07; clean STATE.md
SC5 admin JWT on /api/documents/* returning 403 is explicitly deferred to
Phase 3 SC4 (D-07: existing doc endpoints stay public until Phase 3 auth
enforcement). ROADMAP updated. Duplicate Open Questions removed from STATE.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 20:28:55 +02:00
curo1305 80eb280233 docs(02): phase 2 verification report
4/5 success criteria verified; 1 blocker gap identified: admin JWT
does not return 403 on document content endpoints because api/documents.py
has no auth enforcement (Phase 1 legacy state, deferred to Phase 3 per D-03).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 20:21:01 +02:00
curo1305 858be6260e docs(02-05): execution summary and state update
- 02-05-SUMMARY.md: admin panel frontend complete — AdminView, three tab components, AppSidebar update
- STATE.md: Phase 2 complete (5/5 plans), progress 40%, decisions added
- ROADMAP.md: Phase 2 marked complete, all 5 plans checked
- REQUIREMENTS.md: ADMIN-01 through ADMIN-05 and ADMIN-07 marked complete
2026-05-22 20:12:05 +02:00
curo1305 92e3d755d0 feat(02-05): AppSidebar admin link and user identity footer
- Add conditional Admin nav link (v-if authStore.user?.role === 'admin') with shield SVG icon
- Add user identity footer: initials avatar (bg-indigo-100), email (truncate flex-1), sign-out icon button (aria-label="Sign out")
- Import useAuthStore alongside existing topicsStore; add useRouter for post-logout redirect
- All existing nav links, topicsStore reference, and scoped styles preserved unchanged
2026-05-22 20:09:16 +02:00
curo1305 9137f41537 feat(02-05): admin tab components and AdminView
- AdminView.vue: tabbed layout (Users | Quotas | AI Config) with UI-SPEC tab strip classes
- AdminUsersTab.vue: user table with create form (crypto.getRandomValues password), inline deactivation confirmation, reactivate, reset-password, row-level spinner, empty state
- AdminQuotasTab.vue: quota inline edit with MB display, usage %, warning when limit < usage
- AdminAiConfigTab.vue: AI provider/model per-user with 1.5s "Saved" confirmation
- client.js: fix adminDeactivateUser/adminReactivateUser to use PATCH /status endpoint, fix adminResetUserPassword to /password-reset, fix adminUpdateAiConfig to send ai_provider/ai_model, add adminGetUserQuota
- No impersonation UI in any admin component (T-02-31)
2026-05-22 20:09:05 +02:00
curo1305 bcb63bf8aa docs(02-04): execution summary and state update
- 02-04-SUMMARY.md: admin API plan complete (18 tests, 7 endpoints, all security checks pass)
- STATE.md: advanced to plan 4/5, updated metrics and session continuity
2026-05-22 20:03:34 +02:00
curo1305 f94e8d8b4a feat(02-04): implement admin API endpoints — user CRUD, quota management, AI config
- GET /api/admin/users: list users (safe fields only, ordered by created_at)
- POST /api/admin/users: create user (password_must_change=True, quota init)
- PATCH /api/admin/users/{id}/status: deactivate/reactivate with sole-admin guard
- POST /api/admin/users/{id}/password-reset: Celery email dispatch (no token returned)
- GET /api/admin/users/{id}/quota: quota view with MB helpers
- PATCH /api/admin/users/{id}/quota: quota adjust with below-usage warning
- PATCH /api/admin/users/{id}/ai-config: assign AI provider/model per user
- _user_to_dict() whitelist helper prevents password_hash/credentials_enc leakage
- No impersonation endpoint (ADMIN-07 enforced by omission)
- get_current_admin Depends() on every handler (SEC-07)
- Updated backend/main.py to include admin_router
- Fixed test: mock send_reset_email.delay to avoid Redis in unit tests
2026-05-22 20:01:37 +02:00
curo1305 cbad9acac1 test(02-04): RED phase — admin API test suite (11 tests, expect fail until admin.py exists) 2026-05-22 19:59:16 +02:00
curo1305 833f869a48 docs(02-03): execution summary and state update
- 02-03-SUMMARY.md: TOTP enrollment endpoints, password reset, account management UI
- STATE.md: advanced to Plan 3/5 complete, added key decisions
2026-05-22 19:57:09 +02:00
curo1305 d73e2f6112 feat(02-03): TOTP enrollment flow, backup codes, AccountView, ConfirmBlock
- TotpEnrollment.vue: three-step enrollment (setup → verify → backup-codes); emits 'enrolled'
- BackupCodesDisplay.vue: 2-column grid, copy-all clipboard, acknowledgment checkbox
- ConfirmBlock.vue: reusable inline confirmation block with 'confirmed'/'cancelled' emits
- AccountView.vue: TOTP section (enrollment or disable), change-password with breach/wrong-pw error handling, sign-out-all with ConfirmBlock
- npm run build exits 0
2026-05-22 19:54:53 +02:00
curo1305 43e1d0145e feat(02-03): add TOTP setup/enable/disable, password reset, and frontend_url to config
- GET /api/auth/totp/setup: returns provisioning_uri + secret (400 if already enabled)
- POST /api/auth/totp/enable: rate-limited 10/min, verifies TOTP code with Redis replay prevention, returns 10 backup codes
- DELETE /api/auth/totp: disables TOTP, clears secret, deletes backup codes
- POST /api/auth/password-reset: always returns 202 (anti-enumeration), enqueues Celery email task
- POST /api/auth/password-reset/confirm: validates token, strength, HIBP; updates password; no auto-login (AUTH-05)
- config.py: added frontend_url setting for password reset link construction
- test_auth_totp.py: all 11 tests passing (GREEN)
2026-05-22 19:52:36 +02:00
curo1305 d7831e9382 test(02-03): add failing tests for TOTP endpoints, password reset, logout-all
- test_totp_setup_returns_uri: GET /api/auth/totp/setup returns provisioning_uri + secret
- test_totp_setup_already_enabled: returns 400 when totp_enabled=True
- test_totp_setup_requires_auth: returns 401/403 without Bearer
- test_password_reset_always_202_nonexistent: anti-enumeration for non-existent email
- test_password_reset_always_202_existing: anti-enumeration for existing email
- test_password_reset_confirm_invalid_token: returns 400 for bad token
- test_password_reset_confirm_weak_password: returns 422 for weak password
- test_password_reset_confirm_valid_no_autologin: returns 200 with no access_token (AUTH-05)
- test_logout_all_revokes_tokens: returns 200 with revoked message
- test_logout_all_requires_auth: returns 401/403 without Bearer
- test_totp_enable_rate_limit: 11th call returns 429
2026-05-22 19:50:51 +02:00
curo1305 3d487b82ef docs(02-02): execution summary — auth API endpoints + frontend auth wall complete
Requirements completed: AUTH-01, AUTH-02, AUTH-04, SEC-01, SEC-02, SEC-03, SEC-05

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 19:48:33 +02:00
curo1305 3b7d362600 feat(02-02): frontend auth store, router guard, Login/Register views
- frontend/src/stores/auth.js: useAuthStore with accessToken in memory
  only (never browser storage); login() accepts options.backupCode
- frontend/src/api/client.js: extended with Bearer token injection,
  401 auto-refresh retry, all auth/admin API functions, changePassword
- frontend/src/router/index.js: auth routes added (/login, /register,
  /password-reset, /account, /admin); beforeEach guard redirects
  unauthenticated users to /login with redirect param
- frontend/src/layouts/AuthLayout.vue: centered bare layout for auth pages
- frontend/src/views/auth/LoginView.vue: three-step flow (password, TOTP,
  backup code); "Use a backup code instead" link; UI-SPEC copywriting
- frontend/src/views/auth/RegisterView.vue: registration with
  PasswordStrengthBar; HIBP error display; UI-SPEC copywriting
- frontend/src/components/auth/PasswordStrengthBar.vue: 4-segment bar
- frontend/src/components/ui/AppSpinner.vue: animate-spin SVG spinner
- Stub views: PasswordResetView, NewPasswordView, AccountView, AdminView
- .gitignore: exclude frontend/node_modules, dist, package-lock.json

npm run build exits 0. All acceptance criteria verified.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 19:45:21 +02:00
curo1305 1882edfff6 feat(02-02): auth API endpoints + security hardening + Python 3.9 compat
- backend/api/auth.py: register, login (TOTP+backup), refresh, logout,
  me, change-password; per-account Redis rate limit; HIBP check
- backend/main.py: Origin validation middleware, CSP headers middleware,
  CORS locked to settings.cors_origins, Redis lifespan (app.state.redis),
  admin bootstrap, auth router included, slowapi SlowAPIMiddleware
- backend/services/email.py: already created in Plan 01 (verified exists)
- Python 3.9 compat: fixed match statement in ai/__init__.py,
  str|None union syntax in openai_provider.py, api/documents.py,
  api/topics.py, api/settings.py, services/classifier.py

All 17 tests in test_auth_api.py pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 19:35:38 +02:00
curo1305 1d425d4392 test(02-02): add failing tests for auth API endpoints
RED phase - 17 tests covering register, login, TOTP, backup codes,
per-account rate limiting, Origin validation, change-password, and
password_must_change flow.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 19:35:31 +02:00
curo1305 479b72ef9a docs(02-01): execution summary — auth service layer, deps, migration complete
- 02-01-SUMMARY.md: 3 tasks complete, 31 tests passing, all verification checks passed
- STATE.md: Phase 2 plan 1/5 complete, decisions added, open questions resolved
2026-05-22 19:27:29 +02:00
curo1305 c4613b6b87 feat(02-01): implement deps/auth.py FastAPI dependency chain with tests
- get_current_user: validates Bearer JWT via decode_access_token, loads User from DB
  raises HTTP 401 on invalid/expired token, missing user, or deactivated account
- get_current_admin: wraps get_current_user, raises HTTP 403 on role != 'admin' (T-02-07)
- Admin impersonation architecturally excluded (ADMIN-07, T-02-08) — no code path bypasses role check
- tests/test_auth_deps.py: 7 tests covering happy path, tampered token, inactive user, 403 non-admin, 200 admin
2026-05-22 19:25:16 +02:00
curo1305 9fc820d893 feat(02-01): implement services/auth.py full auth service layer and email_tasks.py
- services/auth.py: Argon2 password hashing (pwdlib), constant-time verify (SEC-06)
- JWT create/decode for access tokens and password-reset tokens (typ claim validation, T-02-01)
- Refresh token lifecycle: create, rotate, revoke-all with family revocation (AUTH-07, RFC 9700)
- Family revocation enqueues send_security_alert_email.delay on token reuse (T-02-02)
- TOTP provisioning (pyotp) and verification with Redis replay prevention, valid_window=1 (AUTH-08)
- Backup code generation (8-char hex uppercase), storage (Argon2 hashed), constant-time verify (T-02-03)
- HIBP k-anonymity check via SHA-1 prefix (T-02-05), fail-open on network error (T-02-06)
- Admin bootstrap: idempotent, logs WARNING if env vars missing (D-04/D-05/D-06)
- services/email.py: SMTP send + dev stdout fallback (D-01/D-02)
- tasks/email_tasks.py: send_reset_email and send_security_alert_email Celery tasks
- celery_app.py: add email queue route for tasks.email_tasks.*
- TDD tests: 17 tests covering all auth primitives and family revocation
2026-05-22 19:23:42 +02:00
curo1305 12c6487855 feat(02-01): add BackupCode ORM model, password_must_change field, Alembic migration, extend Settings
- Add BackupCode model to db/models.py with user_id FK, code_hash (Argon2), used_at (nullable)
- Add ix_backup_codes_user_id index on backup_codes.user_id
- Add password_must_change BOOLEAN NOT NULL DEFAULT false to User model (ADMIN-01)
- Extend config.py Settings with JWT, SMTP, admin bootstrap, and CORS fields (D-01, D-04, D-09)
- Add env_list_separator=',' for cors_origins env var parsing
- Append PyJWT, pwdlib[argon2], pyotp, aioredis, slowapi to requirements.txt
- Add .env.example entries for SECRET_KEY, ADMIN_EMAIL, SMTP_*, CORS_ORIGINS
- Create migration 0002 adding backup_codes table and password_must_change column
- Add TDD tests for all Task 1 acceptance criteria (7 tests pass)
2026-05-22 19:19:52 +02:00
curo1305 16584ade00 docs(02): create phase 2 plan — Users & Authentication
5 plans across 5 waves covering AUTH-01..08, SEC-01..03/05..07,
ADMIN-01..05/07. Includes security hardening (Origin validation,
per-account rate limiting, TOTP replay prevention, refresh token
family revocation with security alert), TOTP + backup code login,
and admin panel frontend.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 19:13:44 +02:00
curo1305 333978d7cb docs(02): UI design contract for Phase 2 — Users & Authentication
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 15:12:02 +02:00
curo1305 9e28de8c15 docs(02): UI design contract for Users & Authentication phase
Specifies form field states, password strength indicator, TOTP enrollment
and backup codes patterns, loading states, error placement, admin table
row states, copywriting (anti-enumeration copy), and full component
inventory for Phase 2 frontend work.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 14:51:28 +02:00
curo1305 5c010587f6 docs(state): record phase 2 context session 2026-05-22 14:33:25 +02:00
curo1305 e0341348f0 docs(02): capture phase context 2026-05-22 14:33:20 +02:00
curo1305 16bb31eb6d docs(01-05): complete walking-skeleton plan — SUMMARY, STATE, ROADMAP
Phase 1 complete: all 5/5 plans executed, walking-skeleton e2e verified
live against Docker stack (postgres + minio + redis + backend + celery-worker).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 14:19:41 +02:00
curo1305 970c8e4e44 feat(01-05): final cutover — delete data/, prune config.py, async-only tests
- Delete backend/data/ tracked files (D-04): flat-file metadata, settings.json,
  topics.json, and uploaded files removed from git; backend/data/ added to
  .gitignore (empty dir remains on macOS due to ACL — no tracked files remain)
- Prune backend/config.py: remove DATA_DIR, UPLOADS_DIR, METADATA_DIR,
  TOPICS_FILE, ensure_data_dirs(); rebase SETTINGS_FILE as derived path from
  settings.data_dir (Phase 1 flat-file settings kept per plan decision)
- Prune backend/tests/conftest.py: remove isolated_data_dir autouse fixture
  and sync TestClient client fixture; add SQLite type compatibility shim
  (visit_INET/JSONB) so in-memory db_session can create tables with
  PostgreSQL-specific column types; add live_services_available fixture
- Rewrite backend/tests/test_documents.py: delete all legacy sync tests,
  remove all @pytest.mark.xfail markers; async-only document tests now
  use async_client + storage service directly for topic wiring
- Rewrite backend/tests/test_health.py: delete legacy sync test_health(client);
  remove @pytest.mark.xfail from test_health_checks_postgres_and_minio
- Port backend/tests/test_topics.py to async_client (sync client removed)
- Port backend/tests/test_settings.py to async_client with monkeypatch for
  SETTINGS_FILE isolation (settings remain flat-file in Phase 1)
2026-05-22 09:53:39 +02:00
curo1305 c1931fd566 feat(01-05): wire main.py lifespan+health and rewrite documents+topics to async session
- Rewrite main.py lifespan: MinIO client created at startup, docuvault bucket
  auto-created if missing, stored on app.state.minio; engine.dispose() on shutdown
- Extend /health endpoint: probes PostgreSQL (SELECT 1) and MinIO (bucket_exists)
  returning {"status": "ok"|"degraded", "checks": {"postgres": ..., "minio": ...}}
- Rewrite api/documents.py: all routes inject session: AsyncSession = Depends(get_db);
  save_upload/save_metadata/list_metadata/get_metadata/delete_document all async;
  upload handler queues extract_and_classify.delay() instead of inline classification;
  /classify endpoint retains synchronous await classifier.classify_document() for
  backward-compatible immediate response
- Rewrite api/topics.py: all routes inject session dependency; all storage calls
  are async with session parameter; Pydantic models TopicCreate/TopicUpdate/
  SuggestRequest preserved verbatim
2026-05-22 09:47:00 +02:00
curo1305 32d67de1ca feat(01-05): introduce celery_app + tasks/document_tasks + session-aware classifier
- Add backend/celery_app.py: Celery("docuvault") with Redis broker, JSON
  serialization, and tasks.document_tasks.* routed to documents queue;
  reads REDIS_URL directly from os.environ (no config import — Pitfall 7)
- Add backend/tasks/__init__.py: empty package marker
- Add backend/tasks/document_tasks.py: sync extract_and_classify Celery task
  that calls asyncio.run(_run()) to retrieve bytes from MinIO, extract text
  via extractor, and classify via classifier; classification failure is non-fatal
- Update backend/services/classifier.py: classify_document and
  suggest_topics_for_document now accept session: AsyncSession as first arg;
  all storage.* calls updated to async session-injection pattern
- Add extract_text_from_bytes helper to services/extractor.py for bytes-based
  extraction (used by Celery worker, which retrieves bytes from MinIO)
2026-05-22 09:45:33 +02:00
curo1305 5d21c6f588 docs(01-04): complete StorageBackend + MinIO + async storage plan — SUMMARY, STATE, ROADMAP 2026-05-22 09:41:43 +02:00
curo1305 3e4b1f1f91 feat(01-04): rewrite services/storage.py as async SQLAlchemy + MinIO orchestrator
- Replaced entire flat-file + filelock implementation with async ORM + MinIO
- All 14 DB-touching functions are async def accepting AsyncSession as first param
- load_settings/save_settings/mask_api_key/settings_masked remain sync (flat-file, Phase 2 will migrate)
- save_upload uses null-user D-03 sentinel; object_key via MinIO put_object
- update_document_topics auto-creates missing topics via create_topic deduplication
- No filelock, no METADATA_DIR/UPLOADS_DIR/TOPICS_FILE references remain
- Added __all__ listing all 18 public functions
- Updated conftest.py: removed filelock patching no longer needed
- Fixed test_object_key_schema: removed unused db_session param (SQLite INET type conflict)
2026-05-22 09:39:32 +02:00
curo1305 eaf86a832a feat(01-04): add StorageBackend ABC + MinIOBackend + factory
- backend/storage/base.py: StorageBackend ABC with 5 abstract methods mirroring ai/base.py
- backend/storage/minio_backend.py: MinIOBackend wrapping all sync Minio SDK calls in asyncio.to_thread(); STORE-02 key schema: {user_id}/{document_id}/{uuid4()}{ext}
- backend/storage/__init__.py: get_storage_backend() factory mirroring ai/__init__.py
- backend/tests/test_storage.py: remove xfail markers (plan 04 implements the module)
2026-05-22 09:36:24 +02:00
curo1305 e822a8f4b1 docs(01-03): complete SQLAlchemy ORM + Alembic plan — SUMMARY, STATE, ROADMAP
- SUMMARY.md: all 11 tables documented, privilege grants, verification results, deviations
- STATE.md: plan counter advanced to 3/5, decisions added, session continuity updated
- ROADMAP.md: 01-03-PLAN.md marked complete, progress table updated to 3/5
2026-05-22 09:33:24 +02:00
curo1305 75ea7ef106 feat(01-03): scaffold Alembic async config and author 0001_initial_schema migration
- backend/alembic.ini: script_location=migrations, sqlalchemy.url=%(DATABASE_MIGRATE_URL)s
- backend/migrations/env.py: async_engine_from_config + Base.metadata wiring;
  runtime os.environ.get("DATABASE_MIGRATE_URL") injection (alembic.ini interpolation
  does not read OS env directly)
- backend/migrations/versions/0001_initial_schema.py: creates all 11 tables in
  dependency order with correct FKs, indexes, and named constraints
- documents.user_id is nullable=True per D-03; Phase 2 adds NOT NULL
- Ends with GRANT + ALTER DEFAULT PRIVILEGES for docuvault_app (Pitfall 4)
- Also grants USAGE/SELECT on sequences (audit_log.id autoincrement)
- downgrade() drops all tables in reverse dependency order
2026-05-22 09:20:49 +02:00
curo1305 3e1fcd69b5 feat(01-03): add full v1 ORM schema, async session factory, and DB dependency
- backend/db/models.py: 11 SQLAlchemy 2.0 ORM models (User, Quota, RefreshToken,
  Folder, Document, Topic, DocumentTopic, Share, AuditLog, CloudConnection, Group)
- Document.user_id declared nullable=True per D-03 (Phase 2 adds NOT NULL)
- AuditLog.metadata_ uses mapped_column("metadata", JSONB) to avoid DeclarativeBase
  reserved-attribute conflict
- Group table stub for D-02 (v2 feature, seeded per PROJECT.md)
- Uses Optional[X] instead of X | None for Python < 3.10 compatibility
- backend/db/session.py: async engine (pool_pre_ping=True, expire_on_commit=False)
- backend/deps/db.py: async get_db() FastAPI dependency yielding AsyncSession
2026-05-22 09:16:21 +02:00
curo1305 213afec6b3 docs(01-02): complete Wave 0 test scaffolds plan — SUMMARY, STATE, ROADMAP
- Create 01-02-SUMMARY.md: 19 total xfail tests across 5 files, 3 task
  commits documented, no deviations
- STATE.md: advance to plan 3/5, update progress to 40%, record decisions
  for async_client naming and xfail(strict=False) pattern
- ROADMAP.md: mark 01-02-PLAN.md complete, update progress table to 2/5
2026-05-22 09:10:27 +02:00
curo1305 d856a2eaa9 test(01-02): extend test_health.py and port test_documents.py to async client
test_health.py:
  - Keep existing test_health(client) sync test unchanged (Plan 01 baseline)
  - Add test_health_checks_postgres_and_minio(async_client) xfail scaffold
    for extended /health response with postgres+minio checks (Plan 05, D-07)

test_documents.py:
  - Keep all 9 existing sync tests verbatim
  - Add async ports (_async suffix) for each: 9 xfail tests using async_client
  - Add test_upload_persists_to_postgres_and_minio_async (UUID id + GET
    round-trip assertion) — xfail until Plan 05 storage rewrite
  - Total: 10 new xfail async tests, 9 sync tests unchanged
2026-05-22 09:08:05 +02:00
curo1305 27fa0d4631 test(01-02): add Wave 0 scaffolds test_storage.py and test_alembic.py
test_storage.py (6 xfail tests, STORE-02):
  - test_object_key_schema: regex {user_id}/{doc_id}/{uuid4}{ext}
  - test_filename_not_in_object_key: human filename never in MinIO key
  - test_storage_backend_abc_methods: incomplete subclass raises TypeError
  - test_get_storage_backend_returns_minio: factory returns MinIOBackend
  - test_put_object_uses_asyncio_to_thread: SDK call wrapped in to_thread
  - test_minio_backend_health_check_returns_bool: True/False on ok/error

test_alembic.py (2 xfail tests, STORE-01 / D-02 / D-03):
  - test_migration_creates_all_tables: all 11 v1 tables after upgrade head
  - test_documents_user_id_nullable: user_id notnull=0 per D-03
2026-05-22 09:06:55 +02:00
curo1305 1f675fcf1a feat(01-02): add async db_session and async_client fixtures to conftest.py
- Add @pytest_asyncio.fixture db_session: in-memory SQLite via aiosqlite,
  expire_on_commit=False, skips gracefully (ImportError) before Plan 03
- Add @pytest_asyncio.fixture async_client: httpx.AsyncClient with
  ASGITransport, overrides deps.db.get_db, skips before Plan 03
- Retain all legacy sync fixtures (isolated_data_dir, client, sample_txt,
  sample_pdf) unchanged for backward compatibility through Plan 04
2026-05-22 09:05:36 +02:00
curo1305 f9b8a0d1ca docs(01-01): complete Compose + Config Foundation plan — SUMMARY, STATE, ROADMAP
- Create 01-01-SUMMARY.md with full execution record (3 tasks, 6 files)
- Update STATE.md: advance to plan 2 of 5, record key decisions, update session
- Update ROADMAP.md: mark 01-01 complete, update progress table (1/5 plans)
2026-05-22 09:01:16 +02:00