curo1305
0d51d023ce
feat(03-02): implement presigned upload flow, quota enforcement, cleanup task
...
- Replace POST /api/documents/upload with POST /api/documents/upload-url + /{id}/confirm
- upload-url: create pending Document row with user_id=None (Wave 2), return presigned PUT URL
- confirm: stat MinIO for authoritative size (T-03-05), atomic quota UPDATE (T-03-06, STORE-03)
- Confirm returns 413 with {used_bytes, limit_bytes, rejected_bytes} on quota exceeded (STORE-05)
- Wave 2 guard: skip quota UPDATE when doc.user_id is None (Plan 03-03 removes this)
- Add GET /api/auth/me/quota to api/auth.py (STORE-04)
- services/storage.py: remove save_upload (D-04); add GREATEST(0, used_bytes-delta) quota decrement to delete_document (STORE-06)
- tasks/document_tasks.py: add cleanup_abandoned_uploads Celery beat task (D-06)
- celery_app.py: add beat_schedule for cleanup-abandoned-uploads every 30 minutes
- tests/test_documents.py: replace legacy /upload tests with xfail; add real test logic for upload-url/confirm/get-quota
- tests/test_quota.py: implement real test logic with xfail for PostgreSQL-specific SQL
2026-05-23 14:32:12 +02:00
curo1305
1882edfff6
feat(02-02): auth API endpoints + security hardening + Python 3.9 compat
...
- backend/api/auth.py: register, login (TOTP+backup), refresh, logout,
me, change-password; per-account Redis rate limit; HIBP check
- backend/main.py: Origin validation middleware, CSP headers middleware,
CORS locked to settings.cors_origins, Redis lifespan (app.state.redis),
admin bootstrap, auth router included, slowapi SlowAPIMiddleware
- backend/services/email.py: already created in Plan 01 (verified exists)
- Python 3.9 compat: fixed match statement in ai/__init__.py,
str|None union syntax in openai_provider.py, api/documents.py,
api/topics.py, api/settings.py, services/classifier.py
All 17 tests in test_auth_api.py pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-05-22 19:35:38 +02:00
curo1305
9fc820d893
feat(02-01): implement services/auth.py full auth service layer and email_tasks.py
...
- services/auth.py: Argon2 password hashing (pwdlib), constant-time verify (SEC-06)
- JWT create/decode for access tokens and password-reset tokens (typ claim validation, T-02-01)
- Refresh token lifecycle: create, rotate, revoke-all with family revocation (AUTH-07, RFC 9700)
- Family revocation enqueues send_security_alert_email.delay on token reuse (T-02-02)
- TOTP provisioning (pyotp) and verification with Redis replay prevention, valid_window=1 (AUTH-08)
- Backup code generation (8-char hex uppercase), storage (Argon2 hashed), constant-time verify (T-02-03)
- HIBP k-anonymity check via SHA-1 prefix (T-02-05), fail-open on network error (T-02-06)
- Admin bootstrap: idempotent, logs WARNING if env vars missing (D-04/D-05/D-06)
- services/email.py: SMTP send + dev stdout fallback (D-01/D-02)
- tasks/email_tasks.py: send_reset_email and send_security_alert_email Celery tasks
- celery_app.py: add email queue route for tasks.email_tasks.*
- TDD tests: 17 tests covering all auth primitives and family revocation
2026-05-22 19:23:42 +02:00
curo1305
32d67de1ca
feat(01-05): introduce celery_app + tasks/document_tasks + session-aware classifier
...
- Add backend/celery_app.py: Celery("docuvault") with Redis broker, JSON
serialization, and tasks.document_tasks.* routed to documents queue;
reads REDIS_URL directly from os.environ (no config import — Pitfall 7)
- Add backend/tasks/__init__.py: empty package marker
- Add backend/tasks/document_tasks.py: sync extract_and_classify Celery task
that calls asyncio.run(_run()) to retrieve bytes from MinIO, extract text
via extractor, and classify via classifier; classification failure is non-fatal
- Update backend/services/classifier.py: classify_document and
suggest_topics_for_document now accept session: AsyncSession as first arg;
all storage.* calls updated to async session-injection pattern
- Add extract_text_from_bytes helper to services/extractor.py for bytes-based
extraction (used by Celery worker, which retrieves bytes from MinIO)
2026-05-22 09:45:33 +02:00
curo1305
3e4b1f1f91
feat(01-04): rewrite services/storage.py as async SQLAlchemy + MinIO orchestrator
...
- Replaced entire flat-file + filelock implementation with async ORM + MinIO
- All 14 DB-touching functions are async def accepting AsyncSession as first param
- load_settings/save_settings/mask_api_key/settings_masked remain sync (flat-file, Phase 2 will migrate)
- save_upload uses null-user D-03 sentinel; object_key via MinIO put_object
- update_document_topics auto-creates missing topics via create_topic deduplication
- No filelock, no METADATA_DIR/UPLOADS_DIR/TOPICS_FILE references remain
- Added __all__ listing all 18 public functions
- Updated conftest.py: removed filelock patching no longer needed
- Fixed test_object_key_schema: removed unused db_session param (SQLite INET type conflict)
2026-05-22 09:39:32 +02:00
curo1305
7a34807fa0
chore: initial commit — existing single-user document scanner codebase
...
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-05-22 08:53:28 +02:00