docs(codebase): refresh codebase map after Phase 06.2 completion

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-02 15:32:06 +02:00
parent bd17b4b22f
commit 89f8d5a654
7 changed files with 1829 additions and 621 deletions
@@ -1,87 +1,331 @@
-# TESTING — document-scanner
+# Testing Patterns

-_Last updated: 2026-05-21_
+**Analysis Date:** 2026-06-02

-## Summary
+## Test Framework

-The backend has solid integration test coverage across all API surfaces and services using pytest + FastAPI TestClient. Each test runs in a fully isolated temporary data directory, so there is no shared state between tests. The frontend has no test framework configured at all.
+**Backend Runner:**
+- pytest 8.2+ with pytest-asyncio
+- Config: `backend/pytest.ini` — `asyncio_mode = auto`, `testpaths = tests`
+- `asyncio_mode = auto` means all `async def test_*` functions run as coroutines automatically

---
+**Backend Assertion Library:**
+- pytest built-in `assert`
+- `unittest.mock` for `AsyncMock`, `MagicMock`, `patch`

-## Backend Testing
-
-### Framework
- **pytest** + **pytest-asyncio** (`asyncio_mode = auto` in `pytest.ini`)
- **FastAPI TestClient** (synchronous ASGI test client from `httpx`)
- No mocking library — AI calls are either tested with real parsing logic or the AI layer is swapped via provider mocking
-
-### Test Isolation Strategy (conftest.py)
- `isolated_data_dir` fixture is `autouse=True` — every test automatically gets:
-  - A fresh `tmp_path/data/` directory with `uploads/`, `metadata/`
-  - Clean `topics.json` and `settings.json` initialized from `DEFAULT_SETTINGS`
-  - Monkeypatched `DATA_DIR` env var and all module-level path constants in `config` and `services.storage`
-  - New `FileLock` instances pointing to the tmp dir
- `client` fixture wraps FastAPI `TestClient` with the isolated data dir active
-
-### Test Files
-
-| File | What it covers |
-|---|---|
-| `test_health.py` | `GET /health` returns `{"status": "ok"}` |
-| `test_documents.py` | Upload TXT/PDF (no-classify), list, get, delete; extracts text correctly |
-| `test_topics.py` | Create, list, delete topics via API |
-| `test_settings.py` | Read default settings, update provider config |
-| `test_extractor.py` | Unit tests for `extract_text()` on TXT, PDF, DOCX, image paths |
-| `test_classifier.py` | Unit tests for JSON parsing helpers (`_parse_classification`, `_parse_suggestions`, `_strip_code_fences`) — no real AI calls |
-| `test_lmstudio.py` | LMStudio provider-specific behaviour (likely mocked or uses a local endpoint) |
-
-### Fixtures Available
-
-| Fixture | Provides |
-|---|---|
-| `isolated_data_dir` | Autouse — clean tmp data dir |
-| `client` | FastAPI TestClient with isolated data |
-| `sample_txt` | A `.txt` file with test content |
-| `sample_pdf` | A minimal valid PDF created with PyMuPDF |
-
-### What Is NOT Tested
-
- Auto-classification flow end-to-end (requires a live AI provider)
- Document reclassify endpoint
- Anthropic, OpenAI, Ollama provider implementations directly
- Any concurrent write / filelock contention scenarios
- File size / type validation edge cases
- Frontend — no tests exist
-
---
-
-## Frontend Testing
-
- **No test framework installed** — `package.json` has no `vitest`, `jest`, or `@testing-library/vue`
- No test files found under `frontend/src/`
- No Cypress or Playwright configuration
-
---
-
-## Running Tests
+**Frontend Runner:**
+- Vitest 4.1.7
+- Config: `frontend/vitest.config.js` — `environment: 'happy-dom'`, `globals: true`
+- `@vue/test-utils` 2.4.10 for component mounting

+**Run Commands:**
 ```bash
-# From backend/
-pytest
+# Backend — from backend/ directory
+pytest -v                          # Run all tests
+pytest tests/test_auth_api.py      # Single file
+INTEGRATION=1 pytest -v            # Run with live Docker services (PostgreSQL + MinIO + Redis)

-# With verbose output
-pytest -v
+# Frontend — from frontend/ directory
+npm test                           # vitest run (one-shot)
+npx vitest                         # watch mode
+```

-# Single file
-pytest tests/test_documents.py
+## Test File Organization
+
+**Backend location:** All tests in `backend/tests/`; flat structure, one file per concern.
+
+**Naming:**
+- `test_<area>.py` — `test_auth_api.py`, `test_documents.py`, `test_shares.py`
+- `test_<layer>_<area>.py` for unit tests: `test_task2_auth_service.py`, `test_cloud_backends.py`
+
+**Frontend location:** Co-located in `__tests__/` subdirectories next to the code they test:
+- `frontend/src/stores/__tests__/auth.test.js`
+- `frontend/src/components/folders/__tests__/FolderTreeItem.test.js`
+- `frontend/src/views/__tests__/FileManagerView.test.js`
+- `frontend/src/router/__tests__/router.guard.test.js`
+
+## Backend Test Structure
+
+**Standard async test (most common pattern):**
+```python
+@pytest.mark.asyncio
+async def test_register_success(authed_client):
+    """POST /api/auth/register with valid data returns 201 with id and handle."""
+    resp = await _register(authed_client)
+    assert resp.status_code == 201, resp.text
+    data = resp.json()
+    assert "id" in data
+    assert data["handle"] == "testuser"
+```
+
+**Module-level async mark (newer pattern, avoids per-function decorator):**
+```python
+pytestmark = pytest.mark.asyncio  # at module top — used in test_shares.py, test_audit.py
+```
+
+**Shared helper functions:** Each test file defines async helper functions (not fixtures) for setup operations:
+```python
+async def _register(async_client, handle="testuser", email="t@example.com", password="ValidPass12!"):
+    return await async_client.post("/api/auth/register", json={...})
+```
+
+**ORM-direct test data creation:** Tests often insert data via ORM rather than API to test specific states:
+```python
+doc = Document(id=doc_id, user_id=auth_user["user"].id, ...)
+db_session.add(doc)
+await db_session.commit()
+```
+
+## Backend Fixtures (conftest.py)
+
+All fixtures are async (`@pytest_asyncio.fixture`) unless purely synchronous.
+
+**Session fixture:**
+```python
+@pytest_asyncio.fixture
+async def db_session():
+    # In-memory SQLite with PostgreSQL type shims (INET, JSONB patched to TEXT)
+    # Used for all unit/integration tests without live services
+```
+
+**HTTP client fixtures:**
+```python
+@pytest_asyncio.fixture
+async def async_client(db_session):
+    # httpx.AsyncClient + ASGITransport wrapping the real FastAPI app
+    # DB dependency overridden via app.dependency_overrides[get_db]
+```
+
+**Auth fixtures (shared across all API tests):**
+```python
+@pytest_asyncio.fixture
+async def auth_user(db_session):
+    # Creates User + Quota, issues JWT, returns:
+    # { "user": User, "token": str, "headers": {"Authorization": "Bearer ..."} }
+
+@pytest_asyncio.fixture
+async def second_auth_user(db_session):
+    # Same shape as auth_user — used for sharing tests (owner + recipient)
+
+@pytest_asyncio.fixture
+async def admin_user(db_session):
+    # Same shape, role="admin"
+```
+
+**Infrastructure mocks:**
+```python
+@pytest.fixture
+def mock_minio_presigned(monkeypatch):
+    # Patches MinIOBackend.generate_presigned_put_url with AsyncMock
+
+@pytest.fixture
+def mock_minio_stat(monkeypatch):
+    # Patches MinIOBackend.stat_object with AsyncMock returning 1024 bytes
+    # Override per-test: mock_minio_stat.return_value = 50_000_000
+```
+
+**Cloud fixtures:**
+```python
+@pytest.fixture
+def mock_google_drive_creds():   # Fake OAuth credential dict
+
+@pytest.fixture
+def mock_onedrive_creds():       # Fake MSAL credential dict
+
+@pytest.fixture
+async def cloud_connection_factory(db_session):
+    # Factory: creates CloudConnection ORM rows
+    # Usage: conn = await cloud_connection_factory(session, user_id, provider="google_drive")
+```
+
+**File fixtures:**
+```python
+@pytest.fixture
+def sample_txt(tmp_path):    # Creates "sample.txt" in tmp_path
+
+@pytest.fixture
+def sample_pdf(tmp_path):    # Creates minimal PDF via PyMuPDF
+```
+
+## Service Availability and Integration Mode
+
+Tests default to **in-memory SQLite** (no live services required):
+- PostgreSQL-specific types (UUID, INET, JSONB) are patched via `SQLiteTypeCompiler` monkey-patching
+- Tests that require PostgreSQL row-level locking semantics are marked `@pytest.mark.xfail(strict=False)`
+
+For **live service testing**, set `INTEGRATION=1` or have Docker services running on their default ports (PostgreSQL:5432, MinIO:9000, Redis:6379). The `live_services_available()` fixture detects this.
+
+## Mocking
+
+**Backend mocking:**
+- `unittest.mock.patch` for external service calls: `patch("services.auth.check_hibp", return_value=True)`
+- `AsyncMock` for async methods: `monkeypatch.setattr(MinIOBackend, "stat_object", mock, raising=False)`
+- `FakeRedis` class defined inline in test files that need it (test_auth_api.py, test_security_headers.py, test_totp_replay.py) — in-memory dict with TTL support, mirrors Redis get/set/incr/expire interface
+- Celery tasks mocked with `MagicMock`: `monkeypatch.setattr("api.documents.extract_and_classify.delay", MagicMock())`
+- `app.dependency_overrides[get_db] = lambda: db_session` for DB substitution
+
+**Frontend mocking:**
+- `vi.mock('../../api/client.js', () => ({ login: vi.fn(), ... }))` — mock entire API module
+- Individual function mocks: `const mockListFolders = vi.fn()` then `vi.mock(...)` referencing the mock
+- Store mocks for component tests: `vi.mock('../../stores/auth.js', () => ({ useAuthStore: () => ({ user: {...} }) }))`
+- Heavy child component stubs: `vi.mock('../../components/X.vue', () => ({ default: { template: '<div/>' } }))`
+- Browser storage stubs: `Object.defineProperty(globalThis, 'localStorage', { value: fakeLocalStorage })`
+
+## Frontend Test Structure
+
+**Store tests (primary coverage):**
+```javascript
+import { describe, it, expect, vi, beforeEach } from 'vitest'
+import { setActivePinia, createPinia } from 'pinia'
+
+beforeEach(() => {
+  setActivePinia(createPinia())  // fresh Pinia before each test
+  vi.clearAllMocks()
+})
+
+describe('useAuthStore — behavior group', () => {
+  it('describes exactly one assertion', async () => {
+    api.login.mockResolvedValue({ access_token: 'tok', user: {...} })
+    const store = useAuthStore()
+    await store.login('u@x.com', 'pass')
+    expect(store.accessToken).toBe('tok')
+  })
+})
+```
+
+**Component tests (mount-based):**
+```javascript
+import { mount, flushPromises } from '@vue/test-utils'
+// ...
+const wrapper = mount(ComponentName, {
+  props: { item: makeItem() },
+  global: { plugins: [router] }
+})
+await flushPromises()
+expect(wrapper.find('button').exists()).toBe(false)
+```
+
+## Coverage by Area
+
+### Backend Coverage (329 test functions across 26 test files)
+
+| Area | Test file(s) | Coverage |
+|------|-------------|----------|
+| Auth API (register, login, TOTP, backup codes, refresh, logout, change-password) | `test_auth_api.py` (498 lines) | High |
+| Auth service unit tests (JWT, password, TOTP, backup codes) | `test_task2_auth_service.py` | High |
+| Auth dependencies (get_current_user, get_current_admin) | `test_auth_deps.py` | High |
+| TOTP replay prevention (AUTH-08) | `test_totp_replay.py` (239 lines) | High |
+| Per-account rate limiting (SEC-02) | `test_auth_api.py` | High |
+| Documents API (list, filter, confirm, delete, PATCH, content) | `test_documents.py` (925 lines) | High |
+| Quota enforcement (atomic increment, concurrent race, delete decrement) | `test_quota.py` (239 lines) | Medium — concurrent race xfail on SQLite |
+| Folder API (CRUD, breadcrumb, IDOR) | `test_folders.py` (494 lines) | High |
+| Sharing API (SHARE-01 through SHARE-05) | `test_shares.py` (454 lines) | High |
+| Admin API (users, quotas, AI config, ADMIN-07 no-impersonation) | `test_admin_api.py` (431 lines) | High |
+| Audit log (SHARE events, AUTH events, CSV export) | `test_audit.py` (355 lines) | High |
+| Security headers (CSP, X-Frame-Options, nosniff) | `test_security_headers.py` | High |
+| Security invariants (credentials_enc not exposed, IDOR) | `test_security.py` | High |
+| Constant-time comparisons (SEC-03, hmac.compare_digest) | `test_constant_time_auth.py` | High |
+| Cloud storage (CLOUD-01 through CLOUD-07, SSRF, IDOR) | `test_cloud.py` (855 lines) | High |
+| Cloud backends (Google Drive, OneDrive, WebDAV, Nextcloud) | `test_cloud_backends.py`, `test_webdav_backend.py` | Medium |
+| Cloud credential encryption/decryption | `test_cloud_utils.py` (273 lines) | High |
+| AI classifier JSON parsing | `test_classifier.py` (266 lines) | High |
+| Text extraction | `test_extractor.py` | High |
+| MinIO object key schema | `test_storage.py` (277 lines) | Medium |
+| Settings API | `test_settings.py` | Medium |
+| Topics API | `test_topics.py` (204 lines) | High |
+| Health endpoint | `test_health.py` | Low (smoke test) |
+| Alembic migrations | `test_alembic.py` (246 lines) | Medium |
+| LM Studio provider | `test_lmstudio.py` | Conditional — `@pytest.mark.skipif` unless reachable |
+
+### Frontend Coverage (14 test files, ~163 test cases)
+
+| Area | Test file | Coverage |
+|------|-----------|----------|
+| Auth store (login, logout, TOTP, no-browser-storage invariant) | `stores/__tests__/auth.test.js` | High |
+| Folders store (fetchFolders, createFolder, rename, delete) | `stores/__tests__/folders.test.js` | High |
+| Cloud connections store | `stores/__tests__/cloudConnections.test.js` | Medium |
+| Router guards (meta.public, meta.layout, redirect on unauthenticated) | `router/__tests__/router.guard.test.js` | High |
+| FileManagerView (folder navigation, search, sort, move, delete) | `views/__tests__/FileManagerView.test.js` | Medium |
+| FolderTreeItem (expand arrow, active state) | `components/folders/__tests__/FolderTreeItem.test.js` | Medium |
+| FolderBreadcrumb | `components/folders/__tests__/FolderBreadcrumb.test.js` | Medium |
+| TotpEnrollment component | `components/auth/__tests__/TotpEnrollment.test.js` | Medium |
+| PasswordStrengthBar component | `components/auth/__tests__/PasswordStrengthBar.test.js` | Medium |
+| AdminUsersTab component | `components/admin/__tests__/AdminUsersTab.test.js` | Medium |
+| AdminQuotasTab component | `components/admin/__tests__/AdminQuotasTab.test.js` | Medium |
+| AdminAiConfigTab component | `components/admin/__tests__/AdminAiConfigTab.test.js` | Medium |
+| SettingsAccountTab component | `components/settings/__tests__/SettingsAccountTab.test.js` | Medium |
+| SettingsCloudTab component | `components/settings/__tests__/SettingsCloudTab.test.js` | Medium |
+
+## Test Gaps
+
+**Backend gaps:**
+- `test_storage.py` — MinIO object key tests are largely `xfail(strict=False)` waiting for module implementation
+- Concurrent quota race (`test_concurrent_quota_race`) is `xfail(strict=False)` — requires PostgreSQL row-level locking
+- Delete quota decrement (`test_delete_decrements_quota`) is `xfail(strict=False)` on SQLite
+- No `pytest-cov` — no coverage measurement enforced
+- No CI configuration (no GitHub Actions yaml)
+
+**Frontend gaps:**
+- `src/components/documents/` — `DocumentCard.vue`, `DocumentPreviewModal.vue`, `SearchBar.vue`, `SortControls.vue` have **no tests**
+- `src/components/cloud/` — `CloudFolderTreeItem.vue`, `CloudProviderTreeItem.vue`, `CloudCredentialModal.vue` have **no tests**
+- `src/components/sharing/` — `ShareModal.vue` has **no tests**
+- `src/components/upload/` — `DropZone.vue`, `UploadProgress.vue` have **no tests**
+- `src/components/layout/` — `AppSidebar.vue`, `QuotaBar.vue` have **no tests**
+- `src/stores/documents.js` — documents store has **no tests**
+- No E2E tests (no Playwright or Cypress)
+
+## Security-Specific Tests
+
+These test files exist specifically to enforce security invariants:
+
+- `test_constant_time_auth.py` — asserts `hmac.compare_digest` used (source inspection + behavioral)
+- `test_security.py` — asserts `credentials_enc` never appears in API responses (SEC-08); asserts admin DELETE calls `storage.delete_object` (SEC-09)
+- `test_security_headers.py` — asserts CSP, X-Frame-Options, X-Content-Type-Options on every response (SEC-05)
+- `test_totp_replay.py` — asserts same TOTP code rejected on second use (AUTH-08)
+- `test_auth_api.py` — includes `test_origin_rejected` (CSRF), `test_per_account_rate_limit` (SEC-02)
+- `test_auth_deps.py` — includes wrong-owner 403, deactivated user 401, admin-blocked 403
+
+## Common Patterns
+
+**Async testing:**
+```python
+# Option 1 — per-test decorator
+@pytest.mark.asyncio
+async def test_something(async_client, auth_user):
+    resp = await async_client.get("/api/documents", headers=auth_user["headers"])
+    assert resp.status_code == 200
+
+# Option 2 — module-level mark
+pytestmark = pytest.mark.asyncio
+async def test_something(async_client, auth_user):
+    ...
+```
+
+**Security negative tests (wrong owner → 403/404):**
+```python
+async def test_cannot_access_other_users_document(async_client, auth_user, second_auth_user, db_session):
+    doc_id = await _make_doc(db_session, auth_user)
+    resp = await async_client.get(f"/api/documents/{doc_id}", headers=second_auth_user["headers"])
+    assert resp.status_code in (403, 404)
+```
+
+**Patching external calls:**
+```python
+with patch("services.auth.check_hibp", return_value=True) as mock_hibp:
+    resp = await authed_client.post("/api/auth/change-password", ...)
+assert resp.status_code == 422
+```
+
+**Frontend security invariant testing:**
+```javascript
+it('login() never writes accessToken to localStorage', async () => {
+  api.login.mockResolvedValue({ access_token: 'tok', user: {...} })
+  const store = useAuthStore()
+  await store.login('alice@example.com', 'password')
+  expect(fakeLocalStorage.setItem).not.toHaveBeenCalled()
+})
 ```

 ---

-## Gaps / Unknowns
-
- No test coverage measurement (no `pytest-cov` in `requirements.txt`)
- `test_lmstudio.py` content not inspected — unclear if it hits a real local endpoint
- No CI configuration (no GitHub Actions, no Dockerfile for test runner)
- No snapshot or contract tests for API response shapes
- Frontend is completely untested
+*Testing analysis: 2026-06-02*