docs(codebase): refresh codebase map after Phase 06.2 completion

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
curo1305
2026-06-02 15:32:06 +02:00
parent bd17b4b22f
commit 89f8d5a654
7 changed files with 1829 additions and 621 deletions
+318 -74
View File
@@ -1,87 +1,331 @@
# TESTING — document-scanner
# Testing Patterns
_Last updated: 2026-05-21_
**Analysis Date:** 2026-06-02
## Summary
## Test Framework
The backend has solid integration test coverage across all API surfaces and services using pytest + FastAPI TestClient. Each test runs in a fully isolated temporary data directory, so there is no shared state between tests. The frontend has no test framework configured at all.
**Backend Runner:**
- pytest 8.2+ with pytest-asyncio
- Config: `backend/pytest.ini``asyncio_mode = auto`, `testpaths = tests`
- `asyncio_mode = auto` means all `async def test_*` functions run as coroutines automatically
---
**Backend Assertion Library:**
- pytest built-in `assert`
- `unittest.mock` for `AsyncMock`, `MagicMock`, `patch`
## Backend Testing
### Framework
- **pytest** + **pytest-asyncio** (`asyncio_mode = auto` in `pytest.ini`)
- **FastAPI TestClient** (synchronous ASGI test client from `httpx`)
- No mocking library — AI calls are either tested with real parsing logic or the AI layer is swapped via provider mocking
### Test Isolation Strategy (conftest.py)
- `isolated_data_dir` fixture is `autouse=True` — every test automatically gets:
- A fresh `tmp_path/data/` directory with `uploads/`, `metadata/`
- Clean `topics.json` and `settings.json` initialized from `DEFAULT_SETTINGS`
- Monkeypatched `DATA_DIR` env var and all module-level path constants in `config` and `services.storage`
- New `FileLock` instances pointing to the tmp dir
- `client` fixture wraps FastAPI `TestClient` with the isolated data dir active
### Test Files
| File | What it covers |
|---|---|
| `test_health.py` | `GET /health` returns `{"status": "ok"}` |
| `test_documents.py` | Upload TXT/PDF (no-classify), list, get, delete; extracts text correctly |
| `test_topics.py` | Create, list, delete topics via API |
| `test_settings.py` | Read default settings, update provider config |
| `test_extractor.py` | Unit tests for `extract_text()` on TXT, PDF, DOCX, image paths |
| `test_classifier.py` | Unit tests for JSON parsing helpers (`_parse_classification`, `_parse_suggestions`, `_strip_code_fences`) — no real AI calls |
| `test_lmstudio.py` | LMStudio provider-specific behaviour (likely mocked or uses a local endpoint) |
### Fixtures Available
| Fixture | Provides |
|---|---|
| `isolated_data_dir` | Autouse — clean tmp data dir |
| `client` | FastAPI TestClient with isolated data |
| `sample_txt` | A `.txt` file with test content |
| `sample_pdf` | A minimal valid PDF created with PyMuPDF |
### What Is NOT Tested
- Auto-classification flow end-to-end (requires a live AI provider)
- Document reclassify endpoint
- Anthropic, OpenAI, Ollama provider implementations directly
- Any concurrent write / filelock contention scenarios
- File size / type validation edge cases
- Frontend — no tests exist
---
## Frontend Testing
- **No test framework installed** — `package.json` has no `vitest`, `jest`, or `@testing-library/vue`
- No test files found under `frontend/src/`
- No Cypress or Playwright configuration
---
## Running Tests
**Frontend Runner:**
- Vitest 4.1.7
- Config: `frontend/vitest.config.js``environment: 'happy-dom'`, `globals: true`
- `@vue/test-utils` 2.4.10 for component mounting
**Run Commands:**
```bash
# From backend/
pytest
# Backend — from backend/ directory
pytest -v # Run all tests
pytest tests/test_auth_api.py # Single file
INTEGRATION=1 pytest -v # Run with live Docker services (PostgreSQL + MinIO + Redis)
# With verbose output
pytest -v
# Frontend — from frontend/ directory
npm test # vitest run (one-shot)
npx vitest # watch mode
```
# Single file
pytest tests/test_documents.py
## Test File Organization
**Backend location:** All tests in `backend/tests/`; flat structure, one file per concern.
**Naming:**
- `test_<area>.py``test_auth_api.py`, `test_documents.py`, `test_shares.py`
- `test_<layer>_<area>.py` for unit tests: `test_task2_auth_service.py`, `test_cloud_backends.py`
**Frontend location:** Co-located in `__tests__/` subdirectories next to the code they test:
- `frontend/src/stores/__tests__/auth.test.js`
- `frontend/src/components/folders/__tests__/FolderTreeItem.test.js`
- `frontend/src/views/__tests__/FileManagerView.test.js`
- `frontend/src/router/__tests__/router.guard.test.js`
## Backend Test Structure
**Standard async test (most common pattern):**
```python
@pytest.mark.asyncio
async def test_register_success(authed_client):
"""POST /api/auth/register with valid data returns 201 with id and handle."""
resp = await _register(authed_client)
assert resp.status_code == 201, resp.text
data = resp.json()
assert "id" in data
assert data["handle"] == "testuser"
```
**Module-level async mark (newer pattern, avoids per-function decorator):**
```python
pytestmark = pytest.mark.asyncio # at module top — used in test_shares.py, test_audit.py
```
**Shared helper functions:** Each test file defines async helper functions (not fixtures) for setup operations:
```python
async def _register(async_client, handle="testuser", email="t@example.com", password="ValidPass12!"):
return await async_client.post("/api/auth/register", json={...})
```
**ORM-direct test data creation:** Tests often insert data via ORM rather than API to test specific states:
```python
doc = Document(id=doc_id, user_id=auth_user["user"].id, ...)
db_session.add(doc)
await db_session.commit()
```
## Backend Fixtures (conftest.py)
All fixtures are async (`@pytest_asyncio.fixture`) unless purely synchronous.
**Session fixture:**
```python
@pytest_asyncio.fixture
async def db_session():
# In-memory SQLite with PostgreSQL type shims (INET, JSONB patched to TEXT)
# Used for all unit/integration tests without live services
```
**HTTP client fixtures:**
```python
@pytest_asyncio.fixture
async def async_client(db_session):
# httpx.AsyncClient + ASGITransport wrapping the real FastAPI app
# DB dependency overridden via app.dependency_overrides[get_db]
```
**Auth fixtures (shared across all API tests):**
```python
@pytest_asyncio.fixture
async def auth_user(db_session):
# Creates User + Quota, issues JWT, returns:
# { "user": User, "token": str, "headers": {"Authorization": "Bearer ..."} }
@pytest_asyncio.fixture
async def second_auth_user(db_session):
# Same shape as auth_user — used for sharing tests (owner + recipient)
@pytest_asyncio.fixture
async def admin_user(db_session):
# Same shape, role="admin"
```
**Infrastructure mocks:**
```python
@pytest.fixture
def mock_minio_presigned(monkeypatch):
# Patches MinIOBackend.generate_presigned_put_url with AsyncMock
@pytest.fixture
def mock_minio_stat(monkeypatch):
# Patches MinIOBackend.stat_object with AsyncMock returning 1024 bytes
# Override per-test: mock_minio_stat.return_value = 50_000_000
```
**Cloud fixtures:**
```python
@pytest.fixture
def mock_google_drive_creds(): # Fake OAuth credential dict
@pytest.fixture
def mock_onedrive_creds(): # Fake MSAL credential dict
@pytest.fixture
async def cloud_connection_factory(db_session):
# Factory: creates CloudConnection ORM rows
# Usage: conn = await cloud_connection_factory(session, user_id, provider="google_drive")
```
**File fixtures:**
```python
@pytest.fixture
def sample_txt(tmp_path): # Creates "sample.txt" in tmp_path
@pytest.fixture
def sample_pdf(tmp_path): # Creates minimal PDF via PyMuPDF
```
## Service Availability and Integration Mode
Tests default to **in-memory SQLite** (no live services required):
- PostgreSQL-specific types (UUID, INET, JSONB) are patched via `SQLiteTypeCompiler` monkey-patching
- Tests that require PostgreSQL row-level locking semantics are marked `@pytest.mark.xfail(strict=False)`
For **live service testing**, set `INTEGRATION=1` or have Docker services running on their default ports (PostgreSQL:5432, MinIO:9000, Redis:6379). The `live_services_available()` fixture detects this.
## Mocking
**Backend mocking:**
- `unittest.mock.patch` for external service calls: `patch("services.auth.check_hibp", return_value=True)`
- `AsyncMock` for async methods: `monkeypatch.setattr(MinIOBackend, "stat_object", mock, raising=False)`
- `FakeRedis` class defined inline in test files that need it (test_auth_api.py, test_security_headers.py, test_totp_replay.py) — in-memory dict with TTL support, mirrors Redis get/set/incr/expire interface
- Celery tasks mocked with `MagicMock`: `monkeypatch.setattr("api.documents.extract_and_classify.delay", MagicMock())`
- `app.dependency_overrides[get_db] = lambda: db_session` for DB substitution
**Frontend mocking:**
- `vi.mock('../../api/client.js', () => ({ login: vi.fn(), ... }))` — mock entire API module
- Individual function mocks: `const mockListFolders = vi.fn()` then `vi.mock(...)` referencing the mock
- Store mocks for component tests: `vi.mock('../../stores/auth.js', () => ({ useAuthStore: () => ({ user: {...} }) }))`
- Heavy child component stubs: `vi.mock('../../components/X.vue', () => ({ default: { template: '<div/>' } }))`
- Browser storage stubs: `Object.defineProperty(globalThis, 'localStorage', { value: fakeLocalStorage })`
## Frontend Test Structure
**Store tests (primary coverage):**
```javascript
import { describe, it, expect, vi, beforeEach } from 'vitest'
import { setActivePinia, createPinia } from 'pinia'
beforeEach(() => {
setActivePinia(createPinia()) // fresh Pinia before each test
vi.clearAllMocks()
})
describe('useAuthStore — behavior group', () => {
it('describes exactly one assertion', async () => {
api.login.mockResolvedValue({ access_token: 'tok', user: {...} })
const store = useAuthStore()
await store.login('u@x.com', 'pass')
expect(store.accessToken).toBe('tok')
})
})
```
**Component tests (mount-based):**
```javascript
import { mount, flushPromises } from '@vue/test-utils'
// ...
const wrapper = mount(ComponentName, {
props: { item: makeItem() },
global: { plugins: [router] }
})
await flushPromises()
expect(wrapper.find('button').exists()).toBe(false)
```
## Coverage by Area
### Backend Coverage (329 test functions across 26 test files)
| Area | Test file(s) | Coverage |
|------|-------------|----------|
| Auth API (register, login, TOTP, backup codes, refresh, logout, change-password) | `test_auth_api.py` (498 lines) | High |
| Auth service unit tests (JWT, password, TOTP, backup codes) | `test_task2_auth_service.py` | High |
| Auth dependencies (get_current_user, get_current_admin) | `test_auth_deps.py` | High |
| TOTP replay prevention (AUTH-08) | `test_totp_replay.py` (239 lines) | High |
| Per-account rate limiting (SEC-02) | `test_auth_api.py` | High |
| Documents API (list, filter, confirm, delete, PATCH, content) | `test_documents.py` (925 lines) | High |
| Quota enforcement (atomic increment, concurrent race, delete decrement) | `test_quota.py` (239 lines) | Medium — concurrent race xfail on SQLite |
| Folder API (CRUD, breadcrumb, IDOR) | `test_folders.py` (494 lines) | High |
| Sharing API (SHARE-01 through SHARE-05) | `test_shares.py` (454 lines) | High |
| Admin API (users, quotas, AI config, ADMIN-07 no-impersonation) | `test_admin_api.py` (431 lines) | High |
| Audit log (SHARE events, AUTH events, CSV export) | `test_audit.py` (355 lines) | High |
| Security headers (CSP, X-Frame-Options, nosniff) | `test_security_headers.py` | High |
| Security invariants (credentials_enc not exposed, IDOR) | `test_security.py` | High |
| Constant-time comparisons (SEC-03, hmac.compare_digest) | `test_constant_time_auth.py` | High |
| Cloud storage (CLOUD-01 through CLOUD-07, SSRF, IDOR) | `test_cloud.py` (855 lines) | High |
| Cloud backends (Google Drive, OneDrive, WebDAV, Nextcloud) | `test_cloud_backends.py`, `test_webdav_backend.py` | Medium |
| Cloud credential encryption/decryption | `test_cloud_utils.py` (273 lines) | High |
| AI classifier JSON parsing | `test_classifier.py` (266 lines) | High |
| Text extraction | `test_extractor.py` | High |
| MinIO object key schema | `test_storage.py` (277 lines) | Medium |
| Settings API | `test_settings.py` | Medium |
| Topics API | `test_topics.py` (204 lines) | High |
| Health endpoint | `test_health.py` | Low (smoke test) |
| Alembic migrations | `test_alembic.py` (246 lines) | Medium |
| LM Studio provider | `test_lmstudio.py` | Conditional — `@pytest.mark.skipif` unless reachable |
### Frontend Coverage (14 test files, ~163 test cases)
| Area | Test file | Coverage |
|------|-----------|----------|
| Auth store (login, logout, TOTP, no-browser-storage invariant) | `stores/__tests__/auth.test.js` | High |
| Folders store (fetchFolders, createFolder, rename, delete) | `stores/__tests__/folders.test.js` | High |
| Cloud connections store | `stores/__tests__/cloudConnections.test.js` | Medium |
| Router guards (meta.public, meta.layout, redirect on unauthenticated) | `router/__tests__/router.guard.test.js` | High |
| FileManagerView (folder navigation, search, sort, move, delete) | `views/__tests__/FileManagerView.test.js` | Medium |
| FolderTreeItem (expand arrow, active state) | `components/folders/__tests__/FolderTreeItem.test.js` | Medium |
| FolderBreadcrumb | `components/folders/__tests__/FolderBreadcrumb.test.js` | Medium |
| TotpEnrollment component | `components/auth/__tests__/TotpEnrollment.test.js` | Medium |
| PasswordStrengthBar component | `components/auth/__tests__/PasswordStrengthBar.test.js` | Medium |
| AdminUsersTab component | `components/admin/__tests__/AdminUsersTab.test.js` | Medium |
| AdminQuotasTab component | `components/admin/__tests__/AdminQuotasTab.test.js` | Medium |
| AdminAiConfigTab component | `components/admin/__tests__/AdminAiConfigTab.test.js` | Medium |
| SettingsAccountTab component | `components/settings/__tests__/SettingsAccountTab.test.js` | Medium |
| SettingsCloudTab component | `components/settings/__tests__/SettingsCloudTab.test.js` | Medium |
## Test Gaps
**Backend gaps:**
- `test_storage.py` — MinIO object key tests are largely `xfail(strict=False)` waiting for module implementation
- Concurrent quota race (`test_concurrent_quota_race`) is `xfail(strict=False)` — requires PostgreSQL row-level locking
- Delete quota decrement (`test_delete_decrements_quota`) is `xfail(strict=False)` on SQLite
- No `pytest-cov` — no coverage measurement enforced
- No CI configuration (no GitHub Actions yaml)
**Frontend gaps:**
- `src/components/documents/``DocumentCard.vue`, `DocumentPreviewModal.vue`, `SearchBar.vue`, `SortControls.vue` have **no tests**
- `src/components/cloud/``CloudFolderTreeItem.vue`, `CloudProviderTreeItem.vue`, `CloudCredentialModal.vue` have **no tests**
- `src/components/sharing/``ShareModal.vue` has **no tests**
- `src/components/upload/``DropZone.vue`, `UploadProgress.vue` have **no tests**
- `src/components/layout/``AppSidebar.vue`, `QuotaBar.vue` have **no tests**
- `src/stores/documents.js` — documents store has **no tests**
- No E2E tests (no Playwright or Cypress)
## Security-Specific Tests
These test files exist specifically to enforce security invariants:
- `test_constant_time_auth.py` — asserts `hmac.compare_digest` used (source inspection + behavioral)
- `test_security.py` — asserts `credentials_enc` never appears in API responses (SEC-08); asserts admin DELETE calls `storage.delete_object` (SEC-09)
- `test_security_headers.py` — asserts CSP, X-Frame-Options, X-Content-Type-Options on every response (SEC-05)
- `test_totp_replay.py` — asserts same TOTP code rejected on second use (AUTH-08)
- `test_auth_api.py` — includes `test_origin_rejected` (CSRF), `test_per_account_rate_limit` (SEC-02)
- `test_auth_deps.py` — includes wrong-owner 403, deactivated user 401, admin-blocked 403
## Common Patterns
**Async testing:**
```python
# Option 1 — per-test decorator
@pytest.mark.asyncio
async def test_something(async_client, auth_user):
resp = await async_client.get("/api/documents", headers=auth_user["headers"])
assert resp.status_code == 200
# Option 2 — module-level mark
pytestmark = pytest.mark.asyncio
async def test_something(async_client, auth_user):
...
```
**Security negative tests (wrong owner → 403/404):**
```python
async def test_cannot_access_other_users_document(async_client, auth_user, second_auth_user, db_session):
doc_id = await _make_doc(db_session, auth_user)
resp = await async_client.get(f"/api/documents/{doc_id}", headers=second_auth_user["headers"])
assert resp.status_code in (403, 404)
```
**Patching external calls:**
```python
with patch("services.auth.check_hibp", return_value=True) as mock_hibp:
resp = await authed_client.post("/api/auth/change-password", ...)
assert resp.status_code == 422
```
**Frontend security invariant testing:**
```javascript
it('login() never writes accessToken to localStorage', async () => {
api.login.mockResolvedValue({ access_token: 'tok', user: {...} })
const store = useAuthStore()
await store.login('alice@example.com', 'password')
expect(fakeLocalStorage.setItem).not.toHaveBeenCalled()
})
```
---
## Gaps / Unknowns
- No test coverage measurement (no `pytest-cov` in `requirements.txt`)
- `test_lmstudio.py` content not inspected — unclear if it hits a real local endpoint
- No CI configuration (no GitHub Actions, no Dockerfile for test runner)
- No snapshot or contract tests for API response shapes
- Frontend is completely untested
*Testing analysis: 2026-06-02*