docs(codebase): refresh codebase map after Phase 06.2 completion

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
curo1305
2026-06-02 15:32:06 +02:00
parent bd17b4b22f
commit 89f8d5a654
7 changed files with 1829 additions and 621 deletions
+210 -88
View File
@@ -1,94 +1,216 @@
# CONVENTIONS — document-scanner
# Coding Conventions
_Last updated: 2026-05-21_
**Analysis Date:** 2026-06-02
## Summary
## Naming Patterns
The codebase follows standard Python and Vue 3 conventions without heavy tooling enforcement. Backend uses async/await throughout with type hints on public interfaces. Frontend uses Vue Options API with Pinia stores as the data layer. No linter or formatter configuration is committed.
**Python files:**
- `snake_case` throughout — `auth.py`, `cloud_utils.py`, `document_tasks.py`
- Modules named for their responsibility, not their layer (e.g., `services/auth.py`, `services/audit.py`)
**Python functions:**
- `snake_case` for all functions and methods: `hash_password`, `verify_password`, `create_access_token`, `write_audit_log`
- Private helpers prefixed with underscore: `_set_refresh_cookie`, `_port_open`, `_set_doc_user_id`
- Async functions use same convention — no `async_` prefix
**Python classes:**
- `PascalCase` for ORM models and Pydantic models: `User`, `Document`, `RegisterRequest`, `DocumentPatch`
- Request/response models end in `Request` or `Response`: `RegisterRequest`, `LoginRequest`, `ChangePasswordRequest`
**Python variables:**
- `snake_case`: `user_id`, `access_token`, `used_bytes`, `credentials_enc`
- Constants use `UPPER_SNAKE_CASE`: `_PASSWORD_DETAIL` (underscore prefix when module-private)
- Module-level singletons prefixed underscore: `_pwd`, `_CLOUD_PROVIDERS`
**DB column naming:**
- `snake_case` for all columns: `user_id`, `password_hash`, `is_active`, `created_at`
- Exception: ORM attribute `metadata_` maps to DB column `metadata` (reserved SQLAlchemy name)
- Timestamp columns use `_at` suffix: `created_at`, `used_at`
- Boolean columns use `is_` or no prefix: `is_active`, `totp_enabled`, `password_must_change`
**Frontend files:**
- Vue components: `PascalCase``DocumentCard.vue`, `FolderTreeItem.vue`, `StorageBrowser.vue`
- Stores: `camelCase.js``auth.js`, `documents.js`, `cloudConnections.js`
- Utilities: `camelCase.js``formatters.js`
- API client: single file `src/api/client.js`
- Test files: `ComponentName.test.js` or `storeName.test.js` inside `__tests__/` subdirectory
**Frontend functions and variables:**
- `camelCase`: `formatDate`, `formatSize`, `providerColor`, `fetchDocuments`, `uploadToMinIO`
- Store composables use `use` prefix: `useAuthStore`, `useFoldersStore`, `useDocumentsStore`
- Private helpers prefixed underscore: `_refreshInFlight`
- Event names emitted from components: `kebab-case``'breadcrumb-navigate'`, `'folder-create'`, `'file-open'`
## Code Style
**Formatting:**
- No Prettier, ESLint, Black, or Ruff config committed — style maintained by convention only
- Backend follows PEP 8 organically; 4-space indentation
- Tailwind CSS utility classes applied inline in Vue templates; no scoped `<style>` blocks used
**Python style specifics:**
- `from __future__ import annotations` at top of all `api/` and `services/` files (all 8 api/ files confirmed)
- `Optional[X]` used instead of `X | None` union syntax — maintained for Python < 3.10 compatibility even though runtime is 3.12
- Type annotations on all function signatures and ORM `Mapped[...]` column declarations
- Docstrings present on all public functions and modules; module docstrings explain invariants and phase context
**Vue/JS style specifics:**
- `<script setup>` Composition API used for ALL Vue components — no Options API exists (all 30+ components confirmed)
- Pinia stores use setup function syntax (not options syntax): `defineStore('name', () => { ... })`
- `ref()` for all reactive state; `computed()` for derived values; `watch()` for side effects
- Props always explicitly typed: `{ type: Object, required: true }`
- `emits` declared on components that emit events
## Import Organization
**Python imports (consistent order across all api/ and services/ files):**
1. `from __future__ import annotations` (first line, when present)
2. Standard library (`import uuid`, `import hashlib`, `import logging`)
3. Third-party (`from fastapi import ...`, `from sqlalchemy import ...`, `from pydantic import ...`)
4. Internal (`from config import settings`, `from db.models import ...`, `from deps.auth import ...`, `from services import ...`)
Example from `backend/api/auth.py`:
```python
from __future__ import annotations
import uuid
from typing import Literal, Optional
from fastapi import APIRouter, Depends, HTTPException, Request, Response, status
from pydantic import BaseModel, EmailStr
from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession
from config import settings
from db.models import BackupCode, Quota, RefreshToken, User
from deps.auth import get_current_user
from deps.db import get_db
from services import auth as auth_service
```
**Frontend imports (consistent order):**
1. `import { ... } from 'vue'` — Vue composables
2. `import { ... } from 'vue-router'` — router composables
3. `import { useXStore } from '../stores/x.js'` — Pinia stores
4. `import * as api from '../../api/client.js'` — API client (namespace import)
5. `import ChildComponent from './ChildComponent.vue'` — child components
6. `import { formatDate } from '../../utils/formatters.js'` — shared utilities
**Path resolution:** Relative paths throughout — no `@/` alias configured.
## Error Handling
**Backend — service vs API layer separation (strict pattern):**
- `services/` functions raise `ValueError` with descriptive messages — NEVER `HTTPException`
- `api/` handlers catch `ValueError` and map to HTTP status codes
- Pattern from `api/auth.py`:
```python
try:
auth_service.validate_password_strength(body.new_password)
except ValueError as exc:
raise HTTPException(status_code=status.HTTP_422_UNPROCESSABLE_ENTITY, detail=str(exc))
```
**HTTP status codes used:**
- `201` — resource created (register, share, folder)
- `401` — unauthenticated or wrong credentials
- `403` — forbidden (wrong role, wrong owner, admin blocked from document content)
- `404` — not found
- `409` — conflict (duplicate email/handle)
- `413` — quota exceeded
- `422` — validation failure (weak password, invalid field value)
- `429` — rate limited
**Audit log exceptions:**
- `services/audit.py` `write_audit_log()` catches all exceptions and calls `logger.warning()`
- Audit failure MUST NOT abort the primary operation — no re-raise under any circumstance
**Frontend error handling:**
- Stores catch errors and set `error.value = e.message`; `loading.value` always reset in `finally`
- `api/client.js` `request()` throws `Error` with `.status` and optional `.payload` properties
- On 401: automatic single-retry after `authStore.refresh()`; on refresh failure throws `'Session expired'`
## Logging
**Framework:** Python `logging` module with `logger = logging.getLogger(__name__)` per module.
**Patterns:**
- `%`-style format strings (never f-strings in log calls): `logger.warning("audit log write failed: %s", exc)`
- `logger.info` for successful notable operations; `logger.warning` for non-fatal failures; `logger.error` for operation failures
- Never log secrets, tokens, passwords, or PII
- Auth events, quota violations, and admin actions are written to the `AuditLog` DB table via `write_audit_log()` — not the Python logger
**Frontend:** No logging framework — `console.*` not used in production code.
## Comments
**Module docstrings — every backend module has:**
- Summary of what it implements (with HTTP endpoint paths)
- Security invariants it enforces (with REQ-IDs: `SEC-02`, `AUTH-07`, `D-04`)
- Plan/phase traceability note
**Inline comments:**
- Security-sensitive lines carry rationale: `# CLAUDE.md constraint`, `# SEC-06`, `# T-03-22`
- SQLAlchemy quirks explained inline where non-obvious
- `# ── Section Name ──────` horizontal rules separate logical sections within long files
**Test docstrings:**
- Every test function has a one-line docstring describing what it asserts: `"""POST /api/auth/register with valid data returns 201 with id and handle."""`
## Function Design
**Backend:**
- Single responsibility per function — auth service functions do exactly one thing
- DB-touching functions are `async` and take `AsyncSession` as a parameter
- Pydantic `@field_validator` used for complex field constraints (e.g., `filename_no_path_separators`)
**Frontend:**
- Store actions are `async` functions defined inside `defineStore` setup
- Utility functions in `src/utils/formatters.js` are pure — no side effects, no imports
- Test factory helpers follow `makeFolder(overrides = {})` pattern — spread overrides over defaults
## Module Design
**Backend:**
- All routers named `router`: `router = APIRouter(prefix="/api/...", tags=[...])`
- Settings singleton: `settings = Settings()` at bottom of `config.py`; imported as `from config import settings`
- No `__all__` declarations — convention limits what callers import
**Frontend:**
- Named exports from stores: `export const useAuthStore = defineStore(...)`
- Named exports from utilities: `export function formatDate(iso) { ... }`
- Default exports from Vue components (implicit via `<script setup>`)
- `src/api/client.js`: named exports only; `request()` is unexported internal helper
## Backend Dependency Injection
FastAPI `Depends()` is used for all cross-cutting concerns. Three standard dependencies in `backend/deps/`:
- `get_db` (`deps/db.py`) — yields `AsyncSession`; overridden in tests with in-memory SQLite session
- `get_current_user` (`deps/auth.py`) — validates Bearer JWT, returns `User`; raises 401
- `get_current_admin` (`deps/auth.py`) — delegates to `get_current_user`, checks `role == 'admin'`; raises 403
- `get_regular_user` (`deps/auth.py`) — delegates to `get_current_user`, blocks `role == 'admin'`; raises 403
Usage pattern in route handlers:
```python
@router.get("/protected")
async def protected_endpoint(
current_user: User = Depends(get_regular_user),
session: AsyncSession = Depends(get_db),
):
...
```
## Security-Enforced Invariants in Code
The following patterns are mandatory and must not be deviated from:
- **Token storage:** `accessToken` lives only in Pinia `ref()` — never `localStorage`, never `sessionStorage`
- **Refresh cookie:** `httponly=True, secure=True, samesite="strict"` on every `set_cookie` call
- **Ownership check:** every document/folder/share endpoint asserts `resource.user_id == current_user.id`
- **Object keys:** `{user_id}/{document_id}/{uuid4()}{ext}` — human filename stored in DB only
- **Quota:** atomic `UPDATE quotas SET used_bytes = used_bytes + $delta WHERE (used_bytes + $delta) <= limit_bytes RETURNING used_bytes` — never read-then-write
- **Admin exclusion:** admin accounts blocked from all `/api/documents/*` endpoints via `get_regular_user`
---
## Python Conventions (Backend)
### Naming
- Files: `snake_case.py`
- Classes: `PascalCase` (e.g., `AnthropicProvider`, `ClassificationResult`)
- Functions/variables: `snake_case`
- Constants: `UPPER_SNAKE_CASE` (e.g., `MAX_STORED_CHARS`, `DATA_DIR`)
- Private helpers: leading underscore (e.g., `_extract_pdf`, `_parse_classification`)
### Async
- All API endpoint functions are `async def`
- All `AIProvider` methods are `async def`
- `pytest-asyncio` with `asyncio_mode=auto` (set in `pytest.ini`)
### Type Hints
- Used on public function signatures in `ai/` layer and `services/`
- Dataclass used for `ClassificationResult` (`@dataclass` with `field(default_factory=...)`)
- Not used consistently in `api/` routers (rely on FastAPI/Pydantic implicit validation)
### Error Handling
- `extractor.py` wraps all extraction in `try/except Exception` and returns error strings (never raises)
- AI providers raise on hard failures; caller (`classifier.py`) is responsible for propagating
- No global exception handler registered in `main.py`
### Imports
- Standard library first, then third-party, then local — not enforced by isort
- Heavy library imports (`fitz`, `pytesseract`, `docx`) are deferred inside functions to avoid import-time cost when unused
### Module Docstrings
- Present on `extractor.py` and `test_classifier.py`; absent elsewhere
---
## JavaScript / Vue Conventions (Frontend)
### Naming
- Vue files: `PascalCase.vue` (e.g., `DocumentCard.vue`, `AppSidebar.vue`)
- Pinia stores: `camelCase` filename matching store ID (e.g., `documents.js``useDocumentsStore`)
- Views: `<Name>View.vue` suffix
- Components grouped by domain in subdirectories: `documents/`, `topics/`, `upload/`, `layout/`
### Vue Style
- Options API used throughout (not Composition API)
- Props defined with type and default; no `defineProps` (Options API syntax)
- `v-model`, `v-for`, `v-if` used directly in templates
### Pinia Pattern
- Each store encapsulates `state`, `getters`, and `actions`
- Actions call `src/api/client.js` — components never import `client.js` directly
- Stores are the single source of truth; views read from store state
### API Client
- `src/api/client.js` is the sole HTTP adapter
- All paths are prefixed `/api/` (proxied to backend in dev via Vite config)
### Styling
- Tailwind CSS utility classes used directly in templates
- No scoped `<style>` blocks observed in component list
- Global styles in `src/style.css`
---
## API Design Conventions (Backend)
- All endpoints prefixed `/api/` (set per router)
- JSON responses; multipart for file upload
- HTTP verbs follow REST: GET list, GET by ID, POST create, PUT/PATCH update, DELETE remove
- No versioning (`/api/v1/`) — flat namespace
---
## Configuration
- Runtime paths controlled entirely by `DATA_DIR` env var (defaults to `/app/data`)
- AI settings persisted in `data/settings.json` — no env var overrides at runtime for provider config (except `ANTHROPIC_API_KEY` / `OPENAI_API_KEY` noted in `.env.example`)
- No `.env` loading in backend code — env vars passed via Docker Compose `environment:` block
---
## Gaps / Unknowns
- No ESLint, Prettier, Black, or Ruff configuration committed
- No pre-commit hooks
- No consistent JSDoc or Python docstring coverage
*Convention analysis: 2026-06-02*