Files
kite/.planning/research/ARCHITECTURE.md
T

38 KiB
Raw Blame History

Architecture Research

Domain: Multi-user SaaS document management platform (FastAPI + Vue 3 brownfield migration) Researched: 2026-05-21 Confidence: HIGH (auth DI pattern confirmed via official FastAPI docs; storage/DB patterns are well-established S3/PostgreSQL engineering standards cross-verified against official MinIO and SQLAlchemy docs)


Standard Architecture

System Overview

┌──────────────────────────────────────────────────────────────────────┐
│                          Browser (Vue 3 SPA)                          │
│   ┌───────────┐  ┌──────────────┐  ┌───────────┐  ┌──────────────┐  │
│   │ auth store│  │ docs store   │  │quota store│  │settings store│  │
│   └─────┬─────┘  └──────┬───────┘  └─────┬─────┘  └──────┬───────┘  │
│         └───────────────┴────────────────┴────────────────┘          │
│                         api/client.js (Bearer token injected)         │
└───────────────────────────────────┬──────────────────────────────────┘
                                    │ HTTPS/JSON + multipart
                          ┌─────────▼─────────┐
                          │   Load Balancer    │ (future; optional now)
                          └────────┬──────────┘
               ┌───────────────────┼───────────────────┐
               │                   │                   │
    ┌──────────▼──────┐  ┌─────────▼──────┐  ┌────────▼───────┐
    │  FastAPI inst 1  │  │ FastAPI inst 2  │  │ FastAPI inst N │
    │  (stateless)     │  │  (stateless)    │  │  (stateless)   │
    └──────────┬───────┘  └────────┬────────┘  └────────┬───────┘
               └───────────────────┼───────────────────┘
                         ┌─────────▼──────────┐
                         │   Shared Services   │
              ┌──────────┴──────────────────────┴─────────┐
              │                                            │
   ┌──────────▼──────────┐               ┌────────────────▼──────┐
   │     PostgreSQL       │               │        MinIO           │
   │  (users, docs, meta, │               │  (object storage,      │
   │   quotas, audit)     │               │   one bucket per user  │
   └──────────────────────┘               │   OR prefix-per-user)  │
                                          └───────────────────────┘
                                                    │
                         ┌──────────────────────────┼─────────────────┐
                         │                          │                 │
              ┌──────────▼────────┐    ┌────────────▼───────┐   ┌────▼──────┐
              │ Cloud Storage     │    │   OneDrive Adapter │   │  WebDAV   │
              │ Adapter (base)    │    │   Google Drive     │   │  Adapter  │
              └───────────────────┘    └────────────────────┘   └───────────┘

Component Boundaries

Component Responsibility Communicates With
api/auth.py Registration, login, token refresh, TOTP enroll/verify services/user_service.py, DB
api/documents.py Upload, list, get, delete, reclassify, share services/document_service.py, quota dep
api/folders.py Folder CRUD, move services/folder_service.py
api/storage_backends.py Connect/disconnect cloud accounts, list/browse services/cloud_service.py
api/admin.py User CRUD, quota adjustments, audit log, AI config services/admin_service.py
deps/auth.py get_current_user — verifies JWT, returns User model DB, jose/PyJWT
deps/quota.py check_quota — reads user's usage, raises 413 if exceeded DB
deps/db.py get_db — yields async SQLAlchemy session PostgreSQL
services/document_service.py Orchestrates extract → classify → store flow extractor, classifier, storage_service
services/storage_service.py Routes to MinIO or cloud adapter; enforces object key namespacing MinIO, cloud adapters
services/user_service.py Password hashing, TOTP provisioning, breach check DB, bcrypt, pyotp
services/quota_service.py Compute used bytes from DB, update after upload/delete DB
services/audit_service.py Append-only audit log writes DB
services/cloud_service.py Manage encrypted cloud credentials, proxy operations Cloud adapters, DB
storage/base.py StorageBackend ABC (mirrors ai/base.py pattern)
storage/minio_backend.py MinIO S3 implementation MinIO
storage/onedrive_backend.py OneDrive Graph API implementation Microsoft Graph
storage/gdrive_backend.py Google Drive API implementation Google Drive API
storage/nextcloud_backend.py Nextcloud WebDAV implementation WebDAV
db/models.py SQLAlchemy ORM models PostgreSQL
db/migrations/ Alembic migration history

backend/
├── main.py                     # FastAPI app factory, middleware, router registration
├── config.py                   # pydantic-settings: DB URL, MinIO creds, secret keys
├── deps/
│   ├── auth.py                 # get_current_user, get_current_admin
│   ├── db.py                   # get_db (async session dependency)
│   └── quota.py                # check_upload_quota (raises 413 if exceeded)
├── api/
│   ├── auth.py                 # /auth/register, /auth/login, /auth/refresh, /auth/totp/*
│   ├── documents.py            # /documents/* (existing routes, now user-scoped)
│   ├── folders.py              # /folders/*
│   ├── storage_backends.py     # /storage-backends/* (cloud account management)
│   └── admin.py                # /admin/* (users, quotas, audit, AI config)
├── services/
│   ├── document_service.py     # upload orchestration (extract → classify → store → quota)
│   ├── storage_service.py      # routes uploads to correct StorageBackend
│   ├── quota_service.py        # read/write quota usage
│   ├── user_service.py         # user creation, password, TOTP
│   ├── audit_service.py        # audit log writes
│   └── cloud_service.py        # cloud backend credential management
├── storage/                    # cloud storage adapter layer (mirrors ai/)
│   ├── base.py                 # StorageBackend ABC
│   ├── __init__.py             # get_storage_backend() factory
│   ├── minio_backend.py        # default local-S3 backend
│   ├── onedrive_backend.py
│   ├── gdrive_backend.py
│   └── nextcloud_backend.py    # WebDAV-based
├── ai/                         # unchanged — existing provider abstraction
│   └── ...
├── db/
│   ├── models.py               # all SQLAlchemy ORM models
│   ├── session.py              # async engine + sessionmaker
│   └── migrations/             # Alembic env + version scripts
└── tests/

Structure Rationale

  • deps/: FastAPI dependency functions isolated from service logic. Auth, DB session, and quota are injected independently — routes compose them without coupling.
  • storage/: Direct mirror of ai/ module. Same ABC + factory pattern. Existing team mental model applies immediately.
  • db/: ORM models and session config separated from services, ensuring migrations can be run independently of app startup.

Architectural Patterns

Pattern 1: JWT Verification via Dependency Injection (not middleware)

What: JWT parsing and user lookup happens in deps/auth.py::get_current_user, injected via Depends() per route.

When to use: All authenticated routes. Admin routes additionally inject get_current_admin which calls get_current_user then checks user.role == "admin".

Trade-offs: Unauthenticated routes (health check, login, register) require no special exclusion logic. Middleware-based auth forces you to maintain an allowlist of public routes — that list inevitably drifts. DI is opt-in per route, which is safer.

Example:

# deps/auth.py
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="/auth/login")

async def get_current_user(
    token: Annotated[str, Depends(oauth2_scheme)],
    db: Annotated[AsyncSession, Depends(get_db)],
) -> User:
    try:
        payload = jwt.decode(token, settings.jwt_secret, algorithms=["HS256"])
        user_id: str = payload.get("sub")
        if user_id is None:
            raise credentials_exception
    except JWTError:
        raise credentials_exception
    user = await db.get(User, user_id)
    if user is None or not user.is_active:
        raise credentials_exception
    return user

# api/documents.py
@router.get("/documents")
async def list_documents(
    current_user: Annotated[User, Depends(get_current_user)],
    db: Annotated[AsyncSession, Depends(get_db)],
):
    ...

Confirmed: HIGH confidence — FastAPI official documentation explicitly recommends this pattern over middleware for auth.


Pattern 2: Refresh Token Rotation

What: Short-lived access tokens (15 min) + long-lived refresh tokens (30 days) stored in refresh_tokens table. On every /auth/refresh call, the old token is invalidated and a new pair is issued.

When to use: Always, for multi-user SaaS. Prevents stolen tokens having indefinite access.

Trade-offs: Requires refresh_tokens DB table and one extra DB write per refresh. The alternative (long-lived JWTs) cannot be revoked without a blocklist, which has the same cost.

Implementation notes:

  • Refresh token = opaque random UUID (not JWT) — store hashed in DB alongside user_id, expires_at, revoked
  • Access token = JWT with sub=user_id, exp=now+15m, jti=uuid (for optional blocklist future)
  • On logout or password change: set revoked=true on all user's refresh tokens
  • On TOTP failure after password success: do not issue any token; log failed_mfa audit event

Pattern 3: MinIO Presigned URL Flow (preferred over streaming proxy)

What: FastAPI generates a short-lived presigned PUT URL from MinIO; the browser uploads directly to MinIO. For downloads, FastAPI generates a presigned GET URL and redirects.

When to use: All document uploads and downloads where the client is a browser on the same network as MinIO (typical Docker Compose deployment). Use streaming proxy only when MinIO is not reachable from the browser (e.g., MinIO is behind an internal network).

Trade-offs:

  • Presigned URL avoids buffering the file through FastAPI — reduces memory pressure and latency significantly for large files.
  • The FastAPI instance must be able to reach MinIO to generate the URL, but does not need to handle the byte stream.
  • For Docker Compose: MinIO is on the internal Docker network; expose only the presigned-URL-generating endpoint externally. The presigned URL itself points to the MinIO public port.

Flow:

1. POST /documents/upload-url  (FastAPI, authenticated)
   → quota check
   → generate presigned PUT URL (expires 5 min)
   → return { upload_url, object_key, document_id }

2. PUT <upload_url>  (browser → MinIO directly)
   → no FastAPI involvement

3. POST /documents/confirm  { document_id }  (FastAPI, authenticated)
   → verify object exists in MinIO
   → trigger text extraction + classification (background task)
   → update document status to "processing"
   → return document record

Object key namespace: {user_id}/{document_id}/{filename} — ensures per-user isolation without separate buckets. One bucket (docuvault-documents) is sufficient; IAM policies or object key prefix checks enforce isolation in code.

Presigned GET for downloads:

url = minio_client.presigned_get_object(
    bucket_name="docuvault-documents",
    object_name=f"{user_id}/{document_id}/{filename}",
    expires=timedelta(minutes=30),
)
return RedirectResponse(url)

Confidence: HIGH for the S3 presigned URL pattern (standard across all S3-compatible stores). MinIO Python SDK presigned_put_object and presigned_get_object methods confirmed as stable API.


Pattern 4: Cloud Storage Adapter (StorageBackend ABC)

What: A StorageBackend ABC in storage/base.py defines the interface. Each cloud integration implements it. storage_service.py routes to the correct backend based on the user's default_storage_backend setting.

When to use: Any operation that reads or writes document bytes. The service layer never calls MinIO or Google Drive directly — always via the adapter.

Interface:

# storage/base.py
from abc import ABC, abstractmethod
from typing import AsyncIterator

class StorageBackend(ABC):
    @abstractmethod
    async def put_object(self, key: str, data: bytes, content_type: str) -> str:
        """Store object, return canonical reference (URL or key)."""

    @abstractmethod
    async def get_object(self, key: str) -> bytes:
        """Retrieve object bytes."""

    @abstractmethod
    async def delete_object(self, key: str) -> None:
        """Delete object."""

    @abstractmethod
    async def get_presigned_url(self, key: str, expires_seconds: int = 3600) -> str | None:
        """Return a time-limited direct URL, or None if backend doesn't support it."""

    @abstractmethod
    async def list_objects(self, prefix: str) -> list[str]:
        """List keys under prefix."""

    @abstractmethod
    async def health_check(self) -> bool:
        """Verify connectivity."""

Factory:

# storage/__init__.py
def get_storage_backend(user: User, credentials: dict | None) -> StorageBackend:
    backend_type = user.default_storage_backend  # "minio" | "onedrive" | "gdrive" | ...
    if backend_type == "minio":
        return MinIOBackend(settings.minio_endpoint, ...)
    elif backend_type == "onedrive":
        return OneDriveBackend(credentials)  # decrypted before passing in
    ...

Credentials encryption: Cloud OAuth tokens and refresh tokens are stored encrypted with Fernet symmetric encryption. The key is in CLOUD_CREDS_KEY env var. Encryption/decryption happens in cloud_service.py before the credentials are passed to the backend constructor — the adapter itself always receives plaintext credentials and never touches the DB.


Pattern 5: Storage Quota Enforcement via Service Layer (not middleware, not DB constraint)

What: Quota is checked in deps/quota.py::check_upload_quota — a FastAPI dependency injected on upload routes. After successful upload, quota_service.increment_usage(user_id, bytes) is called.

Where NOT to enforce:

  • Not in middleware: Middleware cannot easily read the Content-Length before the body is buffered, and cannot know user identity without re-implementing auth.
  • Not as a DB constraint: CHECK (used_bytes <= limit_bytes) would require the DB to reject the commit, creating a race between the object already uploaded to MinIO and the metadata not committed. Inconsistency.

Correct sequence:

1. Pre-upload: deps/quota.py reads user.quota_used_bytes + Content-Length header
   → if (used + incoming) > limit_bytes: raise HTTP 413 with quota detail

2. Upload proceeds to MinIO (presigned URL or proxy)

3. Post-upload: quota_service.increment_usage atomically:
   UPDATE quotas SET used_bytes = used_bytes + $delta
   WHERE user_id = $uid AND (used_bytes + $delta) <= limit_bytes
   RETURNING used_bytes
   → if no rows returned: another concurrent upload exceeded quota; delete from MinIO + 413

Why atomic update with check: Two simultaneous uploads can both pass the pre-check. The atomic UPDATE with WHERE guard prevents double-spend. This is the correct pattern for optimistic quota enforcement under concurrency.


PostgreSQL Schema Design

Core Tables

-- Users
CREATE TABLE users (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    handle      TEXT UNIQUE NOT NULL,          -- @username for sharing
    email       TEXT UNIQUE NOT NULL,
    password_hash TEXT NOT NULL,               -- bcrypt
    totp_secret TEXT,                          -- NULL = TOTP not enabled
    totp_enabled BOOLEAN NOT NULL DEFAULT FALSE,
    role        TEXT NOT NULL DEFAULT 'user',  -- 'user' | 'admin'
    is_active   BOOLEAN NOT NULL DEFAULT TRUE,
    ai_provider TEXT,                          -- NULL = use system default
    ai_model    TEXT,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Quotas (1:1 with users; separate for clean admin queries)
CREATE TABLE quotas (
    user_id     UUID PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
    limit_bytes BIGINT NOT NULL DEFAULT 104857600,  -- 100 MB
    used_bytes  BIGINT NOT NULL DEFAULT 0,
    CONSTRAINT no_negative_usage CHECK (used_bytes >= 0)
);

-- Refresh tokens
CREATE TABLE refresh_tokens (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id     UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    token_hash  TEXT NOT NULL UNIQUE,           -- SHA-256 of the opaque token
    expires_at  TIMESTAMPTZ NOT NULL,
    revoked     BOOLEAN NOT NULL DEFAULT FALSE,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX ON refresh_tokens(user_id, revoked);

-- Folders
CREATE TABLE folders (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id     UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    parent_id   UUID REFERENCES folders(id) ON DELETE CASCADE,  -- NULL = root
    name        TEXT NOT NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
    UNIQUE (user_id, parent_id, name)
);

-- Documents
CREATE TABLE documents (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id         UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    folder_id       UUID REFERENCES folders(id) ON DELETE SET NULL,
    filename        TEXT NOT NULL,
    content_type    TEXT NOT NULL,
    size_bytes      BIGINT NOT NULL DEFAULT 0,
    storage_backend TEXT NOT NULL DEFAULT 'minio',  -- 'minio' | 'onedrive' | ...
    object_key      TEXT NOT NULL,                  -- backend-specific reference
    extracted_text  TEXT,                           -- NULL until extraction complete
    status          TEXT NOT NULL DEFAULT 'pending', -- pending | processing | ready | error
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX ON documents(user_id, folder_id);
CREATE INDEX ON documents(user_id, created_at DESC);

-- Document topics (M:N)
CREATE TABLE document_topics (
    document_id UUID NOT NULL REFERENCES documents(id) ON DELETE CASCADE,
    topic_id    UUID NOT NULL REFERENCES topics(id) ON DELETE CASCADE,
    PRIMARY KEY (document_id, topic_id)
);

-- Topics (per-user; admin sets defaults)
CREATE TABLE topics (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id     UUID REFERENCES users(id) ON DELETE CASCADE,  -- NULL = system default
    name        TEXT NOT NULL,
    UNIQUE (user_id, name)
);

-- Document shares
CREATE TABLE document_shares (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    document_id     UUID NOT NULL REFERENCES documents(id) ON DELETE CASCADE,
    owner_id        UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    recipient_id    UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    permission      TEXT NOT NULL DEFAULT 'view',  -- 'view' | 'download' (future)
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    UNIQUE (document_id, recipient_id)
);
CREATE INDEX ON document_shares(recipient_id);

-- Cloud storage backends per user
CREATE TABLE cloud_backends (
    id                  UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id             UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    backend_type        TEXT NOT NULL,              -- 'onedrive' | 'gdrive' | 'nextcloud' | 'webdav'
    display_name        TEXT NOT NULL,
    credentials_enc     TEXT NOT NULL,              -- Fernet-encrypted JSON blob
    is_default          BOOLEAN NOT NULL DEFAULT FALSE,
    created_at          TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX ON cloud_backends(user_id);

-- Audit log (append-only)
CREATE TABLE audit_log (
    id          BIGSERIAL PRIMARY KEY,
    user_id     UUID REFERENCES users(id) ON DELETE SET NULL,
    actor_id    UUID REFERENCES users(id) ON DELETE SET NULL, -- admin acting on behalf
    event_type  TEXT NOT NULL,  -- login | login_failed | upload | delete | share | quota_change | ...
    resource_id UUID,           -- document_id / folder_id / user_id depending on context
    ip_address  INET,
    metadata    JSONB,          -- event-specific extra fields (no document content)
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX ON audit_log(user_id, created_at DESC);
CREATE INDEX ON audit_log(event_type, created_at DESC);
-- NOTE: no UPDATE or DELETE grants on audit_log for app user; only INSERT + SELECT

Schema design notes:

  • topics.user_id IS NULL = system-wide default topics visible to all users; per-user topics shadow them.
  • documents.object_key stores the backend-relative reference — for MinIO it is {user_id}/{document_id}/{filename}; for OneDrive it is the Drive item ID. The storage_backend column tells the service which adapter to use.
  • cloud_backends.credentials_enc is never returned in any API response; only the adapter factory decrypts it server-side.
  • Audit log uses BIGSERIAL (not UUID) for append-ordered natural scan and to discourage random access patterns.

Data Flow

Document Upload Flow (MinIO presigned URL path)

Browser
  │
  ├─[1] POST /documents/upload-url  {filename, size, content_type, folder_id?}
  │       → get_current_user dep (JWT verify → load User from DB)
  │       → check_upload_quota dep (reads quotas table, compares size)
  │       → document_service.prepare_upload()
  │           → INSERT documents row (status='pending')
  │           → minio_backend.generate_presigned_put(object_key, expires=300s)
  │       ← {upload_url, object_key, document_id}
  │
  ├─[2] PUT <upload_url>  (browser → MinIO, no FastAPI)
  │
  ├─[3] POST /documents/{id}/confirm
  │       → get_current_user dep
  │       → document_service.confirm_upload()
  │           → verify object exists in MinIO (HEAD request)
  │           → quota_service.increment_usage(user_id, size_bytes) [atomic]
  │           → UPDATE documents SET status='processing'
  │           → enqueue background task: extract_and_classify(document_id)
  │       ← {document_id, status: "processing"}
  │
  └─[4] Background: extract_and_classify(document_id)
          → extractor.extract_text(object bytes from MinIO)
          → classifier.classify(text, user_topics)
          → UPDATE documents SET extracted_text=..., status='ready'
          → UPDATE document_topics
          → audit_service.log(event='upload', ...)

Authentication Flow

Browser
  │
  ├─[1] POST /auth/login  {email, password, totp_code?}
  │       → user_service.verify_password(email, password)
  │       → if totp_enabled: pyotp.TOTP(secret).verify(totp_code)
  │       → issue access_token (JWT, 15 min) + refresh_token (opaque UUID)
  │       → store hash(refresh_token) in refresh_tokens table
  │       ← {access_token, refresh_token, expires_in}
  │
  ├─[2] Any authenticated request
  │       Authorization: Bearer <access_token>
  │       → get_current_user dep decodes JWT locally (no DB round-trip for valid tokens)
  │
  └─[3] POST /auth/refresh  {refresh_token}
          → look up hash(refresh_token) in refresh_tokens table
          → verify not revoked, not expired
          → set revoked=true on old token
          → issue new access_token + new refresh_token (rotation)
          ← {access_token, refresh_token, expires_in}

Shared Document Access Flow

Recipient accesses "Shared with me"
  │
  ├─ GET /documents/shared-with-me
  │     → SELECT d.* FROM documents d
  │       JOIN document_shares s ON s.document_id = d.id
  │       WHERE s.recipient_id = :current_user_id
  │     ← list of document records (owner's documents, recipient has view access)
  │
  └─ GET /documents/{id}/download  (recipient, shared document)
        → verify document_shares row exists for (document_id, current_user_id)
        → generate presigned GET URL using owner's object_key
        ← 302 redirect to presigned URL
        (file bytes flow from MinIO → browser, never through FastAPI)

Migration Path: Flat-File → PostgreSQL + MinIO

Principle: parallel-run, not flag-day cutover

The safest approach is to keep the existing flat-file code running and introduce the new stack incrementally, in a sequence that never breaks the existing API contract from the Vue frontend's perspective.

Phase 1 — Infrastructure, no behavior change

  1. Add PostgreSQL and MinIO services to docker-compose.yml
  2. Create db/models.py with initial schema (users, documents, quotas — no auth yet)
  3. Add Alembic, run initial migration
  4. Add deps/db.py with async session dependency
  5. No API changes. Existing flat-file code still runs.

Validation: docker-compose up boots all services without errors. Alembic migrations apply cleanly.

Phase 2 — Auth layer (new endpoints, existing endpoints temporarily open)

  1. Add users table, refresh_tokens table
  2. Implement /auth/register, /auth/login, /auth/refresh
  3. Add get_current_user dependency to deps/auth.py
  4. Add a single test authenticated route (GET /auth/me)
  5. Existing document endpoints remain unauthenticated (guarded by feature flag or separate router prefix)

Validation: Auth endpoints work independently. Existing UI still calls existing routes without tokens.

Phase 3 — Document storage migration (dual-write period)

  1. Add MinIO minio_backend.py, integrate into storage_service.py
  2. Create a one-time migration script:
    • Reads each data/metadata/<id>.json
    • Inserts a documents row with user_id = SYSTEM_USER_ID (a single placeholder user)
    • Uploads data/uploads/<id>.<ext> to MinIO
  3. New uploads go to MinIO + PostgreSQL; old flat-file data has been migrated
  4. Update document API routes to read from PostgreSQL + MinIO, guarded behind get_current_user

Critical: Run migration script in a transaction; if any file fails, roll back DB inserts (MinIO objects can be cleaned up separately). Do not delete flat-file data until validation is complete.

Phase 4 — Multi-user isolation

  1. Add per-user quotas rows, folders table
  2. Enforce user-scoped queries: all document queries include WHERE user_id = :current_user_id
  3. Add quota enforcement dependency on upload routes
  4. Add document_shares, cloud_backends tables

Phase 5 — Cloud storage backends

  1. Implement StorageBackend ABC and MinIOBackend
  2. Implement first cloud adapter (OneDrive or Google Drive)
  3. Add /storage-backends/* API endpoints
  4. Add frontend UI for connecting cloud accounts

Frontend changes during migration

  • Add Authorization: Bearer <token> header injection in src/api/client.js (single change point — all API calls go through this module already)
  • Add login/register views and auth Pinia store
  • Redirect to /login on 401 responses
  • No other frontend changes required until Phase 4 (user-scoped UI)

Horizontal Scaling Concerns

Concern What to Share What Can Be Instance-Local
DB connections PostgreSQL (shared) — use connection pooling (asyncpg pool size 10-20 per instance) None
Object storage MinIO (shared) — all instances use same endpoint None
Refresh token state PostgreSQL refresh_tokens table (shared) JWT validation (CPU-only, no shared state needed)
Quota state PostgreSQL quotas table with atomic UPDATE (shared) Pre-flight Content-Length check (instance-local read, final write shared)
Background tasks Cannot use BackgroundTasks across instances — use Celery + Redis OR Postgres-backed queue (pg_boss / pgqueuer) Single-instance: BackgroundTasks is fine for Phase 1
File upload temp buffers If streaming proxy pattern used: RAM per instance Use presigned URLs to avoid this entirely
AI provider instances Re-instantiated per request already — no shared state Per-instance re-instantiation is fine
CORS / session Stateless JWT — no sticky sessions needed

First bottleneck: Background task queue. FastAPI BackgroundTasks runs in the same process. When classification is slow or multiple uploads arrive simultaneously, workers block. Introduce a task queue (Celery + Redis, or pgqueuer) before scaling to N instances — otherwise each instance has its own queue and tasks are not distributed.

Second bottleneck: DB connection count. With N instances × 20 connections = N×20 PostgreSQL connections. Add PgBouncer in transaction mode in front of PostgreSQL before N gets large.


Anti-Patterns

Anti-Pattern 1: Per-Instance File Locks for Quota

What people do: Carry forward the filelock pattern into the multi-instance world, using a lock file on a shared volume.

Why it's wrong: Shared NFS/volume file locking has undefined behavior under Docker Compose networking, requires a shared filesystem mount (kills stateless instances), and is slower than a DB atomic update.

Do this instead: Atomic UPDATE quotas SET used_bytes = used_bytes + $delta WHERE user_id = $uid AND (used_bytes + $delta) <= limit_bytes in PostgreSQL. Single round-trip, correct under concurrency, no shared filesystem required.


Anti-Pattern 2: Streaming All File Traffic Through FastAPI

What people do: POST /upload receives the multipart body into memory, then POSTs it to MinIO from FastAPI.

Why it's wrong: Doubles memory usage (once in FastAPI, once in MinIO client buffer). Saturates FastAPI worker threads during large uploads. Introduces FastAPI as a bottleneck for byte transfer.

Do this instead: Two-step presigned URL flow (Pattern 3 above). FastAPI only handles metadata; bytes flow browser → MinIO directly.


Anti-Pattern 3: Auth in Middleware Instead of Dependencies

What people do: Write a custom ASGI middleware that reads the Authorization header and either passes or rejects requests.

Why it's wrong: Middleware runs before FastAPI routing. To allow public routes (login, register, health), you must maintain an exclusion list in the middleware. This list inevitably goes stale when new public routes are added. Middleware cannot easily populate request.state.user in a way that's type-safe for path operations.

Do this instead: Depends(get_current_user) on each protected router. Optional auth uses Depends(get_optional_user) returning User | None. Explicit, type-safe, co-located with the route it protects. Confirmed as FastAPI's recommended pattern.


Anti-Pattern 4: Storing Cloud Credentials Unencrypted (or Relying on DB-Level Encryption Alone)

What people do: Store OAuth tokens in plaintext DB columns, assuming DB-level TLS or disk encryption is sufficient.

Why it's wrong: Any user with DB read access (admin, compromised migration, backup leak) can extract all users' cloud tokens. Violates the privacy-first admin model requirement.

Do this instead: Fernet-encrypt the credential JSON blob in cloud_service.py before writing to cloud_backends.credentials_enc. The Fernet key lives in CLOUD_CREDS_KEY env var only — never in the DB. Admin queries on cloud_backends return only id, backend_type, display_name, is_default — the credentials_enc column is excluded from all admin-facing serializers.


Anti-Pattern 5: One MinIO Bucket Per User

What people do: Create a new MinIO bucket for each registered user to enforce isolation.

Why it's wrong: MinIO is not designed for millions of buckets. Bucket creation is a management operation. IAM policies per bucket become complex to manage at scale.

Do this instead: Single bucket, key prefix isolation: {user_id}/{document_id}/{filename}. Enforce prefix scoping in storage_service.py — never let a user-supplied key escape their {user_id}/ prefix. Verify in every get_object and delete_object call that the resolved key starts with the authenticated user's ID.


Integration Points

External Services

Service Integration Pattern Notes
PostgreSQL SQLAlchemy 2.0 async (asyncpg driver), sessions via Depends(get_db) Use asyncpg pool, not per-request connections
MinIO minio Python SDK (sync) wrapped in asyncio.to_thread(), or aiobotocore for async S3 Presigned URL generation is CPU-bound, not I/O-bound — to_thread is fine
OneDrive Microsoft Graph API via httpx async client + OAuth2 PKCE flow Refresh tokens stored encrypted in cloud_backends
Google Drive Google Drive API v3 via httpx or google-auth library Same credential model as OneDrive
Nextcloud WebDAV via httpx (PUT/GET/DELETE) or webdavclient3 library Basic auth or app password — simpler than OAuth
PyOTP TOTP generation/verification (pyotp.TOTP(secret).verify(code)) Time-window tolerance: default ±1 period (±30 sec) is sufficient
python-jose or PyJWT JWT encode/decode Use HS256 with a 256-bit secret. python-jose has broader algorithm support; PyJWT is simpler and more actively maintained
cryptography (Fernet) Cloud credential encryption/decryption Fernet.generate_key() at setup; store in CLOUD_CREDS_KEY env var
passlib[bcrypt] Password hashing bcrypt work factor 12 minimum

Internal Boundaries

Boundary Communication Notes
api/services/ Direct async function calls Services never import from api/; dependency is one-directional
services/storage/ StorageBackend ABC interface Services import from storage/__init__.py factory only
services/ai/ Existing get_provider() factory — unchanged AI provider is still re-instantiated per call
deps/services/ Services can be called from deps (e.g., quota_service from quota dep) Keep deps thin — prefer passing a DB session to the dep and calling service functions
db/models.py ↔ everywhere Import models directly No repository pattern needed at this scale; SQLAlchemy session + models is sufficient

Scaling Considerations

Scale Architecture Notes
1100 users Single FastAPI instance, BackgroundTasks, no queue This milestone's target; simplest path
10010k users Add Celery + Redis for background tasks; add PgBouncer; scale FastAPI to 24 instances Background task queue is the first change needed
10k100k users Read replica for PostgreSQL (document listing queries), MinIO multi-node cluster Document metadata reads dominate; separate read/write paths
100k+ users Consider separate microservice for classification (GPU workers); CDN in front of MinIO presigned URLs Classification latency becomes user-facing bottleneck

Sources

  • FastAPI official docs — Security / OAuth2 with JWT: https://fastapi.tiangolo.com/tutorial/security/oauth2-jwt/ (HIGH confidence — directly confirmed pattern)
  • FastAPI official docs — Advanced Middleware: https://fastapi.tiangolo.com/advanced/middleware/ (HIGH confidence — confirms DI > middleware for auth)
  • FastAPI official docs — SQL Databases: https://fastapi.tiangolo.com/tutorial/sql-databases/ (HIGH confidence — session-per-request via Depends confirmed)
  • MinIO S3 presigned URL pattern: S3-compatible standard, documented in AWS S3 and MinIO docs (HIGH confidence — industry-standard pattern)
  • PostgreSQL atomic UPDATE for quota enforcement: standard optimistic concurrency pattern (HIGH confidence)
  • Fernet symmetric encryption (cryptography library): well-documented Python standard for symmetric key encryption (HIGH confidence)
  • Refresh token rotation pattern: IETF OAuth 2.0 Security BCP (RFC 9700 / draft-ietf-oauth-security-topics) (HIGH confidence)

Architecture research for: DocuVault multi-user SaaS document management (FastAPI + Vue 3 brownfield) Researched: 2026-05-21