baa5bed7e2
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
316 lines
19 KiB
Markdown
316 lines
19 KiB
Markdown
---
|
|
phase: 05-cloud-storage-backends
|
|
plan: 05
|
|
type: execute
|
|
wave: 4
|
|
depends_on:
|
|
- "05-03"
|
|
- "05-04"
|
|
files_modified:
|
|
- backend/api/cloud.py
|
|
- backend/main.py
|
|
autonomous: true
|
|
requirements:
|
|
- CLOUD-01
|
|
- CLOUD-02
|
|
- CLOUD-03
|
|
- CLOUD-04
|
|
- CLOUD-05
|
|
- CLOUD-06
|
|
|
|
must_haves:
|
|
truths:
|
|
- "GET /api/cloud/oauth/initiate/{provider} redirects to provider OAuth URL; state token in Redis with 30-min TTL"
|
|
- "GET /api/cloud/oauth/callback/{provider} validates state, exchanges code, encrypts credentials, saves CloudConnection, redirects to /settings?cloud_connected={provider}"
|
|
- "POST /api/cloud/connections/webdav validates URL (SSRF), tests connection (PROPFIND), encrypts + saves credentials"
|
|
- "GET /api/cloud/connections returns CloudConnectionOut list — no credentials_enc"
|
|
- "DELETE /api/cloud/connections/{id} deletes credentials_enc row; subsequent use returns 503"
|
|
- "GET /api/cloud/folders/{provider}/{folder_id} returns lazy-loaded folder listing (TTL-cached)"
|
|
- "PATCH /api/users/me/default-storage updates users.default_storage_backend"
|
|
- "All endpoints use get_regular_user dep — admin blocked (403)"
|
|
- "OAuth callback invalid state returns 400; invalid provider returns 400"
|
|
- "write_audit_log called on connect, disconnect, and REQUIRES_REAUTH transitions"
|
|
artifacts:
|
|
- path: "backend/api/cloud.py"
|
|
provides: "All /api/cloud/* endpoints + /api/users/me/default-storage"
|
|
contains: "router = APIRouter"
|
|
- path: "backend/main.py"
|
|
provides: "cloud router registered"
|
|
contains: "cloud_router"
|
|
key_links:
|
|
- from: "backend/api/cloud.py"
|
|
to: "backend/storage/cloud_utils.py"
|
|
via: "encrypt_credentials / decrypt_credentials"
|
|
pattern: "encrypt_credentials"
|
|
- from: "backend/api/cloud.py"
|
|
to: "backend/api/admin.py"
|
|
via: "CloudConnectionOut Pydantic model import"
|
|
pattern: "CloudConnectionOut"
|
|
- from: "backend/api/cloud.py"
|
|
to: "backend/services/audit.py"
|
|
via: "write_audit_log on connect/disconnect"
|
|
pattern: "write_audit_log"
|
|
---
|
|
|
|
<objective>
|
|
Create backend/api/cloud.py with all cloud connection management endpoints and register it in main.py.
|
|
|
|
Purpose: This plan implements the complete cloud backend API surface: OAuth initiation, OAuth callback, WebDAV connect, list connections, disconnect, folder listing, and default-storage selection.
|
|
Output: backend/api/cloud.py with 7 endpoints + 1 patch endpoint; main.py updated to register the router.
|
|
</objective>
|
|
|
|
<execution_context>
|
|
@/Users/nik/.claude/get-shit-done/workflows/execute-plan.md
|
|
@/Users/nik/.claude/get-shit-done/templates/summary.md
|
|
</execution_context>
|
|
|
|
<context>
|
|
@.planning/PROJECT.md
|
|
@.planning/ROADMAP.md
|
|
@.planning/phases/05-cloud-storage-backends/05-CONTEXT.md
|
|
@.planning/phases/05-cloud-storage-backends/05-RESEARCH.md
|
|
@.planning/phases/05-cloud-storage-backends/05-03-SUMMARY.md
|
|
@.planning/phases/05-cloud-storage-backends/05-04-SUMMARY.md
|
|
</context>
|
|
|
|
<interfaces>
|
|
<!-- From backend/api/admin.py -->
|
|
From backend/api/admin.py:
|
|
class CloudConnectionOut(BaseModel):
|
|
id: str
|
|
provider: str
|
|
display_name: str
|
|
status: str
|
|
connected_at: datetime
|
|
model_config = {"from_attributes": True}
|
|
|
|
<!-- From backend/deps/auth.py -->
|
|
From backend/deps/auth.py:
|
|
async def get_regular_user(credentials, session) -> User: -- raises 403 for admin, 401 for invalid token
|
|
|
|
From backend/services/audit.py:
|
|
async def write_audit_log(session, event_type, user_id, actor_id, resource_id, ip_address, metadata_=None) -> None
|
|
|
|
From backend/db/models.py:
|
|
CloudConnection: id (UUID), user_id (UUID), provider (String), display_name (Text),
|
|
credentials_enc (Text), status (String, default="ACTIVE"), connected_at (TIMESTAMP)
|
|
User: id (UUID), default_storage_backend (String, default="minio")
|
|
|
|
From backend/config.py (after Plan 01):
|
|
settings.cloud_creds_key: str
|
|
settings.google_client_id, google_client_secret: str
|
|
settings.onedrive_client_id, onedrive_client_secret, onedrive_tenant_id: str
|
|
settings.backend_url: str (used in OAuth callback redirect_uri)
|
|
|
|
From backend/storage/cloud_utils.py:
|
|
def encrypt_credentials(master_key: bytes, user_id: str, credentials: dict) -> str
|
|
def decrypt_credentials(master_key: bytes, user_id: str, credentials_enc: str) -> dict
|
|
def validate_cloud_url(url: str) -> None
|
|
|
|
From RESEARCH.md Pattern 3: Google Drive OAuth — Flow.from_client_config, access_type="offline", prompt="consent"
|
|
From RESEARCH.md Pattern 4: OneDrive OAuth — msal.ConfidentialClientApplication, acquire_token_by_authorization_code
|
|
From RESEARCH.md Pattern 7: OAuth state in Redis — key "oauth_state:{state_token}", TTL 1800, single-use delete
|
|
|
|
From backend/storage/nextcloud_backend.py: NextcloudBackend.list_folder() -> list[dict]
|
|
From backend/services/cloud_cache.py: get_cloud_folders_cached(user_id, provider, folder_id, fetch_fn)
|
|
</interfaces>
|
|
|
|
<tasks>
|
|
|
|
<task type="auto" tdd="true">
|
|
<name>Task 1: Create cloud.py with OAuth + WebDAV connect + connection management endpoints</name>
|
|
<files>backend/api/cloud.py</files>
|
|
<read_first>
|
|
- backend/api/admin.py — CloudConnectionOut pattern, _user_to_dict whitelist style, write_audit_log usage
|
|
- backend/api/auth.py — Redis state pattern (oauth_state-like keys), rate limiting pattern
|
|
- backend/deps/auth.py — get_regular_user signature
|
|
- backend/db/models.py — CloudConnection, User model fields
|
|
- backend/config.py — new Phase 5 settings fields
|
|
- .planning/phases/05-cloud-storage-backends/05-CONTEXT.md — D-03, D-04, D-06, D-17, D-18, D-19 decisions
|
|
- .planning/phases/05-cloud-storage-backends/05-RESEARCH.md — Pattern 3 (Google OAuth), Pattern 4 (MSAL), Pattern 7 (Redis state)
|
|
</read_first>
|
|
<behavior>
|
|
- GET /api/cloud/oauth/initiate/{provider}: accepts provider in {"google_drive", "onedrive"}; generates state_token = secrets.token_urlsafe(32); stores "oauth_state:{state_token}" in Redis with value str(current_user.id), TTL 1800; builds authorization_url; returns HTTP 302 redirect to authorization_url
|
|
- GET /api/cloud/oauth/callback/{provider}: reads state and code query params; looks up Redis key "oauth_state:{state}"; if missing, returns 400; deletes Redis key (single-use); exchanges code for tokens; encrypts credentials; upserts CloudConnection (match on user_id + provider); sets status=ACTIVE; calls write_audit_log(event_type="cloud.connected"); returns 302 redirect to {settings.frontend_url}/settings?cloud_connected={provider}
|
|
- On any exception in callback: returns 302 redirect to {settings.frontend_url}/settings?cloud_error={url-encoded error message}
|
|
- POST /api/cloud/connections/webdav: Pydantic body with server_url (HttpUrl), username (str), password (str), provider (Literal["nextcloud", "webdav"]); calls validate_cloud_url(server_url) → 422 on ValueError; instantiates WebDAVBackend/NextcloudBackend; calls backend.health_check() wrapped in try/except → 422 if False/exception; encrypts credentials; upserts CloudConnection; calls write_audit_log(event_type="cloud.connected"); returns CloudConnectionOut
|
|
- GET /api/cloud/connections: selects all CloudConnection where user_id=current_user.id; returns {"items": [CloudConnectionOut, ...]}; credentials_enc never in response
|
|
- DELETE /api/cloud/connections/{id}: loads CloudConnection; asserts connection.user_id == current_user.id (returns 404 if mismatch — prevents ID enumeration per D-19); deletes row; calls write_audit_log(event_type="cloud.disconnected"); returns 204
|
|
- PATCH /api/users/me/default-storage: body {"backend": str}; updates User.default_storage_backend; returns {"default_storage_backend": new_value}
|
|
- ALL endpoints: Depends(get_regular_user) — admin blocked (D-18, D-19)
|
|
- ALL endpoints: cross-user access returns 404 not 403 (prevents ID enumeration)
|
|
</behavior>
|
|
<action>
|
|
Create backend/api/cloud.py with module docstring listing all endpoints and security invariants.
|
|
|
|
Imports: secrets, uuid, urllib.parse, from fastapi import APIRouter, Depends, HTTPException, Request, status, from fastapi.responses import RedirectResponse, from pydantic import BaseModel, HttpUrl, Literal, from sqlalchemy import select, from sqlalchemy.ext.asyncio import AsyncSession
|
|
|
|
From project modules:
|
|
from api.admin import CloudConnectionOut
|
|
from config import settings
|
|
from db.models import CloudConnection, User
|
|
from deps.auth import get_regular_user
|
|
from deps.db import get_db
|
|
from services.audit import write_audit_log
|
|
from storage.cloud_utils import encrypt_credentials, decrypt_credentials, validate_cloud_url
|
|
|
|
VALID_OAUTH_PROVIDERS = {"google_drive", "onedrive"}
|
|
VALID_WEBDAV_PROVIDERS = {"nextcloud", "webdav"}
|
|
|
|
router = APIRouter(prefix="/api/cloud", tags=["cloud"])
|
|
users_router = APIRouter(prefix="/api/users", tags=["users"])
|
|
|
|
Pydantic request models:
|
|
class WebDAVConnectRequest(BaseModel): server_url: str; username: str; password: str; provider: str
|
|
class DefaultStorageRequest(BaseModel): backend: str
|
|
|
|
Implement all 6 cloud endpoints + 1 users/me/default-storage endpoint per the behavior spec above.
|
|
|
|
For Google Drive OAuth initiate/callback:
|
|
from google_auth_oauthlib.flow import Flow (lazy import inside handler)
|
|
Flow.from_client_config with client_id=settings.google_client_id, client_secret=settings.google_client_secret
|
|
Scopes: ["https://www.googleapis.com/auth/drive.file"]
|
|
redirect_uri = f"{settings.backend_url}/api/cloud/oauth/callback/google_drive"
|
|
flow.authorization_url(access_type="offline", prompt="consent")
|
|
At callback: flow.fetch_token(code=code); store access_token, refresh_token, expiry, token_uri, client_id, client_secret
|
|
|
|
For OneDrive OAuth initiate/callback:
|
|
import msal (lazy import inside handler)
|
|
msal.ConfidentialClientApplication(settings.onedrive_client_id, client_credential=settings.onedrive_client_secret, authority=f"https://login.microsoftonline.com/{settings.onedrive_tenant_id}")
|
|
app.get_authorization_request_url(scopes=["Files.ReadWrite","offline_access"], redirect_uri=..., state=state_token)
|
|
At callback: app.acquire_token_by_authorization_code(code, scopes=..., redirect_uri=...)
|
|
Wrap msal calls in asyncio.to_thread()
|
|
|
|
For WebDAV/Nextcloud connect:
|
|
from storage.webdav_backend import WebDAVBackend
|
|
from storage.nextcloud_backend import NextcloudBackend
|
|
Instantiate with try/except ValueError → HTTPException(422)
|
|
health_check() in asyncio.to_thread context; on False → HTTPException(422, "Connection test failed — check server URL and credentials")
|
|
|
|
Upsert logic for CloudConnection:
|
|
SELECT where user_id=current_user.id AND provider=provider
|
|
If exists: update credentials_enc + status=ACTIVE; if not exists: INSERT
|
|
display_name = human-readable from provider: {"google_drive": "Google Drive", "onedrive": "OneDrive", "nextcloud": "Nextcloud", "webdav": "WebDAV server"}
|
|
|
|
write_audit_log calls:
|
|
cloud.connected: user_id=current_user.id, actor_id=current_user.id, resource_id=conn.id, metadata_={"provider": provider}
|
|
cloud.disconnected: same pattern
|
|
</action>
|
|
<verify>
|
|
<automated>cd /Users/nik/Documents/Progamming/document_scanner/backend && python -c "
|
|
from api.cloud import router, users_router
|
|
print('cloud router imports OK')
|
|
print('Routes:')
|
|
for route in router.routes:
|
|
print(f' {route.methods} {route.path}')
|
|
for route in users_router.routes:
|
|
print(f' {route.methods} {route.path}')
|
|
"</automated>
|
|
</verify>
|
|
<acceptance_criteria>
|
|
- backend/api/cloud.py exists and imports without error
|
|
- router has routes: GET /oauth/initiate/{provider}, GET /oauth/callback/{provider}, POST /connections/webdav, GET /connections, DELETE /connections/{id}, GET /folders/{provider}/{folder_id}
|
|
- users_router has route: PATCH /me/default-storage
|
|
- All handlers have `Depends(get_regular_user)` in their signature
|
|
- CloudConnectionOut imported from api.admin — not redefined
|
|
- credentials_enc column never referenced in any response serialization (only in CloudConnection ORM SELECT for encrypt/decrypt ops)
|
|
- `pytest -v --tb=short` exits 0
|
|
</acceptance_criteria>
|
|
<done>cloud.py created with all 7 endpoints; all use get_regular_user dep; CloudConnectionOut from admin module; pytest passes</done>
|
|
</task>
|
|
|
|
<task type="auto" tdd="true">
|
|
<name>Task 2: Register cloud router in main.py + add folder listing endpoint</name>
|
|
<files>backend/main.py, backend/api/cloud.py</files>
|
|
<read_first>
|
|
- backend/main.py — existing router registrations pattern (app.include_router)
|
|
- backend/api/cloud.py — router and users_router objects created in Task 1
|
|
- backend/services/cloud_cache.py — get_cloud_folders_cached signature
|
|
- backend/storage/nextcloud_backend.py — NextcloudBackend.list_folder signature
|
|
- backend/storage/webdav_backend.py — WebDAVBackend.list_folder (may not exist — use generic approach)
|
|
</read_first>
|
|
<behavior>
|
|
- GET /api/cloud/folders/{provider}/{folder_id} endpoint added to cloud.py router: loads CloudConnection, decrypts credentials, instantiates backend, calls backend-specific list method via get_cloud_folders_cached; returns {"items": [...]} where each item has id, name, is_dir, size
|
|
- main.py includes both cloud router and users_router from api.cloud
|
|
- Router registrations added in alphabetical order with other routers (after folders, before shares)
|
|
- Existing test suite passes after router registration
|
|
</behavior>
|
|
<action>
|
|
In backend/api/cloud.py, add the folder listing endpoint to the router (if not already added in Task 1):
|
|
|
|
GET /api/cloud/folders/{provider}/{folder_id} implementation:
|
|
- Load CloudConnection for current_user.id + provider; 404 if not found or status != ACTIVE
|
|
- Decrypt credentials
|
|
- Build a fetch_fn async lambda that calls backend.list_folder(folder_id or root path)
|
|
- For provider "google_drive": use Drive service.files().list(q=f"'{folder_id}' in parents", fields="files(id,name,mimeType,size)"); convert to standard format
|
|
- For provider "onedrive": GET /me/drive/items/{folder_id}/children; convert to standard format
|
|
- For provider in ("nextcloud", "webdav"): instantiate NextcloudBackend; call list_folder(folder_id)
|
|
- Wrap in get_cloud_folders_cached(str(current_user.id), provider, folder_id, fetch_fn)
|
|
- Return {"items": [{"id":..., "name":..., "is_dir":bool, "size":int}, ...]}
|
|
|
|
In backend/main.py:
|
|
- Add imports: from api.cloud import router as cloud_router, users_router as cloud_users_router
|
|
- Add app.include_router(cloud_router) and app.include_router(cloud_users_router) after the existing router includes
|
|
- The existing routers (documents, topics, auth, admin, folders, audit, shares) must remain unchanged
|
|
</action>
|
|
<verify>
|
|
<automated>cd /Users/nik/Documents/Progamming/document_scanner/backend && python -c "
|
|
from main import app
|
|
cloud_routes = [r.path for r in app.routes if hasattr(r, 'path') and '/api/cloud' in r.path]
|
|
default_storage = [r.path for r in app.routes if hasattr(r, 'path') and 'default-storage' in r.path]
|
|
print('Cloud routes registered:', cloud_routes)
|
|
print('Default storage route:', default_storage)
|
|
assert len(cloud_routes) >= 5, f'Expected 5+ cloud routes, got {len(cloud_routes)}'
|
|
" && python -m pytest -v --tb=short 2>&1 | tail -5</automated>
|
|
</verify>
|
|
<acceptance_criteria>
|
|
- main.py imports and includes cloud_router and cloud_users_router
|
|
- `from main import app; [r.path for r in app.routes]` includes paths matching /api/cloud/ and /api/users/me/default-storage
|
|
- At least 6 cloud routes registered (initiate, callback, webdav, connections GET, connections DELETE, folders)
|
|
- `pytest -v --tb=short` exits 0, 0 failures
|
|
- Existing routes (documents, auth, admin, folders, shares) still reachable
|
|
</acceptance_criteria>
|
|
<done>Both cloud routers registered in main.py; all cloud routes visible in app.routes; full pytest suite passes</done>
|
|
</task>
|
|
|
|
</tasks>
|
|
|
|
<threat_model>
|
|
## Trust Boundaries
|
|
|
|
| Boundary | Description |
|
|
|----------|-------------|
|
|
| OAuth callback → user session | state parameter validates callback belongs to the initiating user |
|
|
| API request → CloudConnection row | connection.user_id == current_user.id assertion prevents IDOR |
|
|
| WebDAV credentials → validation | credentials only stored after successful health_check() |
|
|
| API response → CloudConnectionOut | credentials_enc excluded by CloudConnectionOut whitelist |
|
|
|
|
## STRIDE Threat Register
|
|
|
|
| Threat ID | Category | Component | Disposition | Mitigation Plan |
|
|
|-----------|----------|-----------|-------------|-----------------|
|
|
| T-05-05-01 | Tampering | OAuth callback CSRF | mitigate | secrets.token_urlsafe(32) state token stored in Redis; validated at callback; single-use deletion after validation (D-04) |
|
|
| T-05-05-02 | Elevation of Privilege | OAuth callback state token leak | mitigate | Redis TTL 1800s (30 min); key deleted after single use; state token never returned to browser |
|
|
| T-05-05-03 | Information Disclosure | CloudConnectionOut in API responses | mitigate | CloudConnectionOut imported from admin.py — exact same whitelist; credentials_enc absent by omission (SEC-08) |
|
|
| T-05-05-04 | Information Disclosure | Cloud connection ID enumeration | mitigate | DELETE /connections/{id} returns 404 for wrong-owner connections — same pattern as documents and shares (T-04-04-02) |
|
|
| T-05-05-05 | Tampering | WebDAV server_url SSRF | mitigate | validate_cloud_url called before WebDAVBackend/NextcloudBackend instantiation; also called in __init__ and before each request (D-17 defense-in-depth) |
|
|
| T-05-05-06 | Spoofing | Admin access to cloud endpoints | mitigate | get_regular_user raises 403 for admin role on all cloud endpoints (D-18) |
|
|
| T-05-05-07 | Information Disclosure | OAuth error message in redirect URL | accept | Error message in ?cloud_error= is URL-encoded and displayed to the authenticated user only; no PII or secret value included |
|
|
| T-05-05-08 | Information Disclosure | write_audit_log metadata for cloud.connected | mitigate | Audit metadata_ = {"provider": provider} only — no credentials, no tokens, no plaintext password (aligns with document audit whitelist pattern) |
|
|
</threat_model>
|
|
|
|
<verification>
|
|
cd /Users/nik/Documents/Progamming/document_scanner/backend && python -m pytest tests/test_cloud.py -v && python -m pytest -v --tb=short 2>&1 | tail -10
|
|
</verification>
|
|
|
|
<success_criteria>
|
|
- cloud.py: all 7 endpoints implemented; all use get_regular_user dep; cross-user returns 404; write_audit_log on connect/disconnect
|
|
- main.py: both routers registered; all routes visible in app.routes
|
|
- pytest -v exits 0, 0 failures
|
|
- test_cloud.py stubs transition from xfail to green for test_credentials_enc_not_exposed, test_connection_status_display, test_disconnect_deletes_credentials, test_ssrf_validation, test_cross_user_idor, test_admin_cannot_see_credentials
|
|
</success_criteria>
|
|
|
|
<output>
|
|
Create `.planning/phases/05-cloud-storage-backends/05-05-SUMMARY.md` when done
|
|
</output>
|