--- phase: 05-cloud-storage-backends plan: 04 type: execute wave: 3 depends_on: - "05-02" files_modified: - backend/storage/nextcloud_backend.py - backend/storage/webdav_backend.py autonomous: true requirements: - CLOUD-01 - CLOUD-07 must_haves: truths: - "NextcloudBackend implements all 7 StorageBackend abstract methods" - "WebDAVBackend implements all 7 StorageBackend abstract methods" - "validate_cloud_url() called inside WebDAVBackend and NextcloudBackend before every outbound WebDAV request" - "All sync webdavclient3 calls wrapped in asyncio.to_thread()" - "generate_presigned_put_url and presigned_get_url raise NotImplementedError on both WebDAV backends" - "health_check uses lightweight PROPFIND or check() call to validate connectivity without storing unverified credentials" artifacts: - path: "backend/storage/nextcloud_backend.py" provides: "Nextcloud WebDAV StorageBackend" contains: "class NextcloudBackend" - path: "backend/storage/webdav_backend.py" provides: "Generic WebDAV StorageBackend" contains: "class WebDAVBackend" key_links: - from: "backend/storage/nextcloud_backend.py" to: "backend/storage/cloud_utils.py" via: "validate_cloud_url called before every outbound request" pattern: "validate_cloud_url" - from: "backend/storage/webdav_backend.py" to: "backend/storage/cloud_utils.py" via: "validate_cloud_url called before every outbound request" pattern: "validate_cloud_url" --- Implement NextcloudBackend and WebDAVBackend — the two credential-based (non-OAuth) cloud StorageBackend concrete classes. Purpose: These backends handle Nextcloud and generic WebDAV servers using HTTP Basic Auth. SSRF prevention via validate_cloud_url() is mandatory before every outbound request. All sync webdavclient3 calls are wrapped in asyncio.to_thread() per the MinIOBackend pattern. Output: nextcloud_backend.py and webdav_backend.py, each implementing all 7 StorageBackend methods. @/Users/nik/.claude/get-shit-done/workflows/execute-plan.md @/Users/nik/.claude/get-shit-done/templates/summary.md @.planning/PROJECT.md @.planning/ROADMAP.md @.planning/phases/05-cloud-storage-backends/05-CONTEXT.md @.planning/phases/05-cloud-storage-backends/05-RESEARCH.md @.planning/phases/05-cloud-storage-backends/05-02-SUMMARY.md From backend/storage/base.py: class StorageBackend(ABC): async def put_object(user_id, document_id, file_bytes, extension, content_type) -> str async def get_object(object_key: str) -> bytes async def delete_object(object_key: str) -> None async def presigned_get_url(object_key: str, expires_minutes: int = 60) -> str async def health_check() -> bool async def generate_presigned_put_url(object_key: str, expires_minutes: int = 15) -> str async def stat_object(object_key: str) -> int webdavclient3 Client options: {"webdav_hostname": server_url, "webdav_login": username, "webdav_password": password} All webdavclient3 calls are synchronous — MUST wrap in asyncio.to_thread() Method names to verify: client.upload_to(buf, remote_path), client.download_from(buf, remote_path) client.list(remote_dir), client.info(remote_path) returns dict with "size" key client.check(remote_path) returns bool — used for health_check client.clean(remote_path) — delete ASSUMPTION A1: verify upload_to/download_from method names against installed package during implementation validate_cloud_url(url: str) -> None — raises ValueError if URL targets private/internal address Must be called: (1) at connect-time, (2) before every outbound WebDAV request Use urllib.parse.quote() on path segments for Nextcloud compatibility with non-ASCII filenames object_key = WebDAV path: "docuvault/{user_id}/{document_id}{extension}" CloudConnection credentials dict: {"server_url": str, "username": str, "password": str} Task 1: Implement WebDAVBackend backend/storage/webdav_backend.py - backend/storage/base.py — all 7 method signatures - backend/storage/minio_backend.py — asyncio.to_thread() wrapping pattern - backend/storage/cloud_utils.py — validate_cloud_url signature - .planning/phases/05-cloud-storage-backends/05-RESEARCH.md — Pattern 5 (webdavclient3), Pitfall 2 (path encoding), A1 (assumed method names) - WebDAVBackend.__init__(self, server_url: str, username: str, password: str) creates webdavclient3 Client - validate_cloud_url(server_url) called in __init__ before constructing the client (SSRF guard at construct time) - put_object: constructs object_key = f"docuvault/{user_id}/{document_id}{extension}"; percent-encodes path segments; uploads via asyncio.to_thread; returns object_key - get_object: downloads to BytesIO via asyncio.to_thread; returns bytes - delete_object: deletes via asyncio.to_thread; catches FileNotFoundError / WebDavException for missing file (no-op) - presigned_get_url: raises NotImplementedError - generate_presigned_put_url: raises NotImplementedError - stat_object: calls asyncio.to_thread for client.info(object_key); returns int(info.get("size", 0)) - health_check: calls asyncio.to_thread for client.check("/"); returns True/False - SSRF validation called before every asyncio.to_thread call: validate_cloud_url(self._server_url) - Uses urllib.parse.quote on non-docuvault path segments (Pitfall 2) Create backend/storage/webdav_backend.py with: Module docstring explaining WebDAV backend, SSRF validation requirement per D-17, and Pitfall 2 (path encoding). from __future__ import annotations import asyncio, io, urllib.parse from webdav3.client import Client from storage.base import StorageBackend from storage.cloud_utils import validate_cloud_url class WebDAVBackend(StorageBackend): def __init__(self, server_url: str, username: str, password: str) -> None: validate_cloud_url(server_url) # SSRF guard at construct time self._server_url = server_url options = { "webdav_hostname": server_url, "webdav_login": username, "webdav_password": password, } self._client = Client(options) def _make_path(self, user_id: str, document_id: str, extension: str) -> str: # Construct path with percent-encoding for Nextcloud/WebDAV compatibility (Pitfall 2) encoded_uid = urllib.parse.quote(str(user_id), safe="") encoded_did = urllib.parse.quote(str(document_id), safe="") return f"docuvault/{encoded_uid}/{encoded_did}{extension}" async def put_object(self, user_id, document_id, file_bytes, extension, content_type) -> str: validate_cloud_url(self._server_url) # re-validate before every request (D-17) object_key = self._make_path(user_id, document_id, extension) buf = io.BytesIO(file_bytes) # Ensure parent directory exists: client.mkdir("docuvault/{user_id}/") wrapped in asyncio.to_thread # Then: await asyncio.to_thread(self._client.upload_to, buf, object_key) # If upload_to method name incorrect, verify against webdavclient3 docs and use correct name return object_key async def get_object(self, object_key: str) -> bytes: validate_cloud_url(self._server_url) buf = io.BytesIO() await asyncio.to_thread(self._client.download_from, buf, object_key) return buf.getvalue() async def delete_object(self, object_key: str) -> None: validate_cloud_url(self._server_url) try: await asyncio.to_thread(self._client.clean, object_key) except Exception: pass # No-op if file not found async def presigned_get_url(self, object_key: str, expires_minutes: int = 60) -> str: raise NotImplementedError("WebDAV backend does not support presigned URLs") async def generate_presigned_put_url(self, object_key: str, expires_minutes: int = 15) -> str: raise NotImplementedError("WebDAV backend does not support presigned put URLs") async def stat_object(self, object_key: str) -> int: validate_cloud_url(self._server_url) info = await asyncio.to_thread(self._client.info, object_key) return int(info.get("size", 0)) async def health_check(self) -> bool: try: validate_cloud_url(self._server_url) result = await asyncio.to_thread(self._client.check, "/") return bool(result) except Exception: return False IMPORTANT: During implementation, verify the webdavclient3 method names by running: python -c "from webdav3.client import Client; print([m for m in dir(Client) if not m.startswith('_')])" and use the correct method names. The RESEARCH.md marks upload_to/download_from as [ASSUMED]. Correct method names if different (e.g., may be upload_sync, download_sync, or upload/download). cd /Users/nik/Documents/Progamming/document_scanner/backend && python -c " from storage.webdav_backend import WebDAVBackend import inspect for method in ['put_object','get_object','delete_object','presigned_get_url','health_check','generate_presigned_put_url','stat_object']: assert inspect.iscoroutinefunction(getattr(WebDAVBackend, method)), f'{method} not async' # SSRF guard: connecting to localhost should raise ValueError try: WebDAVBackend('http://localhost/dav', 'user', 'pass') print('FAIL: should raise ValueError for localhost') except ValueError: print('OK: SSRF blocked in __init__') print('All methods async: OK') import asyncio backend = WebDAVBackend.__new__(WebDAVBackend) backend._server_url = 'https://example.com/dav' # bypass __init__ for method check async def check(): try: await backend.presigned_get_url('k') except NotImplementedError: print('presigned_get_url NotImplementedError: OK') try: await backend.generate_presigned_put_url('k') except NotImplementedError: print('generate_presigned_put_url NotImplementedError: OK') asyncio.run(check()) " - backend/storage/webdav_backend.py exists with class WebDAVBackend - All 7 methods are async coroutines - WebDAVBackend("http://127.0.0.1/dav", "u", "p") raises ValueError (SSRF guard in __init__) - presigned_get_url and generate_presigned_put_url raise NotImplementedError - validate_cloud_url imported and called in __init__ and before every asyncio.to_thread call - `pytest -v --tb=short` exits 0 WebDAVBackend created; SSRF validation in __init__ and before each request; all 7 methods async; pytest passes Task 2: Implement NextcloudBackend backend/storage/nextcloud_backend.py - backend/storage/webdav_backend.py — WebDAVBackend implementation (NextcloudBackend extends it) - .planning/phases/05-cloud-storage-backends/05-RESEARCH.md — Open Question 2 (Nextcloud folder listing path convention), Pitfall 2 (path encoding) - backend/storage/cloud_utils.py — validate_cloud_url - NextcloudBackend subclasses WebDAVBackend — inherits all 7 methods; only overrides what differs - NextcloudBackend stores the username for folder listing path construction (Nextcloud WebDAV path: /remote.php/dav/files/{username}/) - SSRF validation inherited from WebDAVBackend parent class - list_folder(folder_path: str) -> list[dict] method added for cloud folder listing via PROPFIND (used by API) - list_folder returns list of dicts with keys: id (str path), name (str), is_dir (bool), size (int) - get_object and put_object inherited from WebDAVBackend - health_check overrides parent to use PROPFIND on the Nextcloud root path Create backend/storage/nextcloud_backend.py with: Module docstring explaining Nextcloud extends WebDAVBackend; Nextcloud WebDAV base path convention. from __future__ import annotations import asyncio, urllib.parse from storage.webdav_backend import WebDAVBackend from storage.cloud_utils import validate_cloud_url class NextcloudBackend(WebDAVBackend): """Nextcloud storage backend — extends WebDAVBackend with Nextcloud-specific path handling. The server_url should be the full WebDAV root: https://nc.example.com/remote.php/dav/files/{username}/ """ def __init__(self, server_url: str, username: str, password: str) -> None: super().__init__(server_url, username, password) self._username = username async def list_folder(self, folder_path: str = "") -> list[dict]: """List folder contents at folder_path relative to WebDAV root. Returns a list of dicts: [{"id": str, "name": str, "is_dir": bool, "size": int}, ...] Used by GET /api/cloud/folders/nextcloud/{folder_id} endpoint. """ validate_cloud_url(self._server_url) # List the folder using client.list() which returns a list of file names # For each item, call client.info() to get size and type # Wrap each client call in asyncio.to_thread # Return structured list async def health_check(self) -> bool: try: validate_cloud_url(self._server_url) # Use client.check("") or client.list("") to verify connectivity to root result = await asyncio.to_thread(self._client.check, "") return bool(result) except Exception: return False NextcloudBackend inherits put_object, get_object, delete_object, presigned_get_url, generate_presigned_put_url, and stat_object from WebDAVBackend. The list_folder method is extra (not in ABC) and used exclusively by the cloud folder listing API endpoint. cd /Users/nik/Documents/Progamming/document_scanner/backend && python -c " from storage.nextcloud_backend import NextcloudBackend from storage.webdav_backend import WebDAVBackend import inspect # Verify subclass assert issubclass(NextcloudBackend, WebDAVBackend), 'NextcloudBackend must subclass WebDAVBackend' # Verify all 7 methods async for method in ['put_object','get_object','delete_object','presigned_get_url','health_check','generate_presigned_put_url','stat_object']: assert inspect.iscoroutinefunction(getattr(NextcloudBackend, method)), f'{method} not async' # Verify list_folder added assert hasattr(NextcloudBackend, 'list_folder'), 'list_folder missing' assert inspect.iscoroutinefunction(NextcloudBackend.list_folder), 'list_folder not async' print('NextcloudBackend is WebDAVBackend subclass: OK') print('All 7 StorageBackend methods async: OK') print('list_folder method present and async: OK') # SSRF guard inherited try: NextcloudBackend('http://10.0.0.1/dav', 'user', 'pass') print('FAIL: SSRF should be blocked') except ValueError: print('SSRF guard inherited: OK') " - backend/storage/nextcloud_backend.py exists with class NextcloudBackend - issubclass(NextcloudBackend, WebDAVBackend) is True - All 7 StorageBackend methods are async (inherited or overridden) - list_folder async method added beyond the ABC contract - SSRF guard inherited from WebDAVBackend.__init__: NextcloudBackend("http://10.0.0.1/dav", ...) raises ValueError - `pytest -v --tb=short` exits 0 NextcloudBackend created as WebDAVBackend subclass; list_folder added; SSRF guard inherited; pytest passes ## Trust Boundaries | Boundary | Description | |----------|-------------| | user-supplied server_url → WebDAV client | Server URL must be validated for SSRF before Client construction and before each request | | webdavclient3 sync calls → event loop | All sync SDK calls must be in asyncio.to_thread() to prevent event loop blocking | | WebDAV credentials → encrypted storage | Credentials flow from encrypted DB via factory into backend constructor — never logged | ## STRIDE Threat Register | Threat ID | Category | Component | Disposition | Mitigation Plan | |-----------|----------|-----------|-------------|-----------------| | T-05-04-01 | Tampering | WebDAVBackend — SSRF via server_url | mitigate | validate_cloud_url(server_url) in __init__ AND before every asyncio.to_thread call; D-17 requires both points | | T-05-04-02 | Tampering | DNS rebinding on WebDAV requests | mitigate | validate_cloud_url called before each request (not only at connect-time); documented defense-in-depth via network egress firewall (RESEARCH.md Pitfall 5) | | T-05-04-03 | Information Disclosure | WebDAV path includes user_id/document_id | accept | object_key = "docuvault/{user_id}/{document_id}{ext}" — no human filename; acceptable for single-user WebDAV servers | | T-05-04-04 | Denial of Service | Nextcloud list_folder fetching info per item | accept | TTLCache (Plan 02) prevents repeated list_folder calls within 60s; per-item info call is provider overhead only | | T-05-04-05 | Tampering | webdavclient3 path traversal via object_key | mitigate | put_object constructs object_key from user_id and document_id (both UUID values); get_object/delete_object receive object_key from DB (not from user input directly) — no raw user path injection | cd /Users/nik/Documents/Progamming/document_scanner/backend && python -m pytest tests/test_cloud.py -v && python -m pytest -v --tb=short 2>&1 | tail -10 - WebDAVBackend: all 7 methods async; validate_cloud_url in __init__ and before each request; presigned methods raise NotImplementedError - NextcloudBackend: subclass of WebDAVBackend; list_folder method added; SSRF guard inherited - pytest -v exits 0, 0 failures; test_cloud.py still all xfailed Create `.planning/phases/05-cloud-storage-backends/05-04-SUMMARY.md` when done