docs(05-03): complete GoogleDriveBackend + OneDriveBackend plan
- SUMMARY.md created for Plan 05-03 - STATE.md updated: completed_plans 26→27, progress 81→84% - Session continuity updated with pytest results (262 passed / 43 xfailed / 1 pre-existing) - Key decisions added: shared CloudConnectionError, cache_discovery=False, createUploadSession
This commit is contained in:
+14
-9
@@ -4,13 +4,13 @@ milestone: v1.0
|
|||||||
milestone_name: milestone
|
milestone_name: milestone
|
||||||
current_phase: 5
|
current_phase: 5
|
||||||
status: executing
|
status: executing
|
||||||
last_updated: "2026-05-28T19:30:00.000Z"
|
last_updated: "2026-05-28T19:15:00.000Z"
|
||||||
progress:
|
progress:
|
||||||
total_phases: 5
|
total_phases: 5
|
||||||
completed_phases: 4
|
completed_phases: 4
|
||||||
total_plans: 32
|
total_plans: 32
|
||||||
completed_plans: 26
|
completed_plans: 27
|
||||||
percent: 81
|
percent: 84
|
||||||
---
|
---
|
||||||
|
|
||||||
# Project State
|
# Project State
|
||||||
@@ -28,13 +28,13 @@ progress:
|
|||||||
| 2 | Users & Authentication | ✓ Complete (5/5 plans) |
|
| 2 | Users & Authentication | ✓ Complete (5/5 plans) |
|
||||||
| 3 | Document Migration & Multi-User Isolation | ✓ Complete (5/5 plans, UAT passed, security gate passed) |
|
| 3 | Document Migration & Multi-User Isolation | ✓ Complete (5/5 plans, UAT passed, security gate passed) |
|
||||||
| 4 | Folders, Sharing, Quotas & Document UX | ✓ Complete (9/9 plans, UAT 14/15 passed, 1 bug fixed) |
|
| 4 | Folders, Sharing, Quotas & Document UX | ✓ Complete (9/9 plans, UAT 14/15 passed, 1 bug fixed) |
|
||||||
| 5 | Cloud Storage Backends | Not Started |
|
| 5 | Cloud Storage Backends | In Progress (3/8 plans complete) |
|
||||||
|
|
||||||
## Current Position
|
## Current Position
|
||||||
|
|
||||||
**Phase:** 05-cloud-storage-backends — Not started
|
**Phase:** 05-cloud-storage-backends — In Progress
|
||||||
**Plan:** 0/TBD
|
**Plan:** 3/8
|
||||||
**Progress:** [████████░░] 78%
|
**Progress:** [████████░░] 84%
|
||||||
|
|
||||||
## Performance Metrics
|
## Performance Metrics
|
||||||
|
|
||||||
@@ -124,6 +124,10 @@ progress:
|
|||||||
| CloudConnectionOut whitelist pattern | Pydantic model with exactly the safe fields; credentials_enc absent by omission — SEC-08 safe-by-default |
|
| CloudConnectionOut whitelist pattern | Pydantic model with exactly the safe fields; credentials_enc absent by omission — SEC-08 safe-by-default |
|
||||||
| admin.user_deleted flush before delete | audit write flushed (session.flush()) while user FK still valid; session.delete(user) follows — preserves audit FK integrity |
|
| admin.user_deleted flush before delete | audit write flushed (session.flush()) while user FK still valid; session.delete(user) follows — preserves audit FK integrity |
|
||||||
| test_admin_impersonation 405 acceptable | DELETE /users/{id} causes GET to return 405 not 422; both mean no GET impersonation endpoint; test updated to accept {404, 405, 422} |
|
| test_admin_impersonation 405 acceptable | DELETE /users/{id} causes GET to return 405 not 422; both mean no GET impersonation endpoint; test updated to accept {404, 405, 422} |
|
||||||
|
| CloudConnectionError shared exception type | Defined once in google_drive_backend.py; imported by onedrive_backend.py — single exception type across all cloud backends |
|
||||||
|
| cache_discovery=False on Drive build() | Prevents /tmp discovery cache writes — directory traversal vector (T-05-03-05) |
|
||||||
|
| createUploadSession for all OneDrive uploads | No 4 MB size gate; resumable sessions handle small and large files through same code path (Pitfall 6) |
|
||||||
|
| MSAL invalid_grant via result.get('error') | MSAL returns dict (never raises); field-level check is correct — Assumption A3 confirmed |
|
||||||
|
|
||||||
### Open Questions
|
### Open Questions
|
||||||
|
|
||||||
@@ -169,6 +173,7 @@ _Updated at each phase transition._
|
|||||||
| Last session | 2026-05-28 — Phase 5 planned (8 plans, 7 waves); verification passed (4 blockers → resolved: D-05 API-layer refresh path, SEC-09 cloud cleanup, frontend_url config, RESEARCH resolved markers) |
|
| Last session | 2026-05-28 — Phase 5 planned (8 plans, 7 waves); verification passed (4 blockers → resolved: D-05 API-layer refresh path, SEC-09 cloud cleanup, frontend_url config, RESEARCH resolved markers) |
|
||||||
| Last session | 2026-05-28 — Plan 05-01 executed: Wave 0 Nyquist scaffold — 19 xfail stubs in test_cloud.py, 4 cloud fixtures in conftest.py, 6 package pins, 8 config settings; 172 passed / 43 xfailed |
|
| Last session | 2026-05-28 — Plan 05-01 executed: Wave 0 Nyquist scaffold — 19 xfail stubs in test_cloud.py, 4 cloud fixtures in conftest.py, 6 package pins, 8 config settings; 172 passed / 43 xfailed |
|
||||||
| Last session | 2026-05-28 — Plan 05-02 executed: cloud_utils.py (SSRF+HKDF), cloud_cache.py (TTLCache), storage factory extended; 199 passed / 43 xfailed / 1 pre-existing failure |
|
| Last session | 2026-05-28 — Plan 05-02 executed: cloud_utils.py (SSRF+HKDF), cloud_cache.py (TTLCache), storage factory extended; 199 passed / 43 xfailed / 1 pre-existing failure |
|
||||||
| Next action | Execute Plan 05-03: GoogleDriveBackend + OneDriveBackend (all 7 StorageBackend methods) |
|
| Last session | 2026-05-28 — Plan 05-03 executed: GoogleDriveBackend (Drive v3, cache_discovery=False, asyncio.to_thread) + OneDriveBackend (MSAL, resumable upload, CHUNK_SIZE=10MB); 262 passed / 43 xfailed / 1 pre-existing failure |
|
||||||
|
| Next action | Execute Plan 05-04: WebDAVBackend + NextcloudBackend |
|
||||||
| Pending decisions | None |
|
| Pending decisions | None |
|
||||||
| Resume file | `.planning/phases/05-cloud-storage-backends/05-03-PLAN.md` |
|
| Resume file | `.planning/phases/05-cloud-storage-backends/05-04-PLAN.md` |
|
||||||
|
|||||||
@@ -0,0 +1,148 @@
|
|||||||
|
---
|
||||||
|
phase: 05-cloud-storage-backends
|
||||||
|
plan: 03
|
||||||
|
subsystem: api
|
||||||
|
tags: [google-drive, onedrive, microsoft-graph, msal, google-api-python-client, oauth2, asyncio, cloud-storage]
|
||||||
|
|
||||||
|
# Dependency graph
|
||||||
|
requires:
|
||||||
|
- phase: 05-cloud-storage-backends
|
||||||
|
plan: 02
|
||||||
|
provides: "CloudConnectionError (shared exception), StorageBackend ABC, asyncio.to_thread pattern reference (MinIOBackend)"
|
||||||
|
provides:
|
||||||
|
- "backend/storage/google_drive_backend.py: GoogleDriveBackend + CloudConnectionError exception class"
|
||||||
|
- "backend/storage/onedrive_backend.py: OneDriveBackend with resumable upload and MSAL token refresh"
|
||||||
|
- "backend/tests/test_cloud_backends.py: 32 green TDD tests for both backends"
|
||||||
|
affects: [05-05, 05-06, 05-07, 05-08]
|
||||||
|
|
||||||
|
# Tech tracking
|
||||||
|
tech-stack:
|
||||||
|
added:
|
||||||
|
- google-api-python-client 2.196.0 (Google Drive v3 API — files.create, get_media, delete, list)
|
||||||
|
- google-auth-oauthlib 1.3.1 (google.oauth2.credentials.Credentials)
|
||||||
|
- msal 1.36.0 (ConfidentialClientApplication.acquire_token_by_refresh_token)
|
||||||
|
patterns:
|
||||||
|
- "Shared exception class: CloudConnectionError(reason=) defined once in google_drive_backend.py, imported by onedrive_backend.py"
|
||||||
|
- "All sync SDK calls wrapped in asyncio.to_thread() — identical pattern to MinIOBackend"
|
||||||
|
- "cache_discovery=False on googleapiclient.discovery.build() — prevents /tmp discovery doc writes"
|
||||||
|
- "B2 design: backends are stateless signal-raisers — raise CloudConnectionError, never update DB"
|
||||||
|
- "OneDrive resumable upload: createUploadSession for ALL files (no 4 MB size gate)"
|
||||||
|
- "CHUNK_SIZE = 10 MB — above Graph's 4 MB simple upload limit (Pitfall 6 prevention)"
|
||||||
|
|
||||||
|
key-files:
|
||||||
|
created:
|
||||||
|
- backend/storage/google_drive_backend.py
|
||||||
|
- backend/storage/onedrive_backend.py
|
||||||
|
- backend/tests/test_cloud_backends.py
|
||||||
|
modified: []
|
||||||
|
|
||||||
|
key-decisions:
|
||||||
|
- "CloudConnectionError defined in google_drive_backend.py and imported by onedrive_backend.py — single shared exception type keeps error handling uniform in the API layer (cloud.py, Plan 05-05)"
|
||||||
|
- "cache_discovery=False on Drive build() — prevents googleapiclient from writing /tmp discovery cache, avoiding /tmp traversal vector (T-05-03-05)"
|
||||||
|
- "Resumable upload sessions used for ALL OneDrive uploads regardless of file size — simpler than a size gate and eliminates the 4 MB limit (Pitfall 6, RESEARCH.md Open Question 3)"
|
||||||
|
- "MSAL invalid_grant detection via result.get('error') == 'invalid_grant' — confirmed as the correct Assumption A3 from RESEARCH.md"
|
||||||
|
- "_ensure_valid_token() uses 60-second buffer before expiry — reduces race conditions between expiry check and actual API call"
|
||||||
|
|
||||||
|
patterns-established:
|
||||||
|
- "Backend statelessness: cloud backends raise CloudConnectionError(reason=) and never call session.commit()"
|
||||||
|
- "Google Drive 401 → token_expired; 400 + invalid_grant body → invalid_grant"
|
||||||
|
- "OneDrive: _ensure_valid_token() + _refresh_token() called before every operation"
|
||||||
|
|
||||||
|
requirements-completed:
|
||||||
|
- CLOUD-01
|
||||||
|
- CLOUD-05
|
||||||
|
- CLOUD-07
|
||||||
|
|
||||||
|
# Metrics
|
||||||
|
duration: 6min
|
||||||
|
completed: 2026-05-28
|
||||||
|
---
|
||||||
|
|
||||||
|
# Phase 5 Plan 03: Google Drive and OneDrive StorageBackend Implementations Summary
|
||||||
|
|
||||||
|
**Stateless GoogleDriveBackend (Drive v3 with asyncio.to_thread, cache_discovery=False) and OneDriveBackend (MSAL token refresh, 10 MB resumable upload sessions via createUploadSession) implementing all 7 StorageBackend methods**
|
||||||
|
|
||||||
|
## Performance
|
||||||
|
|
||||||
|
- **Duration:** 6 min
|
||||||
|
- **Started:** 2026-05-28T19:05:18Z
|
||||||
|
- **Completed:** 2026-05-28T19:11:00Z
|
||||||
|
- **Tasks:** 2
|
||||||
|
- **Files modified:** 3
|
||||||
|
|
||||||
|
## Accomplishments
|
||||||
|
|
||||||
|
- Created `google_drive_backend.py` with `CloudConnectionError(reason=)` exception class and `GoogleDriveBackend` implementing all 7 StorageBackend methods. Every sync `googleapiclient` call is wrapped in `asyncio.to_thread()`. `cache_discovery=False` prevents /tmp traversal (T-05-03-05). HttpError 401 raises `CloudConnectionError(reason="token_expired")`; HttpError 400 with "invalid_grant" body raises `CloudConnectionError(reason="invalid_grant")`. `presigned_get_url` and `generate_presigned_put_url` raise `NotImplementedError` (D-14).
|
||||||
|
- Created `onedrive_backend.py` with `OneDriveBackend` importing the shared `CloudConnectionError` from `google_drive_backend`. `CHUNK_SIZE = 10 * 1024 * 1024` (10 MB). Uses Microsoft Graph `createUploadSession` for all uploads (no 4 MB size gate). `_ensure_valid_token()` checks expiry with 60s buffer; `_refresh_token()` wraps MSAL in `asyncio.to_thread()` and returns `None` on `invalid_grant` to trigger `CloudConnectionError(reason="invalid_grant")`. Both `presigned_*` methods raise `NotImplementedError`.
|
||||||
|
- Created `tests/test_cloud_backends.py` with 32 TDD tests (RED → GREEN) covering imports, all 7 methods being async, `CHUNK_SIZE`, shared `CloudConnectionError`, `presigned_*` raising `NotImplementedError`, `_init__` correctness, and `_ensure_valid_token` behavior for expired/non-expired tokens.
|
||||||
|
|
||||||
|
## Task Commits
|
||||||
|
|
||||||
|
Each task was committed atomically following the TDD RED → GREEN cycle:
|
||||||
|
|
||||||
|
1. **RED phase tests — both backends** - `4efe7c1` (test)
|
||||||
|
2. **Task 1: GoogleDriveBackend** - `337ee8e` (feat)
|
||||||
|
3. **Task 2: OneDriveBackend** - `bcb887e` (feat)
|
||||||
|
|
||||||
|
## Files Created/Modified
|
||||||
|
|
||||||
|
- `/Users/nik/Documents/Progamming/document_scanner/backend/storage/google_drive_backend.py` — GoogleDriveBackend (all 7 methods) + CloudConnectionError exception class
|
||||||
|
- `/Users/nik/Documents/Progamming/document_scanner/backend/storage/onedrive_backend.py` — OneDriveBackend (all 7 methods), CHUNK_SIZE, MSAL token refresh, resumable upload
|
||||||
|
- `/Users/nik/Documents/Progamming/document_scanner/backend/tests/test_cloud_backends.py` — 32 green TDD tests for both backends
|
||||||
|
|
||||||
|
## Decisions Made
|
||||||
|
|
||||||
|
- `CloudConnectionError` is defined once in `google_drive_backend.py` and imported by `onedrive_backend.py`. This keeps the exception type unified — the API layer in `cloud.py` (Plan 05-05) will catch one exception type regardless of which backend raised it.
|
||||||
|
- `cache_discovery=False` is explicitly set on `googleapiclient.discovery.build()`. Without this flag, the client writes a JSON discovery document to `/tmp` on first call — this was identified as Threat T-05-03-05 in the plan's threat model.
|
||||||
|
- `createUploadSession` is used for ALL OneDrive uploads (not only files > 4 MB). This matches RESEARCH.md's resolution of Open Question 3: simpler code (no size branch), avoids the 4 MB limit entirely, and handles both small and large files through the same path.
|
||||||
|
- MSAL's `invalid_grant` is detected via `result.get("error") == "invalid_grant"` — consistent with Assumption A3 in RESEARCH.md. The MSAL library returns a dict (never raises), so field-level checking is the correct approach.
|
||||||
|
|
||||||
|
## Deviations from Plan
|
||||||
|
|
||||||
|
None — plan executed exactly as written. Both backends implemented per the action specifications, all acceptance criteria met.
|
||||||
|
|
||||||
|
## Issues Encountered
|
||||||
|
|
||||||
|
`google-api-python-client`, `google-auth-oauthlib`, and `msal` were not installed in the local Python 3.9.6 environment (they were added to `requirements.txt` in Plan 05-01 but not installed locally). Installed all three via `pip3 install` to enable local test execution. This is consistent with the Plan 05-02 SUMMARY's note about running tests locally vs. Docker.
|
||||||
|
|
||||||
|
FutureWarnings from `google.auth` about Python 3.9 end-of-life appeared in pytest output but do not affect test results — they are informational warnings from the library, not from our code.
|
||||||
|
|
||||||
|
## Known Stubs
|
||||||
|
|
||||||
|
None. Both backends are fully implemented with real method bodies. No placeholder returns or TODO comments in production code paths.
|
||||||
|
|
||||||
|
## Threat Surface Scan
|
||||||
|
|
||||||
|
No new network endpoints introduced. Both backends are pure library classes:
|
||||||
|
- `GoogleDriveBackend` makes outbound calls to `googleapis.com` using OAuth tokens from the decrypted credentials dict. Credentials are not logged.
|
||||||
|
- `OneDriveBackend` makes outbound calls to `graph.microsoft.com` and `login.microsoftonline.com` (via MSAL). Credentials are not logged.
|
||||||
|
|
||||||
|
No new trust boundaries not already documented in the plan's `<threat_model>`. All STRIDE mitigations listed are implemented:
|
||||||
|
- T-05-03-01: Credentials dict never logged; only in memory during request lifecycle
|
||||||
|
- T-05-03-02: `invalid_grant` detection implemented; `CloudConnectionError(reason="invalid_grant")` propagated to API layer
|
||||||
|
- T-05-03-05: `cache_discovery=False` implemented on Drive `build()` call
|
||||||
|
|
||||||
|
No threat flags raised.
|
||||||
|
|
||||||
|
## Next Phase Readiness
|
||||||
|
|
||||||
|
- Both OAuth cloud backends are complete and importable. Plan 05-05 (`cloud.py` API layer) can import `GoogleDriveBackend`, `OneDriveBackend`, and `CloudConnectionError` directly.
|
||||||
|
- The `get_storage_backend_for_document()` factory in `storage/__init__.py` (Plan 05-02) already has lazy imports for both backends; the `# type: ignore[import]` comments can be resolved once Plan 05-05 adds the actual cloud router.
|
||||||
|
- 32 new tests in `test_cloud_backends.py` are all green.
|
||||||
|
- Full suite: 262 passed / 43 xfailed / 1 pre-existing failure (`test_extract_docx` — python-docx not installed locally).
|
||||||
|
|
||||||
|
## Self-Check: PASSED
|
||||||
|
|
||||||
|
Files verified present:
|
||||||
|
- `backend/storage/google_drive_backend.py`: FOUND
|
||||||
|
- `backend/storage/onedrive_backend.py`: FOUND
|
||||||
|
- `backend/tests/test_cloud_backends.py`: FOUND
|
||||||
|
|
||||||
|
Commits verified:
|
||||||
|
- 4efe7c1: test(05-03): add RED phase tests — FOUND
|
||||||
|
- 337ee8e: feat(05-03): implement GoogleDriveBackend — FOUND
|
||||||
|
- bcb887e: feat(05-03): implement OneDriveBackend — FOUND
|
||||||
|
|
||||||
|
---
|
||||||
|
*Phase: 05-cloud-storage-backends*
|
||||||
|
*Completed: 2026-05-28*
|
||||||
Reference in New Issue
Block a user