docs(04): create phase 4 plan (9 plans, 7 waves)
Folders, Sharing, Quotas & Document UX — plans verified (0 blockers, 2 non-blocking warnings). Covers FOLD-01..05, SHARE-01..05, SEC-08/09, ADMIN-06, DOC-01/02. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,216 @@
|
||||
---
|
||||
phase: 04-folders-sharing-quotas-document-ux
|
||||
plan: 02
|
||||
type: execute
|
||||
wave: 1
|
||||
depends_on: []
|
||||
files_modified:
|
||||
- backend/migrations/versions/0004_phase4_pdf_open_mode_tsvector.py
|
||||
- backend/storage/minio_backend.py
|
||||
autonomous: true
|
||||
requirements:
|
||||
- FOLD-05
|
||||
- DOC-02
|
||||
- ADMIN-06
|
||||
|
||||
must_haves:
|
||||
truths:
|
||||
- "Alembic migration 0004 upgrades cleanly; downgrade reverses all DDL changes"
|
||||
- "users.pdf_open_mode column exists with server_default 'in_app'"
|
||||
- "GIN expression index ix_documents_fts on documents.extracted_text exists in PostgreSQL"
|
||||
- "MinIOBackend.put_object_raw() is callable for audit-logs bucket writes"
|
||||
- "audit-logs MinIO bucket is created when MINIO_ENDPOINT is set"
|
||||
artifacts:
|
||||
- path: "backend/migrations/versions/0004_phase4_pdf_open_mode_tsvector.py"
|
||||
provides: "Alembic migration: pdf_open_mode column, GIN FTS index, audit-logs bucket creation"
|
||||
- path: "backend/storage/minio_backend.py"
|
||||
provides: "put_object_raw(bucket, key, data, length, content_type) method added"
|
||||
key_links:
|
||||
- from: "backend/migrations/versions/0004_phase4_pdf_open_mode_tsvector.py"
|
||||
to: "backend/db/models.py"
|
||||
via: "Alembic op.execute GIN index must not collide with ORM model table_args"
|
||||
pattern: "ix_documents_fts"
|
||||
- from: "backend/tasks/audit_tasks.py"
|
||||
to: "backend/storage/minio_backend.py"
|
||||
via: "put_object_raw called by daily export task"
|
||||
pattern: "put_object_raw"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Create Alembic migration 0004 and add MinIOBackend.put_object_raw(). This plan has no
|
||||
dependency on the test scaffolds plan (04-01) and can run in parallel.
|
||||
|
||||
Purpose: Establish the database schema additions (pdf_open_mode, FTS index) and storage
|
||||
method (put_object_raw for audit CSV upload) that later plans depend on.
|
||||
Output: Migration file 0004 + extended MinIOBackend.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@$HOME/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/phases/04-folders-sharing-quotas-document-ux/04-CONTEXT.md
|
||||
@.planning/phases/04-folders-sharing-quotas-document-ux/04-PATTERNS.md
|
||||
@backend/migrations/versions/0003_multi_user_isolation.py
|
||||
@backend/storage/minio_backend.py
|
||||
@backend/db/models.py
|
||||
</context>
|
||||
|
||||
<interfaces>
|
||||
<!-- Key interfaces the executor needs. Extracted from codebase. -->
|
||||
<!-- From backend/storage/minio_backend.py — existing put_object signature for reference: -->
|
||||
<!--
|
||||
async def put_object(
|
||||
self,
|
||||
file_bytes: bytes,
|
||||
content_type: str,
|
||||
user_id: str,
|
||||
document_id: str,
|
||||
ext: str,
|
||||
) -> str: # returns object_key
|
||||
...
|
||||
await asyncio.to_thread(
|
||||
self._client.put_object,
|
||||
self._bucket, # hardcoded documents bucket
|
||||
object_key, # {user_id}/{document_id}/{uuid4()}{ext}
|
||||
io.BytesIO(file_bytes),
|
||||
length=len(file_bytes),
|
||||
content_type=content_type,
|
||||
)
|
||||
return object_key
|
||||
-->
|
||||
<!-- New put_object_raw signature (per D-17, PATTERNS.md):
|
||||
async def put_object_raw(
|
||||
self,
|
||||
bucket: str, # caller supplies bucket name (e.g., "audit-logs")
|
||||
key: str, # caller supplies complete key (e.g., "audit-logs/2026-05-25.csv")
|
||||
data: io.BytesIO,
|
||||
length: int,
|
||||
content_type: str,
|
||||
) -> None
|
||||
-->
|
||||
<!-- From backend/migrations/versions/0003_multi_user_isolation.py — batch_alter_table pattern:
|
||||
with op.batch_alter_table("users") as batch_op:
|
||||
batch_op.add_column(sa.Column("col_name", sa.String(), nullable=False, server_default="value"))
|
||||
-->
|
||||
</interfaces>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto">
|
||||
<name>Task 1: Alembic migration 0004 — pdf_open_mode column + GIN FTS index + audit-logs bucket</name>
|
||||
<files>backend/migrations/versions/0004_phase4_pdf_open_mode_tsvector.py</files>
|
||||
<read_first>
|
||||
backend/migrations/versions/0003_multi_user_isolation.py — read the entire file; extract the batch_alter_table pattern for adding columns, the MinIO bucket creation pattern gated on MINIO_ENDPOINT env var, the module docstring format, and the downgrade() function pattern
|
||||
backend/db/models.py — read lines 44-70 (User model) to confirm pdf_open_mode does NOT already exist before adding it
|
||||
</read_first>
|
||||
<action>
|
||||
Create backend/migrations/versions/0004_phase4_pdf_open_mode_tsvector.py.
|
||||
|
||||
Module docstring (follow 0003 format): list three changes in order — (1) users.pdf_open_mode column, (2) GIN expression index on documents.extracted_text, (3) audit-logs MinIO bucket creation.
|
||||
|
||||
Set: revision = "0004", down_revision = "0003", branch_labels = None, depends_on = None.
|
||||
|
||||
upgrade() function — three steps in this exact order:
|
||||
|
||||
Step 1: Add users.pdf_open_mode column using batch_alter_table (per D-10, for SQLite compat):
|
||||
`with op.batch_alter_table("users") as batch_op: batch_op.add_column(sa.Column("pdf_open_mode", sa.String(), nullable=False, server_default="in_app"))`
|
||||
|
||||
Step 2: Create GIN expression index MANUALLY — do NOT use Column Computed() or Index() — use op.execute with raw SQL (per D-11, Pitfall 2, PATTERNS.md). The exact SQL is:
|
||||
`CREATE INDEX ix_documents_fts ON documents USING GIN (to_tsvector('english', coalesce(extracted_text, '')))`
|
||||
Add a comment above this line: `# managed manually — do not autogenerate (Alembic issue #1390)`.
|
||||
|
||||
Step 3: Create the audit-logs MinIO bucket gated on MINIO_ENDPOINT env var (per Pitfall 8, PATTERNS.md). Follow the exact same guard pattern from migration 0003: `if os.environ.get("MINIO_ENDPOINT"):` then instantiate Minio client using env vars MINIO_ENDPOINT, MINIO_ACCESS_KEY, MINIO_SECRET_KEY with secure=False. Check `client.bucket_exists("audit-logs")` and call `client.make_bucket("audit-logs")` only if it does not exist.
|
||||
|
||||
downgrade() function: (1) `op.execute("DROP INDEX IF EXISTS ix_documents_fts")`, (2) `with op.batch_alter_table("users") as batch_op: batch_op.drop_column("pdf_open_mode")`. Add a comment: `# MinIO bucket NOT reversed — bucket may contain audit data`.
|
||||
|
||||
Imports needed at module top: `from __future__ import annotations`, `import os`, `import sqlalchemy as sa`, `from alembic import op`.
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /Users/nik/Documents/Progamming/document_scanner/backend && python -c "from migrations.versions import 0004_phase4_pdf_open_mode_tsvector" 2>&1 || python -c "import importlib.util; spec = importlib.util.spec_from_file_location('m', 'migrations/versions/0004_phase4_pdf_open_mode_tsvector.py'); m = importlib.util.module_from_spec(spec); spec.loader.exec_module(m); print('OK')"</automated>
|
||||
</verify>
|
||||
<acceptance_criteria>
|
||||
- File exists at backend/migrations/versions/0004_phase4_pdf_open_mode_tsvector.py
|
||||
- `revision = "0004"` and `down_revision = "0003"` present in file
|
||||
- File contains `batch_alter_table("users")` with pdf_open_mode column, server_default="in_app"
|
||||
- File contains `CREATE INDEX ix_documents_fts ON documents USING GIN` (grep confirms)
|
||||
- File contains `# managed manually — do not autogenerate` comment
|
||||
- File contains `if os.environ.get("MINIO_ENDPOINT")` MinIO gate for audit-logs bucket
|
||||
- File contains `audit-logs` bucket name (grep confirms)
|
||||
- File contains `DROP INDEX IF EXISTS ix_documents_fts` in downgrade()
|
||||
- `python -c "import py_compile; py_compile.compile('migrations/versions/0004_phase4_pdf_open_mode_tsvector.py')"` exits 0 (no syntax errors)
|
||||
</acceptance_criteria>
|
||||
<done>Migration file is syntactically valid and contains all three upgrade steps + downgrade reversal.</done>
|
||||
</task>
|
||||
|
||||
<task type="auto">
|
||||
<name>Task 2: Add put_object_raw() to MinIOBackend</name>
|
||||
<files>backend/storage/minio_backend.py</files>
|
||||
<read_first>
|
||||
backend/storage/minio_backend.py — read the entire file; identify the existing put_object() method signature and asyncio.to_thread() call pattern; identify the class definition and existing imports including io import
|
||||
backend/storage/base.py — read to see if StorageBackend ABC needs a corresponding abstract method added (it does NOT — put_object_raw is concrete only on MinIOBackend, not on the ABC, because only MinIOBackend has an audit-logs bucket)
|
||||
</read_first>
|
||||
<action>
|
||||
Add the put_object_raw() async method to the MinIOBackend class in backend/storage/minio_backend.py.
|
||||
|
||||
Place it immediately after the existing put_object() method.
|
||||
|
||||
Method signature: `async def put_object_raw(self, bucket: str, key: str, data: io.BytesIO, length: int, content_type: str) -> None:`
|
||||
|
||||
Docstring: "Upload bytes to an arbitrary bucket+key (used for audit-logs CSV export). Unlike put_object(), does NOT apply the document key schema — the caller supplies the complete key. The main documents bucket is NOT used."
|
||||
|
||||
Implementation: call `await asyncio.to_thread(self._client.put_object, bucket, key, data, length=length, content_type=content_type)`. This mirrors the exact pattern of the existing put_object() method (which uses self._client.put_object via asyncio.to_thread).
|
||||
|
||||
Do NOT add io import if it already exists. Do NOT add this method to StorageBackend ABC (base.py) — this is MinIOBackend-only.
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /Users/nik/Documents/Progamming/document_scanner/backend && python -c "from storage.minio_backend import MinIOBackend; import inspect; src = inspect.getsource(MinIOBackend.put_object_raw); print('OK')"</automated>
|
||||
</verify>
|
||||
<acceptance_criteria>
|
||||
- MinIOBackend.put_object_raw exists as an async method (grep: `async def put_object_raw`)
|
||||
- Method accepts: bucket: str, key: str, data: io.BytesIO, length: int, content_type: str
|
||||
- Method body calls `asyncio.to_thread(self._client.put_object, bucket, key, data, length=length, content_type=content_type)`
|
||||
- `python -c "from storage.minio_backend import MinIOBackend"` exits 0 (no import errors)
|
||||
- StorageBackend ABC (base.py) is NOT modified (grep: `put_object_raw` absent from base.py)
|
||||
</acceptance_criteria>
|
||||
<done>MinIOBackend.put_object_raw() is importable and callable; base.py unchanged.</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<threat_model>
|
||||
## Trust Boundaries
|
||||
|
||||
| Boundary | Description |
|
||||
|----------|-------------|
|
||||
| Migration → MinIO | Bucket creation uses env var credentials; no credentials in code |
|
||||
| put_object_raw → MinIO SDK | Caller supplies bucket name; method does not validate bucket against allowlist |
|
||||
|
||||
## STRIDE Threat Register
|
||||
|
||||
| Threat ID | Category | Component | Disposition | Mitigation Plan |
|
||||
|-----------|----------|-----------|-------------|-----------------|
|
||||
| T-04-02-01 | Tampering | migration 0004 GIN index | mitigate | Index created via raw SQL (not autogenerate) to prevent Alembic repeat-generation bug; comment documents the decision |
|
||||
| T-04-02-02 | Information Disclosure | audit-logs MinIO bucket | mitigate | Bucket creation gated on MINIO_ENDPOINT env var (no hardcoded credentials); bucket policy is private-by-default (MinIO default) |
|
||||
| T-04-02-03 | Tampering | put_object_raw caller-supplied key | accept | put_object_raw is called only from trusted application code (Celery task); key is constructed from application logic, not user input |
|
||||
| T-04-SC | Tampering | npm/pip/cargo installs | accept | No new packages installed in this plan |
|
||||
</threat_model>
|
||||
|
||||
<verification>
|
||||
1. Verify migration syntax: `python -c "import py_compile; py_compile.compile('backend/migrations/versions/0004_phase4_pdf_open_mode_tsvector.py')"`
|
||||
2. Verify MinIOBackend import: `cd backend && python -c "from storage.minio_backend import MinIOBackend; print(hasattr(MinIOBackend, 'put_object_raw'))"`
|
||||
3. Grep checks: `grep -n "ix_documents_fts\|audit-logs\|pdf_open_mode\|put_object_raw" backend/migrations/versions/0004_phase4_pdf_open_mode_tsvector.py backend/storage/minio_backend.py`
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
- Migration 0004 file is syntactically valid Python with revision="0004" and down_revision="0003"
|
||||
- Migration contains all three upgrade steps: pdf_open_mode column, GIN FTS index, audit-logs bucket creation
|
||||
- MinIOBackend.put_object_raw() is an async method that delegates to asyncio.to_thread
|
||||
- No existing tests regress: `cd backend && pytest -v --no-header 2>&1 | grep -E "FAILED|ERROR"` returns nothing
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
Create `.planning/phases/04-folders-sharing-quotas-document-ux/04-02-SUMMARY.md` when done.
|
||||
</output>
|
||||
Reference in New Issue
Block a user