fix: capitalize watch-folder names to PascalCase-with-dashes on ingest

Folder names like "invoices" and "vendor-invoices" are now converted to
"Invoices" and "Vendor-Invoices" when the watcher auto-creates categories,
matching the naming convention enforced on user-created categories.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
curo1305
2026-04-18 22:26:24 +02:00
parent ebf97b6f4a
commit 1c8b35399c
@@ -11,7 +11,8 @@ Key design decisions:
- Watch documents use user_id="watch" as a sentinel so they are visible to - Watch documents use user_id="watch" as a sentinel so they are visible to
all authenticated users in the document list. all authenticated users in the document list.
- Subfolder names map to categories: a file at invoices/bill.pdf is assigned - Subfolder names map to categories: a file at invoices/bill.pdf is assigned
to a "invoices" category (auto-created if needed). to an "Invoices" category (auto-created if needed; folder name is converted
to PascalCase-with-dashes: "vendor-invoices""Vendor-Invoices").
- Suggestions: if ai_folder_suggestion or ai_rename_suggestion are enabled, - Suggestions: if ai_folder_suggestion or ai_rename_suggestion are enabled,
the relevant fields are set on the document after AI processing so users the relevant fields are set on the document after AI processing so users
can confirm/reject from the UI. can confirm/reject from the UI.
@@ -21,6 +22,7 @@ Key design decisions:
import asyncio import asyncio
import json import json
import logging import logging
import re
import uuid import uuid
from pathlib import Path from pathlib import Path
@@ -66,7 +68,10 @@ async def ingest_file(path_str: str, watch_root: Path, config: dict) -> None:
# Determine category from the first subfolder component # Determine category from the first subfolder component
try: try:
rel = path.relative_to(watch_root) rel = path.relative_to(watch_root)
folder_name = rel.parts[0] if len(rel.parts) > 1 else None folder_name = (
"-".join(p.capitalize() for p in re.split(r"[-_\s]+", rel.parts[0]) if p)
if len(rel.parts) > 1 else None
)
except ValueError: except ValueError:
folder_name = None folder_name = None