1cdc532fff
- pytest suite for doc-service: 20+ tests covering category CRUD, document upload/get/delete/patch, ownership isolation, category assignment, AI processing (mock), and live PDF tests (auto-skipped when tests/pdfs/ is empty) - Minimal in-memory PDF builder in conftest so tests run without any fixture files; real PDFs can be dropped into tests/pdfs/ to activate live extraction tests - AI prompt updated to return suggested_categories (2–5 short names) - Frontend: SuggestionChip component in DocumentRow shows AI-suggested categories after processing; "Assign" links to an existing category, "Create & Assign" creates it first, ✕ dismisses locally - Default AI provider changed to LM Studio at http://host.docker.internal:1234/v1 (host.docker.internal resolves to the macOS host from inside Docker Desktop) - tests/pdfs/ directory tracked via .gitkeep; *.pdf excluded by .gitignore Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
33 lines
1.2 KiB
Python
33 lines
1.2 KiB
Python
from abc import ABC, abstractmethod
|
|
|
|
SYSTEM_PROMPT = (
|
|
"You are a financial document analysis assistant. "
|
|
"Given the text extracted from a PDF document, return ONLY a JSON object "
|
|
"with no markdown, no code fences, and no explanation."
|
|
)
|
|
|
|
USER_PROMPT_TEMPLATE = """Analyze the following document text and return a JSON object with exactly these keys:
|
|
document_type (one of: invoice, bill, receipt, order, expense, revenue, unknown),
|
|
total_amount (string or null),
|
|
currency (string or null),
|
|
vendor_name (string or null),
|
|
customer_name (string or null),
|
|
billing_address (string or null),
|
|
customer_address (string or null),
|
|
invoice_number (string or null),
|
|
invoice_date (string or null),
|
|
due_date (string or null),
|
|
tags (array of short keyword strings describing the document),
|
|
line_items (array of objects, each with keys: description, amount),
|
|
suggested_categories (array of 2 to 5 short category name strings a user might want to file this document under, e.g. "Utilities", "Travel", "Software Subscriptions", "Client Invoices").
|
|
|
|
Document text:
|
|
{text}"""
|
|
|
|
|
|
class AIProvider(ABC):
|
|
@abstractmethod
|
|
async def classify_document(self, text: str) -> dict:
|
|
"""Return structured extraction dict from document text."""
|
|
...
|