Commit Graph

1 Commits

Author SHA1 Message Date
Christian
c6d310e96d feat: analyze PDF attachments for invoice extraction v2.2.18
- email_analysis_service: extract PDF text from attachments as PRIMARY source
  - _build_invoice_extraction_context: reads PDF bytes (in-memory or DB)
  - _extract_pdf_texts_from_attachments: pdfplumber on in-memory bytes
  - _get_attachment_texts_from_db: fallback to content_data/file_path
  - _build_extraction_prompt: comprehensive schema (vendor, CVR, lines, dates)
  - num_predict 300→3000, timeout 30→120s, format=json
- email_processor_service: _update_extracted_fields saves vendor_name, CVR, invoice_date
- migration 140: extracted_vendor_name, extracted_vendor_cvr, extracted_invoice_date columns

Sender (forwarder/external bookkeeper) is now ignored for vendor detection.
The actual invoice PDF determines vendor/amounts/lines.
2026-03-02 00:17:41 +01:00