Christian
|
c6d310e96d
|
feat: analyze PDF attachments for invoice extraction v2.2.18
- email_analysis_service: extract PDF text from attachments as PRIMARY source
- _build_invoice_extraction_context: reads PDF bytes (in-memory or DB)
- _extract_pdf_texts_from_attachments: pdfplumber on in-memory bytes
- _get_attachment_texts_from_db: fallback to content_data/file_path
- _build_extraction_prompt: comprehensive schema (vendor, CVR, lines, dates)
- num_predict 300→3000, timeout 30→120s, format=json
- email_processor_service: _update_extracted_fields saves vendor_name, CVR, invoice_date
- migration 140: extracted_vendor_name, extracted_vendor_cvr, extracted_invoice_date columns
Sender (forwarder/external bookkeeper) is now ignored for vendor detection.
The actual invoice PDF determines vendor/amounts/lines.
|
2026-03-02 00:17:41 +01:00 |
|