• v2.2.18 c6d310e96d

    feat: analyze PDF attachments for invoice extraction v2.2.18

    Ghost released this 2026-03-02 00:17:41 +01:00 | 54 commits to main since this release

    • email_analysis_service: extract PDF text from attachments as PRIMARY source
      • _build_invoice_extraction_context: reads PDF bytes (in-memory or DB)
      • _extract_pdf_texts_from_attachments: pdfplumber on in-memory bytes
      • _get_attachment_texts_from_db: fallback to content_data/file_path
      • _build_extraction_prompt: comprehensive schema (vendor, CVR, lines, dates)
      • num_predict 300→3000, timeout 30→120s, format=json
    • email_processor_service: _update_extracted_fields saves vendor_name, CVR, invoice_date
    • migration 140: extracted_vendor_name, extracted_vendor_cvr, extracted_invoice_date columns

    Sender (forwarder/external bookkeeper) is now ignored for vendor detection.
    The actual invoice PDF determines vendor/amounts/lines.

    Downloads