bmc_hub/docs/EMAIL_ACTIVITY_LOGGING.md
Christian 3fb43783a6 feat: Implement Email Workflow System with comprehensive documentation and migration scripts
- Added Email Workflow System with automated actions based on email classification.
- Created database schema with tables for workflows, executions, and actions.
- Developed API endpoints for CRUD operations on workflows and execution history.
- Included pre-configured workflows for invoice processing, time confirmation, and bankruptcy alerts.
- Introduced user guide and workflow system improvements for better usability.
- Implemented backup system for automated backup jobs and notifications.
- Established email activity log to track all actions and events related to emails.
2025-12-15 12:28:12 +01:00

11 KiB

Email Activity Logging System

Oversigt

Komplet audit trail system der logger alt hvad der sker med emails i BMC Hub. Hver handling, ændring og event bliver logget automatisk med timestamps, metadata og kontekst.

🎯 Hvad Bliver Logget?

System Events

  • fetched: Email hentet fra mail server
  • classified: Email klassificeret af AI/keyword system
  • workflow_executed: Workflow kørt på email
  • rule_matched: Email regel matchet
  • status_changed: Email status ændret
  • error: Fejl opstået under processing

User Events

  • read: Email læst af bruger
  • attachment_downloaded: Attachment downloaded
  • attachment_uploaded: Attachment uploaded

Integration Events

  • linked: Email linket til vendor/customer/case
  • invoice_extracted: Faktura data ekstraheret fra PDF
  • ticket_created: Support ticket oprettet
  • notification_sent: Notifikation sendt

📊 Database Schema

email_activity_log Table

CREATE TABLE email_activity_log (
    id SERIAL PRIMARY KEY,
    email_id INTEGER NOT NULL,              -- Hvilken email
    event_type VARCHAR(50) NOT NULL,        -- Hvad skete der
    event_category VARCHAR(30) NOT NULL,    -- Kategori (system/user/workflow/etc)
    description TEXT NOT NULL,              -- Human-readable beskrivelse
    metadata JSONB,                         -- Ekstra data som JSON
    user_id INTEGER,                        -- Bruger hvis user-triggered
    created_at TIMESTAMP,                   -- Hvornår
    created_by VARCHAR(255)                 -- Hvem/hvad
);

email_timeline View

Pre-built view med joins til users og email_messages:

SELECT * FROM email_timeline WHERE email_id = 123;

🔧 Hvordan Bruges Det?

I Python Code

from app.services.email_activity_logger import email_activity_logger

# Log email fetch
await email_activity_logger.log_fetched(
    email_id=123,
    source='imap',
    message_id='msg-abc-123'
)

# Log classification
await email_activity_logger.log_classified(
    email_id=123,
    classification='invoice',
    confidence=0.85,
    method='ai'
)

# Log workflow execution
await email_activity_logger.log_workflow_executed(
    email_id=123,
    workflow_id=5,
    workflow_name='Invoice Processing',
    status='completed',
    steps_completed=3,
    execution_time_ms=1250
)

# Log status change
await email_activity_logger.log_status_changed(
    email_id=123,
    old_status='active',
    new_status='processed',
    reason='workflow completed'
)

# Log entity linking
await email_activity_logger.log_linked(
    email_id=123,
    entity_type='vendor',
    entity_id=42,
    entity_name='Acme Corp'
)

# Log invoice extraction
await email_activity_logger.log_invoice_extracted(
    email_id=123,
    invoice_number='INV-2025-001',
    amount=1234.56,
    success=True
)

# Log error
await email_activity_logger.log_error(
    email_id=123,
    error_type='extraction_failed',
    error_message='PDF corrupted',
    context={'file': 'invoice.pdf', 'size': 0}
)

# Generic log (for custom events)
await email_activity_logger.log(
    email_id=123,
    event_type='custom_event',
    category='integration',
    description='Custom event happened',
    metadata={'key': 'value'}
)

Via SQL

-- Log event directly via function
SELECT log_email_event(
    123,                            -- email_id
    'custom_event',                 -- event_type
    'system',                       -- event_category
    'Something happened',           -- description
    '{"foo": "bar"}'::jsonb,       -- metadata (optional)
    NULL,                           -- user_id (optional)
    'system'                        -- created_by
);

-- Query logs for specific email
SELECT * FROM email_activity_log 
WHERE email_id = 123 
ORDER BY created_at DESC;

-- Use the view for nicer output
SELECT * FROM email_timeline 
WHERE email_id = 123;

Via API

GET /api/v1/emails/123/activity

Response:

[
  {
    "id": 1,
    "email_id": 123,
    "event_type": "fetched",
    "event_category": "system",
    "description": "Email fetched from email server",
    "metadata": {
      "source": "imap",
      "message_id": "msg-abc-123"
    },
    "user_id": null,
    "user_name": null,
    "created_at": "2025-12-15T10:30:00",
    "created_by": "system"
  },
  {
    "id": 2,
    "email_id": 123,
    "event_type": "classified",
    "event_category": "system",
    "description": "Classified as invoice (confidence: 85%)",
    "metadata": {
      "classification": "invoice",
      "confidence": 0.85,
      "method": "ai"
    },
    "created_at": "2025-12-15T10:30:02",
    "created_by": "system"
  }
]

🎨 UI Integration

Email Detail View

Når du vælger en email i email UI:

  1. Klik på "Log" tab i højre sidebar
  2. Se komplet timeline af alle events
  3. Ekspander metadata for detaljer

Timeline Features

  • Kronologisk visning: Nyeste først
  • Color-coded ikoner: Baseret på event category
    • 🔵 System events (blue)
    • 🟢 User events (green)
    • 🔷 Workflow events (cyan)
    • 🟡 Rule events (yellow)
    • Integration events (gray)
  • Expandable metadata: Klik for at se JSON details
  • User attribution: Viser hvem der udførte action

📈 Analytics & Monitoring

Recent Activity Across All Emails

GET /api/v1/emails/activity/recent?limit=50&event_type=error

Activity Statistics

GET /api/v1/emails/activity/stats

Response:

[
  {
    "event_type": "classified",
    "event_category": "system",
    "count": 1523,
    "last_occurrence": "2025-12-15T12:45:00"
  },
  {
    "event_type": "workflow_executed",
    "event_category": "workflow",
    "count": 892,
    "last_occurrence": "2025-12-15T12:44:30"
  }
]

🔍 Use Cases

1. Debugging Email Processing

-- See complete flow for problematic email
SELECT 
    event_type,
    description,
    created_at
FROM email_activity_log
WHERE email_id = 123
ORDER BY created_at;

2. Performance Monitoring

-- Find slow workflow executions
SELECT 
    email_id,
    description,
    (metadata->>'execution_time_ms')::int as exec_time
FROM email_activity_log
WHERE event_type = 'workflow_executed'
ORDER BY exec_time DESC
LIMIT 10;

3. User Activity Audit

-- See what user did
SELECT 
    e.subject,
    a.event_type,
    a.description,
    a.created_at
FROM email_activity_log a
JOIN email_messages e ON a.email_id = e.id
WHERE a.user_id = 5
ORDER BY a.created_at DESC;

4. Error Analysis

-- Find common errors
SELECT 
    metadata->>'error_type' as error_type,
    COUNT(*) as count
FROM email_activity_log
WHERE event_type = 'error'
GROUP BY error_type
ORDER BY count DESC;

5. Workflow Success Rate

-- Calculate workflow success rate
SELECT 
    metadata->>'workflow_name' as workflow,
    COUNT(*) FILTER (WHERE metadata->>'status' = 'completed') as success,
    COUNT(*) FILTER (WHERE metadata->>'status' = 'failed') as failed,
    ROUND(
        100.0 * COUNT(*) FILTER (WHERE metadata->>'status' = 'completed') / COUNT(*),
        2
    ) as success_rate
FROM email_activity_log
WHERE event_type = 'workflow_executed'
GROUP BY workflow
ORDER BY success_rate DESC;

🚀 Auto-Logging

Følgende er allerede implementeret og logger automatisk:

Email Fetching - Logged når emails hentes
Classification - Logged når AI klassificerer
Workflow Execution - Logged ved start og completion
Status Changes - Logged når email status ændres

Kommende Auto-Logging

Rule matching (tilføjes snart)
User read events (når user åbner email)
Attachment actions (download/upload)
Entity linking (vendor/customer association)

💡 Best Practices

1. Always Include Metadata

# ❌ Bad - No context
await email_activity_logger.log(
    email_id=123,
    event_type='action_performed',
    category='system',
    description='Something happened'
)

# ✅ Good - Rich context
await email_activity_logger.log(
    email_id=123,
    event_type='invoice_sent',
    category='integration',
    description='Invoice sent to e-conomic',
    metadata={
        'invoice_number': 'INV-2025-001',
        'economic_id': 12345,
        'amount': 1234.56,
        'sent_at': datetime.now().isoformat()
    }
)

2. Use Descriptive Event Types

# ❌ Bad - Generic
event_type='action'

# ✅ Good - Specific
event_type='invoice_sent_to_economic'

3. Choose Correct Category

  • system: Automated system actions
  • user: User-triggered actions
  • workflow: Workflow executions
  • rule: Rule-based actions
  • integration: External system integrations

4. Log Errors with Context

try:
    result = extract_invoice_data(pdf_path)
except Exception as e:
    await email_activity_logger.log_error(
        email_id=email_id,
        error_type='extraction_failed',
        error_message=str(e),
        context={
            'pdf_path': pdf_path,
            'file_size': os.path.getsize(pdf_path),
            'traceback': traceback.format_exc()
        }
    )

🔒 Data Retention

Activity logs kan vokse hurtigt. Implementer cleanup strategi:

-- Delete logs older than 90 days
DELETE FROM email_activity_log 
WHERE created_at < NOW() - INTERVAL '90 days';

-- Archive old logs to separate table
INSERT INTO email_activity_log_archive
SELECT * FROM email_activity_log
WHERE created_at < NOW() - INTERVAL '30 days';

DELETE FROM email_activity_log
WHERE created_at < NOW() - INTERVAL '30 days';

📊 Performance Considerations

Med indexes på email_id, event_type, created_at og event_category, kan systemet håndtere millioner af log entries uden performance issues.

Index Usage

-- Fast: Uses idx_email_activity_log_email_id
SELECT * FROM email_activity_log WHERE email_id = 123;

-- Fast: Uses idx_email_activity_log_event_type
SELECT * FROM email_activity_log WHERE event_type = 'workflow_executed';

-- Fast: Uses idx_email_activity_log_created_at
SELECT * FROM email_activity_log WHERE created_at > NOW() - INTERVAL '1 day';

🎓 Examples

Complete Email Lifecycle Log

# 1. Email arrives
await email_activity_logger.log_fetched(email_id, 'imap', message_id)

# 2. AI classifies it
await email_activity_logger.log_classified(email_id, 'invoice', 0.92, 'ai')

# 3. Workflow processes it
await email_activity_logger.log_workflow_executed(
    email_id, workflow_id, 'Invoice Processing', 'completed', 3, 1100
)

# 4. Links to vendor
await email_activity_logger.log_linked(email_id, 'vendor', 42, 'Acme Corp')

# 5. Extracts invoice
await email_activity_logger.log_invoice_extracted(
    email_id, 'INV-001', 1234.56, True
)

# 6. Status changes
await email_activity_logger.log_status_changed(
    email_id, 'active', 'processed', 'workflow completed'
)

Result: Complete audit trail af email fra fetch til processed!


Version: 1.0
Last Updated: 15. december 2025
Status: Production Ready