bmc_hub/docs/EMAIL_ACTIVITY_LOGGING.md

457 lines
11 KiB
Markdown
Raw Normal View History

# Email Activity Logging System
## Oversigt
Komplet audit trail system der logger **alt** hvad der sker med emails i BMC Hub. Hver handling, ændring og event bliver logget automatisk med timestamps, metadata og kontekst.
## 🎯 Hvad Bliver Logget?
### System Events
- **fetched**: Email hentet fra mail server
- **classified**: Email klassificeret af AI/keyword system
- **workflow_executed**: Workflow kørt på email
- **rule_matched**: Email regel matchet
- **status_changed**: Email status ændret
- **error**: Fejl opstået under processing
### User Events
- **read**: Email læst af bruger
- **attachment_downloaded**: Attachment downloaded
- **attachment_uploaded**: Attachment uploaded
### Integration Events
- **linked**: Email linket til vendor/customer/case
- **invoice_extracted**: Faktura data ekstraheret fra PDF
- **ticket_created**: Support ticket oprettet
- **notification_sent**: Notifikation sendt
## 📊 Database Schema
### email_activity_log Table
```sql
CREATE TABLE email_activity_log (
id SERIAL PRIMARY KEY,
email_id INTEGER NOT NULL, -- Hvilken email
event_type VARCHAR(50) NOT NULL, -- Hvad skete der
event_category VARCHAR(30) NOT NULL, -- Kategori (system/user/workflow/etc)
description TEXT NOT NULL, -- Human-readable beskrivelse
metadata JSONB, -- Ekstra data som JSON
user_id INTEGER, -- Bruger hvis user-triggered
created_at TIMESTAMP, -- Hvornår
created_by VARCHAR(255) -- Hvem/hvad
);
```
### email_timeline View
Pre-built view med joins til users og email_messages:
```sql
SELECT * FROM email_timeline WHERE email_id = 123;
```
## 🔧 Hvordan Bruges Det?
### I Python Code
```python
from app.services.email_activity_logger import email_activity_logger
# Log email fetch
await email_activity_logger.log_fetched(
email_id=123,
source='imap',
message_id='msg-abc-123'
)
# Log classification
await email_activity_logger.log_classified(
email_id=123,
classification='invoice',
confidence=0.85,
method='ai'
)
# Log workflow execution
await email_activity_logger.log_workflow_executed(
email_id=123,
workflow_id=5,
workflow_name='Invoice Processing',
status='completed',
steps_completed=3,
execution_time_ms=1250
)
# Log status change
await email_activity_logger.log_status_changed(
email_id=123,
old_status='active',
new_status='processed',
reason='workflow completed'
)
# Log entity linking
await email_activity_logger.log_linked(
email_id=123,
entity_type='vendor',
entity_id=42,
entity_name='Acme Corp'
)
# Log invoice extraction
await email_activity_logger.log_invoice_extracted(
email_id=123,
invoice_number='INV-2025-001',
amount=1234.56,
success=True
)
# Log error
await email_activity_logger.log_error(
email_id=123,
error_type='extraction_failed',
error_message='PDF corrupted',
context={'file': 'invoice.pdf', 'size': 0}
)
# Generic log (for custom events)
await email_activity_logger.log(
email_id=123,
event_type='custom_event',
category='integration',
description='Custom event happened',
metadata={'key': 'value'}
)
```
### Via SQL
```sql
-- Log event directly via function
SELECT log_email_event(
123, -- email_id
'custom_event', -- event_type
'system', -- event_category
'Something happened', -- description
'{"foo": "bar"}'::jsonb, -- metadata (optional)
NULL, -- user_id (optional)
'system' -- created_by
);
-- Query logs for specific email
SELECT * FROM email_activity_log
WHERE email_id = 123
ORDER BY created_at DESC;
-- Use the view for nicer output
SELECT * FROM email_timeline
WHERE email_id = 123;
```
### Via API
```http
GET /api/v1/emails/123/activity
```
Response:
```json
[
{
"id": 1,
"email_id": 123,
"event_type": "fetched",
"event_category": "system",
"description": "Email fetched from email server",
"metadata": {
"source": "imap",
"message_id": "msg-abc-123"
},
"user_id": null,
"user_name": null,
"created_at": "2025-12-15T10:30:00",
"created_by": "system"
},
{
"id": 2,
"email_id": 123,
"event_type": "classified",
"event_category": "system",
"description": "Classified as invoice (confidence: 85%)",
"metadata": {
"classification": "invoice",
"confidence": 0.85,
"method": "ai"
},
"created_at": "2025-12-15T10:30:02",
"created_by": "system"
}
]
```
## 🎨 UI Integration
### Email Detail View
Når du vælger en email i email UI:
1. Klik på **"Log"** tab i højre sidebar
2. Se komplet timeline af alle events
3. Ekspander metadata for detaljer
### Timeline Features
- **Kronologisk visning**: Nyeste først
- **Color-coded ikoner**: Baseret på event category
- 🔵 System events (blue)
- 🟢 User events (green)
- 🔷 Workflow events (cyan)
- 🟡 Rule events (yellow)
- ⚫ Integration events (gray)
- **Expandable metadata**: Klik for at se JSON details
- **User attribution**: Viser hvem der udførte action
## 📈 Analytics & Monitoring
### Recent Activity Across All Emails
```http
GET /api/v1/emails/activity/recent?limit=50&event_type=error
```
### Activity Statistics
```http
GET /api/v1/emails/activity/stats
```
Response:
```json
[
{
"event_type": "classified",
"event_category": "system",
"count": 1523,
"last_occurrence": "2025-12-15T12:45:00"
},
{
"event_type": "workflow_executed",
"event_category": "workflow",
"count": 892,
"last_occurrence": "2025-12-15T12:44:30"
}
]
```
## 🔍 Use Cases
### 1. Debugging Email Processing
```sql
-- See complete flow for problematic email
SELECT
event_type,
description,
created_at
FROM email_activity_log
WHERE email_id = 123
ORDER BY created_at;
```
### 2. Performance Monitoring
```sql
-- Find slow workflow executions
SELECT
email_id,
description,
(metadata->>'execution_time_ms')::int as exec_time
FROM email_activity_log
WHERE event_type = 'workflow_executed'
ORDER BY exec_time DESC
LIMIT 10;
```
### 3. User Activity Audit
```sql
-- See what user did
SELECT
e.subject,
a.event_type,
a.description,
a.created_at
FROM email_activity_log a
JOIN email_messages e ON a.email_id = e.id
WHERE a.user_id = 5
ORDER BY a.created_at DESC;
```
### 4. Error Analysis
```sql
-- Find common errors
SELECT
metadata->>'error_type' as error_type,
COUNT(*) as count
FROM email_activity_log
WHERE event_type = 'error'
GROUP BY error_type
ORDER BY count DESC;
```
### 5. Workflow Success Rate
```sql
-- Calculate workflow success rate
SELECT
metadata->>'workflow_name' as workflow,
COUNT(*) FILTER (WHERE metadata->>'status' = 'completed') as success,
COUNT(*) FILTER (WHERE metadata->>'status' = 'failed') as failed,
ROUND(
100.0 * COUNT(*) FILTER (WHERE metadata->>'status' = 'completed') / COUNT(*),
2
) as success_rate
FROM email_activity_log
WHERE event_type = 'workflow_executed'
GROUP BY workflow
ORDER BY success_rate DESC;
```
## 🚀 Auto-Logging
Følgende er allerede implementeret og logger automatisk:
**Email Fetching** - Logged når emails hentes
**Classification** - Logged når AI klassificerer
**Workflow Execution** - Logged ved start og completion
**Status Changes** - Logged når email status ændres
### Kommende Auto-Logging
⏳ Rule matching (tilføjes snart)
⏳ User read events (når user åbner email)
⏳ Attachment actions (download/upload)
⏳ Entity linking (vendor/customer association)
## 💡 Best Practices
### 1. Always Include Metadata
```python
# ❌ Bad - No context
await email_activity_logger.log(
email_id=123,
event_type='action_performed',
category='system',
description='Something happened'
)
# ✅ Good - Rich context
await email_activity_logger.log(
email_id=123,
event_type='invoice_sent',
category='integration',
description='Invoice sent to e-conomic',
metadata={
'invoice_number': 'INV-2025-001',
'economic_id': 12345,
'amount': 1234.56,
'sent_at': datetime.now().isoformat()
}
)
```
### 2. Use Descriptive Event Types
```python
# ❌ Bad - Generic
event_type='action'
# ✅ Good - Specific
event_type='invoice_sent_to_economic'
```
### 3. Choose Correct Category
- **system**: Automated system actions
- **user**: User-triggered actions
- **workflow**: Workflow executions
- **rule**: Rule-based actions
- **integration**: External system integrations
### 4. Log Errors with Context
```python
try:
result = extract_invoice_data(pdf_path)
except Exception as e:
await email_activity_logger.log_error(
email_id=email_id,
error_type='extraction_failed',
error_message=str(e),
context={
'pdf_path': pdf_path,
'file_size': os.path.getsize(pdf_path),
'traceback': traceback.format_exc()
}
)
```
## 🔒 Data Retention
Activity logs kan vokse hurtigt. Implementer cleanup strategi:
```sql
-- Delete logs older than 90 days
DELETE FROM email_activity_log
WHERE created_at < NOW() - INTERVAL '90 days';
-- Archive old logs to separate table
INSERT INTO email_activity_log_archive
SELECT * FROM email_activity_log
WHERE created_at < NOW() - INTERVAL '30 days';
DELETE FROM email_activity_log
WHERE created_at < NOW() - INTERVAL '30 days';
```
## 📊 Performance Considerations
Med indexes på `email_id`, `event_type`, `created_at` og `event_category`, kan systemet håndtere millioner af log entries uden performance issues.
### Index Usage
```sql
-- Fast: Uses idx_email_activity_log_email_id
SELECT * FROM email_activity_log WHERE email_id = 123;
-- Fast: Uses idx_email_activity_log_event_type
SELECT * FROM email_activity_log WHERE event_type = 'workflow_executed';
-- Fast: Uses idx_email_activity_log_created_at
SELECT * FROM email_activity_log WHERE created_at > NOW() - INTERVAL '1 day';
```
## 🎓 Examples
### Complete Email Lifecycle Log
```python
# 1. Email arrives
await email_activity_logger.log_fetched(email_id, 'imap', message_id)
# 2. AI classifies it
await email_activity_logger.log_classified(email_id, 'invoice', 0.92, 'ai')
# 3. Workflow processes it
await email_activity_logger.log_workflow_executed(
email_id, workflow_id, 'Invoice Processing', 'completed', 3, 1100
)
# 4. Links to vendor
await email_activity_logger.log_linked(email_id, 'vendor', 42, 'Acme Corp')
# 5. Extracts invoice
await email_activity_logger.log_invoice_extracted(
email_id, 'INV-001', 1234.56, True
)
# 6. Status changes
await email_activity_logger.log_status_changed(
email_id, 'active', 'processed', 'workflow completed'
)
```
Result: **Complete audit trail af email fra fetch til processed!**
---
**Version**: 1.0
**Last Updated**: 15. december 2025
**Status**: ✅ Production Ready