← Back to Research

Overview

Native compression, delta updates for thread replies, and efficient binary encoding for SMTP.

Problem Statement

SMTP is inefficient: - Base64 encoding adds 33% overhead - No compression (email threads repeat quoted content) - Binary attachments bloated - No deduplication for forwarded messages

Goals

  1. Native SMTP compression (gzip/brotli)
  2. Delta encoding for thread replies (only send new content)
  3. Deduplication for attachments and forwarded messages
  4. Backward compatible with legacy MTAs

Proposed Solutions

1. SMTP Extension: COMPRESS

EHLO msgs.global
250-msgs.global
250-SIZE 52428800
250-COMPRESS GZIP BROTLI
250 HELP

COMPRESS BROTLI
220 Compression active

[All subsequent SMTP commands/data compressed]

2. Delta Encoding for Threads

X-Thread-Delta: true
X-Thread-Base-Message-ID: <original@msgs.global>
Content-Transfer-Encoding: delta

[Only new content, not quoted text]

3. Attachment Deduplication

Content-Disposition: attachment; filename="report.pdf"
Content-SHA256: abc123...
Content-Location: https://msgs.global/attachments/abc123

[If recipient already has this hash, skip transfer]

Expected Savings

Content Type Current Size Optimized Size Savings
Text thread (10 messages) 500 KB 120 KB 76%
Forwarded message 100 KB 5 KB 95%
Attachment (duplicate) 2 MB 50 bytes 99.9%
Average email 150 KB 60 KB 60%

Integration with msgs.global

Postfix

  • SMTP compression plugin
  • Attachment deduplication database
  • Delta encoder for common thread patterns

Storage

  • Content-addressed storage (hash-based)
  • Shared attachment pool
  • Reference counting

API

POST /api/v1/messages/optimize - Analyze optimization potential
GET  /api/v1/stats/bandwidth   - Bandwidth savings dashboard

Challenges

  • Backward compatibility (fallback to uncompressed)
  • CPU overhead for compression
  • Storage for deduplication database
  • Thread detection accuracy

Status

📋 Research Phase

Next Steps

  1. Benchmark compression ratios on real email corpus
  2. Prototype SMTP COMPRESS extension
  3. Design attachment deduplication schema
  4. Measure CPU vs bandwidth tradeoff