← Back to Research
Overview
End-to-end encryption with metadata protection where even msgs.global cannot see who sends to whom, subject lines, or message timing.
Problem Statement
Current encrypted email (S/MIME, PGP) protects content but exposes: - Sender and recipient addresses (To/From headers) - Subject lines - Message timestamps - Message size - Social graph (who communicates with whom) - Access patterns (when you check email)
This metadata is often more valuable than content for surveillance.
Vision
Traditional E2EE:
✅ Content encrypted
❌ Metadata visible: alice@msgs.global -> bob@company.com, "Q4 Acquisition Plan"
Zero-Knowledge:
✅ Content encrypted
✅ Recipient hidden (onion routing)
✅ Subject encrypted
✅ Timing obfuscated (mix network)
✅ No social graph leakage
Architecture
1. Onion Routing for Delivery
Alice -> [Relay 1] -> [Relay 2] -> [Relay 3] -> Bob
Each hop only knows:
- Relay 1: Received from Alice, forward to Relay 2 (doesn't know final destination)
- Relay 2: Received from Relay 1, forward to Relay 3 (doesn't know source or final dest)
- Relay 3: Received from Relay 2, deliver to Bob (doesn't know source)
2. Encrypted Message Envelope
{
"version": "zk-delivery-1.0",
"layers": [
{
"recipient": "relay1@msgs.global",
"encrypted_for": "relay1-public-key",
"payload": "encrypted-layer-1..." // Contains next hop
}
],
"decoy_traffic": true, // Mix with dummy messages
"timestamp_obfuscation": "+/- 5min", // Fuzzy timing
"size_padding": 1024 // Pad to fixed size
}
3. Private Information Retrieval (PIR)
Bob checks for messages without revealing which mailbox he's checking:
Traditional: "Give me messages for bob@msgs.global" [server knows Bob is checking]
PIR: Bob downloads encrypted database chunk containing many mailboxes,
decrypts only his messages locally [server doesn't know which mailbox]
Technical Components
1. Mix Network
class MixNode:
"""Anonymizing relay node"""
def __init__(self, private_key):
self.private_key = private_key
self.message_queue = []
self.batch_size = 20 # Mix with 20 messages before forwarding
self.batch_delay = 5 * 60 # 5 minute batching window
def receive_message(self, encrypted_layer):
"""Receive message, cannot decrypt payload"""
self.message_queue.append(encrypted_layer)
# Trigger batch when queue is full OR time elapsed
if len(self.message_queue) >= self.batch_size:
self.mix_and_forward()
def mix_and_forward(self):
"""Mix messages and forward in random order"""
# 1. Decrypt outer layer (only this hop's layer)
decrypted = [self.decrypt_layer(msg) for msg in self.message_queue]
# 2. Add dummy messages (pad to fixed size)
while len(decrypted) < self.batch_size:
decrypted.append(self.generate_dummy())
# 3. Shuffle (hide message order)
random.shuffle(decrypted)
# 4. Forward to next hops
for msg in decrypted:
self.forward_to_next_hop(msg)
self.message_queue = []
def decrypt_layer(self, encrypted_layer):
"""Decrypt outer layer to reveal next hop"""
decrypted = self.private_key.decrypt(encrypted_layer)
return {
'next_hop': decrypted['recipient'],
'encrypted_payload': decrypted['payload']
}
2. Onion Encryption
def create_onion_message(message, route):
"""
Encrypt message in layers (like Tor)
route = [relay1, relay2, relay3, recipient]
"""
payload = encrypt(message, recipient_public_key) # Innermost: actual message
# Wrap in layers (outside-in)
for relay in reversed(route[:-1]):
payload = {
'next_hop': route[route.index(relay) + 1],
'encrypted_payload': payload
}
payload = encrypt(payload, relay.public_key)
return payload
def send_onion_message(message, recipient):
"""Send message through mix network"""
# 1. Select random route (3 hops)
route = select_random_route(num_hops=3)
route.append(recipient)
# 2. Create onion
onion = create_onion_message(message, route)
# 3. Send to first hop
first_hop = route[0]
send_to_mix_node(first_hop, onion)
3. Private Mailbox Retrieval
class PrivateMailbox:
"""PIR-based mailbox access"""
def retrieve_messages_pir(self, mailbox_index):
"""
Retrieve messages without revealing which mailbox
Uses computational PIR (simpler than information-theoretic PIR)
"""
# 1. Get database size
db_size = self.get_database_size()
# 2. Generate PIR query (hides which index)
query = generate_pir_query(mailbox_index, db_size)
# 3. Send query to server
response = server.process_pir_query(query)
# 4. Extract messages from response
messages = decode_pir_response(response, mailbox_index)
return messages
def generate_pir_query(target_index, db_size):
"""
Generate query that hides which mailbox we're accessing
Simplified computational PIR using homomorphic encryption
"""
# Encrypt index using homomorphic encryption
# Server can compute on encrypted index without learning it
encrypted_query = homomorphic_encrypt(target_index)
return encrypted_query
4. Metadata-Hiding Headers
Traditional headers:
From: alice@msgs.global
To: bob@company.com
Subject: Q4 Acquisition Discussion
Date: 2026-03-07 20:00:00
Zero-knowledge headers:
From: [encrypted-sender]
To: [encrypted-recipient]
Subject: [encrypted-subject]
Date: [obfuscated-timestamp] +/- 5min
Message-ID: [random-id]
Size: [padded-to-fixed-bucket]
All metadata encrypted, only visible to final recipient
Integration with msgs.global
Database Schema
-- Mix network nodes
CREATE TABLE mix_nodes (
id SERIAL PRIMARY KEY,
node_id VARCHAR(64) UNIQUE,
public_key TEXT,
address VARCHAR(255),
reputation_score FLOAT,
active BOOLEAN DEFAULT TRUE
);
-- Anonymous message queue
CREATE TABLE anon_message_queue (
id SERIAL PRIMARY KEY,
encrypted_envelope BYTEA, -- Onion-encrypted
received_at TIMESTAMP DEFAULT NOW(),
forward_after TIMESTAMP, -- Batching delay
processed BOOLEAN DEFAULT FALSE
);
-- Private mailboxes (PIR database)
CREATE TABLE private_mailboxes (
mailbox_index INTEGER PRIMARY KEY, -- Public index (doesn't reveal identity)
encrypted_messages BYTEA[], -- Array of encrypted messages
last_updated TIMESTAMP DEFAULT NOW()
);
-- User mailbox mapping (only user knows)
CREATE TABLE user_mailbox_secrets (
user_id INTEGER REFERENCES users(id),
mailbox_index INTEGER, -- Only user knows this mapping
mailbox_key BYTEA, -- Key to decrypt mailbox
UNIQUE(user_id, mailbox_index)
);
Sending Flow
@app.route('/api/v1/zk-send', methods=['POST'])
def send_zk_message():
"""Send zero-knowledge message"""
data = request.json
# 1. Encrypt message content (E2EE)
encrypted_content = encrypt_message(
data['message'],
data['recipient_public_key']
)
# 2. Select random route through mix network
route = select_random_route(num_hops=3)
# 3. Create onion layers
onion = create_onion_message(encrypted_content, route)
# 4. Add to mix queue (batched delivery)
add_to_mix_queue(onion, first_hop=route[0])
return {'status': 'queued', 'estimated_delivery': '5-15 minutes'}
def process_mix_queue():
"""Background job: Process mix queue in batches"""
while True:
# Wait for batch size or timeout
batch = wait_for_batch(size=20, timeout=300) # 5 min
# Shuffle batch
random.shuffle(batch)
# Forward each message
for msg in batch:
forward_message(msg)
time.sleep(60) # Mix every minute
Receiving Flow
@app.route('/api/v1/zk-retrieve', methods=['POST'])
def retrieve_zk_messages():
"""Retrieve messages using PIR"""
user = request.current_user
# 1. Get user's private mailbox index
mailbox_index = get_user_mailbox_index(user.id)
# 2. Generate PIR query
query = generate_pir_query(mailbox_index)
# 3. Server processes query (doesn't learn which mailbox)
response = process_pir_query(query)
# 4. Client decrypts response locally
# (happens client-side to preserve privacy)
return response # Encrypted database chunk
@app.route('/api/v1/zk-mailbox-db', methods=['GET'])
def get_mailbox_database():
"""
Alternative to PIR: Download entire database (smaller systems)
User downloads all mailboxes, decrypts only theirs locally
"""
# Only feasible for small number of mailboxes (<10,000)
mailboxes = db.query("SELECT * FROM private_mailboxes")
return {
'mailboxes': [
{'index': m.mailbox_index, 'data': m.encrypted_messages}
for m in mailboxes
]
}
Privacy Guarantees
| Metadata | Visible to Sender | Visible to Recipient | Visible to msgs.global | Visible to Relay |
|---|---|---|---|---|
| Content | ✅ | ✅ | ❌ | ❌ |
| Recipient | ✅ | ✅ | ❌ | ❌ |
| Sender | ✅ | ✅ | ❌ | ❌ |
| Subject | ✅ | ✅ | ❌ | ❌ |
| Timestamp | ✅ | ✅ | ~Fuzzy (±5min) | ~Fuzzy |
| Message Size | ✅ | ✅ | ~Padded bucket | ~Padded |
| Social Graph | ✅ | ❌ | ❌ | ❌ |
Performance Characteristics
| Metric | Traditional | Zero-Knowledge |
|---|---|---|
| Delivery time | <1 second | 5-15 minutes |
| Bandwidth overhead | 1x | 3-5x (padding, dummy traffic) |
| Server CPU | Low | High (mix processing) |
| Client CPU | Low | Medium (PIR decryption) |
| Anonymity set | N/A | 20-100 messages per batch |
Threat Model
Protects Against:
- ✅ Passive surveillance (ISP, msgs.global)
- ✅ Traffic analysis
- ✅ Social graph mapping
- ✅ Timing attacks (batching)
- ✅ Size correlation (padding)
Does NOT Protect Against:
- ❌ Global passive adversary (can correlate all network traffic)
- ❌ Compromised recipient revealing sender
- ❌ Endpoint compromise (keylogger, etc.)
- ❌ Very long-term traffic analysis (statistical attacks)
Migration Path
Phase 1: Opt-In ZK Mailboxes (6 months)
- [ ] Build mix network infrastructure
- [ ] Create PIR mailbox system
- [ ] Beta test with privacy-focused users
Phase 2: Hybrid Mode (12 months)
- [ ] Users can choose per-message (normal vs ZK)
- [ ] Transparent fallback to traditional delivery
Phase 3: ZK by Default (18+ months)
- [ ] Zero-knowledge default for all messages
- [ ] Traditional SMTP for external domains only
Use Cases
- Whistleblowers: Confidential source communication
- Journalists: Protect sources
- Healthcare: HIPAA-compliant metadata protection
- Legal: Attorney-client privilege (metadata too)
- Activists: Censorship-resistant communication
- Privacy-conscious users: Everyone deserves privacy
Related Technologies
- Tor: Onion routing (inspiration)
- Nym Mixnet: Privacy infrastructure
- Signal Sealed Sender: Metadata protection in messaging
- Loopix: Mix network design
- PIR libraries: SealPIR, XPIR
Status
🔬 Research Phase (Advanced Cryptography Required)
Next Steps
- Literature review: Mix network designs, PIR protocols
- Threat model refinement
- Performance analysis (latency vs anonymity tradeoff)
- Prototype mix node
- PIR library integration testing
- User experience design (managing latency expectations)