SecurityKey Rotation Runbook

Key Rotation Runbook

This runbook covers the complete procedure for rotating the Datafly Signal encryption key. Follow these steps in order.

⚠️

Key rotation requires coordination across all Signal services. Plan a maintenance window if you are not comfortable with rolling deployments.

Prerequisites

  • Admin access to the Management API (OrgAdmin role)
  • Ability to set environment variables on all Signal services
  • Access to perform rolling restarts (Helm upgrade, Coolify redeploy, etc.)

Generate a New Key

Generate a cryptographically secure 32-byte AES-256 key:

# Using OpenSSL
openssl rand -hex 32
 
# Or using Signal's built-in generator (if you have the binary)
signal-cli generate-key

Save the output — this is your new ENCRYPTION_KEY.

Rotation Steps

Deploy dual-key configuration

Set both the new and old keys on all Signal services:

ENCRYPTION_KEY=<new-64-char-hex-key>
ENCRYPTION_KEY_PREVIOUS=<old-64-char-hex-key>

Rolling restart

Perform a rolling restart of all services. The order does not matter — each service handles dual-key mode independently.

Helm:

helm upgrade datafly ./charts/datafly \
  --set encryption.key=$NEW_KEY \
  --set encryption.previousKey=$OLD_KEY

Coolify: Update the environment variables in each resource and click Redeploy.

Verify dual-key mode

Check the crypto health endpoint on each service:

curl https://collect.example.com/health/crypto
curl https://api.example.com/health/crypto

Verify the response shows:

{
  "status": "healthy",
  "dual_key_mode": true,
  "checks": {
    "key_valid": true,
    "round_trip": true,
    "previous_key_valid": true
  }
}

Re-encrypt existing data

Trigger background re-encryption of all pipeline secrets and identity traits:

curl -X POST https://api.example.com/v1/admin/crypto/re-encrypt \
  -H "Authorization: Bearer <admin-token>"

Monitor re-encryption progress

Poll the status endpoint until completion:

curl https://api.example.com/v1/admin/crypto/re-encrypt \
  -H "Authorization: Bearer <admin-token>"

Expected response when complete:

{
  "status": "completed",
  "total_records": 150,
  "processed_records": 150,
  "failed_records": 0,
  "started_at": "2026-04-06T14:30:00Z",
  "completed_at": "2026-04-06T14:30:12Z"
}
🚫

If failed_records is greater than 0, investigate the management-api logs before proceeding. Failed records remain encrypted with the old key.

Remove the old key

Once re-encryption is complete with zero failures, remove ENCRYPTION_KEY_PREVIOUS from all services:

ENCRYPTION_KEY=<new-key>
# ENCRYPTION_KEY_PREVIOUS removed

Final rolling restart

Perform a final rolling restart. Services return to single-key mode.

Verify completion

Check crypto health shows dual_key_mode: false:

curl https://api.example.com/health/crypto

Cloud KMS Key Rotation

If using a cloud KMS provider (GCP, AWS, Azure), key rotation follows the same dual-key procedure, but the key wrapping happens in the cloud:

GCP Cloud KMS

# Create a new key version
gcloud kms keys versions create \
  --key=signal-dek \
  --keyring=signal \
  --location=global
 
# Set the new version as primary
gcloud kms keys update signal-dek \
  --keyring=signal \
  --location=global \
  --primary-version=<new-version-number>

GCP KMS automatically handles decryption with any key version, so ENCRYPTION_KEY_PREVIOUS is not needed. However, you should still run the re-encryption job to ensure all data is encrypted under the latest key version.

AWS KMS

AWS KMS supports automatic annual key rotation:

aws kms enable-key-rotation --key-id <key-arn>

With automatic rotation enabled, AWS creates new key material annually. Decryption of old data is handled transparently. Run the re-encryption job periodically to migrate data to the latest key material.

Troubleshooting

Service fails to start after key change

The most common cause is a typo in the new key. Verify:

  • The key is exactly 64 hexadecimal characters
  • The key is the same across all services
  • ENCRYPTION_KEY_PREVIOUS is set to the old key (not the new one)

Re-encryption reports failures

Check the management-api logs for the specific error. Common causes:

  • Corrupted ciphertext: A database record was modified outside of Signal
  • Missing key: The ENCRYPTION_KEY_PREVIOUS doesn’t match the key that encrypted the record
  • Database timeout: Increase DB_MAX_CONNS if re-encrypting a large number of records

Crypto health check fails

Verify the key is valid:

# This should be exactly 64 characters, all hex (0-9, a-f)
echo -n "$ENCRYPTION_KEY" | wc -c

If using a KMS provider, check that the service has network access to the KMS endpoint and valid credentials.

Compliance Reference

StandardRequirementHow Signal Addresses It
PCI-DSS 4.0 (Req 3.6-3.7)Rotate crypto keys annually; document proceduresThis runbook + annual rotation schedule
SOC 2 (CC6.1)Logical access controls for crypto keysKMS integration + RBAC on re-encrypt endpoint
ISO 27001 (A.10.1.2)Key management policyKMS providers + audit logging of all key operations
FCA/PRAFollow NCSC guidance on key managementAES-256-GCM + envelope encryption + key rotation