AWS Deployment
This guide walks through deploying Datafly Signal on Amazon Web Services using EKS, MSK, ElastiCache, and RDS.
Prerequisites
Install the following tools on your workstation:
| Tool | Version | Purpose |
|---|---|---|
| AWS CLI | v2+ | AWS account access |
| eksctl | v0.170+ | EKS cluster creation |
| kubectl | v1.28+ | Kubernetes management |
| Helm | v3.14+ | Chart installation |
Ensure your AWS CLI is configured with credentials that have permissions to create EKS clusters, MSK clusters, ElastiCache, RDS instances, and IAM roles.
Architecture
Internet
│
┌──────▼──────┐
│ AWS ALB │
│ (Ingress) │
└──────┬──────┘
│
┌────────▼────────┐
│ Amazon EKS │
│ ┌────────────┐ │
│ │ Datafly │ │
│ │ Services │ │
│ └──────┬─────┘ │
└─────────┼───────┘
┌────────────┼────────────┐
│ │ │
┌──────▼──────┐ ┌───▼───┐ ┌─────▼─────┐
│ Amazon MSK │ │ RDS │ │ElastiCache│
│ (Kafka) │ │(PgSQL)│ │ (Redis) │
└─────────────┘ └───────┘ └───────────┘Step 1: Create the EKS Cluster
Choose a node instance type based on your sizing tier:
| Tier | Instance Type | Node Count | vCPU/Node | Memory/Node |
|---|---|---|---|---|
| Small | t3.xlarge | 3 | 4 | 16 GB |
| Medium | m5.xlarge | 3 | 4 | 16 GB |
| Large | m5.2xlarge | 5 | 8 | 32 GB |
| XL | m5.4xlarge | 8 | 16 | 64 GB |
eksctl create cluster \
--name datafly-cluster \
--region eu-west-1 \
--version 1.30 \
--nodegroup-name datafly-nodes \
--node-type m5.xlarge \
--nodes 3 \
--nodes-min 3 \
--nodes-max 6 \
--managedVerify the cluster is ready:
kubectl get nodesInstall the AWS ALB Ingress Controller
# Install the AWS Load Balancer Controller
helm repo add eks https://aws.github.io/eks-charts
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
--namespace kube-system \
--set clusterName=datafly-cluster \
--set serviceAccount.create=true \
--set serviceAccount.name=aws-load-balancer-controllerThe ALB Controller requires an IAM policy. Follow the AWS documentation for the full IAM setup.
Step 2: Provision Managed Services
Amazon MSK (Kafka)
Create an MSK cluster in the same VPC as your EKS cluster:
aws kafka create-cluster \
--cluster-name datafly-kafka \
--kafka-version 3.6.0 \
--number-of-broker-nodes 2 \
--broker-node-group-info \
InstanceType=kafka.m5.large,\
ClientSubnets=subnet-xxx,subnet-yyy,\
SecurityGroups=sg-kafka \
--encryption-info '{"EncryptionInTransit":{"ClientBroker":"TLS"}}'Record the bootstrap brokers endpoint:
aws kafka get-bootstrap-brokers --cluster-arn arn:aws:kafka:...Amazon ElastiCache (Redis)
aws elasticache create-replication-group \
--replication-group-id datafly-redis \
--replication-group-description "Datafly Signal Redis" \
--engine redis \
--engine-version 7.1 \
--cache-node-type cache.t3.medium \
--num-cache-clusters 1 \
--security-group-ids sg-redis \
--cache-subnet-group-name datafly-subnet-group \
--transit-encryption-enabled \
--auth-token "your-redis-auth-token"Amazon RDS (PostgreSQL)
aws rds create-db-instance \
--db-instance-identifier datafly-postgres \
--db-instance-class db.t3.medium \
--engine postgres \
--engine-version 16.4 \
--master-username datafly \
--master-user-password "your-db-password" \
--allocated-storage 20 \
--storage-type gp3 \
--vpc-security-group-ids sg-postgres \
--db-subnet-group-name datafly-subnet-group \
--storage-encrypted \
--backup-retention-period 7 \
--db-name dataflyEnsure the MSK, ElastiCache, and RDS security groups allow inbound traffic from the EKS node security group. All services must be in the same VPC (or peered VPCs).
Step 3: Configure Secrets
Option A: AWS Secrets Manager (Recommended)
Store connection strings in AWS Secrets Manager:
aws secretsmanager create-secret \
--name datafly/prod/database-url \
--secret-string "postgresql://datafly:password@datafly-postgres.xxx.rds.amazonaws.com:5432/datafly?sslmode=require"
aws secretsmanager create-secret \
--name datafly/prod/jwt-secret \
--secret-string "$(openssl rand -hex 32)"
aws secretsmanager create-secret \
--name datafly/prod/encryption-key \
--secret-string "$(openssl rand -hex 32)"
aws secretsmanager create-secret \
--name datafly/prod/hmac-secret \
--secret-string "$(openssl rand -hex 32)"
aws secretsmanager create-secret \
--name datafly/prod/licence-key \
--secret-string "lic_your_licence_key"Install the External Secrets Operator:
helm repo add external-secrets https://charts.external-secrets.io
helm install external-secrets external-secrets/external-secrets \
--namespace external-secrets --create-namespaceThen create a ClusterSecretStore for AWS:
apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
name: aws-secrets-manager
spec:
provider:
aws:
service: SecretsManager
region: eu-west-1
auth:
jwt:
serviceAccountRef:
name: external-secrets-sa
namespace: external-secretsOption B: Kubernetes Secrets (Simpler)
kubectl create namespace datafly
kubectl create secret generic datafly-secrets \
--namespace datafly \
--from-literal=DATABASE_URL="postgresql://datafly:password@datafly-postgres.xxx.rds.amazonaws.com:5432/datafly?sslmode=require" \
--from-literal=REDIS_URL="rediss://:auth-token@datafly-redis.xxx.cache.amazonaws.com:6379" \
--from-literal=KAFKA_BROKERS="b-1.datafly-kafka.xxx.kafka.eu-west-1.amazonaws.com:9094" \
--from-literal=JWT_SECRET="$(openssl rand -hex 32)" \
--from-literal=ENCRYPTION_KEY="$(openssl rand -hex 32)" \
--from-literal=HMAC_SECRET="$(openssl rand -hex 32)" \
--from-literal=DATAFLY_LICENCE_KEY="lic_your_licence_key"
kubectl create configmap datafly-config \
--namespace datafly \
--from-literal=ENVIRONMENT="prod" \
--from-literal=LOG_LEVEL="info"Step 4: Configure DNS and TLS
Route 53 DNS
Create DNS records pointing to the ALB:
# After Helm install, get the ALB address:
kubectl get ingress -n datafly
# Create Route 53 records (A record alias to ALB):
# data.yourdomain.com → ALB
# app.yourdomain.com → ALB
# api.yourdomain.com → ALBTLS with cert-manager
Install cert-manager:
helm repo add jetstack https://charts.jetstack.io
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager --create-namespace \
--set crds.enabled=trueStep 5: Install Datafly Signal
Use the AWS example values file as a starting point:
# Download the example values file
curl -O https://raw.githubusercontent.com/datafly/signal/main/deployments/helm/datafly/values-aws.yaml
# Edit values-aws.yaml to set your domain, secrets, and sizing tier
# Then install:
helm install datafly oci://ghcr.io/datafly/charts/datafly \
--namespace datafly --create-namespace \
--values values-aws.yaml \
--set licenceKey=lic_your_licence_keyKey values to customise in values-aws.yaml:
ingress:
hosts:
- host: data.yourdomain.com # Your data collection subdomain
paths:
- path: /v1
pathType: Prefix
service: ingestion-gateway
port: 8080
- path: /d.js
pathType: Exact
service: ingestion-gateway
port: 8080
- host: app.yourdomain.com # Management UI
paths:
- path: /
pathType: Prefix
service: management-ui
port: 3000
- host: api.yourdomain.com # Management API
paths:
- path: /
pathType: Prefix
service: management-api
port: 8083
tls:
- secretName: datafly-tls
hosts:
- data.yourdomain.com
- app.yourdomain.com
- api.yourdomain.com
externalSecrets:
enabled: true
provider: aws
secretStore: aws-secrets-manager
keys:
databaseUrl: datafly/prod/database-url
jwtSecret: datafly/prod/jwt-secret
encryptionKey: datafly/prod/encryption-key
hmacSecret: datafly/prod/hmac-secret
licenceKey: datafly/prod/licence-keyStep 6: Verify the Deployment
Check pod status
kubectl get pods -n dataflyAll pods should show Running with 1/1 ready.
Check the ingress
kubectl get ingress -n dataflyThe ADDRESS column should show the ALB DNS name.
Test event ingestion
curl -X POST https://data.yourdomain.com/v1/t \
-H "Content-Type: application/json" \
-d '{"type":"track","event":"Test Event","properties":{"source":"deployment-test"}}'Access the Management UI
Open https://app.yourdomain.com in your browser. Log in with the default admin credentials from the seed data, or create a new admin user via the Management API.
Check logs
kubectl logs -n datafly -l app.kubernetes.io/name=ingestion-gateway --tail=50
kubectl logs -n datafly -l app.kubernetes.io/name=event-processor --tail=50Cost Estimate (Small Tier)
| Service | Instance | Monthly Cost |
|---|---|---|
| EKS Control Plane | — | ~$73 |
| EC2 Nodes (3x m5.xlarge) | On-Demand | ~$420 |
| MSK (2 brokers, kafka.m5.large) | — | ~$260 |
| ElastiCache (cache.t3.medium) | — | ~$50 |
| RDS (db.t3.medium, 20 GB) | — | ~$55 |
| ALB | — | ~$25 |
| Total | ~$883/mo |
These are approximate on-demand costs for eu-west-1. Reserved instances and savings plans can reduce costs by 30-60%. Use the Sizing Calculator for detailed estimates by tier.
Next Steps
- Set up Observability for Prometheus metrics and Grafana dashboards
- Review Upgrades for the upgrade process
- Configure Backup & DR for disaster recovery
- Check Troubleshooting if you encounter issues