GCP Deployment
This guide walks through deploying Datafly Signal on Google Cloud Platform using GKE, Cloud SQL, Memorystore, and Confluent Cloud (or self-managed Kafka).
Prerequisites
| Tool | Version | Purpose |
|---|---|---|
| gcloud CLI | Latest | GCP account access |
| kubectl | v1.28+ | Kubernetes management |
| Helm | v3.14+ | Chart installation |
gcloud auth login
gcloud config set project YOUR_PROJECT_ID
gcloud config set compute/region europe-west1Architecture
Internet
│
┌──────▼──────┐
│ GCE/nginx │
│ (Ingress) │
└──────┬──────┘
│
┌────────▼────────┐
│ Google GKE │
│ ┌────────────┐ │
│ │ Datafly │ │
│ │ Services │ │
│ └──────┬─────┘ │
└─────────┼───────┘
┌────────────┼────────────┐
│ │ │
┌──────▼──────┐ ┌───▼───┐ ┌─────▼─────┐
│ Confluent │ │Cloud │ │Memorystore│
│ Cloud / │ │ SQL │ │ (Redis) │
│ Kafka │ │(PgSQL)│ │ │
└─────────────┘ └───────┘ └───────────┘Step 1: Create the GKE Cluster
Choose a machine type based on your sizing tier:
| Tier | Machine Type | Node Count |
|---|---|---|
| Small | e2-standard-4 | 3 |
| Medium | e2-standard-4 | 3 |
| Large | e2-standard-8 | 5 |
| XL | e2-standard-16 | 8 |
gcloud container clusters create datafly-cluster \
--region europe-west1 \
--num-nodes 3 \
--machine-type e2-standard-4 \
--enable-autorepair \
--enable-autoupgrade \
--release-channel regular \
--workload-pool=YOUR_PROJECT_ID.svc.id.googGet credentials:
gcloud container clusters get-credentials datafly-cluster --region europe-west1Step 2: Provision Managed Services
Cloud SQL for PostgreSQL
gcloud sql instances create datafly-postgres \
--database-version=POSTGRES_16 \
--tier=db-custom-2-4096 \
--region=europe-west1 \
--storage-size=20GB \
--storage-type=SSD \
--backup-start-time=02:00 \
--availability-type=zonal
gcloud sql databases create datafly \
--instance=datafly-postgres
gcloud sql users create datafly \
--instance=datafly-postgres \
--password="your-db-password"For GKE to connect to Cloud SQL, use the Cloud SQL Auth Proxy as a sidecar or enable Private IP on the Cloud SQL instance.
Memorystore for Redis
gcloud redis instances create datafly-redis \
--size=1 \
--region=europe-west1 \
--redis-version=redis_7_2 \
--tier=basic \
--transit-encryption-mode=SERVER_AUTHENTICATIONGet the Redis host:
gcloud redis instances describe datafly-redis --region=europe-west1 --format="value(host)"Kafka (Confluent Cloud or Self-Managed)
Option A: Confluent Cloud — Create a dedicated Kafka cluster in Confluent Cloud in the same region. Record the bootstrap servers, API key, and API secret.
Option B: In-cluster Kafka — Set kafka.enabled: true in the Helm values to deploy Kafka within the GKE cluster. This is simpler but requires managing Kafka yourself.
Step 3: Configure Secrets
Option A: GCP Secret Manager (Recommended)
echo -n "postgresql://datafly:password@CLOUD_SQL_IP:5432/datafly?sslmode=require" | \
gcloud secrets create datafly-prod-database-url --data-file=-
echo -n "$(openssl rand -hex 32)" | \
gcloud secrets create datafly-prod-jwt-secret --data-file=-
echo -n "$(openssl rand -hex 32)" | \
gcloud secrets create datafly-prod-encryption-key --data-file=-
echo -n "$(openssl rand -hex 32)" | \
gcloud secrets create datafly-prod-hmac-secret --data-file=-
echo -n "lic_your_licence_key" | \
gcloud secrets create datafly-prod-licence-key --data-file=-Install the External Secrets Operator and configure a ClusterSecretStore for GCP:
helm repo add external-secrets https://charts.external-secrets.io
helm install external-secrets external-secrets/external-secrets \
--namespace external-secrets --create-namespaceapiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
name: gcp-secret-manager
spec:
provider:
gcpsm:
projectID: YOUR_PROJECT_ID
auth:
workloadIdentity:
clusterLocation: europe-west1
clusterName: datafly-cluster
clusterProjectID: YOUR_PROJECT_ID
serviceAccountRef:
name: external-secrets-sa
namespace: external-secretsOption B: Kubernetes Secrets
kubectl create namespace datafly
kubectl create secret generic datafly-secrets \
--namespace datafly \
--from-literal=DATABASE_URL="postgresql://datafly:password@CLOUD_SQL_IP:5432/datafly?sslmode=require" \
--from-literal=REDIS_URL="redis://MEMORYSTORE_HOST:6379" \
--from-literal=KAFKA_BROKERS="kafka-bootstrap:9092" \
--from-literal=JWT_SECRET="$(openssl rand -hex 32)" \
--from-literal=ENCRYPTION_KEY="$(openssl rand -hex 32)" \
--from-literal=HMAC_SECRET="$(openssl rand -hex 32)" \
--from-literal=DATAFLY_LICENCE_KEY="lic_your_licence_key"
kubectl create configmap datafly-config \
--namespace datafly \
--from-literal=ENVIRONMENT="prod" \
--from-literal=LOG_LEVEL="info"Step 4: Configure DNS and TLS
Cloud DNS
Reserve a static IP and create DNS records:
gcloud compute addresses create datafly-ip --global
# Get the IP address
gcloud compute addresses describe datafly-ip --global --format="value(address)"
# Create DNS records in Cloud DNS (or your DNS provider):
# data.yourdomain.com → static IP
# app.yourdomain.com → static IP
# api.yourdomain.com → static IPTLS with cert-manager
helm repo add jetstack https://charts.jetstack.io
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager --create-namespace \
--set crds.enabled=trueStep 5: Install Datafly Signal
helm install datafly oci://ghcr.io/datafly/charts/datafly \
--namespace datafly --create-namespace \
--values values-gcp.yaml \
--set licenceKey=lic_your_licence_keyKey values to customise in values-gcp.yaml:
ingress:
className: gce # or "nginx" if using nginx-ingress
annotations:
kubernetes.io/ingress.global-static-ip-name: datafly-ip
hosts:
- host: data.yourdomain.com
paths:
- path: /v1
pathType: Prefix
service: ingestion-gateway
port: 8080
- path: /d.js
pathType: Exact
service: ingestion-gateway
port: 8080
- host: app.yourdomain.com
paths:
- path: /
pathType: Prefix
service: management-ui
port: 3000
- host: api.yourdomain.com
paths:
- path: /
pathType: Prefix
service: management-api
port: 8083
externalSecrets:
enabled: true
provider: gcp
secretStore: gcp-secret-manager
keys:
databaseUrl: datafly-prod-database-url
jwtSecret: datafly-prod-jwt-secret
encryptionKey: datafly-prod-encryption-key
hmacSecret: datafly-prod-hmac-secret
licenceKey: datafly-prod-licence-keyStep 6: Verify the Deployment
Check pod status
kubectl get pods -n dataflyCheck the ingress
kubectl get ingress -n dataflyGCE Ingress can take 5-10 minutes to provision the load balancer and health checks. The ADDRESS field will be empty until provisioning completes.
Test event ingestion
curl -X POST https://data.yourdomain.com/v1/t \
-H "Content-Type: application/json" \
-d '{"type":"track","event":"Test Event","properties":{"source":"deployment-test"}}'Access the Management UI
Open https://app.yourdomain.com in your browser.
Cost Estimate (Small Tier)
| Service | Spec | Monthly Cost |
|---|---|---|
| GKE Management Fee | — | ~$73 |
| Compute (3x e2-standard-4) | On-Demand | ~$310 |
| Cloud SQL (db-custom-2-4096) | — | ~$65 |
| Memorystore (1 GB, Basic) | — | ~$35 |
| Confluent Cloud (Basic) | — | ~$250 |
| Load Balancer | — | ~$20 |
| Total | ~$753/mo |
GCP pricing varies by region. Committed use discounts can reduce compute costs by 20-57%. Use the Sizing Calculator for detailed estimates.
Next Steps
- Set up Observability for monitoring
- Review Upgrades for the upgrade process
- Configure Backup & DR for disaster recovery
- Check Troubleshooting if you encounter issues