GCP Deployment

This guide walks through deploying Datafly Signal on Google Cloud Platform using GKE, Cloud SQL, Memorystore, and Confluent Cloud (or self-managed Kafka).

Prerequisites

ToolVersionPurpose
gcloud CLILatestGCP account access
kubectlv1.28+Kubernetes management
Helmv3.14+Chart installation
gcloud auth login
gcloud config set project YOUR_PROJECT_ID
gcloud config set compute/region europe-west1

Architecture

                    Internet

                ┌──────▼──────┐
                │  GCE/nginx  │
                │  (Ingress)  │
                └──────┬──────┘

              ┌────────▼────────┐
              │   Google GKE    │
              │  ┌────────────┐ │
              │  │ Datafly    │ │
              │  │ Services   │ │
              │  └──────┬─────┘ │
              └─────────┼───────┘
           ┌────────────┼────────────┐
           │            │            │
    ┌──────▼──────┐ ┌───▼───┐ ┌─────▼─────┐
    │  Confluent  │ │Cloud  │ │Memorystore│
    │  Cloud /    │ │ SQL   │ │  (Redis)  │
    │  Kafka      │ │(PgSQL)│ │           │
    └─────────────┘ └───────┘ └───────────┘

Step 1: Create the GKE Cluster

Choose a machine type based on your sizing tier:

TierMachine TypeNode Count
Smalle2-standard-43
Mediume2-standard-43
Largee2-standard-85
XLe2-standard-168
gcloud container clusters create datafly-cluster \
  --region europe-west1 \
  --num-nodes 3 \
  --machine-type e2-standard-4 \
  --enable-autorepair \
  --enable-autoupgrade \
  --release-channel regular \
  --workload-pool=YOUR_PROJECT_ID.svc.id.goog

Get credentials:

gcloud container clusters get-credentials datafly-cluster --region europe-west1

Step 2: Provision Managed Services

Cloud SQL for PostgreSQL

gcloud sql instances create datafly-postgres \
  --database-version=POSTGRES_16 \
  --tier=db-custom-2-4096 \
  --region=europe-west1 \
  --storage-size=20GB \
  --storage-type=SSD \
  --backup-start-time=02:00 \
  --availability-type=zonal
 
gcloud sql databases create datafly \
  --instance=datafly-postgres
 
gcloud sql users create datafly \
  --instance=datafly-postgres \
  --password="your-db-password"

For GKE to connect to Cloud SQL, use the Cloud SQL Auth Proxy as a sidecar or enable Private IP on the Cloud SQL instance.

Memorystore for Redis

gcloud redis instances create datafly-redis \
  --size=1 \
  --region=europe-west1 \
  --redis-version=redis_7_2 \
  --tier=basic \
  --transit-encryption-mode=SERVER_AUTHENTICATION

Get the Redis host:

gcloud redis instances describe datafly-redis --region=europe-west1 --format="value(host)"

Kafka (Confluent Cloud or Self-Managed)

Option A: Confluent Cloud — Create a dedicated Kafka cluster in Confluent Cloud in the same region. Record the bootstrap servers, API key, and API secret.

Option B: In-cluster Kafka — Set kafka.enabled: true in the Helm values to deploy Kafka within the GKE cluster. This is simpler but requires managing Kafka yourself.

Step 3: Configure Secrets

echo -n "postgresql://datafly:password@CLOUD_SQL_IP:5432/datafly?sslmode=require" | \
  gcloud secrets create datafly-prod-database-url --data-file=-
 
echo -n "$(openssl rand -hex 32)" | \
  gcloud secrets create datafly-prod-jwt-secret --data-file=-
 
echo -n "$(openssl rand -hex 32)" | \
  gcloud secrets create datafly-prod-encryption-key --data-file=-
 
echo -n "$(openssl rand -hex 32)" | \
  gcloud secrets create datafly-prod-hmac-secret --data-file=-
 
echo -n "lic_your_licence_key" | \
  gcloud secrets create datafly-prod-licence-key --data-file=-

Install the External Secrets Operator and configure a ClusterSecretStore for GCP:

helm repo add external-secrets https://charts.external-secrets.io
helm install external-secrets external-secrets/external-secrets \
  --namespace external-secrets --create-namespace
apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
  name: gcp-secret-manager
spec:
  provider:
    gcpsm:
      projectID: YOUR_PROJECT_ID
      auth:
        workloadIdentity:
          clusterLocation: europe-west1
          clusterName: datafly-cluster
          clusterProjectID: YOUR_PROJECT_ID
          serviceAccountRef:
            name: external-secrets-sa
            namespace: external-secrets

Option B: Kubernetes Secrets

kubectl create namespace datafly
 
kubectl create secret generic datafly-secrets \
  --namespace datafly \
  --from-literal=DATABASE_URL="postgresql://datafly:password@CLOUD_SQL_IP:5432/datafly?sslmode=require" \
  --from-literal=REDIS_URL="redis://MEMORYSTORE_HOST:6379" \
  --from-literal=KAFKA_BROKERS="kafka-bootstrap:9092" \
  --from-literal=JWT_SECRET="$(openssl rand -hex 32)" \
  --from-literal=ENCRYPTION_KEY="$(openssl rand -hex 32)" \
  --from-literal=HMAC_SECRET="$(openssl rand -hex 32)" \
  --from-literal=DATAFLY_LICENCE_KEY="lic_your_licence_key"
 
kubectl create configmap datafly-config \
  --namespace datafly \
  --from-literal=ENVIRONMENT="prod" \
  --from-literal=LOG_LEVEL="info"

Step 4: Configure DNS and TLS

Cloud DNS

Reserve a static IP and create DNS records:

gcloud compute addresses create datafly-ip --global
 
# Get the IP address
gcloud compute addresses describe datafly-ip --global --format="value(address)"
 
# Create DNS records in Cloud DNS (or your DNS provider):
# data.yourdomain.com  → static IP
# app.yourdomain.com   → static IP
# api.yourdomain.com   → static IP

TLS with cert-manager

helm repo add jetstack https://charts.jetstack.io
helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager --create-namespace \
  --set crds.enabled=true

Step 5: Install Datafly Signal

helm install datafly oci://ghcr.io/datafly/charts/datafly \
  --namespace datafly --create-namespace \
  --values values-gcp.yaml \
  --set licenceKey=lic_your_licence_key

Key values to customise in values-gcp.yaml:

ingress:
  className: gce              # or "nginx" if using nginx-ingress
  annotations:
    kubernetes.io/ingress.global-static-ip-name: datafly-ip
  hosts:
    - host: data.yourdomain.com
      paths:
        - path: /v1
          pathType: Prefix
          service: ingestion-gateway
          port: 8080
        - path: /d.js
          pathType: Exact
          service: ingestion-gateway
          port: 8080
    - host: app.yourdomain.com
      paths:
        - path: /
          pathType: Prefix
          service: management-ui
          port: 3000
    - host: api.yourdomain.com
      paths:
        - path: /
          pathType: Prefix
          service: management-api
          port: 8083
 
externalSecrets:
  enabled: true
  provider: gcp
  secretStore: gcp-secret-manager
  keys:
    databaseUrl: datafly-prod-database-url
    jwtSecret: datafly-prod-jwt-secret
    encryptionKey: datafly-prod-encryption-key
    hmacSecret: datafly-prod-hmac-secret
    licenceKey: datafly-prod-licence-key

Step 6: Verify the Deployment

Check pod status

kubectl get pods -n datafly

Check the ingress

kubectl get ingress -n datafly

GCE Ingress can take 5-10 minutes to provision the load balancer and health checks. The ADDRESS field will be empty until provisioning completes.

Test event ingestion

curl -X POST https://data.yourdomain.com/v1/t \
  -H "Content-Type: application/json" \
  -d '{"type":"track","event":"Test Event","properties":{"source":"deployment-test"}}'

Access the Management UI

Open https://app.yourdomain.com in your browser.

Cost Estimate (Small Tier)

ServiceSpecMonthly Cost
GKE Management Fee~$73
Compute (3x e2-standard-4)On-Demand~$310
Cloud SQL (db-custom-2-4096)~$65
Memorystore (1 GB, Basic)~$35
Confluent Cloud (Basic)~$250
Load Balancer~$20
Total~$753/mo

GCP pricing varies by region. Committed use discounts can reduce compute costs by 20-57%. Use the Sizing Calculator for detailed estimates.

Next Steps