DeploymentCustomer-Hosted

Customer-Hosted Deployment

Datafly Signal supports deployment into a customer’s own cloud account (VPC). This model is designed for organisations with strict compliance requirements where event data must never leave their own infrastructure.

Overview

In a customer-hosted deployment, the entire Datafly Signal stack runs within the customer’s cloud account:

Customer's VPC
├── Kubernetes Cluster
│   ├── Ingestion Gateway
│   ├── Event Processor
│   ├── Delivery Workers
│   ├── Identity Hub
│   ├── Management API
│   └── Management UI
├── Managed Kafka (MSK / Confluent)
├── Managed Redis (ElastiCache / Memorystore)
└── Managed PostgreSQL (RDS / Cloud SQL)

Benefits

  • Data sovereignty — event data never leaves the customer’s VPC. All processing happens within their cloud account.
  • Compliance — meets requirements for GDPR, HIPAA, SOC 2, and industry-specific regulations that mandate data residency.
  • Network control — the customer controls all firewall rules, VPC peering, and network policies.
  • Audit — the customer has full visibility into infrastructure logs, network traffic, and resource usage.

Requirements

Kubernetes Cluster

RequirementMinimumRecommended
Kubernetes version1.28+1.30+
Worker nodes35+
Node size4 vCPU / 8 GB8 vCPU / 16 GB
Ingress controllerAny (nginx, ALB, Traefik)nginx-ingress or AWS ALB

Managed Services

ServiceRequirement
Apache KafkaAWS MSK, Confluent Cloud, or self-managed Kafka 3.6+
RedisAWS ElastiCache, GCP Memorystore, or self-managed Redis 7+
PostgreSQLAWS RDS, GCP Cloud SQL, or self-managed PostgreSQL 16+

All managed services must be accessible from the Kubernetes cluster within the same VPC or via VPC peering.

TLS Certificates

A valid TLS certificate is required for the customer’s data collection subdomain (e.g., data.customer.com). This can be provisioned via:

  • AWS Certificate Manager (ACM)
  • Let’s Encrypt (cert-manager in Kubernetes)
  • Customer’s existing certificate authority

Network Configuration

DNS

The customer creates a DNS A record (or CNAME) pointing their data collection subdomain to the Kubernetes ingress controller’s external IP or load balancer:

data.customer.com  →  A    →  <ingress-controller-external-ip>
app.customer.com   →  A    →  <ingress-controller-external-ip>
api.customer.com   →  A    →  <ingress-controller-external-ip>

Firewall Rules

DirectionSourceDestinationPortPurpose
InboundInternetIngress Controller443Browser event collection
InboundAdmin IPsIngress Controller443Management UI and API access
OutboundDelivery WorkersVendor APIs443Server-to-server delivery
InternalKubernetes podsKafka9092Event streaming
InternalKubernetes podsRedis6379Caching
InternalKubernetes podsPostgreSQL5432Configuration storage
⚠️

Delivery Workers must have outbound HTTPS access to vendor API endpoints (e.g., www.google-analytics.com, graph.facebook.com). Ensure your firewall or NAT gateway allows outbound traffic on port 443.

Installation

1. Provision infrastructure

Create the managed Kafka, Redis, and PostgreSQL instances within the customer’s VPC. Record the connection endpoints.

2. Install the Helm chart

helm install datafly ./deployments/helm/datafly \
  --namespace datafly \
  --create-namespace \
  --values values-customer.yaml

3. Run database migrations

kubectl run migrations --rm -it \
  --namespace datafly \
  --image=ghcr.io/datafly/migrations:v1.2.0 \
  --env="DATABASE_URL=postgresql://..." \
  -- migrate-up

4. Configure DNS

Point the customer’s subdomain to the ingress controller’s external IP.

5. Verify

# Check all pods are running
kubectl get pods -n datafly
 
# Test the ingestion endpoint
curl -X POST https://data.customer.com/v1/t \
  -H "Authorization: Bearer dk_live_..." \
  -H "Content-Type: application/json" \
  -d '{"type":"track","event":"Test Event","properties":{}}'

Updates

Updates are delivered as new Helm chart versions. The customer (or Datafly operations team with access) runs a Helm upgrade:

helm upgrade datafly ./deployments/helm/datafly \
  --namespace datafly \
  --values values-customer.yaml

Helm chart upgrades follow semantic versioning. Minor and patch versions are backward-compatible. Major versions may require migration steps documented in the release notes.

Update Process

  1. Review the release notes for the new version.
  2. Run database migrations if required (make migrate-up or the migration job).
  3. Run helm upgrade with the new chart version.
  4. Verify all pods are healthy: kubectl get pods -n datafly.
  5. Run a test event to confirm end-to-end delivery.

Monitoring

The customer is responsible for monitoring their own infrastructure. Datafly Signal services expose:

EndpointPortDescription
/healthService portHealth check (returns 200 OK when healthy)
/metricsService portPrometheus-compatible metrics endpoint

Recommended monitoring stack:

  • Prometheus for metrics collection
  • Grafana for dashboards
  • AlertManager for alerting on service health, Kafka consumer lag, and delivery failures

Datafly provides sample Grafana dashboards as part of the Helm chart. Enable them with monitoring.grafana.dashboards.enabled: true in the values file.