Coolify (Production)
Deploy Datafly Signal to a VPS using Coolify as the deployment platform. This guide splits the platform into individually deployable resources for zero-downtime updates.
Architecture
The deployment uses 7 individual Coolify resources:
| Resource | Type | Source | Purpose |
|---|---|---|---|
| Infrastructure | Docker Compose Empty | Paste YAML | PostgreSQL, Redis, Kafka, Zookeeper |
| migrate | Dockerfile (GitHub) | database/Dockerfile | Database migrations + admin seed |
| ingestion-gateway | Dockerfile (GitHub) | ingestion-gateway/Dockerfile | Receives events from Datafly.js |
| event-processor | Dockerfile (GitHub) | event-processor/Dockerfile | Processes events, routes to delivery topics |
| delivery-workers | Dockerfile (GitHub) | delivery-workers/Dockerfile | Delivers events to vendor APIs |
| management-api | Dockerfile (GitHub) | management-api/Dockerfile | Admin API for the management UI |
| management-ui | Dockerfile (GitHub) | management-ui/Dockerfile | Next.js dashboard |
This means you can redeploy a single service (e.g. management-api) without touching infrastructure or other services.
All resources must have “Connect to Predefined Network” enabled in Coolify so containers can communicate by hostname.
Infrastructure hostnames are prefixed with dfsignal- (e.g. dfsignal-redis, dfsignal-kafka) to avoid conflicts with Coolify’s internal services (coolify-redis, coolify-db). All application env vars must use these prefixed hostnames.
Prerequisites
| Requirement | Details |
|---|---|
| VPS | 8+ vCPU, 16+ GB RAM (IONOS VPS XXL recommended) |
| OS | Ubuntu 24.04 |
| Coolify | v4.x installed |
| GitHub | Repository connected to Coolify via GitHub App |
| DNS | A records pointing to VPS IP for each domain |
Step 1: Infrastructure (Docker Compose Empty)
The infrastructure stack uses only pre-built images — no Git source needed.
Create a new resource in Coolify
- Click + New Resource in your project/environment
- Type: Docker Compose Empty
Paste the compose file
Paste the following into the Docker Compose editor:
services:
zookeeper:
image: confluentinc/cp-zookeeper:7.6.0
hostname: dfsignal-zookeeper
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
KAFKA_OPTS: "-Dzookeeper.4lw.commands.whitelist=ruok,srvr,stat"
volumes:
- zookeeper-data:/var/lib/zookeeper/data
- zookeeper-logs:/var/lib/zookeeper/log
restart: unless-stopped
deploy:
resources:
limits:
memory: 512M
kafka:
image: confluentinc/cp-kafka:7.6.0
hostname: dfsignal-kafka
depends_on:
- zookeeper
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: dfsignal-zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://dfsignal-kafka:29092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
KAFKA_NUM_PARTITIONS: 3
KAFKA_DEFAULT_REPLICATION_FACTOR: 1
KAFKA_LOG_RETENTION_HOURS: 168
KAFKA_LOG_RETENTION_BYTES: 1073741824
volumes:
- kafka-data:/var/lib/kafka/data
healthcheck:
test: ["CMD-SHELL", "kafka-topics --bootstrap-server localhost:29092 --list"]
interval: 15s
timeout: 10s
retries: 10
start_period: 30s
restart: unless-stopped
deploy:
resources:
limits:
memory: 2G
redis:
image: redis:7-alpine
hostname: dfsignal-redis
command: >
redis-server
--appendonly yes
--maxmemory 256mb
--maxmemory-policy allkeys-lru
--requirepass ${REDIS_PASSWORD}
volumes:
- redis-data:/data
healthcheck:
test: ["CMD", "redis-cli", "-a", "${REDIS_PASSWORD}", "ping"]
interval: 10s
timeout: 5s
retries: 5
start_period: 5s
restart: unless-stopped
deploy:
resources:
limits:
memory: 512M
postgresql:
image: postgres:16-alpine
hostname: dfsignal-postgresql
environment:
POSTGRES_DB: ${DB_NAME}
POSTGRES_USER: ${DB_USER}
POSTGRES_PASSWORD: ${DB_PASSWORD}
PGDATA: /var/lib/postgresql/data/pgdata
volumes:
- postgres-data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${DB_USER} -d ${DB_NAME}"]
interval: 10s
timeout: 5s
retries: 5
start_period: 10s
restart: unless-stopped
deploy:
resources:
limits:
memory: 2G
volumes:
postgres-data:
redis-data:
kafka-data:
zookeeper-data:
zookeeper-logs:Enable networking
In the resource settings, enable “Connect to Predefined Network”.
Set environment variables
DB_NAME=datafly
DB_USER=datafly
DB_PASSWORD=<strong-password>
REDIS_PASSWORD=<strong-password>Deploy
Click Deploy. Wait for all services to show healthy (PostgreSQL, Redis, Kafka).
Step 2: Database Migrations
The migrate resource runs schema migrations and seeds the admin user. Deploy this after infrastructure is healthy, and redeploy whenever migration files change.
Create a new resource in Coolify
- Type: Dockerfile (select your GitHub source and repository)
- Base Directory:
/Application/database - Dockerfile Location:
Dockerfile
Enable networking
Enable “Connect to Predefined Network” so it can reach PostgreSQL.
Set environment variables
DATABASE_URL=postgres://datafly:<db-password>@dfsignal-postgresql:5432/datafly?sslmode=disable
ADMIN_EMAIL=admin@customer.com
ADMIN_PASSWORD=<strong-password>
ADMIN_ORG_NAME=Customer NameReplace <db-password> with the actual DB_PASSWORD from Step 1.
The migrate container runs once and exits. Coolify will show it as “Exited” — this is expected. Stop the resource after it completes to prevent repeated restarts. It does not need a domain or health check.
Deploy
Click Deploy. Check the logs to confirm migrations ran successfully and the admin user was seeded.
Verify
In the Coolify terminal (select the dfsignal-postgresql container from the infrastructure resource):
psql -U datafly -d datafly -c "SELECT email, role FROM users;"You should see the admin user with role org_admin.
Step 3: Application Services
Create each application service as an individual Dockerfile resource in Coolify, all pointing to the same GitHub repository.
Coolify settings per service
| Service | Base Directory | Dockerfile Location | Port | Domain |
|---|---|---|---|---|
| ingestion-gateway | /Application | ingestion-gateway/Dockerfile | 8080 | collect.customer.com |
| event-processor | /Application | event-processor/Dockerfile | 8080 | (none — internal only) |
| delivery-workers | /Application | delivery-workers/Dockerfile | 8080 | (none — internal only) |
| management-api | /Application | management-api/Dockerfile | 8080 | api.customer.com |
| management-ui | /Application/management-ui | Dockerfile | 3000 | app.customer.com |
Go services (ingestion-gateway, event-processor, delivery-workers, management-api) need Base Directory set to /Application because their Dockerfiles copy the shared/ module from the parent directory. The management-ui is self-contained and uses /Application/management-ui.
For each service:
- Create a new Coolify resource with type Dockerfile
- Select your GitHub source and repository
- Set Base Directory and Dockerfile Location per the table above
- Enable “Connect to Predefined Network”
- Set the port number
- Assign a domain (for public-facing services)
- Add environment variables (see below)
- Set health check path to
/healthz(Go services) or/(management-ui)
Environment Variables
Shared across all Go services
DATAFLY_ENV=production
LOG_LEVEL=info
PORT=8080
KAFKA_BROKERS=dfsignal-kafka:29092
REDIS_ADDR=dfsignal-redis:6379
REDIS_PASSWORD=<same-as-infrastructure>
DATABASE_URL=postgres://datafly:<db-password>@dfsignal-postgresql:5432/datafly?sslmode=disable
HMAC_SECRET=<64-char-hex>Additional per service
ingestion-gateway — no additional vars needed.
event-processor:
ENCRYPTION_KEY=<64-char-hex>delivery-workers:
ENCRYPTION_KEY=<64-char-hex>
VENDOR_TYPE=webhook
WEBHOOK_URL=http://localhost:9999/noopThe WEBHOOK_URL default is a placeholder. Delivery workers idle on the Kafka topic until pipelines with integrations are configured. When adding a real webhook integration, set the actual URL here.
management-api:
ENCRYPTION_KEY=<64-char-hex>
JWT_SECRET=<64-char-hex>
SERVICE_URL_INGESTION_GATEWAY=https://collect.customer.commanagement-ui:
NEXT_PUBLIC_API_URL=https://api.customer.com/v1NEXT_PUBLIC_API_URL is baked into the UI at build time. If you change the API domain, you must redeploy the management-ui for the change to take effect. Do not set NODE_ENV=production as a build-time variable — the Dockerfile handles this in the runtime stage. Setting it at build time causes npm install to skip devDependencies (including TypeScript), breaking the build.
Generating Secrets
Generate secure random hex strings for JWT_SECRET, HMAC_SECRET, and ENCRYPTION_KEY:
openssl rand -hex 32Each secret should be a unique 64-character hex string.
Deploy Order
For a fresh deployment:
- Deploy Infrastructure — wait for PostgreSQL and Kafka to be healthy
- Deploy migrate — wait for it to complete (check logs), then stop the resource
- Deploy management-api and ingestion-gateway
- Deploy management-ui
- Deploy event-processor and delivery-workers
For updates, deploy only the resource that changed. For schema changes, redeploy migrate first, then the affected services.
DNS Configuration
Point these A records to your VPS IP:
| Domain | Service |
|---|---|
app.customer.com | management-ui |
api.customer.com | management-api |
collect.customer.com | ingestion-gateway |
If using Cloudflare, set DNS records to DNS only (grey cloud, not proxied) so that Coolify/Traefik can issue SSL certificates via Let’s Encrypt.
Verifying the Deployment
Check all services are running
In the Coolify terminal or via SSH:
docker ps --format "table {{.Names}}\t{{.Status}}" | grep -v coolifyAll application containers should show Up with (healthy).
Test the ingestion gateway
curl -s https://collect.customer.com/healthzShould return {"status":"ok"}.
Test the management API
curl -s https://api.customer.com/v1/healthLog in to the management UI
Navigate to https://app.customer.com and log in with the admin email and password from the migrate resource’s environment variables.
Troubleshooting
Service can’t reach Kafka/Redis/PostgreSQL
Ensure “Connect to Predefined Network” is enabled on both the infrastructure compose and the individual service resource. All containers must be on the same Coolify network.
Verify hostname resolution from inside a service container:
# From the Coolify terminal, select the service container
nslookup dfsignal-kafka
nslookup dfsignal-redis
nslookup dfsignal-postgresqlDo not use generic hostnames like redis, kafka, or postgresql — these can resolve to Coolify’s own internal services (e.g. coolify-redis). Always use the dfsignal- prefixed hostnames.
Management UI shows API errors
Check that NEXT_PUBLIC_API_URL was set correctly at build time. This is baked into the JavaScript bundle — changing the env var alone won’t work; you must redeploy the management-ui.
Management UI build fails (Cannot find module ‘typescript’)
Ensure NODE_ENV=production is not set as a build-time variable. Coolify’s “Inject Build Args” feature passes env vars at build time, which causes npm install to skip devDependencies. The Dockerfile sets NODE_ENV=production only in the runtime stage.
Delivery workers crash looping
Check the logs for missing environment variables. Each vendor type requires its own config (e.g. WEBHOOK_URL for webhook, GA4_MEASUREMENT_ID for GA4). If no integrations are configured yet, set VENDOR_TYPE=webhook with the default placeholder URL.
Migration stuck in dirty state
If the migrate container fails with a dirty migration, connect to PostgreSQL and check the state:
SELECT * FROM schema_migrations;The migrate container includes automatic dirty state recovery — it forces the dirty version clean and retries. If this still fails, manually fix the state:
UPDATE schema_migrations SET dirty = false;Then redeploy the migrate resource.
Migrate container keeps restarting
This is expected. The migrate container runs once and exits. Coolify’s default restart policy keeps restarting it. After confirming migrations ran successfully in the logs, click Stop on the migrate resource in Coolify.