Ingestion
The Ingestion Gateway is the entry point for all event data flowing into Datafly Signal. It is a Go service (port 8080) that receives events from multiple sources, validates authentication, enriches the payload with identity data, and publishes each event to the Kafka raw-events topic for downstream processing.
How It Works
Sources (browser, server, pixel, webhook)
→ Ingestion Gateway (authentication, identity, enrichment)
→ Kafka raw-events topic
→ Event ProcessorEvery event — regardless of how it arrives — is normalised into a canonical event envelope before being published. This means the processing layer and delivery workers never need to care about how the event was collected.
Input Methods
Datafly Signal supports five methods of ingesting events:
| Method | Endpoint | Authentication | Use Case |
|---|---|---|---|
| Browser (Datafly.js) | POST /v1/t, /v1/p, /v1/i, /v1/g | Pipeline key | Standard website/app tracking |
| Server-Side | POST /v1/events | HMAC-SHA256 | Backend event submission |
| Batch | POST /v1/batch | HMAC or pipeline key | Bulk imports, historical backfills |
| Tracking Pixel | GET /v1/pixel/{type} | Pipeline key (query param) | Email opens, no-JS environments |
| Webhook | POST /v1/webhook | Per-source signature | Third-party service events |
Key Capabilities
Pipeline Key Validation
Every request is authenticated via a pipeline key (dk_...). The gateway validates pipeline keys against PostgreSQL, with a Redis cache layer to avoid per-request database lookups. Invalid or revoked keys are rejected with a 401 Unauthorized response.
Anonymous Identity
On browser requests, the gateway sets a first-party _dfid cookie:
- httpOnly — not accessible to client-side JavaScript
- Secure — only sent over HTTPS
- SameSite=Lax — prevents CSRF while allowing top-level navigations
- 2-year TTL — persistent anonymous identity across sessions
This cookie serves as the anonymous identifier for the visitor, providing consistent identity without relying on third-party cookies.
Vendor ID Generation
The gateway self-generates vendor-specific identifiers and sets them as first-party cookies on the customer’s subdomain:
| Vendor | Cookie | Format |
|---|---|---|
| Google Analytics 4 | _ga | GA1.1.{random}.{timestamp} |
| Meta / Facebook | _fbp | fb.1.{timestamp}.{random} |
| TikTok | _ttp | UUID v4 |
Because these are first-party cookies on the customer’s own domain, they persist through ITP restrictions and ad blocker rules.
Click ID Capture
The gateway automatically captures advertising click IDs from URL query parameters and includes them in the event payload:
| Parameter | Vendor |
|---|---|
gclid | Google Ads |
fbclid | Meta / Facebook |
ttclid | TikTok |
li_fat_id | |
ScCid | Snapchat |
epik | |
tduid | The Trade Desk |
CORS Handling
Browser-based endpoints respond to OPTIONS preflight requests and include the appropriate Access-Control-Allow-* headers. The allowed origins are derived from the source configuration for each pipeline key.
Sections
- HTTP Endpoints — Browser-facing event collection endpoints used by Datafly.js
- Server-Side Events — Server-to-server event submission with HMAC authentication
- Batch API — Submit multiple events in a single request
- Tracking Pixel — 1x1 transparent GIF for email and no-JS tracking
- Webhooks — Accept events from third-party services