Core Concepts
Before you start configuring the platform, it helps to understand the key building blocks and how they fit together.
Pipelines
A pipeline is the central concept in Datafly Signal. It represents one data collection endpoint — typically one website or app — and controls what happens to the events collected from it.
Each pipeline has:
- A pipeline key (
dk_...) — a unique identifier used by the JS collector to authenticate events - One or more integrations — vendor destinations where events will be delivered
- Parameters — shared configuration values (like a GA4 Measurement ID) that are injected into integration configs at processing time
Think of a pipeline as the answer to: “For this website, which vendors should receive events, and how should those events be transformed?”
When to Create Multiple Pipelines
In most cases, you’ll create one pipeline per website or app. You might create separate pipelines when:
- You have distinct websites with different vendor requirements
- You want completely separate configuration for staging vs production
- You manage multiple brands with different data destinations
Integrations
An integration connects your pipeline to a specific vendor destination. When you add an integration, you’re telling Signal: “Send events from this pipeline to this vendor’s API.”
Each integration includes:
- Vendor type — which vendor API to deliver to (GA4, Meta CAPI, TikTok, BigQuery, etc.)
- Credentials — API keys, access tokens, or measurement IDs required by the vendor
- Field mappings — how to transform your canonical events into the vendor’s expected format
Datafly Signal ships with 120+ pre-built integrations in the Integration Library, covering advertising platforms, analytics tools, CDPs, data warehouses, and more.
Templates and Revisions
Integrations use a template and revision model:
- A template is the integration itself (e.g. “Our GA4 Integration”)
- A revision is a versioned snapshot of its configuration
- You can create new revisions to update mappings without affecting live traffic until you’re ready to publish
This gives you a safe way to iterate on configuration changes.
Blueprints
A blueprint is a pre-built configuration for a specific vendor and industry vertical. Instead of mapping every field from scratch, you can select a blueprint when installing an integration to get a working configuration immediately.
For example, the GA4 Retail Blueprint pre-maps common e-commerce events (purchase, add_to_cart, view_item, etc.) to GA4’s Measurement Protocol format, including all the standard parameters GA4 expects.
Available blueprint verticals include Retail, Travel, and Media. Blueprints are fully editable after install — they’re a starting point, not a constraint.
The V2 Schema-Mapping Builder
When you configure an integration (either from a blueprint or from scratch), you use the V2 schema-mapping builder in the Management UI. This is where you define:
- Parameters — Connection credentials and shared values (e.g.
measurement_id,api_secret) - Global mappings — Fields that apply to every event sent to this vendor (e.g.
client_id,user_id) - Event mappings — Per-event-type field mappings (e.g. map
purchaseevents to GA4’spurchaseevent withtransaction_id,value,currency,items[]) - Defaults — Fallback values applied when source data is missing
Each field mapping specifies a source (where the value comes from in the canonical event) and a mode (direct mapping, static value, expression, or computed).
Event Processing
Events flow through two processing layers before being delivered to vendors:
Layer 1: Organisation Data Layer
This runs on every event regardless of which integrations will receive it. It handles tenant-wide governance:
| Step | What it does |
|---|---|
| Consent enforcement | Checks consent state and strips non-consented data |
| Bot filtering | Removes non-human traffic (known bots, data centre IPs, behavioural signals) |
| PII detection | Detects and handles personal data (hash, redact, or pass through per your policy) |
| Geolocation | Enriches events with country, region, and city from the IP address |
| Device parsing | Structures user agent into device type, browser, and OS |
| Session stitching | Groups events into user sessions |
| Identity resolution | Attaches known user IDs and vendor IDs to the event |
| Deduplication | Prevents the same event from being processed twice |
You configure these settings in Settings > Organisation Data Layer in the Management UI.
Layer 2: Pipeline Transformation Engine
This runs once per integration the event is routed to. It transforms the canonical event into the exact format each vendor expects, using the field mappings you defined in the integration configuration.
The Datafly.js Collector
Datafly.js is a lightweight JavaScript SDK (under 8KB gzipped) that you add to your website. It replaces all vendor-specific tags and sends events to your Datafly Signal endpoint.
It provides four core methods:
| Method | Purpose | Example |
|---|---|---|
page() | Track a page view | Called on every page load |
track() | Track a custom event | _df.track('Purchase', { value: 99.99 }) |
identify() | Set the user’s identity | _df.identify('user-123', { email: '...' }) |
group() | Associate the user with a group | _df.group('company-456', { name: '...' }) |
The collector also automatically captures:
- Ad click IDs from URL parameters (
gclid,fbclid,ttclid, etc.) - Consent state from your consent management platform
- Page context (URL, title, referrer)
- The anonymous ID (
_dfid) for identity stitching
Team & Roles
Datafly Signal uses role-based access control to manage who can do what:
| Role | What they can do |
|---|---|
| Org Admin | Full access — manage team, sources, pipelines, integrations, settings |
| Source Admin | Manage sources, integrations, pipelines, and brands |
| Source Editor | Create and edit sources, integrations, and transformations |
| Source Viewer | Read-only access to all resources and the real-time debugger |
| Data Governance Admin | Manage data layer, transformations, and consent settings |
Team management is available under Settings > RBAC in the Management UI.
Next Steps
Now that you understand the building blocks, head to Your First Pipeline to set everything up.