Getting StartedCore Concepts

Core Concepts

Before you start configuring the platform, it helps to understand the key building blocks and how they fit together.

Pipelines

A pipeline is the central concept in Datafly Signal. It represents one data collection endpoint — typically one website or app — and controls what happens to the events collected from it.

Each pipeline has:

  • A pipeline key (dk_...) — a unique identifier used by the JS collector to authenticate events
  • One or more integrations — vendor destinations where events will be delivered
  • Parameters — shared configuration values (like a GA4 Measurement ID) that are injected into integration configs at processing time

Think of a pipeline as the answer to: “For this website, which vendors should receive events, and how should those events be transformed?”

When to Create Multiple Pipelines

In most cases, you’ll create one pipeline per website or app. You might create separate pipelines when:

  • You have distinct websites with different vendor requirements
  • You want completely separate configuration for staging vs production
  • You manage multiple brands with different data destinations

Integrations

An integration connects your pipeline to a specific vendor destination. When you add an integration, you’re telling Signal: “Send events from this pipeline to this vendor’s API.”

Each integration includes:

  • Vendor type — which vendor API to deliver to (GA4, Meta CAPI, TikTok, BigQuery, etc.)
  • Credentials — API keys, access tokens, or measurement IDs required by the vendor
  • Field mappings — how to transform your canonical events into the vendor’s expected format

Datafly Signal ships with 120+ pre-built integrations in the Integration Library, covering advertising platforms, analytics tools, CDPs, data warehouses, and more.

Templates and Revisions

Integrations use a template and revision model:

  • A template is the integration itself (e.g. “Our GA4 Integration”)
  • A revision is a versioned snapshot of its configuration
  • You can create new revisions to update mappings without affecting live traffic until you’re ready to publish

This gives you a safe way to iterate on configuration changes.

Blueprints

A blueprint is a pre-built configuration for a specific vendor and industry vertical. Instead of mapping every field from scratch, you can select a blueprint when installing an integration to get a working configuration immediately.

For example, the GA4 Retail Blueprint pre-maps common e-commerce events (purchase, add_to_cart, view_item, etc.) to GA4’s Measurement Protocol format, including all the standard parameters GA4 expects.

Available blueprint verticals include Retail, Travel, and Media. Blueprints are fully editable after install — they’re a starting point, not a constraint.

The V2 Schema-Mapping Builder

When you configure an integration (either from a blueprint or from scratch), you use the V2 schema-mapping builder in the Management UI. This is where you define:

  • Parameters — Connection credentials and shared values (e.g. measurement_id, api_secret)
  • Global mappings — Fields that apply to every event sent to this vendor (e.g. client_id, user_id)
  • Event mappings — Per-event-type field mappings (e.g. map purchase events to GA4’s purchase event with transaction_id, value, currency, items[])
  • Defaults — Fallback values applied when source data is missing

Each field mapping specifies a source (where the value comes from in the canonical event) and a mode (direct mapping, static value, expression, or computed).

Event Processing

Events flow through two processing layers before being delivered to vendors:

Layer 1: Organisation Data Layer

This runs on every event regardless of which integrations will receive it. It handles tenant-wide governance:

StepWhat it does
Consent enforcementChecks consent state and strips non-consented data
Bot filteringRemoves non-human traffic (known bots, data centre IPs, behavioural signals)
PII detectionDetects and handles personal data (hash, redact, or pass through per your policy)
GeolocationEnriches events with country, region, and city from the IP address
Device parsingStructures user agent into device type, browser, and OS
Session stitchingGroups events into user sessions
Identity resolutionAttaches known user IDs and vendor IDs to the event
DeduplicationPrevents the same event from being processed twice

You configure these settings in Settings > Organisation Data Layer in the Management UI.

Layer 2: Pipeline Transformation Engine

This runs once per integration the event is routed to. It transforms the canonical event into the exact format each vendor expects, using the field mappings you defined in the integration configuration.

The Datafly.js Collector

Datafly.js is a lightweight JavaScript SDK (under 8KB gzipped) that you add to your website. It replaces all vendor-specific tags and sends events to your Datafly Signal endpoint.

It provides four core methods:

MethodPurposeExample
page()Track a page viewCalled on every page load
track()Track a custom event_df.track('Purchase', { value: 99.99 })
identify()Set the user’s identity_df.identify('user-123', { email: '...' })
group()Associate the user with a group_df.group('company-456', { name: '...' })

The collector also automatically captures:

  • Ad click IDs from URL parameters (gclid, fbclid, ttclid, etc.)
  • Consent state from your consent management platform
  • Page context (URL, title, referrer)
  • The anonymous ID (_dfid) for identity stitching

Team & Roles

Datafly Signal uses role-based access control to manage who can do what:

RoleWhat they can do
Org AdminFull access — manage team, sources, pipelines, integrations, settings
Source AdminManage sources, integrations, pipelines, and brands
Source EditorCreate and edit sources, integrations, and transformations
Source ViewerRead-only access to all resources and the real-time debugger
Data Governance AdminManage data layer, transformations, and consent settings

Team management is available under Settings > RBAC in the Management UI.

Next Steps

Now that you understand the building blocks, head to Your First Pipeline to set everything up.