Azure Synapse Analytics

Datafly Signal writes events directly into a Synapse dedicated SQL pool table over the SQL Server wire protocol, so you can analyse first-party behaviour with the rest of your Azure data estate.

Prerequisites

Before configuring Azure Synapse Analytics in Signal, you need an Azure account with a Synapse workspace, a dedicated SQL pool, and a target table.

Create an Azure Account

If you don’t already have one, sign up at azure.microsoft.com.

Create a Synapse Workspace

  1. In the Azure portal, search for Azure Synapse Analytics and click Create.
  2. Select your Subscription and Resource group.
  3. Enter a Workspace name (e.g. datafly-synapse).
  4. Select or create a Data Lake Storage Gen2 account for the workspace.
  5. Set the SQL administrator login and Password (for the built-in serverless SQL pool).
  6. Click Review + Create > Create.

Create a Dedicated SQL Pool

  1. Open your Synapse workspace in the Azure portal.
  2. Go to SQL pools > + New.
  3. Enter a Dedicated SQL pool name (e.g. datafly_pool).
  4. Choose a Performance level (start with DW100c for development, scale up for production).
  5. Click Review + Create > Create.

Dedicated SQL pools are billed while running. Pause the pool when not in use to save costs. Serverless SQL pools are billed per query and do not need to be paused.

Create a Table

Connect to the dedicated SQL pool using Synapse Studio, Azure Data Studio, or SSMS:

CREATE TABLE dbo.events (
  event_id NVARCHAR(64) NOT NULL,
  type NVARCHAR(20),
  event NVARCHAR(256),
  anonymous_id NVARCHAR(64),
  user_id NVARCHAR(256),
  timestamp DATETIME2,
  received_at DATETIME2,
  sent_at DATETIME2,
  context NVARCHAR(MAX),
  properties NVARCHAR(MAX),
  traits NVARCHAR(MAX),
  source_id NVARCHAR(64),
  integration_id NVARCHAR(64)
)
WITH (
  DISTRIBUTION = HASH(anonymous_id),
  CLUSTERED COLUMNSTORE INDEX
);

Hash distribution on anonymous_id optimises queries that filter or join by user. Clustered columnstore index provides excellent compression and analytics query performance.

Get the Server Endpoint

In the Synapse workspace overview, copy the Dedicated SQL endpoint (e.g. datafly-synapse.sql.azuresynapse.net). Synapse listens on TCP port 1433 and requires Encrypt=True.

Configure Firewall Rules

  1. In the Synapse workspace, go to Security > Firewalls.
  2. Add the IP addresses of your Signal deployment.
  3. Click Save.

Configuration

FieldTypeRequiredDescription
serverstringYesThe Synapse dedicated SQL endpoint hostname (e.g. datafly-synapse.sql.azuresynapse.net).
portstringYesThe SQL Server port. Defaults to 1433.
databasestringYesThe target database / SQL pool name.
schema_namestringYesThe schema to write to. Defaults to dbo.
table_namestringYesThe target table name to insert rows into.
usernamestringYesSQL authentication username.
passwordsecretYesSQL authentication password.
encryptstringYesSet to true to require TLS. Synapse requires encryption.

Signal Setup

Quick Setup

  1. Navigate to Integrations in the sidebar.
  2. Open the Integration Library tab.
  3. Find Azure Synapse Analytics or filter by Warehouse.
  4. Click Install, select a variant if available, and fill in the required fields.
  5. Click Install Integration to create the integration with a ready-to-use default blueprint.

API Setup

curl -X POST http://localhost:8084/v1/admin/integration-catalog/azure_synapse/install \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Azure Synapse Analytics",
    "variant": "default",
    "config": {
      "server": "datafly-synapse.sql.azuresynapse.net",
      "port": "1433",
      "database": "datafly_pool",
      "schema_name": "dbo",
      "table_name": "events",
      "username": "sqladmin",
      "password": "YOUR_PASSWORD",
      "encrypt": "true"
    },
    "delivery_mode": "server_side"
  }'

Schema

Signal writes the standard event envelope. The table definition shown in Prerequisites uses:

ColumnSynapse typeNotes
event_idNVARCHAR(64) NOT NULLUnique per event.
typeNVARCHAR(20)Event type.
eventNVARCHAR(256)Event name (snake_case).
anonymous_idNVARCHAR(64)First-party visitor identifier. Good HASH distribution column.
user_idNVARCHAR(256)Logged-in user identifier (nullable).
timestampDATETIME2Client event time.
received_atDATETIME2Time Signal received the event.
sent_atDATETIME2Time the row was delivered.
contextNVARCHAR(MAX)JSON document — page, device, consent metadata.
propertiesNVARCHAR(MAX)JSON document — custom event properties.
traitsNVARCHAR(MAX)JSON document — user traits.
source_idNVARCHAR(64)Pipeline source identifier.
integration_idNVARCHAR(64)Signal integration identifier.

Query JSON columns with JSON_VALUE() / OPENJSON().

Synapse is a first-party destination under your control. The default blueprint forwards all events. If you need consent filtering, apply it via pipeline transforms or downstream views over context.

Testing

  1. Enable the integration in Signal and trigger a test event on your website.
  2. Open Synapse Studio and navigate to the dedicated SQL pool.
  3. Run a query to check for events:
SELECT TOP 10 * FROM dbo.events ORDER BY timestamp DESC;
  1. Verify that event rows are appearing with correct data.
  2. In Signal, check the Live Events view to confirm delivery status shows as successful.

Troubleshooting

ProblemSolution
Events not appearing in the tableVerify the connection string, database, schema, and table name are correct. Ensure the dedicated SQL pool is running (not paused).
Connection refused / timeoutCheck the Synapse firewall rules. Add Signal’s IP addresses.
Login failedVerify the username and password in the connection string.
INSERT permission deniedThe user lacks INSERT permission. Grant it with GRANT INSERT ON dbo.events TO your_user;.
SQL pool is pausedDedicated SQL pools must be running to accept connections. Resume the pool in the Azure portal.
Invalid object nameThe table does not exist or the schema is incorrect. Verify with SELECT * FROM INFORMATION_SCHEMA.TABLES;.
Performance issuesCheck the SQL pool performance level (DWU). Scale up if insert throughput is insufficient.

Visit Azure Synapse Analytics documentation for full T-SQL reference, distribution strategies, and performance tuning guides.

See also