Amazon Redshift

Datafly Signal delivers events to Amazon Redshift for petabyte-scale data warehousing, analytics, and BI reporting.

⚠️

This integration is currently in beta. Configuration and behaviour may change.

Prerequisites

Before configuring Amazon Redshift in Signal, you need an AWS account with a Redshift cluster, a target schema and table, and a database user with INSERT privileges.

Create an AWS Account

If you don’t already have one, sign up at aws.amazon.com.

Create a Redshift Cluster

  1. Open the Redshift console.
  2. Click Create cluster.
  3. Enter a Cluster identifier (e.g. datafly-analytics).
  4. Choose a node type and number of nodes based on your expected data volume.
  5. Set a Admin user name and Admin user password.
  6. Under Network and security, configure the VPC and subnet group.
  7. Click Create cluster.
  8. Once the cluster is available, note the Endpoint (e.g. datafly-analytics.abc123.us-east-1.redshift.amazonaws.com) and Port (default 5439).

Configure Network Access

  1. In the Redshift cluster details, click the VPC security group link.
  2. Add an Inbound rule allowing TCP traffic on port 5439 from the IP addresses or CIDR range of your Signal deployment.
  3. If Signal runs outside AWS, enable Publicly accessible on the cluster (under Modify > Network and security).
⚠️

If your cluster is publicly accessible, restrict the security group inbound rules to only the IP addresses of your Signal infrastructure. Never allow 0.0.0.0/0 access to a production cluster.

Create a Schema and Table

Connect to the cluster using the Redshift Query Editor or a SQL client (e.g. psql, DBeaver):

CREATE SCHEMA datafly;
 
CREATE TABLE datafly.events (
  event_id VARCHAR(64) NOT NULL,
  type VARCHAR(20),
  event VARCHAR(256),
  anonymous_id VARCHAR(64),
  user_id VARCHAR(256),
  timestamp TIMESTAMP,
  received_at TIMESTAMP,
  sent_at TIMESTAMP,
  context SUPER,
  properties SUPER,
  traits SUPER,
  source_id VARCHAR(64),
  integration_id VARCHAR(64)
)
DISTKEY(anonymous_id)
SORTKEY(timestamp);

Using anonymous_id as the distribution key and timestamp as the sort key optimises both join performance (by user) and time-range query performance.

Create a Database User for Signal

Create a dedicated user with limited privileges:

CREATE USER datafly_signal PASSWORD 'your_secure_password';
GRANT USAGE ON SCHEMA datafly TO datafly_signal;
GRANT INSERT ON TABLE datafly.events TO datafly_signal;

Configuration

FieldTypeRequiredDescription
hoststringYesThe Redshift cluster endpoint hostname.
portstringYesThe Redshift port. Defaults to 5439.
databasestringYesThe database name.
schemastringYesThe schema containing the target table.
usernamestringYesThe database username for authentication.
passwordsecretYesThe database password for authentication.

Signal Setup

Quick Setup

  1. Navigate to Integrations in the sidebar.
  2. Open the Integration Library tab.
  3. Find Amazon Redshift or filter by Cloud Storage.
  4. Click Install, select a variant if available, and fill in the required fields.
  5. Click Install Integration to create the integration with a ready-to-use default blueprint.

API Setup

curl -X POST http://localhost:8084/v1/admin/integration-catalog/amazon_redshift/install \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Amazon Redshift",
    "variant": "default",
    "config": {
      "host": "datafly-analytics.abc123.us-east-1.redshift.amazonaws.com",
      "port": "5439",
      "database": "analytics",
      "schema": "datafly",
      "username": "datafly_signal",
      "password": "your_secure_password"
    },
    "delivery_mode": "server_side"
  }'

Testing

  1. Enable the integration in Signal and trigger a test event on your website.
  2. Connect to the Redshift cluster and query the target table:
SELECT * FROM datafly.events ORDER BY timestamp DESC LIMIT 10;
  1. Verify event rows are appearing with correct data.
  2. In Signal, check the Live Events view to confirm delivery status shows as successful.

Troubleshooting

ProblemSolution
Events not appearing in RedshiftVerify the host, port, database, schema, and table name are correct.
Connection refused / timeoutCheck that the security group allows inbound traffic on port 5439 from Signal’s IP addresses. Verify the cluster is publicly accessible if Signal runs outside the VPC.
permission denied for schemaThe database user lacks USAGE on the schema. Run GRANT USAGE ON SCHEMA datafly TO datafly_signal;.
permission denied for relationThe database user lacks INSERT on the table. Run GRANT INSERT ON TABLE datafly.events TO datafly_signal;.
SSL connection errorsRedshift requires SSL by default. Ensure your Signal deployment trusts the Amazon root CA certificates.
Slow insertsConsider increasing the batch size in the integration settings. Verify the sort key and dist key are configured appropriately.
Credential errorsVerify the username and password are correct. Check that the user has not been locked out.

Visit Amazon Redshift documentation for full SQL reference and cluster management guides.