IntegrationsCloud StorageAmazon Data Firehose

Amazon Data Firehose

Datafly Signal delivers events to Amazon Data Firehose (formerly Kinesis Data Firehose) for automatic batching, transformation, and loading into S3, Redshift, Elasticsearch, Splunk, and other downstream destinations.

Prerequisites

Before configuring Amazon Data Firehose in Signal, you need an AWS account with a Firehose delivery stream and IAM credentials.

Create an AWS Account

If you don’t already have one, sign up at aws.amazon.com. Ensure billing is configured and you have console access.

Create a Firehose Delivery Stream

  1. Open the Amazon Data Firehose console.
  2. Click Create Firehose stream.
  3. For Source, select Direct PUT (Signal writes directly to Firehose).
  4. Enter a Firehose stream name (e.g. datafly-events-stream).
  5. Choose your Destination (e.g. Amazon S3, Amazon Redshift, Elasticsearch, Splunk, or HTTP endpoint).
  6. Configure the destination settings:
    • For S3: specify the bucket, prefix, and compression settings.
    • For Redshift: specify the cluster, database, table, and COPY options.
  7. Under Buffer settings, configure buffer size (1-128 MB) and buffer interval (60-900 seconds).
  8. Click Create Firehose stream.

The buffer settings control how Firehose batches data before delivering to the destination. Smaller buffers reduce latency; larger buffers improve throughput and reduce costs.

Create an IAM User for Signal

  1. Open the IAM console.
  2. Go to Users > Create user.
  3. Enter a username (e.g. datafly-signal-firehose).
  4. Select Attach policies directly and create a custom policy:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "firehose:PutRecord",
        "firehose:PutRecordBatch"
      ],
      "Resource": "arn:aws:firehose:us-east-1:123456789012:deliverystream/datafly-events-stream"
    }
  ]
}
  1. Replace the region, account ID, and stream name with your values.
  2. Attach the policy to the user.

Generate Access Keys

  1. On the IAM user detail page, go to Security credentials.
  2. Under Access keys, click Create access key.
  3. Select Application running outside AWS.
  4. Copy the Access Key ID and Secret Access Key immediately — the secret is only shown once.
⚠️

Store these credentials securely. If you lose the secret access key, you must create a new key pair.

Configuration

FieldTypeRequiredDescription
delivery_stream_namestringYesThe name of the Firehose delivery stream to send records to.
regionselectYesThe AWS region where your Firehose delivery stream is configured.
access_key_idsecretYesThe AWS access key ID with firehose:PutRecord and firehose:PutRecordBatch permissions.
secret_access_keysecretYesThe AWS secret access key associated with the access key ID.

Signal Setup

Quick Setup

  1. Navigate to Integrations in the sidebar.
  2. Open the Integration Library tab.
  3. Find Amazon Data Firehose or filter by Cloud Storage.
  4. Click Install, select a variant if available, and fill in the required fields.
  5. Click Install Integration to create the integration with a ready-to-use default blueprint.

API Setup

curl -X POST http://localhost:8084/v1/admin/integration-catalog/amazon_firehose/install \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Amazon Data Firehose",
    "variant": "default",
    "config": {
      "delivery_stream_name": "datafly-events-stream",
      "region": "us-east-1",
      "access_key_id": "AKIAIOSFODNN7EXAMPLE",
      "secret_access_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
    },
    "delivery_mode": "server_side"
  }'

Testing

  1. Enable the integration in Signal and trigger a test event on your website.
  2. Open the Firehose console and select your delivery stream.
  3. Check the Monitoring tab for incoming record counts and delivery success metrics.
  4. Verify data is arriving at the configured destination (e.g. check the S3 bucket for new files, or query the Redshift table).
  5. In Signal, check the Live Events view to confirm delivery status shows as successful.

Troubleshooting

ProblemSolution
Events not appearing in the destinationCheck the Firehose Monitoring tab for errors. Verify the delivery stream name and region are correct.
AccessDeniedExceptionThe IAM user lacks firehose:PutRecord or firehose:PutRecordBatch permission. Update the IAM policy.
ResourceNotFoundExceptionThe delivery stream does not exist in the specified region. Verify the stream name and region.
Data delayed in destinationFirehose buffers data before delivery. Check the buffer size and interval settings on the stream. Minimum interval is 60 seconds.
ServiceUnavailableExceptionFirehose is temporarily unavailable. Signal will automatically retry. Check the AWS Service Health Dashboard.
Destination delivery failuresCheck the Firehose error log (S3 error prefix or CloudWatch logs) for destination-specific errors.
Credential errorsVerify the access key ID and secret access key are correct and the IAM user has not been deactivated.

Visit Amazon Data Firehose documentation for full API reference and destination configuration guides.