IntegrationsObject StorageAmazon S3

Amazon S3

Datafly Signal writes batches of first-party events as objects into an S3 bucket — ready for query with Athena, Glue, Spark, Presto, or any downstream lakehouse tooling.

Prerequisites

Before configuring Amazon S3 in Signal, you need an AWS account with an S3 bucket and an IAM user with write permissions.

Create an AWS Account

If you don’t already have one, sign up at aws.amazon.com.

Create an S3 Bucket

  1. Open the S3 console.
  2. Click Create bucket.
  3. Enter a Bucket name (e.g. datafly-events-production). Bucket names must be globally unique.
  4. Select the AWS Region closest to your Signal deployment.
  5. Leave Block Public Access settings enabled (recommended).
  6. Optionally enable Bucket Versioning for data protection.
  7. Click Create bucket.

Choose a region close to your Signal infrastructure to minimise latency and data transfer costs.

Create an IAM User for Signal

  1. Open the IAM console.
  2. Go to Users > Create user.
  3. Enter a username (e.g. datafly-signal-s3).
  4. Attach a custom policy:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:PutObject"
      ],
      "Resource": "arn:aws:s3:::datafly-events-production/*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetBucketLocation"
      ],
      "Resource": "arn:aws:s3:::datafly-events-production"
    }
  ]
}
  1. Replace the bucket name with your value.

Generate Access Keys

  1. On the IAM user detail page, go to Security credentials.
  2. Under Access keys, click Create access key.
  3. Select Application running outside AWS.
  4. Copy the Access Key ID and Secret Access Key immediately — the secret is only shown once.
⚠️

Store these credentials securely. If you lose the secret access key, you must create a new key pair.

Configuration

FieldTypeRequiredDescription
bucketstringYesThe S3 bucket name. Also accepts bucket_name.
regionselectYesThe AWS region of the bucket (e.g. us-east-1).
access_key_idsecretYesAWS access key ID with s3:PutObject on the bucket.
secret_access_keysecretYesSecret access key matching the access key ID.
session_tokensecretNoOptional STS session token for temporary credentials.
prefixstringNoOptional prefix (folder path) prepended to all object keys. Include a trailing slash (e.g. events/).
storage_classselectNoOptional S3 storage class (e.g. STANDARD, STANDARD_IA, INTELLIGENT_TIERING). Defaults to STANDARD.
server_side_encryptionselectNoOptional SSE algorithm: AES256 or aws:kms.
kms_key_idstringNoKMS key ARN — required when server_side_encryption is aws:kms.

Signal Setup

Quick Setup

  1. Navigate to Integrations in the sidebar.
  2. Open the Integration Library tab.
  3. Find Amazon S3 or filter by Cloud Storage.
  4. Click Install, select a variant if available, and fill in the required fields.
  5. Click Install Integration to create the integration with a ready-to-use default blueprint.

API Setup

curl -X POST http://localhost:8084/v1/admin/integration-catalog/amazon_s3/install \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Amazon S3",
    "variant": "default",
    "config": {
      "bucket": "datafly-events-production",
      "region": "us-east-1",
      "access_key_id": "AKIAIOSFODNN7EXAMPLE",
      "secret_access_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
      "prefix": "events/",
      "storage_class": "STANDARD",
      "server_side_encryption": "AES256"
    },
    "delivery_mode": "server_side"
  }'

Schema

Signal writes batched events as newline-delimited JSON (NDJSON) objects. Each line is one event using the canonical envelope:

{"event_id":"...", "type":"track", "event":"order_completed",
 "anonymous_id":"...", "user_id":"...",
 "timestamp":"2026-05-12T10:00:00Z", "received_at":"...", "sent_at":"...",
 "context":{...}, "properties":{...}, "traits":{...},
 "source_id":"...", "integration_id":"..."}

Object keys follow the pattern <prefix>YYYY/MM/DD/HH/<batch-uuid>.json.gz, suitable for Athena partition projection on date components.

S3 is a first-party destination in your own AWS account. The default blueprint forwards all events. If you need consent-aware partitioning, branch on context.consent in your pipeline transforms before delivery.

Testing

  1. Enable the integration in Signal and trigger a test event on your website.
  2. Open the S3 console and navigate to your bucket.
  3. Browse to the prefix path and verify that event files are appearing.
  4. Download a file and inspect the contents to confirm the event data is correct.
  5. In Signal, check the Live Events view to confirm delivery status shows as successful.

Troubleshooting

ProblemSolution
Events not appearing in S3Verify the bucket name, region, and prefix are correct. Check that the bucket exists.
AccessDeniedThe IAM user lacks s3:PutObject permission on the bucket. Update the IAM policy.
NoSuchBucketThe bucket does not exist. Verify the bucket name (names are case-sensitive and globally unique).
InvalidAccessKeyIdThe access key ID is incorrect or the IAM user has been deleted. Verify credentials.
SignatureDoesNotMatchThe secret access key is incorrect. Regenerate the access key pair.
Files appearing but emptyCheck the batch settings. Events are buffered before flushing to files.
Wrong region errorS3 bucket region must match the region config field. Check the bucket’s actual region in the S3 console.

Visit Amazon S3 documentation for full API reference, lifecycle policies, and cost optimisation guides.

See also