IntegrationsCloud StorageGoogle Cloud Storage

Google Cloud Storage

Datafly Signal delivers events as objects to Google Cloud Storage (GCS) buckets for data lake, archival, and batch processing use cases.

⚠️

This integration is currently in beta. Configuration and behaviour may change.

Prerequisites

Before configuring Google Cloud Storage in Signal, you need a GCP project with a GCS bucket and a service account with the Storage Object Creator role.

Create a GCP Account and Project

  1. Sign up at cloud.google.com if you don’t already have an account.
  2. Create a new project or select an existing one in the GCP Console.
  3. Note the Project ID.

Enable the Cloud Storage API

  1. Go to APIs & Services > Library.
  2. Search for Cloud Storage JSON API.
  3. Click Enable (it may already be enabled by default).

Create a GCS Bucket

  1. Go to the Cloud Storage console.
  2. Click Create bucket.
  3. Enter a Bucket name (e.g. datafly-events-production). Names must be globally unique.
  4. Choose a Location type:
    • Region — lowest latency, single region (recommended for Signal).
    • Multi-region — higher availability, replicated across regions.
  5. Choose a Storage class: Standard (recommended for frequently accessed data), Nearline, Coldline, or Archive.
  6. Leave Public access prevention enforced (recommended).
  7. Click Create.

Choose a region close to your Signal infrastructure to minimise latency and egress costs. Standard storage class is recommended for event data that will be queried regularly.

Create a Service Account

  1. Go to IAM & Admin > Service Accounts > Create Service Account.
  2. Enter a name (e.g. datafly-signal-gcs).
  3. Grant the Storage Object Creator role (roles/storage.objectCreator).
  4. Click Done.

Generate a Service Account Key

  1. Click on the service account.
  2. Go to Keys > Add Key > Create new key > JSON.
  3. The key file will download. Store it securely.
⚠️

Store the JSON key file securely. Do not commit it to version control. The entire JSON content is what you will paste into the Signal configuration.

Configuration

FieldTypeRequiredDescription
bucket_namestringYesThe name of the GCS bucket where event files will be written.
project_idstringYesThe Google Cloud project ID that owns the bucket.
service_account_jsonsecretYesThe full JSON key file content for a GCP service account with Storage Object Creator permissions.
prefixstringNoOptional prefix (folder path) prepended to all object names. Include a trailing slash (e.g. events/).
file_formatselectYesThe output file format: json (newline-delimited JSON) or parquet.

Signal Setup

Quick Setup

  1. Navigate to Integrations in the sidebar.
  2. Open the Integration Library tab.
  3. Find Google Cloud Storage or filter by Cloud Storage.
  4. Click Install, select a variant if available, and fill in the required fields.
  5. Click Install Integration to create the integration with a ready-to-use default blueprint.

API Setup

curl -X POST http://localhost:8084/v1/admin/integration-catalog/google_cloud_storage/install \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Google Cloud Storage",
    "variant": "default",
    "config": {
      "bucket_name": "datafly-events-production",
      "project_id": "datafly-analytics",
      "service_account_json": "{\"type\": \"service_account\", ...}",
      "prefix": "events/",
      "file_format": "json"
    },
    "delivery_mode": "server_side"
  }'

Testing

  1. Enable the integration in Signal and trigger a test event on your website.
  2. Open the Cloud Storage console and navigate to your bucket.
  3. Browse to the prefix path and verify that event files are appearing.
  4. Click on a file to download and inspect the event data.
  5. In Signal, check the Live Events view to confirm delivery status shows as successful.

Troubleshooting

ProblemSolution
Events not appearing in the bucketVerify the bucket name, project ID, and prefix are correct.
Permission denied (403)The service account lacks Storage Object Creator role on the bucket. Add it in IAM & Admin > IAM.
Bucket not found (404)The bucket does not exist. Verify the bucket name (globally unique, case-sensitive).
Invalid service account JSONEnsure you pasted the complete JSON key file content, including all fields.
Files appearing but emptyCheck the batch settings. Events are buffered before flushing to files.
Bucket location mismatchThe bucket location does not affect connectivity but may impact latency. Choose a location close to your Signal deployment.
Uniform bucket-level access errorsIf the bucket uses uniform bucket-level access, ensure the service account has the role at the bucket level via IAM, not legacy ACLs.

Visit Google Cloud Storage documentation for full API reference, lifecycle management, and access control guides.