Google Cloud Storage
Datafly Signal delivers events as objects to Google Cloud Storage (GCS) buckets for data lake, archival, and batch processing use cases.
This integration is currently in beta. Configuration and behaviour may change.
Prerequisites
Before configuring Google Cloud Storage in Signal, you need a GCP project with a GCS bucket and a service account with the Storage Object Creator role.
Create a GCP Account and Project
- Sign up at cloud.google.com if you don’t already have an account.
- Create a new project or select an existing one in the GCP Console.
- Note the Project ID.
Enable the Cloud Storage API
- Go to APIs & Services > Library.
- Search for Cloud Storage JSON API.
- Click Enable (it may already be enabled by default).
Create a GCS Bucket
- Go to the Cloud Storage console.
- Click Create bucket.
- Enter a Bucket name (e.g.
datafly-events-production). Names must be globally unique. - Choose a Location type:
- Region — lowest latency, single region (recommended for Signal).
- Multi-region — higher availability, replicated across regions.
- Choose a Storage class: Standard (recommended for frequently accessed data), Nearline, Coldline, or Archive.
- Leave Public access prevention enforced (recommended).
- Click Create.
Choose a region close to your Signal infrastructure to minimise latency and egress costs. Standard storage class is recommended for event data that will be queried regularly.
Create a Service Account
- Go to IAM & Admin > Service Accounts > Create Service Account.
- Enter a name (e.g.
datafly-signal-gcs). - Grant the Storage Object Creator role (
roles/storage.objectCreator). - Click Done.
Generate a Service Account Key
- Click on the service account.
- Go to Keys > Add Key > Create new key > JSON.
- The key file will download. Store it securely.
Store the JSON key file securely. Do not commit it to version control. The entire JSON content is what you will paste into the Signal configuration.
Configuration
| Field | Type | Required | Description |
|---|---|---|---|
bucket_name | string | Yes | The name of the GCS bucket where event files will be written. |
project_id | string | Yes | The Google Cloud project ID that owns the bucket. |
service_account_json | secret | Yes | The full JSON key file content for a GCP service account with Storage Object Creator permissions. |
prefix | string | No | Optional prefix (folder path) prepended to all object names. Include a trailing slash (e.g. events/). |
file_format | select | Yes | The output file format: json (newline-delimited JSON) or parquet. |
Signal Setup
Quick Setup
- Navigate to Integrations in the sidebar.
- Open the Integration Library tab.
- Find Google Cloud Storage or filter by Cloud Storage.
- Click Install, select a variant if available, and fill in the required fields.
- Click Install Integration to create the integration with a ready-to-use default blueprint.
API Setup
curl -X POST http://localhost:8084/v1/admin/integration-catalog/google_cloud_storage/install \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Google Cloud Storage",
"variant": "default",
"config": {
"bucket_name": "datafly-events-production",
"project_id": "datafly-analytics",
"service_account_json": "{\"type\": \"service_account\", ...}",
"prefix": "events/",
"file_format": "json"
},
"delivery_mode": "server_side"
}'Testing
- Enable the integration in Signal and trigger a test event on your website.
- Open the Cloud Storage console and navigate to your bucket.
- Browse to the prefix path and verify that event files are appearing.
- Click on a file to download and inspect the event data.
- In Signal, check the Live Events view to confirm delivery status shows as successful.
Troubleshooting
| Problem | Solution |
|---|---|
| Events not appearing in the bucket | Verify the bucket name, project ID, and prefix are correct. |
Permission denied (403) | The service account lacks Storage Object Creator role on the bucket. Add it in IAM & Admin > IAM. |
Bucket not found (404) | The bucket does not exist. Verify the bucket name (globally unique, case-sensitive). |
| Invalid service account JSON | Ensure you pasted the complete JSON key file content, including all fields. |
| Files appearing but empty | Check the batch settings. Events are buffered before flushing to files. |
| Bucket location mismatch | The bucket location does not affect connectivity but may impact latency. Choose a location close to your Signal deployment. |
| Uniform bucket-level access errors | If the bucket uses uniform bucket-level access, ensure the service account has the role at the bucket level via IAM, not legacy ACLs. |
Visit Google Cloud Storage documentation for full API reference, lifecycle management, and access control guides.