IntegrationsDatabasesGoogle Spanner

Google Cloud Spanner

Datafly Signal delivers events to Google Cloud Spanner for globally distributed, strongly consistent, horizontally scalable relational database storage.

Prerequisites

Before configuring Google Cloud Spanner in Signal, you need a GCP project with a Spanner instance, a database with a target table, and a service account.

Create a GCP Account and Project

  1. Sign up at cloud.google.com if you don’t already have an account.
  2. Create a new project or select an existing one in the GCP Console.
  3. Note the Project ID.

Enable the Cloud Spanner API

  1. Go to APIs & Services > Library.
  2. Search for Cloud Spanner API.
  3. Click Enable.

Create a Spanner Instance

  1. Go to the Spanner console.
  2. Click Create instance.
  3. Enter an Instance name (e.g. datafly-events) and Instance ID.
  4. Choose a Configuration:
    • Regional — data in a single region (lower latency, lower cost).
    • Multi-region — data replicated across regions (higher availability).
  5. Set the Compute capacity (processing units or nodes). 1 node = 1000 processing units.
  6. Click Create.

For development and testing, you can use the free trial instance (1 node, limited to specific regions). For production, size the instance based on your expected write throughput.

Create a Database

  1. In the Spanner console, click on your instance.
  2. Click Create database.
  3. Enter a Database name (e.g. events_db).
  4. Click Create.

Create a Table

In the Spanner console, open the database and run the following DDL:

CREATE TABLE Events (
  event_id STRING(64) NOT NULL,
  type STRING(20),
  event STRING(256),
  anonymous_id STRING(64),
  user_id STRING(256),
  timestamp TIMESTAMP,
  received_at TIMESTAMP,
  context JSON,
  properties JSON,
  traits JSON,
  source_id STRING(64),
  integration_id STRING(64),
) PRIMARY KEY (event_id);

Spanner uses the primary key for data distribution. Using event_id (a UUID) as the primary key ensures even distribution across splits. Avoid monotonically increasing keys like timestamps as primary keys — they cause hotspots.

Create a Service Account

  1. Go to IAM & Admin > Service Accounts > Create Service Account.
  2. Enter a name (e.g. datafly-signal-spanner).
  3. Grant the Cloud Spanner Database User role (roles/spanner.databaseUser).
  4. Click Done.

Generate a Service Account Key

  1. Click on the service account.
  2. Go to Keys > Add Key > Create new key > JSON.
  3. The key file will download. Store it securely.
⚠️

Store the JSON key file securely. Do not commit it to version control.

Configuration

FieldTypeRequiredDescription
project_idstringYesThe Google Cloud project ID that contains the Spanner instance.
instance_idstringYesThe Spanner instance ID.
database_idstringYesThe Spanner database ID.
table_namestringYesThe target table name to insert rows into.
service_account_jsonsecretYesThe full JSON key file content for a GCP service account with Spanner Database User permissions.

Signal Setup

Quick Setup

  1. Navigate to Integrations in the sidebar.
  2. Open the Integration Library tab.
  3. Find Google Cloud Spanner or filter by Cloud Storage.
  4. Click Install, select a variant if available, and fill in the required fields.
  5. Click Install Integration to create the integration with a ready-to-use default blueprint.

API Setup

curl -X POST http://localhost:8084/v1/admin/integration-catalog/google_spanner/install \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Google Cloud Spanner",
    "variant": "default",
    "config": {
      "project_id": "datafly-analytics",
      "instance_id": "datafly-events",
      "database_id": "events_db",
      "table_name": "Events",
      "service_account_json": "{\"type\": \"service_account\", ...}"
    },
    "delivery_mode": "server_side"
  }'

Testing

  1. Enable the integration in Signal and trigger a test event on your website.
  2. Open the Spanner console and navigate to your database.
  3. Go to Query and run:
SELECT * FROM Events ORDER BY timestamp DESC LIMIT 10;
  1. Verify that event rows are appearing with correct data.
  2. In Signal, check the Live Events view to confirm delivery status shows as successful.

Troubleshooting

ProblemSolution
Events not appearing in the tableVerify the project ID, instance ID, database ID, and table name are correct.
Permission denied (403)The service account lacks the Cloud Spanner Database User role. Add it in IAM & Admin > IAM.
NOT_FOUND: Database not foundThe database does not exist. Verify the database ID in the Spanner console.
NOT_FOUND: Table not foundThe table does not exist in the database. Verify the table name (case-sensitive in Spanner).
Invalid service account JSONEnsure you pasted the complete JSON key file content.
RESOURCE_EXHAUSTEDThe instance is at capacity. Increase the number of processing units or nodes.
Write hotspotsAvoid sequential primary keys. Use UUIDs or add a hash prefix to distribute writes evenly across splits.

Visit Google Cloud Spanner documentation for full SQL reference, schema design best practices, and performance tuning guides.