IntegrationsDatabasesGoogle Bigtable

Google Bigtable

Datafly Signal writes first-party events as rows into a Google Cloud Bigtable table — high-throughput, low-latency NoSQL storage suited to time-series workloads.

Prerequisites

Before configuring Google Bigtable in Signal, you need a GCP project with a Bigtable instance, a table with a column family, and a service account.

Create a GCP Account and Project

  1. Sign up at cloud.google.com if you don’t already have an account.
  2. Create a new project or select an existing one in the GCP Console.
  3. Note the Project ID.

Enable the Bigtable API

  1. Go to APIs & Services > Library.
  2. Search for Cloud Bigtable API and Cloud Bigtable Admin API.
  3. Click Enable for both.

Create a Bigtable Instance

  1. Go to the Bigtable console.
  2. Click Create instance.
  3. Enter an Instance name (e.g. datafly-events) and Instance ID.
  4. Choose the Storage type:
    • SSD — lowest latency (recommended for real-time workloads).
    • HDD — lower cost for large-volume storage.
  5. Configure clusters — select a region and the number of nodes (minimum 1 for development, 3+ for production).
  6. Click Create.

Bigtable pricing is based on the number of nodes and storage. A single-node development instance is suitable for testing but should not be used in production.

Create a Table with Column Family

  1. In the Bigtable console, click on your instance.
  2. Go to Tables > Create table.
  3. Enter a Table ID (e.g. events).
  4. Add a Column family named events (or your preferred name).
  5. Click Create.

Alternatively, use the cbt CLI tool:

cbt -project your-project -instance datafly-events createtable events
cbt -project your-project -instance datafly-events createfamily events events

Create a Service Account

  1. Go to IAM & Admin > Service Accounts > Create Service Account.
  2. Enter a name (e.g. datafly-signal-bigtable).
  3. Grant the Bigtable User role (roles/bigtable.user).
  4. Click Done.

Generate a Service Account Key

  1. Click on the service account.
  2. Go to Keys > Add Key > Create new key > JSON.
  3. The key file will download. Store it securely.
⚠️

Store the JSON key file securely. Do not commit it to version control.

Configuration

FieldTypeRequiredDescription
project_idstringYesThe Google Cloud project ID that contains the Bigtable instance.
instance_idstringYesThe Bigtable instance ID.
tablestringYesThe Bigtable table ID to write rows into. Also accepts table_id.
column_familystringYesThe column family to write event columns to.
service_account_jsonsecretYesThe full JSON key file content for a service account with roles/bigtable.user.

Signal Setup

Quick Setup

  1. Navigate to Integrations in the sidebar.
  2. Open the Integration Library tab.
  3. Find Google Bigtable or filter by Database.
  4. Click Install, select a variant if available, and fill in the required fields.
  5. Click Install Integration to create the integration with a ready-to-use default blueprint.

API Setup

curl -X POST http://localhost:8084/v1/admin/integration-catalog/google_bigtable/install \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Google Bigtable",
    "variant": "default",
    "config": {
      "project_id": "datafly-analytics",
      "instance_id": "datafly-events",
      "table": "events",
      "column_family": "events",
      "service_account_json": "{\"type\": \"service_account\", ...}"
    },
    "delivery_mode": "server_side"
  }'

Schema

Each event becomes one Bigtable row. The row key is <reversed-timestamp>#<event_id> (reversed timestamp prefix prevents hotspotting on monotonically increasing keys). The columns are written under the configured column family, with one column per envelope field:

ColumnTypeNotes
event_idbytesUnique per event.
typebytesEvent type.
eventbytesEvent name.
anonymous_idbytesFirst-party visitor identifier.
user_idbytesLogged-in user identifier (optional).
timestampbytesISO-8601 client event time.
received_atbytesISO-8601 time Signal received the event.
sent_atbytesISO-8601 time the row was written.
contextbytesJSON document.
propertiesbytesJSON document.
traitsbytesJSON document.
source_idbytesPipeline source identifier.
integration_idbytesSignal integration identifier.

Bigtable stores all values as bytes — apply type conversion in your reader.

Bigtable is a first-party destination in your own GCP project. The default blueprint forwards all events. Apply consent filtering via pipeline transforms or filter-on-read at the application layer over the context column if needed.

Testing

  1. Enable the integration in Signal and trigger a test event on your website.
  2. Use the cbt CLI to read recent rows:
cbt -project your-project -instance datafly-events read events count=10
  1. Or use the Bigtable console’s Query tab to inspect rows.
  2. In Signal, check the Live Events view to confirm delivery status shows as successful.

Troubleshooting

ProblemSolution
Events not appearing in the tableVerify the project ID, instance ID, table ID, and column family are correct.
Permission denied (403)The service account lacks the Bigtable User role. Add it in IAM & Admin > IAM.
Table not foundThe table does not exist in the specified instance. Verify the table ID.
Column family not foundThe column family name does not match. List column families with cbt -project ... -instance ... ls <table>.
Invalid service account JSONEnsure you pasted the complete JSON key file content, including all fields.
High latencyCheck the number of nodes in your Bigtable cluster. Add more nodes for higher throughput.
Row key hotspottingIf all writes go to the same node, consider using a row key with better distribution (e.g. reversed timestamp prefix).

Visit Google Bigtable documentation for full API reference, schema design best practices, and monitoring guides.

See also