Apache Kafka

Datafly Signal publishes events to Apache Kafka topics for real-time stream processing, event sourcing, and integration with downstream consumers and microservices.

Prerequisites

Before configuring Apache Kafka in Signal, you need a Kafka cluster with a topic and optional SASL credentials for authentication.

Set Up a Kafka Cluster

You have several options:

Option A: Self-Hosted Kafka

  1. Install Apache Kafka using the official quickstart.
  2. Start ZooKeeper (or use KRaft mode for ZooKeeper-less setups).
  3. Start one or more Kafka brokers.
  4. Note the Bootstrap servers addresses (e.g. broker1:9092,broker2:9092).

Option B: Managed Kafka

Use a managed Kafka service such as Amazon MSK, Aiven, Redpanda, or Strimzi on Kubernetes. Follow the provider’s setup guide and note the bootstrap servers.

Create a Topic

Create the topic that Signal will produce to:

kafka-topics.sh --create \
  --topic datafly-events \
  --partitions 6 \
  --replication-factor 3 \
  --bootstrap-server broker1:9092

Or if using a managed service, create the topic through the provider’s console.

Choose the number of partitions based on expected throughput and consumer parallelism. 6 partitions is a good starting point for moderate workloads. Increasing partitions later requires careful planning.

Configure SASL Authentication (If Required)

If your Kafka cluster requires authentication:

  1. Create SASL credentials for Signal. The exact process depends on your Kafka deployment:
    • SASL/PLAIN: Create a username and password in the JAAS configuration.
    • SASL/SCRAM: Use kafka-configs.sh to create SCRAM credentials.
    • SASL/OAUTHBEARER: Configure an OAuth provider and obtain client credentials.
  2. Note the Security protocol (SASL_SSL for encrypted + authenticated connections, SASL_PLAINTEXT for authenticated without encryption).
  3. Note the SASL mechanism (e.g. PLAIN, SCRAM-SHA-256, SCRAM-SHA-512).

Example SCRAM credential creation:

kafka-configs.sh --alter \
  --add-config 'SCRAM-SHA-256=[password=your_password]' \
  --entity-type users \
  --entity-name datafly-signal \
  --bootstrap-server broker1:9092
⚠️

Always use SASL_SSL in production to encrypt data in transit. SASL_PLAINTEXT transmits credentials in cleartext and should only be used in development environments.

Verify Network Access

Ensure your Signal deployment can reach the Kafka bootstrap servers on the configured port (default 9092 for plaintext, 9093 for SSL). Check firewall rules, security groups, and DNS resolution.

Configuration

FieldTypeRequiredDescription
bootstrap_serversstringYesComma-separated list of Kafka broker addresses in host:port format.
topicstringYesThe Kafka topic to produce messages to. The topic must already exist or auto-creation must be enabled.
security_protocolselectYesThe protocol for broker communication: PLAINTEXT, SSL, SASL_PLAINTEXT, or SASL_SSL.
sasl_mechanismselectNoThe SASL mechanism: PLAIN, SCRAM-SHA-256, or SCRAM-SHA-512. Required when using SASL protocols.
sasl_usernamestringNoThe SASL username. Required when a SASL mechanism is selected.
sasl_passwordsecretNoThe SASL password. Required when a SASL mechanism is selected.

Signal Setup

Quick Setup

  1. Navigate to Integrations in the sidebar.
  2. Open the Integration Library tab.
  3. Find Apache Kafka or filter by Cloud Storage.
  4. Click Install, select a variant if available, and fill in the required fields.
  5. Click Install Integration to create the integration with a ready-to-use default blueprint.

API Setup

curl -X POST http://localhost:8084/v1/admin/integration-catalog/kafka/install \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Apache Kafka",
    "variant": "default",
    "config": {
      "bootstrap_servers": "broker1:9092,broker2:9092",
      "topic": "datafly-events",
      "security_protocol": "SASL_SSL",
      "sasl_mechanism": "SCRAM-SHA-256",
      "sasl_username": "datafly-signal",
      "sasl_password": "your_password"
    },
    "delivery_mode": "server_side"
  }'

Testing

  1. Enable the integration in Signal and trigger a test event on your website.
  2. Consume messages from the topic to verify:
kafka-console-consumer.sh \
  --topic datafly-events \
  --from-beginning \
  --max-messages 10 \
  --bootstrap-server broker1:9092
  1. Inspect the message values to verify the event data.
  2. In Signal, check the Live Events view to confirm delivery status shows as successful.

Troubleshooting

ProblemSolution
Events not appearing in the topicVerify the bootstrap servers and topic name are correct.
Connection timeoutEnsure Signal can reach the Kafka brokers. Check DNS, firewalls, and security groups. Verify the port matches the security protocol.
SASL authentication failedThe SASL username or password is incorrect. Verify the credentials and SASL mechanism.
TopicAuthorizationExceptionThe SASL user lacks produce permission on the topic. Configure ACLs with kafka-acls.sh.
UnknownTopicOrPartitionExceptionThe topic does not exist and auto-creation is disabled. Create the topic manually.
MessageSizeTooLargeThe event payload exceeds message.max.bytes on the broker or topic. Increase the limit or reduce payload size.
SSL handshake failureIf using SSL or SASL_SSL, ensure the broker’s SSL certificate is trusted. Check the CA certificate configuration.
Wrong security protocolThe security protocol must match the broker’s listener configuration. PLAINTEXT for port 9092, SSL for 9093 is a common pattern but varies by deployment.

Visit Apache Kafka documentation for full producer configuration reference, ACL setup, and security configuration.