Estimated reading time: 5 minutes

Sample Project demonstrating moving Data from Kafka into Tableau

Here we demonstrate connection from Tableau to using a most practical approach using a as a sink via Kafka Connect and then connecting Tableau to that database.

Here’s a breakdown with conceptual configuration and code snippets:

Scenario: We’ll stream data from a Kafka topic (user_activity) into a PostgreSQL database table (user_activity_table) using Kafka Connect. Then, we’ll connect Tableau to this PostgreSQL database.

Part 1: Kafka Data (Conceptual)

Assume your Kafka topic user_activity contains JSON messages like this:

JSON

{
  "user_id": "user123",
  "event_type": "page_view",
  "page_url": "/products",
  "timestamp": "2025-04-23T14:30:00Z"
}

Part 2: PostgreSQL Database Setup

  1. Install PostgreSQL: If you don’t have it already, install PostgreSQL.
  2. Create a Database and Table: Create a database (e.g., kafka_data) and a table (user_activity_table) to store the Kafka data:
      • CREATE DATABASE kafka_data;
      • CREATE TABLE user_activity_table ( user_id VARCHAR(255), event_type VARCHAR(255), page_url TEXT, timestamp TIMESTAMP WITH TIME ZONE );

Part 3: Kafka Connect Setup and Configuration

  1. Install Kafka Connect: Kafka Connect is usually included with your Kafka distribution.
  2. Download PostgreSQL JDBC Driver: Download the PostgreSQL JDBC driver (postgresql-*.jar) and place it in the Kafka Connect plugin path.
  3. Configure a JDBC Sink Connector: Create a configuration file (e.g., postgres_sink.properties) for the JDBC Sink Connector:
    • Properties
      • name=-sink-connector connector.class=io.confluent.connect.jdbc.JdbcSinkConnector tasks.max=1 topics=user_activity connection.url=jdbc:postgresql://your_postgres_host:5432/kafka_data connection.user=your_postgres_user connection.password=your_postgres_password table.name.format=user_activity_table insert.mode=insert pk.mode=none value.converter=org..kafka.connect.json.JsonConverter value.converter.schemas.enable=false
        • Replace your_postgres_host, your_postgres_user, and your_postgres_password with your PostgreSQL connection details.
        • topics: Specifies the Kafka topic to consume from.
        • connection.url: JDBC connection string for PostgreSQL.
        • table.name.format: The name of the table to write to.
        • value.converter: Specifies how to convert the Kafka message value (we assume JSON).
  4. Start Kafka Connect: Run the Kafka Connect worker, pointing it to your connector configuration:
  • Bash
    • ./bin/connect-standalone.sh config/connect-standalone.properties config/postgres_sink.properties
    • config/connect-standalone.properties would contain the basic Kafka Connect worker configuration (broker list, plugin paths, etc.).

Part 4: Producing Sample Data to Kafka (Python)

Here’s a simple Python script using the kafka-python library to produce sample JSON data to the user_activity topic:

Python

from kafka import KafkaProducer
import json
import datetime
import time

KAFKA_BROKER = 'your_kafka_broker:9092'  
# Replace with your Kafka broker address
KAFKA_TOPIC = 'user_activity'

producer = KafkaProducer(
    bootstrap_servers=[KAFKA_BROKER],
    value_serializer=lambda x: json.dumps(x).encode('utf-8')
)

try:
    for i in range(5):
        timestamp = datetime.datetime.utcnow().isoformat() + 'Z'
        user_activity_data = {
            "user_id": f"user{100 + i}",
            "event_type": "click",
            "page_url": f"/item/{i}",
            "timestamp": timestamp
        }
        producer.send(KAFKA_TOPIC, value=user_activity_data)
        print(f"Sent: {user_activity_data}")
        time.sleep(1)

except Exception as e:
    print(f"Error sending data: {e}")
finally:
    producer.close()
  • Replace your_kafka_broker:9092 with the actual address of your Kafka broker.
  • This script sends a few sample JSON messages to the user_activity topic.

Part 5: Connecting Tableau to PostgreSQL

  1. Open Tableau Desktop.
  2. Under “Connect,” select “PostgreSQL.”
  3. Enter the connection details:
    • Server: your_postgres_host
    • Database: kafka_data
    • User: your_postgres_user
    • Password: your_postgres_password
    • Port: 5432 (default)
  4. Click “Connect.”
  5. Select the public schema (or the schema where user_activity_table resides).
  6. Drag the user_activity_table to the canvas.
  7. You can now start building visualizations in Tableau using the data from the user_activity_table, which is being populated in near real-time by Kafka Connect.

Limitations and Considerations:

  • Not True Real-time in Tableau: Tableau will query the PostgreSQL database based on its refresh settings (live connection or scheduled extract). It won’t have a direct, push-based real-time stream from Kafka.
  • Complexity: Setting up Kafka Connect and a database adds complexity compared to a direct connector.
  • Data Transformation: You might need to perform more complex transformations within PostgreSQL or Tableau.
  • Error Handling: Robust error handling is crucial in a production Kafka Connect setup.

Alternative (Conceptual – No Simple Code): Using a Real-time Data (e.g., Rockset)

While providing a full, runnable code example for a platform like Rockset is beyond a simple snippet, the concept involves:

  1. Rockset Kafka Integration: Configuring Rockset to connect to your Kafka cluster and continuously ingest data from the user_activity topic. Rockset handles schema discovery and .
  2. Tableau Rockset Connector: Using Tableau’s native Rockset connector (you’d need a Rockset account and key) to directly query the real-time data in Rockset.

This approach offers lower latency for real-time analytics in Tableau compared to the database sink method but involves using a third-party service.

In conclusion, while direct Kafka connectivity in Tableau is limited, using Kafka Connect to pipe data into a Tableau-supported database (like PostgreSQL) provides a practical way to visualize near real-time data with the help of configuration and standard database connection methods. For true low-latency real-time visualization, exploring dedicated real-time data platforms with Tableau connectors is the more suitable direction.

Agentic AI (45) AI Agent (35) airflow (6) Algorithm (35) Algorithms (88) apache (57) apex (5) API (135) Automation (67) Autonomous (60) auto scaling (5) AWS (73) aws bedrock (1) Azure (47) BigQuery (22) bigtable (2) blockchain (3) Career (7) Chatbot (23) cloud (143) cosmosdb (3) cpu (45) cuda (14) Cybersecurity (19) database (138) Databricks (25) Data structure (22) Design (113) dynamodb (10) ELK (2) embeddings (39) emr (3) flink (12) gcp (28) Generative AI (28) gpu (25) graph (49) graph database (15) graphql (4) image (50) indexing (33) interview (7) java (43) json (79) Kafka (31) LLM (59) LLMs (55) Mcp (6) monitoring (128) Monolith (6) mulesoft (4) N8n (9) Networking (16) NLU (5) node.js (16) Nodejs (6) nosql (29) Optimization (91) performance (193) Platform (121) Platforms (96) postgres (5) productivity (31) programming (54) pseudo code (1) python (110) pytorch (22) Q&A (2) RAG (65) rasa (5) rdbms (7) ReactJS (1) realtime (2) redis (16) Restful (6) rust (3) salesforce (15) Spark (39) sql (70) tensor (11) time series (17) tips (14) tricks (29) use cases (93) vector (60) vector db (9) Vertex AI (23) Workflow (67)