Estimated reading time: 4 minutes

Monitoring Apache Kafka infrastructure using New Relic

One can effectively monitor infrastructure using New Relic through several methods:

1. Kafka On-Host Integration (Recommended for most self-managed Kafka deployments):

  • How it works: This integration runs on the same host as your Kafka brokers and directly collects metrics and inventory data specific to Kafka. It leverages JMX ( Management Extensions) to gather detailed metrics about brokers, topics, producers, and consumers. It also pulls inventory data from Zookeeper.1
  • Key Features:
    • Monitors key metrics for clusters, brokers, producers, consumers, and topics.
    • Automatic discovery of brokers using Bootstrap or Zookeeper.
    • Provides pre-built dashboards for Kafka overview.
    • Allows you to create custom dashboards and set up alerts.2
  • Installation: Requires installing the New Relic Infrastructure agent on your Kafka hosts first.3 Then, you install and configure the Kafka integration. Configuration involves specifying your cluster name, Kafka version, broker discovery strategy (Zookeeper or Bootstrap), and JMX connection details.4
  • Metrics Collected (Examples):
    • broker.bytesWrittenToTopicPerSecond
    • broker.IOInPerSecond / broker.IOOutPerSecond
    • broker.messagesInPerSecond
    • replication.isrExpandsPerSecond / replication.isrShrinksPerSecond
    • replication.unreplicatedPartitions
    • consumer.bytesInPerSecond
    • consumer.maxLag
  • Documentation: https://docs.newrelic.com/install/kafka/

2. Java Agent (for Java-based Producers and Consumers):

  • How it works: If your Kafka producers and consumers are written in Java, you can use the New Relic Java agent to automatically instrument them. This provides insights into the of your applications interacting with Kafka.
  • Key Features:
    • Collects metrics about messaging rates, latency, and lag for Java-based clients.
    • Automatic instrumentation of Kafka’s Java client library.
    • Option to collect event data for more flexible querying with NRQL.
    • Supports distributed tracing to track messages across your application and Kafka.
  • Installation: Requires installing the New Relic Java agent in your Java application. You might need to enable specific Kafka-related configurations in the newrelic.yml file.
  • Metrics Collected (Examples):
    • MessageBroker/Kafka/Internal/consumer-metrics/records-consumed-rate
    • MessageBroker/Kafka/Internal/producer-metrics/record-send-total
    • Kafka/Streams/stream-thread-metrics/poll-latency-avg (for Kafka Streams)
  • Documentation: https://docs.newrelic.com/docs/apm/agents/java-agent/instrumentation/java-agent-instrument-kafka-message-queues/

3. OpenTelemetry (for a vendor-agnostic approach):

  • How it works: OpenTelemetry is an open-source observability framework that provides a standardized way to collect telemetry data (metrics, traces, and logs).5 You can use OpenTelemetry to instrument your Kafka brokers and client applications and then export the data to New Relic.
  • Key Features:
    • Vendor-agnostic instrumentation.
    • Flexibility in collecting various telemetry data.
    • Integration with the New Relic OTLP (OpenTelemetry Protocol) endpoint.
    • Allows for correlating infrastructure and application performance data.
  • Installation: Involves setting up OpenTelemetry SDKs and Collectors for your Kafka components and configuring the OTLP exporter to send data to your New Relic account.
  • Documentation: https://docs.newrelic.com/docs/opentelemetry/ and https://github.com/newrelic/newrelic-opentelemetry-examples have examples for various technologies.

4. Kafka Connect New Relic Connector (for sending data from Kafka Connect to New Relic):

  • How it works: If you are using Kafka Connect, you can use the New Relic Kafka Connect Sink Connector to stream data from Kafka topics directly into New Relic as events.6
  • Key Features:
    • Directly ingest data from Kafka topics into New Relic.
    • Allows you to analyze Kafka data within the New Relic .
  • Installation: Requires configuring and deploying the New Relic Kafka Connect Sink Connector in your Kafka Connect environment.
  • Documentation: Search for “New Relic Kafka Connect Sink Connector” in the New Relic documentation or on like Confluent Hub.

Choosing the Right Method:

  • For comprehensive monitoring of your Kafka brokers and overall cluster health, the Kafka On-Host Integration is generally the recommended approach.
  • If you need detailed performance insights into your Java-based producers and consumers, use the New Relic Java Agent.
  • If you prefer a vendor-agnostic approach or have a complex environment with various technologies, OpenTelemetry offers flexibility.
  • If you want to directly ingest data from Kafka Connect pipelines into New Relic, use the Kafka Connect New Relic Connector.

You can even use a combination of these methods for a holistic view of your Kafka ecosystem within New Relic. Remember to consult the official New Relic documentation for the most up-to-date instructions and configuration details.

Agentic AI (27) AI Agent (20) airflow (7) Algorithm (28) Algorithms (65) apache (32) apex (2) API (102) Automation (59) Autonomous (41) auto scaling (6) AWS (55) Azure (39) BigQuery (15) bigtable (8) blockchain (2) Career (5) Chatbot (20) cloud (111) cosmosdb (3) cpu (44) cuda (20) Cybersecurity (9) database (93) Databricks (7) Data structure (18) Design (93) dynamodb (24) ELK (3) embeddings (43) emr (7) flink (9) gcp (25) Generative AI (15) gpu (15) graph (49) graph database (15) graphql (4) image (52) indexing (33) interview (7) java (40) json (35) Kafka (21) LLM (27) LLMs (48) Mcp (5) monitoring (101) Monolith (3) mulesoft (1) N8n (3) Networking (13) NLU (5) node.js (20) Nodejs (2) nosql (23) Optimization (78) performance (209) Platform (90) Platforms (67) postgres (3) productivity (22) programming (53) pseudo code (1) python (66) pytorch (36) RAG (45) rasa (4) rdbms (5) ReactJS (4) realtime (1) redis (13) Restful (9) rust (2) salesforce (10) Spark (17) spring boot (5) sql (57) tensor (19) time series (15) tips (16) tricks (4) use cases (51) vector (64) vector db (5) Vertex AI (18) Workflow (47) xpu (1)

Leave a Reply