2.1.0 Metrics and Tracing Changes

Author:   Dave Marion
Date:   22 Jun 2022

Metrics and Tracing changed in 2.1.0. This post explains the new implementations and provides examples on how to configure them.

Metrics

Accumulo was modified in version 1.7.0 (2015) to use the Hadoop Metrics2 framework for capturing and emitting internal Accumulo metrics. Micrometer, a newer metrics framework, supports sending metrics to many popular monitoring systems. In Accumulo 2.1.0 support for the Hadoop Metrics2 framework has been removed in favor of using Micrometer. Metrics are disabled by default.

Micrometer has the concept of a MeterRegistry, which is used to create and emit metrics to the supported monitoring systems. Additionally, Micrometer supports sending metrics to multiple monitoring systems concurrently. Configuring Micrometer in Accumulo will require you to write a small peice of code to provide the MeterRegistry configuration. Specifically, you will need to create a class that implements MeterRegistryFactory. Your implementation will need to create and configure the appropriate MeterRegistry. Additionally, you will need to add the MeterRegistry jar file and the jar file containing your MeterRegistryFactory implementation to Accumulo’s classpath. The page for each monitoring system that Micrometer supports contains instructions on how to configure the registry and which jar file is required.

Accumulo’s metrics integration test uses a TestStatsDRegistryFactory to create and configure a StatsD Meter Registry. The instructions below provide an example of how to use this class to emit Accumulo’s metrics to a Telegraf - InfluxDB - Grafana monitoring stack.

Metrics Example

This example uses a Docker container that contains Telegraf-InfluxDB-Grafana system. We will configure Accumulo to send metrics to the Telegraf component running in the Docker image. Telegraf will persist the metrics in InfluxDB and then we will visualize the metrics using Grafana. This example assumes that you have installed Docker (or equivalent engine) and have an Accumulo database already installed and initialized. We will be installing some things, modifying the Accumulo configuration, and starting Accumulo.

  1. Download the Telegraf-Influx-Grafana (TIG) Docker image
    docker pull artlov/docker-telegraf-influxdb-grafana:latest
    
  2. Create directories for the Docker container
    mkdir -p /tmp/metrics/influxdb
    chmod 777 /tmp/metrics/influxdb
    mkdir /tmp/metrics/grafana
    mkdir /tmp/metrics/grafana-dashboards
    mkdir -p /tmp/metrics/telegraf/conf
    
  3. Download Telegraf configuration and Grafana dashboard
    cd /tmp/metrics/telegraf/conf
    wget https://raw.githubusercontent.com/apache/accumulo-testing/main/contrib/terraform-testing-infrastructure/modules/config-files/templates/telegraf.conf.tftpl
    cat telegraf.conf.tftpl | sed "s/\${manager_ip}/localhost/" > telegraf.conf
    cd /tmp/metrics/grafana-dashboards
    wget https://raw.githubusercontent.com/apache/accumulo-testing/main/contrib/terraform-testing-infrastructure/modules/config-files/files/grafana_dashboards/accumulo-dashboard.json
    wget https://raw.githubusercontent.com/apache/accumulo-testing/main/contrib/terraform-testing-infrastructure/modules/config-files/files/grafana_dashboards/accumulo-dashboard.yaml
    
  4. Start the TIG Docker container
    docker run --ulimit nofile=66000:66000 -d --rm \
     --name tig-stack \
     -p 3003:3003 \
     -p 3004:8888 \
     -p 8086:8086 \
     -p 22022:22 \
     -p 8125:8125/udp \
     -v /tmp/metrics/influxdb:/var/lib/influxdb \
     -v /tmp/metrics/grafana:/var/lib/grafana \
     -v /tmp/metrics/telegraf/conf:/etc/telegraf \
     -v /tmp/metrics/grafana-dashboards:/etc/grafana/provisioning/dashboards \
     artlov/docker-telegraf-influxdb-grafana:latest
    
  5. Download Micrometer StatsD Meter Registry jar
    wget -O micrometer-registry-statsd-1.9.1.jar https://search.maven.org/remotecontent?filepath=io/micrometer/micrometer-registry-statsd/1.9.1/micrometer-registry-statsd-1.9.1.jar
    
  6. At a mininum you need to enable the metrics using the property general.micrometer.enabled and supply the name of the MeterRegistryFactory class using the property general.micrometer.factory. To enable JVM metrics, use the property general.micrometer.jvm.metrics.enabled. Modify the accumulo.properties configuration file by adding the properties below.
    # Micrometer settings
    general.micrometer.enabled=true
    general.micrometer.jvm.metrics.enabled=true
    general.micrometer.factory=org.apache.accumulo.test.metrics.TestStatsDRegistryFactory
    
  7. Copy the micrometer-registry-statsd-1.9.1.jar and accumulo-test.jar into the Accumulo lib directory

  8. The TestStatsDRegistryFactory uses system properties to determine the host and port of the StatsD server. In this example the Telegraf component started in step 4 above contains a StatsD server listening on localhost:8125. Configure the TestStatsDRegistryFactory by adding the following system properties to the JAVA_OPTS variable in accumulo-env.sh.
    "-Dtest.meter.registry.host=127.0.0.1"
    "-Dtest.meter.registry.port=8125"
    
  9. Start Accumulo. You should see the following statement in the server log files
    [metrics.MetricsUtil] INFO : initializing metrics, enabled:true, class:org.apache.accumulo.test.metrics.TestStatsDRegistryFactory
    
  10. Log into Grafana (http://localhost:3003/) using the default credentials (root/root). Click the Home icon at the top, then click the Accumulo Micrometer Test Dashboard. If everything is working correctly, then you should see something like the image below.

Grafana Screenshot

Tracing

With the retirement of HTrace, Accumulo has selected to replace it’s tracing functionality with OpenTelemetry in version 2.1.0. Hadoop appears to be on the same path which, when finished, should provide better insight into Accumulo’s use of HDFS. OpenTelemetry supports exporting Trace information to several different systems, to include Jaeger, Zipkin, and others. The HTrace trace spans in the Accumulo source code have been updated to use OpenTelemetry trace spans. If tracing is enabled, then Accumulo will use the OpenTelemetry implementation registered with the GlobalOpenTelemetry object. Tracing is disabled by default and a no-op OpenTelemetry implementation is used.

Tracing Example

This example uses the OpenTelemetry Java Agent jar file to configure and export trace information to Jaeger. The OpenTelemetry Java Agent jar file bundles together the supported Java exporters, provides a way to configure them, and registers them with the GlobalOpenTelemetry singleton that is used by Accumulo. An alternate method to supplying the OpenTelemetry dependencies, without using the Java Agent jar file, is to create a shaded jar with the OpenTelemetry autoconfigure module and it’s runtime dependencies and place the resulting shaded jar on the classpath. An example Maven pom.xml file to create the shaded jar is here. When using this alternate method you can skip step 2 and the uncommenting of the java agent in step 5 below.

  1. Download Jaeger all-in-one Docker image
      docker pull jaegertracing/all-in-one:1.35
    
  2. Download OpenTelemetry Java Agent (https://github.com/open-telemetry/opentelemetry-java/tree/main/sdk-extensions/autoconfigure)
      wget -O opentelemetry-javaagent-1.15.0.jar https://search.maven.org/remotecontent?filepath=io/opentelemetry/javaagent/opentelemetry-javaagent/1.15.0/opentelemetry-javaagent-1.15.0.jar
    
  3. To enable tracing, you need to set the general.opentelemetry.enabled property. Modify the accumulo.properties configuration file and add the following property.
    # OpenTelemetry settings
    general.opentelemetry.enabled=true
    
  4. To enable tracing in the shell, set the general.opentelemetry.enabled property in the accumulo-client.properties configuration file.
    # OpenTelemetry settings
    general.opentelemetry.enabled=true
    
  5. Configure the OpenTelemetry JavaAgent in accumulo-env.sh by uncommenting the following and updating the path to the java agent jar:
      ## Optionally setup OpenTelemetry SDK AutoConfigure
      ## See https://github.com/open-telemetry/opentelemetry-java/tree/main/sdk-extensions/autoconfigure
      #JAVA_OPTS=('-Dotel.traces.exporter=jaeger' '-Dotel.metrics.exporter=none' '-Dotel.logs.exporter=none' "${JAVA_OPTS[@]}")
      ## Optionally setup OpenTelemetry Java Agent
      ## See https://github.com/open-telemetry/opentelemetry-java-instrumentation for more options
      #JAVA_OPTS=('-javaagent:path/to/opentelemetry-javaagent.jar' "${JAVA_OPTS[@]}")
    
  6. Start Jaeger Docker container
    docker run -d --rm --name jaeger \
     -e COLLECTOR_ZIPKIN_HOST_PORT=:9411 \
     -p 5775:5775/udp \
     -p 6831:6831/udp \
     -p 6832:6832/udp \
     -p 5778:5778 \
     -p 16686:16686 \
     -p 14268:14268 \
     -p 14250:14250 \
     -p 9411:9411 jaegertracing/all-in-one:1.35
    
  7. Start Accumulo. You should see the following statement in the server log files
    [trace.TraceUtil] INFO : Trace enabled in Accumulo: yes, OpenTelemetry instance: class io.opentelemetry.javaagent.instrumentation.opentelemetryapi.v1_10.ApplicationOpenTelemetry110, Tracer instance: class io.opentelemetry.javaagent.instrumentation.opentelemetryapi.trace.ApplicationTracer
    
  8. View traces in Jaeger UI at http://localhost:16686. You can select the service name on the left panel and click Find Traces to view the trace information. If everything is working correctly, then you should see something like the image below.

Jaeger Screenshot

View all posts in the news archive