Simpler scripts and configuration coming in Accumulo 2.0.0

Author:   Mike Walch
Date:   16 Nov 2016

For the upcoming 2.0.0 release, Accumulo’s scripts and configuration were refactored to make Accumulo easier to use. While Accumulo’s documentation (i.e. the user manual and INSTALL.md) were updated with any changes that were made, this blog post provides a summary of the changes.

Fewer scripts

Before 2.0.0, the bin/ directory of Accumulo’s binary tarball contained about 20 scripts:

$ ls accumulo-1.8.0/bin/
accumulo             build_native_library.sh  generate_monitor_certificate.sh  start-here.sh    stop-server.sh
accumulo_watcher.sh  check-slaves             LogForwarder.sh                  start-server.sh  tdown.sh
bootstrap_config.sh  config-server.sh         start-all.sh                     stop-all.sh      tool.sh
bootstrap_hdfs.sh    config.sh                start-daemon.sh                  stop-here.sh     tup.sh

The number of scripts made it difficult to know which scripts to use. If you added the bin/ directory to your PATH, it could add unecessary commands to your PATH or cause commands to be overriden due generic names (like ‘start-all.sh’). The number of scripts were reduced by removing scripts that are no longer used and combining scripts with similiar functionality.

Starting with 2.0.0, Accumulo will only have 4 scripts in its bin/ directory:

$ ls accumulo-2.0.0/bin/
accumulo  accumulo-cluster  accumulo-service  accumulo-util

Below are some notes on this change:

  • The ‘accumulo’ script was mostly left alone except for improved usage.
  • The ‘accumulo-service’ script was created to manage Accumulo processes as services
  • The ‘accumulo-cluster’ command was created to manage Accumulo on cluster and replaces ‘start-all.sh’ and ‘stop-all.sh’.
  • The ‘accumulo-util’ command combines many utility scripts such as ‘build_native_library.sh’, ‘tool.sh’, etc into one script.

Less configuration

Before 2.0.0, Accumulo’s conf/ directory looked like the following (after creating initial config files using ‘bootstrap_config.sh’):

$ ls accumulo-1.8.0/conf/
accumulo-env.sh          auditLog.xml  generic_logger.properties            masters                    slaves
accumulo-metrics.xml     client.conf   generic_logger.xml                   monitor                    templates
accumulo.policy.example  examples      hadoop-metrics2-accumulo.properties  monitor_logger.properties  tracers
accumulo-site.xml        gc            log4j.properties                     monitor_logger.xml

While all of these files have a purpose, many are only used in rare situations. For Accumulo 2.0, the ‘conf/’ directory now only contains a minimum set of configuration files needed to run Accumulo.

$ ls accumulo-2.0.0/conf/
accumulo-env.sh  accumulo-site.xml  client.conf  log4j-monitor.properties  log4j.properties  log4j-service.properties  templates

The Accumulo tarball does contain host files (i.e ‘tservers’, ‘monitor’, etc) by default as these files are only required by the ‘accumulo-cluster’ command. However, the script has a command to generate them.

$ ./bin/accumulo-cluster create-config

Any less common configuration files can still be found in conf/templates.

Better usage

Before 2.0.0, the ‘accumulo’ command had a limited usage:

$ ./accumulo-1.8.0/bin/accumulo
accumulo admin | check-server-config | classpath | create-token | gc | help | info | init | jar <jar> [<main class>] args |
  login-info | master | minicluster | monitor | proxy | rfile-info | shell | tracer | tserver | version | zookeeper | <accumulo class> args

For 2.0.0, all ‘accumulo’ commands were given a short description and organized into the groups. Below is the full usage. It should be noted that usage is limited until the ‘accumulo-env.sh’ configuration file is created in conf/ by the accumulo create-config command.

$ ./accumulo-2.0.0/bin/accumulo help

Usage: accumulo <command> [-h] (<argument> ...)

  -h   Prints usage for specified command

Core Commands:
  init                           Initializes Accumulo
  shell                          Runs Accumulo shell
  classpath                      Prints Accumulo classpath
  version                        Prints Accumulo version
  admin                          Executes administrative commands
  info                           Prints Accumulo cluster info
  help                           Prints usage
  <main class> args              Runs Java <main class> located on Accumulo classpath

Process Commands:
  gc                             Starts Accumulo garbage collector
  master                         Starts Accumulo master
  monitor                        Starts Accumulo monitor
  minicluster                    Starts Accumulo minicluster
  proxy                          Starts Accumulo proxy
  tserver                        Starts Accumulo tablet server
  tracer                         Starts Accumulo tracer
  zookeeper                      Starts Apache Zookeeper instance

Advanced Commands:
  check-server-config            Checks server config
  create-token                   Creates authentication token
  login-info                     Prints Accumulo login info
  rfile-info                     Prints rfile info

The new ‘accumulo-service’ and ‘accumulo-cluster’ commands also have informative usage.

$ ./accumulo-2.0.0/bin/accumulo-service 

Usage: accumulo-service <service> <command>

Services:
  gc          Accumulo garbage collector
  monitor     Accumulo monitor
  master      Accumulo master
  tserver     Accumulo tserver
  tracer      Accumulo tracter

Commands:
  start       Starts service
  stop        Stops service
  kill        Kills service

$ ./accumulo-2.0.0/bin/accumulo-cluster 

Usage: accumulo-cluster <command> (<argument> ...)

Commands:
  create-config       Creates cluster config
  start               Starts Accumulo cluster
  stop                Stops Accumulo cluster
  start-non-tservers  Starts all services except tservers
  start-tservers      Starts all tservers on cluster
  stop-tservers       Stops all tservers on cluster
  start-here          Starts all services on this node
  stop-here           Stops all services on this node

This post was updated on March 24, 2017 to reflect changes to Accumulo 2.0

View all posts in the news archive