Simpler scripts and configuration coming in Accumulo 2.0.0
|Date||16 Nov 2016|
|Reviewer(s)||Josh Elser, Chistopher Tubbs|
For the upcoming 2.0.0 release, Accumulo’s scripts and configuration were refactored to make Accumulo easier to use. While Accumulo’s documentation (i.e. the user manual and INSTALL.md) were updated with any changes that were made, this blog post provides a summary of the changes.
Before 2.0.0, the
bin/ directory of Accumulo’s binary tarball contained about 20 scripts:
$ ls accumulo-1.8.0/bin/ accumulo build_native_library.sh generate_monitor_certificate.sh start-here.sh stop-server.sh accumulo_watcher.sh check-slaves LogForwarder.sh start-server.sh tdown.sh bootstrap_config.sh config-server.sh start-all.sh stop-all.sh tool.sh bootstrap_hdfs.sh config.sh start-daemon.sh stop-here.sh tup.sh
The number of scripts made it difficult to know which scripts to use. If you added the
bin/ directory to your
PATH, it could add unecessary commands to your PATH or cause commands to be overriden due generic names
(like ‘start-all.sh’). The number of scripts were reduced using the following methods:
- Scripts that are only called by other scripts were moved to a new
libexec/directory in the Accumulo binary tarball
- Scripts with similiar functionality were combined
- Extra/optional scripts were move to a new
contrib/directory in the binary tarball
Starting with 2.0.0, Accumulo will only have 3 scripts in its
$ ls accumulo-2.0.0/bin/ accumulo accumulo-cluster accumulo-service
Below are some notes on this change:
- The ‘accumulo’ script was left alone except for improved usage and the addition of ‘create-config’ and ‘build-native’ commands to replace ‘bootstrap_config.sh’ and ‘build_native_library.sh’.
- The ‘accumulo-service’ script was created to manage Accumulo processes as services
- The ‘accumulo-cluster’ command was created to manage Accumulo on cluster and replaces ‘start-all.sh’ and ‘stop-all.sh’.
All optional scripts in
bin/were moved to
$ ls accumulo-2.0.0/contrib/ bootstrap-hdfs.sh check-tservers gen-monitor-cert.sh tool.sh
Before 2.0.0, Accumulo’s
conf/ directory looked like the following (after creating initial config files
$ ls accumulo-1.8.0/conf/ accumulo-env.sh auditLog.xml generic_logger.properties masters slaves accumulo-metrics.xml client.conf generic_logger.xml monitor templates accumulo.policy.example examples hadoop-metrics2-accumulo.properties monitor_logger.properties tracers accumulo-site.xml gc log4j.properties monitor_logger.xml
While all of these files have a purpose, many are only used in rare situations. Therefore, the ‘accumulo create-config’ (which replaces ‘bootstrap_config.sh’) now only generates a minimum set of configuration files needed to run Accumulo.
$ cd accumulo-2.0.0 $ ./bin/accumulo create-config $ ls conf/ accumulo-env.sh accumulo-site.xml client.conf examples
The ‘accumulo create-config’ command does not generate host files (i.e ‘tservers’, ‘monitor’, etc) to run processes locally. These files are only required by the ‘accumulo-cluster’ command which has a command to generate them.
$ cd accumulo-2.0.0/ $ ./bin/accumulo-cluster create-config $ ls conf/ accumulo-env.sh accumulo-site.xml client.conf examples gc masters monitor tracers tservers
Any less common configuration files that were not generated above can still be found in
Before 2.0.0, the ‘accumulo’ command had a limited usage:
$ ./accumulo-1.8.0/bin/accumulo accumulo admin | check-server-config | classpath | create-token | gc | help | info | init | jar <jar> [<main class>] args | login-info | master | minicluster | monitor | proxy | rfile-info | shell | tracer | tserver | version | zookeeper | <accumulo class> args
For 2.0.0, all ‘accumulo’ commands were given a short description and organized into the groups. Below is
the full usage. It should be noted that usage is limited until the ‘accumulo-env.sh’ configuration file is
conf/ by the
accumulo create-config command.
$ ./accumulo-2.0.0/bin/accumulo help Usage: accumulo <command> (<argument> ...) Core Commands: create-config Creates Accumulo configuration build-native Builds Accumulo native libraries init Initializes Accumulo shell Runs Accumulo shell classpath Prints Accumulo classpath version Prints Accumulo version admin Executes administrative commands info Prints Accumulo cluster info help Prints usage jar <jar> [<main class>] args Runs Java <main class> in <jar> using Accumulo classpath <main class> args Runs Java <main class> located on Accumulo classpath Process Commands: gc Starts Accumulo garbage collector master Starts Accumulo master monitor Starts Accumulo monitor minicluster Starts Accumulo minicluster proxy Starts Accumulo proxy tserver Starts Accumulo tablet server tracer Starts Accumulo tracer zookeeper Starts Apache Zookeeper instance Advanced Commands: check-server-config Checks server config create-token Creates authentication token login-info Prints Accumulo login info rfile-info Prints rfile info
The new ‘accumulo-service’ and ‘accumulo-cluster’ commands also have informative usage.
$ ./accumulo-2.0.0/bin/accumulo-service Usage: accumulo-service <service> <command> Services: gc Accumulo garbage collector monitor Accumulo monitor master Accumulo master tserver Accumulo tserver tracer Accumulo tracter Commands: start Starts service stop Stops service kill Kills service $ ./accumulo-2.0.0/bin/accumulo-cluster Usage: accumulo-cluster <command> (<argument> ...) Commands: create-config Creates cluster config start Starts Accumulo cluster stop Stops Accumulo cluster
View all posts in the news archive