Apache Accumulo 1.6.5 Release Notes

Apache Accumulo 1.6.5 is a maintenance release on the 1.6 version branch. This release contains changes from 55 issues, comprised of bug-fixes, performance improvements, build quality improvements, and more. See JIRA for a complete list.

Users of any previous 1.6.x release are strongly encouraged to update as soon as possible to benefit from the improvements with very little concern in change of underlying functionality. Users of 1.4 or 1.5 that are seeking to upgrade to 1.6 should consider 1.6.5 as a starting point.

Outstanding Known Issues

Be aware that a small documentation bug exists with the compact command in the shell (ACCUMULO-4138). The documentation for the begin row and end row should be described as exclusive and inclusive, respectively, rather than the incorrect description of both being inclusive.

Highlights

Queued Compactions Not Running

Found and fixed a bug (ACCUMULO-4016) in which some queued compactions would never run if the number of files changed while the tablet was queued.

Faster Processing of Conditional Mutations

Improved ConditionalMutation processing time by a factor of 3. (ACCUMULO-4066)

Slow GC While Bulk Importing

Found and worked around an issue where lots of bulk imports creating many new files would significantly impair the Accumulo GC service, and possibly prevent it from running to completion entirely. (ACCUMULO-4021)

Improvements in Locating Client Configuration File

Fixed some unexpected error messages related to setting ACCUMULO_CLIENT_CONF_PATH, and improved the detection of the client.conf file if ACCUMULO_CLIENT_CONF_PATH was set to a directory containing client.conf. (ACCUMULO-4026,ACCUMULO-4027)

Transient ZooKeeper disconnect causes FATE threads to exit

ZooKeeper clients are expected to handle the situation where they become disconnected from the ZooKeeper server and must wait to be reconnected before continuing ZooKeeper operations.

The dedicated threads running inside the Accumulo Master process for FATE actions had the potential unexpectedly exit in this disconnected state. This caused a scenario where all future FATE-based operations would be blocked until the Accumulo Master process was restarted. (ACCUMULO-4060)

Incorrect management of certain Apache Thrift RPCs

Accumulo relies on Apache Thrift to implement remote procedure calls between Accumulo services. Accumulo’s use of Thrift uncovered an unfortunate situation where a special RPC (a “oneway” call) would leave unwanted data on the underlying Thrift connection. After this extra data was left on connection, all subsequent RPCs re-using that connection would fail with “out of sequence response” error messages. Accumulo would be left in a bad state until the mishandled connections were released or Accumulo services were restarted. (ACCUMULO-4065)

Other Notable Changes

Testing

Each unit and functional test only runs on a single node, while the RandomWalk and Continuous Ingest tests run on any number of nodes. Agitation refers to randomly restarting Accumulo processes and Hadoop Datanode processes, and, in HDFS High-Availability instances, forcing NameNode failover.

OS Hadoop Nodes ZooKeeper HDFS HA Tests
CentOS 7.1 2.6.3 9 3.4.6 No Random walk (All.xml) 18-hour run (2 failures, both conflicting operations on same table in Concurrent test)
CentOS 7.1 2.6.3 6 3.4.6 No Continuous ingest with agitation (2B entries)
CentOS 6.7 2.2.0 and 1.2.1 1 3.3.6 No All unit and integration tests
CentOS 7.1 (Oracle JDK8) 2.6.3 9 3.4.6 No Continuous ingest with agitation (24hrs, 32B entries verified) on EC2 (1 m3.xlarge leader; 8 d2.xlarge workers)