Apache Accumulo 1.6.2 Release Notes

Apache Accumulo 1.6.2 is a maintenance release on the 1.6 version branch. This release contains changes from over 150 issues, comprised of bug-fixes, performance improvements and better test cases. Apache Accumulo 1.6.2 is the first release since the community has adopted Semantic Versioning which means that all changes to the public API are guaranteed to be made without adding to or removing from the public API. This ensures that client code that runs against 1.6.1 is guaranteed to run against 1.6.2 and vice versa.

Users of 1.6.0 or 1.6.1 are strongly encouraged to update as soon as possible to benefit from the improvements with very little concern in change of underlying functionality. Users of 1.4 or 1.6 are seeking to upgrade to 1.6 should consider 1.6.2 the starting point over 1.6.0 or 1.6.1. For information about improvements since Accumulo 1.5, see the 1.6.0 and 1.6.1 release notes.

Notable Bug Fixes

Only first ZooKeeper server is used

In constructing a ZooKeeperInstance, the user provides a comma-separated list of addresses for ZooKeeper servers. 1.6.0 and 1.6.1 incorrectly truncated the provided list of ZooKeeper servers used to the first. This would cause clients to fail when the first ZooKeeper server in the list became unavailable and not properly load balance requests to all available servers in the quorum. ACCUMULO-3218 fixes the parsing of the ZooKeeper quorum list to use all servers, not just the first.

Incorrectly handled ZooKeeper exception

Use of ZooKeeper’s API requires very careful exception handling as some thrown exceptions from the ZooKeeper API are considered “normal” and must be retried by the client. In 1.6.1, Accumulo improved its handling of these “expected failures” to better insulate calls to ZooKeeper; however, the wrapper which sets data to a ZNode incorrectly handled all cases. ACCUMULO-3448 fixed the implementation of ZooUtil.putData(...) to handle the expected error conditions correctly.

scanId is not set in ActiveScan

The ActiveScan class is the returned object by InstanceOperations.listScans. This class represents a “scan” running on Accumulo servers, either from a Scanner or BatchScanner. The ActiveScan class is meant to represent all of the information that represents the scan and can be useful to administrators or DevOps-types to observe and act on scans which are running for excessive periods of time. ACCUMULO-2641 fixes ActiveScan to ensure that the internal identifier scanId is properly set.

Table state change doesn’t wait when requested

An Accumulo table has two states: ONLINE and OFFLINE. An offline table in Accumulo consumes no TabletServer resources, only HDFS resources, which makes it useful to save infrequently used data. The Accumulo methods provided to transition a state from ONLINE to OFFLINE and vice versa did not respect the wait=true parameter when set. ACCUMULO-3301 fixes the underlying implementation to ensure that when wait=true is provided, the method will not return until the table’s state transition has fully completed.

KeyValue doesn’t implement hashCode() or equals()

The KeyValue class is an implementation of Entry<Key,Value> which is returned by the classes like Scanner and BatchScanner. ACCUMULO-3217 adds these methods which ensure that the returned Entry<Key,Value> operates as expected with HashMaps and HashSets.

Potential deadlock in TabletServer

Internal to the TabletServer, there are methods to construct instances of configuration objects for tables and namespaces. The locking on these methods was not correctly implemented which created the possibility to have concurrent requests to a TabletServer to deadlock. ACCUMULO-3372 found this problem while performing bulk imports of RFiles into Accumulo. Additional synchronization was added server-side to prevent this deadlock from happening in the future.

The DateLexicoder incorrectly serialized Dates prior 1970

The DateLexicode, a part of the Lexicoders classes which implement methods to convert common type primitives into lexicographically sorting Strings/bytes, incorrectly converted Date objects for dates prior to 1970. ACCUMULO-3385 fixed the DateLexicoder to correctly (de)serialize data Date objects. For users with data stored in Accumulo using the broken implementation, the following can be performed to read the old data.

  Lexicoder lex = new ULongLexicoder();
  for (Entry<Key, Value> e : scanner) {
    Date d = new Date(lex.decode(TextUtil.getBytes(e.getKey().getRow())));
    // ...
  }

Reduce MiniAccumuloCluster failures due to random port allocations

MiniAccumuloCluster has had issues where it fails to properly start due to the way it attempts to choose a random, unbound port on the local machine to start the ZooKeeper and Accumulo processes. Improvements have been made, including retry logic, to withstand a few failed port choices. The changes made by ACCUMULO-3233 and the related issues should eliminate sporadic failures users of MiniAccumuloCluster might have observed.

Tracer doesn’t handle trace table state transition

The Tracer is an optional Accumulo server process that serializes Spans, elements of a distributed trace, to the trace table for later inspection and correlation with other Spans. By default, the Tracer writes to a “trace” table. In earlier versions of Accumulo, if this table was put offline, the Tracer would fail to write new Spans to the table when it came back online. ACCUMULO-3351 ensures that the Tracer process will resume writing Spans to the trace table when it transitions to online after being offline.

Tablet not major compacting

It was noticed that a system performing many bulk imports, there was a tablet with hundreds of files which was not major compacting nor was scheduled to be major compacted. ACCUMULO-3462 identified as fix server-side which would prevent this from happening in the future.

YARN job submission fails with Hadoop-2.6.0

Hadoop 2.6.0 introduced a new component, the TimelineServer, which is a centralized metrics service designed for other Hadoop components to leverage. MapReduce jobs submitted via accumulo and tool.sh failed to run the job because it attempted to contact the TimelineServer and Accumulo was missing a dependency on the classpath to communicate with the TimelineServer. ACCUMULO-3230 updates the classpath in the example configuration files to include the necessary dependencies for the TimelineServer to ensure that YARN job submission operates as previously.

Performance Improvements

User scans can block root and metadata table scans

The TabletServer provides a feature to limit the number of open files as a resource management configuration. To perform a scan against a normal table, the metadata and root table, when not cached, need to be consulted first. With a sufficient number of concurrent scans against normal tables, adding to the open file count, scans against the metadata and root tables could be blocked from running because no files can be opened. This prevents other system operations from happening as expected. ACCUMULO-3297 fixes the internal semaphore used to implement this resource management to ensure that root and metadata table scans can proceed.

Other improvements

Limit available ciphers for SSL/TLS

Since Apache Accumulo 1.5.2 and 1.6.1, the POODLE man-in-the-middle attack was found which exploits a client’s ability to fallback to the SSLv3.0 protocol. The main mitigation strategy was to prevent the use of old ciphers/protocols when using SSL connectors. In Accumulo, both the Apache Thrift RPC servers and Jetty server for the Accumulo monitor have the ability to enable SSL. ACCUMULO-3316 is the parent issue which provides new configuration properties in accumulo-site.xml which can limit the accepted ciphers/protocols. By default, insecure or out-dated protocols have been removed from the default set in order to protect users by default.

Documentation

Documentation was added to the Administration chapter for moving from a Non-HA Namenode setup to an HA Namenode setup. New chapters were added for the configuration of SSL and for summaries of Implementation Details (initially describing FATE operations). A section was added to the Configuration chapter for describing how to arrive at optimal settings for configuring an instance with native maps.

Testing

Each unit and functional test only runs on a single node, while the RandomWalk and Continuous Ingest tests run on any number of nodes. Agitation refers to randomly restarting Accumulo processes and Hadoop Datanode processes, and, in HDFS High-Availability instances, forcing NameNode failover.

OS Hadoop Nodes ZooKeeper HDFS HA Tests
Gentoo N/A 1 N/A No Unit and Integration Tests
Mac OSX N/A 1 N/A No Unit and Integration Tests
Fedora 21 N/A 1 N/A No Unit and Integration Tests
CentOS 6 2.6 20 3.4.5 No ContinuousIngest w/ verification w/ and w/o agitation (31B and 21B entries, respectively)