Apache Accumulo 1.5.3

25 Jun 2015

Apache Accumulo 1.5.3 is a bug-fix release for the 1.5 series. It is likely to be the last 1.5 release, with development shifting towards newer release lines. We recommend upgrading to a newer version to continue to get bug fixes and new features.

Below are resources for this release:

In the context of Accumulo’s Semantic Versioning guidelines, this is a “patch version”. This means that there should be no public API changes. Any changes which were made were done in a backwards-compatible manner. Code that runs against 1.5.2 should run against 1.5.3.

We’d like to thank all of the committers and contributors which had a part in making this release, from code contributions to testing. Everyone’s efforts are greatly appreciated.

Security Changes

SSLv3 disabled (POODLE)

Many Accumulo services were capable of enabling wire encryption using SSL connectors. To be safe, ACCUMULO-3316 disables the problematic SSLv3 version by default which was potentially susceptible to the man-in-the-middle attack. ACCUMULO-3317 also disables SSLv3 in the monitor, so it will not accept SSLv3 client connections, when running it with https.

Notable Bug Fixes

SourceSwitchingIterator Deadlock

An instance of SourceSwitchingIterator, the Accumulo iterator which transparently manages whether data for a tablet read from memory (the in-memory map) or disk (HDFS after a minor compaction), was found deadlocked in a production system.

This deadlock prevented the scan and the minor compaction from ever successfully completing without restarting the tablet server. ACCUMULO-3745 fixes the inconsistent synchronization inside of the SourceSwitchingIterator to prevent this deadlock from happening in the future.

The only mitigation of this bug was to restart the tablet server that is deadlocked.

Table flush blocked indefinitely

While running the Accumulo RandomWalk distributed test, it was observed that all activity in Accumulo had stopped and there was an offline Accumulo metadata table tablet. The system first tried to flush a user tablet, but the metadata table was not online (likely due to the agitation process which stops and starts Accumulo processes during the test). After this call, a call to load the metadata tablet was queued but could not complete until the previous flush call. Thus, a deadlock occurred.

This deadlock happened because the synchronous flush call could not complete before the load tablet call completed, but the load tablet call couldn’t run because of connection caching we perform in Accumulo’s RPC layer to reduce the quantity of sockets we need to create to send data. ACCUMULO-3597 prevents this deadlock by forcing the use of a non-cached connection for the RPC message requesting a metadata tablet to be loaded.

While this feature does result in additional network resources to be used, the concern is minimal because the number of metadata tablets is typically very small with respect to the total number of tablets in the system.

The only mitigation of this bug was to restart the tablet server that is hung.

RPC Connections not cached

It was observed that the underlying connection for invoking RPCs were not actually being cached, despite it being requested that they should be cached. While this did not result in a noticed performance impact, it was deficiency. ACCUMULO-3574 ensures that connections are cached when it is requested that they are.

Deletes on Apache Thrift Proxy API ignored

A user noted that when trying to specify a delete using the Accumulo Thrift Proxy, the delete was treated as an update. ACCUMULO-3474 fixes the Proxy server such that deletes are properly respected as specified by the client.

Other Changes

Other changes for this version can be found in JIRA.

Testing

Each unit and functional test only runs on a single node, while the RandomWalk and Continuous Ingest tests run on any number of nodes. Agitation refers to randomly restarting Accumulo processes and Hadoop DataNode processes, and, in HDFS High-Availability instances, forcing NameNode fail-over.

During testing, multiple Accumulo developers noticed some stability issues with HDFS using Apache Hadoop 2.6.0 when restarting Accumulo processes and HDFS datanodes. The developers investigated these issues as a part of the normal release testing procedures, but were unable to find a definitive cause of these failures. Users are encouraged to follow ACCUMULO-2388 if they wish to follow any future developments. One possible workaround is to increase the general.rpc.timeout in the Accumulo configuration from 120s to 240s.

OS Hadoop Nodes ZooKeeper HDFS High-Availability Tests
Gentoo 2.6.0 1 3.4.5 No Unit and Integration Tests
Centos 6.5 2.7.1 6 3.4.5 No Continuous Ingest and Verify

View all releases in the archive