Apache Accumulo 1.9.0

23 Apr 2018

Apache Accumulo 1.9.0 is a minor release on the 1.x branch. This release would be considered a maintenance release on 1.8 branch except there are some API additions which resulted in a new minor release. Users of 1.8.x versions of Accumulo should upgrade to 1.9.0. There will be no more bug fix releases on the 1.8 branch. This release contains changes for nearly a hundred issues. See GitHub and JIRA for a list of changes.

Below are resources for this release:

  • User Manual - In-depth developer and administrator documentation.
  • Javadocs - Accumulo 1.9.0 API
  • Examples - Code with corresponding readme files that give step by step instructions for running example code.

Notable Changes

Deprecated ClientConfiguration API using commons config

In ACCUMULO-4611, public API in ClientConfiguration using commons config types was deprecated to better support Hadoop 3 in the future. New methods were added to replace these methods which cause this release to be a 1.9.0 release. These changes allow removal of commons config from Accumulo’s API in 2.0.0. If using ClientConfiguration, then switching from existing constructors to the new static methods create(), fromFile(), or fromMap() will ensure your code works in 2.0.0.

Performance Improvements

Accumulo was profiled while running lots of concurrent small scans. During this exercise these performance bugs were found and fixed: #379, ACCUMULO-4778, ACCUMULO-4779, ACCUMULO-4781, ACCUMULO-4782, ACCUMULO-4788, ACCUMULO-4789, ACCUMULO-4790, ACCUMULO-4797, ACCUMULO-4798, ACCUMULO-4799, ACCUMULO-4800, ACCUMULO-4801, ACCUMULO-4805, and ACCUMULO-4809

Below are other significant performance improvements in 1.9.0:

  • ACCUMULO-4636 - System iterator performance improvements
  • ACCUMULO-4657 - Avoided expensive BulkImport logging
  • ACCUMULO-4667 - Avoided unnecessary recomputation in LocalityGroupIterator
  • #410 - Fixed inefficient auths check

Fixed upgrade process to set version on all volumes

During upgrades, only one volume in a multiple HDFS volume was updated with the correct version. This would cause all tablet servers to complain and ultimately fail. ACCUMULO-4686 fixes this by setting the version on all volumes.

Updated Accumulo to work with new releases of Guava

In ACCUMULO-4702, dependencies on Beta-annotated Guava classes and methods were removed. While Accumulo still includes Guava 14 in its tarball, it will work with newer versions of Guava in client code. It has been tested to work with Guava 23.

Updated RFile to prevent very large blocks

RFiles now use windowed statistics (ACCUMULO-4669) to prevent very large blocks. In 1.8.0 a bug was introduced that caused RFile data block sizes to grow very large in the case where key sizes slowly increased. This could lead to degraded query performance or out of memory exceptions on tablet servers.

Allow iterators to yield

In ACCUMULO-4643 added capability for an iterator to yield control in a seek or next call prior to finding a key/value. Yielding avoids starvation of other scans when iterators take a long time to return a key/value. To use this feature, implement YieldingKeyValueIterator.

Disallow dots (.) in iterator names

In ACCUMULO-3389, we added a check to prevent iterators from being created by our API which contained the dot (.) character. In some cases, the presence of a dot character could be parsed incorrectly as an iterator option rather than part of its name. This caused unexpected problems. Iterator names are no longer allowed to contain dots. Any user code doing that will break with an IllegalArgumentException.

Other Notable Changes

Upgrading

View the Upgrading Accumulo documentation for guidance.

Testing

Continuous ingest, random walk, and all integration test were run against RC1. Randomwalk was run for 2 days with 7 walkers. Continuous ingest was run with 9 nodes for 24 hours followed by a successful verification.


View all releases in the archive