Apache Accumulo 1.9.0
18 Apr 2018
Please check our release archive for a newer version.
Apache Accumulo 1.9.0 is a minor release on the 1.x branch. This release would be considered a maintenance release on 1.8 branch except there are some API additions which resulted in a new minor release. Users of 1.8.x versions of Accumulo should upgrade to 1.9.0. There will be no more bug fix releases on the 1.8 branch. This release contains changes for nearly a hundred issues. See GitHub and JIRA for a list of changes.
Below are resources for this release:
- User Manual - In-depth developer and administrator documentation.
- Javadocs - Accumulo 1.9.0 API
- Examples - Code with corresponding readme files that give step by step instructions for running example code.
Notable Changes
Deprecated ClientConfiguration API using commons config
In ACCUMULO-4611, public API in ClientConfiguration using commons config types was deprecated to better support Hadoop 3 in the future. New methods were added to replace these methods which cause this release to be a 1.9.0 release. These changes allow removal of commons config from Accumulo’s API in 2.0.0. If using ClientConfiguration, then switching from existing constructors to the new static methods create(), fromFile(), or fromMap() will ensure your code works in 2.0.0.
Performance Improvements
Accumulo was profiled while running lots of concurrent small scans. During this exercise these performance bugs were found and fixed: #379, ACCUMULO-4778, ACCUMULO-4779, ACCUMULO-4781, ACCUMULO-4782, ACCUMULO-4788, ACCUMULO-4789, ACCUMULO-4790, ACCUMULO-4797, ACCUMULO-4798, ACCUMULO-4799, ACCUMULO-4800, ACCUMULO-4801, ACCUMULO-4805, and ACCUMULO-4809
Below are other significant performance improvements in 1.9.0:
- ACCUMULO-4636 - System iterator performance improvements
- ACCUMULO-4657 - Avoided expensive BulkImport logging
- ACCUMULO-4667 - Avoided unnecessary recomputation in LocalityGroupIterator
- #410 - Fixed inefficient auths check
Fixed upgrade process to set version on all volumes
During upgrades, only one volume in a multiple HDFS volume was updated with the correct version. This would cause all tablet servers to complain and ultimately fail. ACCUMULO-4686 fixes this by setting the version on all volumes.
Updated Accumulo to work with new releases of Guava
In ACCUMULO-4702, dependencies on Beta-annotated Guava classes and methods were removed. While Accumulo still includes Guava 14 in its tarball, it will work with newer versions of Guava in client code. It has been tested to work with Guava 23.
Updated RFile to prevent very large blocks
RFiles now use windowed statistics (ACCUMULO-4669) to prevent very large blocks. In 1.8.0 a bug was introduced that caused RFile data block sizes to grow very large in the case where key sizes slowly increased. This could lead to degraded query performance or out of memory exceptions on tablet servers.
Allow iterators to yield
In ACCUMULO-4643 added capability for an iterator to yield control in a seek or next call prior to finding a key/value. Yielding avoids starvation of other scans when iterators take a long time to return a key/value. To use this feature, implement YieldingKeyValueIterator.
Disallow dots (.) in iterator names
In ACCUMULO-3389, we added a check to prevent iterators from being created by our API which contained the dot (.) character. In some cases, the presence of a dot character could be parsed incorrectly as an iterator option rather than part of its name. This caused unexpected problems. Iterator names are no longer allowed to contain dots. Any user code doing that will break with an IllegalArgumentException.
Various security-related improvements
- #417 - Make TLSv1.2 the default for TLS RPC connections
- ACCUMULO-2806 -
accumulo init
sets the correct permissions on /accumulo to 700 - ACCUMULO-4587 - use a newer version of JQuery (3.2.1)
- ACCUMULO-4660 - sanitized incoming values from HTTP parameters
- ACCUMULO-4665 and ACCUMULO-4666 - Kerberos improvements
- ACCUMULO-4676 - set the HTTPOnly flags for JSESSSIONID in monitor
Other Notable Changes
- #403 - Enabled more metrics reporting
- ACCUMULO-4528 - Add import/export table info to docs
- ACCUMULO-4655 - Added a Response Time column to the monitor
- ACCUMULO-4693 - Add process name to metrics
- ACCUMULO-4721 - Document rfile-info in the user manual
Upgrading
View the Upgrading Accumulo documentation for guidance.
Testing
Continuous ingest, random walk, and all integration test were run against RC1. Randomwalk was run for 2 days with 7 walkers. Continuous ingest was run with 9 nodes for 24 hours followed by a successful verification.
Useful Links
View all releases in the archive