Apache Accumulo 1.8.1

26 Feb 2017

Apache Accumulo 1.8. is a maintenance release on the 1.8 version branch. This release contains changes from more then 40 issues, comprised of bug-fixes, performance improvements, build quality improvements, and more. See JIRA for a complete list.

Below are resources for this release:

In the context of Accumulo’s Semantic Versioning guidelines, this is a “minor version”. This means that new APIs have been created, some deprecations may have been added, but no deprecated APIs have been removed. Code written against 1.7.x should work against 1.8.0 – binary compatibility has been preserved with one exception of an already-deprecated Mock Accumulo utility class. As always, the Accumulo developers take API compatibility very seriously and have invested much time to ensure that we meet the promises set forth to our users.

Major Changes

Problem with scans right after minor compaction

A bug was found when 2 or more concurrent scans run on a tablet that has just undergone minor compaction. The minor compaction thread writes the in-memory map to a local temporary rfile and tries to switch the current iterators to use it instead of the native map. The iterator code in the scan thread may also switch itself to use the local temporary rfile it if notices it before the minor compaction threads performs the switch. The bug happened shortly after the switch when one of the iterator threads will get a NegativeArraySizeException. See ACCUMULO-4483 for more info.

Tablet Server Performance Improvement

ACCUMULO-4458 mitigated some contention on the Hadoop configuration instance backing the XML configs read for SiteConfiguration.
This should improve overall Tablet Server performance.

Synchronization issue with deep copies of sources

Deep copies of iterator sources were not thread safe and threw exceptions, mostly down in the ZlibDecompressor library. The real bug was in the BoundedRangeFileInputStream. The read() method synchronizes on the underlying FSDataInputStream, however the available() method did not. See ACCUMULO-4391.

System permission bug in Thrift Proxy

The Accumulo Proxy lacked support for the following system permissions:

  • System.CREATE_NAMESPACE
  • System.DROP_NAMESPACE
  • System.ALTER_NAMESPACE
  • System.OBTAIN_DELEGATION_TOKEN

Ticket is ACCUMULO-4519.

Shell compaction file selection options can block

The block happens when the tablet lock is held. The tablet lock is meant to protect changes to the tablets internal metadata, and blocking operations should not occur while this lock is held. The compaction command has options to select files based on some criteria, some of which required blocking operations. This issue is fixed in ACCUMULO-4572.

HostRegexTableLoadBalancer used stale information

The HostRegexTableLoadBalander maintains an internal mapping of tablet server pools and tablet server status. It was updated at a configurable interval initially as an optimization. Unfortunately it had the negative side effect of providing the assignment and balance operations with stale information. This lead to a constant shuffling of tablets. The configuration property was removed so that assign/balance methods get updated information every time. See ACCUMULO-4576.

Modify TableOperations online to check for table state

The table operations online operation executes as a fate operation. If a transaction lock for the table is currently held, this operation will block even if no action is needed. ACCUMULO-4574 changes the behavior of the online operation to a NOOP if the table is already in the requested state. This returns immediately without queuing a fate operation.

Other Notable Changes

  • ACCUMULO-4488 Fix gap in user manual on Kerberos for clients
  • ACCUMULO-2724 CollectTabletStats had multiple -t parameter
  • ACCUMULO-4431 Log what random is chosen for a tserver.
  • ACCUMULO-4494 Include column family seeks in the Iterator Test Harness
  • ACCUMULO-4549 Remove duplicate init functions in TabletBalancer
  • ACCUMULO-4467 Random Walk broken because of unmet dependency on commons-math
  • ACCUMULO-4578 Cancel compaction FATE operation does not release namespace lock
  • ACCUMULO-4505 Shell still reads accumulo-site.xml when using Zookeeper CLI options
  • ACCUMULO-4535 HostRegexTableLoadBalancer fails with NullPointerException
  • ACCUMULO-4575 Concurrent table delete operations leave orphan fate transaction locks

Upgrading

Upgrades from 1.7 to 1.8 are possible with little effort as no changes were made at the data layer and RPC changes were made in a backwards-compatible way. The recommended way is to stop Accumulo 1.7, perform the Accumulo upgrade to 1.8, and then start 1.8. Like previous versions, after 1.8 is started on a 1.7 instance, a one-time upgrade will happen by the Master which will prevent a downgrade back to 1.7. Upgrades are still one way. Upgrades from versions prior to 1.7 to 1.8 should follow the below path to 1.7 and then perform the upgrade to 1.8 – direct upgrades to 1.8 for versions other than 1.7 are untested.

Existing configuration files from 1.7 should be compared against the examples provided in 1.8. The 1.7 configuration files should all function with 1.8 code, but you will likely want to include changes found in the 1.8.0 release notes and these release notes for 1.8.1.

For upgrades from prior to 1.7, follow the upgrade instructions to 1.7 first.

Testing

Each unit and functional test only runs on a single node, while the RandomWalk and Continuous Ingest tests run on any number of nodes. Agitation refers to randomly restarting Accumulo processes and Hadoop Datanode processes, and, in HDFS High-Availability instances, forcing NameNode failover.

OS/Environment Hadoop Nodes ZooKeeper HDFS HA Tests
CentOS7/openJDK1.8.0_121/EC2; 1 m3.xlarge leader, 8 d2.xlarge workers 2.7.3 9 3.4.9 No 24 HR Continuous Ingest without Agitation.
CentOS7/openJDK1.8.0_121/EC2; 1 m3.xlarge leader, 8 d2.xlarge workers 2.7.3 9 3.4.9 No 24 HR Continuous Ingest with Agitation.

View all releases in the archive