Apache Accumulo 2.1.0
01 Nov 2022
Please check our release archive for a newer version.
About
Apache Accumulo 2.1.0 brings many new features and updates since 1.10 and 2.0. The 2.1 release series is an LTM series, and as such, is expected to receive stability-improving bugfixes, as needed. This makes this series suitable for production environments where stability is preferable over new features that might appear in subsequent non-LTM releases.
This release has received more than 1200 commits from over 50 contributors, including numerous bugfixes, updates, and features.
Minimum Requirements
This version of Accumulo requires at least Java 11 to run. Various Java 11 versions from different distributors were used throughout its testing and development, so we expect it to work with any standard OpenJDK-based Java distribution.
At least Hadoop 3 is required, though it is recommended to use a more recent version. Version 3.3 was used extensively during testing, but we have no specific knowledge that an earlier version of Hadoop 3 will not work. Whichever major/minor version you use, it is recommended to use the latest bugfix/patch version available. By default, our POM depends on 3.3.4.
During much of this release’s development, ZooKeeper 3.5 was used as a minimum. However, that version reach its end-of-life during development, and we do not recommend using end-of-life versions of ZooKeeper. The latest bugfix version of 3.6, 3.7, or 3.8 should also work fine. By default, our POM depends on 3.8.0.
Binary Incompatibility
This release is known to be incompatible with prior versions of the client libraries. That is, the 2.0.0 or 2.0.1 version of the client libraries will not be able to communicate with a 2.1.0 or later installation of Accumulo, nor will the 2.1.0 or later version of the client libraries communicate with a 2.0.1 or earlier installation.
Major New Features
Overhaul of Table Compactions
Significant changes were made to how Accumulo compacts files in this release. See compaction for details, below are some highlights.
- Multiple concurrent compactions per tablet on disjoint files is now supported. Previously only a single compaction could run on a tablet. This allows tablets that are running long compactions on large files to concurrently compact new smaller files that arrive.
- Multiple compaction thread pools per tablet server are now supported. Previously only a single thread pool existed within a tablet server for compactions. With a single thread pool, if all threads are working on long compactions it can starve quick compactions. Now compactions with little data can be processed by dedicated thread pools.
- Accumulo’s default algorithm for selecting files to compact was modified to select the smallest set of files that meet the compaction ratio criteria instead of the largest set. This change makes tablets more aggressive about reducing their number files while still doing logarithmic compaction work. This change also enables efficiently compacting new small files that arrive during a long running compaction.
- Having dedicated compaction threads pools for tables is now supported through configuration. The default configuration for Accumulo sets up dedicated thread pools for compacting the Accumulo metadata table.
- Merging minor compactions were dropped. These were added to Accumulo to address the problem of new files arriving while a long running compaction was running. Merging minor compactions could cause O(N^2) compaction work. The new compaction changes in this release can satisfy this use case while doing a logarithmic amount of work.
CompactionStrategy was deprecated in favor of new public APIs. CompactionStrategy was never public
API as it used internal types and one of these types FileRef
was removed in 2.1. Users who have
written a CompactionStrategy can replace FileRef
with its replacement internal type
StoredTabletFile
but this is not recommended. Since it is very likely that CompactionStrategy will
be removed in a future release, any work put into rewriting a CompactionStrategy will be lost. It is
recommended that users implement CompactionSelector, CompactionConfigurer, and CompactionPlanner
instead. The new compaction changes in 2.1 introduce new algorithms for optimally scheduling
compactions across multiple thread pools, configuring a deprecated compaction strategy may result is
missing out on the benefits of these new algorithms.
See the javadoc for more information.
GitHub tickets related to these changes: #564 #1605 #1609 #1649
External Compactions (experimental)
This feature includes two new optional server components, CompactionCoordinator and Compactor, that enables the user to run major compactions outside of the TabletServer. See design , compaction , and the External Compaction blog post for more information. This work was completed over many tickets, see the GitHub project for the related issues. #2096
Scan Servers (experimental)
This feature includes a new optional server component, Scan Server, that enables the user to run scans outside of the TabletServer. See design , #2411, and #2665 for more information. Importantly, users can utilize this feature to avoid bogging down the TabletServer with long-running scans, slow iterators, etc., provided they are willing to tolerate eventual consistency.
New Per-Table On-Disk Encryption (experimental)
On-disk encryption can now be configured on a per table basis as well as for the entire instance (all tables). See on-disk-encryption for more information.
New jshell entry point
Created new “jshell” convenience entry point. Run bin/accumulo jshell
to start up jshell,
preloaded with Accumulo classes imported and with an instance of AccumuloClient already created for
you to connect to Accumulo (assuming you have a client properties file on the class path) #1870 #1910
Major Improvements
Fixed GC Metadata hotspots
Prior to this release, Accumulo stored GC file candidates in the metadata table using rows of the
form ~del<URI>
. This row schema lead to uneven load on the metadata table and metadata tablets
that were eventually never used. In #1043 / #1344, the row format was changed to
~del<hash(URI)><URI>
resulting in even load on the metadata table and even data spread in the
tablets. After upgrading, there may still be splits in the metadata table using the old row format.
These splits can be merged away as shown in the example below which starts off with splits generated
from the old and new row schema. The old splits with the prefix ~delhdfs
are merged away.
root@uno> getsplits -t accumulo.metadata
2<
~
~del55
~dela7
~delhdfs://localhost:8020/accumulo/tables/2/default_tablet/F00000a0.rf
~delhdfs://localhost:8020/accumulo/tables/2/default_tablet/F00000kb.rf
root@uno> merge -t accumulo.metadata -b ~delhdfs -e ~delhdfs~
root@uno> getsplits -t accumulo.metadata
2<
~
~del55
~dela7
Master Renamed to Manager
In order to use more inclusive language in our code, the Accumulo team has renamed all references to the word “master” to “manager” (with the exception of deprecated classes and packages retained for compatibility). This change includes the master server process, configuration properties with master in the name, utilities with master in the name, and packages/classes in the code base. Where these changes affect the public API, the deprecated “master” name will still be supported until Accumulo 3.0.
Important One particular change to be aware of is that certain state for the manager process is stored in ZooKeeper, previously in under a directory named
masters
. This directory has been renamed tomanagers
, and the upgrade will happen automatically if you launch Accumulo using the provided scripts. However, if you do not use the built in scripts (e.g., accumulo-cluster or accumulo-service), then you will need to perform a one-time upgrade of the ZooKeeper state by executing theRenameMasterDirInZK
utility:${ACCUMULO_HOME}/bin/accumulo org.apache.accumulo.manager.upgrade.RenameMasterDirInZK
Some other specific examples of these changes include:
- All configuration properties starting with
master.
have been renamed to start withmanager.
instead. Themaster.*
property names in the site configuration file (or passed on the command-line) are converted internally to the new name, and a warning is printed. However, the old name can still be used until at least the 3.0 release of Accumulo. Anymaster.*
properties that have been set in ZooKeeper will be automatically converted to the newmanager.*
name when Accumulo is upgraded. The old property names can still be used by theconfig
shell command or via the methods accessible viaAccumuloClient
, but a warning will be generated when the old names are used. You are encouraged to update all references tomaster
in your site configuration files tomanager
when installing Accumulo 2.1. - The tablet balancers in the
org.apache.accumulo.server.master.balancer
package have all been relocated toorg.apache.accumulo.server.manager.balancer
. DefaultLoadBalancer has been also renamed to SimpleLoadBalancer along with the move. The default balancer has been updated fromorg.apache.accumulo.server.master.balancer.TableLoadBalancer
toorg.apache.accumulo.server.manager.balancer.TableLoadBalancer
, and the default per-table balancer has been updated fromorg.apache.accumulo.server.master.balancer.DefaultLoadBalancer
toorg.apache.accumulo.server.manager.balancer.SimpleLoadBalancer
. If you have customized the tablet balancer configuration, you are strongly encouraged to update your configuration to reference the updated balancer names. If you have written a custom tablet balancer, it should be updated to implement the new interfaceorg.apache.accumulo.server.manager.balancer.TabletBalancer
rather than extending the deprecated abstractorg.apache.accumulo.server.master.balancer.TabletBalancer
. - The configuration file
masters
for identifying the manager host(s) has been deprecated. If this file is found, a warning will be printed. The replacement filemanagers
should be used (i.e., rename your masters file to managers) instead. - The
master
argument to theaccumulo-service
script has been deprecated, and the replacementmanager
argument should be used instead. - The
-master
argument to theorg.apache.accumulo.server.util.ZooZap
utility has been deprecated and the replacement-manager
argument should be used instead. - The
GetMasterStats
utility has been renamed toGetManagerStats
. org.apache.accumulo.master.state.SetGoalState
is deprecated, and any custom scripts that invoke this utility should be updated to callorg.apache.accumulo.manager.state.SetGoalState
instead.masterMemory
inminicluster.properties
has been deprecated andmanagerMemory
should be used instead in anyminicluster.properties
files you have configured.- See also #1640 #1642 #1703 #1704 #1873 #1907
New Tracing Facility
HTrace support was removed in this release and has been replaced with OpenTelemetry. Trace information will not be shown in the monitor. See comments in #2259 for an example of how to configure Accumulo to emit traces to supported OpenTelemetry sinks. #2257
New Metrics Implementation
The Hadoop Metrics2 framework is no longer being used to emit metrics from Accumulo. Accumulo is now using the Micrometer framework. Metric name and type changes have been documented in org.apache.accumulo.core.metrics.MetricsProducer, see the javadoc for more information. See comments in #2305 for an example of how to configure Accumulo to emit metrics to supported Micrometer sinks. #1134
New SPI Package
A new Service Plugin Interface (SPI) package was created in the accumulo-core jar, at org.apache.accumulo.core.spi, under which exists interfaces for the various pluggable components. See #1900 #1905 #1880 #1891 #1426
Minor Improvements
New listtablets Shell Command
A new command was created for debugging called listtablets, that shows detailed tablet information on a single line. This command aggregates data about a tablet such as status, location, size, number of entries and HDFS directory name. It even shows the start and end rows of tablets, displaying them in the same sorted order they are stored in the metadata. See example command output below. #1317 #1821
root@uno> listtablets -t test_ingest -h
2021-01-04T15:12:47,663 [Shell.audit] INFO : root@uno> listtablets -t test_ingest -h
NUM TABLET_DIR FILES WALS ENTRIES SIZE STATUS LOCATION ID START (Exclusive) END
TABLE: test_ingest
1 t-0000007 1 0 60 552 HOSTED CURRENT:ip-10-113-12-25:9997 2 -INF row_0000000005
2 t-0000006 1 0 500 2.71K HOSTED CURRENT:ip-10-113-12-25:9997 2 row_0000000005 row_0000000055
3 t-0000008 1 0 5.00K 24.74K HOSTED CURRENT:ip-10-113-12-25:9997 2 row_0000000055 row_0000000555
4 default_tablet 1 0 4.44K 22.01K HOSTED CURRENT:ip-10-113-12-25:9997 2 row_0000000555 +INF
root@uno> listtablets -t accumulo.metadata
2021-01-04T15:13:21,750 [Shell.audit] INFO : root@uno> listtablets -t accumulo.metadata
NUM TABLET_DIR FILES WALS ENTRIES SIZE STATUS LOCATION ID START (Exclusive) END
TABLE: accumulo.metadata
1 table_info 2 0 7 524 HOSTED CURRENT:ip-10-113-12-25:9997 !0 -INF ~
2 default_tablet 0 0 0 0 HOSTED CURRENT:ip-10-113-12-25:9997 !0 ~ +INF
New Utility for Generating Splits
A new command line utility was created to generate split points from 1 or more rfiles. One or more HDFS directories can be given as well. The utility will iterate over all the files provided and determine the proper split points based on either the size or number given. It uses Apache Datasketches to get the split points from the data. #2361 #2368
New Option for Cloning Offline
Added option to leave cloned tables offline #1474 #1475
New Max Tablets Option in Bulk Import
The property table.bulk.max.tablets
was created in new bulk import technique. This property acts
as a cluster performance failsafe to prevent a single ingested file from being distributed across
too much of a cluster. The value is enforced by the new bulk import technique and is the maximum
number of tablets allowed for one bulk import file. When this property is set, an error will be
thrown when the value is exceeded during a bulk import. #1614
New Health Check Thread in TabletServer
A new thread was added to the tablet server to periodically verify tablet metadata. #2320 This thread also prints to the debug log how long it takes the tserver to scan the metadata table. The property tserver.health.check.interval was added to control the frequency at which this health check takes place. #2583
New ability for user to define context classloaders
Deprecated the existing VFS ClassLoader for eventual removal and created a new mechanism for users to load their own classloader implementations. The new VFS classloader and VFS context classloaders are in a new repo and can now be specified using Java’s own system properties. Alternatively, one can set their own classloader (they always could do this). #1747 #1715
Please reference the Known Issues section of the 2.0.1 release notes for an issue affecting the VFSClassLoader.
Change in uncaught Exception/Error handling in server-side threads
Consolidated and normalized thread pool and thread creation. All threads created through this code path will have an UncaughtExceptionHandler attached to it that will log the fact that the Thread encountered an uncaught Exception and is now dead. When an Error is encountered in a server process, it will attempt to print a message to stderr then terminate the VM using Runtime.halt. On the client side, the default UncaughtExceptionHandler will only log the Exception/Error in the client and does not terminate the VM. Additionally, the user has the ability to set their own UncaughtExceptionHandler implementation on the client. #1808 #1818 #2554
Updated hash algorithm
With the default password Authenticator, Accumulo used to store password hashes using SHA-256 and
using custom code to add a salt. In this release, we now use Apache commons-codec to store password
hashes in the crypt(3)
standard format. With this change, we’ve also defaulted to using the
stronger SHA-512. Existing stored password hashes (if upgrading from an earlier version of Accumulo)
will automatically be upgraded when users authenticate or change their passwords, and Accumulo will
log a warning if it detects any passwords have not been upgraded. #1787 #1788 #1798 #1810
Various Performance improvements when deleting tables
- Make delete table operations cancel user compactions #2030 #2169.
- Prevent compactions from starting when delete table is called #2182 #2240.
- Added check to not flush when table is being deleted #1887.
- Added log message before waiting for deletes to finish #1881.
- Added code to stop user flush if table is being deleted #1931
New Monitor Pages, Improvements & Features
- A page was added to the Monitor that lists the active compactions and the longest running active compaction. As an optimization, this page will only fetch data if a user loads the page and will only do so a maximum of once a minute. This optimization was also added for the Active Scans page, along with the addition of a “Fetched” column indicating when the data was retrieved.
- A new feature was added to the TabletServer page to help users identify which tservers are in recovery mode. When a tserver is recovering, its corresponding row in the TabletServer Status table will be highlighted.
- A new page was also created for External Compactions that allows users to see the progress of compactions and other details about ongoing compactions (see below).
New tserver scan timeout property
The new property tserver.scan.results.max.timeout
was added to allow configuration of the timeout.
A bug was discovered where tservers were running out of memory, partially due to this timeout being
so short. The default value is 1 second, but now it can be increased. It is the max time for the
thrift client handler to wait for scan results before timing out. #2599 #2598
Always choose volumes for new tablet files
In #1389, we changed the behavior of the VolumeChooser. It now runs any time a new file is created. This means VolumeChooser decisions are no longer “sticky” for tablets. This allows tablets to balance their files across multiple HDFS volumes, instead of the first selected. Now, only the directory name is “sticky” for a tablet, but the volume is not. So, new files will appear in a directory named the same on different volumes that the VolumeChooser selects.
Iterators package is now public API
#1390 #1400 #1411 We declared that the core.iterators package is public API, so it will now follow the semver rules for public API.
Better accumulo-gc memory usage
#1543 #1650 Switch from batching file candidates to delete based on the amount of available memory, and instead use a fixed-size batching strategy. This allows the accumulo-gc to run consistently using a batch size that is configurable by the user. The user is responsbile for ensuring the process is given enough memory to accommodate the batch size they configure, but this makes the process much more consistent and predictable.
Log4j2
#1528 #1514 #1515 #1516 While we still use slf4j, we have upgraded the default logger binding to log4j2, which comes with a bunch of features, such as dynamic reconfiguration, colorized console logging, and more.
Added forEach method to Scanner
#1742 #1765 We added a forEach method to Scanner objects, so you can easily iterate over the results using a lambda / BiConsumer that accepts a key-value pair.
New public API to set multiple properties atomically
#2692 We added a new public API added to support setting multiple properties at once
atomically using a read-modify-write pattern. This is available for table, namespace, and system
properties, and is called modifyProperties()
. This builds off a related change that allows us to
more efficiently store and properties in ZooKeeper, which also results in fewer ZooKeeper watches.
Simplified cluster configuration
#2138 #2903 Modified the accumulo-cluster script to read the server locations from a single file, cluster.yaml, in the conf directory instead of multiple files (tserver, manager, gc, etc.). Starting the new scan server and compactor server types is supported using this new file. It also contains options for starting multiple Tablet and Scan Servers per host.
Other notable changes
- #1174 #816 Abstract metadata and change root metadata schema
- #1309 Explicitly prevent cloning metadata table to prevent poor user experience
- #1313 #936 Store Root Tablet list of files in Zookeeper
- #1294 #1299 Add optional -t tablename to importdirectory shell command.
- #1332 Disable FileSystemMonitor checks of /proc by default (to be removed in future)
- #1345 #1352 Optionally disable gc-initiated compactions/flushes
- #1397 #1461 Replace relative paths in the metadata tables on upgrade.
- #1456 #1457 Prevent catastrophic tserver shutdown by rate limiting the shutdown
- #1053 #1060 #1576 Support multiple volumes in import table
- #1568 Support multiple tservers / node in accumulo-service
- #1644 #1645 Fix issue with minor compaction not retrying
- #1660 Dropped unused MemoryManager property
- #1764 #1783 Parallelize listcompactions in shell
- #1797 Add table option to shell delete command.
- #2039 #2045 Add bulk import option to ignore empty dirs
- #2117 #2236 Make sorted recovery write to RFiles. New
tserver.wal.sort.file.
property to configure - #2076 Sorted recovery files can now be encrypted
- #2441 Upgraded to Junit 5
- #2462 Added SUBMITTED FaTE status to differentiate between things submitted vs. running
- #2467 Added fate shell command option to cancel FaTE operations that are NEW or SUBMITTED
- #2807 Added several troubleshooting utilities to the
accumulo admin
command. - #2820 #2900
du
command performance improved by using the metadata table for computation instead of HDFS - #2966 Upgrade Thrift to 0.17.0
Upgrading
View the Upgrading Accumulo documentation for guidance.
Known Issues
At the time of release, the following issues were known:
- #3045 - External compactions may appear stuck until the coordinator is restarted
- #3048 - The monitor may not show times in the correct format for the user’s locale
- #3053 - ThreadPool creation is a bit spammy by default in the debug logs
- #3057 - The monitor may have an annoying popup on the external compactions page if the coordinator is offline
Useful Links
This release also includes bug fixes from 1.10.2 and earlier, which were released after 2.0.0 and 2.0.1.
View all releases in the archive