Apache Accumulo 2.1.3
12 Aug 2024
About
Apache Accumulo 2.1.3 is a patch release of the 2.1 LTM line. It contains bug fixes and minor enhancements. This version supersedes 2.1.2. Users upgrading to 2.1 should upgrade directly to this version instead of 2.1.2.
Notable Changes
The full set of changes are too numerous to be useful here. Below are some highlights of the most significant bug fixes and features added in 2.1.3. For the full set of changes, please see the commit history or issue tracker milestone links below.
Configurable Improvements
Many notable improvements have been added that include a means for users to configure them via new or existing properties to improve system performance or behavior. These include:
- #3722 Adds properties general.file.name.allocation.batch.size.min and general.file.name.allocation.batch.size.max that allow the batch size for unique filename allocation in ZooKeeper to be configurable. In a system that requires large numbers of unique names, larger batch sizes can reduce ZooKeeper contention because more file names can be reserved with a single ZooKeeper call.
- #3738 Adds an experimental property, gc.remove.in.use.candidates, that
enables the Garbage Collector to remove candidates that have active tablet file references, with
the expectation that a new deletion candidate for the file will reappear when the reference is no
longer in use. This is expected to speed up subsequent GC runs by skipping checks for candidates
that were previously known to still be in use. This experimental property defaults to
false
, but is expected to become the default behavior in future releases. - #4544 Made scan servers refresh their cached list of tablet files before expiration using a new property, sserver.cache.metadata.refresh.percent, to control when the refresh happens.
- #4536 Created a ScanServerSelector implementation available to users using the
scan.server.selector.impl
client property calledorg.apache.accumulo.core.spi.scan.ConfigurableScanServerHostSelector
that tries to use scan servers on the same host to leverage shared off-heap-cache usage. - #3737, #3783 Addressed some common transport layer errors about max message sizes being exceeded. The max message size configuration was greatly simplified by removing any experimental max message size properties, and deprecating the other non-experimental ones, all in favor of a new common single property, rpc.message.size.max, for all server types. The default value is now very large to avoid users experiencing errors, but can be constrained by setting a smaller value. If the new property is set, it overrides any of the deprecated properties values that might be set. However, if it is not set, the deprecated property values may still be in effect. It is recommended that users remove the use of the deprecated properties, so that their servers will use the new property, whose default value should be sufficient for most users. The deprecated properties will be removed in version 3.1. A related bug was also fixed where the configured max message size was not being used in some cases.
- #3751 Added property rpc.backlog to configure backlog size for Thrift server sockets.
- #4468 GrepIterator received several new options to control which portions of the Key
and/or Value are matched. The default behavior preserves the previous behavior of matching
anywhere except in the ColumnVisibility. A related change was made to the shell’s
grep
command, which uses this iterator internally, so that it will also match on ColumnVisibility, which it didn’t do before. This means that any existing uses of the GrepIterator will not see any change in behavior. However, users should be aware that any scripted uses of the shell’sgrep
command may see additional results that were not previously shown if the matched term is found only in the ColumnVisibility. - #4348 When using the generate-splits utility, detect non-printable characters and throw an error if base64 encoding option was not used; this is a change from the previous behavior which would have attempted to use a custom hex-encoding for non-printable characters, which could be unreliable and hide binary data in split points without the user noticing; users are advised to always use the base64 option whenever there is a chance of a non-printable character appearing in a split point
- #4223 Added properties compactor.wait.time.job.min and compactor.wait.time.job.max to control compactor polling times.
- #3725 Changed the gc batch size option, gc.candidate.batch.size, to support memory percentages in addition to fixed memory sizes, and updated the default value accordingly.
Some notable performance-related improvements may affect the behavior of the system in noticeable ways, but are not directly configurable by the user. These include:
- #4309 Optimized the logic for getting a random TabletServer connection which substantially improved Shell startup times, as well as startup times of any other Accumulo client.
- #3813 Reduced the load on ZooKeeper when running many compactors.
- #4709 Modified Manager balancer code such that the root, metadata, and user tables will be balanced separately, and in that order. For example, balancing for user tables will not occur while the metadata table is unbalanced.
- #4682 Changed the ScanServer file reference format in the metadata table to sort by UUID to increase performance.
Notable Bug Fixes
- #4283 Fixed a bug with client side iterators that prevented them from accessing the plugin environment.
- #3721 Fixed an issue with writes happening in a retry after batch writer was closed. This strengthens metadata consistency.
- #3749, #3750 Fixed an issue where deleting a compaction pool with running compactions would leave the tserver in a bad state.
- #3748 Fixed a bug where a wal could remain locked if an exception occurred.
- #608, #3755 Add validation to GC that checks that the scanner used by GC to determine candidates for deletion returned a complete row as a mitigation for #608 which observed referenced files being removed when they are still in use.
- #3744 Fixed a bug regarding improperly created GCRun logger name.
- #3737 Adds a custom Transport Factory to set transport message and frame size to avoid infinite loops as described in #3731.
- #4117 Fixed a bug in compaction properties where the replacement
maxOpen
property was being ignored in favor of the deprecatedopen.max
property. - #4681 Stopped listing all compactors in each compactor to reduce load on Zookeeper.
- #3966 Changed the default value of the property table.majc.compaction.strategy to an empty string to fix a compatibility bug with old and new compaction plugins.
- #4554 Fixed a race condition that could cause duplicate compactions to run. While harmless in terms of data, the duplicate compactions could waste significant compute resources.
- #4127 Updated new compaction plugins to honor table.file.max property using a much more efficient algorithm than old compaction plugins had for this property.
- #4485 Interrupt compactions on tablet unload. This prevents long-running compactions from blocking tablet migration.
- #3512 Fixed an issue with improperly cleaned up scans preventing metadata tablet unload.
- #4456 Fixed an issue where setting an empty property value deleted the property.
- #4000 Fixed a bug that could cause bulk import to lose files when errors happened in the tablet server.
- #4462 Fixed bug that prevented listing Fate operations in some situations.
- #4573 Modified CredentialProviderToken serialization so that it no longer stores the resolved password in serialized form if the user were to serialize it.
- #4684 Fixed an issue that was causing table creation to get progressively slower when creating a lot of tables.
Metrics Improvements
- #4461, #4522, #4577, #4492, #4740, #4470 Added new metrics. See MetricsProducer for a full list.
- #4459 Added the ability to multiple instances of
MeterRegistryFactory
to export metrics to several destinations at the same time. See general.micrometer.factory - #4622, #4716 Added metric to report when a server is idle or busy and added a property, general.metrics.process.idle, to make the threshold configurable.
- #3998 Added instance name tag to metrics
Other Improvements
- #4532 Add an option,
timeToWaitForScanServers
, to theConfigurableScanServerSelector
implementation of thescan.server.selector.impl
client property to cause scans to wait for scan servers to be online before scanning - #3697 Support and document
ACCUMULO_JAVA_PREFIX
option inaccumulo-env.sh
as either an array or as a scalar, to better support things likenumactl
in calling scripts. - #3745 Adds a prefix to gc deletion log messages to it easier to isolate the deletion actions of the garbage collector for analysis.
- #3684 Consolidated y/n prompts in the shell. Users can now exit out of multi-table delete operations without accepting prompts for each one.
- #3726 Adjusted reauthentication messages from the shell to assist with troubleshooting.
- #3927 Added validation of JSON property types
- #4763 Improved the accumulo-cluster script and cluster.yaml file for the use case of starting and stopping specific groups of compactors and scan servers.
- #4487 Scan server properties can now be set in the system configuration in ZooKeeper (via the API or the shell)
- #4768 Better thread names to make it easier to correlate thread dumps with their related source code
- #4558 Added a log message in the Manager when it has been waiting over 15 minutes for a tablet to unload.
- #4495 Added
accumulo admin serviceStatus
command to quickly get system process status from the command line.
Requirements
Accumulo 2.1.3 now requires JDK 17 to build, but still supports Java 11 runtime.
Upgrading
View the Upgrading Accumulo documentation for guidance.
Useful Links
This release also contains bug fixes from 1.10.4, which was released after 2.1.2.
View all releases in the archive