Class ConfigurableScanServerSelector
- All Implemented Interfaces:
ScanServerSelector
- Direct Known Subclasses:
ConfigurableScanServerHostSelector
- Hash each tablet to a per attempt configurable number of scan servers and then randomly choose one of those scan servers. Using hashing allows different client to select the same scan servers for a given tablet.
- Use a per attempt configurable busy timeout.
This class accepts a single configuration that has a json value. To configure this class set
scan.server.selector.opts.profiles=<json>
in the accumulo client configuration along with
the config for the class. The following is the default configuration value.
- isDefault : A boolean that specifies whether this is the default profile. One and only one profile must set this to true.
- maxBusyTimeout : The maximum busy timeout to use. The busy timeout from the last attempt configuration grows exponentially up to this max.
- scanTypeActivations : A list of scan types that will activate this profile. Scan
types are specified by setting
scan_type=<scan_type>
as execution on the scanner. SeeScannerBase.setExecutionHints(Map)
- group : Scan servers can be started with an optional group. If specified, this option will limit the scan servers used to those that were started with this group name. If not specified, the set of scan servers that did not specify a group will be used. Grouping scan servers supports at least two use cases. First groups can be used to dedicate resources for certain scans. Second groups can be used to have different hardware/VM types for scans, for example could have some scans use expensive high memory VMs and others use cheaper burstable VMs.
- timeToWaitForScanServers : When there are no scans servers, this setting determines
how long to wait for scan servers to become available before falling back to tablet servers.
Falling back to tablet servers may cause tablets to be loaded that are not currently loaded. When
this setting is given a wait time and there are no scan servers, it will wait for scan servers to
be available. This setting avoids loading tablets on tablet servers when scans servers are
temporarily unavailable which could be caused by normal cluster activity. You can specify the
wait time using different units to precisely control the wait duration. The supported units are:
- "d" for days
- "h" for hours
- "m" for minutes
- "s" for seconds
- "ms" for milliseconds
ScanServerSelector.SelectorParameters.waitUntil(Supplier, Duration, String)
- attemptPlans : A list of configuration to use for each scan attempt. Each list object
has the following fields:
- servers : The number of servers to randomly choose from for this attempt.
- busyTimeout : The busy timeout to use for this attempt.
- salt : An optional string to append when hashing the tablet. When this is set differently for attempts it has the potential to cause the set of servers chosen from to be disjoint. When not set or the same, the servers between attempts will be subsets.
Below is an example configuration with two profiles, one is the default and the other is used
when the scan execution hint scan_type=slow
is set.
[ { "isDefault":true, "maxBusyTimeout":"5m", "busyTimeoutMultiplier":4, "attemptPlans":[ {"servers":"3", "busyTimeout":"33ms"}, {"servers":"100%", "busyTimeout":"100ms"} ] }, { "scanTypeActivations":["slow"], "maxBusyTimeout":"20m", "busyTimeoutMultiplier":8, "group":"lowcost", "timeToWaitForScanServers": "120s", "attemptPlans":[ {"servers":"1", "busyTimeout":"10s"}, {"servers":"3", "busyTimeout":"30s","salt":"42"}, {"servers":"9", "busyTimeout":"60s","salt":"84"} ] } ]
For the default profile in the example it will start off by choosing randomly from 3 scan servers based on a hash of the tablet with no salt. For the first attempt it will use a busy timeout of 33 milliseconds. If the first attempt returns with busy, then it will randomly choose from 100% or all servers for the second attempt and use a busy timeout of 100ms. For subsequent attempts it will keep choosing from all servers and start multiplying the busy timeout by 4 until the max busy timeout of 4 minutes is reached.
For the profile activated by scan_type=slow
it starts off by choosing randomly from 1
scan server based on a hash of the tablet with no salt and a busy timeout of 10s. The second
attempt will choose from 3 scan servers based on a hash of the tablet plus the salt
42. Without the salt, the single scan servers from the first attempt would always be
included in the set of 3. With the salt the single scan server from the first attempt may not be
included. The third attempt will choose a scan server from 9 using the salt 84 and a
busy timeout of 60s. The different salt means the set of servers that attempts 2 and 3 choose
from may be disjoint. Attempt 4 and greater will continue to choose from the same 9 servers as
attempt 3 and will keep increasing the busy timeout by multiplying 8 until the maximum of 20
minutes is reached. For this profile it will choose from scan servers in the group
lowcost. This profile also will not fallback to tablet servers when there are
currently no scan servers, it will wait for scan servers to become available.
- Since:
- 2.1.0
-
Nested Class Summary
Nested ClassesNested classes/interfaces inherited from interface org.apache.accumulo.core.spi.scan.ScanServerSelector
ScanServerSelector.InitParameters, ScanServerSelector.SelectorParameters
-
Field Summary
FieldsFields inherited from interface org.apache.accumulo.core.spi.scan.ScanServerSelector
DEFAULT_SCAN_SERVER_GROUP_NAME
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
This method is called once after aScanServerSelector
is instantiated.Uses theScanServerSelector.SelectorParameters
to determine which, if any, ScanServer should be used for scanning a tablet.protected int
selectServers
(ScanServerSelector.SelectorParameters params, ConfigurableScanServerSelector.Profile profile, List<String> orderedScanServers, Map<TabletId, String> serversToUse)
-
Field Details
-
RANDOM
-
PROFILES_DEFAULT
- See Also:
-
-
Constructor Details
-
ConfigurableScanServerSelector
public ConfigurableScanServerSelector()
-
-
Method Details
-
init
Description copied from interface:ScanServerSelector
This method is called once after aScanServerSelector
is instantiated.- Specified by:
init
in interfaceScanServerSelector
-
selectServers
Description copied from interface:ScanServerSelector
Uses the
ScanServerSelector.SelectorParameters
to determine which, if any, ScanServer should be used for scanning a tablet.In the case where there are zero scan servers available and an implementation does not want to fall back to tablet servers, its ok to wait and poll for scan servers. When waiting its best to use
ScanServerSelector.SelectorParameters.waitUntil(Supplier, Duration, String)
as this allows Accumulo to know about the wait and cancel it via exceptions when it no longer makes sense to wait.- Specified by:
selectServers
in interfaceScanServerSelector
- Parameters:
params
- parameters for the calculation- Returns:
- results
-
selectServers
protected int selectServers(ScanServerSelector.SelectorParameters params, ConfigurableScanServerSelector.Profile profile, List<String> orderedScanServers, Map<TabletId, String> serversToUse)
-