Class CountingSummarizer<K>
- Type Parameters:
K
- The counter key type. This type must have good implementations ofObject.hashCode()
andObject.equals(Object)
.
- All Implemented Interfaces:
Summarizer
- Direct Known Subclasses:
AuthorizationSummarizer
,FamilySummarizer
,VisibilitySummarizer
During collection and summarization this class will use the functions from converter()
and encoder()
. For each key/value the function from converter()
will be called
to create zero or more counter objects. A counter associated with each counter object will be
incremented, as long as there are not too many counters and the counter object is not too long.
When Summarizer.Collector.summarize(Summarizer.StatisticConsumer)
is called, the function
from encoder()
will be used to convert counter objects to strings. These strings will be
used to emit statistics. Overriding encoder()
is optional. One reason to override is if
the counter object contains binary or special data. For example, a function that base64 encodes
counter objects could be created.
If the counter key type is mutable, then consider overriding copier()
.
The function returned by converter()
will be called frequently and should be very
efficient. The function returned by encoder()
will be called less frequently and can be
more expensive. The reason these two functions exists is to avoid the conversion to string for
each key value, if that conversion is unnecessary.
Below is an example implementation that counts column visibilities. This example avoids
converting column visibility to string for each key/value. This example shows the source code for
VisibilitySummarizer
.
public class VisibilitySummarizer extends CountingSummarizer<ByteSequence> {
@Override
protected UnaryOperator<ByteSequence> copier() {
// ByteSequences are mutable, so override and provide a copy function
return ArrayByteSequence::new;
}
@Override
protected Converter<ByteSequence> converter() {
return (key, val, consumer) -> consumer.accept(key.getColumnVisibilityData());
}
}
- Since:
- 2.0.0
- See Also:
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic interface
A function that converts key values to zero or more counter objects.Nested classes/interfaces inherited from interface org.apache.accumulo.core.client.summary.Summarizer
Summarizer.Collector, Summarizer.Combiner, Summarizer.StatisticConsumer
-
Field Summary
Modifier and TypeFieldDescriptionstatic final String
This prefixes all counters when emitting statistics inSummarizer.Collector.summarize(Summarizer.StatisticConsumer)
.static final String
This is the name of the statistic that tracks the total number of deleted keys seen.static final String
This is the name of the statistic that tracks the total number of counter objects emitted by theCountingSummarizer.Converter
.static final String
static final String
A configuration option to determine if delete keys should be counted.static final String
static final String
static final String
A configuration option for specifying the maximum length of an individual counter key.static final String
A configuration option for specifying the maximum number of unique counters an instance of this summarizer should track.static final String
This tracks the total number of key/values seen by theSummarizer.Collector
static final String
This is the name of the statistic that tracks how many counter objects were ignored because they were too long.static final String
This is the name of the statistic that tracks how many counters objects were ignored because the number of unique counters was exceeded. -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionFactory method that creates aSummarizer.Collector
based on configuration.Factory method that creates aSummarizer.Combiner
.protected abstract CountingSummarizer.Converter<K>
protected UnaryOperator<K>
copier()
Override this if your key type is mutable and subject to change.encoder()
-
Field Details
-
MAX_COUNTERS_OPT
A configuration option for specifying the maximum number of unique counters an instance of this summarizer should track. If not specified, a default of "1024" will be used.- See Also:
-
MAX_COUNTER_LEN_OPT
A configuration option for specifying the maximum length of an individual counter key. If not specified, a default of "128" will be used.- See Also:
-
INGNORE_DELETES_OPT
A configuration option to determine if delete keys should be counted. If set to true then delete keys will not be passed to theCountingSummarizer.Converter
and the statistic "deletesIgnored" will track the number of deleted ignored. This options defaults to "true".- See Also:
-
COUNTER_STAT_PREFIX
This prefixes all counters when emitting statistics inSummarizer.Collector.summarize(Summarizer.StatisticConsumer)
.- See Also:
-
TOO_MANY_STAT
This is the name of the statistic that tracks how many counters objects were ignored because the number of unique counters was exceeded. The max number of unique counters is specified byMAX_COUNTERS_OPT
.- See Also:
-
TOO_LONG_STAT
This is the name of the statistic that tracks how many counter objects were ignored because they were too long. The maximum length is specified byMAX_COUNTER_LEN_OPT
.- See Also:
-
EMITTED_STAT
This is the name of the statistic that tracks the total number of counter objects emitted by theCountingSummarizer.Converter
. This includes emitted Counter objects that were ignored.- See Also:
-
DELETES_IGNORED_STAT
This is the name of the statistic that tracks the total number of deleted keys seen. This statistic is only incremented when the "ignoreDeletes" option is set to true.- See Also:
-
SEEN_STAT
This tracks the total number of key/values seen by theSummarizer.Collector
- See Also:
-
MAX_COUNTER_DEFAULT
- See Also:
-
MAX_CKL_DEFAULT
- See Also:
-
INGNORE_DELETES_DEFAULT
- See Also:
-
-
Constructor Details
-
CountingSummarizer
public CountingSummarizer()
-
-
Method Details
-
converter
- Returns:
- A function that is used to convert each key value to zero or more counter objects. Each function returned should be independent.
-
encoder
- Returns:
- A function that is used to convert counter objects to String. The default function
calls
Object.toString()
on the counter object.
-
copier
Override this if your key type is mutable and subject to change.- Returns:
- a function that used to copy the counter object. This function is only used when the
collector has never seen the counter object before. In this case the collector needs to
possibly copy the counter object before using as map key. The default implementation is
the
UnaryOperator.identity()
function.
-
collector
Description copied from interface:Summarizer
Factory method that creates aSummarizer.Collector
based on configuration. EachSummarizer.Collector
created by this method should be independent and have its own internal state. Accumulo uses a Collector to generate summary statistics about a sequence of key values written to a file.- Specified by:
collector
in interfaceSummarizer
-
combiner
Description copied from interface:Summarizer
Factory method that creates aSummarizer.Combiner
. Accumulo will only use the created Combiner to merge data fromSummarizer.Collector
s created using the sameSummarizerConfiguration
.- Specified by:
combiner
in interfaceSummarizer
-