java.lang.Object

org.apache.accumulo.core.client.mapreduce.InputTableConfig

All Implemented Interfaces:: org.apache.hadoop.io.Writable

public class InputTableConfig extends Object implements org.apache.hadoop.io.Writable

This class to holds a batch scan configuration for a table. It contains all the properties needed to specify how rows should be returned from the table.

Constructor Summary

Constructors

Constructor

Description

InputTableConfig()

InputTableConfig(DataInput input)

Creates a batch scan config object out of a previously serialized batch scan config object.
Method Summary

Modifier and Type

Method

Description

boolean

equals(Object o)

InputTableConfig

fetchColumns(Collection<org.apache.accumulo.core.util.Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> columns)

Restricts the columns that will be mapped over for this job for the default input table.

Collection<org.apache.accumulo.core.util.Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>>

getFetchedColumns()

Returns the columns to be fetched for this configuration

List<IteratorSetting>

getIterators()

Returns the iterators to be set on this configuration

List<Range>

getRanges()

Returns the ranges to be queried in the configuration

SamplerConfiguration

getSamplerConfiguration()

int

hashCode()

boolean

isOfflineScan()

Determines whether a configuration has the offline table scan feature enabled.

void

readFields(DataInput dataInput)

InputTableConfig

setAutoAdjustRanges(boolean autoAdjustRanges)

Controls the automatic adjustment of ranges for this job.

InputTableConfig

setIterators(List<IteratorSetting> iterators)

Set iterators on to be used in the query.

InputTableConfig

setOfflineScan(boolean offlineScan)

Enable reading offline tables.

InputTableConfig

setRanges(List<Range> ranges)

Sets the input ranges to scan for all tables associated with this job.

void

setSamplerConfiguration(SamplerConfiguration samplerConfiguration)

Set the sampler configuration to use when reading from the data.

InputTableConfig

setUseIsolatedScanners(boolean useIsolatedScanners)

Controls the use of the IsolatedScanner in this job.

InputTableConfig

setUseLocalIterators(boolean useLocalIterators)

Controls the use of the ClientSideIteratorScanner in this job.

boolean

shouldAutoAdjustRanges()

Determines whether a configuration has auto-adjust ranges enabled.

boolean

shouldUseIsolatedScanners()

Determines whether a configuration has isolation enabled.

boolean

shouldUseLocalIterators()

Determines whether a configuration uses local iterators.

void

write(DataOutput dataOutput)

Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- InputTableConfig
  
  public InputTableConfig()
- InputTableConfig
  
  public InputTableConfig(DataInput input) throws IOException
  
  Creates a batch scan config object out of a previously serialized batch scan config object.
  
  Parameters:
  
  input - the data input of the serialized batch scan config
  
  Throws:
  
  IOException
Method Details
- setRanges
  
  public InputTableConfig setRanges(List<Range> ranges)
  
  Sets the input ranges to scan for all tables associated with this job. This will be added to any per-table ranges that have been set using
  
  Parameters:
  
  ranges - the ranges that will be mapped over
  
  Since:
  
  1.6.0
- getRanges
  
  public List<Range> getRanges()
  
  Returns the ranges to be queried in the configuration
- fetchColumns
  
  public InputTableConfig fetchColumns(Collection<org.apache.accumulo.core.util.Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> columns)
  
  Restricts the columns that will be mapped over for this job for the default input table.
  
  Parameters:
  
  columns - a pair of Text objects corresponding to column family and column qualifier. If the column qualifier is null, the entire column family is selected. An empty set is the default and is equivalent to scanning the all columns.
  
  Since:
  
  1.6.0
- getFetchedColumns
  
  public Collection<org.apache.accumulo.core.util.Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> getFetchedColumns()
  
  Returns the columns to be fetched for this configuration
- setIterators
  
  public InputTableConfig setIterators(List<IteratorSetting> iterators)
  
  Set iterators on to be used in the query.
  
  Parameters:
  
  iterators - the configurations for the iterators
  
  Since:
  
  1.6.0
- getIterators
  
  public List<IteratorSetting> getIterators()
  
  Returns the iterators to be set on this configuration
- setAutoAdjustRanges
  
  public InputTableConfig setAutoAdjustRanges(boolean autoAdjustRanges)
  
  Controls the automatic adjustment of ranges for this job. This feature merges overlapping ranges, then splits them to align with tablet boundaries. Disabling this feature will cause exactly one Map task to be created for each specified range. The default setting is enabled. *
  By default, this feature is enabled.
  Parameters:
  
  autoAdjustRanges - the feature is enabled if true, disabled otherwise
  
  Since:
  
  1.6.0
  
  See Also:
  
  setRanges(java.util.List)
- shouldAutoAdjustRanges
  
  public boolean shouldAutoAdjustRanges()
  
  Determines whether a configuration has auto-adjust ranges enabled.
  Returns:
  
  false if the feature is disabled, true otherwise
  
  Since:
  
  1.6.0
  
  See Also:
  
  setAutoAdjustRanges(boolean)
- setUseLocalIterators
  
  public InputTableConfig setUseLocalIterators(boolean useLocalIterators)
  
  Controls the use of the ClientSideIteratorScanner in this job. Enabling this feature will cause the iterator stack to be constructed within the Map task, rather than within the Accumulo TServer. To use this feature, all classes needed for those iterators must be available on the classpath for the task.
  By default, this feature is disabled.
  
  Parameters:
  
  useLocalIterators - the feature is enabled if true, disabled otherwise
  
  Since:
  
  1.6.0
- shouldUseLocalIterators
  
  public boolean shouldUseLocalIterators()
  
  Determines whether a configuration uses local iterators.
  Returns:
  
  true if the feature is enabled, false otherwise
  
  Since:
  
  1.6.0
  
  See Also:
  
  setUseLocalIterators(boolean)
- setOfflineScan
  
  public InputTableConfig setOfflineScan(boolean offlineScan)
  
  Enable reading offline tables. By default, this feature is disabled and only online tables are scanned. This will make the map reduce job directly read the table's files. If the table is not offline, then the job will fail. If the table comes online during the map reduce job, it is likely that the job will fail.
  To use this option, the map reduce user will need access to read the Accumulo directory in HDFS.
  Reading the offline table will create the scan time iterator stack in the map process. So any iterators that are configured for the table will need to be on the mapper's classpath. The accumulo-site.xml may need to be on the mapper's classpath if HDFS or the Accumulo directory in HDFS are non-standard.
  One way to use this feature is to clone a table, take the clone offline, and use the clone as the input table for a map reduce job. If you plan to map reduce over the data many times, it may be better to the compact the table, clone it, take it offline, and use the clone for all map reduce jobs. The reason to do this is that compaction will reduce each tablet in the table to one file, and it is faster to read from one file.
  There are two possible advantages to reading a tables file directly out of HDFS. First, you may see better read performance. Second, it will support speculative execution better. When reading an online table speculative execution can put more load on an already slow tablet server.
  By default, this feature is disabled.
  
  Parameters:
  
  offlineScan - the feature is enabled if true, disabled otherwise
  
  Since:
  
  1.6.0
- isOfflineScan
  
  public boolean isOfflineScan()
  
  Determines whether a configuration has the offline table scan feature enabled.
  Returns:
  
  true if the feature is enabled, false otherwise
  
  Since:
  
  1.6.0
  
  See Also:
  
  setOfflineScan(boolean)
- setUseIsolatedScanners
  
  public InputTableConfig setUseIsolatedScanners(boolean useIsolatedScanners)
  
  Controls the use of the IsolatedScanner in this job.
  By default, this feature is disabled.
  
  Parameters:
  
  useIsolatedScanners - the feature is enabled if true, disabled otherwise
  
  Since:
  
  1.6.0
- shouldUseIsolatedScanners
  
  public boolean shouldUseIsolatedScanners()
  
  Determines whether a configuration has isolation enabled.
  Returns:
  
  true if the feature is enabled, false otherwise
  
  Since:
  
  1.6.0
  
  See Also:
  
  setUseIsolatedScanners(boolean)
- setSamplerConfiguration
  
  public void setSamplerConfiguration(SamplerConfiguration samplerConfiguration)
  
  Set the sampler configuration to use when reading from the data.
  Since:
  
  1.8.0
  
  See Also:
  
  ScannerBase.setSamplerConfiguration(SamplerConfiguration)
  
  InputFormatBase.setSamplerConfiguration(org.apache.hadoop.mapreduce.Job, SamplerConfiguration)
- getSamplerConfiguration
  
  public SamplerConfiguration getSamplerConfiguration()
  
  Since:
  
  1.8.0
- write
  
  public void write(DataOutput dataOutput) throws IOException
  
  Specified by:
  
  write in interface org.apache.hadoop.io.Writable
  
  Throws:
  
  IOException
- readFields
  
  public void readFields(DataInput dataInput) throws IOException
  
  Specified by:
  
  readFields in interface org.apache.hadoop.io.Writable
  
  Throws:
  
  IOException
- equals
  
  public boolean equals(Object o)
  
  Overrides:
  
  equals in class Object
- hashCode
  
  public int hashCode()
  
  Overrides:
  
  hashCode in class Object

Class InputTableConfig

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

InputTableConfig

InputTableConfig

Method Details

setRanges

getRanges

fetchColumns

getFetchedColumns

setIterators

getIterators

setAutoAdjustRanges

shouldAutoAdjustRanges

setUseLocalIterators

shouldUseLocalIterators

setOfflineScan

isOfflineScan

setUseIsolatedScanners

shouldUseIsolatedScanners

setSamplerConfiguration

getSamplerConfiguration

write

readFields

equals

hashCode