Class AccumuloInputFormat
This class allows MapReduce jobs to use Accumulo as the source of data. This
InputFormat
provides keys and values of type Key
and Value
to the Map function. Configure the
job using the configure()
method, which provides a fluent API. For Example:
AccumuloInputFormat.configure().clientProperties(props).table(name) // required .auths(auths).addIterator(iter1).ranges(ranges).fetchColumns(columns).executionHints(hints) .samplerConfiguration(sampleConf).autoAdjustRanges(false) // enabled by default .scanIsolation(true) // not available with batchScan() .offlineScan(true) // not available with batchScan() .store(job);Multiple tables can be set by configuring clientProperties once and then calling .table() for each table. The methods following a call to .table() apply only to that table. For Example:
AccumuloInputFormat.configure().clientProperties(props) // set client props once .table(table1).auths(auths1).fetchColumns(cols1).batchScan(true) // options for table1 .table(table2).ranges(range2).auths(auths2).addIterator(iter2) // options for table2 .table(table3).ranges(range3).auths(auths3).addIterator(iter3) // options for table3 .store(job); // store all tables in the job when finishedFor descriptions of all options see
InputFormatBuilder.InputFormatOptions
- Since:
- 2.0
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic InputFormatBuilder.ClientParams<org.apache.hadoop.mapreduce.Job>
Sets all the information required for this map reduce job.createRecordReader
(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context) List<org.apache.hadoop.mapreduce.InputSplit>
getSplits
(org.apache.hadoop.mapreduce.JobContext context) Gets the splits of the tables that have been set on the job by reading the metadata table for the specified ranges.
-
Constructor Details
-
AccumuloInputFormat
public AccumuloInputFormat()
-
-
Method Details
-
getSplits
public List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext context) throws IOException Gets the splits of the tables that have been set on the job by reading the metadata table for the specified ranges.- Specified by:
getSplits
in classorg.apache.hadoop.mapreduce.InputFormat<Key,
Value> - Returns:
- the splits from the tables based on the ranges.
- Throws:
IOException
- if a table set on the job doesn't exist or an error occurs initializing the tablet locator
-
createRecordReader
-
configure
Sets all the information required for this map reduce job.
-