Class AccumuloFileOutputFormat


  • @Deprecated
    public class AccumuloFileOutputFormat
    extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<Key,​Value>
    Deprecated.
    since 2.0.0; Use org.apache.accumulo.hadoop.mapreduce instead from the accumulo-hadoop-mapreduce.jar
    This class allows MapReduce jobs to write output in the Accumulo data file format.
    Care should be taken to write only sorted data (sorted by Key), as this is an important requirement of Accumulo data files.

    The output path to be created must be specified via FileOutputFormat.setOutputPath(Job, Path). This is inherited from FileOutputFormat.setOutputPath(Job, Path). Other methods from FileOutputFormat are not supported and may be ignored or cause failures. Using other Hadoop configuration options that affect the behavior of the underlying files directly in the Job's configuration may work, but are not directly supported at this time.

    • Nested Class Summary

      • Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat

        org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.Counter
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected static org.apache.log4j.Logger log
      Deprecated.
       
      • Fields inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat

        BASE_OUTPUT_NAME, COMPRESS, COMPRESS_CODEC, COMPRESS_TYPE, OUTDIR, PART
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods 
      Modifier and Type Method Description
      org.apache.hadoop.mapreduce.RecordWriter<Key,​Value> getRecordWriter​(org.apache.hadoop.mapreduce.TaskAttemptContext context)
      Deprecated.
       
      static void setCompressionType​(org.apache.hadoop.mapreduce.Job job, String compressionType)
      Deprecated.
      Sets the compression type to use for data blocks.
      static void setDataBlockSize​(org.apache.hadoop.mapreduce.Job job, long dataBlockSize)
      Deprecated.
      Sets the size for data blocks within each file.
      Data blocks are a span of key/value pairs stored in the file that are compressed and indexed as a group.
      static void setFileBlockSize​(org.apache.hadoop.mapreduce.Job job, long fileBlockSize)
      Deprecated.
      Sets the size for file blocks in the file system; file blocks are managed, and replicated, by the underlying file system.
      static void setIndexBlockSize​(org.apache.hadoop.mapreduce.Job job, long indexBlockSize)
      Deprecated.
      Sets the size for index blocks within each file; smaller blocks means a deeper index hierarchy within the file, while larger blocks mean a more shallow index hierarchy within the file.
      static void setReplication​(org.apache.hadoop.mapreduce.Job job, int replication)
      Deprecated.
      Sets the file system replication factor for the resulting file, overriding the file system default.
      static void setSampler​(org.apache.hadoop.mapreduce.Job job, SamplerConfiguration samplerConfig)
      Deprecated.
      Specify a sampler to be used when writing out data.
      • Methods inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat

        checkOutputSpecs, getCompressOutput, getDefaultWorkFile, getOutputCommitter, getOutputCompressorClass, getOutputName, getOutputPath, getPathForWorkFile, getUniqueFile, getWorkOutputPath, setCompressOutput, setOutputCompressorClass, setOutputName, setOutputPath
    • Field Detail

      • log

        protected static final org.apache.log4j.Logger log
        Deprecated.
    • Constructor Detail

      • AccumuloFileOutputFormat

        public AccumuloFileOutputFormat()
        Deprecated.
    • Method Detail

      • setCompressionType

        public static void setCompressionType​(org.apache.hadoop.mapreduce.Job job,
                                              String compressionType)
        Deprecated.
        Sets the compression type to use for data blocks. Specifying a compression may require additional libraries to be available to your Job.
        Parameters:
        job - the Hadoop job instance to be configured
        compressionType - one of "none", "gz", "lzo", or "snappy"
        Since:
        1.5.0
      • setDataBlockSize

        public static void setDataBlockSize​(org.apache.hadoop.mapreduce.Job job,
                                            long dataBlockSize)
        Deprecated.
        Sets the size for data blocks within each file.
        Data blocks are a span of key/value pairs stored in the file that are compressed and indexed as a group.

        Making this value smaller may increase seek performance, but at the cost of increasing the size of the indexes (which can also affect seek performance).

        Parameters:
        job - the Hadoop job instance to be configured
        dataBlockSize - the block size, in bytes
        Since:
        1.5.0
      • setFileBlockSize

        public static void setFileBlockSize​(org.apache.hadoop.mapreduce.Job job,
                                            long fileBlockSize)
        Deprecated.
        Sets the size for file blocks in the file system; file blocks are managed, and replicated, by the underlying file system.
        Parameters:
        job - the Hadoop job instance to be configured
        fileBlockSize - the block size, in bytes
        Since:
        1.5.0
      • setIndexBlockSize

        public static void setIndexBlockSize​(org.apache.hadoop.mapreduce.Job job,
                                             long indexBlockSize)
        Deprecated.
        Sets the size for index blocks within each file; smaller blocks means a deeper index hierarchy within the file, while larger blocks mean a more shallow index hierarchy within the file. This can affect the performance of queries.
        Parameters:
        job - the Hadoop job instance to be configured
        indexBlockSize - the block size, in bytes
        Since:
        1.5.0
      • setReplication

        public static void setReplication​(org.apache.hadoop.mapreduce.Job job,
                                          int replication)
        Deprecated.
        Sets the file system replication factor for the resulting file, overriding the file system default.
        Parameters:
        job - the Hadoop job instance to be configured
        replication - the number of replicas for produced files
        Since:
        1.5.0
      • setSampler

        public static void setSampler​(org.apache.hadoop.mapreduce.Job job,
                                      SamplerConfiguration samplerConfig)
        Deprecated.
        Specify a sampler to be used when writing out data. This will result in the output file having sample data.
        Parameters:
        job - The Hadoop job instance to be configured
        samplerConfig - The configuration for creating sample data in the output file.
        Since:
        1.8.0
      • getRecordWriter

        public org.apache.hadoop.mapreduce.RecordWriter<Key,​Value> getRecordWriter​(org.apache.hadoop.mapreduce.TaskAttemptContext context)
                                                                                  throws IOException
        Deprecated.
        Specified by:
        getRecordWriter in class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<Key,​Value>
        Throws:
        IOException