Class AccumuloOutputFormat

java.lang.Object
org.apache.hadoop.mapreduce.OutputFormat<org.apache.hadoop.io.Text,Mutation>
org.apache.accumulo.core.client.mapreduce.AccumuloOutputFormat

@Deprecated(since="2.0.0") public class AccumuloOutputFormat extends org.apache.hadoop.mapreduce.OutputFormat<org.apache.hadoop.io.Text,Mutation>
Deprecated.
since 2.0.0; Use org.apache.accumulo.hadoop.mapreduce instead from the accumulo-hadoop-mapreduce.jar
This class allows MapReduce jobs to use Accumulo as the sink for data. This OutputFormat accepts keys and values of type Text (for a table name) and Mutation from the Map and Reduce functions. The user must specify the following via static configurator methods: Other static methods are optional.
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    protected static class 
    Deprecated.
    A base class to be used to create RecordWriter instances that write to Accumulo.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    protected static final org.apache.log4j.Logger
    Deprecated.
     
  • Constructor Summary

    Constructors
    Constructor
    Description
    Deprecated.
     
  • Method Summary

    Modifier and Type
    Method
    Description
    protected static Boolean
    canCreateTables(org.apache.hadoop.mapreduce.JobContext context)
    Deprecated.
    Determines whether tables are permitted to be created as needed.
    void
    checkOutputSpecs(org.apache.hadoop.mapreduce.JobContext job)
    Deprecated.
     
    protected static AuthenticationToken
    getAuthenticationToken(org.apache.hadoop.mapreduce.JobContext context)
    Deprecated.
    Gets the authenticated token from either the specified token file or directly from the configuration, whichever was used when the job was configured.
    protected static BatchWriterConfig
    getBatchWriterOptions(org.apache.hadoop.mapreduce.JobContext context)
    Deprecated.
    Gets the BatchWriterConfig settings.
    protected static String
    getDefaultTableName(org.apache.hadoop.mapreduce.JobContext context)
    Deprecated.
    Gets the default table name from the configuration.
    protected static Instance
    getInstance(org.apache.hadoop.mapreduce.JobContext context)
    Deprecated.
    Initializes an Accumulo Instance based on the configuration.
    protected static org.apache.log4j.Level
    getLogLevel(org.apache.hadoop.mapreduce.JobContext context)
    Deprecated.
    Gets the log level from this configuration.
    org.apache.hadoop.mapreduce.OutputCommitter
    getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
    Deprecated.
     
    protected static String
    getPrincipal(org.apache.hadoop.mapreduce.JobContext context)
    Deprecated.
    Gets the user name from the configuration.
    org.apache.hadoop.mapreduce.RecordWriter<org.apache.hadoop.io.Text,Mutation>
    getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext attempt)
    Deprecated.
     
    protected static Boolean
    getSimulationMode(org.apache.hadoop.mapreduce.JobContext context)
    Deprecated.
    Determines whether this feature is enabled.
    protected static byte[]
    getToken(org.apache.hadoop.mapreduce.JobContext context)
    Deprecated.
    since 1.6.0; Use getAuthenticationToken(JobContext) instead.
    protected static String
    getTokenClass(org.apache.hadoop.mapreduce.JobContext context)
    Deprecated.
    since 1.6.0; Use getAuthenticationToken(JobContext) instead.
    protected static Boolean
    isConnectorInfoSet(org.apache.hadoop.mapreduce.JobContext context)
    Deprecated.
    Determines if the connector has been configured.
    static void
    setBatchWriterOptions(org.apache.hadoop.mapreduce.Job job, BatchWriterConfig bwConfig)
    Deprecated.
    Sets the configuration for for the job's BatchWriter instances.
    static void
    setConnectorInfo(org.apache.hadoop.mapreduce.Job job, String principal, String tokenFile)
    Deprecated.
    Sets the connector information needed to communicate with Accumulo in this job.
    static void
    setConnectorInfo(org.apache.hadoop.mapreduce.Job job, String principal, AuthenticationToken token)
    Deprecated.
    Sets the connector information needed to communicate with Accumulo in this job.
    static void
    setCreateTables(org.apache.hadoop.mapreduce.Job job, boolean enableFeature)
    Deprecated.
    Sets the directive to create new tables, as necessary.
    static void
    setDefaultTableName(org.apache.hadoop.mapreduce.Job job, String tableName)
    Deprecated.
    Sets the default table name to use if one emits a null in place of a table name for a given mutation.
    static void
    setLogLevel(org.apache.hadoop.mapreduce.Job job, org.apache.log4j.Level level)
    Deprecated.
    Sets the log level for this job.
    static void
    setSimulationMode(org.apache.hadoop.mapreduce.Job job, boolean enableFeature)
    Deprecated.
    Sets the directive to use simulation mode for this job.
    static void
    setZooKeeperInstance(org.apache.hadoop.mapreduce.Job job, String instanceName, String zooKeepers)
    static void
    setZooKeeperInstance(org.apache.hadoop.mapreduce.Job job, ClientConfiguration clientConfig)
    Deprecated.
    Configures a ZooKeeperInstance for this job.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • log

      protected static final org.apache.log4j.Logger log
      Deprecated.
  • Constructor Details

    • AccumuloOutputFormat

      public AccumuloOutputFormat()
      Deprecated.
  • Method Details

    • setConnectorInfo

      public static void setConnectorInfo(org.apache.hadoop.mapreduce.Job job, String principal, AuthenticationToken token) throws AccumuloSecurityException
      Deprecated.
      Sets the connector information needed to communicate with Accumulo in this job.

      WARNING: Some tokens, when serialized, divulge sensitive information in the configuration as a means to pass the token to MapReduce tasks. This information is BASE64 encoded to provide a charset safe conversion to a string, but this conversion is not intended to be secure. PasswordToken is one example that is insecure in this way; however DelegationTokens, acquired using SecurityOperations.getDelegationToken(DelegationTokenConfig), is not subject to this concern.

      Parameters:
      job - the Hadoop job instance to be configured
      principal - a valid Accumulo user name (user must have Table.CREATE permission if setCreateTables(Job, boolean) is set to true)
      token - the user's password
      Throws:
      AccumuloSecurityException
      Since:
      1.5.0
    • setConnectorInfo

      public static void setConnectorInfo(org.apache.hadoop.mapreduce.Job job, String principal, String tokenFile) throws AccumuloSecurityException
      Deprecated.
      Sets the connector information needed to communicate with Accumulo in this job.

      Stores the password in a file in HDFS and pulls that into the Distributed Cache in an attempt to be more secure than storing it in the Configuration.

      Parameters:
      job - the Hadoop job instance to be configured
      principal - a valid Accumulo user name (user must have Table.CREATE permission if setCreateTables(Job, boolean) is set to true)
      tokenFile - the path to the token file
      Throws:
      AccumuloSecurityException
      Since:
      1.6.0
    • isConnectorInfoSet

      protected static Boolean isConnectorInfoSet(org.apache.hadoop.mapreduce.JobContext context)
      Deprecated.
      Determines if the connector has been configured.
      Parameters:
      context - the Hadoop context for the configured job
      Returns:
      true if the connector has been configured, false otherwise
      Since:
      1.5.0
      See Also:
    • getPrincipal

      protected static String getPrincipal(org.apache.hadoop.mapreduce.JobContext context)
      Deprecated.
      Gets the user name from the configuration.
      Parameters:
      context - the Hadoop context for the configured job
      Returns:
      the user name
      Since:
      1.5.0
      See Also:
    • getTokenClass

      @Deprecated(since="1.6.0") protected static String getTokenClass(org.apache.hadoop.mapreduce.JobContext context)
      Deprecated.
      since 1.6.0; Use getAuthenticationToken(JobContext) instead.
      Gets the serialized token class from either the configuration or the token file.
      Since:
      1.5.0
    • getToken

      @Deprecated(since="1.6.0") protected static byte[] getToken(org.apache.hadoop.mapreduce.JobContext context)
      Deprecated.
      since 1.6.0; Use getAuthenticationToken(JobContext) instead.
      Gets the serialized token from either the configuration or the token file.
      Since:
      1.5.0
    • getAuthenticationToken

      protected static AuthenticationToken getAuthenticationToken(org.apache.hadoop.mapreduce.JobContext context)
      Deprecated.
      Gets the authenticated token from either the specified token file or directly from the configuration, whichever was used when the job was configured.
      Parameters:
      context - the Hadoop context for the configured job
      Returns:
      the principal's authentication token
      Since:
      1.6.0
      See Also:
    • setZooKeeperInstance

      @Deprecated(since="1.6.0") public static void setZooKeeperInstance(org.apache.hadoop.mapreduce.Job job, String instanceName, String zooKeepers)
      Configures a ZooKeeperInstance for this job.
      Parameters:
      job - the Hadoop job instance to be configured
      instanceName - the Accumulo instance name
      zooKeepers - a comma-separated list of zookeeper servers
      Since:
      1.5.0
    • setZooKeeperInstance

      public static void setZooKeeperInstance(org.apache.hadoop.mapreduce.Job job, ClientConfiguration clientConfig)
      Deprecated.
      Configures a ZooKeeperInstance for this job.
      Parameters:
      job - the Hadoop job instance to be configured
      clientConfig - client configuration for specifying connection timeouts, SSL connection options, etc.
      Since:
      1.6.0
    • getInstance

      protected static Instance getInstance(org.apache.hadoop.mapreduce.JobContext context)
      Deprecated.
      Initializes an Accumulo Instance based on the configuration.
      Parameters:
      context - the Hadoop context for the configured job
      Returns:
      an Accumulo instance
      Since:
      1.5.0
    • setLogLevel

      public static void setLogLevel(org.apache.hadoop.mapreduce.Job job, org.apache.log4j.Level level)
      Deprecated.
      Sets the log level for this job.
      Parameters:
      job - the Hadoop job instance to be configured
      level - the logging level
      Since:
      1.5.0
    • getLogLevel

      protected static org.apache.log4j.Level getLogLevel(org.apache.hadoop.mapreduce.JobContext context)
      Deprecated.
      Gets the log level from this configuration.
      Parameters:
      context - the Hadoop context for the configured job
      Returns:
      the log level
      Since:
      1.5.0
      See Also:
    • setDefaultTableName

      public static void setDefaultTableName(org.apache.hadoop.mapreduce.Job job, String tableName)
      Deprecated.
      Sets the default table name to use if one emits a null in place of a table name for a given mutation. Table names can only be alpha-numeric and underscores.
      Parameters:
      job - the Hadoop job instance to be configured
      tableName - the table to use when the tablename is null in the write call
      Since:
      1.5.0
    • getDefaultTableName

      protected static String getDefaultTableName(org.apache.hadoop.mapreduce.JobContext context)
      Deprecated.
      Gets the default table name from the configuration.
      Parameters:
      context - the Hadoop context for the configured job
      Returns:
      the default table name
      Since:
      1.5.0
      See Also:
    • setBatchWriterOptions

      public static void setBatchWriterOptions(org.apache.hadoop.mapreduce.Job job, BatchWriterConfig bwConfig)
      Deprecated.
      Sets the configuration for for the job's BatchWriter instances. If not set, a new BatchWriterConfig, with sensible built-in defaults is used. Setting the configuration multiple times overwrites any previous configuration.
      Parameters:
      job - the Hadoop job instance to be configured
      bwConfig - the configuration for the BatchWriter
      Since:
      1.5.0
    • getBatchWriterOptions

      protected static BatchWriterConfig getBatchWriterOptions(org.apache.hadoop.mapreduce.JobContext context)
      Deprecated.
      Gets the BatchWriterConfig settings.
      Parameters:
      context - the Hadoop context for the configured job
      Returns:
      the configuration object
      Since:
      1.5.0
      See Also:
    • setCreateTables

      public static void setCreateTables(org.apache.hadoop.mapreduce.Job job, boolean enableFeature)
      Deprecated.
      Sets the directive to create new tables, as necessary. Table names can only be alpha-numeric and underscores.

      By default, this feature is disabled.

      Parameters:
      job - the Hadoop job instance to be configured
      enableFeature - the feature is enabled if true, disabled otherwise
      Since:
      1.5.0
    • canCreateTables

      protected static Boolean canCreateTables(org.apache.hadoop.mapreduce.JobContext context)
      Deprecated.
      Determines whether tables are permitted to be created as needed.
      Parameters:
      context - the Hadoop context for the configured job
      Returns:
      true if the feature is disabled, false otherwise
      Since:
      1.5.0
      See Also:
    • setSimulationMode

      public static void setSimulationMode(org.apache.hadoop.mapreduce.Job job, boolean enableFeature)
      Deprecated.
      Sets the directive to use simulation mode for this job. In simulation mode, no output is produced. This is useful for testing.

      By default, this feature is disabled.

      Parameters:
      job - the Hadoop job instance to be configured
      enableFeature - the feature is enabled if true, disabled otherwise
      Since:
      1.5.0
    • getSimulationMode

      protected static Boolean getSimulationMode(org.apache.hadoop.mapreduce.JobContext context)
      Deprecated.
      Determines whether this feature is enabled.
      Parameters:
      context - the Hadoop context for the configured job
      Returns:
      true if the feature is enabled, false otherwise
      Since:
      1.5.0
      See Also:
    • checkOutputSpecs

      public void checkOutputSpecs(org.apache.hadoop.mapreduce.JobContext job) throws IOException
      Deprecated.
      Specified by:
      checkOutputSpecs in class org.apache.hadoop.mapreduce.OutputFormat<org.apache.hadoop.io.Text,Mutation>
      Throws:
      IOException
    • getOutputCommitter

      public org.apache.hadoop.mapreduce.OutputCommitter getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
      Deprecated.
      Specified by:
      getOutputCommitter in class org.apache.hadoop.mapreduce.OutputFormat<org.apache.hadoop.io.Text,Mutation>
    • getRecordWriter

      public org.apache.hadoop.mapreduce.RecordWriter<org.apache.hadoop.io.Text,Mutation> getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext attempt) throws IOException
      Deprecated.
      Specified by:
      getRecordWriter in class org.apache.hadoop.mapreduce.OutputFormat<org.apache.hadoop.io.Text,Mutation>
      Throws:
      IOException