Interface LoadPlan.SplitResolver

All Superinterfaces:
Function<org.apache.hadoop.io.Text,LoadPlan.TableSplits>
Enclosing class:
LoadPlan

public static interface LoadPlan.SplitResolver extends Function<org.apache.hadoop.io.Text,LoadPlan.TableSplits>
A function that maps a row to two table split points that contain the row. These splits must exist in the table being bulk imported to. There is no requirement that the splits are contiguous. For example if a table has splits C,D,E,M and we ask for splits containing row H its ok to return D,M, but that could result in the file mapping to more actual tablets than needed. For a row that falls before or after all table splits, use null to represent -inf and +inf. For example if a table has splits C,D,E,M and row B is resolved it is ok to return null,C. If row Q is resolved for table splits C,D,E,M it would be ok to return M,null. For a table with zero splits, the resolver should return null,null for all rows.
Since:
2.1.4
  • Method Summary

    Modifier and Type
    Method
    Description
    apply(org.apache.hadoop.io.Text row)
    For a given row, R, this function should find two split points, S1 and S2, that exist in the table being bulk imported to, such that S1 < R <= S2.
    from(SortedSet<org.apache.hadoop.io.Text> splits)
     

    Methods inherited from interface java.util.function.Function

    andThen, compose
  • Method Details

    • from

      static LoadPlan.SplitResolver from(SortedSet<org.apache.hadoop.io.Text> splits)
    • apply

      LoadPlan.TableSplits apply(org.apache.hadoop.io.Text row)
      For a given row, R, this function should find two split points, S1 and S2, that exist in the table being bulk imported to, such that S1 < R <= S2. The closer S1 and S2 are to each other, the better.
      Specified by:
      apply in interface Function<org.apache.hadoop.io.Text,LoadPlan.TableSplits>