Batch Scanner Code

Accumulo Tour: Batch Scanner Code

Tour page 10 of 13

Below is a solution to the exercise.

Create a table called “GothamBatch”.

jshell> client.tableOperations().create("GothamBatch");

Generate 10,000 rows of villain data

jshell> try (BatchWriter writer = client.createBatchWriter("GothamBatch")) {
   ...>   for (int i = 0; i < 10_000; i++) {
   ...>     Mutation m = new Mutation(String.format("id%04d", i));
   ...>     m.put("villain", "alias", "henchman" + i);
   ...>     m.put("villain", "yearsOfService", "" + (new Random().nextInt(50)));
   ...>     m.put("villain", "wearsCape?", "false");
   ...>    writer.addMutation(m);
   ...>   }
   ...> }

Create a BatchScanner with 5 query threads

jshell> try (BatchScanner batchScanner = client.createBatchScanner("GothamBatch", Authorizations.EMPTY, 5)) {
   ...>
   ...>   // Create a collection of 2 sample ranges and set it to the batchScanner
   ...>   List<Range> ranges = new ArrayList<Range>();
   ...>
   ...>   // Create a collection of 2 sample ranges and set it to the batchScanner
   ...>   ranges.add(new Range("id1000", "id1999"));
   ...>   ranges.add(new Range("id9000", "id9999"));
   ...>   batchScanner.setRanges(ranges);
   ...>
   ...>   // Fetch just the columns we want
   ...>   batchScanner.fetchColumn(new Text("villain"), new Text("yearsOfService"));
   ...>
   ...>   // Calculate average years of service
   --->   long villianCount = batchScanner.stream().count();
   --->   Double average = batchScanner.stream().map(Map.Entry::getValue).map(Value::toString).mapToLong(Long::valueOf).average().getAsDouble();
   --->   System.out.println("The average years of service of " + villianCount + " villians is " + average);
   ...> }

Running the solution above should print output similar to below:

The average years of service of 2000 villains is 24.8125

< 10 / 13 >