Apache Accumulo RowHash Example

This example shows a simple map/reduce job that reads from an accumulo table and writes back into that table.

To run this example you will need some data in a table. The following will put a trivial amount of data into accumulo using the accumulo shell:

$ ./bin/accumulo shell -u username -p password
Shell - Apache Accumulo Interactive Shell
- version: 1.5.0
- instance name: instance
- instance id: 00000000-0000-0000-0000-000000000000
-
- type 'help' for a list of available commands
-
username@instance> createtable input
username@instance> insert a-row cf cq value
username@instance> insert b-row cf cq value
username@instance> quit

The RowHash class will insert a hash for each row in the database if it contains a specified colum. Here’s how you run the map/reduce job

$ bin/tool.sh lib/accumulo-examples-simple.jar org.apache.accumulo.examples.simple.mapreduce.RowHash -u user -p passwd -i instance -t input --column cf:cq

Now we can scan the table and see the hashes:

$ ./bin/accumulo shell -u username -p password
Shell - Apache Accumulo Interactive Shell
- version: 1.5.0
- instance name: instance
- instance id: 00000000-0000-0000-0000-000000000000
-
- type 'help' for a list of available commands
-
username@instance> scan -t input
a-row cf:cq []    value
a-row cf-HASHTYPE:cq-MD5BASE64 []    IGPBYI1uC6+AJJxC4r5YBA==
b-row cf:cq []    value
b-row cf-HASHTYPE:cq-MD5BASE64 []    IGPBYI1uC6+AJJxC4r5YBA==
username@instance>