Toolkit com.teracloud.streams.hbase 5.0.0
General Information
The HBase toolkit provides support for interacting with Apache HBase from Streams.
HBase is a Hadoop database, a distributed, scalable, big data store. Tables are partitioned by rows across clusters. A cell value in an HBase table is accessed by its row, columnFamily, columnQualifier, and timestamp. Usually the timestamp is left out, and only the latest value is returned. The HBase toolkit currently does not provide support for timestamps.
- The columnFamilies must be defined when the table is established and might be limited, but
- New columnQualifiers can be added at run time and there is no limit to their number
- Tuples can be added to a HBase table by using the HBASEPut operator (which includes a checkAndPut condition) or incremented with the HBASEIncrement operator.
- Tuples can be retrieved with the HBASEGet operator from an HBase table.
- The HBASEScan operator can output all tuples, or all tuples in a particular row range from an HBase table.
- The HBASEDelete operator enables tuples to be deleted from an HBase table.
- as an attribute of the input tuple (using columnFamilyAttrName and columnQualifierAttrName parameters), or
- as a single string that is used for all tuples (using staticColumnFamily and staticColumnQualifier parameters).
While the HBASEPut operator requires all fields to be provided, the HBASEGet and HBASEDelete operators do not, and their behavior change based on what items are provided. For example, HBASEDelete will delete the whole row if columnFamily and columnQualifier are not specified, but will only delete the cell value if they are.
Except for HBASEIncrement and HBASEGet, the only data types that are currently supported are rstrings. HBASEGet supports getting a value of type long.
The toolkit uses the same configuration information from the hbase-site.xml file that HBase does. For more information about HBase, see http://hbase.apache.org/.
Check and put/delete operations
- a full entry (row, columnFamily, columnQualifier, value). If this entry exists with the given value, HBase makes the pure or delete.
- a partial entry (row, columnFamily, columnQualifier). If there is no value, HBase makes the update.
In this mode, the operator can have an output port with a success attribute to indicate whether the put or delete happened.
- Version
- 5.0.0
- Required Product Version
- 7.2.0.0