Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rows with identical keys

Tags:

hbase

bigdata

When I need to create an HBase-row, I have to call Put(row_key) method. Then, what happens if I'll call Put() method again with the same row_key value? Will the existing row be updated or HBase will create the new row?

Is it possible to create 2 rows with identical keys?

like image 874
VeLKerr Avatar asked May 25 '15 20:05

VeLKerr


3 Answers

Row-keys are used to identify a row uniquely in Hbase. If you want two rows to have identical keys, then you are missing something. Please add more information regarding your requirement, or revisit the basics of Hbase architecture

like image 100
Ramzy Avatar answered Sep 22 '22 12:09

Ramzy


Your question should include column family and column qualifier values as well. With row key, these three are unique identifier of a value in hbase table.

Also you can enable versioning for that column family and have multiple values which can have same "row key + column family + column qualifier" values. In this case every unique version (value) is defined by "rowkey+ col.fam. +col.qual. + timestamp"

like image 31
halil Avatar answered Sep 21 '22 12:09

halil


You cannot have rows with the same key, but you can have multiple versions of Put using timestamps. You can use these built-in timestamps for audit or for timestamping.

If you issue multiple Puts without specifying version (timestamp), the latest version of the KV will prevail. If you issue multiple puts the same explicitly set timestamp, one of those values will be returned but HBase provides no guarantees about the the order and which KV will survive the compaction (scheduled cleanup). If you insert multiple Puts with negative timestamps this will be really bad. Earlier HBase versions will produce unpredictable scan results while later HBase versions will throw an exception.

like image 38
Sergei Rodionov Avatar answered Sep 21 '22 12:09

Sergei Rodionov