Big data: Hbase Architecture

HBase Tables and Regions

Table is made up of any number of regions.

Region is specified by its startKey and endKey.

Each region may live on a different node and is made up of several HDFS files and blocks, each of which is replicated by Hadoop

HBase Tables:-

HBase uses HDFS as its reliable storage layer.It Handles checksums, replication, failover

Hbase consists of,

Data is stored in memory and flushed to disk on regular intervals or based on size

MemStores:-

After data is written to the WAL the RegionServer saves KeyValues in memory store

Flush to disk based on size, is hbase.hregion.memstore.flush.size
Default size is 64MB
Uses snapshot mechanism to write flush to disk while still serving from it and accepting new data at the same time

Compactions:-
Two types: Minor and Major Compactions

Minor Compactions

Major Compactions

Key Cardinality:-

The best performance is gained from using row keys

Fold, Store, and Shift:-

All values are stored with the full coordinates,including: Row Key, Column Family, Column Qualifier, and Timestamp

DDI:-

Stands for Denormalization, Duplication and Intelligent Keys

Block Cache

Region Splits

Big data