HBase Working Principle: A part Hadoop Architecture

HBase Working Principle: A part Hadoop ArchitectureA brief summary of HBase work in HadoopSahil DhankhadBlockedUnblockFollowFollowingMay 211.

IntoductionHBase is a high-reliability, high-performance, column-oriented, scalable distributed storage system that uses HBase technology to build large-scale structured storage clusters on inexpensive PC Servers.

The goal of HBase is to store and process large amounts of data, specifically to handle large amounts of data consisting of thousands of rows and columns using only standard hardware configurations.

Different from MapReduce’s offline batch computing framework, HBase is random access storage and retrieval data platform, which makes up for the shortcomings of HDFS that cannot access data randomly.

It is suitable for business scenarios where real-time requirements are not very high — HBase stores Byte arrays, which don’t mind data types, allowing dynamic, flexible data models.

Hadoop Ecosystem (Credit: Edureka.

com)The figure above depicts the various layers of the Hadoop 2.

0 ecosystem — Hbase located on the structured storage layer.

HDFS provides high-reliability low-level storage support for HBase.

MapReduce provides high-performance batch processing capability for HBase.

ZooKeeper provides stable services and failover mechanism for HBase.

Pig and Hive provide HBase for high-level language support for data statistics processing, Sqoop provides HDB with available RDBMS data import function, which makes it very convenient to migrate business data from a traditional database to HBase.


HBase Architecture2.

1 Design IdeaHBase is a distributed database that uses ZooKeeper to manage clusters and HDFS as the underlying storage.

At the architectural level, it consists of HMaster (Leader elected by Zookeeper) and multiple HRegionServers.

The underlying architecture is shown in the following figure:In the concept of HBase, HRegionServer corresponds to one node in the cluster, one HRegionServer is responsible for managing multiple HRegions, and one HRegion represents a part of the data of a table.

In HBase, a table may require a lot of HRegions to store data, and the data in each HRegion is not disorganized.

When HBase manages HRegion, it will define a range of Rowkey for each HRegion.

The data falling within a defined scope will be handed over to a specific Region, thus distributing the load to multiple nodes, thus taking advantage of the advantages of distributed and characteristic.

Also, HBase will automatically adjust the location of the Region.

If an HRegionServer is overheated, that is, a large number of requests fall on the HRegion managed by the HRegionServer, HBase will move the HRegion to other nodes that are relatively idle, ensuring that the cluster environment is fully utilized.


2 Basic ArchitectureHBase consists of HMaster and HRegionServer and also follows the master-slave server architecture.

HBase divides the logical table into multiple data blocks, HRegion, and stores them in HRegionServer.

HMaster is responsible for managing all HRegionServers.

It does not store any data itself, but only stores the mappings (metadata) of data to HRegionServer.

All nodes in the cluster are coordinated by Zookeeper and handle various issues that may be encountered during HBase operation.

The basic architecture of HBase is shown below:Client : Use HBase’s RPC mechanism to communicate with HMaster and HRegionServer, submit requests and get results.

For management operations, the client performs RPC with HMaster.

For data read and write operations, the client performs RPC with HRegionServer.

Zookeeper: By registering the status information of each node in the cluster to ZooKeeper, HMaster can sense the health status of each HRegionServer at any time, and can also avoid the single point problem of HMaster.

HMaster: Manage all HRegionServers, tell them which HRegions need to be maintained, and monitor the health of all HRegionServers.

When a new HRegionServer logs in to HMaster, HMaster tells it to wait for data to be allocated.

When an HRegion dies, HMaster marks all HRegions it is responsible for as unallocated and then assigns them to other HRegionServers.

HMaster does not have a single point problem.

HBase can start multiple HMasters.

Through the Zookeeper’s election mechanism, there is always one HMaster running in the cluster, which improves the availability of the cluster.

HRegion: When the size of the table exceeds the preset value, HBase will automatically divide the table into different areas, each of which contains a subset of all the rows in the table.

For the user, each table is a collection of data, distinguished by a primary key (RowKey).

Physically, a table is split into multiple blocks, each of which is an HRegion.

We use the table name + start/end primary key to distinguish each HRegion.

One HRegion will save a piece of continuous data in a table.

A complete table data is stored in multiple HRegions.

HRegionServer: All data in HBase is generally stored in HDFS from the bottom layer.

Users can obtain this data through a series of HRegionServers.

Generally, only one HRegionServer is running on one node of the cluster, and the HRegion of each segment is only maintained by one HRegionServer.

HRegionServer is mainly responsible for reading and writing data to the HDFS file system in response to user I/O requests.

It is the core module in HBase.

HRegionServer internally manages a series of HRegion objects, each HRegion corresponding to a continuous data segment in the logical table.

HRegion is composed of multiple HStores.

Each HStore corresponds to the storage of one column family in the logical table.

It can be seen that each column family is a centralized storage unit.

Therefore, to improve operational efficiency, it is preferable to place columns with common I/O characteristics in one column family.

HStore: It is the core of HBase storage, which consists of MemStore and StoreFiles.

MemStore is a memory buffer.

The data written by the user will first be put into MemStore.

When MemStore is full, Flush will be a StoreFile (the underlying implementation is HFile).

When the number of StoreFile files increases to a certain threshold, the Compact merge operation will be triggered, merge multiple StoreFiles into one StoreFile, and perform version merge and data delete operations during the merge process.

Therefore, it can be seen that HBase only adds data, and all update and delete operations are performed in the subsequent Compact process, so that the user’s write operation can be returned as soon as it enters the memory, ensuring the high performance of HBaseI/O.

When StoreFiles Compact, it will gradually form a larger and larger StoreFile.

When the size of a single StoreFile exceeds a certain threshold, the Split operation will be triggered.

At the same time, the current HRegion will be split into 2 HRegions, and the parent HRegion will go offline.

The two sub-HRegions are assigned to the corresponding HRegionServer by HMaster so that the load pressure of the original HRegion is shunted to the two HRegions.

HLog: Each HRegionServer has an HLog object, which is a pre-written log class that implements the Write Ahead Log.

Each time a user writes data to MemStore, it also writes a copy of the data to the HLog file.

The HLog file is periodically scrolled and deleted, and the old file is deleted (data that has been persisted to the StoreFile).

When HMaster detects that an HRegionServer is terminated unexpectedly by the Zookeeper, HMaster first processes the legacy HLog file, splits the HLog data of different HRegions, puts them into the corresponding HRegion directory, and then redistributes the invalid HRegions.

In the process of loading HRegion, HRegionServer of these HRegions will find that there is a history HLog needs to be processed so the data in Replay HLog will be transferred to MemStore, then Flush to StoreFiles to complete data recovery.


3 Root and MetaAll HRegion metadata of HBase is stored in the .



As HRegion increases, the data in the .

META table also increases and splits into multiple new HRegions.

To locate the location of each HRegion in the .

META table, the metadata of all HRegions in the .

META the table is stored in the -ROOT-table, and finally, the location information of the ROOT-table is recorded by Zookeeper.

Before all clients access user data, they need to first access Zookeeper to obtain the location of -ROOT-, then access the -ROOT-table to get the location of the .

META table, and finally determine the location of the user data according to the information in the META table, as follows: The figure shows.

The -ROOT-table is never split.

It has only one HRegion, which guarantees that any HRegion can be located with only three jumps.

To speed up access, all regions of the .

META table are kept in memory.

The client caches the queried location information, and the cache does not actively fail.

If the client still does not have access to the data based on the cached information, then ask the Region server of the relevant .

META table to try to obtain the location of the data.

If it still fails, ask where the .

META table associated with the -ROOT-table is.

Finally, if the previous information is all invalid, the data of HRegion is relocated by ZooKeeper.

So if the cache on the client is entirely invalid, you need to go back and forth six times to get the correct HRegion.


HBase data modelHBase is a distributed database similar to BigTable.

It is sparse long-term storage (on HDFS), multi-dimensional, and sorted mapping tables.

The index of this table is the row keyword, column keyword, and timestamp.

HBase data is a string, no type.

Think of a table as a large mapping.

You can locate specific data by row key, row key + timestamp or row key + column (column family: column modifier).

Since HBase is sparsely storing data, some columns can be blank.

The above table gives the logical storage logical view of the com.


www website.

There is only one row of data in the table.

The unique identifier of the row is “com.


www”, and there is a time for each logical modification of this row of data.

The stamp corresponds to the corresponding.

There are four columns in the table: contents: HTML, anchor:cnnsi.

com, anchor:my.


ca, mime: type and each column give the column family to which it belongs.

The row key (RowKey) is the unique identifier of the data row in the table and serves as the primary key for retrieving records.

There are only three ways to access rows in a table in HBase: access via a row key, range access for a given row key, and full table scan.

The row key can be any string (maximum length 64KB) and stored in lexicographical order.

For rows that are often read together, the fundamental values ​​need to be carefully designed so that they can be stored collectively.


HBase read and write processThe figure below is the HRegionServer data storage relationship diagram.

As mentioned above, HBase uses MemStore and StoreFile to store updates to the table.

The data is first written to HLog and MemStore when it is updated.

The data in the MemStore is sorted.

When the MemStore accumulates to a certain threshold, a new MemStore is created, and the old MemStore is added to the Flush queue, and a separate thread is flushed to the disk to become a StoreFile.

At the same time, the system will record a CheckPoint in Zookeeper, indicating that the data changes before this time have been persisted.

When an unexpected system occurs, the data in the MemStore may be lost.

In this case, HLog is used to recover the data after CheckPoint.

StoreFile is read-only and cannot be modified once created.

Therefore, the update of HBase is an additional operation.

When the StoreFile in a store reaches a certain threshold, a merge operation is performed, and the modifications of the same key are merged to form a large StoreFile.

When the size of the StoreFile reaches a certain threshold, the StoreFile is split and divided into two StoreFiles.


1 Write operation flowStep 1: The client sends a write data request to the HRegionServer through the scheduling of the Zookeeper, and writes the data in the HRegion.

Step 2: The data is written to the MemStore of HRegion until the MemStore reaches the preset threshold.

Step 3: The data in MemStore is Flushed into a StoreFile.

Step 4: As the number of StoreFile files increases, when the number of the StoreFile files increases to a certain threshold, the Compact merge operation is triggered, and multiple StoreFiles are merged into one StoreFile, and version merge and data deletion are performed at the same time.

Step 5: StoreFiles gradually forms a larger and larger StoreFile through the continuous Compact operation.

Step 6: After the size of a single StoreFile exceeds a certain threshold, the Split operation is triggered to split the current HRegion into two new HRegions.

The parent HRegion will go offline, and the two sub-HRegions from the new Split will be assigned to the corresponding HRegionServer by HMaster so that the pressure of the original HRegion can be shunted to the two HRegions.


2 Read Operation FlowStep 1: The client accesses Zookeeper, finds the -ROOT-table, and obtains the .


table information.

Step 2: Search from the .


table to obtain the HRegion information of the target data, to find the corresponding HRegionServer.

Step 3: Obtain the data you need to find through HRegionServer.

Step 4: The memory of the HRegionserver is divided into two parts: MemStore and BlockCache.

MemStore is mainly used to write data, and BlockCache is mainly used to read data.

Read the request first to the MemStore to check the data, check the BlockCache check, and then check the StoreFile, and put the read result into the BlockCache.


HBase usage scenariosSemi-structured or unstructured data: For data structure fields that are not well defined or cluttered, it is difficult to extract data according to a concept suitable for HBase.

If more fields are stored as the business grows, the RDBMS needs to be down to maintain the change table structure, and HBase supports dynamic additions.

The records are very sparse: how many columns of the RDBMS row are fixed, and empty columns waste storage space.

Columns with empty HBase are not stored, which saves space and improves read performance.

Multi-version data: Values ​​that are located according to the RowKey and column identifiers can have any number of version values ​​(timestamps are different), so it is very convenient to use HBase for data that needs to store the change history.

A large amount of data: When the amount of data is getting larger and larger, the RDBMS database can’t hold up, and there is a read-write separation strategy.

Through a master, it is responsible for write operations, and multiple slaves are responsible for reading operations, and the server cost is doubled.

As the pressure increases, the Master can’t hold it.

At this time, the library is divided, and the data with little correlation is deployed separately.

Some join queries cannot be used, and the middle layer needs to be used.

As the amount of data increases further, the record of a table becomes larger and larger, and the query becomes very slow.

Therefore, it is necessary to divide the table, for example, by modulo the ID into multiple tables to reduce the number of records of a single table.

People who have experienced these things know how to toss the process.

HBase is simple, just add new nodes to the cluster, HBase will automatically split horizontally, and seamless integration with Hadoop ensures data reliability (HDFS) and high performance of massive data analysis (MapReduce ).


HBase Map ReduceThe relationship between Table and Region in HBase is somewhat similar to the relationship between File and Block in HDFS.

Since HBase provides APIs for interacting with MapReduce, such as TableInputFormat and TableOutputFormat, HBase data tables can be directly used as input and output of Hadoop MapReduce, which facilitates the development of MapReduce applications, and does not need to pay attention to the processing of HBase system itself Detail.

If you liked this topic, you can have a look of the few other topics that I have written down about the Hadoop.

 If you find any mistakes or have any suggestion, please feel free to contact me on my LinkedIn.

A Brief Summary of Apache Hadoop: A Solution of Big Data Problem and Hint comes from GoogleWelcome to the introduction of Big data and Hadoop where we are going to talk about Apache Hadoop and problems that big…towardsdatascience.

comNew in Hadoop: You should know the Various File Format in Hadoop.

A Beginners’ Guide to Hadoop File Formatstowardsdatascience.


. More details

Leave a Reply