What is a region normalizer

A region normalizer is a strategy to control the regions’ sizes when the table changes dynamically. Once in awhile the HBase master asks the normalizer to calculate normalization plans and the normalizer responds with the list of regions which have to be splitted and/or merged. The HBase master then executes these splits and merges when it has opportunity to do so.

Reference: SimpleRegionNormalizer

HBase comes with one example of region normalizer, called SimpleRegionNormalizer. The goal of this normalizer is to calculate an average region size per table, and then to find outliers. Regions which are more than twice larger than the average are planned to be split and the small regions are merged if their merged size is still less than the average. By this method, the regions’ sizes become closer to the average. The code of this normalizer is available here:

https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java

Goals of our normalizer

In our case, the regions already had approximately equal size already. So the goal was to decrease the number of them, starting from the least used ones. Fortunately, the HBase provides the request count for each region. When I checked, I found that many regions on our servers had this count on zero, and many other had low numbers. This parameter is configurable, but first of all we could start with regions which had zero request count. The idea was to merge regions which are not used or almost not used until they become too big or until the table has too few regions. As I told earlier, we started with 200-300 regions per table. So the limit of 20-30 regions per table will mean 10 times less regions.

This is the full list of configuration parameters:

minRegionsPerTable (default 20)
maxRegionSize (in megabytes, default 3072 for 3GB)
maxResults (default 10)
maxRequestCount (default 0)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

What is a region normalizer

Reference: SimpleRegionNormalizer

Goals of our normalizer

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

similarweb/silent-normalizer

Folders and files

Latest commit

History

Repository files navigation

What is a region normalizer

Reference: SimpleRegionNormalizer

Goals of our normalizer

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages