Containerized Apache Hive Metastore for horizontally scalable Hive Metastore deployments backed by a PostgreSQL-compatible database.
The hive-metastore image is stored on Docker Hub in the rtdl/hive-metastore repository.
This is a sub-project of rtdl – the real-time data lake. Please go to rtdl's repo and give it a star.
To get a persistent Apache Hive Metastore instance running in a container backed by a
PostgreSQL-compatible database (all files stored in storage/ folder):
- Run
docker compose -f docker-compose.init.yml up -d.- Note: This configuration should be fault-tolerant, but if any containers or
processes fail when running this, run
docker compose -f docker-compose.init.yml downand retry.
- Note: This configuration should be fault-tolerant, but if any containers or
processes fail when running this, run
- After containers
rtdl_catalog-db-initandrtdl_catalog-initexit and complete withEXITED (0), kill and delete the rtdl container set by runningdocker compose -f docker-compose.init.yml down - Run
docker compose up -devery time after.docker compose downto stop.
Note: To start from scratch, first run the below commands from the repo's root folder.
% rm -rf storage/
- This image is not interactive and has no default ENTRYPOINT. You must use the
entrypointoption along with a corresponding shell script andvolumesto load scripts to execute. - A PostgreSQL-compatible database is required to run the container. Define your connection
credentials in
conf/metastore-site.xmland use the 'volumes' option to load yourmetastore-site.xmlfile to/opt/apache-hive-metastore-{version}-bin/conf/metastore-site.xml. - This image opens port
9083. Use this port to connect to your Hive Metastore instance.