Building HDP on Docker

Docker is a great tool that automates the deployment of software across a Linux operating system. While the fundamental idea behind Docker is to stack specialized software together to form a complex system, there is no particular rule of how big or small the software for a container can or should be. Running the complete HDP stack in a single container can be achieved as well as running each service of HDP in it’s own container.

Docker allows you to run applications inside containers. Running an application inside a container takes a single command: docker run. Containers are based off of images defining software packages and configurations. hkropp/hdp-basic is such an image in which the HDP services are running. The image was build using Ambari blueprint orchastrated by a Dockerfile. The hostname was specified to be n1.hdp throughout the build process and hence needs also to be specified when running it. The Dockerfile for this image is located here. This posts describes how to build HDP on top of Docker.

Prerequisite Setup

Before getting started a Docker environment needs to be installed. A quick way to get started is Boot2Docker. Boot2Docker is a VirtualBox image based on Tiny Core Linux with Docker installed. It can be used with Mac OS X or Windows. Other ways to install Docker can be found here.

Boot2Docker

Once installed Boot2Docker can be used via command line tool boot2docker. With it we can initialize the VM, boot it up, and prepare our shell for docker.

Running hdp-basic

With the Docker environment setup the image can be run like this:

If not already installed locally this will fetch the image from Docker Hub. After that the image is run in daemon mode as the -d  flag indicates. The -p flag lets Docker know to expose this port to the host VM. With this Ambari can be accessed using the $ boot2docker ip  and port 8080 – http://$(boot2docker ip):8080 The hostname is set to be n1.hdp because the image was configured with this hostname. By executing the /start-server script at boot time the Ambari server is started together with all installed services.

The Dockerfile

Building this image was achieved using this Dockerfile, while the installation of HDP was done using Ambari Shell with Blueprints. Helpful about Ambari Shell is the fact that an blueprint install can be executed blocking further process until the install has finished (–exitOnFinish true). From the install-cluster.sh script:

The image is based from a centos:6.6 image. Throughout the build a consistent hostname is being used for the configuration and installation. Doing this with Docker builds is actually not very easy to achieve. By design Docker tries to make the context a container can run in as less restrictive as possible. Assigning a fixed host name to an image is restricting these context. In addition every build step creates a new image with a new host name. Setting the host name before each step requires root privileges which are not given. To work around this the ENV command was used to set the HOSTNAME and to make it resolvable before any command that required the hostname a script was executed to set it as part of the /etc/hosts file.

Part of the Dockerfile:

Part of the set_host.sh:

The Ambari agents support dynamic host configuration by defining a script.

Dockerfile:

hostname.sh:

Starting HDP

start-server is the script that is executed during startup of the container. Here the Ambari server and agent are started. The Ambari Shell is again being used to start up the all installed HDP services.

Further Readings

5 thoughts on “Building HDP on Docker”

Leave a Reply