Kerberos Ambari Blueprint Installs

Apache Ambari rapidly improves support for secure installations and managing security in Hadoop. Already now it is fairly convenient to create kerberized clusters in a snap with automated procedures or the Ambari wizard.

With the latest release of Ambari kerberos setups get baked into blueprint installations making separate methods like API calls unnecessary. In this post I would like to briefly discuss the new option in Ambari to use pure Blueprint installs for secure cluster setups. Additionally explaining some of the prerequisites for a sandbox demo like install.

Prerequisites

Making this work from scratch requires some preliminaries to be met. For this setup to work we need an existing Kerberos KDC/AD for our REALM. Additional to be able to create principals and keytabs we need a administrative user with the right permissions.

Here this principal ist going to be hdp/admin for our realm HDP.CORP. We configure the the correct ACLs during KDC creation to allow any user with the instance /admin to be administrator of the REALM.

Due to import policies for software Oracle JDK distribution does not ship with required policies for Java to support strong encryption. In case of Oracle JDK this needs to be installed separately. In case of OpenJDK this are already included. Here we use OpenJDK.

I. A working KDC/AD

You will need a Microsoft Active Directory or a MIT Kerberos KDC (Key Distribution Center) to work with. Setting up a working KDC on a Unix can be achieved this with simple steps:

Create the Kerberos database with the password hadoop:

If you are using an virtual environment you might have an issue with enough entropy for the random generation for the encryption key of the Kerberos database. For this use the rng-tools service rng, which helps to create enough entropy. In some cases you might need to add extra options in  /etc/sysconfig/rngd before starting rng service:

II. Admin Principal

Ambari needs an admin principal that is allowed to create other principals and keytabs. Here we are using hdp/admin@HDP.CORP as the admin principal.

We configured the ACLs of the KDC to allow any principal with the admin instance to become admin rights.

III. A Word about JCE

As mentioned early some export policy regulations (or some other reason) prevents Oracle from shipping required security policies for Java to support strong encryption types, which we use need for Kerberos to work. You have to make sure your setup has them installed:

Here we are going to install OpenJDK which already ships with the required JCE policies.

IV. Install and Configure Ambari Server & Agents

Ambari server needs to be up and running with all Ambari agents being registered. Registering an Ambari agent manually with the cluster is fairly easy. Setting the Ambari server host name in ambari.ini generally is enough to register the agent with the server:

Kerberos Blueprint Install

An Ambari Blueprint install consists of two steps. In the first step a cluster description is being created that can be used as a blueprint to install multiple cluster. The Blueprint holds the relevant components, theirs configuration, and how they are distributed across different hosts ( host_groups).

In a second step a cluster is being created by referencing an existing blueprint and mapping existing hosts to host-groups of the the blueprint. For example that host node01.example.com -> host_group: worker-node. This can also happen in a declarative way by saying all the register host that have CPU count of 4 should belong to host_group:master-node. This would look like this:

Cluster creation script and Blueprint are described in JSON format an send as payload of REST API calls to the Ambari server. Commonly this are stored in files named blueprint.json and hostmapping.json . This can then be send to the Ambari server as payload for the following POST requests:

The first request will create the blueprint description of the cluster layout. The second call create a cluster based on the blueprint given in the cluster install script ( hostmapping.json)

Kerberos Blueprint

For a kerberized install the following parameters need to be set in the configuration section kerberos-env of the cluster blueprint:

These are the basic parameters needed for a sample install. See here for an exhausting list of possible parameters and their meaning: kerberos-env.

During the cluster creation with the host-mapping file security and credential configuration are required for the installation to succeed:

Here the principal  is our previously created KDC admin principal with it’s password hold in the key parameter. The password are only temporally stored, which is configured by the type TEMPORARY. The credentials can also be persisted with PERSISTENT, but Ambari should also be configured for encrypted password storage.Sample Blueprint:

Sample Host Mapping:

Further Readings

Leave a Reply

Your email address will not be published. Required fields are marked *