Kerberos Ambari Blueprint Installs

Apache Ambari rapidly improves support for secure installations and managing security in Hadoop. Already now it is fairly convenient to create kerberized clusters in a snap with automated procedures or the Ambari wizard.

With the latest release of Ambari kerberos setups get baked into blueprint installations making separate methods like API calls unnecessary. In this post I would like to briefly discuss the new option in Ambari to use pure Blueprint installs for secure cluster setups. Additionally explaining some of the prerequisites for a sandbox demo like install. Continue reading “Kerberos Ambari Blueprint Installs”

Automated Kerberos Install for HDP w/ Ambari + Puppet

With the release of Ambari 2.x kerberizing a HDP install improved quite a bit. Looking back at Kerberized Hadoop Cluster – A Sandbox Example compared to today most of the there described steps are much easier by now and can be automated. For long I was looking to include it into my existing Vagrant project for an end to end setup of a kerberized cluster. With the writing of this post I finally had the opportunity to do so.

In this post I would like to describe the parts added to the Vagrant setting needed to accomplish an end to end setup of a kerberized HDP cluster. Before the final step of the cluster setup by using the Ambari REST API, a KDC with credentials needs to be created. A Puppet module was created and included to achieve the installation of a MIT Kerberos install. Continue reading “Automated Kerberos Install for HDP w/ Ambari + Puppet”

Automated Blueprint Install with Ambari Shell

Ambari Shell is an interactive command line tool to administrate Ambari manged HDP clusters. It supports all available functionality provided by the UI of the Ambari web application. Written as a Java application based on a Groovy REST client it further provides tab completion and a context aware commands. In a previous post we already discussed various contexts like service and state will using REST calls to alter them. Ambari Shell is a convenient tool for managing most of the complex aspects discussed there.

With that it can also be used for automated cluster installs based on Ambari Blueprints. While it is fairly simple to use two curl request to do a blueprint based install, Ambari Shell gives the advantage of monitoring the process. In scripted setups and with the use of provisioning tools like Puppet, Chef, or Ansible it gives the possibility to time setup steps after a complete cluster install. Executing a cluster install with --exitOnFinish true will halt the execution of the script until the install has finished.

An example of this is used as part of this Dockerfile where a parameterized script install_cluster.sh. The below example is being used as part of a Puppet install triggered with Vagrant:

Further Readings

Install HDP with Red Hat Satellite

As part of the installation of HDP with Ambari two repositories get generated with the URLs defined as user input during the first steps of the install wizard and distributed to the cluster hosts. In cases where you are using Red Hat Satellite to manage your Linux infrastructure, you need to disable the repositories defined to leverage Red Hat Satellite. The same is also true for SUSE’s Manager (Spacewalk).

Prior to the install and prior to starting the Ambari server you need to disable the repositories by altering the template responsible for generating them.

Prior to Ambari 2.x you would need to change  repo_suse_rhel.j2 template to disable the generated repositories. In that file simply change the enabled=1 to enabled=0. To find the template file do $ find /var/lib/ambari-server -name repo_suse_rhel.j2 .

Starting with Ambari 2.x the configuration for the repositories can be found in the cluster-evn.xml under /var/lib/ambari-server/resources/stacks/HDP/2.0.6/configuration. Also here change the value of enbaled to . In that file look for the <name>repo_suse_rhel_template</name> .

Save the changes and start you install. Continue reading “Install HDP with Red Hat Satellite”

Install Apache Zepplin via REST & Ambari

The Ambari server offers a comprehensive REST API to install a complete HDP cluster or manage all parts of it. In this post we are going to explore the possibilities of installing a new service. With Ambari it is fairly easy to define custom services for management. With YARN being a general purpose execution engine Ambari can be seen as the general purpose management service. For this example we are using the Apache Zeppelin service provided here. A more general documentation of how to install a new service to an existing cluster can be found here. Continue reading “Install Apache Zepplin via REST & Ambari”

Services and State with Ambari REST API

The Ambari management tool for Hadoop offers among other handy tools a comprehensive REST API for cluster administration. Logically a cluster is divided into hosts, services and service components. While the UI might not always has support for all needed scenarios sure the REST API can be used to achieve it. For example moving a master component of a service from one host to another.

In this post we are going to look a little closer at the way the Ambari API can be used to manage Hadoop services. At the end of this post you will find a list of all the currently supported Hadoop services with all the needed master, slave and client components that can be manged and administrated within your HDP stack. Also this posts contains the possible states and state transitions a component might have which could become useful when facing problems like Host config is in invalid state. Continue reading “Services and State with Ambari REST API”

Ambari Augeas Module and Configuration

Operating large to small size clusters needs automation; as well does installation. Going repeatedly through installation instructions, typing the same commands over and over again is really no fun. Worse it’s a waste of resources. The same holds true for keeping the configuration of 10 or 100 nodes in sync. A first step is to use a distributed shell and the next to turn to tools like Puppet, Cobblerd, Ansible, or Chef, which promise full automation of cluster operations.

As with most automation processes this tools tend to make the common parts easy and the not so common steps hard. The fact that every setup is different requires concepts of  adaptability that are easy to use. One of this tools for adapting configuration files could be seen in Augeas. Although hard to understand at time Augeas is easy to use, as it turns arbitrary configuration files into trees on which typical tree operations can be performed.

This post highlights the possibilities to us Augeas in a Puppet setup to install and configure a HDP Ambari cluster. Ambari agents are part of every node in a cluster, so when adding new nodes to a cluster in an automated fashion you would want to be prepared. Continue reading “Ambari Augeas Module and Configuration”

Automated Ambari Install with Existing Database

If not specified differently Amabari server will be install together with a Postgres database for it’s meta-data among a MySQL database for the Hive Metastore. A Derby database will also be installed for storing scheduling information of Oozie workflows. This post covers the installation of Ambari server with an existing database. In support of an automated install with Ambari Blueprints configuring the Hive Metastore and Oozie will also be provided. Continue reading “Automated Ambari Install with Existing Database”