Automated Kerberos Install for HDP w/ Ambari + Puppet

With the release of Ambari 2.x kerberizing a HDP install improved quite a bit. Looking back at Kerberized Hadoop Cluster – A Sandbox Example compared to today most of the there described steps are much easier by now and can be automated. For long I was looking to include it into my existing Vagrant project for an end to end setup of a kerberized cluster. With the writing of this post I finally had the opportunity to do so.

In this post I would like to describe the parts added to the Vagrant setting needed to accomplish an end to end setup of a kerberized HDP cluster. Before the final step of the cluster setup by using the Ambari REST API, a KDC with credentials needs to be created. A Puppet module was created and included to achieve the installation of a MIT Kerberos install. Continue reading “Automated Kerberos Install for HDP w/ Ambari + Puppet”

Azure VNet for Your HDP Cluster

In a series of blog posts I demonstrated how to create a custom OS Image for automatic provisioning of HDP with Vagrant on the Azure Cloud. On GitHub I share the result of a first layout among other provisioning scripts. Until now the setup was done without the proper network configuration required for the communication of the individual components of the cluster.

With the new release of the vagrant-azure plugin it will be possible to setup the cluster in a dedicated VNet. This is the last missing peace in the series of work I published to allow the automated provisioning of HDP in Azure. Unfortunately this is not quite true, as the current Ruby SDK of Azure does not allow the passing of IP addresses to the machines. We therefor have to create host entries currently by hand. It could be possible to setup a DNS or use Puppet to conduct the host mapping in a automated fashion, but I at least was not able to do so as part of this work here. Continue reading “Azure VNet for Your HDP Cluster”

HDP Ansible Playbook Example

In my existing collection of automated install scripts for HDP I always try to extend it with further examples of different provisioners, providers, and settings. Recently I added with hdp22-n1-centos6-ansible an example Ansible environment for preparing a HDP 2.2 installation on one node.

Ansible differs from other provisioners like Puppet or Chef by a simplified approach in dependence on SSH. It behaves almost as a distributed shell putting little dependencies on existing hosts. Where for example Puppet makes strong assumptions about the current state of a system with one or multiple nodes, does Ansible more or less reflect a collection of tasks a system gone through to reach it’s current state. While some celebrate Ansible for it’s simplicity do others abandon it for it’s lack of strong integrity.

In this post I would like to share a sample Ansible Playbook to prepare a HDP 2.2 Amabri installation using Vagrant with Virtualbox. You can download and view the in this post discussed example here. Continue reading “HDP Ansible Playbook Example”

Creating a HDP Ready CentOS Image for Azure

There already exist some collections of VM images in Windows Azure gallery. Additionally on VM Depot the community can share custom provisioned images. Looking for a HDP ready image based on CentOS I could not find one suitable for my needs. In this post I would like to describe how I created my first HDP ready image based on CentOS for Windows Azure. The image will be created within Azure itself, as I don’t have access to Hyper-V and any attempts to create a CentOS based VHD with VirtualBox failed for unknown reasons. Continue reading “Creating a HDP Ready CentOS Image for Azure”

Provisioning a Cluster Using Vagrant and OpenStack

Vagrant has become very popular for provisioning virtual machines for development. Usually it’s used in combination with VirtualBox on a local machine. But Vagrant supports multiple other visualization providers, in fact one can build a custom provider as needed. If the local machine is not sufficient for the needs of development moving to the cloud seems like a reasonable thing to do using AWS, Rackspace, or OpenStack. Continue reading “Provisioning a Cluster Using Vagrant and OpenStack”

Provisioning a HDP Dev Cluster with Vagrant

Setting up a production or development Hadoop cluster used to be much more tedious then it is today with tools like Puppet, Chef, and Vagrant. Additionally the Hadoop community kept busy investing in the ease of deployments listening to demands of experienced system administrators. The latest of such investments is Ambari Blueprints.

With Ambari Blueprints dev-ops are capable of configuring an automated setup of individual components on each node across a cluster. This further can be re-used to replicate the setup on to different clusters for development, integration, or production.

In this post we are going to setup up a three node HDP 2.1 cluster for development on a local machine by using Vagrant and Ambari.
Most of what will be presented here builds up on previous work published by various author which are referenced at the end of this post. Continue reading “Provisioning a HDP Dev Cluster with Vagrant”