Automated Ambari Install with Existing Database

If not specified differently Amabari server will be install together with a Postgres database for it’s meta-data among a MySQL database for the Hive Metastore. A Derby database will also be installed for storing scheduling information of Oozie workflows. This post covers the installation of Ambari server with an existing database. In support of an automated install with Ambari Blueprints configuring the Hive Metastore and Oozie will also be provided.

Databases and Schema

Ambari

Ambari server supports Embedded, Oracle, MySQL, and Postgres databases. Except for two thing to watch out for there is nothing special about the setup of this databases:

  1. MySQL with InnoDB
    Ambari server needs InnoDB, so you can either set the default_storage_engine=innodb in your my.cnf or set the session variable SET storage_engine=InnoDB during creation of the schema.
  2. Oracle SID or Service?
    With Oracle database you can either setup a a reference by system ID (SID) or by Service. In either way pay attention to set the –sidorsname=sid|sname parameter during setup.

Schema information needs to be created prior to the installation of Ambari server. The required DDL with needed DROP And CREATE statements can for example be found under /var/lib/ambari-server/resources/ after installation:

$ ll /var/lib/ambari-server/resources/
total 135036
-rw-r--r-- 1 root root     26066 May 21  2014 Ambari-DDL-MySQL-CREATE.sql
-rw-r--r-- 1 root root      1192 May 21  2014 Ambari-DDL-MySQL-DROP.sql
-rw-r--r-- 1 root root     28097 May 21  2014 Ambari-DDL-Oracle-CREATE.sql
-rw-r--r-- 1 root root      2160 May 21  2014 Ambari-DDL-Oracle-DROP.sql
-rw-r--r-- 1 root root     25963 May 21  2014 Ambari-DDL-Postgres-CREATE.sql
-rw-r--r-- 1 root root      1337 May 21  2014 Ambari-DDL-Postgres-DROP.sql
-rw-r--r-- 1 root root     31588 May 21  2014 Ambari-DDL-Postgres-EMBEDDED-CREATE.sql
-rw-r--r-- 1 root root       853 May 21  2014 Ambari-DDL-Postgres-EMBEDDED-DROP.sql

Hive Metastore

The Hive Metastore supports MySQL, Postgres, Oracle, and MS SQL Server databases. Nothing particular is required during the installation of this databases. Ambari server beginning with stack HDP 2.2 will automatically ensure the schema are created before installation of the Metastore. For any stack prior to that the DDL scripts can be found under var/lib/ambari-server/resources/stacks/HDP/${VERSION}/services/HIVE/etc/.

Oozie

The default database for Oozie is Derby. Other than that Oozie support MySQL, Postgres, and Oracle databases. The schema can be created automatically. Automatic schema creation can be achieved by providing the oozie.service.JPAService.create.db.schema=true parameter.

Installing Ambari

Ambari can best be installed through the package repository for your operating system:

  • RHEL/CentOS/Oracle 5 (Deprecated)

    wget -nv http://public-repo-1.hortonworks.com/ambari/centos5/1.x/updates/1.7.0/ambari.repo -O /etc/yum.repos.d/ambari.repo
  • SLES 11
    wget -nv http://public-repo-1.hortonworks.com/ambari/suse11/1.x/updates/1.7.0/ambari.repo -O /etc/zypp/repos.d/ambari.repo
  • Ubuntu 12
    wget -nv http://public-repo-1.hortonworks.com/ambari/ubuntu12/1.x/updates/1.7.0/ambari.list -O /etc/apt/sources.list.d/ambari.list
    apt-key adv --recv-keys --keyserver keyserver.ubuntu.com B9733A7A07513CAD
    apt-get update

Setup Ambari Server

After installation Amabari server can be configured using setup. For details you can execute $ambari-server setup –help. To configure Ambari with an existing database we can setup Ambari for example like this:

ambari-server setup -s 
--database=embedded|oracle|mysql|postgres
--databasehost=Hostname of database server
--databaseport=Database port
--databasename=Database/Schema/Service name or ServiceID
--databaseusername=Database user login
--databasepassword=Database user password

The configurations are stored under /etc/ambari-server/conf/ambari-server.properties. The password for the database connection is stored under /etc/ambari-server/conf/password.dat and can therefor be protected separately from the other configurations.

Blueprint for Automated Install

During installation Hive and Oozie would also need to be configured for any existing database. A possible Blueprint could look like this:

{
    "Blueprints" : {
        "blueprint_name" : "blueprint",
        "stack_name" : "HDP",
        "stack_version" : "2.2"
    },
    "configurations": [
        {
            "nagios-env": {
                    "nagios_contact": "root@localhost",
                    "nagios_group" : "nagios",
            "nagios_user" : "nagios",
                    "nagios_web_login" : "nagiosadmin",
                    "nagios_web_password" : "admin"
            }
        },{
            "hive-site": {
                "javax.jdo.option.ConnectionDriverName" : "oracle.jdbc.driver.OracleDriver",
                "javax.jdo.option.ConnectionPassword" : "pw",
                "javax.jdo.option.ConnectionURL" : "jdbc:oracle:thin:@host:1521/HIVE.SID",
                "javax.jdo.option.ConnectionUserName" : "hive"
        }
        },{
            "hive-env": {
                "hive_ambari_database" : "Oracle",
                "hive_database" : "Existing Oracle Database",
                "hive_database_type" : "oracle"
        }
        },{
            "oozie-site": {
                "oozie.db.schema.name" : "oozie",
                "oozie.service.JPAService.create.db.schema" : "true",
                "oozie.service.JPAService.jdbc.driver" : "com.mysql.jdbc.Driver",
                "oozie.service.JPAService.jdbc.username" : "oozie",
                "oozie.service.JPAService.jdbc.password" : "pw",
                "oozie.service.JPAService.jdbc.url" : "jdbc:mysql://host:3306/oozie"
            }
        },{
            "oozie-env": {
                "oozie_database" : "Existing MySQL Database",
                "oozie_existing_oracle_database" : "MySQL",
                "oozie_existing_oracle_host" : "db_host"
            }
        }
    ],
    "host_groups" : [{ 
        "name" : "001",
        "cardinality" : "1",
        "components" : [...]
    }] 
}

Supported Database Matrix (Source)

For completeness a reference matrix of supported databases:

PostgreSQL 8.x PostgreSQL 9.x MySQL 5.x Oracle 11gr2 Other
Hive / HCatalog Supported. For instructions on configuring this database for Hive metastore, see Instructions for Manual Install. Supported. For instructions on configuring this database for Hive metastore, see Instructions for Manual Install . Default. For instructions on configuring this database for Hive metastore, either see Instructions for Manual Install or see Using Ambari with MySQL. Default. For instructions on configuring this database for Hive metastore, either see Instructions for Manual Install or see Using Ambari with Oracle.
Oozie Supported. For instructions on configuring this database for Oozie metastore, see Instructions for Manual Install. Supported. For instructions on configuring this database for Oozie metastore, see Instructions for Manual Install. Supported. For instructions on configuring this database for Oozie metastore, either see Instructions for Manual Install or see Using Oozie with MySQL. Supported. For instructions on configuring this database for Oozie metastore, either see Instructions for Manual Install or see Instructions for Oozie. Derby (default).
Hue [a] Supported. For instructions on configuring this database for Hue, see Instructions for Manual Install. Supported. For instructions on configuring this database for Hue, see Instructions for Manual Install . Supported. For instructions on configuring this database for Hue, see Instructions for Manual Install. Supported. For instructions on configuring this database for Hue, see Instructions for Manual Install. SQLite (default)
Ambari [b] Default. For more information, see Database Requirements. Supported. For more information, see Using Ambari with PostgreSQL. Supported. For more information, see Using Ambari with MySQL. Supported. For more information, see Using Ambari with Oracle.

[a] Hue does not currently support Ubuntu 12.

[b] Ambari does not currently support Ubuntu 12.

Further Readings

Advertisement

4 thoughts on “Automated Ambari Install with Existing Database

  1. Hi

    Just wondering if you have any idea on how to setup HIVE and Oozie on MS SQL Server 2008 R2. So how can I setup Ambari server to show option for connecting to an existing MS SQL 2008 R2 during the customize services option.

    Secondly if I setup a MySQL server on the same machine running Ambari Sever will that be fine for production use.

    I would appreciate your help.

    Thanks
    Musa

    Like

    1. Hi, thanks for your comment. I have no experience with MS SQL in Hadoop and unfortunately it also seems not to be supported.

      Running Ambari + DB on the same node for production sounds fine to me. Often the question is rather if you have Ambari installed on one of the master nodes in your cluster, because at some point it can make sense to have a separate managing/monitoring node in the cluster that only has Ambari Server, Ambari Metrics, Ganglia Server, Nagios Server, etc. installed without Hadoop components.

      Like

Leave a Reply to jonbros Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s