If not specified differently Amabari server will be install together with a Postgres database for it’s meta-data among a MySQL database for the Hive Metastore. A Derby database will also be installed for storing scheduling information of Oozie workflows. This post covers the installation of Ambari server with an existing database. In support of an automated install with Ambari Blueprints configuring the Hive Metastore and Oozie will also be provided.
Databases and Schema
Ambari
Ambari server supports Embedded, Oracle, MySQL, and Postgres databases. Except for two thing to watch out for there is nothing special about the setup of this databases:
- MySQL with InnoDB
Ambari server needs InnoDB, so you can either set the default_storage_engine=innodb in your my.cnf or set the session variable SET storage_engine=InnoDB during creation of the schema. - Oracle SID or Service?
With Oracle database you can either setup a a reference by system ID (SID) or by Service. In either way pay attention to set the –sidorsname=sid|sname parameter during setup.
Schema information needs to be created prior to the installation of Ambari server. The required DDL with needed DROP And CREATE statements can for example be found under /var/lib/ambari-server/resources/ after installation:
$ ll /var/lib/ambari-server/resources/ total 135036 -rw-r--r-- 1 root root 26066 May 21 2014 Ambari-DDL-MySQL-CREATE.sql -rw-r--r-- 1 root root 1192 May 21 2014 Ambari-DDL-MySQL-DROP.sql -rw-r--r-- 1 root root 28097 May 21 2014 Ambari-DDL-Oracle-CREATE.sql -rw-r--r-- 1 root root 2160 May 21 2014 Ambari-DDL-Oracle-DROP.sql -rw-r--r-- 1 root root 25963 May 21 2014 Ambari-DDL-Postgres-CREATE.sql -rw-r--r-- 1 root root 1337 May 21 2014 Ambari-DDL-Postgres-DROP.sql -rw-r--r-- 1 root root 31588 May 21 2014 Ambari-DDL-Postgres-EMBEDDED-CREATE.sql -rw-r--r-- 1 root root 853 May 21 2014 Ambari-DDL-Postgres-EMBEDDED-DROP.sql
Hive Metastore
The Hive Metastore supports MySQL, Postgres, Oracle, and MS SQL Server databases. Nothing particular is required during the installation of this databases. Ambari server beginning with stack HDP 2.2 will automatically ensure the schema are created before installation of the Metastore. For any stack prior to that the DDL scripts can be found under var/lib/ambari-server/resources/stacks/HDP/${VERSION}/services/HIVE/etc/.Oozie
The default database for Oozie is Derby. Other than that Oozie support MySQL, Postgres, and Oracle databases. The schema can be created automatically. Automatic schema creation can be achieved by providing the oozie.service.JPAService.create.db.schema=true parameter.
Installing Ambari
Ambari can best be installed through the package repository for your operating system:
Setup Ambari Server
After installation Amabari server can be configured using setup. For details you can execute $ambari-server setup –help. To configure Ambari with an existing database we can setup Ambari for example like this:
ambari-server setup -s --database=embedded|oracle|mysql|postgres --databasehost=Hostname of database server --databaseport=Database port --databasename=Database/Schema/Service name or ServiceID --databaseusername=Database user login --databasepassword=Database user password
The configurations are stored under /etc/ambari-server/conf/ambari-server.properties. The password for the database connection is stored under /etc/ambari-server/conf/password.dat and can therefor be protected separately from the other configurations.
Blueprint for Automated Install
During installation Hive and Oozie would also need to be configured for any existing database. A possible Blueprint could look like this:
{ "Blueprints" : { "blueprint_name" : "blueprint", "stack_name" : "HDP", "stack_version" : "2.2" }, "configurations": [ { "nagios-env": { "nagios_contact": "root@localhost", "nagios_group" : "nagios", "nagios_user" : "nagios", "nagios_web_login" : "nagiosadmin", "nagios_web_password" : "admin" } },{ "hive-site": { "javax.jdo.option.ConnectionDriverName" : "oracle.jdbc.driver.OracleDriver", "javax.jdo.option.ConnectionPassword" : "pw", "javax.jdo.option.ConnectionURL" : "jdbc:oracle:thin:@host:1521/HIVE.SID", "javax.jdo.option.ConnectionUserName" : "hive" } },{ "hive-env": { "hive_ambari_database" : "Oracle", "hive_database" : "Existing Oracle Database", "hive_database_type" : "oracle" } },{ "oozie-site": { "oozie.db.schema.name" : "oozie", "oozie.service.JPAService.create.db.schema" : "true", "oozie.service.JPAService.jdbc.driver" : "com.mysql.jdbc.Driver", "oozie.service.JPAService.jdbc.username" : "oozie", "oozie.service.JPAService.jdbc.password" : "pw", "oozie.service.JPAService.jdbc.url" : "jdbc:mysql://host:3306/oozie" } },{ "oozie-env": { "oozie_database" : "Existing MySQL Database", "oozie_existing_oracle_database" : "MySQL", "oozie_existing_oracle_host" : "db_host" } } ], "host_groups" : [{ "name" : "001", "cardinality" : "1", "components" : [...] }] }
Supported Database Matrix (Source)
For completeness a reference matrix of supported databases:
PostgreSQL 8.x | PostgreSQL 9.x | MySQL 5.x | Oracle 11gr2 | Other | |||||||||||||||||||||||||||||||||||||||||||||
Hive / HCatalog | Supported. For instructions on configuring this database for Hive metastore, see Instructions for Manual Install. | Supported. For instructions on configuring this database for Hive metastore, see Instructions for Manual Install . | Default. For instructions on configuring this database for Hive metastore, either see Instructions for Manual Install or see Using Ambari with MySQL. | Default. For instructions on configuring this database for Hive metastore, either see Instructions for Manual Install or see Using Ambari with Oracle. | |||||||||||||||||||||||||||||||||||||||||||||
Oozie | Supported. For instructions on configuring this database for Oozie metastore, see Instructions for Manual Install. | Supported. For instructions on configuring this database for Oozie metastore, see Instructions for Manual Install. | Supported. For instructions on configuring this database for Oozie metastore, either see Instructions for Manual Install or see Using Oozie with MySQL. | Supported. For instructions on configuring this database for Oozie metastore, either see Instructions for Manual Install or see Instructions for Oozie. | Derby (default). | ||||||||||||||||||||||||||||||||||||||||||||
Hue [a] | Supported. For instructions on configuring this database for Hue, see Instructions for Manual Install. | Supported. For instructions on configuring this database for Hue, see Instructions for Manual Install . | Supported. For instructions on configuring this database for Hue, see Instructions for Manual Install. | Supported. For instructions on configuring this database for Hue, see Instructions for Manual Install. | SQLite (default) | ||||||||||||||||||||||||||||||||||||||||||||
Ambari [b] | Default. For more information, see Database Requirements. | Supported. For more information, see Using Ambari with PostgreSQL. | Supported. For more information, see Using Ambari with MySQL. | Supported. For more information, see Using Ambari with Oracle. | |||||||||||||||||||||||||||||||||||||||||||||
[a] Hue does not currently support Ubuntu 12. [b] Ambari does not currently support Ubuntu 12. |
Automated Ambari Install with Existing Database http://t.co/Uc35Awn0o1
LikeLike
Henning saves the day!
LikeLike
Hi
Just wondering if you have any idea on how to setup HIVE and Oozie on MS SQL Server 2008 R2. So how can I setup Ambari server to show option for connecting to an existing MS SQL 2008 R2 during the customize services option.
Secondly if I setup a MySQL server on the same machine running Ambari Sever will that be fine for production use.
I would appreciate your help.
Thanks
Musa
LikeLike
Hi, thanks for your comment. I have no experience with MS SQL in Hadoop and unfortunately it also seems not to be supported.
Running Ambari + DB on the same node for production sounds fine to me. Often the question is rather if you have Ambari installed on one of the master nodes in your cluster, because at some point it can make sense to have a separate managing/monitoring node in the cluster that only has Ambari Server, Ambari Metrics, Ganglia Server, Nagios Server, etc. installed without Hadoop components.
LikeLike