YARN is changing the face of Big Data as we know it today. Breaking with by now well established patterns like MapReduce YARN gives clients the ability to run diverse distributed algorithms on one cluster under one resource management. In addition with Apache Slider comes the possibility to ‘slide’ existing long running service on to the same cluster with the same resource provider.
With the up-coming release of HDP 2.2 HBase services are running under the management of YARN. The same will be true for Storm getting us one step closer to the vision of an Enterprise Hadoop Data Lake. One important aspect in this scenario is yet to come: YARN-796 aka YARN Labels.
Recently the technical preview of Hortonworks Data Platform 2.2 was released with a Sandbox image for download. Giving you the possibility to try out the concepts of tomorrows Big Data platform today with this tutorials. You can also try some of the virtual environments I’ve put together here hdp22-n1-centos6-puppet or hdp22-n3-centos6-puppet.
We are experiencing the dawn of a new era in Hadoop, Hadoop v2. Are you YARN ready? Here are some resources to get you going:
- Apache Hadoop YARN: Moving Beyond MapReduce and Batch Processing with Apache Hadoop 2 (Amazon)
- Simple YARN Application (Github)
- YARN Word Count Example: The Distributed Shell (Github)
- YARN Ready Webinars:
- Integrating to YARN natively (part 1) ( video / slides )
- Integrating to YARN using Slider (part 2) ( video / slides )
- Integrating to YARN with Tez (part 3) ( video / slides )
- Using Ambari for Management ( video / slides )
- Developing Applications on Hadoop with Scalding ( video / slides )
- Using Spark to Integrate to YARN ( video / slides )