Get Started with Hadoop – Now!!

Looking back it is insane how mature Hadoop has become. Not only the maturity itself but also the pace is quite impressive. Early projects jumped right onto the Hadoop wagon without clear but big expectations. Great about this times was that it felt like a gold-rush and Hadoop’s simple and inherently scalable paradigm made sure this path was sticked with successful projects. In his recent Book Arun Murthy identifies 4 Phases Hadoop has gone through so far:

  • Phase 0: The Area of Ad Hoc Hadoop
  • Phase 1: Hadoop on Demand
  • Phase 2: Dawn of the shared Cluster
  • Phase 3: Emergence of YARN

He describes in detail the characteristics of each phase in the first chapter of his book deriving the need for today’s resource layer – YARN. Also he shares some nice insides of what was going on at Yahoo! around that time.

In summary and from my point of view Hadoop has grown from a side and R&D project into the center of Enterprise Computing. From many failed projects we have learned the key aspects of Big Data projects. Enterprise is picking up on this successes. It is shaping the future of Hadoop with high intense. Shared resources throughout the company with a high utilization and scalability are the driving forces, always combined with a central point to store the data (HDFS). This includes enterprise grade security and data governance.

The Forces are Strong

There is currently huge investments in and around Hadoop combined with a high adoption rate. Forbes predicts the market to grow 6 times faster then the average IT market in 2014 and will reach $16.1 billion. Venture Capital is investing strong into the Hadoop and NoSQL market, by 2013 it surpassed the the $1 billion mark.

http://wikibon.org/wiki/v/Data_Warehouse_Vendors_Moving_to_Contain_the_Hadoop_Threat
Source: http://wikibon.org/wiki/v/Data_Warehouse_Vendors_Moving_to_Contain_the_Hadoop_Threat
http://wikibon.org/wiki/v/Big_Data_Vendor_Revenue_and_Market_Forecast_2013-2017
Source: http://wikibon.org/wiki/v/Big_Data_Vendor_Revenue_and_Market_Forecast_2013-2017

Get Started Now

As a IT professional, student, or in a similar position it is a great time to get yourself started. If you are just beginning your career or still in school it is a advantage to excel in distributed computing, databases, and statistics.

1. Out-Of-The-Box Hadoop

It has never been easier to get started with Hadoop. Many vendors offer out of the box solutions that get you started quickly. They offer simple virtual images that you can download and run on your local machine.sandbox_laptop

  1. Hortonworks Sandbox
  2. Cloudera QuickStart VMs
  3. MapR Sandbox for Hadoop

2. Become a HUGger

To learn from other users it is a good idea to meet them. Hadoop user groups are called HUG and you are probably able to find one near you on for example MeetUp. Go there and get in touch with others. You can also join the community on G+: Hadoop, BigData or Data Data Data or Data Science.

You might be in a position to have the chance to attend conferences than you probably would be interested in Hadoop Summit, Berlin Buzzwords, or Strata Conf.

3. Read

Some recommended books:

Some recommended blogs:

One thought on “Get Started with Hadoop – Now!!”

Leave a Reply

Your email address will not be published. Required fields are marked *