Hive Join Strategies

Hive joins are executed by MapReduce jobs through different execution engines like for example Tez, Spark or MapReduce. Joins even of multiple tables can be achieved by one job only. Since it’s first release many optimizations have been added to Hive giving users various options for query improvements of joins.

Understanding how joins are implemented with MapReduce helps to recognize the different optimization techniques in Hive today. Continue reading “Hive Join Strategies”