There is a widespread discussion on the new MapReduce release, called YARN (Yet Another Resource Negotiator), which is shipped with the latest Hadoop 2.0 version. Curt Monash tries to give a clear perspective on the multitude of releases in his post: Hadoop YARN – beyond MapReduce.
The new MapReduce YARN promises significant improvements in reliability, availability, scalability, backward (and forward) compatibility, predictable latency and cluster utilization. This results in architectural and design changes as depicted in the Arun C Murthy‘s YARN Architecture:
The major difference is that the JobTracker is divided into:
- ResourceManager that manages the global assignment of compute resources to applications.
- ApplicationMaster manages the application’s scheduling and coordination.
Also, the communication between the different Nodes is simplified which allows greater scalability. A prototype build on YARN that clearly demonstrates its advantages is extensively described in PaaS on Hadoop Yarn – Idea and Prototype. Despite many issues and failures in the current implementation, the framework will open new application fields that were not possible with the old version.