Skip to content

Latest commit

 

History

History
59 lines (27 loc) · 857 Bytes

slides02.md

File metadata and controls

59 lines (27 loc) · 857 Bytes

autoscale: true

#[fit] Data Engineering - Part II


BEYOND MAPREDUCE WITH HADOOP V2


Apache YARN


In a YARN cluster, there are two types of hosts:

ResourceManager

Master daemon that communicates with the client, tracks resources on the cluster, and orchestrates work by assigning tasks to NodeManagers

NodeManager

Worker daemon that launches and tracks processes spawned on worker hosts.


inline, 130%


YARN defines two resources

  • vcores - think of it as a “usage share of a CPU core.”
  • memory

inline, 110%


Yarn Containers


container :

A request to hold resources (vcore and memory) on the YARN cluster

Once a hold has been granted on a host, the NodeManager launches a process called a task.

inline, 110%