autoscale: true
#[fit] Data Engineering - Part II
In a YARN cluster, there are two types of hosts:
ResourceManager
Master daemon that communicates with the client, tracks resources on the cluster, and orchestrates work by assigning tasks to NodeManagers
NodeManager
Worker daemon that launches and tracks processes spawned on worker hosts.
YARN defines two resources
- vcores - think of it as a “usage share of a CPU core.”
- memory
container :
A request to hold resources (vcore and memory) on the YARN cluster
Once a hold has been granted on a host, the NodeManager launches a process called a task.