- apex cli launch command
- get yarn application id
- copy resources to hdfs (jar, configuration files)
- Ask yarn to start master process
- Master on launch construct physicalPlan + runtimePlan
- Master asks for slave containers from RM
- Master launch slave containers
- One slave container is launched it sends first heartbeat request to master.
- Master notes down bufferserver address of new container and submits operator deploy requests to slave.
- Master keeps on monitoring slave using heartbeat protocol.
Application package is nothing but a jar file with following structure.
app/.jar lib/.jar conf/*.xml
Apex cli opens up the application package and looks for application defined using following formats. The application can be specified as
- Java application (implementing StreamingApplication)
- Json file
- Property file
Application Factory is used to prepare an DAG for submission to yarn.