Skip to content

Latest commit

 

History

History
140 lines (102 loc) · 7.68 KB

quick_start.md

File metadata and controls

140 lines (102 loc) · 7.68 KB

Quick Start

This tutorial provides a quick introduction to using remote shuffle service for Flink. This short guide will show you how to download the latest stable version of remote shuffle service, install and run it. You will also run an example Flink job using remote shuffle service and observe it in the Flink web UI.

Introduction

Following the below steps in this quick start guide, you will run a simple Flink batch job which uses the remote shuffle service:

  1. Download remote shuffle service and Flink binary releases.
  2. Start a standalone remote shuffle service cluster.
  3. Start a Flink local cluster that uses the remote shuffle service for data shuffling.
  4. Submit a simple Flink batch job to the started cluster.

Build & Download

Remote shuffler service runs on all UNIX-like environments, i.e. Linux, Mac OS X. You need to have Java 8 installed. To check the Java version installed, type in your terminal:

java -version
  1. Download the latest binary release or build remote shuffle service yourself.
  2. Next, download the latest binary release of Flink, then extract the archive:
 tar -xzf flink-*.tgz

For the steps of downloading or installing Flink, you can also refer to Flink first steps.

Browsing the Project Directory

Navigate to the extracted shuffle service directory and list the contents by issuing:

cd flink-remote-shuffle* && ls -l

You should see some directories as follows.

Directory Meaning
bin/ Directory containing several bash scripts of the remote shuffle service that manage ShuffleMananger or ShuffleWorker.
conf/ Directory containing configuration files, including remote-shuffle-conf.yaml, log4j2.properties, etc.
lib/ Directory containing the remote shuffle service JARs compiled, including shuffle-dist-*.jar, log4j JARs, etc.
log/ Log directory should be empty. When running standalone shuffle service cluster, the logs of ShuffleMananger or ShuffleWorkers will be stored in this directory by default.
opt/ Directory containing the optional JARs used in some special environments, for example, shuffle-kubernetes-operator-*.jar is used when deploying on Kubernetes.
examples/ Directory containing several demo example JARs.

Starting Clusters

Starting a Standalone Remote Shuffle Cluster

Please refer to how to start a standalone remote shuffle cluster.

Starting a Flink Cluster

Before starting a Flink local cluster,

  1. Make sure that a valid remote shuffle service cluster has been successfully started.
  2. You need to copy the shuffle plugin JAR from the remote shuffle lib directory (for example, lib/shuffle-plugin-*.jar) to the Flink lib directory.

For different startup modes of remote shuffle service, Flink job configurations are different, the details are as follows.

  • For standalone remote shuffle service, please add the following configurations to conf/flink-conf.yaml in the extracted Flink directory to use remote shuffle service when running a Flink batch job. The argument manager-ip-address is the ip address of ShuffleManager (for local remote shuffle cluster, it should be 127.0.0.1).
shuffle-service-factory.class: com.alibaba.flink.shuffle.plugin.RemoteShuffleServiceFactory
remote-shuffle.manager.rpc-address: <manager-ip-address>
  • For remote shuffle service on YARN or Kubernetes, please add the following configurations to conf/flink-conf.yaml in the extracted Flink directory to use remote shuffle service when running a Flink batch job. remote-shuffle.ha.zookeeper.quorum is the Zookeeper address of the ShuffleManager when high availability is enabled.
shuffle-service-factory.class: com.alibaba.flink.shuffle.plugin.RemoteShuffleServiceFactory
remote-shuffle.high-availability.mode: ZOOKEEPER
remote-shuffle.ha.zookeeper.quorum: zk1.host:2181,zk2.host:2181,zk3.host:2181

Please refer the following links for different deployment mode of Flink:

Usually, starting a local Flink cluster by running the following command is enough for this quick start guide:

# We assume to be in the root directory of the Flink extracted distribution

./bin/start-cluster.sh

You should be able to navigate to the web UI at http://<job manager ip address>:8081 to view the Flink dashboard and see that the cluster is up and running.

Because the configurations related to shuffle have been modified, all jobs submitted to the Flink cluster will use remote shuffle service to shuffle data.

Submitting a Flink Job

After starting the Flink cluster successfully, you can submit a simple Flink batch demo job.

The example source code is in the shuffle-examples module. BatchJobDemo is a simple Flink batch job. And you need to copy the compiled demo JAR examples/BatchJobDemo.jar to the extracted Flink directory. Please run the following command to submit the example batch job.

# Firstly, copy the example JAR
# cp examples/BatchJobDemo.jar <Flink directory>

# We assume to be in the root directory of the Flink extracted distribution

./bin/flink run ./BatchJobDemo.jar

You have successfully ran a Flink batch job using remote shuffle service.

Where to Go from Here

Congratulations on running your first Flink application using remote shuffle service!