Skip to content

Commit

Permalink
docs: 优化5.x版本的RocketMQ Connect 快速接入、实战4、实战5 英文文档,修复错误的docker命令等内容,添加详…
Browse files Browse the repository at this point in the history
…细的操作步骤
  • Loading branch information
yuanzhi committed Nov 23, 2023
1 parent 6302f37 commit e98f940
Show file tree
Hide file tree
Showing 6 changed files with 1,216 additions and 400 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -2,163 +2,242 @@

# Quick Start

In standalone mode, [rocketmq-connect-sample] serves as a demo.
This tutorial will start a RocketMQ Connector example project "rocketmq-connect-sample" in standalone mode to help you understand the working principle of connectors.
The example project provides a source connector that reads data from source files and sends it to the RocketMQ cluster.
It also provides a sink connector that reads messages from the RocketMQ cluster and writes them to destination files.

The main purpose of rocketmq-connect-sample is to read data from a source file and send it to a RocketMQ cluster, and then read messages from the Topic and write them to a target file.

## 1. Prepare
## 1. Preparation: Start RocketMQ

1. Linux/Unix/Mac
2. 64bit JDK 1.8+;
3. Maven 3.2.x+;
4. Start [RocketMQ](https://rocketmq.apache.org/docs/quick-start/);
5. Create test Topic
4. Start RocketMQ. Either [RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) or
[RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/) 5.x version can be used;
5. Test RocketMQ message sending and receiving using the tool.

> sh ${ROCKETMQ_HOME}/bin/mqadmin updateTopic -t fileTopic -n localhost:9876 -c DefaultCluster -r 8 -w 8
Here, use the environment variable NAMESRV_ADDR to inform the tool client of the NameServer address of RocketMQ as localhost:9876.

**tips** : ${ROCKETMQ_HOME} locational instructions
```shell
#$ cd distribution/target/rocketmq-4.9.7/rocketmq-4.9.7
$ cd distribution/target/rocketmq-5.1.4/rocketmq-5.1.4

>bin-release.zip version:/rocketmq-all-4.9.4-bin-release
>
>source-release.zip version:/rocketmq-all-4.9.4-source-release/distribution
$ export NAMESRV_ADDR=localhost:9876
$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Producer
SendResult [sendStatus=SEND_OK, msgId= ...

$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Consumer
ConsumeMessageThread_%d Receive New Messages: [MessageExt...
```
## 2. Build Connect
**Note**: RocketMQ has the feature of automatically creating Topic and Group. When sending or subscribing to messages,
if the corresponding Topic or Group does not exist, RocketMQ will automatically create them. Therefore,
there is no need to create Topic and Group in advance.
```
## 2. Build Connector Runtime
```shell
git clone https://github.com/apache/rocketmq-connect.git
cd rocketmq-connect
mvn -Prelease-connect -DskipTests clean install -U
export RMQ_CONNECT_HOME=`pwd`
mvn -Prelease-connect -Dmaven.test.skip=true clean install -U
```
## 3. Run Worker
**Note**: The project already includes the code for rocketmq-connect-sample by default,
so there is no need to build the rocketmq-connect-sample plugin separately.
## 3. Run Connector Worker in Standalone Mode
### Modify Configuration
Modify the `connect-standalone.conf` file to configure the RocketMQ connection
address and other information. Please refer to [9. Configuration File Instructions](#9-configuration-file-instructions) for details.
```
cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT
cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT
sh bin/connect-standalone.sh -c conf/connect-standalone.conf &
vim conf/connect-standalone.conf
```
In standalone mode, RocketMQ Connect persists the synchronization checkpoint information
to the local file directory storePathRootDir.
>storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot
If you want to reset the synchronization checkpoint, you need to delete the persisted
checkpoint file.
```shell
rm -rf /Users/YourUsername/rocketmqconnect/storeRoot/*
```
**tips**: The JVM Parameters Configuration can be adjusted in /bin/runconnect.sh as needed.
### Start Connector Worker in Standalone Mode
```shell
sh bin/connect-standalone.sh -c conf/connect-standalone.conf &
```
**tips**: You can modify `docker/connect/bin/runconnect.sh` to adjust JVM startup
parameters as needed.
>JAVA_OPT="${JAVA_OPT} -server -Xms256m -Xmx256m"
runtime start successful:
To view the startup log file:
>The standalone worker boot success.
```shell
tail -100f ~/logs/rocketmqconnect/connect_runtime.log
```
View the startup log files.
If the runtime starts successfully, you will see the following print in the log file:
>tail -100f ~/logs/rocketmqconnect/connect_runtime.log
>The standalone worker boot success.
`ctrl + c` exit log
To exit the log tracking mode of `tail -f` command, you can press the `Ctrl + C` key combination.
## 4. Start source connector
## 4. Start Source Connector
Create a test file named test-source-file.txt in the current directory.
### Create Source File and Write Test Data
```
```shell
mkdir -p /Users/YourUsername/rocketmqconnect/
cd /Users/YourUsername/rocketmqconnect/
touch test-source-file.txt
echo "Hello \r\nRocketMQ\r\n Connect" >> test-source-file.txt
```
**Note**: There should be no empty lines (the demo program will throw an error if it
encounters empty lines). The source connector will continuously read the source file
and convert each line of data into a message body to be sent to RocketMQ for consumption
by the sink connector.
### Start Source Connector
```shell
curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSourceConnector -d '{
"connector.class": "org.apache.rocketmq.connect.file.FileSourceConnector",
"filename": "/Users/YourUsername/rocketmqconnect/test-source-file.txt",
"connect.topicname": "fileTopic"
}'
```
If the curl request returns status 200, it indicates successful creation. Example response:
>{"status":200,"body":{"connector.class":"org.apache.rocketmq.connect.file.FileSourceConnector","filename":"/Users/YourUsername/rocketmqconnect/test-source-file.txt","connect.topicname":"fileTopic"}}
curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSourceConnector -d '{"connector.class":"org.apache.rocketmq.connect.file.FileSourceConnector","filename":"test-source-file.txt","connect.topicname":"fileTopic"}'
View the log file:
```shell
tail -100f ~/logs/rocketmqconnect/connect_runtime.log
```
If you see the following log message, it means the file source connector has started successfully.
If you see the following log, it means the file source connector has started successfully:
>Start connector fileSourceConnector and set target state STARTED successed!!
>tail -100f ~/logs/rocketmqconnect/connect_runtime.log
>
>2019-07-16 11:18:39 INFO pool-7-thread-1 - **Source task start**, config:{"properties":{"source-record-...
#### source connector configuration instructions
#### Source Connector Configuration Instructions
| key | nullable | default | description |
| ----------------- | -------- | ------- | ------------------------------------------------------------ |
| connector.class | false | | The class name (including the package name) that implements the Connector interface |
| filename | false | | source file name |
| filename | false | | The name of the source file (recommended to use absolute path) |
| connect.topicname | false | | Topic required for synchronizing file data |
## 5. Start sink connector
```shell
curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSinkConnector -d '{
"connector.class": "org.apache.rocketmq.connect.file.FileSinkConnector",
"filename": "/Users/YourUsername/rocketmqconnect/test-sink-file.txt",
"connect.topicnames": "fileTopic"
}'
```
curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSinkConnector -d '{"connector.class":"org.apache.rocketmq.connect.file.FileSinkConnector","filename":"test-sink-file.txt","connect.topicnames":"fileTopic"}'
cat test-sink-file.txt
If the curl request returns status 200, it indicates successful creation. Example response:
>{"status":200,"body":{"connector.class":"org.apache.rocketmq.connect.file.FileSinkConnector","filename":"/Users/YourUsername/rocketmqconnect/test-sink-file.txt","connect.topicnames":"fileTopic"}}
View the log file:
```shell
tail -100f ~/logs/rocketmqconnect/connect_runtime.log
```
If you see the following log, it means the file sink connector has started successfully:
> Start connector fileSinkConnector and set target state STARTED successed!!
> tail -100f ~/logs/rocketmqconnect/connect_runtime.log
Check if the sink connector has written data to the destination file:
```shell
cat /Users/YourUsername/rocketmqconnect/test-sink-file.txt
```
If you see the following log message, it means the file sink connector has started successfully.
If the test-sink-file.txt file is generated and its content is the same as the
test-source-file.txt, it means the entire process is running correctly.
> 2019-07-16 11:24:58 INFO pool-7-thread-2 - **Sink task start**, config:{"properties":{"source-record-...
Continue writing test data to the source file test-source-file.txt:
```shell
cd /Users/YourUsername/rocketmqconnect/
If test-sink-file.txt is generated and its content is the same as source-file.txt, it means that the entire process is running normally.
echo "Say Hi to\r\nRMQ Connector\r\nAgain" >> test-source-file.txt
The file contents may be in a different order, which is normal because the order of messages received from different queues in RocketMQ may also be inconsistent.
# Wait a few seconds, check if rocketmq-connect replicate data to sink file succeed
sleep 10
cat /Users/YourUsername/rocketmqconnect/test-sink-file.txt
```
**Note**: The order of file contents may vary because the `rocketmq-connect-sample` uses `normal message` when
sending and receiving messages to/from a RocketMQ topic. This is different from `ordered message`, and consuming
`normal messages` does not guarantee the order.
#### sink connector configuration instructions
| key | nullable | default | description |
| ------------------ | -------- | ------- | ------------------------------------------------------------ |
| connector.class | false | | The class name (including the package name) that implements the Connector interface |
| filename | false | | The sink pulls data and saves it to a file. |
| connect.topicnames | false | | The topics of the data messages that the sink needs to process. |
| filename | false | | The sink pulls data and saves it to a file(recommended to use absolute path) |
| connect.topicnames | false | | The topics of the data messages that the sink needs to process |
```
Tips:The configuration file instructions for the sample rocketmq-connect-sample are for reference only, different source/sink connectors have different configurations, please refer to the specific source/sink connector.
```
**Tips**:The configuration file instructions for the sample rocketmq-connect-sample are for reference only, different source/sink connectors have different configurations, please refer to the specific source/sink connector.
## 6. Stop connector
The RESTful command format for stopping connectors is
`http://(your worker ip):(port)/connectors/(connector name)/stop`
To stop the two connectors in the demo, you can use the following commands:
```shell
#GET request
http://(your worker ip):(port)/connectors/(connector name)/stop

#Stopping the two connectors in the demo
curl http://127.0.0.1:8082/connectors/fileSinkConnector/stop
curl http://127.0.0.1:8082/connectors/fileSourceConnector/stop

curl http://127.0.0.1:8082/connectors/fileSinkConnector/stop
curl http://127.0.0.1:8082/connectors/fileSourceConnector/stop
```
Seeing the following log message indicates that the connector has been successfully stopped.
If the curl request returns a status of 200, it indicates successful stopping of the connectors.
Example response:
>{"status":200,"body":"Connector [fileSinkConnector] deleted successfully"}
>**Source task stop**, config:{"properties":{"source-record-converter":"org.apache.rocketmq.connect.runtime.converter.JsonConverter","filename":"/home/zhoubo/IdeaProjects/my-new3-rocketmq-externals/rocketmq-connect/rocketmq-connect-runtime/source-file.txt","task-class":"org.apache.rocketmq.connect.file.FileSourceTask","topic":"fileTopic","connector-class":"org.apache.rocketmq.connect.file.FileSourceConnector","update-timestamp":"1564765189322"}}
## 7. Stopping the Worker process
If you see the following log message, it means the file sink connector has been
successfully shut down:
```shell
tail -100f ~/logs/rocketmqconnect/connect_default.log
```
> Completed shutdown for connectorName:fileSinkConnector
## 7. Stop the Worker process
```shell
cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT
sh bin/connectshutdown.sh
```
## 8. Log directory
>${user.home}/logs/rocketmqconnect
## 9. Configuration file

The default directory for persistent configuration files is /tmp/storeRoot.
You can use the following commands to view the log directory:
| key | description |
| -------------------- | --------------------------------------------------------- |
| connectorConfig.json | Connector configuration persistence files |
| position.json | Source connect data processing progress persistence files |
| taskConfig.json | Task configuration persistence files |
| offset.json | Sink connect data consumption progress persistence files |
| connectorStatus.json | Connector status persistence files |
| taskStatus.json | Task status persistence files |
```shell
ls $HOME/logs/rocketmqconnect
ls ~/logs/rocketmqconnect
```
## 10. Configuration Instructions
## 9. Configuration File Instructions
Modify the RESTful port, storeRoot path, Nameserver address, and other information based on your usage.
The file location is in the conf/connect-standalone.conf under the work startup directory.
Here is an example of a configuration file:
```shell
#current cluster node uniquely identifies
Expand All @@ -168,16 +247,29 @@ workerId=DEFAULT_WORKER_1
httpPort=8082
# Local file dir for config store
storePathRootDir=/home/connect/storeRoot
storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot
#You need to modify it to your own rocketmq nameserver endpoint.
# RocketMQ namesrvAddr
namesrvAddr=127.0.0.1:9876
#This is used for loading Connector plugins, similar to how JVM loads jar packages or classes at startup. This directory is used for placing Connector-related implementation plugins and supports both files and directories.
# Source or sink connector jar file dir
pluginPaths=rocketmq-connect-sample/target/rocketmq-connect-sample-0.0.1-SNAPSHOT.jar
# Plugin path for loading Source/Sink Connectors
# The rocketmq-connect project already includes the rocketmq-connect-sample module by default, so no configuration is needed here.
pluginPaths=
```
Explanation of storePathRootDir configuration:
In standalone mode, RocketMQ Connect persists the synchronization checkpoint information
to the local file directory specified by storePathRootDir. The persistent files include:
| key | description |
| -------------------- | --------------------------------------------------------- |
| connectorConfig.json | Connector configuration persistence files |
| position.json | Source connect data processing progress persistence files |
| taskConfig.json | Task configuration persistence files |
| offset.json | Sink connect data consumption progress persistence files |
| connectorStatus.json | Connector status persistence files |
| taskStatus.json | Task status persistence files |
# Addition : put the Connector-related implementation plugins in the specified folder.
# pluginPaths=/usr/local/connector-plugins/*
```
Loading

0 comments on commit e98f940

Please sign in to comment.