diff --git a/docs/10-connect/03RocketMQ Connect Quick Start.md b/docs/10-connect/03RocketMQ Connect Quick Start.md index 31c9589538..82e88257aa 100644 --- a/docs/10-connect/03RocketMQ Connect Quick Start.md +++ b/docs/10-connect/03RocketMQ Connect Quick Start.md @@ -4,159 +4,217 @@ # 快速开始 -单机模式下[rocketmq-connect-sample]作为 demo +本教程将采用单机模式启动一个RocketMQ Connector示例工程rocketmq-connect-sample,来帮助你了解连接器的工作原理。 +示例工程中提供了源端连接器,作用是从源文件中读取数据然后发送到RocketMQ集群。 +同时提供了目的端连接器,作用是从RocketMQ集群中读取消息然后写入目的端文件。 -rocketmq-connect-sample的主要作用是从源文件中读取数据发送到RocketMQ集群 然后从Topic中读取消息,写入到目标文件 - -## 1.准备 +## 1.准备:启动RocketMQ 1. Linux/Unix/Mac 2. 64bit JDK 1.8+; 3. Maven 3.2.x或以上版本; -4. 启动 [RocketMQ](https://rocketmq.apache.org/docs/quick-start/); -5. 创建测试Topic -> sh ${ROCKETMQ_HOME}/bin/mqadmin updateTopic -t fileTopic -n localhost:9876 -c DefaultCluster -r 8 -w 8 +4. 启动 RocketMQ。使用[RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) 或 + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/)版本均可; +5. 工具测试 RocketMQ 消息收发是否正常。详见[RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) 或 + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/)文档。 + +这里利用环境变量NAMESRV_ADDR来告诉工具客户端RocketMQ的NameServer地址为localhost:9876 +```shell +#$ cd distribution/target/rocketmq-4.9.7/rocketmq-4.9.7 +$ cd distribution/target/rocketmq-5.1.4/rocketmq-5.1.4 -**tips** : ${ROCKETMQ_HOME} 位置说明 +$ export NAMESRV_ADDR=localhost:9876 +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Producer + SendResult [sendStatus=SEND_OK, msgId= ... ->bin-release.zip 版本:/rocketmq-all-4.9.4-bin-release -> ->source-release.zip 版本:/rocketmq-all-4.9.4-source-release/distribution +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Consumer + ConsumeMessageThread_%d Receive New Messages: [MessageExt... +``` +**说明**:RocketMQ具备自动创建Topic和Group的功能,在发送消息或订阅消息时,如果相应的Topic或Group不存在,RocketMQ会自动创建它们。因此不需要提前创建Topic和Group。 -## 2.构建Connect +## 2.构建Connector Runtime -``` +```shell git clone https://github.com/apache/rocketmq-connect.git cd rocketmq-connect -mvn -Prelease-connect -DskipTests clean install -U +export RMQ_CONNECT_HOME=`pwd` +mvn -Prelease-connect -Dmaven.test.skip=true clean install -U ``` -## 3.运行Worker +**注意**:本工程已默认包含 rocketmq-connect-sample 的代码,因此无需单独构建 rocketmq-connect-sample 插件。 + +## 3.单机模式运行 Connector Worker + +### 修改配置 +`connect-standalone.conf`中配置了RocketMQ连接地址等信息,需要根据使用情况进行修改,具体参见[9.配置文件说明](#9配置文件说明)。 ``` -cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT +cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT -sh bin/connect-standalone.sh -c conf/connect-standalone.conf & +vim conf/connect-standalone.conf +``` + +单机模式(standalone)下,RocketMQ Connect 会把同步位点信息持久化到本地文件目录 storePathRootDir +>storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot +如果想重置同步位点,则需要删除持久化的位点信息文件 +```shell +rm -rf /Users/YourUsername/rocketmqconnect/storeRoot/* ``` -**tips**: 可修改 /bin/runconnect.sh 适当调整 JVM Parameters Configuration ->JAVA_OPT="${JAVA_OPT} -server -Xms256m -Xmx256m" +### 采用单机模式启动Connector Worker -runtime启动成功: +```shell +sh bin/connect-standalone.sh -c conf/connect-standalone.conf & +``` ->The standalone worker boot success. +**tips**: 可修改 docker/connect/bin/runconnect.sh 适当调整 JVM 启动参数 + +>JAVA_OPT="${JAVA_OPT} -server -Xms256m -Xmx256m" 查看启动日志文件: +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log +``` ->tail -100f ~/logs/rocketmqconnect/connect_runtime.log +runtime若启动成功则日志文件中能看到如下打印内容: +>The standalone worker boot success. -ctrl + c 退出日志 +要退出tail -f命令的日志追踪模式,您可以按下 Ctrl + C 组合键。 ## 4.启动source connector -当前目录创建测试文件 test-source-file.txt -``` +### 创建源端文件并写入测试数据 + +```shell +mkdir -p /Users/YourUsername/rocketmqconnect/ +cd /Users/YourUsername/rocketmqconnect/ touch test-source-file.txt echo "Hello \r\nRocketMQ\r\n Connect" >> test-source-file.txt +``` +**注意**:不能有空行(demo程序遇到空行会报错)。source connector会持续读取源端文件,每读取到一行数据就会转换为消息体发送到RocketMQ,供sink connector消费。 -curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSourceConnector -d '{"connector.class":"org.apache.rocketmq.connect.file.FileSourceConnector","filename":"test-source-file.txt","connect.topicname":"fileTopic"}' +### 启动Source Connector +```shell +curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSourceConnector -d '{ + "connector.class": "org.apache.rocketmq.connect.file.FileSourceConnector", + "filename": "/Users/YourUsername/rocketmqconnect/test-source-file.txt", + "connect.topicname": "fileTopic" +}' ``` +curl请求返回status:200则表示创建成功,返回样例: +>{"status":200,"body":{"connector.class":"org.apache.rocketmq.connect.file.FileSourceConnector","filename":"/Users/YourUsername/rocketmqconnect/test-source-file.txt","connect.topicname":"fileTopic"}} + 看到以下日志说明 file source connector 启动成功了 +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log +``` ->tail -100f ~/logs/rocketmqconnect/connect_runtime.log -> ->2019-07-16 11:18:39 INFO pool-7-thread-1 - **Source task start**, config:{"properties":{"source-record-... +>Start connector fileSourceConnector and set target state STARTED successed!! #### source connector配置说明 | key | nullable | default | description | |-------------------| -------- | ---------------------|--------------------------| | connector.class | false | | 实现 Connector接口的类名称(包含包名) | -| filename | false | | 数据源文件名称 | -| connect.topicname | false | | 同步文件数据所需topic | +| filename | false | | 数据源端文件名称(建议使用绝对路径) | +| connect.topicname | false | | 同步文件数据所使用的RocketMQ topic | ## 5.启动sink connector +```shell +curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSinkConnector -d '{ + "connector.class": "org.apache.rocketmq.connect.file.FileSinkConnector", + "filename": "/Users/YourUsername/rocketmqconnect/test-sink-file.txt", + "connect.topicnames": "fileTopic" +}' ``` -curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSinkConnector -d '{"connector.class":"org.apache.rocketmq.connect.file.FileSinkConnector","filename":"test-sink-file.txt","connect.topicnames":"fileTopic"}' -cat test-sink-file.txt +curl请求返回status:200则表示创建成功,返回样例: +>{"status":200,"body":{"connector.class":"org.apache.rocketmq.connect.file.FileSinkConnector","filename":"/Users/YourUsername/rocketmqconnect/test-sink-file.txt","connect.topicnames":"fileTopic"}} + +看到以下日志说明file sink connector 启动成功了 +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log ``` +> Start connector fileSinkConnector and set target state STARTED successed!! +查看sink connector是否将数据写入了目的端文件: +```shell +cat /Users/YourUsername/rocketmqconnect/test-sink-file.txt +``` -> tail -100f ~/logs/rocketmqconnect/connect_runtime.log +如果生成了 test-sink-file.txt 文件,并且与 source-file.txt 内容一样则说明整个流程正常运行。 -看到以下日志说明file sink connector 启动成功了 +继续向源端文件 test-source-file.txt 中写入测试数据, +```shell +cd /Users/YourUsername/rocketmqconnect/ -> 2019-07-16 11:24:58 INFO pool-7-thread-2 - **Sink task start**, config:{"properties":{"source-record-... +echo "Say Hi to\r\nRMQ Connector\r\nAgain" >> test-source-file.txt + +# Wait a few seconds, check if rocketmq-connect replicate data to sink file succeed +sleep 10 +cat /Users/YourUsername/rocketmqconnect/test-sink-file.txt +``` + +**注意**:文件内容可能顺序不一样,这是因为 `rocketmq-connect-sample` 向RocketMQ Topic中收发消息时,使用的消息类型是普通消息,区别于顺序消息,消费普通消息时是不保证顺序的。 -如果 test-sink-file.txt 生成并且与 source-file.txt 内容一样,说明整个流程正常运行。 -文件内容可能顺序不一样,这主要是因为RocketMQ发到不同queue时,接收不同queue消息顺序可能也不一致导致的,是正常的。 #### sink connector配置说明 -| key | nullable | default | description | -|--------------------| -------- | ------- | -------------------------------------------------------------------------------------- | -| connector.class | false | | 实现Connector接口的类名称(包含包名) | -| filename | false | | sink拉去的数据保存到文件 | -| connect.topicnames | false | | sink需要处理数据消息topics | +| key | nullable | default | description | +|--------------------| -------- | ------- |----------------------------------------| +| connector.class | false | | 实现Connector接口的类名称(包含包名) | +| filename | false | | sink消费RocketMQ数据后保存到的目的端文件名称(建议使用绝对路径) | +| connect.topicnames | false | | sink需要处理数据消息topics | -``` -注:source/sink配置文件说明是以rocketmq-connect-sample为demo,不同source/sink connector配置有差异,请以具体sourc/sink connector 为准 -``` +**注意**:source/sink配置文件说明是以rocketmq-connect-sample为demo,不同source/sink connector配置有差异,请以具体sourc/sink connector 为准 ## 6.停止connector - -```shell -GET请求 -http://(your worker ip):(port)/connectors/(connector name)/stop +RESTFul 命令格式 `http://(your worker ip):(port)/connectors/(connector name)/stop` 停止demo中的两个connector -curl http://127.0.0.1:8082/connectors/fileSinkConnector/stop -curl http://127.0.0.1:8082/connectors/fileSourceConnector/stop - +```shell +curl http://127.0.0.1:8082/connectors/fileSinkConnector/stop +curl http://127.0.0.1:8082/connectors/fileSourceConnector/stop ``` -看到以下日志说明connector停止成功了 ->**Source task stop**, config:{"properties":{"source-record-converter":"org.apache.rocketmq.connect.runtime.converter.JsonConverter","filename":"/home/zhoubo/IdeaProjects/my-new3-rocketmq-externals/rocketmq-connect/rocketmq-connect-runtime/source-file.txt","task-class":"org.apache.rocketmq.connect.file.FileSourceTask","topic":"fileTopic","connector-class":"org.apache.rocketmq.connect.file.FileSourceConnector","update-timestamp":"1564765189322"}} +curl请求返回status:200则表示停止成功,返回样例: +>{"status":200,"body":"Connector [fileSinkConnector] deleted successfully"} + +看到以下日志说明file sink connector 停止成功了 +```shell +tail -100f ~/logs/rocketmqconnect/connect_default.log +``` +> Completed shutdown for connectorName:fileSinkConnector ## 7.停止Worker进程 -``` +```shell +cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT sh bin/connectshutdown.sh ``` ## 8.日志目录 +查看日志目录(下面2个命令是等价的) +```shell +ls $HOME/logs/rocketmqconnect +ls ~/logs/rocketmqconnect +``` ->${user.home}/logs/rocketmqconnect - -## 9.配置文件 - -持久化配置文件默认目录 /tmp/storeRoot - -| key | description | -|----------------------|---------------------------| -| connectorConfig.json | connector配置持久化文件 | -| position.json | source connect数据处理进度持久化文件 | -| taskConfig.json | task配置持久化文件 | -| offset.json | sink connect数据消费进度持久化文件 | -| connectorStatus.json | connector 状态持久化文件 | -| taskStatus.json | task 状态持久化文件 | - -## 10.配置说明 +## 9.配置文件说明 -可根据使用情况修改 [RESTful](https://restfulapi.cn/) 端口,storeRoot 路径,Nameserver 地址等信息 +connect-standalone.conf配置文件中, 配置了 [RESTful](https://restfulapi.cn/) 端口,storeRoot 路径,Nameserver 地址等信息,可根据需要进行修改。 -文件位置:work 启动目录下 conf/connect-standalone.conf +配置文件样例: ```shell #current cluster node uniquely identifies @@ -166,17 +224,26 @@ workerId=DEFAULT_WORKER_1 httpPort=8082 # Local file dir for config store -storePathRootDir=/home/connect/storeRoot +storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot #需要修改为自己的rocketmq nameserver 接入点 # RocketMQ namesrvAddr namesrvAddr=127.0.0.1:9876 -#用于加载Connector插件,类似于jvm启动加载jar包或者class类,这里目录目录用于放Connector相关的实现插件, -支持文件和目录 -# Source or sink connector jar file dir -pluginPaths=rocketmq-connect-sample/target/rocketmq-connect-sample-0.0.1-SNAPSHOT.jar +# 插件地址,用于Worker加载Source/Sink Connector插件 +# rocketmq-connect 工程已默认包含 rocketmq-connect-sample 模块,因此这里无需配置。 +pluginPaths= +``` + +storePathRootDir配置说明: -# 补充:将 Connector 相关实现插件保存到指定文件夹 -# pluginPaths=/usr/local/connector-plugins/* -``` \ No newline at end of file +单机模式(standalone)下,RocketMQ Connect 会把同步位点信息持久化到本地文件目录 storePathRootDir,持久化文件包括 + +| key | description | +|----------------------|---------------------------| +| connectorConfig.json | connector配置持久化文件 | +| position.json | source connect数据处理进度持久化文件 | +| taskConfig.json | task配置持久化文件 | +| offset.json | sink connect数据消费进度持久化文件 | +| connectorStatus.json | connector 状态持久化文件 | +| taskStatus.json | task 状态持久化文件 | diff --git a/docs/10-connect/07RocketMQ Connect In Action4.md b/docs/10-connect/07RocketMQ Connect In Action4.md index 2a9c439d6b..34588b9b1b 100644 --- a/docs/10-connect/07RocketMQ Connect In Action4.md +++ b/docs/10-connect/07RocketMQ Connect In Action4.md @@ -1,6 +1,6 @@ # RocketMQ Connect实战4 -SFTP Server(文件数据) -> RocketMQ Connect +SFTP Server(文件数据) -> RocketMQ Connect -> SFTP Server(文件) ## 准备 @@ -9,52 +9,67 @@ SFTP Server(文件数据) -> RocketMQ Connect 1. Linux/Unix/Mac 2. 64bit JDK 1.8+; 3. Maven 3.2.x或以上版本; -4. 启动 [RocketMQ](https://rocketmq.apache.org/docs/quick-start/); +4. 启动 RocketMQ。使用[RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) 或 + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/)版本均可; +5. 工具测试 RocketMQ 消息收发是否正常。详见[RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) 或 + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/)文档。 +这里利用环境变量NAMESRV_ADDR来告诉工具客户端RocketMQ的NameServer地址为localhost:9876 +```shell +#$ cd distribution/target/rocketmq-4.9.7/rocketmq-4.9.7 +$ cd distribution/target/rocketmq-5.1.4/rocketmq-5.1.4 -**提示** : ${ROCKETMQ_HOME} 位置说明 +$ export NAMESRV_ADDR=localhost:9876 +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Producer + SendResult [sendStatus=SEND_OK, msgId= ... ->bin-release.zip 版本:/rocketmq-all-4.9.4-bin-release -> ->source-release.zip 版本:/rocketmq-all-4.9.4-source-release/distribution +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Consumer + ConsumeMessageThread_%d Receive New Messages: [MessageExt... +``` +**说明**:RocketMQ具备自动创建Topic和Group的功能,在发送消息或订阅消息时,如果相应的Topic或Group不存在,RocketMQ会自动创建它们。因此不需要提前创建Topic和Group。 -### 启动Connect +### 构建 Connector Runtime +```shell +git clone https://github.com/apache/rocketmq-connect.git -#### Connector插件编译 +cd rocketmq-connect -RocketMQ Connector SFTP -``` -$ cd rocketmq-connect/connectors/rocketmq-connect-sftp/ -$ mvn clean package -Dmaven.test.skip=true -``` +export RMQ_CONNECT_HOME=`pwd` -将 RocketMQ Connector SFTP 编译好的包放入Runtime加载目录。命令如下: -``` -mkdir -p /usr/local/connector-plugins -cp target/rocketmq-connect-sftp-0.0.1-SNAPSHOT-jar-with-dependencies.jar /usr/local/connector-plugins +mvn -Prelease-connect -Dmaven.test.skip=true clean install -U ``` -#### 启动Connect Runtime +### 构建 SFTP Connector Plugin ``` -cd rocketmq-connect +cd $RMQ_CONNECT_HOME/connectors/rocketmq-connect-sftp/ -mvn -Prelease-connect -DskipTests clean install -U +mvn clean package -Dmaven.test.skip=true +``` +将 SFTP RocketMQ Connector 编译好的包放入Runtime加载的Plugin目录 ``` +mkdir -p /Users/YourUsername/rocketmqconnect/connector-plugins +cp target/rocketmq-connect-sftp-0.0.1-SNAPSHOT-jar-with-dependencies.jar /Users/YourUsername/rocketmqconnect/connector-plugins +``` + +### 单机模式运行 Connector Worker + +`connect-standalone.conf`中配置了RocketMQ连接地址等信息,需要根据使用情况进行修改 -修改配置`connect-standalone.conf` ,重点配置如下 ``` -$ cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT -$ vim conf/connect-standalone.conf +cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT + +vim conf/connect-standalone.conf ``` +示例配置信息如下 ``` workerId=standalone-worker -storePathRootDir=/tmp/storeRoot +storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot ## Http port for user to access REST API httpPort=8082 @@ -64,86 +79,148 @@ namesrvAddr=localhost:9876 # RocketMQ acl aclEnable=false -accessKey=rocketmq -secretKey=12345678 +#accessKey=rocketmq +#secretKey=12345678 -autoCreateGroupEnable=false clusterName="DefaultCluster" -# 核心配置,将之前编译好包的插件目录配置在此; -# Source or sink connector jar file dir,The default value is rocketmq-connect-sample -pluginPaths=/usr/local/connector-plugins +# 插件地址,用于Worker加载Source/Sink Connector插件 +pluginPaths=/Users/YourUsername/rocketmqconnect/connector-plugins ``` +单机模式(standalone)下,RocketMQ Connect 会把同步位点信息持久化到本地文件目录 storePathRootDir +>storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot +如果想重置同步位点,则需要删除持久化的位点信息文件 +```shell +rm -rf /Users/YourUsername/rocketmqconnect/storeRoot/* ``` -cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT +采用单机模式启动Connector Worker +``` sh bin/connect-standalone.sh -c conf/connect-standalone.conf & - ``` -### SFTP 服务器搭建 +### 搭建 SFTP 服务器 +SFTP(SSH File Transfer Protocol)是一个文件传输协议,用于在计算机之间进行安全的文件传输。SFTP建立在SSH连接之上,它是通过SSH(Secure Shell)协议进行加密和身份验证的。 + +这里为了方便演示,使用 MAC OS 自带的 SFTP 服务(只需开启“远程登录”即可访问),详细参见[允许远程电脑访问你的 Mac](https://support.apple.com/zh-cn/guide/mac-help/mchlp1066/mac)文档。 -使用 MAC OS 自带的 SFTP 服务器 +### 创建源端测试文件 -[允许远程电脑访问你的 Mac](https://support.apple.com/zh-cn/guide/mac-help/mchlp1066/mac) +创建源端测试文件 source.txt ,并写入测试数据 -### 测试数据 +``` +mkdir -p /Users/YourUsername/rocketmqconnect/sftp-test/ -登陆 SFTP 服务器,将具有如何内容的 souce.txt 文件放入用户目录,例如:/path/to/ +cd /Users/YourUsername/rocketmqconnect/sftp-test/ -```text -张三|100000202211290001|20221129001|30000.00|2022-11-28|03:00:00|7.00 +touch source.txt + +echo '张三|100000202211290001|20221129001|30000.00|2022-11-28|03:00:00|7.00 李四|100000202211290002|20221129002|40000.00|2022-11-28|04:00:00|9.00 -赵五|100000202211290003|20221129003|50000.00|2022-11-28|05:00:00|12.00 +赵五|100000202211290003|20221129003|50000.00|2022-11-28|05:00:00|12.00' >> source.txt +``` + +登录 SFTP 服务,验证是否能正常访问。输入下面命令,输入密码后即可进入SFTP服务器 +```shell +# sftp -P port YourUsername@hostname +sftp -P 22 YourUsername@127.0.0.1 +``` +**说明**:由于是本机MAC OS提供的SFTP服务,所以地址是 127.0.0.1, 端口是默认的22。 + +```shell +sftp> cd /Users/YourUsername/rocketmqconnect/sftp-test/ +sftp> ls source.txt +sftp> bye ``` ## 启动Connector ### 启动 SFTP source connector -同步 SFTP 文件:source.txt -作用:通过登陆 SFTP 服务器,解析文件并封装成通用的ConnectRecord对象,发送的RocketMQ Topic当中 +运行以下命令启动 SFTP source connector,connector将会连接到SFTP服务读取source.txt文件, +每读取文件中的一行内容,就会解析并封装成通用的ConnectRecord对象,发送到RocketMQ Topic当中, +供Sink Connector进行消费。 ```shell curl -X POST --location "http://localhost:8082/connectors/SftpSourceConnector" --http1.1 \ -H "Host: localhost:8082" \ -H "Content-Type: application/json" \ - -d "{ - \"connector.class\": \"org.apache.rocketmq.connect.http.sink.SftpSourceConnector\", - \"host\": \"127.0.0.1\", - \"port\": 22, - \"username\": \"wencheng\", - \"password\": \"1617\", - \"filePath\": \"/Users/wencheng/Documents/source.txt\", - \"connect.topicname\": \"sftpTopic\", - \"fieldSeparator\": \"|\", - \"fieldSchema\": \"username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit\" - }" + -d '{ + "connector.class": "org.apache.rocketmq.connect.http.sink.SftpSourceConnector", + "host": "127.0.0.1", + "port": 22, + "username": "YourUsername", + "password": "yourPassword", + "filePath": "/Users/YourUsername/rocketmqconnect/sftp-test/source.txt", + "connect.topicname": "sftpTopic", + "fieldSeparator": "|", + "fieldSchema": "username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit" + }' +``` + +curl请求返回status:200则表示创建成功,返回样例: +>{"status":200,"body":{"connector.class":"... + +看到以下日志说明 file source connector 启动成功了 +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log ``` -运行完以上命令后,SFTP 服务上的文件数据会被组织成给定格式的数据,写入 MQ。之后可以通过 sink connector 或者其他业务系统去消费它。 +>Start connector SftpSourceConnector and set target state STARTED successed!! ### 启动 SFTP sink connector -作用:通过消费Topic中的数据,使用SFTP协议写入到目标文件当中 +运行以下命令启动 SFTP sink connector,connector将会订阅RocketMQ Topic的数据进行消费, +并将每个消息转换为一行文字内容,然后通过SFTP协议写入到sink.txt文件中去。 ```shell curl -X POST --location "http://localhost:8082/connectors/SftpSinkConnector" --http1.1 \ -H "Host: localhost:8082" \ -H "Content-Type: application/json" \ - -d "{ - \"connector.class\": \"org.apache.rocketmq.connect.http.sink.SftpSinkConnector\", - \"host\": \"127.0.0.1\", - \"port\": 22, - \"username\": \"wencheng\", - \"password\": \"1617\", - \"filePath\": \"/Users/wencheng/Documents/sink.txt\", - \"connect.topicnames\": \"sftpTopic\", - \"fieldSeparator\": \"|\", - \"fieldSchema\": \"username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit\" - }" -``` - -**** \ No newline at end of file + -d '{ + "connector.class": "org.apache.rocketmq.connect.http.sink.SftpSinkConnector", + "host": "127.0.0.1", + "port": 22, + "username": "YourUsername", + "password": "yourPassword", + "filePath": "/Users/YourUsername/rocketmqconnect/sftp-test/sink.txt", + "connect.topicnames": "sftpTopic", + "fieldSeparator": "|", + "fieldSchema": "username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit" + }' +``` + +curl请求返回status:200则表示创建成功,返回样例: +>{"status":200,"body":{"connector.class":"... + +看到以下日志说明 file source connector 启动成功了 +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log +``` + +>Start connector SftpSinkConnector and set target state STARTED successed!! + + +查看sink connector是否将数据写入了目的端文件: +```shell +cat /Users/YourUsername/rocketmqconnect/sftp-test/sink.txt +``` + +如果生成了 sink.txt 文件,并且与 source.txt 内容一样则说明整个流程正常运行。 + +继续向源端文件 source.txt 中写入测试数据, +```shell +cd /Users/YourUsername/rocketmqconnect/sftp-test/ + +echo '张三x|100000202211290001|20221129001|30000.00|2022-11-28|03:00:00|7.00 +李四x|100000202211290002|20221129002|40000.00|2022-11-28|04:00:00|9.00 +赵五x|100000202211290003|20221129003|50000.00|2022-11-28|05:00:00|12.00' >> source.txt + +# Wait a few seconds, check if rocketmq-connect replicate data to sink file succeed +sleep 10 +cat /Users/YourUsername/rocketmqconnect/sftp-test/sink.txt +``` + +**注意**:文件内容可能顺序不一样,这是因为`rocketmq-connect-sftp`向RocketMQ Topic中收发消息时,使用的消息类型是普通消息,区别于顺序消息,消费普通消息时是不保证顺序的。 diff --git a/docs/10-connect/08RocketMQ Connect In Action5-ES.md b/docs/10-connect/08RocketMQ Connect In Action5-ES.md index 783d8bf137..00011e6b82 100644 --- a/docs/10-connect/08RocketMQ Connect In Action5-ES.md +++ b/docs/10-connect/08RocketMQ Connect In Action5-ES.md @@ -1,6 +1,6 @@ # RocketMQ Connect实战5 -Elsticsearch Source - >RocketMQ Connect -> Elasticsearch Sink +Elasticsearch Source -> RocketMQ Connect -> Elasticsearch Sink ## 准备 @@ -9,53 +9,67 @@ Elsticsearch Source - >RocketMQ Connect -> Elasticsearch Sink 1. Linux/Unix/Mac 2. 64bit JDK 1.8+; 3. Maven 3.2.x或以上版本; -4. 启动 [RocketMQ](https://rocketmq.apache.org/docs/quick-start/); +4. 启动 RocketMQ。使用[RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) 或 + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/)版本均可; +5. 工具测试 RocketMQ 消息收发是否正常。详见[RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) 或 + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/)文档。 +这里利用环境变量NAMESRV_ADDR来告诉工具客户端RocketMQ的NameServer地址为localhost:9876 +```shell +#$ cd distribution/target/rocketmq-4.9.7/rocketmq-4.9.7 +$ cd distribution/target/rocketmq-5.1.4/rocketmq-5.1.4 -**tips** : ${ROCKETMQ_HOME} 位置说明 +$ export NAMESRV_ADDR=localhost:9876 +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Producer + SendResult [sendStatus=SEND_OK, msgId= ... ->bin-release.zip 版本:/rocketmq-all-4.9.4-bin-release -> ->source-release.zip 版本:/rocketmq-all-4.9.4-source-release/distribution +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Consumer + ConsumeMessageThread_%d Receive New Messages: [MessageExt... +``` +**说明**:RocketMQ具备自动创建Topic和Group的功能,在发送消息或订阅消息时,如果相应的Topic或Group不存在,RocketMQ会自动创建它们。因此不需要提前创建Topic和Group。 -### 启动Connect +### 构建 Connector Runtime +```shell +git clone https://github.com/apache/rocketmq-connect.git -#### Connector插件编译 +cd rocketmq-connect -Elasticsearch RocketMQ Connector -``` -$ cd rocketmq-connect/connectors/rocketmq-connect-elasticsearch/ -$ mvn clean package -Dmaven.test.skip=true -``` +export RMQ_CONNECT_HOME=`pwd` -将 Elasticsearch RocketMQ Connector 编译好的包放入Runtime加载目录。命令如下: -``` -mkdir -p /usr/local/connector-plugins -cp rocketmq-connect-elasticsearch/target/rocketmq-connect-elasticsearch-1.0.0-jar-with-dependencies.jar /usr/local/connector-plugins +mvn -Prelease-connect -Dmaven.test.skip=true clean install -U ``` - -#### 启动Connect Runtime +### 构建 Elasticsearch Connector Plugin ``` -cd rocketmq-connect +cd $RMQ_CONNECT_HOME/connectors/rocketmq-connect-elasticsearch/ -mvn -Prelease-connect -DskipTests clean install -U +mvn clean package -Dmaven.test.skip=true +``` +将 Elasticsearch RocketMQ Connector 编译好的包放入Runtime加载的Plugin目录 +``` +mkdir -p /Users/YourUsername/rocketmqconnect/connector-plugins +cp target/rocketmq-connect-elasticsearch-1.0.0-jar-with-dependencies.jar /Users/YourUsername/rocketmqconnect/connector-plugins ``` -修改配置`connect-standalone.conf` ,重点配置如下 +### 单机模式运行 Connector Worker + +`connect-standalone.conf`中配置了RocketMQ连接地址等信息,需要根据使用情况进行修改 + ``` -$ cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT -$ vim conf/connect-standalone.conf +cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT + +vim conf/connect-standalone.conf ``` +示例配置信息如下 ``` workerId=standalone-worker -storePathRootDir=/tmp/storeRoot +storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot ## Http port for user to access REST API httpPort=8082 @@ -65,53 +79,207 @@ namesrvAddr=localhost:9876 # RocketMQ acl aclEnable=false -accessKey=rocketmq -secretKey=12345678 +#accessKey=rocketmq +#secretKey=12345678 -autoCreateGroupEnable=false clusterName="DefaultCluster" -# 核心配置,将之前编译好elasticsearch包的插件目录配置在此; -# Source or sink connector jar file dir,The default value is rocketmq-connect-sample -pluginPaths=/usr/local/connector-plugins +# 插件地址,用于Worker加载Source/Sink Connector插件 +pluginPaths=/Users/YourUsername/rocketmqconnect/connector-plugins ``` +单机模式(standalone)下,RocketMQ Connect 会把同步位点信息持久化到本地文件目录 storePathRootDir +>storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot +如果想重置同步位点,则需要删除持久化的位点信息文件 +```shell +rm -rf /Users/YourUsername/rocketmqconnect/storeRoot/* ``` -cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT +采用单机模式启动Connector Worker +``` sh bin/connect-standalone.sh -c conf/connect-standalone.conf & +``` + +### 搭建 Elasticsearch 服务 + +Elasticsearch是一个开源的实时分布式搜索和分析引擎。 + +这里为了方便演示,使用 docker 搭建 2个 Elasticsearch 数据库,分别作为 Connector 连接的源和目的端ES数据库。 +``` +docker pull docker.elastic.co/elasticsearch/elasticsearch:7.15.1 +docker run --name es1 -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1g -Xmx1g" \ + -v /Users/YourUsername/rocketmqconnect/es/es1_data:/usr/share/elasticsearch/data \ + -d docker.elastic.co/elasticsearch/elasticsearch:7.15.1 + +docker run --name es2 -p 9201:9200 -p 9301:9300 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1g -Xmx1g" \ + -v /Users/YourUsername/rocketmqconnect/es/es2_data:/usr/share/elasticsearch/data \ + -d docker.elastic.co/elasticsearch/elasticsearch:7.15.1 ``` -### Elasticsearch镜像 +**docker命令说明**: +- --name es2: 为容器指定一个名称,本例中为es2。 +- -p 9201:9200 -p 9301:9300: 将Elasticsearch的HTTP端口9200和传输端口9300分别映射到主机的9201和9301端口,以便可以通过主机访问Elasticsearch服务。 +- -e "discovery.type=single-node": 设置Elasticsearch的发现类型为单节点模式,这对于单机部署非常适用。 +- -v /Users/YourUsername/rocketmqconnect/es/es2_data:/usr/share/elasticsearch/data: 将主机上的一个目录挂载到容器内的/usr/share/elasticsearch/data目录,用于持久化存储Elasticsearch数据。 + +通过以上命令,您可以运行一个带有自定义配置和数据存储的Elasticsearch容器,并且可以通过主机的9200端口访问其HTTP API。这是在本地开发或测试环境中运行独立的Elasticsearch实例的常见方式。 + + +查看ES日志,查看启动是否有报错 +``` +docker logs -f es1 -使用 docker 搭建环境 Elasticsearch 数据库 +docker logs -f es2 ``` -# starting a elasticsearch instance -docker run --name my-elasticsearch -p 9200:9200 -p 9300:9300 -e "ES_JAVA_OPTS=-Xms1g -Xmx1g" -d 74c2e0ec249c + +使用curl命令检查Elasticsearch是否正常 + +``` +# check es1 +curl -XGET http://localhost:9200 + +# check es2 +curl -XGET http://localhost:9201 ``` -### Kibana镜像 -使用 docker 搭建环境 Kibana +如果成功连接并且Elasticsearch已正常启动,您将看到与Elasticsearch相关的信息和版本号的JSON响应。 + +### 搭建 Kibana 服务 + +Kibana是一个开源的数据可视化工具,用于对Elasticsearch中存储的数据进行搜索、分析和可视化展示。 +它提供了丰富的图表、图形和仪表盘等功能,使用户能够以直观的方式理解和探索数据。 + +这里为了方便演示,使用 docker 搭建 2个 Kibana 服务,分别连接前面搭建的2个ES数据库。 + ``` -docker run --name my-kibana -e ELASTICSEARCH_URL=http://192.168.0.101:9200 -p 5601:5601 -d 5dca66b41943 +docker pull docker.elastic.co/kibana/kibana:7.15.1 + +docker run --name kibana1 --link es1:elasticsearch -p 5601:5601 -d docker.elastic.co/kibana/kibana:7.15.1 + +docker run --name kibana2 --link es2:elasticsearch -p 5602:5601 -d docker.elastic.co/kibana/kibana:7.15.1 + ``` +**docker命令说明**: +- --name kibana2: 为容器指定一个名称,本例中为kibana2。 +- --link es2:elasticsearch: 将容器链接到另一个名为es2的Elasticsearch容器。这将允许Kibana实例连接和与Elasticsearch进行通信。 +- -p 5602:5601: 将Kibana的默认端口5601映射到主机的5602端口,以便可以通过主机访问Kibana的用户界面。 +- -d: 在后台运行容器。 +通过以上命令,您可以在Docker容器中启动一个独立的Kibana实例,并将其连接到另一个正在运行的Elasticsearch实例。 +这样,您可以通过浏览器访问主机的5601、5602端口,来分别访问Kibana1、Kibana2控制台。 -### 测试数据 +查看Kibana日志,查看启动是否有报错 +``` +docker logs -f kibana1 -通过 kibana Dev Tools 创建测试数据:参考 [console-ibana](https://www.elastic.co/guide/en/kibana/8.5/console-kibana.html#console-kibana); +docker logs -f kibana2 +``` +使用浏览器访问 kibana 控制台,地址 +- kibana1: http://localhost:5601 +- kibana2:http://localhost:5602 -源索引:connect_es +如果控制台页面能正常打开,则说明Kibana已正常启动。 + +### 向源端ES写入测试数据 +Kibana 的 Dev Tools 可以帮助您在 Kibana 中与 Elasticsearch 进行直接的交互和操作,执行各种查询和操作,并分析和理解返回的数据。 +参见文档 [console-kibana](https://www.elastic.co/guide/en/kibana/8.9/console-kibana.html)。 + +#### 批量写入测试数据 +浏览器访问Kibana1控制台,左侧菜单找到Dev Tools,进入页面后输入如下命令写入测试数据 +``` +POST /_bulk +{ "index" : { "_index" : "connect_es" } } +{ "id": "1", "field1": "value1", "field2": "value2" } +{ "index" : { "_index" : "connect_es" } } +{ "id": "2", "field1": "value3", "field2": "value4" } +``` +**说明**: +- connect_es:数据的索引名称 +- id/field1/field2:数据中的字段名称,1、value1、value2 分别是字段的值。 + +**注意**:`rocketmq-connect-elasticsearch` 存在一个限制,就是数据中必须要一个可用于 >= 比较运算的字段(字符串 或 数字),该字段会被用于记录同步的位点信息。 +上面的示例中 `id` 字段,就是一个全局唯一、自增的数值类型字段。 + +#### 查数据 +查询索引下的数据: +``` +GET /connect_es/_search +{ + "size": 100 +} +``` + +若无数据,则返回示例为: +``` +{ + "error" : { + ... + "type" : "index_not_found_exception", + "reason" : "no such index [connect_es]", + "resource.type" : "index_or_alias", + "resource.id" : "connect_es", + "index_uuid" : "_na_", + "index" : "connect_es" + }, + "status" : 404 +} +``` + +若有数据,则返回示例为: + +``` +{ + ... + "hits" : { + "total" : { + "value" : 2, + "relation" : "eq" + }, + "max_score" : 1.0, + "hits" : [ + { + "_index" : "connect_es", + "_type" : "_doc", + "_id" : "_dx49osBb46Z9cN4hYCg", + "_score" : 1.0, + "_source" : { + "id" : "1", + "field1" : "value1", + "field2" : "value2" + } + }, + { + "_index" : "connect_es", + "_type" : "_doc", + "_id" : "_tx49osBb46Z9cN4hYCg", + "_score" : 1.0, + "_source" : { + "id" : "2", + "field1" : "value3", + "field2" : "value4" + } + } + ] + } +} + +``` + +#### 删除数据 +如果因重复测试等原因,需要删除索引下的数据,则可使用如下命令 +``` +DELETE /connect_es +``` ## 启动Connector ### 启动Elasticsearch source connector -同步源索引数据:connect_es -作用:通过解析 Elasticsearch 文档数据封装成通用的ConnectRecord对象,发送的RocketMQ Topic当中 +运行以下命令启动 ES source connector,connector将会连接到ES读取 connect_es 索引下的文档数据, +并解析 Elasticsearch 文档数据封装成通用的ConnectRecord对象,发送到RocketMQ Topic当中, 供Sink Connector进行消费。 ``` curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/elasticsearchSourceConnector -d '{ @@ -131,29 +299,57 @@ curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connector }' ``` -### 启动 Elasticsearch sink connector +**说明**:启动命令中指定了源端ES要同步的索引为 connect_es ,以及 索引下自增的字段为 id ,并从id=1开始拉取数据。 -作用:通过消费Topic中的数据,写入到目标索引当中 +curl请求返回status:200则表示创建成功,返回样例: +>{"status":200,"body":{"connector.class":"... + +看到以下日志说明 file source connector 启动成功了 +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log ``` -curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/ElasticsearchSinkConnector -d '{ + +>Start connector elasticsearchSourceConnector and set target state STARTED successed!! + + +### 启动 Elasticsearch sink connector +运行以下命令启动 ES sink connector,connector将会订阅RocketMQ Topic的数据进行消费, +并将每个消息转换为文档数据写入到目的端ES当中。 + +``` +curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/elasticsearchSinkConnector -d '{ "connector.class":"org.apache.rocketmq.connect.elasticsearch.connector.ElasticsearchSinkConnector", "elasticsearchHost":"localhost", - "elasticsearchPort":9202, + "elasticsearchPort":9201, "max.tasks":2, "connect.topicnames":"ConnectEsTopic", "value.converter":"org.apache.rocketmq.connect.runtime.converter.record.json.JsonConverter", "key.converter":"org.apache.rocketmq.connect.runtime.converter.record.json.JsonConverter" }' +``` +**说明**:启动命令中指定了目的端ES地址和端口,对应之前docker启动的es2。 + +curl请求返回status:200则表示创建成功,返回样例: +>{"status":200,"body":{"connector.class":"... + +看到以下日志说明 file source connector 启动成功了 +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log ``` -note:本地测试需要启动两个不同端口的Elasticsearch进程 +>Start connector elasticsearchSinkConnector and set target state STARTED successed!! -以上两个Connector任务创建成功以后 -通过访问sink指定的Elasticsearch是否包含数据 +查看sink connector是否将数据写入了目的端ES的索引当中: +1. 浏览器访问 Kibana2 控制台地址 http://localhost:5602 +2. Kibana2 Dev Tools 页面,查询索引下的数据,若跟源端 es1 中的数据一致则说明Connector运行正常。 +``` +GET /connect_es/_search +{ + "size": 100 +} +``` -对源索引的新增数据 -即可同步到目标索引当中 diff --git a/i18n/en/docusaurus-plugin-content-docs/current/10-connect/03RocketMQ Connect Quick Start.md b/i18n/en/docusaurus-plugin-content-docs/current/10-connect/03RocketMQ Connect Quick Start.md index b6a0c363fb..735bf608fe 100644 --- a/i18n/en/docusaurus-plugin-content-docs/current/10-connect/03RocketMQ Connect Quick Start.md +++ b/i18n/en/docusaurus-plugin-content-docs/current/10-connect/03RocketMQ Connect Quick Start.md @@ -2,163 +2,242 @@ # Quick Start -In standalone mode, [rocketmq-connect-sample] serves as a demo. +This tutorial will start a RocketMQ Connector example project "rocketmq-connect-sample" in standalone mode to help you understand the working principle of connectors. +The example project provides a source connector that reads data from source files and sends it to the RocketMQ cluster. +It also provides a sink connector that reads messages from the RocketMQ cluster and writes them to destination files. -The main purpose of rocketmq-connect-sample is to read data from a source file and send it to a RocketMQ cluster, and then read messages from the Topic and write them to a target file. - -## 1. Prepare +## 1. Preparation: Start RocketMQ 1. Linux/Unix/Mac 2. 64bit JDK 1.8+; 3. Maven 3.2.x+; -4. Start [RocketMQ](https://rocketmq.apache.org/docs/quick-start/); -5. Create test Topic +4. Start RocketMQ. Either [RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) or + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/) 5.x version can be used; +5. Test RocketMQ message sending and receiving using the tool. -> sh ${ROCKETMQ_HOME}/bin/mqadmin updateTopic -t fileTopic -n localhost:9876 -c DefaultCluster -r 8 -w 8 +Here, use the environment variable NAMESRV_ADDR to inform the tool client of the NameServer address of RocketMQ as localhost:9876. -**tips** : ${ROCKETMQ_HOME} locational instructions +```shell +#$ cd distribution/target/rocketmq-4.9.7/rocketmq-4.9.7 +$ cd distribution/target/rocketmq-5.1.4/rocketmq-5.1.4 ->bin-release.zip version:/rocketmq-all-4.9.4-bin-release -> ->source-release.zip version:/rocketmq-all-4.9.4-source-release/distribution +$ export NAMESRV_ADDR=localhost:9876 +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Producer + SendResult [sendStatus=SEND_OK, msgId= ... +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Consumer + ConsumeMessageThread_%d Receive New Messages: [MessageExt... +``` -## 2. Build Connect +**Note**: RocketMQ has the feature of automatically creating Topic and Group. When sending or subscribing to messages, +if the corresponding Topic or Group does not exist, RocketMQ will automatically create them. Therefore, +there is no need to create Topic and Group in advance. -``` +## 2. Build Connector Runtime +```shell git clone https://github.com/apache/rocketmq-connect.git cd rocketmq-connect -mvn -Prelease-connect -DskipTests clean install -U +export RMQ_CONNECT_HOME=`pwd` +mvn -Prelease-connect -Dmaven.test.skip=true clean install -U ``` -## 3. Run Worker +**Note**: The project already includes the code for rocketmq-connect-sample by default, +so there is no need to build the rocketmq-connect-sample plugin separately. +## 3. Run Connector Worker in Standalone Mode + +### Modify Configuration +Modify the `connect-standalone.conf` file to configure the RocketMQ connection +address and other information. Please refer to [9. Configuration File Instructions](#9-configuration-file-instructions) for details. ``` -cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT +cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT -sh bin/connect-standalone.sh -c conf/connect-standalone.conf & +vim conf/connect-standalone.conf +``` + +In standalone mode, RocketMQ Connect persists the synchronization checkpoint information +to the local file directory storePathRootDir. +>storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot + +If you want to reset the synchronization checkpoint, you need to delete the persisted +checkpoint file. + +```shell +rm -rf /Users/YourUsername/rocketmqconnect/storeRoot/* ``` -**tips**: The JVM Parameters Configuration can be adjusted in /bin/runconnect.sh as needed. +### Start Connector Worker in Standalone Mode + +```shell +sh bin/connect-standalone.sh -c conf/connect-standalone.conf & +``` + +**tips**: You can modify `docker/connect/bin/runconnect.sh` to adjust JVM startup +parameters as needed. >JAVA_OPT="${JAVA_OPT} -server -Xms256m -Xmx256m" -runtime start successful: +To view the startup log file: ->The standalone worker boot success. +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log +``` -View the startup log files. +If the runtime starts successfully, you will see the following print in the log file: ->tail -100f ~/logs/rocketmqconnect/connect_runtime.log +>The standalone worker boot success. -`ctrl + c` exit log +To exit the log tracking mode of `tail -f` command, you can press the `Ctrl + C` key combination. -## 4. Start source connector +## 4. Start Source Connector -Create a test file named test-source-file.txt in the current directory. +### Create Source File and Write Test Data -``` +```shell +mkdir -p /Users/YourUsername/rocketmqconnect/ +cd /Users/YourUsername/rocketmqconnect/ touch test-source-file.txt echo "Hello \r\nRocketMQ\r\n Connect" >> test-source-file.txt +``` +**Note**: There should be no empty lines (the demo program will throw an error if it +encounters empty lines). The source connector will continuously read the source file +and convert each line of data into a message body to be sent to RocketMQ for consumption +by the sink connector. + +### Start Source Connector + +```shell +curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSourceConnector -d '{ + "connector.class": "org.apache.rocketmq.connect.file.FileSourceConnector", + "filename": "/Users/YourUsername/rocketmqconnect/test-source-file.txt", + "connect.topicname": "fileTopic" +}' +``` + +If the curl request returns status 200, it indicates successful creation. Example response: +>{"status":200,"body":{"connector.class":"org.apache.rocketmq.connect.file.FileSourceConnector","filename":"/Users/YourUsername/rocketmqconnect/test-source-file.txt","connect.topicname":"fileTopic"}} -curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSourceConnector -d '{"connector.class":"org.apache.rocketmq.connect.file.FileSourceConnector","filename":"test-source-file.txt","connect.topicname":"fileTopic"}' +View the log file: +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log ``` -If you see the following log message, it means the file source connector has started successfully. +If you see the following log, it means the file source connector has started successfully: +>Start connector fileSourceConnector and set target state STARTED successed!! ->tail -100f ~/logs/rocketmqconnect/connect_runtime.log -> ->2019-07-16 11:18:39 INFO pool-7-thread-1 - **Source task start**, config:{"properties":{"source-record-... -#### source connector configuration instructions +#### Source Connector Configuration Instructions | key | nullable | default | description | | ----------------- | -------- | ------- | ------------------------------------------------------------ | | connector.class | false | | The class name (including the package name) that implements the Connector interface | -| filename | false | | source file name | +| filename | false | | The name of the source file (recommended to use absolute path) | | connect.topicname | false | | Topic required for synchronizing file data | ## 5. Start sink connector +```shell +curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSinkConnector -d '{ + "connector.class": "org.apache.rocketmq.connect.file.FileSinkConnector", + "filename": "/Users/YourUsername/rocketmqconnect/test-sink-file.txt", + "connect.topicnames": "fileTopic" +}' ``` -curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSinkConnector -d '{"connector.class":"org.apache.rocketmq.connect.file.FileSinkConnector","filename":"test-sink-file.txt","connect.topicnames":"fileTopic"}' -cat test-sink-file.txt +If the curl request returns status 200, it indicates successful creation. Example response: +>{"status":200,"body":{"connector.class":"org.apache.rocketmq.connect.file.FileSinkConnector","filename":"/Users/YourUsername/rocketmqconnect/test-sink-file.txt","connect.topicnames":"fileTopic"}} + +View the log file: +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log ``` +If you see the following log, it means the file sink connector has started successfully: +> Start connector fileSinkConnector and set target state STARTED successed!! -> tail -100f ~/logs/rocketmqconnect/connect_runtime.log +Check if the sink connector has written data to the destination file: +```shell +cat /Users/YourUsername/rocketmqconnect/test-sink-file.txt +``` -If you see the following log message, it means the file sink connector has started successfully. +If the test-sink-file.txt file is generated and its content is the same as the +test-source-file.txt, it means the entire process is running correctly. -> 2019-07-16 11:24:58 INFO pool-7-thread-2 - **Sink task start**, config:{"properties":{"source-record-... +Continue writing test data to the source file test-source-file.txt: +```shell +cd /Users/YourUsername/rocketmqconnect/ -If test-sink-file.txt is generated and its content is the same as source-file.txt, it means that the entire process is running normally. +echo "Say Hi to\r\nRMQ Connector\r\nAgain" >> test-source-file.txt -The file contents may be in a different order, which is normal because the order of messages received from different queues in RocketMQ may also be inconsistent. +# Wait a few seconds, check if rocketmq-connect replicate data to sink file succeed +sleep 10 +cat /Users/YourUsername/rocketmqconnect/test-sink-file.txt +``` + +**Note**: The order of file contents may vary because the `rocketmq-connect-sample` uses `normal message` when +sending and receiving messages to/from a RocketMQ topic. This is different from `ordered message`, and consuming +`normal messages` does not guarantee the order. #### sink connector configuration instructions | key | nullable | default | description | | ------------------ | -------- | ------- | ------------------------------------------------------------ | | connector.class | false | | The class name (including the package name) that implements the Connector interface | -| filename | false | | The sink pulls data and saves it to a file. | -| connect.topicnames | false | | The topics of the data messages that the sink needs to process. | +| filename | false | | The sink pulls data and saves it to a file(recommended to use absolute path) | +| connect.topicnames | false | | The topics of the data messages that the sink needs to process | -``` -Tips:The configuration file instructions for the sample rocketmq-connect-sample are for reference only, different source/sink connectors have different configurations, please refer to the specific source/sink connector. -``` + +**Tips**:The configuration file instructions for the sample rocketmq-connect-sample are for reference only, different source/sink connectors have different configurations, please refer to the specific source/sink connector. ## 6. Stop connector +The RESTful command format for stopping connectors is +`http://(your worker ip):(port)/connectors/(connector name)/stop` +To stop the two connectors in the demo, you can use the following commands: ```shell -#GET request -http://(your worker ip):(port)/connectors/(connector name)/stop - -#Stopping the two connectors in the demo -curl http://127.0.0.1:8082/connectors/fileSinkConnector/stop -curl http://127.0.0.1:8082/connectors/fileSourceConnector/stop - +curl http://127.0.0.1:8082/connectors/fileSinkConnector/stop +curl http://127.0.0.1:8082/connectors/fileSourceConnector/stop ``` -Seeing the following log message indicates that the connector has been successfully stopped. +If the curl request returns a status of 200, it indicates successful stopping of the connectors. +Example response: +>{"status":200,"body":"Connector [fileSinkConnector] deleted successfully"} ->**Source task stop**, config:{"properties":{"source-record-converter":"org.apache.rocketmq.connect.runtime.converter.JsonConverter","filename":"/home/zhoubo/IdeaProjects/my-new3-rocketmq-externals/rocketmq-connect/rocketmq-connect-runtime/source-file.txt","task-class":"org.apache.rocketmq.connect.file.FileSourceTask","topic":"fileTopic","connector-class":"org.apache.rocketmq.connect.file.FileSourceConnector","update-timestamp":"1564765189322"}} - -## 7. Stopping the Worker process +If you see the following log message, it means the file sink connector has been +successfully shut down: +```shell +tail -100f ~/logs/rocketmqconnect/connect_default.log ``` +> Completed shutdown for connectorName:fileSinkConnector + +## 7. Stop the Worker process + +```shell +cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT sh bin/connectshutdown.sh ``` ## 8. Log directory ->${user.home}/logs/rocketmqconnect - -## 9. Configuration file - -The default directory for persistent configuration files is /tmp/storeRoot. +You can use the following commands to view the log directory: -| key | description | -| -------------------- | --------------------------------------------------------- | -| connectorConfig.json | Connector configuration persistence files | -| position.json | Source connect data processing progress persistence files | -| taskConfig.json | Task configuration persistence files | -| offset.json | Sink connect data consumption progress persistence files | -| connectorStatus.json | Connector status persistence files | -| taskStatus.json | Task status persistence files | +```shell +ls $HOME/logs/rocketmqconnect +ls ~/logs/rocketmqconnect +``` -## 10. Configuration Instructions +## 9. Configuration File Instructions Modify the RESTful port, storeRoot path, Nameserver address, and other information based on your usage. -The file location is in the conf/connect-standalone.conf under the work startup directory. +Here is an example of a configuration file: ```shell #current cluster node uniquely identifies @@ -168,16 +247,29 @@ workerId=DEFAULT_WORKER_1 httpPort=8082 # Local file dir for config store -storePathRootDir=/home/connect/storeRoot +storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot #You need to modify it to your own rocketmq nameserver endpoint. # RocketMQ namesrvAddr namesrvAddr=127.0.0.1:9876 -#This is used for loading Connector plugins, similar to how JVM loads jar packages or classes at startup. This directory is used for placing Connector-related implementation plugins and supports both files and directories. -# Source or sink connector jar file dir -pluginPaths=rocketmq-connect-sample/target/rocketmq-connect-sample-0.0.1-SNAPSHOT.jar +# Plugin path for loading Source/Sink Connectors +# The rocketmq-connect project already includes the rocketmq-connect-sample module by default, so no configuration is needed here. +pluginPaths= +``` + +Explanation of storePathRootDir configuration: + +In standalone mode, RocketMQ Connect persists the synchronization checkpoint information +to the local file directory specified by storePathRootDir. The persistent files include: + + +| key | description | +| -------------------- | --------------------------------------------------------- | +| connectorConfig.json | Connector configuration persistence files | +| position.json | Source connect data processing progress persistence files | +| taskConfig.json | Task configuration persistence files | +| offset.json | Sink connect data consumption progress persistence files | +| connectorStatus.json | Connector status persistence files | +| taskStatus.json | Task status persistence files | -# Addition : put the Connector-related implementation plugins in the specified folder. -# pluginPaths=/usr/local/connector-plugins/* -``` \ No newline at end of file diff --git a/i18n/en/docusaurus-plugin-content-docs/current/10-connect/07RocketMQ Connect In Action4.md b/i18n/en/docusaurus-plugin-content-docs/current/10-connect/07RocketMQ Connect In Action4.md index 3f0dd01471..23d647d6e8 100644 --- a/i18n/en/docusaurus-plugin-content-docs/current/10-connect/07RocketMQ Connect In Action4.md +++ b/i18n/en/docusaurus-plugin-content-docs/current/10-connect/07RocketMQ Connect In Action4.md @@ -1,6 +1,6 @@ # RocketMQ Connect in Action 4 -SFTP Server(file data) -> RocketMQ Connect +SFTP Server (File Data) -> RocketMQ Connect -> SFTP Server (File) ## Preparation @@ -9,55 +9,70 @@ SFTP Server(file data) -> RocketMQ Connect 1. Linux/Unix/Mac 2. 64bit JDK 1.8+; 3. Maven 3.2.x+; -4. Start [RocketMQ](https://rocketmq.apache.org/docs/quick-start/); +4. Start RocketMQ. Either [RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) or + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/) 5.x version can be used; +5. Test RocketMQ message sending and receiving using the tool. +Here, use the environment variable NAMESRV_ADDR to inform the tool client of the NameServer address of RocketMQ as localhost:9876. +```shell +#$ cd distribution/target/rocketmq-4.9.7/rocketmq-4.9.7 +$ cd distribution/target/rocketmq-5.1.4/rocketmq-5.1.4 -**Tips** : ${ROCKETMQ_HOME} locational instructions +$ export NAMESRV_ADDR=localhost:9876 +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Producer + SendResult [sendStatus=SEND_OK, msgId= ... ->bin-release.zip version:/rocketmq-all-4.9.4-bin-release -> ->source-release.zip version:/rocketmq-all-4.9.4-source-release/distribution +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Consumer + ConsumeMessageThread_%d Receive New Messages: [MessageExt... +``` +**Note**: RocketMQ has the feature of automatically creating Topic and Group. When sending or subscribing to messages, +if the corresponding Topic or Group does not exist, RocketMQ will automatically create them. Therefore, +there is no need to create Topic and Group in advance. -### Start Connect +### Build Connector Runtime +```shell +git clone https://github.com/apache/rocketmq-connect.git -#### **Compiling Connector Plugin** +cd rocketmq-connect -RocketMQ Connector SFTP +export RMQ_CONNECT_HOME=`pwd` -``` -$ cd rocketmq-connect/connectors/rocketmq-connect-sftp/ -$ mvn clean package -Dmaven.test.skip=true +mvn -Prelease-connect -Dmaven.test.skip=true clean install -U ``` -Move the compiled RocketMQ Connector SFTP package into the Runtime loading directory. The command is as follows: +### Build SFTP Connector Plugin ``` -mkdir -p /usr/local/connector-plugins -cp target/rocketmq-connect-sftp-0.0.1-SNAPSHOT-jar-with-dependencies.jar /usr/local/connector-plugins -``` - -#### Start Connect Runtime +cd $RMQ_CONNECT_HOME/connectors/rocketmq-connect-sftp/ +mvn clean package -Dmaven.test.skip=true ``` -cd rocketmq-connect -mvn -Prelease-connect -DskipTests clean install -U +Put the compiled jar of the SFTP RocketMQ Connector into the Plugin directory for runtime loading. +``` +mkdir -p /Users/YourUsername/rocketmqconnect/connector-plugins +cp target/rocketmq-connect-sftp-0.0.1-SNAPSHOT-jar-with-dependencies.jar /Users/YourUsername/rocketmqconnect/connector-plugins ``` -Modify the configuration `connect-standalone.conf`, the main configuration is as follows +### Run Connector Worker in Standalone Mode + +Modify the `connect-standalone.conf` file to configure the RocketMQ connection +address and other information. ``` -$ cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT -$ vim conf/connect-standalone.conf +cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT + +vim conf/connect-standalone.conf ``` +Example configuration information is as follows: ``` workerId=standalone-worker -storePathRootDir=/tmp/storeRoot +storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot ## Http port for user to access REST API httpPort=8082 @@ -67,84 +82,164 @@ namesrvAddr=localhost:9876 # RocketMQ acl aclEnable=false -accessKey=rocketmq -secretKey=12345678 +#accessKey=rocketmq +#secretKey=12345678 -autoCreateGroupEnable=false clusterName="DefaultCluster" -# Core configuration, configure the plugin directory that was previously compiled here. -# Source or sink connector jar file dir,The default value is rocketmq-connect-sample -pluginPaths=/usr/local/connector-plugins +# Plugin path for loading Source/Sink Connectors +pluginPaths=/Users/YourUsername/rocketmqconnect/connector-plugins ``` +In standalone mode, RocketMQ Connect persistently stores the synchronization checkpoint information +in the local file directory specified by storePathRootDir. + +>storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot +If you want to reset the synchronization checkpoint, you need to delete the persisted checkpoint information files. +```shell +rm -rf /Users/YourUsername/rocketmqconnect/storeRoot/* ``` -cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT +To start Connector Worker in standalone mode: +``` sh bin/connect-standalone.sh -c conf/connect-standalone.conf & - ``` ### Set up an SFTP server -Use the built-in SFTP server on MAC OS. +SFTP (SSH File Transfer Protocol) is a file transfer protocol used for secure file transfers between computers. +SFTP is built on top of the SSH (Secure Shell) protocol and utilizes encryption and authentication. -[Allow remote computers to access your Mac](https://support.apple.com/zh-cn/guide/mac-help/mchlp1066/mac) +We will use the built-in SFTP service in macOS (by enabling "Remote Login" access). +For detailed instructions, please refer to the +[Allow a remote computer to access your Mac](https://support.apple.com/guide/mac-help/allow-a-remote-computer-to-access-your-mac-mchlp1066/mac)document. + +### Create Source Test File +Create a test file named `source.txt` and write some test data to it: + +```shell +mkdir -p /Users/YourUsername/rocketmqconnect/sftp-test/ -### Test data +cd /Users/YourUsername/rocketmqconnect/sftp-test/ -Log in to the SFTP server and place a file called source.txt with specific contents in the user directory, for example: /path/to/. +touch source.txt + +echo 'John Doe|100000202211290001|20221129001|30000.00|2022-11-28|03:00:00|7.00 +Jane Smith|100000202211290002|20221129002|40000.00|2022-11-28|04:00:00|9.00 +Bob Johnson|100000202211290003|20221129003|50000.00|2022-11-28|05:00:00|12.00' >> source.txt +``` + +Log in to the SFTP service to verify that you can access it normally. Enter the following command, then enter your +password : +```shell +# sftp -P port YourUsername@hostname +sftp -P 22 YourUsername@127.0.0.1 +``` +**Note**: Since this is the SFTP service provided by your local MAC OS, the address is `127.0.0.1` and the port is the default 22. -```text -zhangsan|100000202211290001|20221129001|30000.00|2022-11-28|03:00:00|7.00 -lisi|100000202211290002|20221129002|40000.00|2022-11-28|04:00:00|9.00 -zhaowu|100000202211290003|20221129003|50000.00|2022-11-28|05:00:00|12.00 +```shell +sftp> cd /Users/YourUsername/rocketmqconnect/sftp-test/ +sftp> ls source.txt +sftp> bye ``` ## Start Connector -### Start SFTP source connector +### Start SFTP Source Connector -Synchronize the SFTP file: source.txt -Purpose: by logging into the SFTP server, parsing the file and encapsulating it into a generic ConnectRecord object, sending it to the RocketMQ Topic. +Run the following command to start the SFTP source connector. This connector will connect to the +SFTP service to read from the `source.txt` file. For each line of text in the file, the connector +will parse and package the contents into a generic ConnectRecord object, which will then be sent +to a RocketMQ topic for consumption by sink connectors. ```shell curl -X POST --location "http://localhost:8082/connectors/SftpSourceConnector" --http1.1 \ -H "Host: localhost:8082" \ -H "Content-Type: application/json" \ - -d "{ - \"connector.class\": \"org.apache.rocketmq.connect.http.sink.SftpSourceConnector\", - \"host\": \"127.0.0.1\", - \"port\": 22, - \"username\": \"wencheng\", - \"password\": \"1617\", - \"filePath\": \"/Users/wencheng/Documents/source.txt\", - \"connect.topicname\": \"sftpTopic\", - \"fieldSeparator\": \"|\", - \"fieldSchema\": \"username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit\" - }" + -d '{ + "connector.class": "org.apache.rocketmq.connect.http.sink.SftpSourceConnector", + "host": "127.0.0.1", + "port": 22, + "username": "YourUsername", + "password": "yourPassword", + "filePath": "/Users/YourUsername/rocketmqconnect/sftp-test/source.txt", + "connect.topicname": "sftpTopic", + "fieldSeparator": "|", + "fieldSchema": "username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit" + }' +``` + +If the curl request returns status: 200, it indicates that the connector was successfully +created. An example response would look like this: +```json +{"status":200,"body":{"connector.class":"... +``` + +To confirm that the file source connector has started successfully, run the following command: +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log ``` -After running the above commands, the file data on the SFTP server will be organized into data in the specified format, and written to MQ. Afterwards, it can be consumed by the sink connector or other business systems. +>Start connector SftpSourceConnector and set target state STARTED successed!! -### Start SFTP sink connector +### Start SFTP Sink Connector -Purpose: by consuming the data in the Topic, use the SFTP protocol to write to the target file. +Run the following command to start the SFTP sink connector. This connector will subscribe to the RocketMQ topic +to consume messages and convert each one into a single line of text, which will then be written to the destination +file `sink.txt` using the SFTP protocol: ```shell curl -X POST --location "http://localhost:8082/connectors/SftpSinkConnector" --http1.1 \ -H "Host: localhost:8082" \ -H "Content-Type: application/json" \ - -d "{ - \"connector.class\": \"org.apache.rocketmq.connect.http.sink.SftpSinkConnector\", - \"host\": \"127.0.0.1\", - \"port\": 22, - \"username\": \"wencheng\", - \"password\": \"1617\", - \"filePath\": \"/Users/wencheng/Documents/sink.txt\", - \"connect.topicnames\": \"sftpTopic\", - \"fieldSeparator\": \"|\", - \"fieldSchema\": \"username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit\" - }" + -d '{ + "connector.class": "org.apache.rocketmq.connect.http.sink.SftpSinkConnector", + "host": "127.0.0.1", + "port": 22, + "username": "YourUsername", + "password": "yourPassword", + "filePath": "/Users/YourUsername/rocketmqconnect/sftp-test/sink.txt", + "connect.topicnames": "sftpTopic", + "fieldSeparator": "|", + "fieldSchema": "username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit" + }' ``` + +If the curl request returns status: 200, it indicates that the connector was successfully +created. An example response would look like this: +```json +{"status":200,"body":{"connector.class":"... +``` + +Check the logs to confirm successful startup of the SFTP sink connector: +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log +``` + +>Start connector SftpSinkConnector and set target state STARTED successed!! + +Confirm that the data has been written to the destination file by running the following command: +```shell +cat /Users/YourUsername/rocketmqconnect/sftp-test/sink.txt +``` + +If the `sink.txt` file has been generated and its contents match those of the `source.txt` file, the entire process is working correctly. + +Write more test data to the `source.txt` file to continue testing: +```shell +cd /Users/YourUsername/rocketmqconnect/sftp-test/ + +echo 'John Doe|100000202211290001|20221129001|30000.00|2022-11-28|03:00:00|7.00 +Jane Smith|100000202211290002|20221129002|40000.00|2022-11-28|04:00:00|9.00 +Bob Johnson|100000202211290003|20221129003|50000.00|2022-11-28|05:00:00|12.00' >> source.txt + +# Wait a few seconds to give the connector time to replicate data to the sink file. +sleep 10 + +cat /Users/YourUsername/rocketmqconnect/sftp-test/sink.txt +``` + +**Note**: The order of file contents may vary because the `rocketmq-connect-sftp` uses `normal message` when +sending and receiving messages to/from a RocketMQ topic. This is different from `ordered message`, and consuming +`normal messages` does not guarantee the order. \ No newline at end of file diff --git a/i18n/en/docusaurus-plugin-content-docs/current/10-connect/08RocketMQ Connect In Action5-ES.md b/i18n/en/docusaurus-plugin-content-docs/current/10-connect/08RocketMQ Connect In Action5-ES.md index 269fc8cd02..5bec501329 100644 --- a/i18n/en/docusaurus-plugin-content-docs/current/10-connect/08RocketMQ Connect In Action5-ES.md +++ b/i18n/en/docusaurus-plugin-content-docs/current/10-connect/08RocketMQ Connect In Action5-ES.md @@ -1,6 +1,6 @@ # RocketMQ Connect in Action 5 -Elsticsearch Source - >RocketMQ Connect -> Elasticsearch Sink +Elasticsearch Source -> RocketMQ Connect -> Elasticsearch Sink ## preparatory work @@ -8,54 +8,77 @@ Elsticsearch Source - >RocketMQ Connect -> Elasticsearch Sink 1. Linux/Unix/Mac 2. 64bit JDK 1.8+; -3. Maven 3.2.x or later; -4. Start [RocketMQ](https://rocketmq.apache.org/docs/quick-start/); +3. Maven 3.2.x+; +4. Start RocketMQ. Either [RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) or + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/) 5.x version can be used; +5. Test RocketMQ message sending and receiving using the tool. +Here, use the environment variable NAMESRV_ADDR to inform the tool client of the NameServer address of RocketMQ as localhost:9876. +```shell +#$ cd distribution/target/rocketmq-4.9.7/rocketmq-4.9.7 +$ cd distribution/target/rocketmq-5.1.4/rocketmq-5.1.4 -**tips** : ${ROCKETMQ_HOME} Position Description +$ export NAMESRV_ADDR=localhost:9876 +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Producer + SendResult [sendStatus=SEND_OK, msgId= ... ->bin-release.zip version:/rocketmq-all-4.9.4-bin-release -> ->source-release.zip versioon:/rocketmq-all-4.9.4-source-release/distribution +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Consumer + ConsumeMessageThread_%d Receive New Messages: [MessageExt... +``` +**Note**: RocketMQ has the feature of automatically creating Topic and Group. When sending or subscribing to messages, +if the corresponding Topic or Group does not exist, RocketMQ will automatically create them. Therefore, +there is no need to create Topic and Group in advance. -### Start Connect +Here's the English translation of the content: +### Building the Connector Runtime -#### Connector plugin compilation +Clone the repository and build the RocketMQ Connect project: -Elasticsearch RocketMQ Connector -``` -$ cd rocketmq-connect/connectors/rocketmq-connect-elasticsearch/ -$ mvn clean package -Dmaven.test.skip=true -``` +```shell +git clone https://github.com/apache/rocketmq-connect.git -Move the compiled Elasticsearch RocketMQ Connector package into the Runtime load directory. The command is as follows: -``` -mkdir -p /usr/local/connector-plugins -cp rocketmq-connect-elasticsearch/target/rocketmq-connect-elasticsearch-1.0.0-jar-with-dependencies.jar /usr/local/connector-plugins +cd rocketmq-connect + +export RMQ_CONNECT_HOME=`pwd` + +mvn -Prelease-connect -Dmaven.test.skip=true clean install -U ``` +### Build Elasticsearch Connector Plugin +Build the Elasticsearch RocketMQ Connector plugin: -#### Start Connect Runtime +```shell +cd $RMQ_CONNECT_HOME/connectors/rocketmq-connect-elasticsearch/ +mvn clean package -Dmaven.test.skip=true ``` -cd rocketmq-connect -mvn -Prelease-connect -DskipTests clean install -U +Copy the compiled Elasticsearch RocketMQ Connector plugin JAR file into the plugin directory used by the runtime: -``` +```shell +mkdir -p /Users/YourUsername/rocketmqconnect/connector-plugins -Update `connect-standalone.conf` ,Key configurations are as follows: +cp target/rocketmq-connect-elasticsearch-1.0.0-jar-with-dependencies.jar /Users/YourUsername/rocketmqconnect/connector-plugins ``` -$ cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT -$ vim conf/connect-standalone.conf + +### Run Connector Worker in Standalone Mode + +Modify the `connect-standalone.conf` file to configure the RocketMQ connection +address and other information. + +```shell +cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT + +vim conf/connect-standalone.conf ``` +Example configuration information is as follows: ``` workerId=standalone-worker -storePathRootDir=/tmp/storeRoot +storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot ## Http port for user to access REST API httpPort=8082 @@ -65,53 +88,217 @@ namesrvAddr=localhost:9876 # RocketMQ acl aclEnable=false -accessKey=rocketmq -secretKey=12345678 +#accessKey=rocketmq +#secretKey=12345678 -autoCreateGroupEnable=false clusterName="DefaultCluster" -# Core configuration where the plugin directory where you compiled the elasticsearch package is located -# Source or sink connector jar file dir,The default value is rocketmq-connect-sample -pluginPaths=/usr/local/connector-plugins +# Plugin path for loading Source/Sink Connectors +pluginPaths=/Users/YourUsername/rocketmqconnect/connector-plugins ``` +In standalone mode, RocketMQ Connect persistently stores the synchronization checkpoint information +in the local file directory specified by storePathRootDir. + +>storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot +If you want to reset the synchronization checkpoint, delete the persistence files: +```shell +rm -rf /Users/YourUsername/rocketmqconnect/storeRoot/* ``` -cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT +To start Connector Worker in standalone mode: +``` sh bin/connect-standalone.sh -c conf/connect-standalone.conf & +``` + +### Set Up Elasticsearch Services + +Elasticsearch is an open-source search and analytics engine. + +We'll use two separate Docker instances of Elasticsearch to serve as our source and destination databases: ``` +docker pull docker.elastic.co/elasticsearch/elasticsearch:7.15.1 -### Elasticsearch Image +docker run --name es1 -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1g -Xmx1g" \ +-v /Users/YourUsername/rocketmqconnect/es/es1_data:/usr/share/elasticsearch/data \ +-d docker.elastic.co/elasticsearch/elasticsearch:7.15.1 -Use docker to build the Elasticsearch database +docker run --name es2 -p 9201:9200 -p 9301:9300 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1g -Xmx1g" \ +-v /Users/YourUsername/rocketmqconnect/es/es2_data:/usr/share/elasticsearch/data \ +-d docker.elastic.co/elasticsearch/elasticsearch:7.15.1 ``` -# starting a elasticsearch instance -docker run --name my-elasticsearch -p 9200:9200 -p 9300:9300 -e "ES_JAVA_OPTS=-Xms1g -Xmx1g" -d 74c2e0ec249c + +Explanation of Docker commands: + +- `--name es2`: Specifies a name for the container, e.g., `es2`. +- `-p 9201:9200 -p 9301:9300`: Maps ports 9200 and 9300 on the Elasticsearch container to host ports 9201 and 9301 so that the Elasticsearch service can be accessed via the host. +- `-e discovery.type=single-node`: configures Elasticsearch to work on a single node without discovering other nodes in a cluster, suitable for single-server deployment. +- `-v /Users/YourUsername/rocketmqconnect/es/es2_data:/usr/share/elasticsearch/data`: Mounts a directory on the host to `/usr/share/elasticsearch/data` within the container for persistent storage of Elasticsearch data. + +This runs a custom-configured instance of Elasticsearch with persistent data storage on a container accessible through port 9200 on the host machine, making it useful for development or testing environments on a local machine. + +View the Elasticsearch logs: + ``` -### Kibana Image +docker logs -f es1 + +docker logs -f es2 +``` + +Verify that Elasticsearch has started successfully: -Use docker to build the Kibana environment ``` -docker run --name my-kibana -e ELASTICSEARCH_URL=http://192.168.0.101:9200 -p 5601:5601 -d 5dca66b41943 +# Check Elasticsearch instance 1 +curl -XGET http://localhost:9200 + +# Check Elasticsearch instance 2 +curl -XGET http://localhost:9201 ``` +A successful connection and correct operation will result in JSON responses containing information +about Elasticsearch and its version number. -### test data +### Set Up Kibana Services +Kibana is an open-source data visualization tool that allows users to interactively explore +and understand data stored within Elasticsearch clusters. It offers rich features such as charts, graphs, and dashboards. -Create test data with kibana Dev Tools: reference [console-ibana](https://www.elastic.co/guide/en/kibana/8.5/console-kibana.html#console-kibana); +For convenience, we'll set up two separate instances of Kibana in Docker and link them to +our previously established Elasticsearch containers using the following command: +``` +docker pull docker.elastic.co/kibana/kibana:7.15.1 -Source Index:connect_es +docker run --name kibana1 --link es1:elasticsearch -p 5601:5601 -d docker.elastic.co/kibana/kibana:7.15.1 -## Start Connector +docker run --name kibana2 --link es2:elasticsearch -p 5602:5601 -d docker.elastic.co/kibana/kibana:7.15.1 +``` -### Start Elasticsearch source connector +Explanation of Docker Commands: +- `--name kibana2`: Assigns a name to the new container, e.g., kibana2 +- `--link es2:elasticsearch`: Links the container to another named Elasticsearch instance (in this case, 'es2'). This enables communication between Kibana and Elasticsearch. +- `-p 5602:5601`: Maps Kibana's default port (5601) to the same port on the host machine to make it accessible through the browser. +- `-d`: runs the Docker container in detached mode. -Synchronizing source index data: connect_es -effect:Send a RocketMQ Topic by parsing Elasticsearch document data and wrapping it into a generic ConnectRecord object +Once the container has launched, you can monitor its log output: + +``` +docker logs -f kibana1 + +docker logs -f kibana2 +``` + + +To access Kibana console pages, simply visit following addresses in your browser +- kibana1: http://localhost:5601 +- kibana2:http://localhost:5602 + +If they load correctly, it indicates successful startup of the respective Kibana instances. + +### Write Test Data to the Source Elasticsearch +Kibana's Dev Tools can help you interact and operate directly with Elasticsearch in Kibana. +You can execute various queries and operations, analyze and understand the returned data. +Refer to the documentation [console-kibana](https://www.elastic.co/guide/en/kibana/8.9/console-kibana.html). + +#### Bulk Write Test Data +Access the Kibana1 console through the browser, find Dev Tools from the left menu, +and enter the following commands on the page to write test data: + +``` +POST /_bulk +{ "index" : { "_index" : "connect_es" } } +{ "id": "1", "field1": "value1", "field2": "value2" } +{ "index" : { "_index" : "connect_es" } } +{ "id": "2", "field1": "value3", "field2": "value4" } +``` + +**Note**: +- connect_es: The index name for the data. +- id/field1/field2: These are field names, and 1, value1, value2 represent the values for the fields. + +**Note**: There is a limitation in `rocketmq-connect-elasticsearch`, which requires a field in the data that +can be used for >= comparison operations (string or number). This field will be used to record the +synchronization checkpoint. In the above example, the `id` field is a globally unique, incrementing numerical field. + +#### Query Data +To query data within an index, use the following command: +``` +GET /connect_es/_search +{ + "size": 100 +} +``` + +If there is no data available, the response will be: +``` +{ + "error" : { + ... + "type" : "index_not_found_exception", + "reason" : "no such index [connect_es]", + "resource.type" : "index_or_alias", + "resource.id" : "connect_es", + "index_uuid" : "_na_", + "index" : "connect_es" + }, + "status" : 404 +} +``` + +If there is data available, the response will be: +``` +{ + ... + "hits" : { + "total" : { + "value" : 2, + "relation" : "eq" + }, + "max_score" : 1.0, + "hits" : [ + { + "_index" : "connect_es", + "_type" : "_doc", + "_id" : "_dx49osBb46Z9cN4hYCg", + "_score" : 1.0, + "_source" : { + "id" : "1", + "field1" : "value1", + "field2" : "value2" + } + }, + { + "_index" : "connect_es", + "_type" : "_doc", + "_id" : "_tx49osBb46Z9cN4hYCg", + "_score" : 1.0, + "_source" : { + "id" : "2", + "field1" : "value3", + "field2" : "value4" + } + } + ] + } +} + +``` + +#### Delete Data + +If you need to delete data within an index due to repeated testing or other reasons, you can use the following command: + +``` +DELETE /connect_es +``` + +## Start Connector + +### Start Elasticsearch Source Connector +Run the following command to start the ES source connector. The connector will connect to Elasticsearch +and read document data from the connect_es index. It will parse the Elasticsearch document data and +package it into a generic ConnectRecord object, which will be sent to a RocketMQ topic for consumption by the Sink Connector. ``` curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/elasticsearchSourceConnector -d '{ @@ -131,26 +318,60 @@ curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connector }' ``` -### Start Elasticsearch sink connector +**Note**: The startup command specifies that the source ES should synchronize the connect_es index, +and the incrementing field in the index is id. Data will be fetched starting from id=1. + +If the curl request returns status:200, it indicates a successful creation, and the sample response will be: +>{"status":200,"body":{"connector.class":"... + +If you see the following logs, it indicates that the file source connector has started successfully. +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log +``` + +>Start connector elasticsearchSourceConnector and set target state STARTED successed!! -effect:Data is written to the target index by consuming the Topic +### Start Elasticsearch Sink Connector +Run the following command to start the ES sink connector. The connector will subscribe to data from +the RocketMQ topic and consume it. It will convert each message into document data and write it to the destination ES. ``` -curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/ElasticsearchSinkConnector -d '{ +curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/elasticsearchSinkConnector -d '{ "connector.class":"org.apache.rocketmq.connect.elasticsearch.connector.ElasticsearchSinkConnector", "elasticsearchHost":"localhost", - "elasticsearchPort":9202, + "elasticsearchPort":9201, "max.tasks":2, "connect.topicnames":"ConnectEsTopic", "value.converter":"org.apache.rocketmq.connect.runtime.converter.record.json.JsonConverter", "key.converter":"org.apache.rocketmq.connect.runtime.converter.record.json.JsonConverter" }' +``` + +**Note**: The startup command specifies the address and port of the destination ES, which corresponds to +the previously started ES2 in Docker. + +If the curl request returns status:200, it indicates a successful creation, and the sample response will be: +>{"status":200,"body":{"connector.class":"... + +If you see the following logs, it indicates that the file source connector has started successfully: +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log ``` -note:Local testing requires you to start the Elasticsearch process on two different ports +>Start connector elasticsearchSinkConnector and set target state STARTED successed!! + +To check if the sink connector has written data to the destination ES index: + +1. Access the Kibana2 console address in the browser: http://localhost:5602 +2. In the Kibana2 Dev Tools page, query the data within the index. If it matches the data in the source ES1, it means the connector is running properly. + +``` +GET /connect_es/_search +{ + "size": 100 +} +``` -After the two Connector tasks are successfully created Whether the Elasticsearch specified by accessing sink contains data -New data added to the source index can be synchronized to the target index diff --git a/i18n/en/docusaurus-plugin-content-docs/version-5.0/10-connect/03RocketMQ Connect Quick Start.md b/i18n/en/docusaurus-plugin-content-docs/version-5.0/10-connect/03RocketMQ Connect Quick Start.md index 8136302a43..518bf59721 100644 --- a/i18n/en/docusaurus-plugin-content-docs/version-5.0/10-connect/03RocketMQ Connect Quick Start.md +++ b/i18n/en/docusaurus-plugin-content-docs/version-5.0/10-connect/03RocketMQ Connect Quick Start.md @@ -2,163 +2,242 @@ # Quick Start -In standalone mode, [rocketmq-connect-sample] serves as a demo. +This tutorial will start a RocketMQ Connector example project "rocketmq-connect-sample" in standalone mode to help you understand the working principle of connectors. +The example project provides a source connector that reads data from source files and sends it to the RocketMQ cluster. +It also provides a sink connector that reads messages from the RocketMQ cluster and writes them to destination files. -The main purpose of rocketmq-connect-sample is to read data from a source file and send it to a RocketMQ cluster, and then read messages from the Topic and write them to a target file. - -## 1. Prepare +## 1. Preparation: Start RocketMQ 1. Linux/Unix/Mac 2. 64bit JDK 1.8+; 3. Maven 3.2.x+; -4. Start [RocketMQ](https://rocketmq.apache.org/docs/quick-start/); -5. Create test Topic +4. Start RocketMQ. Either [RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) or + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/) 5.x version can be used; +5. Test RocketMQ message sending and receiving using the tool. -> sh ${ROCKETMQ_HOME}/bin/mqadmin updateTopic -t fileTopic -n localhost:9876 -c DefaultCluster -r 8 -w 8 +Here, use the environment variable NAMESRV_ADDR to inform the tool client of the NameServer address of RocketMQ as localhost:9876. -**tips** : ${ROCKETMQ_HOME} locational instructions +```shell +#$ cd distribution/target/rocketmq-4.9.7/rocketmq-4.9.7 +$ cd distribution/target/rocketmq-5.1.4/rocketmq-5.1.4 ->bin-release.zip version:/rocketmq-all-4.9.4-bin-release -> ->source-release.zip version:/rocketmq-all-4.9.4-source-release/distribution +$ export NAMESRV_ADDR=localhost:9876 +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Producer + SendResult [sendStatus=SEND_OK, msgId= ... +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Consumer + ConsumeMessageThread_%d Receive New Messages: [MessageExt... +``` -## 2. Build Connect +**Note**: RocketMQ has the feature of automatically creating Topic and Group. When sending or subscribing to messages, +if the corresponding Topic or Group does not exist, RocketMQ will automatically create them. Therefore, +there is no need to create Topic and Group in advance. -``` +## 2. Build Connector Runtime +```shell git clone https://github.com/apache/rocketmq-connect.git cd rocketmq-connect -mvn -Prelease-connect -DskipTests clean install -U +export RMQ_CONNECT_HOME=`pwd` +mvn -Prelease-connect -Dmaven.test.skip=true clean install -U ``` -## 3. Run Worker +**Note**: The project already includes the code for rocketmq-connect-sample by default, +so there is no need to build the rocketmq-connect-sample plugin separately. +## 3. Run Connector Worker in Standalone Mode + +### Modify Configuration +Modify the `connect-standalone.conf` file to configure the RocketMQ connection +address and other information. Please refer to [9. Configuration File Instructions](#9-configuration-file-instructions) for details. ``` -cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT +cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT -sh bin/connect-standalone.sh -c conf/connect-standalone.conf & +vim conf/connect-standalone.conf +``` + +In standalone mode, RocketMQ Connect persists the synchronization checkpoint information +to the local file directory storePathRootDir. +>storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot + +If you want to reset the synchronization checkpoint, you need to delete the persisted +checkpoint file. + +```shell +rm -rf /Users/YourUsername/rocketmqconnect/storeRoot/* ``` -**tips**: The JVM Parameters Configuration can be adjusted in /bin/runconnect.sh as needed. +### Start Connector Worker in Standalone Mode + +```shell +sh bin/connect-standalone.sh -c conf/connect-standalone.conf & +``` + +**tips**: You can modify `docker/connect/bin/runconnect.sh` to adjust JVM startup +parameters as needed. >JAVA_OPT="${JAVA_OPT} -server -Xms256m -Xmx256m" -runtime start successful: +To view the startup log file: ->The standalone worker boot success. +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log +``` -View the startup log files. +If the runtime starts successfully, you will see the following print in the log file: ->tail -100f ~/logs/rocketmqconnect/connect_runtime.log +>The standalone worker boot success. -`ctrl + c` exit log +To exit the log tracking mode of `tail -f` command, you can press the `Ctrl + C` key combination. -## 4. Start source connector +## 4. Start Source Connector -Create a test file named test-source-file.txt in the current directory. +### Create Source File and Write Test Data -``` +```shell +mkdir -p /Users/YourUsername/rocketmqconnect/ +cd /Users/YourUsername/rocketmqconnect/ touch test-source-file.txt echo "Hello \r\nRocketMQ\r\n Connect" >> test-source-file.txt +``` +**Note**: There should be no empty lines (the demo program will throw an error if it +encounters empty lines). The source connector will continuously read the source file +and convert each line of data into a message body to be sent to RocketMQ for consumption +by the sink connector. + +### Start Source Connector + +```shell +curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSourceConnector -d '{ + "connector.class": "org.apache.rocketmq.connect.file.FileSourceConnector", + "filename": "/Users/YourUsername/rocketmqconnect/test-source-file.txt", + "connect.topicname": "fileTopic" +}' +``` + +If the curl request returns status 200, it indicates successful creation. Example response: +>{"status":200,"body":{"connector.class":"org.apache.rocketmq.connect.file.FileSourceConnector","filename":"/Users/YourUsername/rocketmqconnect/test-source-file.txt","connect.topicname":"fileTopic"}} -curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSourceConnector -d '{"connector.class":"org.apache.rocketmq.connect.file.FileSourceConnector","filename":"test-source-file.txt","connect.topicname":"fileTopic"}' +View the log file: +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log ``` -If you see the following log message, it means the file source connector has started successfully. +If you see the following log, it means the file source connector has started successfully: +>Start connector fileSourceConnector and set target state STARTED successed!! ->tail -100f ~/logs/rocketmqconnect/connect_runtime.log -> ->2019-07-16 11:18:39 INFO pool-7-thread-1 - **Source task start**, config:{"properties":{"source-record-... -#### source connector configuration instructions +#### Source Connector Configuration Instructions | key | nullable | default | description | | ----------------- | -------- | ------- | ------------------------------------------------------------ | | connector.class | false | | The class name (including the package name) that implements the Connector interface | -| filename | false | | source file name | +| filename | false | | The name of the source file (recommended to use absolute path) | | connect.topicname | false | | Topic required for synchronizing file data | ## 5. Start sink connector +```shell +curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSinkConnector -d '{ + "connector.class": "org.apache.rocketmq.connect.file.FileSinkConnector", + "filename": "/Users/YourUsername/rocketmqconnect/test-sink-file.txt", + "connect.topicnames": "fileTopic" +}' ``` -curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSinkConnector -d '{"connector.class":"org.apache.rocketmq.connect.file.FileSinkConnector","filename":"test-sink-file.txt","connect.topicnames":"fileTopic"}' -cat test-sink-file.txt +If the curl request returns status 200, it indicates successful creation. Example response: +>{"status":200,"body":{"connector.class":"org.apache.rocketmq.connect.file.FileSinkConnector","filename":"/Users/YourUsername/rocketmqconnect/test-sink-file.txt","connect.topicnames":"fileTopic"}} + +View the log file: +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log ``` +If you see the following log, it means the file sink connector has started successfully: +> Start connector fileSinkConnector and set target state STARTED successed!! -> tail -100f ~/logs/rocketmqconnect/connect_runtime.log +Check if the sink connector has written data to the destination file: +```shell +cat /Users/YourUsername/rocketmqconnect/test-sink-file.txt +``` -If you see the following log message, it means the file sink connector has started successfully. +If the test-sink-file.txt file is generated and its content is the same as the +test-source-file.txt, it means the entire process is running correctly. -> 2019-07-16 11:24:58 INFO pool-7-thread-2 - **Sink task start**, config:{"properties":{"source-record-... +Continue writing test data to the source file test-source-file.txt: +```shell +cd /Users/YourUsername/rocketmqconnect/ -If test-sink-file.txt is generated and its content is the same as source-file.txt, it means that the entire process is running normally. +echo "Say Hi to\r\nRMQ Connector\r\nAgain" >> test-source-file.txt -The file contents may be in a different order, which is normal because the order of messages received from different queues in RocketMQ may also be inconsistent. +# Wait a few seconds, check if rocketmq-connect replicate data to sink file succeed +sleep 10 +cat /Users/YourUsername/rocketmqconnect/test-sink-file.txt +``` + +**Note**: The order of file contents may vary because the `rocketmq-connect-sample` uses `normal message` when +sending and receiving messages to/from a RocketMQ topic. This is different from `ordered message`, and consuming +`normal messages` does not guarantee the order. #### sink connector configuration instructions | key | nullable | default | description | | ------------------ | -------- | ------- | ------------------------------------------------------------ | | connector.class | false | | The class name (including the package name) that implements the Connector interface | -| filename | false | | The sink pulls data and saves it to a file. | -| connect.topicnames | false | | The topics of the data messages that the sink needs to process. | +| filename | false | | The sink pulls data and saves it to a file(recommended to use absolute path) | +| connect.topicnames | false | | The topics of the data messages that the sink needs to process | -``` -Tips:The configuration file instructions for the sample rocketmq-connect-sample are for reference only, different source/sink connectors have different configurations, please refer to the specific source/sink connector. -``` + +**Tips**:The configuration file instructions for the sample rocketmq-connect-sample are for reference only, different source/sink connectors have different configurations, please refer to the specific source/sink connector. ## 6. Stop connector +The RESTful command format for stopping connectors is +`http://(your worker ip):(port)/connectors/(connector name)/stop` +To stop the two connectors in the demo, you can use the following commands: ```shell -#GET request -http://(your worker ip):(port)/connectors/(connector name)/stop - -#Stopping the two connectors in the demo -curl http://127.0.0.1:8082/connectors/fileSinkConnector/stop -curl http://127.0.0.1:8082/connectors/fileSourceConnector/stop - +curl http://127.0.0.1:8082/connectors/fileSinkConnector/stop +curl http://127.0.0.1:8082/connectors/fileSourceConnector/stop ``` -Seeing the following log message indicates that the connector has been successfully stopped. +If the curl request returns a status of 200, it indicates successful stopping of the connectors. +Example response: +>{"status":200,"body":"Connector [fileSinkConnector] deleted successfully"} ->**Source task stop**, config:{"properties":{"source-record-converter":"org.apache.rocketmq.connect.runtime.converter.JsonConverter","filename":"/home/zhoubo/IdeaProjects/my-new3-rocketmq-externals/rocketmq-connect/rocketmq-connect-runtime/source-file.txt","task-class":"org.apache.rocketmq.connect.file.FileSourceTask","topic":"fileTopic","connector-class":"org.apache.rocketmq.connect.file.FileSourceConnector","update-timestamp":"1564765189322"}} - -## 7. Stopping the Worker process +If you see the following log message, it means the file sink connector has been +successfully shut down: +```shell +tail -100f ~/logs/rocketmqconnect/connect_default.log ``` +> Completed shutdown for connectorName:fileSinkConnector + +## 7. Stop the Worker process + +```shell +cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT sh bin/connectshutdown.sh ``` ## 8. Log directory ->${user.home}/logs/rocketmqconnect - -## 9. Configuration file - -The default directory for persistent configuration files is /tmp/storeRoot. +You can use the following commands to view the log directory: -| key | description | -| -------------------- | --------------------------------------------------------- | -| connectorConfig.json | Connector configuration persistence files | -| position.json | Source connect data processing progress persistence files | -| taskConfig.json | Task configuration persistence files | -| offset.json | Sink connect data consumption progress persistence files | -| connectorStatus.json | Connector status persistence files | -| taskStatus.json | Task status persistence files | +```shell +ls $HOME/logs/rocketmqconnect +ls ~/logs/rocketmqconnect +``` -## 10. Configuration Instructions +## 9. Configuration File Instructions Modify the RESTful port, storeRoot path, Nameserver address, and other information based on your usage. -The file location is in the conf/connect-standalone.conf under the work startup directory. +Here is an example of a configuration file: ```shell #current cluster node uniquely identifies @@ -168,16 +247,29 @@ workerId=DEFAULT_WORKER_1 httpPort=8082 # Local file dir for config store -storePathRootDir=/home/connect/storeRoot +storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot #You need to modify it to your own rocketmq nameserver endpoint. # RocketMQ namesrvAddr namesrvAddr=127.0.0.1:9876 -#This is used for loading Connector plugins, similar to how JVM loads jar packages or classes at startup. This directory is used for placing Connector-related implementation plugins and supports both files and directories. -# Source or sink connector jar file dir -pluginPaths=rocketmq-connect-sample/target/rocketmq-connect-sample-0.0.1-SNAPSHOT.jar +# Plugin path for loading Source/Sink Connectors +# The rocketmq-connect project already includes the rocketmq-connect-sample module by default, so no configuration is needed here. +pluginPaths= +``` + +Explanation of storePathRootDir configuration: + +In standalone mode, RocketMQ Connect persists the synchronization checkpoint information +to the local file directory specified by storePathRootDir. The persistent files include: + + +| key | description | +| -------------------- | --------------------------------------------------------- | +| connectorConfig.json | Connector configuration persistence files | +| position.json | Source connect data processing progress persistence files | +| taskConfig.json | Task configuration persistence files | +| offset.json | Sink connect data consumption progress persistence files | +| connectorStatus.json | Connector status persistence files | +| taskStatus.json | Task status persistence files | -# Addition : put the Connector-related implementation plugins in the specified folder. -# pluginPaths=/usr/local/connector-plugins/* -``` \ No newline at end of file diff --git a/i18n/en/docusaurus-plugin-content-docs/version-5.0/10-connect/07RocketMQ Connect In Action4.md b/i18n/en/docusaurus-plugin-content-docs/version-5.0/10-connect/07RocketMQ Connect In Action4.md index 3f0dd01471..f7e29ecd3e 100644 --- a/i18n/en/docusaurus-plugin-content-docs/version-5.0/10-connect/07RocketMQ Connect In Action4.md +++ b/i18n/en/docusaurus-plugin-content-docs/version-5.0/10-connect/07RocketMQ Connect In Action4.md @@ -1,6 +1,6 @@ # RocketMQ Connect in Action 4 -SFTP Server(file data) -> RocketMQ Connect +SFTP Server (File Data) -> RocketMQ Connect -> SFTP Server (File) ## Preparation @@ -9,55 +9,70 @@ SFTP Server(file data) -> RocketMQ Connect 1. Linux/Unix/Mac 2. 64bit JDK 1.8+; 3. Maven 3.2.x+; -4. Start [RocketMQ](https://rocketmq.apache.org/docs/quick-start/); +4. Start RocketMQ. Either [RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) or + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/) 5.x version can be used; +5. Test RocketMQ message sending and receiving using the tool. +Here, use the environment variable NAMESRV_ADDR to inform the tool client of the NameServer address of RocketMQ as localhost:9876. +```shell +#$ cd distribution/target/rocketmq-4.9.7/rocketmq-4.9.7 +$ cd distribution/target/rocketmq-5.1.4/rocketmq-5.1.4 -**Tips** : ${ROCKETMQ_HOME} locational instructions +$ export NAMESRV_ADDR=localhost:9876 +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Producer + SendResult [sendStatus=SEND_OK, msgId= ... ->bin-release.zip version:/rocketmq-all-4.9.4-bin-release -> ->source-release.zip version:/rocketmq-all-4.9.4-source-release/distribution +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Consumer + ConsumeMessageThread_%d Receive New Messages: [MessageExt... +``` +**Note**: RocketMQ has the feature of automatically creating Topic and Group. When sending or subscribing to messages, +if the corresponding Topic or Group does not exist, RocketMQ will automatically create them. Therefore, +there is no need to create Topic and Group in advance. -### Start Connect +### Build Connector Runtime +```shell +git clone https://github.com/apache/rocketmq-connect.git -#### **Compiling Connector Plugin** +cd rocketmq-connect -RocketMQ Connector SFTP +export RMQ_CONNECT_HOME=`pwd` -``` -$ cd rocketmq-connect/connectors/rocketmq-connect-sftp/ -$ mvn clean package -Dmaven.test.skip=true +mvn -Prelease-connect -Dmaven.test.skip=true clean install -U ``` -Move the compiled RocketMQ Connector SFTP package into the Runtime loading directory. The command is as follows: +### Build SFTP Connector Plugin ``` -mkdir -p /usr/local/connector-plugins -cp target/rocketmq-connect-sftp-0.0.1-SNAPSHOT-jar-with-dependencies.jar /usr/local/connector-plugins -``` - -#### Start Connect Runtime +cd $RMQ_CONNECT_HOME/connectors/rocketmq-connect-sftp/ +mvn clean package -Dmaven.test.skip=true ``` -cd rocketmq-connect -mvn -Prelease-connect -DskipTests clean install -U +Put the compiled jar of the SFTP RocketMQ Connector into the Plugin directory for runtime loading. +``` +mkdir -p /Users/YourUsername/rocketmqconnect/connector-plugins +cp target/rocketmq-connect-sftp-0.0.1-SNAPSHOT-jar-with-dependencies.jar /Users/YourUsername/rocketmqconnect/connector-plugins ``` -Modify the configuration `connect-standalone.conf`, the main configuration is as follows +### Run Connector Worker in Standalone Mode + +Modify the `connect-standalone.conf` file to configure the RocketMQ connection +address and other information. ``` -$ cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT -$ vim conf/connect-standalone.conf +cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT + +vim conf/connect-standalone.conf ``` +Example configuration information is as follows: ``` workerId=standalone-worker -storePathRootDir=/tmp/storeRoot +storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot ## Http port for user to access REST API httpPort=8082 @@ -67,84 +82,164 @@ namesrvAddr=localhost:9876 # RocketMQ acl aclEnable=false -accessKey=rocketmq -secretKey=12345678 +#accessKey=rocketmq +#secretKey=12345678 -autoCreateGroupEnable=false clusterName="DefaultCluster" -# Core configuration, configure the plugin directory that was previously compiled here. -# Source or sink connector jar file dir,The default value is rocketmq-connect-sample -pluginPaths=/usr/local/connector-plugins +# Plugin path for loading Source/Sink Connectors +pluginPaths=/Users/YourUsername/rocketmqconnect/connector-plugins ``` +In standalone mode, RocketMQ Connect persistently stores the synchronization checkpoint information +in the local file directory specified by storePathRootDir. + +>storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot +If you want to reset the synchronization checkpoint, you need to delete the persisted checkpoint information files. +```shell +rm -rf /Users/YourUsername/rocketmqconnect/storeRoot/* ``` -cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT +To start Connector Worker in standalone mode: +``` sh bin/connect-standalone.sh -c conf/connect-standalone.conf & - ``` ### Set up an SFTP server -Use the built-in SFTP server on MAC OS. +SFTP (SSH File Transfer Protocol) is a file transfer protocol used for secure file transfers between computers. +SFTP is built on top of the SSH (Secure Shell) protocol and utilizes encryption and authentication. -[Allow remote computers to access your Mac](https://support.apple.com/zh-cn/guide/mac-help/mchlp1066/mac) +We will use the built-in SFTP service in macOS (by enabling "Remote Login" access). +For detailed instructions, please refer to the +[Allow a remote computer to access your Mac](https://support.apple.com/guide/mac-help/allow-a-remote-computer-to-access-your-mac-mchlp1066/mac)document. + +### Create Source Test File +Create a test file named `source.txt` and write some test data to it: + +```shell +mkdir -p /Users/YourUsername/rocketmqconnect/sftp-test/ -### Test data +cd /Users/YourUsername/rocketmqconnect/sftp-test/ -Log in to the SFTP server and place a file called source.txt with specific contents in the user directory, for example: /path/to/. +touch source.txt + +echo 'John Doe|100000202211290001|20221129001|30000.00|2022-11-28|03:00:00|7.00 +Jane Smith|100000202211290002|20221129002|40000.00|2022-11-28|04:00:00|9.00 +Bob Johnson|100000202211290003|20221129003|50000.00|2022-11-28|05:00:00|12.00' >> source.txt +``` + +Log in to the SFTP service to verify that you can access it normally. Enter the following command, then enter your +password : +```shell +# sftp -P port YourUsername@hostname +sftp -P 22 YourUsername@127.0.0.1 +``` +**Note**: Since this is the SFTP service provided by your local MAC OS, the address is `127.0.0.1` and the port is the default 22. -```text -zhangsan|100000202211290001|20221129001|30000.00|2022-11-28|03:00:00|7.00 -lisi|100000202211290002|20221129002|40000.00|2022-11-28|04:00:00|9.00 -zhaowu|100000202211290003|20221129003|50000.00|2022-11-28|05:00:00|12.00 +```shell +sftp> cd /Users/YourUsername/rocketmqconnect/sftp-test/ +sftp> ls source.txt +sftp> bye ``` ## Start Connector -### Start SFTP source connector +### Start SFTP Source Connector -Synchronize the SFTP file: source.txt -Purpose: by logging into the SFTP server, parsing the file and encapsulating it into a generic ConnectRecord object, sending it to the RocketMQ Topic. +Run the following command to start the SFTP source connector. This connector will connect to the +SFTP service to read from the `source.txt` file. For each line of text in the file, the connector +will parse and package the contents into a generic ConnectRecord object, which will then be sent +to a RocketMQ topic for consumption by sink connectors. ```shell curl -X POST --location "http://localhost:8082/connectors/SftpSourceConnector" --http1.1 \ -H "Host: localhost:8082" \ -H "Content-Type: application/json" \ - -d "{ - \"connector.class\": \"org.apache.rocketmq.connect.http.sink.SftpSourceConnector\", - \"host\": \"127.0.0.1\", - \"port\": 22, - \"username\": \"wencheng\", - \"password\": \"1617\", - \"filePath\": \"/Users/wencheng/Documents/source.txt\", - \"connect.topicname\": \"sftpTopic\", - \"fieldSeparator\": \"|\", - \"fieldSchema\": \"username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit\" - }" + -d '{ + "connector.class": "org.apache.rocketmq.connect.http.sink.SftpSourceConnector", + "host": "127.0.0.1", + "port": 22, + "username": "YourUsername", + "password": "yourPassword", + "filePath": "/Users/YourUsername/rocketmqconnect/sftp-test/source.txt", + "connect.topicname": "sftpTopic", + "fieldSeparator": "|", + "fieldSchema": "username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit" + }' +``` + +If the curl request returns status: 200, it indicates that the connector was successfully +created. An example response would look like this: +```json +{"status":200,"body":{"connector.class":"... +``` + +To confirm that the file source connector has started successfully, run the following command: +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log ``` -After running the above commands, the file data on the SFTP server will be organized into data in the specified format, and written to MQ. Afterwards, it can be consumed by the sink connector or other business systems. +>Start connector SftpSourceConnector and set target state STARTED successed!! -### Start SFTP sink connector +### Start SFTP Sink Connector -Purpose: by consuming the data in the Topic, use the SFTP protocol to write to the target file. +Run the following command to start the SFTP sink connector. This connector will subscribe to the RocketMQ topic +to consume messages and convert each one into a single line of text, which will then be written to the destination +file `sink.txt` using the SFTP protocol: ```shell curl -X POST --location "http://localhost:8082/connectors/SftpSinkConnector" --http1.1 \ -H "Host: localhost:8082" \ -H "Content-Type: application/json" \ - -d "{ - \"connector.class\": \"org.apache.rocketmq.connect.http.sink.SftpSinkConnector\", - \"host\": \"127.0.0.1\", - \"port\": 22, - \"username\": \"wencheng\", - \"password\": \"1617\", - \"filePath\": \"/Users/wencheng/Documents/sink.txt\", - \"connect.topicnames\": \"sftpTopic\", - \"fieldSeparator\": \"|\", - \"fieldSchema\": \"username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit\" - }" + -d '{ + "connector.class": "org.apache.rocketmq.connect.http.sink.SftpSinkConnector", + "host": "127.0.0.1", + "port": 22, + "username": "YourUsername", + "password": "yourPassword", + "filePath": "/Users/YourUsername/rocketmqconnect/sftp-test/sink.txt", + "connect.topicnames": "sftpTopic", + "fieldSeparator": "|", + "fieldSchema": "username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit" + }' ``` + +If the curl request returns status: 200, it indicates that the connector was successfully +created. An example response would look like this: +```json +{"status":200,"body":{"connector.class":"... +``` + +Check the logs to confirm successful startup of the SFTP sink connector: +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log +``` + +>Start connector SftpSinkConnector and set target state STARTED successed!! + +Confirm that the data has been written to the destination file by running the following command: +```shell +cat /Users/YourUsername/rocketmqconnect/sftp-test/sink.txt +``` + +If the `sink.txt` file has been generated and its contents match those of the `source.txt` file, the entire process is working correctly. + +Write more test data to the `source.txt` file to continue testing: +```shell +cd /Users/YourUsername/rocketmqconnect/sftp-test/ + +echo 'John Doe|100000202211290001|20221129001|30000.00|2022-11-28|03:00:00|7.00 +Jane Smith|100000202211290002|20221129002|40000.00|2022-11-28|04:00:00|9.00 +Bob Johnson|100000202211290003|20221129003|50000.00|2022-11-28|05:00:00|12.00' >> source.txt + +# Wait a few seconds to give the connector time to replicate data to the sink file. +sleep 10 + +cat /Users/YourUsername/rocketmqconnect/sftp-test/sink.txt +``` + +**Note**: The order of file contents may vary because the `rocketmq-connect-sftp` uses `normal message` when +sending and receiving messages to/from a RocketMQ topic. This is different from `ordered message`, and consuming +`normal messages` does not guarantee the order. \ No newline at end of file diff --git a/i18n/en/docusaurus-plugin-content-docs/version-5.0/10-connect/08RocketMQ Connect In Action5-ES.md b/i18n/en/docusaurus-plugin-content-docs/version-5.0/10-connect/08RocketMQ Connect In Action5-ES.md index 269fc8cd02..d91129dee4 100644 --- a/i18n/en/docusaurus-plugin-content-docs/version-5.0/10-connect/08RocketMQ Connect In Action5-ES.md +++ b/i18n/en/docusaurus-plugin-content-docs/version-5.0/10-connect/08RocketMQ Connect In Action5-ES.md @@ -1,6 +1,6 @@ # RocketMQ Connect in Action 5 -Elsticsearch Source - >RocketMQ Connect -> Elasticsearch Sink +Elasticsearch Source -> RocketMQ Connect -> Elasticsearch Sink ## preparatory work @@ -8,54 +8,77 @@ Elsticsearch Source - >RocketMQ Connect -> Elasticsearch Sink 1. Linux/Unix/Mac 2. 64bit JDK 1.8+; -3. Maven 3.2.x or later; -4. Start [RocketMQ](https://rocketmq.apache.org/docs/quick-start/); +3. Maven 3.2.x+; +4. Start RocketMQ. Either [RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) or + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/) 5.x version can be used; +5. Test RocketMQ message sending and receiving using the tool. +Here, use the environment variable NAMESRV_ADDR to inform the tool client of the NameServer address of RocketMQ as localhost:9876. +```shell +#$ cd distribution/target/rocketmq-4.9.7/rocketmq-4.9.7 +$ cd distribution/target/rocketmq-5.1.4/rocketmq-5.1.4 -**tips** : ${ROCKETMQ_HOME} Position Description +$ export NAMESRV_ADDR=localhost:9876 +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Producer + SendResult [sendStatus=SEND_OK, msgId= ... ->bin-release.zip version:/rocketmq-all-4.9.4-bin-release -> ->source-release.zip versioon:/rocketmq-all-4.9.4-source-release/distribution +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Consumer + ConsumeMessageThread_%d Receive New Messages: [MessageExt... +``` +**Note**: RocketMQ has the feature of automatically creating Topic and Group. When sending or subscribing to messages, +if the corresponding Topic or Group does not exist, RocketMQ will automatically create them. Therefore, +there is no need to create Topic and Group in advance. -### Start Connect +Here's the English translation of the content: +### Building the Connector Runtime -#### Connector plugin compilation +Clone the repository and build the RocketMQ Connect project: -Elasticsearch RocketMQ Connector -``` -$ cd rocketmq-connect/connectors/rocketmq-connect-elasticsearch/ -$ mvn clean package -Dmaven.test.skip=true -``` +```shell +git clone https://github.com/apache/rocketmq-connect.git -Move the compiled Elasticsearch RocketMQ Connector package into the Runtime load directory. The command is as follows: -``` -mkdir -p /usr/local/connector-plugins -cp rocketmq-connect-elasticsearch/target/rocketmq-connect-elasticsearch-1.0.0-jar-with-dependencies.jar /usr/local/connector-plugins +cd rocketmq-connect + +export RMQ_CONNECT_HOME=`pwd` + +mvn -Prelease-connect -Dmaven.test.skip=true clean install -U ``` +### Build Elasticsearch Connector Plugin +Build the Elasticsearch RocketMQ Connector plugin: -#### Start Connect Runtime +```shell +cd $RMQ_CONNECT_HOME/connectors/rocketmq-connect-elasticsearch/ +mvn clean package -Dmaven.test.skip=true ``` -cd rocketmq-connect -mvn -Prelease-connect -DskipTests clean install -U +Copy the compiled Elasticsearch RocketMQ Connector plugin JAR file into the plugin directory used by the runtime: -``` +```shell +mkdir -p /Users/YourUsername/rocketmqconnect/connector-plugins -Update `connect-standalone.conf` ,Key configurations are as follows: +cp target/rocketmq-connect-elasticsearch-1.0.0-jar-with-dependencies.jar /Users/YourUsername/rocketmqconnect/connector-plugins ``` -$ cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT -$ vim conf/connect-standalone.conf + +### Run Connector Worker in Standalone Mode + +Modify the `connect-standalone.conf` file to configure the RocketMQ connection +address and other information. + +```shell +cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT + +vim conf/connect-standalone.conf ``` +Example configuration information is as follows: ``` workerId=standalone-worker -storePathRootDir=/tmp/storeRoot +storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot ## Http port for user to access REST API httpPort=8082 @@ -65,53 +88,217 @@ namesrvAddr=localhost:9876 # RocketMQ acl aclEnable=false -accessKey=rocketmq -secretKey=12345678 +#accessKey=rocketmq +#secretKey=12345678 -autoCreateGroupEnable=false clusterName="DefaultCluster" -# Core configuration where the plugin directory where you compiled the elasticsearch package is located -# Source or sink connector jar file dir,The default value is rocketmq-connect-sample -pluginPaths=/usr/local/connector-plugins +# Plugin path for loading Source/Sink Connectors +pluginPaths=/Users/YourUsername/rocketmqconnect/connector-plugins ``` +In standalone mode, RocketMQ Connect persistently stores the synchronization checkpoint information +in the local file directory specified by storePathRootDir. + +>storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot +If you want to reset the synchronization checkpoint, delete the persistence files: +```shell +rm -rf /Users/YourUsername/rocketmqconnect/storeRoot/* ``` -cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT +To start Connector Worker in standalone mode: +``` sh bin/connect-standalone.sh -c conf/connect-standalone.conf & +``` + +### Set Up Elasticsearch Services + +Elasticsearch is an open-source search and analytics engine. + +We'll use two separate Docker instances of Elasticsearch to serve as our source and destination databases: ``` +docker pull docker.elastic.co/elasticsearch/elasticsearch:7.15.1 -### Elasticsearch Image +docker run --name es1 -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1g -Xmx1g" \ +-v /Users/YourUsername/rocketmqconnect/es/es1_data:/usr/share/elasticsearch/data \ +-d docker.elastic.co/elasticsearch/elasticsearch:7.15.1 -Use docker to build the Elasticsearch database +docker run --name es2 -p 9201:9200 -p 9301:9300 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1g -Xmx1g" \ +-v /Users/YourUsername/rocketmqconnect/es/es2_data:/usr/share/elasticsearch/data \ +-d docker.elastic.co/elasticsearch/elasticsearch:7.15.1 ``` -# starting a elasticsearch instance -docker run --name my-elasticsearch -p 9200:9200 -p 9300:9300 -e "ES_JAVA_OPTS=-Xms1g -Xmx1g" -d 74c2e0ec249c + +Explanation of Docker commands: + +- `--name es2`: Specifies a name for the container, e.g., `es2`. +- `-p 9201:9200 -p 9301:9300`: Maps ports 9200 and 9300 on the Elasticsearch container to host ports 9201 and 9301 so that the Elasticsearch service can be accessed via the host. +- `-e discovery.type=single-node`: configures Elasticsearch to work on a single node without discovering other nodes in a cluster, suitable for single-server deployment. +- `-v /Users/YourUsername/rocketmqconnect/es/es2_data:/usr/share/elasticsearch/data`: Mounts a directory on the host to `/usr/share/elasticsearch/data` within the container for persistent storage of Elasticsearch data. + +This runs a custom-configured instance of Elasticsearch with persistent data storage on a container accessible through port 9200 on the host machine, making it useful for development or testing environments on a local machine. + +View the Elasticsearch logs: + ``` -### Kibana Image +docker logs -f es1 + +docker logs -f es2 +``` + +Verify that Elasticsearch has started successfully: -Use docker to build the Kibana environment ``` -docker run --name my-kibana -e ELASTICSEARCH_URL=http://192.168.0.101:9200 -p 5601:5601 -d 5dca66b41943 +# Check Elasticsearch instance 1 +curl -XGET http://localhost:9200 + +# Check Elasticsearch instance 2 +curl -XGET http://localhost:9201 ``` +A successful connection and correct operation will result in JSON responses containing information +about Elasticsearch and its version number. -### test data +### Set Up Kibana Services +Kibana is an open-source data visualization tool that allows users to interactively explore +and understand data stored within Elasticsearch clusters. It offers rich features such as charts, graphs, and dashboards. -Create test data with kibana Dev Tools: reference [console-ibana](https://www.elastic.co/guide/en/kibana/8.5/console-kibana.html#console-kibana); +For convenience, we'll set up two separate instances of Kibana in Docker and link them to +our previously established Elasticsearch containers using the following command: +``` +docker pull docker.elastic.co/kibana/kibana:7.15.1 -Source Index:connect_es +docker run --name kibana1 --link es1:elasticsearch -p 5601:5601 -d docker.elastic.co/kibana/kibana:7.15.1 -## Start Connector +docker run --name kibana2 --link es2:elasticsearch -p 5602:5601 -d docker.elastic.co/kibana/kibana:7.15.1 +``` -### Start Elasticsearch source connector +Explanation of Docker Commands: +- `--name kibana2`: Assigns a name to the new container, e.g., kibana2 +- `--link es2:elasticsearch`: Links the container to another named Elasticsearch instance (in this case, 'es2'). This enables communication between Kibana and Elasticsearch. +- `-p 5602:5601`: Maps Kibana's default port (5601) to the same port on the host machine to make it accessible through the browser. +- `-d`: runs the Docker container in detached mode. -Synchronizing source index data: connect_es -effect:Send a RocketMQ Topic by parsing Elasticsearch document data and wrapping it into a generic ConnectRecord object +Once the container has launched, you can monitor its log output: + +``` +docker logs -f kibana1 + +docker logs -f kibana2 +``` + + +To access Kibana console pages, simply visit following addresses in your browser +- kibana1: http://localhost:5601 +- kibana2:http://localhost:5602 + +If they load correctly, it indicates successful startup of the respective Kibana instances. + +### Write Test Data to the Source Elasticsearch +Kibana's Dev Tools can help you interact and operate directly with Elasticsearch in Kibana. +You can execute various queries and operations, analyze and understand the returned data. +Refer to the documentation [console-kibana](https://www.elastic.co/guide/en/kibana/8.9/console-kibana.html). + +#### Bulk Write Test Data +Access the Kibana1 console through the browser, find Dev Tools from the left menu, +and enter the following commands on the page to write test data: + +``` +POST /_bulk +{ "index" : { "_index" : "connect_es" } } +{ "id": "1", "field1": "value1", "field2": "value2" } +{ "index" : { "_index" : "connect_es" } } +{ "id": "2", "field1": "value3", "field2": "value4" } +``` + +**Note**: +- connect_es: The index name for the data. +- id/field1/field2: These are field names, and 1, value1, value2 represent the values for the fields. + +**Note**: There is a limitation in `rocketmq-connect-elasticsearch`, which requires a field in the data that +can be used for >= comparison operations (string or number). This field will be used to record the +synchronization checkpoint. In the above example, the `id` field is a globally unique, incrementing numerical field. + +#### Query Data +To query data within an index, use the following command: +``` +GET /connect_es/_search +{ + "size": 100 +} +``` + +If there is no data available, the response will be: +``` +{ + "error" : { + ... + "type" : "index_not_found_exception", + "reason" : "no such index [connect_es]", + "resource.type" : "index_or_alias", + "resource.id" : "connect_es", + "index_uuid" : "_na_", + "index" : "connect_es" + }, + "status" : 404 +} +``` + +If there is data available, the response will be: +``` +{ + ... + "hits" : { + "total" : { + "value" : 2, + "relation" : "eq" + }, + "max_score" : 1.0, + "hits" : [ + { + "_index" : "connect_es", + "_type" : "_doc", + "_id" : "_dx49osBb46Z9cN4hYCg", + "_score" : 1.0, + "_source" : { + "id" : "1", + "field1" : "value1", + "field2" : "value2" + } + }, + { + "_index" : "connect_es", + "_type" : "_doc", + "_id" : "_tx49osBb46Z9cN4hYCg", + "_score" : 1.0, + "_source" : { + "id" : "2", + "field1" : "value3", + "field2" : "value4" + } + } + ] + } +} + +``` + +#### Delete Data + +If you need to delete data within an index due to repeated testing or other reasons, you can use the following command: + +``` +DELETE /connect_es +``` + +## Start Connector + +### Start Elasticsearch Source Connector +Run the following command to start the ES source connector. The connector will connect to Elasticsearch +and read document data from the connect_es index. It will parse the Elasticsearch document data and +package it into a generic ConnectRecord object, which will be sent to a RocketMQ topic for consumption by the Sink Connector. ``` curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/elasticsearchSourceConnector -d '{ @@ -131,26 +318,60 @@ curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connector }' ``` -### Start Elasticsearch sink connector +**Note**: The startup command specifies that the source ES should synchronize the connect_es index, +and the incrementing field in the index is id. Data will be fetched starting from id=1. + +If the curl request returns status:200, it indicates a successful creation, and the sample response will be: +>{"status":200,"body":{"connector.class":"... + +If you see the following logs, it indicates that the file source connector has started successfully. +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log +``` + +>Start connector elasticsearchSourceConnector and set target state STARTED successed!! -effect:Data is written to the target index by consuming the Topic +### Start Elasticsearch Sink Connector +Run the following command to start the ES sink connector. The connector will subscribe to data from +the RocketMQ topic and consume it. It will convert each message into document data and write it to the destination ES. ``` -curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/ElasticsearchSinkConnector -d '{ +curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/elasticsearchSinkConnector -d '{ "connector.class":"org.apache.rocketmq.connect.elasticsearch.connector.ElasticsearchSinkConnector", "elasticsearchHost":"localhost", - "elasticsearchPort":9202, + "elasticsearchPort":9201, "max.tasks":2, "connect.topicnames":"ConnectEsTopic", "value.converter":"org.apache.rocketmq.connect.runtime.converter.record.json.JsonConverter", "key.converter":"org.apache.rocketmq.connect.runtime.converter.record.json.JsonConverter" }' +``` + +**Note**: The startup command specifies the address and port of the destination ES, which corresponds to +the previously started ES2 in Docker. + +If the curl request returns status:200, it indicates a successful creation, and the sample response will be: +>{"status":200,"body":{"connector.class":"... + +If you see the following logs, it indicates that the file source connector has started successfully: +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log ``` -note:Local testing requires you to start the Elasticsearch process on two different ports +>Start connector elasticsearchSinkConnector and set target state STARTED successed!! + +To check if the sink connector has written data to the destination ES index: + +1. Access the Kibana2 console address in the browser: http://localhost:5602 +2. In the Kibana2 Dev Tools page, query the data within the index. If it matches the data in the source ES1, it means the connector is running properly. + +``` +GET /connect_es/_search +{ + "size": 100 +} +``` -After the two Connector tasks are successfully created Whether the Elasticsearch specified by accessing sink contains data -New data added to the source index can be synchronized to the target index diff --git a/versioned_docs/version-5.0/10-connect/03RocketMQ Connect Quick Start.md b/versioned_docs/version-5.0/10-connect/03RocketMQ Connect Quick Start.md index 31c9589538..60d44d4181 100644 --- a/versioned_docs/version-5.0/10-connect/03RocketMQ Connect Quick Start.md +++ b/versioned_docs/version-5.0/10-connect/03RocketMQ Connect Quick Start.md @@ -4,159 +4,217 @@ # 快速开始 -单机模式下[rocketmq-connect-sample]作为 demo +本教程将采用单机模式启动一个RocketMQ Connector示例工程rocketmq-connect-sample,来帮助你了解连接器的工作原理。 +示例工程中提供了源端连接器,作用是从源文件中读取数据然后发送到RocketMQ集群。 +同时提供了目的端连接器,作用是从RocketMQ集群中读取消息然后写入目的端文件。 -rocketmq-connect-sample的主要作用是从源文件中读取数据发送到RocketMQ集群 然后从Topic中读取消息,写入到目标文件 - -## 1.准备 +## 1.准备:启动RocketMQ 1. Linux/Unix/Mac 2. 64bit JDK 1.8+; 3. Maven 3.2.x或以上版本; -4. 启动 [RocketMQ](https://rocketmq.apache.org/docs/quick-start/); -5. 创建测试Topic -> sh ${ROCKETMQ_HOME}/bin/mqadmin updateTopic -t fileTopic -n localhost:9876 -c DefaultCluster -r 8 -w 8 +4. 启动 RocketMQ。使用[RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) 或 + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/)版本均可; +5. 工具测试 RocketMQ 消息收发是否正常。详见[RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) 或 + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/)文档。 +这里利用环境变量NAMESRV_ADDR来告诉工具客户端RocketMQ的NameServer地址为localhost:9876 -**tips** : ${ROCKETMQ_HOME} 位置说明 +```shell +#$ cd distribution/target/rocketmq-4.9.7/rocketmq-4.9.7 +$ cd distribution/target/rocketmq-5.1.4/rocketmq-5.1.4 ->bin-release.zip 版本:/rocketmq-all-4.9.4-bin-release -> ->source-release.zip 版本:/rocketmq-all-4.9.4-source-release/distribution +$ export NAMESRV_ADDR=localhost:9876 +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Producer + SendResult [sendStatus=SEND_OK, msgId= ... +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Consumer + ConsumeMessageThread_%d Receive New Messages: [MessageExt... +``` -## 2.构建Connect +**说明**:RocketMQ具备自动创建Topic和Group的功能,在发送消息或订阅消息时,如果相应的Topic或Group不存在,RocketMQ会自动创建它们。因此不需要提前创建Topic和Group。 -``` +## 2.构建Connector Runtime + +```shell git clone https://github.com/apache/rocketmq-connect.git cd rocketmq-connect -mvn -Prelease-connect -DskipTests clean install -U +export RMQ_CONNECT_HOME=`pwd` +mvn -Prelease-connect -Dmaven.test.skip=true clean install -U ``` -## 3.运行Worker +**注意**:本工程已默认包含 rocketmq-connect-sample 的代码,因此无需单独构建 rocketmq-connect-sample 插件。 + +## 3.单机模式运行 Connector Worker + +### 修改配置 +`connect-standalone.conf`中配置了RocketMQ连接地址等信息,需要根据使用情况进行修改,具体参见[9.配置文件说明](#9配置文件说明)。 ``` -cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT +cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT -sh bin/connect-standalone.sh -c conf/connect-standalone.conf & +vim conf/connect-standalone.conf +``` +单机模式(standalone)下,RocketMQ Connect 会把同步位点信息持久化到本地文件目录 storePathRootDir +>storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot + +如果想重置同步位点,则需要删除持久化的位点信息文件 +```shell +rm -rf /Users/YourUsername/rocketmqconnect/storeRoot/* ``` -**tips**: 可修改 /bin/runconnect.sh 适当调整 JVM Parameters Configuration ->JAVA_OPT="${JAVA_OPT} -server -Xms256m -Xmx256m" +### 采用单机模式启动Connector Worker -runtime启动成功: +```shell +sh bin/connect-standalone.sh -c conf/connect-standalone.conf & +``` ->The standalone worker boot success. +**tips**: 可修改 docker/connect/bin/runconnect.sh 适当调整 JVM 启动参数 + +>JAVA_OPT="${JAVA_OPT} -server -Xms256m -Xmx256m" 查看启动日志文件: +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log +``` ->tail -100f ~/logs/rocketmqconnect/connect_runtime.log +runtime若启动成功则日志文件中能看到如下打印内容: +>The standalone worker boot success. -ctrl + c 退出日志 +要退出tail -f命令的日志追踪模式,您可以按下 Ctrl + C 组合键。 ## 4.启动source connector -当前目录创建测试文件 test-source-file.txt -``` +### 创建源端文件并写入测试数据 + +```shell +mkdir -p /Users/YourUsername/rocketmqconnect/ +cd /Users/YourUsername/rocketmqconnect/ touch test-source-file.txt echo "Hello \r\nRocketMQ\r\n Connect" >> test-source-file.txt +``` +**注意**:不能有空行(demo程序遇到空行会报错)。source connector会持续读取源端文件,每读取到一行数据就会转换为消息体发送到RocketMQ,供sink connector消费。 -curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSourceConnector -d '{"connector.class":"org.apache.rocketmq.connect.file.FileSourceConnector","filename":"test-source-file.txt","connect.topicname":"fileTopic"}' +### 启动Source Connector +```shell +curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSourceConnector -d '{ + "connector.class": "org.apache.rocketmq.connect.file.FileSourceConnector", + "filename": "/Users/YourUsername/rocketmqconnect/test-source-file.txt", + "connect.topicname": "fileTopic" +}' ``` +curl请求返回status:200则表示创建成功,返回样例: +>{"status":200,"body":{"connector.class":"org.apache.rocketmq.connect.file.FileSourceConnector","filename":"/Users/YourUsername/rocketmqconnect/test-source-file.txt","connect.topicname":"fileTopic"}} + 看到以下日志说明 file source connector 启动成功了 +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log +``` ->tail -100f ~/logs/rocketmqconnect/connect_runtime.log -> ->2019-07-16 11:18:39 INFO pool-7-thread-1 - **Source task start**, config:{"properties":{"source-record-... +>Start connector fileSourceConnector and set target state STARTED successed!! #### source connector配置说明 | key | nullable | default | description | |-------------------| -------- | ---------------------|--------------------------| | connector.class | false | | 实现 Connector接口的类名称(包含包名) | -| filename | false | | 数据源文件名称 | -| connect.topicname | false | | 同步文件数据所需topic | +| filename | false | | 数据源端文件名称(建议使用绝对路径) | +| connect.topicname | false | | 同步文件数据所使用的RocketMQ topic | ## 5.启动sink connector +```shell +curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSinkConnector -d '{ + "connector.class": "org.apache.rocketmq.connect.file.FileSinkConnector", + "filename": "/Users/YourUsername/rocketmqconnect/test-sink-file.txt", + "connect.topicnames": "fileTopic" +}' ``` -curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/fileSinkConnector -d '{"connector.class":"org.apache.rocketmq.connect.file.FileSinkConnector","filename":"test-sink-file.txt","connect.topicnames":"fileTopic"}' -cat test-sink-file.txt +curl请求返回status:200则表示创建成功,返回样例: +>{"status":200,"body":{"connector.class":"org.apache.rocketmq.connect.file.FileSinkConnector","filename":"/Users/YourUsername/rocketmqconnect/test-sink-file.txt","connect.topicnames":"fileTopic"}} + +看到以下日志说明file sink connector 启动成功了 +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log ``` +> Start connector fileSinkConnector and set target state STARTED successed!! +查看sink connector是否将数据写入了目的端文件: +```shell +cat /Users/YourUsername/rocketmqconnect/test-sink-file.txt +``` -> tail -100f ~/logs/rocketmqconnect/connect_runtime.log +如果生成了 test-sink-file.txt 文件,并且与 source-file.txt 内容一样则说明整个流程正常运行。 -看到以下日志说明file sink connector 启动成功了 +继续向源端文件 test-source-file.txt 中写入测试数据, +```shell +cd /Users/YourUsername/rocketmqconnect/ -> 2019-07-16 11:24:58 INFO pool-7-thread-2 - **Sink task start**, config:{"properties":{"source-record-... +echo "Say Hi to\r\nRMQ Connector\r\nAgain" >> test-source-file.txt + +# Wait a few seconds, check if rocketmq-connect replicate data to sink file succeed +sleep 10 +cat /Users/YourUsername/rocketmqconnect/test-sink-file.txt +``` + +**注意**:文件内容可能顺序不一样,这是因为 `rocketmq-connect-sample` 向RocketMQ Topic中收发消息时,使用的消息类型是普通消息,区别于顺序消息,消费普通消息时是不保证顺序的。 -如果 test-sink-file.txt 生成并且与 source-file.txt 内容一样,说明整个流程正常运行。 -文件内容可能顺序不一样,这主要是因为RocketMQ发到不同queue时,接收不同queue消息顺序可能也不一致导致的,是正常的。 #### sink connector配置说明 -| key | nullable | default | description | -|--------------------| -------- | ------- | -------------------------------------------------------------------------------------- | -| connector.class | false | | 实现Connector接口的类名称(包含包名) | -| filename | false | | sink拉去的数据保存到文件 | -| connect.topicnames | false | | sink需要处理数据消息topics | +| key | nullable | default | description | +|--------------------| -------- | ------- |----------------------------------------| +| connector.class | false | | 实现Connector接口的类名称(包含包名) | +| filename | false | | sink消费RocketMQ数据后保存到的目的端文件名称(建议使用绝对路径) | +| connect.topicnames | false | | sink需要处理数据消息topics | -``` -注:source/sink配置文件说明是以rocketmq-connect-sample为demo,不同source/sink connector配置有差异,请以具体sourc/sink connector 为准 -``` +**注意**:source/sink配置文件说明是以rocketmq-connect-sample为demo,不同source/sink connector配置有差异,请以具体sourc/sink connector 为准 ## 6.停止connector - -```shell -GET请求 -http://(your worker ip):(port)/connectors/(connector name)/stop +RESTFul 命令格式 `http://(your worker ip):(port)/connectors/(connector name)/stop` 停止demo中的两个connector -curl http://127.0.0.1:8082/connectors/fileSinkConnector/stop -curl http://127.0.0.1:8082/connectors/fileSourceConnector/stop - +```shell +curl http://127.0.0.1:8082/connectors/fileSinkConnector/stop +curl http://127.0.0.1:8082/connectors/fileSourceConnector/stop ``` -看到以下日志说明connector停止成功了 ->**Source task stop**, config:{"properties":{"source-record-converter":"org.apache.rocketmq.connect.runtime.converter.JsonConverter","filename":"/home/zhoubo/IdeaProjects/my-new3-rocketmq-externals/rocketmq-connect/rocketmq-connect-runtime/source-file.txt","task-class":"org.apache.rocketmq.connect.file.FileSourceTask","topic":"fileTopic","connector-class":"org.apache.rocketmq.connect.file.FileSourceConnector","update-timestamp":"1564765189322"}} +curl请求返回status:200则表示停止成功,返回样例: +>{"status":200,"body":"Connector [fileSinkConnector] deleted successfully"} + +看到以下日志说明file sink connector 停止成功了 +```shell +tail -100f ~/logs/rocketmqconnect/connect_default.log +``` +> Completed shutdown for connectorName:fileSinkConnector ## 7.停止Worker进程 -``` +```shell +cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT sh bin/connectshutdown.sh ``` ## 8.日志目录 +查看日志目录(下面2个命令是等价的) +```shell +ls $HOME/logs/rocketmqconnect +ls ~/logs/rocketmqconnect +``` ->${user.home}/logs/rocketmqconnect - -## 9.配置文件 - -持久化配置文件默认目录 /tmp/storeRoot - -| key | description | -|----------------------|---------------------------| -| connectorConfig.json | connector配置持久化文件 | -| position.json | source connect数据处理进度持久化文件 | -| taskConfig.json | task配置持久化文件 | -| offset.json | sink connect数据消费进度持久化文件 | -| connectorStatus.json | connector 状态持久化文件 | -| taskStatus.json | task 状态持久化文件 | - -## 10.配置说明 +## 9.配置文件说明 -可根据使用情况修改 [RESTful](https://restfulapi.cn/) 端口,storeRoot 路径,Nameserver 地址等信息 +connect-standalone.conf配置文件中, 配置了 [RESTful](https://restfulapi.cn/) 端口,storeRoot 路径,Nameserver 地址等信息,可根据需要进行修改。 -文件位置:work 启动目录下 conf/connect-standalone.conf +配置文件样例: ```shell #current cluster node uniquely identifies @@ -166,17 +224,26 @@ workerId=DEFAULT_WORKER_1 httpPort=8082 # Local file dir for config store -storePathRootDir=/home/connect/storeRoot +storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot #需要修改为自己的rocketmq nameserver 接入点 # RocketMQ namesrvAddr namesrvAddr=127.0.0.1:9876 -#用于加载Connector插件,类似于jvm启动加载jar包或者class类,这里目录目录用于放Connector相关的实现插件, -支持文件和目录 -# Source or sink connector jar file dir -pluginPaths=rocketmq-connect-sample/target/rocketmq-connect-sample-0.0.1-SNAPSHOT.jar +# 插件地址,用于Worker加载Source/Sink Connector插件 +# rocketmq-connect 工程已默认包含 rocketmq-connect-sample 模块,因此这里无需配置。 +pluginPaths= +``` + +storePathRootDir配置说明: -# 补充:将 Connector 相关实现插件保存到指定文件夹 -# pluginPaths=/usr/local/connector-plugins/* -``` \ No newline at end of file +单机模式(standalone)下,RocketMQ Connect 会把同步位点信息持久化到本地文件目录 storePathRootDir,持久化文件包括 + +| key | description | +|----------------------|---------------------------| +| connectorConfig.json | connector配置持久化文件 | +| position.json | source connect数据处理进度持久化文件 | +| taskConfig.json | task配置持久化文件 | +| offset.json | sink connect数据消费进度持久化文件 | +| connectorStatus.json | connector 状态持久化文件 | +| taskStatus.json | task 状态持久化文件 | diff --git a/versioned_docs/version-5.0/10-connect/07RocketMQ Connect In Action4.md b/versioned_docs/version-5.0/10-connect/07RocketMQ Connect In Action4.md index 2a9c439d6b..34588b9b1b 100644 --- a/versioned_docs/version-5.0/10-connect/07RocketMQ Connect In Action4.md +++ b/versioned_docs/version-5.0/10-connect/07RocketMQ Connect In Action4.md @@ -1,6 +1,6 @@ # RocketMQ Connect实战4 -SFTP Server(文件数据) -> RocketMQ Connect +SFTP Server(文件数据) -> RocketMQ Connect -> SFTP Server(文件) ## 准备 @@ -9,52 +9,67 @@ SFTP Server(文件数据) -> RocketMQ Connect 1. Linux/Unix/Mac 2. 64bit JDK 1.8+; 3. Maven 3.2.x或以上版本; -4. 启动 [RocketMQ](https://rocketmq.apache.org/docs/quick-start/); +4. 启动 RocketMQ。使用[RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) 或 + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/)版本均可; +5. 工具测试 RocketMQ 消息收发是否正常。详见[RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) 或 + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/)文档。 +这里利用环境变量NAMESRV_ADDR来告诉工具客户端RocketMQ的NameServer地址为localhost:9876 +```shell +#$ cd distribution/target/rocketmq-4.9.7/rocketmq-4.9.7 +$ cd distribution/target/rocketmq-5.1.4/rocketmq-5.1.4 -**提示** : ${ROCKETMQ_HOME} 位置说明 +$ export NAMESRV_ADDR=localhost:9876 +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Producer + SendResult [sendStatus=SEND_OK, msgId= ... ->bin-release.zip 版本:/rocketmq-all-4.9.4-bin-release -> ->source-release.zip 版本:/rocketmq-all-4.9.4-source-release/distribution +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Consumer + ConsumeMessageThread_%d Receive New Messages: [MessageExt... +``` +**说明**:RocketMQ具备自动创建Topic和Group的功能,在发送消息或订阅消息时,如果相应的Topic或Group不存在,RocketMQ会自动创建它们。因此不需要提前创建Topic和Group。 -### 启动Connect +### 构建 Connector Runtime +```shell +git clone https://github.com/apache/rocketmq-connect.git -#### Connector插件编译 +cd rocketmq-connect -RocketMQ Connector SFTP -``` -$ cd rocketmq-connect/connectors/rocketmq-connect-sftp/ -$ mvn clean package -Dmaven.test.skip=true -``` +export RMQ_CONNECT_HOME=`pwd` -将 RocketMQ Connector SFTP 编译好的包放入Runtime加载目录。命令如下: -``` -mkdir -p /usr/local/connector-plugins -cp target/rocketmq-connect-sftp-0.0.1-SNAPSHOT-jar-with-dependencies.jar /usr/local/connector-plugins +mvn -Prelease-connect -Dmaven.test.skip=true clean install -U ``` -#### 启动Connect Runtime +### 构建 SFTP Connector Plugin ``` -cd rocketmq-connect +cd $RMQ_CONNECT_HOME/connectors/rocketmq-connect-sftp/ -mvn -Prelease-connect -DskipTests clean install -U +mvn clean package -Dmaven.test.skip=true +``` +将 SFTP RocketMQ Connector 编译好的包放入Runtime加载的Plugin目录 ``` +mkdir -p /Users/YourUsername/rocketmqconnect/connector-plugins +cp target/rocketmq-connect-sftp-0.0.1-SNAPSHOT-jar-with-dependencies.jar /Users/YourUsername/rocketmqconnect/connector-plugins +``` + +### 单机模式运行 Connector Worker + +`connect-standalone.conf`中配置了RocketMQ连接地址等信息,需要根据使用情况进行修改 -修改配置`connect-standalone.conf` ,重点配置如下 ``` -$ cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT -$ vim conf/connect-standalone.conf +cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT + +vim conf/connect-standalone.conf ``` +示例配置信息如下 ``` workerId=standalone-worker -storePathRootDir=/tmp/storeRoot +storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot ## Http port for user to access REST API httpPort=8082 @@ -64,86 +79,148 @@ namesrvAddr=localhost:9876 # RocketMQ acl aclEnable=false -accessKey=rocketmq -secretKey=12345678 +#accessKey=rocketmq +#secretKey=12345678 -autoCreateGroupEnable=false clusterName="DefaultCluster" -# 核心配置,将之前编译好包的插件目录配置在此; -# Source or sink connector jar file dir,The default value is rocketmq-connect-sample -pluginPaths=/usr/local/connector-plugins +# 插件地址,用于Worker加载Source/Sink Connector插件 +pluginPaths=/Users/YourUsername/rocketmqconnect/connector-plugins ``` +单机模式(standalone)下,RocketMQ Connect 会把同步位点信息持久化到本地文件目录 storePathRootDir +>storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot +如果想重置同步位点,则需要删除持久化的位点信息文件 +```shell +rm -rf /Users/YourUsername/rocketmqconnect/storeRoot/* ``` -cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT +采用单机模式启动Connector Worker +``` sh bin/connect-standalone.sh -c conf/connect-standalone.conf & - ``` -### SFTP 服务器搭建 +### 搭建 SFTP 服务器 +SFTP(SSH File Transfer Protocol)是一个文件传输协议,用于在计算机之间进行安全的文件传输。SFTP建立在SSH连接之上,它是通过SSH(Secure Shell)协议进行加密和身份验证的。 + +这里为了方便演示,使用 MAC OS 自带的 SFTP 服务(只需开启“远程登录”即可访问),详细参见[允许远程电脑访问你的 Mac](https://support.apple.com/zh-cn/guide/mac-help/mchlp1066/mac)文档。 -使用 MAC OS 自带的 SFTP 服务器 +### 创建源端测试文件 -[允许远程电脑访问你的 Mac](https://support.apple.com/zh-cn/guide/mac-help/mchlp1066/mac) +创建源端测试文件 source.txt ,并写入测试数据 -### 测试数据 +``` +mkdir -p /Users/YourUsername/rocketmqconnect/sftp-test/ -登陆 SFTP 服务器,将具有如何内容的 souce.txt 文件放入用户目录,例如:/path/to/ +cd /Users/YourUsername/rocketmqconnect/sftp-test/ -```text -张三|100000202211290001|20221129001|30000.00|2022-11-28|03:00:00|7.00 +touch source.txt + +echo '张三|100000202211290001|20221129001|30000.00|2022-11-28|03:00:00|7.00 李四|100000202211290002|20221129002|40000.00|2022-11-28|04:00:00|9.00 -赵五|100000202211290003|20221129003|50000.00|2022-11-28|05:00:00|12.00 +赵五|100000202211290003|20221129003|50000.00|2022-11-28|05:00:00|12.00' >> source.txt +``` + +登录 SFTP 服务,验证是否能正常访问。输入下面命令,输入密码后即可进入SFTP服务器 +```shell +# sftp -P port YourUsername@hostname +sftp -P 22 YourUsername@127.0.0.1 +``` +**说明**:由于是本机MAC OS提供的SFTP服务,所以地址是 127.0.0.1, 端口是默认的22。 + +```shell +sftp> cd /Users/YourUsername/rocketmqconnect/sftp-test/ +sftp> ls source.txt +sftp> bye ``` ## 启动Connector ### 启动 SFTP source connector -同步 SFTP 文件:source.txt -作用:通过登陆 SFTP 服务器,解析文件并封装成通用的ConnectRecord对象,发送的RocketMQ Topic当中 +运行以下命令启动 SFTP source connector,connector将会连接到SFTP服务读取source.txt文件, +每读取文件中的一行内容,就会解析并封装成通用的ConnectRecord对象,发送到RocketMQ Topic当中, +供Sink Connector进行消费。 ```shell curl -X POST --location "http://localhost:8082/connectors/SftpSourceConnector" --http1.1 \ -H "Host: localhost:8082" \ -H "Content-Type: application/json" \ - -d "{ - \"connector.class\": \"org.apache.rocketmq.connect.http.sink.SftpSourceConnector\", - \"host\": \"127.0.0.1\", - \"port\": 22, - \"username\": \"wencheng\", - \"password\": \"1617\", - \"filePath\": \"/Users/wencheng/Documents/source.txt\", - \"connect.topicname\": \"sftpTopic\", - \"fieldSeparator\": \"|\", - \"fieldSchema\": \"username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit\" - }" + -d '{ + "connector.class": "org.apache.rocketmq.connect.http.sink.SftpSourceConnector", + "host": "127.0.0.1", + "port": 22, + "username": "YourUsername", + "password": "yourPassword", + "filePath": "/Users/YourUsername/rocketmqconnect/sftp-test/source.txt", + "connect.topicname": "sftpTopic", + "fieldSeparator": "|", + "fieldSchema": "username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit" + }' +``` + +curl请求返回status:200则表示创建成功,返回样例: +>{"status":200,"body":{"connector.class":"... + +看到以下日志说明 file source connector 启动成功了 +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log ``` -运行完以上命令后,SFTP 服务上的文件数据会被组织成给定格式的数据,写入 MQ。之后可以通过 sink connector 或者其他业务系统去消费它。 +>Start connector SftpSourceConnector and set target state STARTED successed!! ### 启动 SFTP sink connector -作用:通过消费Topic中的数据,使用SFTP协议写入到目标文件当中 +运行以下命令启动 SFTP sink connector,connector将会订阅RocketMQ Topic的数据进行消费, +并将每个消息转换为一行文字内容,然后通过SFTP协议写入到sink.txt文件中去。 ```shell curl -X POST --location "http://localhost:8082/connectors/SftpSinkConnector" --http1.1 \ -H "Host: localhost:8082" \ -H "Content-Type: application/json" \ - -d "{ - \"connector.class\": \"org.apache.rocketmq.connect.http.sink.SftpSinkConnector\", - \"host\": \"127.0.0.1\", - \"port\": 22, - \"username\": \"wencheng\", - \"password\": \"1617\", - \"filePath\": \"/Users/wencheng/Documents/sink.txt\", - \"connect.topicnames\": \"sftpTopic\", - \"fieldSeparator\": \"|\", - \"fieldSchema\": \"username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit\" - }" -``` - -**** \ No newline at end of file + -d '{ + "connector.class": "org.apache.rocketmq.connect.http.sink.SftpSinkConnector", + "host": "127.0.0.1", + "port": 22, + "username": "YourUsername", + "password": "yourPassword", + "filePath": "/Users/YourUsername/rocketmqconnect/sftp-test/sink.txt", + "connect.topicnames": "sftpTopic", + "fieldSeparator": "|", + "fieldSchema": "username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit" + }' +``` + +curl请求返回status:200则表示创建成功,返回样例: +>{"status":200,"body":{"connector.class":"... + +看到以下日志说明 file source connector 启动成功了 +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log +``` + +>Start connector SftpSinkConnector and set target state STARTED successed!! + + +查看sink connector是否将数据写入了目的端文件: +```shell +cat /Users/YourUsername/rocketmqconnect/sftp-test/sink.txt +``` + +如果生成了 sink.txt 文件,并且与 source.txt 内容一样则说明整个流程正常运行。 + +继续向源端文件 source.txt 中写入测试数据, +```shell +cd /Users/YourUsername/rocketmqconnect/sftp-test/ + +echo '张三x|100000202211290001|20221129001|30000.00|2022-11-28|03:00:00|7.00 +李四x|100000202211290002|20221129002|40000.00|2022-11-28|04:00:00|9.00 +赵五x|100000202211290003|20221129003|50000.00|2022-11-28|05:00:00|12.00' >> source.txt + +# Wait a few seconds, check if rocketmq-connect replicate data to sink file succeed +sleep 10 +cat /Users/YourUsername/rocketmqconnect/sftp-test/sink.txt +``` + +**注意**:文件内容可能顺序不一样,这是因为`rocketmq-connect-sftp`向RocketMQ Topic中收发消息时,使用的消息类型是普通消息,区别于顺序消息,消费普通消息时是不保证顺序的。 diff --git a/versioned_docs/version-5.0/10-connect/08RocketMQ Connect In Action5-ES.md b/versioned_docs/version-5.0/10-connect/08RocketMQ Connect In Action5-ES.md index 783d8bf137..00011e6b82 100644 --- a/versioned_docs/version-5.0/10-connect/08RocketMQ Connect In Action5-ES.md +++ b/versioned_docs/version-5.0/10-connect/08RocketMQ Connect In Action5-ES.md @@ -1,6 +1,6 @@ # RocketMQ Connect实战5 -Elsticsearch Source - >RocketMQ Connect -> Elasticsearch Sink +Elasticsearch Source -> RocketMQ Connect -> Elasticsearch Sink ## 准备 @@ -9,53 +9,67 @@ Elsticsearch Source - >RocketMQ Connect -> Elasticsearch Sink 1. Linux/Unix/Mac 2. 64bit JDK 1.8+; 3. Maven 3.2.x或以上版本; -4. 启动 [RocketMQ](https://rocketmq.apache.org/docs/quick-start/); +4. 启动 RocketMQ。使用[RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) 或 + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/)版本均可; +5. 工具测试 RocketMQ 消息收发是否正常。详见[RocketMQ 4.x](https://rocketmq.apache.org/docs/4.x/) 或 + [RocketMQ 5.x](https://rocketmq.apache.org/docs/quickStart/01quickstart/)文档。 +这里利用环境变量NAMESRV_ADDR来告诉工具客户端RocketMQ的NameServer地址为localhost:9876 +```shell +#$ cd distribution/target/rocketmq-4.9.7/rocketmq-4.9.7 +$ cd distribution/target/rocketmq-5.1.4/rocketmq-5.1.4 -**tips** : ${ROCKETMQ_HOME} 位置说明 +$ export NAMESRV_ADDR=localhost:9876 +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Producer + SendResult [sendStatus=SEND_OK, msgId= ... ->bin-release.zip 版本:/rocketmq-all-4.9.4-bin-release -> ->source-release.zip 版本:/rocketmq-all-4.9.4-source-release/distribution +$ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Consumer + ConsumeMessageThread_%d Receive New Messages: [MessageExt... +``` +**说明**:RocketMQ具备自动创建Topic和Group的功能,在发送消息或订阅消息时,如果相应的Topic或Group不存在,RocketMQ会自动创建它们。因此不需要提前创建Topic和Group。 -### 启动Connect +### 构建 Connector Runtime +```shell +git clone https://github.com/apache/rocketmq-connect.git -#### Connector插件编译 +cd rocketmq-connect -Elasticsearch RocketMQ Connector -``` -$ cd rocketmq-connect/connectors/rocketmq-connect-elasticsearch/ -$ mvn clean package -Dmaven.test.skip=true -``` +export RMQ_CONNECT_HOME=`pwd` -将 Elasticsearch RocketMQ Connector 编译好的包放入Runtime加载目录。命令如下: -``` -mkdir -p /usr/local/connector-plugins -cp rocketmq-connect-elasticsearch/target/rocketmq-connect-elasticsearch-1.0.0-jar-with-dependencies.jar /usr/local/connector-plugins +mvn -Prelease-connect -Dmaven.test.skip=true clean install -U ``` - -#### 启动Connect Runtime +### 构建 Elasticsearch Connector Plugin ``` -cd rocketmq-connect +cd $RMQ_CONNECT_HOME/connectors/rocketmq-connect-elasticsearch/ -mvn -Prelease-connect -DskipTests clean install -U +mvn clean package -Dmaven.test.skip=true +``` +将 Elasticsearch RocketMQ Connector 编译好的包放入Runtime加载的Plugin目录 +``` +mkdir -p /Users/YourUsername/rocketmqconnect/connector-plugins +cp target/rocketmq-connect-elasticsearch-1.0.0-jar-with-dependencies.jar /Users/YourUsername/rocketmqconnect/connector-plugins ``` -修改配置`connect-standalone.conf` ,重点配置如下 +### 单机模式运行 Connector Worker + +`connect-standalone.conf`中配置了RocketMQ连接地址等信息,需要根据使用情况进行修改 + ``` -$ cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT -$ vim conf/connect-standalone.conf +cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT + +vim conf/connect-standalone.conf ``` +示例配置信息如下 ``` workerId=standalone-worker -storePathRootDir=/tmp/storeRoot +storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot ## Http port for user to access REST API httpPort=8082 @@ -65,53 +79,207 @@ namesrvAddr=localhost:9876 # RocketMQ acl aclEnable=false -accessKey=rocketmq -secretKey=12345678 +#accessKey=rocketmq +#secretKey=12345678 -autoCreateGroupEnable=false clusterName="DefaultCluster" -# 核心配置,将之前编译好elasticsearch包的插件目录配置在此; -# Source or sink connector jar file dir,The default value is rocketmq-connect-sample -pluginPaths=/usr/local/connector-plugins +# 插件地址,用于Worker加载Source/Sink Connector插件 +pluginPaths=/Users/YourUsername/rocketmqconnect/connector-plugins ``` +单机模式(standalone)下,RocketMQ Connect 会把同步位点信息持久化到本地文件目录 storePathRootDir +>storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot +如果想重置同步位点,则需要删除持久化的位点信息文件 +```shell +rm -rf /Users/YourUsername/rocketmqconnect/storeRoot/* ``` -cd distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT +采用单机模式启动Connector Worker +``` sh bin/connect-standalone.sh -c conf/connect-standalone.conf & +``` + +### 搭建 Elasticsearch 服务 + +Elasticsearch是一个开源的实时分布式搜索和分析引擎。 + +这里为了方便演示,使用 docker 搭建 2个 Elasticsearch 数据库,分别作为 Connector 连接的源和目的端ES数据库。 +``` +docker pull docker.elastic.co/elasticsearch/elasticsearch:7.15.1 +docker run --name es1 -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1g -Xmx1g" \ + -v /Users/YourUsername/rocketmqconnect/es/es1_data:/usr/share/elasticsearch/data \ + -d docker.elastic.co/elasticsearch/elasticsearch:7.15.1 + +docker run --name es2 -p 9201:9200 -p 9301:9300 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1g -Xmx1g" \ + -v /Users/YourUsername/rocketmqconnect/es/es2_data:/usr/share/elasticsearch/data \ + -d docker.elastic.co/elasticsearch/elasticsearch:7.15.1 ``` -### Elasticsearch镜像 +**docker命令说明**: +- --name es2: 为容器指定一个名称,本例中为es2。 +- -p 9201:9200 -p 9301:9300: 将Elasticsearch的HTTP端口9200和传输端口9300分别映射到主机的9201和9301端口,以便可以通过主机访问Elasticsearch服务。 +- -e "discovery.type=single-node": 设置Elasticsearch的发现类型为单节点模式,这对于单机部署非常适用。 +- -v /Users/YourUsername/rocketmqconnect/es/es2_data:/usr/share/elasticsearch/data: 将主机上的一个目录挂载到容器内的/usr/share/elasticsearch/data目录,用于持久化存储Elasticsearch数据。 + +通过以上命令,您可以运行一个带有自定义配置和数据存储的Elasticsearch容器,并且可以通过主机的9200端口访问其HTTP API。这是在本地开发或测试环境中运行独立的Elasticsearch实例的常见方式。 + + +查看ES日志,查看启动是否有报错 +``` +docker logs -f es1 -使用 docker 搭建环境 Elasticsearch 数据库 +docker logs -f es2 ``` -# starting a elasticsearch instance -docker run --name my-elasticsearch -p 9200:9200 -p 9300:9300 -e "ES_JAVA_OPTS=-Xms1g -Xmx1g" -d 74c2e0ec249c + +使用curl命令检查Elasticsearch是否正常 + +``` +# check es1 +curl -XGET http://localhost:9200 + +# check es2 +curl -XGET http://localhost:9201 ``` -### Kibana镜像 -使用 docker 搭建环境 Kibana +如果成功连接并且Elasticsearch已正常启动,您将看到与Elasticsearch相关的信息和版本号的JSON响应。 + +### 搭建 Kibana 服务 + +Kibana是一个开源的数据可视化工具,用于对Elasticsearch中存储的数据进行搜索、分析和可视化展示。 +它提供了丰富的图表、图形和仪表盘等功能,使用户能够以直观的方式理解和探索数据。 + +这里为了方便演示,使用 docker 搭建 2个 Kibana 服务,分别连接前面搭建的2个ES数据库。 + ``` -docker run --name my-kibana -e ELASTICSEARCH_URL=http://192.168.0.101:9200 -p 5601:5601 -d 5dca66b41943 +docker pull docker.elastic.co/kibana/kibana:7.15.1 + +docker run --name kibana1 --link es1:elasticsearch -p 5601:5601 -d docker.elastic.co/kibana/kibana:7.15.1 + +docker run --name kibana2 --link es2:elasticsearch -p 5602:5601 -d docker.elastic.co/kibana/kibana:7.15.1 + ``` +**docker命令说明**: +- --name kibana2: 为容器指定一个名称,本例中为kibana2。 +- --link es2:elasticsearch: 将容器链接到另一个名为es2的Elasticsearch容器。这将允许Kibana实例连接和与Elasticsearch进行通信。 +- -p 5602:5601: 将Kibana的默认端口5601映射到主机的5602端口,以便可以通过主机访问Kibana的用户界面。 +- -d: 在后台运行容器。 +通过以上命令,您可以在Docker容器中启动一个独立的Kibana实例,并将其连接到另一个正在运行的Elasticsearch实例。 +这样,您可以通过浏览器访问主机的5601、5602端口,来分别访问Kibana1、Kibana2控制台。 -### 测试数据 +查看Kibana日志,查看启动是否有报错 +``` +docker logs -f kibana1 -通过 kibana Dev Tools 创建测试数据:参考 [console-ibana](https://www.elastic.co/guide/en/kibana/8.5/console-kibana.html#console-kibana); +docker logs -f kibana2 +``` +使用浏览器访问 kibana 控制台,地址 +- kibana1: http://localhost:5601 +- kibana2:http://localhost:5602 -源索引:connect_es +如果控制台页面能正常打开,则说明Kibana已正常启动。 + +### 向源端ES写入测试数据 +Kibana 的 Dev Tools 可以帮助您在 Kibana 中与 Elasticsearch 进行直接的交互和操作,执行各种查询和操作,并分析和理解返回的数据。 +参见文档 [console-kibana](https://www.elastic.co/guide/en/kibana/8.9/console-kibana.html)。 + +#### 批量写入测试数据 +浏览器访问Kibana1控制台,左侧菜单找到Dev Tools,进入页面后输入如下命令写入测试数据 +``` +POST /_bulk +{ "index" : { "_index" : "connect_es" } } +{ "id": "1", "field1": "value1", "field2": "value2" } +{ "index" : { "_index" : "connect_es" } } +{ "id": "2", "field1": "value3", "field2": "value4" } +``` +**说明**: +- connect_es:数据的索引名称 +- id/field1/field2:数据中的字段名称,1、value1、value2 分别是字段的值。 + +**注意**:`rocketmq-connect-elasticsearch` 存在一个限制,就是数据中必须要一个可用于 >= 比较运算的字段(字符串 或 数字),该字段会被用于记录同步的位点信息。 +上面的示例中 `id` 字段,就是一个全局唯一、自增的数值类型字段。 + +#### 查数据 +查询索引下的数据: +``` +GET /connect_es/_search +{ + "size": 100 +} +``` + +若无数据,则返回示例为: +``` +{ + "error" : { + ... + "type" : "index_not_found_exception", + "reason" : "no such index [connect_es]", + "resource.type" : "index_or_alias", + "resource.id" : "connect_es", + "index_uuid" : "_na_", + "index" : "connect_es" + }, + "status" : 404 +} +``` + +若有数据,则返回示例为: + +``` +{ + ... + "hits" : { + "total" : { + "value" : 2, + "relation" : "eq" + }, + "max_score" : 1.0, + "hits" : [ + { + "_index" : "connect_es", + "_type" : "_doc", + "_id" : "_dx49osBb46Z9cN4hYCg", + "_score" : 1.0, + "_source" : { + "id" : "1", + "field1" : "value1", + "field2" : "value2" + } + }, + { + "_index" : "connect_es", + "_type" : "_doc", + "_id" : "_tx49osBb46Z9cN4hYCg", + "_score" : 1.0, + "_source" : { + "id" : "2", + "field1" : "value3", + "field2" : "value4" + } + } + ] + } +} + +``` + +#### 删除数据 +如果因重复测试等原因,需要删除索引下的数据,则可使用如下命令 +``` +DELETE /connect_es +``` ## 启动Connector ### 启动Elasticsearch source connector -同步源索引数据:connect_es -作用:通过解析 Elasticsearch 文档数据封装成通用的ConnectRecord对象,发送的RocketMQ Topic当中 +运行以下命令启动 ES source connector,connector将会连接到ES读取 connect_es 索引下的文档数据, +并解析 Elasticsearch 文档数据封装成通用的ConnectRecord对象,发送到RocketMQ Topic当中, 供Sink Connector进行消费。 ``` curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/elasticsearchSourceConnector -d '{ @@ -131,29 +299,57 @@ curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connector }' ``` -### 启动 Elasticsearch sink connector +**说明**:启动命令中指定了源端ES要同步的索引为 connect_es ,以及 索引下自增的字段为 id ,并从id=1开始拉取数据。 -作用:通过消费Topic中的数据,写入到目标索引当中 +curl请求返回status:200则表示创建成功,返回样例: +>{"status":200,"body":{"connector.class":"... + +看到以下日志说明 file source connector 启动成功了 +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log ``` -curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/ElasticsearchSinkConnector -d '{ + +>Start connector elasticsearchSourceConnector and set target state STARTED successed!! + + +### 启动 Elasticsearch sink connector +运行以下命令启动 ES sink connector,connector将会订阅RocketMQ Topic的数据进行消费, +并将每个消息转换为文档数据写入到目的端ES当中。 + +``` +curl -X POST -H "Content-Type: application/json" http://127.0.0.1:8082/connectors/elasticsearchSinkConnector -d '{ "connector.class":"org.apache.rocketmq.connect.elasticsearch.connector.ElasticsearchSinkConnector", "elasticsearchHost":"localhost", - "elasticsearchPort":9202, + "elasticsearchPort":9201, "max.tasks":2, "connect.topicnames":"ConnectEsTopic", "value.converter":"org.apache.rocketmq.connect.runtime.converter.record.json.JsonConverter", "key.converter":"org.apache.rocketmq.connect.runtime.converter.record.json.JsonConverter" }' +``` +**说明**:启动命令中指定了目的端ES地址和端口,对应之前docker启动的es2。 + +curl请求返回status:200则表示创建成功,返回样例: +>{"status":200,"body":{"connector.class":"... + +看到以下日志说明 file source connector 启动成功了 +```shell +tail -100f ~/logs/rocketmqconnect/connect_runtime.log ``` -note:本地测试需要启动两个不同端口的Elasticsearch进程 +>Start connector elasticsearchSinkConnector and set target state STARTED successed!! -以上两个Connector任务创建成功以后 -通过访问sink指定的Elasticsearch是否包含数据 +查看sink connector是否将数据写入了目的端ES的索引当中: +1. 浏览器访问 Kibana2 控制台地址 http://localhost:5602 +2. Kibana2 Dev Tools 页面,查询索引下的数据,若跟源端 es1 中的数据一致则说明Connector运行正常。 +``` +GET /connect_es/_search +{ + "size": 100 +} +``` -对源索引的新增数据 -即可同步到目标索引当中