[Docs] Update concept related docs info

tcodehuber · Jul 12, 2024 · a73ebaa · a73ebaa
1 parent 7e02c88
commit a73ebaa
Show file tree

Hide file tree

Showing 12 changed files with 75 additions and 85 deletions.
diff --git a/docs/en/concept/JobEnvConfig.md b/docs/en/concept/JobEnvConfig.md
@@ -1,23 +1,23 @@
 # Job Env Config
 
-This document describes env configuration information, the common parameters can be used in all engines. In order to better distinguish between engine parameters, the additional parameters of other engine need to carry a prefix.
+This document describes env configuration information. The common parameters can be used in all engines. In order to better distinguish between engine parameters, the additional parameters of other engine need to carry a prefix.
 In flink engine, we use `flink.` as the prefix. In the spark engine, we do not use any prefixes to modify parameters, because the official spark parameters themselves start with `spark.`
 
 ## Common Parameter
 
-The following configuration parameters are common to all engines
+The following configuration parameters are common to all engines.
 
 ### job.name
 
 This parameter configures the task name.
 
 ### jars
 
-Third-party packages can be loaded via `jars`, like `jars="file://local/jar1.jar;file://local/jar2.jar"`
+Third-party packages can be loaded via `jars`, like `jars="file://local/jar1.jar;file://local/jar2.jar"`.
 
 ### job.mode
 
-You can configure whether the task is in batch mode or stream mode through `job.mode`, like `job.mode = "BATCH"` or `job.mode = "STREAMING"`
+You can configure whether the task is in batch or stream mode through `job.mode`, like `job.mode = "BATCH"` or `job.mode = "STREAMING"`
 
 ### checkpoint.interval
 
@@ -47,11 +47,11 @@ you can set it to `CLIENT`. Please use `CLUSTER` mode as much as possible, becau
 
 Specify the method of encryption, if you didn't have the requirement for encrypting or decrypting config files, this option can be ignored.
 
-For more details, you can refer to the documentation [config-encryption-decryption](../connector-v2/Config-Encryption-Decryption.md)
+For more details, you can refer to the documentation [Config Encryption Decryption](../connector-v2/Config-Encryption-Decryption.md)
 
 ## Flink Engine Parameter
 
-Here are some SeaTunnel parameter names corresponding to the names in Flink, not all of them, please refer to the official [flink documentation](https://flink.apache.org/) for more.
+Here are some SeaTunnel parameter names corresponding to the names in Flink, not all of them. Please refer to the official [Flink Documentation](https://flink.apache.org/).
 
 |    Flink Configuration Name     |     SeaTunnel Configuration Name      |
 |---------------------------------|---------------------------------------|
@@ -62,4 +62,4 @@ Here are some SeaTunnel parameter names corresponding to the names in Flink, not
 
 ## Spark Engine Parameter
 
-Because spark configuration items have not been modified, they are not listed here, please refer to the official [spark documentation](https://spark.apache.org/).
+Because Spark configuration items have not been modified, they are not listed here, please refer to the official [Spark Documentation](https://spark.apache.org/).
diff --git a/docs/en/concept/config.md b/docs/en/concept/config.md
@@ -5,24 +5,24 @@ sidebar_position: 2
 
 # Intro to config file
 
-In SeaTunnel, the most important thing is the Config file, through which users can customize their own data
+In SeaTunnel, the most important thing is the config file, through which users can customize their own data
 synchronization requirements to maximize the potential of SeaTunnel. So next, I will introduce you how to
-configure the Config file.
+configure the config file.
 
-The main format of the Config file is `hocon`, for more details of this format type you can refer to [HOCON-GUIDE](https://github.com/lightbend/config/blob/main/HOCON.md),
-BTW, we also support the `json` format, but you should know that the name of the config file should end with `.json`
+The main format of the config file is `hocon`, for more details you can refer to [HOCON-GUIDE](https://github.com/lightbend/config/blob/main/HOCON.md),
+BTW, we also support the `json` format, but you should keep in mind that the name of the config file should end with `.json`.
 
-We also support the `SQL` format, for details, please refer to the [SQL configuration](sql-config.md) file.
+We also support the `SQL` format, please refer to [SQL configuration](sql-config.md) for more details.
 
 ## Example
 
 Before you read on, you can find config file
-examples [here](https://github.com/apache/seatunnel/tree/dev/config) and in distribute package's
+examples [Here](https://github.com/apache/seatunnel/tree/dev/config) from the binary package's
 config directory.
 
-## Config file structure
+## Config File Structure
 
-The Config file will be similar to the one below.
+The config file is similar to the below one:
 
 ### hocon
 
@@ -125,12 +125,12 @@ sql = """ select * from "table" """
 
 ```
 
-As you can see, the Config file contains several sections: env, source, transform, sink. Different modules
-have different functions. After you understand these modules, you will understand how SeaTunnel works.
+As you can see, the config file contains several sections: env, source, transform, sink. Different modules
+have different functions. After you understand these modules, you will see how SeaTunnel works.
 
 ### env
 
-Used to add some engine optional parameters, no matter which engine (Spark or Flink), the corresponding
+Used to add some engine optional parameters, no matter which engine (Zeta, Spark or Flink), the corresponding
 optional parameters should be filled in here.
 
 Note that we have separated the parameters by engine, and for the common parameters, we can configure them as before.
@@ -140,9 +140,9 @@ For flink and spark engine, the specific configuration rules of their parameters
 
 ### source
 
-source is used to define where SeaTunnel needs to fetch data, and use the fetched data for the next step.
-Multiple sources can be defined at the same time. The supported source at now
-check [Source of SeaTunnel](../connector-v2/source). Each source has its own specific parameters to define how to
+Source is used to define where SeaTunnel needs to fetch data, and use the fetched data for the next step.
+Multiple sources can be defined at the same time. The supported source can be found
+in [Source of SeaTunnel](../connector-v2/source). Each source has its own specific parameters to define how to
 fetch data, and SeaTunnel also extracts the parameters that each source will use, such as
 the `result_table_name` parameter, which is used to specify the name of the data generated by the current
 source, which is convenient for follow-up used by other modules.
@@ -180,35 +180,35 @@ sink {
     fields = ["name", "age", "card"]
     username = "default"
     password = ""
-    source_table_name = "fake1"
+    source_table_name = "fake"
   }
 }
 ```
 
-Like source, transform has specific parameters that belong to each module. The supported source at now check.
-The supported transform at now check [Transform V2 of SeaTunnel](../transform-v2)
+Like source, transform has specific parameters that belong to each module. The supported transform can be found
+in [Transform V2 of SeaTunnel](../transform-v2)
 
 ### sink
 
 Our purpose with SeaTunnel is to synchronize data from one place to another, so it is critical to define how
 and where data is written. With the sink module provided by SeaTunnel, you can complete this operation quickly
-and efficiently. Sink and source are very similar, but the difference is reading and writing. So go check out
-our [supported sinks](../connector-v2/sink).
+and efficiently. Sink and source are very similar, but the difference is reading and writing. So please check out
+[Supported Sinks](../connector-v2/sink).
 
 ### Other
 
 You will find that when multiple sources and multiple sinks are defined, which data is read by each sink, and
-which is the data read by each transform? We use `result_table_name` and `source_table_name` two key
-configurations. Each source module will be configured with a `result_table_name` to indicate the name of the
+which is the data read by each transform? We introduce two key configurations called `result_table_name` and
+`source_table_name`. Each source module will be configured with a `result_table_name` to indicate the name of the
 data source generated by the data source, and other transform and sink modules can use `source_table_name` to
 refer to the corresponding data source name, indicating that I want to read the data for processing. Then
 transform, as an intermediate processing module, can use both `result_table_name` and `source_table_name`
-configurations at the same time. But you will find that in the above example Config, not every module is
+configurations at the same time. But you will find that in the above example config, not every module is
 configured with these two parameters, because in SeaTunnel, there is a default convention, if these two
 parameters are not configured, then the generated data from the last module of the previous node will be used.
 This is much more convenient when there is only one source.
 
-## Config variable substitution
+## Config Variable Substitution
 
 In config file we can define some variables and replace it in run time. **This is only support `hocon` format file**.
 
@@ -266,7 +266,7 @@ We can replace those parameters with this shell command:
 -i nameVal=abc 
 -i username=seatunnel=2.3.1 
 -i password='$a^b%c.d~e0*9(' 
--e local
+-m local
 ```
 
 Then the final submitted config is:
@@ -312,12 +312,12 @@ sink {
 ```
 
 Some Notes:
-- quota with `'` if the value has special character (like `(`)
-- if the replacement variables is in `"` or `'`, like `resName` and `nameVal`, you need add `"`
-- the value can't have space `' '`, like `-i jobName='this is a job name' `, this will be replaced to `job.name = "this"`
-- If you want to use dynamic parameters,you can use the following format: -i date=$(date +"%Y%m%d").
+- Quota with `'` if the value has special character such as `(`
+- If the replacement variables is in `"` or `'`, like `resName` and `nameVal`, you need add `"`
+- The value can't have space `' '`, like `-i jobName='this is a job name' `, this will be replaced to `job.name = "this"`
+- If you want to use dynamic parameters, you can use the following format: -i date=$(date +"%Y%m%d").
 
 ## What's More
 
-If you want to know the details of this format configuration, Please
+If you want to know the details of the format configuration, please
 see [HOCON](https://github.com/lightbend/config/blob/main/HOCON.md).
diff --git a/docs/en/concept/connector-v2-features.md b/docs/en/concept/connector-v2-features.md
@@ -1,9 +1,9 @@
 # Intro To Connector V2 Features
 
-## Differences Between Connector V2 And Connector v1
+## Differences Between Connector V2 And V1
 
 Since https://github.com/apache/seatunnel/issues/1608 We Added Connector V2 Features.
-Connector V2 is a connector defined based on the SeaTunnel Connector API interface. Unlike Connector V1, Connector V2 supports the following features.
+Connector V2 is a connector defined based on the SeaTunnel Connector API interface. Unlike Connector V1, V2 supports the following features:
 
 * **Multi Engine Support** SeaTunnel Connector API is an engine independent API. The connectors developed based on this API can run in multiple engines. Currently, Flink and Spark are supported, and we will support other engines in the future.
 * **Multi Engine Version Support** Decoupling the connector from the engine through the translation layer solves the problem that most connectors need to modify the code in order to support a new version of the underlying engine.
@@ -18,23 +18,23 @@ Source connectors have some common core features, and each source connector supp
 
 If each piece of data in the data source will only be sent downstream by the source once, we think this source connector supports exactly once.
 
-In SeaTunnel, we can save the read **Split** and its **offset**(The position of the read data in split at that time,
-such as line number, byte size, offset, etc) as **StateSnapshot** when checkpoint. If the task restarted, we will get the last **StateSnapshot**
+In SeaTunnel, we can save the read **Split** and its **offset** (The position of the read data in split at that time,
+such as line number, byte size, offset, etc.) as **StateSnapshot** when checkpointing. If the task restarted, we will get the last **StateSnapshot**
 and then locate the **Split** and **offset** read last time and continue to send data downstream.
 
 For example `File`, `Kafka`.
 
 ### column projection
 
-If the connector supports reading only specified columns from the data source (note that if you read all columns first and then filter unnecessary columns through the schema, this method is not a real column projection)
+If the connector supports reading only specified columns from the data source (Note that if you read all columns first and then filter unnecessary columns through the schema, this method is not a real column projection)
 
-For example `JDBCSource` can use sql define read columns.
+For example `JDBCSource` can use sql to define reading columns.
 
 `KafkaSource` will read all content from topic and then use `schema` to filter unnecessary columns, This is not `column projection`.
 
 ### batch
 
-Batch Job Mode, The data read is bounded and the job will stop when all data read complete.
+Batch Job Mode, The data read is bounded and the job will stop after completing all data read.
 
 ### stream