-
Notifications
You must be signed in to change notification settings - Fork 148
APEXMALHAR-2427 Kinesis Input Operator documentation. #564
base: master
Are you sure you want to change the base?
Conversation
@chaithu14 Please review. |
a12bd87
to
039bead
Compare
|
||
### Pre-requisites | ||
|
||
This operator uses the AWS kinesis java sdk version 1.9.10. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a pre-requisite ? You may remove the "pre-requisites" and add this information after Maven Aritifiact
### Introduction | ||
|
||
The kinesis input operator consumes data from the partitions of a Kinesis shard(s) for processing in Apex. | ||
The operator is fault-tolerant, scalable and supports input from multiple shards of AWS kinesis streams in a single operator instance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A suggestion : you may highlight "AWS Kinesis Streams" and provide link to https://aws.amazon.com/kinesis/streams/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
![AbstractKinesisInput.png](images/kinesisInput/classdiagram.png) | ||
|
||
#### Configuration properties | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a table with columns "Parameter" , Type, Description, Mandatory , Default Value etc
This will make it readable and consistent with other operator documentation. You may refer to KafkaInputOperator documentation
#### Abstract Methods | ||
|
||
`T getTuple(Record record)`: Converts the Kinesis Stream record(s) to tuple. | ||
`void emitTuple(Pair<String, Record> data)`: This method emits tuples extracted from AWS kinesis streams. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be on another line. Do verify it by viewing the doc in mkdocs on chrome
|
||
- ***streamName*** - String[] | ||
- Mandatory Parameter. | ||
- Specifies the name of the stream from where the records to be accessed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
records are accessed / records are to be accessed
- ***strategy*** - PartitionStrategy | ||
- Operator supports two types of partitioning strategies, `ONE_TO_ONE` and `MANY_TO_ONE`. | ||
|
||
`ONE_TO_ONE`: If this is enabled, the AppMaster creates one input operator instance per kinesis shard partition. So the number of shard partitions equals the number of operator instances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
number of operator instances equals the number of shard partitions
Default Value = 30 Seconds. | ||
|
||
- ***repartitionCheckInterval*** - Long | ||
- Interval specified in milliseconds. This value specifies the minimum interval between two stat checks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stat ?
Default value = 1024. | ||
|
||
- ***windowDataManager*** - WindowDataManager | ||
- If set to a value other than the default, such as `FSWindowDataManager`, specifies that the operator will process the same set of messages in a window before and after a failure. This is important but it comes with higher cost because at the end of each window the operator needs to persist some state with respect to that window. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be rephrased. The description should convey following .
- What is the purpose of this property ?
- How does the default work ?
- What are my choices if i want to specify any other windowDataManager and how each of those choices will affect the behaviour?
returned by processStats(...) method, the application master invokes | ||
definePartitions(...) on the logical instance which is also the | ||
partitioner instance. Dynamic partition can be disabled by setting the | ||
parameter repartitionInterval value to a negative value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
highlight repartitionInterval
} | ||
} | ||
``` | ||
Below is the configuration for “KINESIS-STREAM-NAME” Kinesis stream name: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did not quite understand the meaning of this line. Are you referring to configuration for the app ?
No description provided.