From 4a7b0fdfe8c01549f7408597efaa8f0bb41bd0f5 Mon Sep 17 00:00:00 2001 From: Harshini Rangaswamy Date: Fri, 19 Jan 2024 10:49:26 +0100 Subject: [PATCH 1/3] Document default restart strategy for Flink JAR applications --- _toc.yml | 2 + docs/products/flink/concepts/custom-jars.rst | 3 +- .../restart-strategy-jar-applications.rst | 49 +++++++++++++++++++ 3 files changed, 53 insertions(+), 1 deletion(-) create mode 100644 docs/products/flink/howto/restart-strategy-jar-applications.rst diff --git a/_toc.yml b/_toc.yml index f7400ad5f..368c566c9 100644 --- a/_toc.yml +++ b/_toc.yml @@ -665,6 +665,8 @@ entries: title: Create a JAR application - file: docs/products/flink/howto/manage-flink-applications title: Manage Apache Flink applications + - file: docs/products/flink/howto/restart-strategy-jar-applications + title: Restart Strategy for JAR Applications - file: docs/products/flink/howto/list-flink-tables title: Apache Flink tables entries: diff --git a/docs/products/flink/concepts/custom-jars.rst b/docs/products/flink/concepts/custom-jars.rst index 7a44f7100..459069229 100644 --- a/docs/products/flink/concepts/custom-jars.rst +++ b/docs/products/flink/concepts/custom-jars.rst @@ -32,5 +32,6 @@ Custom JARs can be applied in various scenarios, including but not limited to: Related pages -------------- -* :doc:`How to use custom JARs in Aiven for Apache Flink application `. +* :doc:`How to use custom JARs in Aiven for Apache Flink application `. + diff --git a/docs/products/flink/howto/restart-strategy-jar-applications.rst b/docs/products/flink/howto/restart-strategy-jar-applications.rst new file mode 100644 index 000000000..d857e3120 --- /dev/null +++ b/docs/products/flink/howto/restart-strategy-jar-applications.rst @@ -0,0 +1,49 @@ +Restart strategy in JAR applications +====================================== + +A restart strategy is a set of rules that Apache Flink® adheres to when dealing with application failures. These strategies enable the automatic restart of a failed job under specific conditions and parameters, which is crucial for high availability and fault tolerance in distributed and scalable systems. + + +Default restart strategy for JAR applications +----------------------------------------------- + +Aiven for Apache Flink® includes a default restart strategy for JAR applications. This strategy uses the **exponential-delay** technique, incrementally increasing the delay time between restarts up to a specified maximum. Once this maximum delay is reached, it remains constant for any subsequent restarts. The default strategy is fully integrated into the Aiven for Apache Flink cluster configuration and automatically applies to all JAR applica + +View the default strategy +```````````````````````````````````````````` + +You can view the default restart strategy configurations for your Aiven for Apache Flink cluster in the Apache Flink Dashboard. Follow these steps to view the current settings: + +1. Access the `Aiven Console `_ and select the Aiven for Apache Flink service. +2. From the **Connection information** section on the overview page, copy the **Service URI** and paste it into your web browser's address bar. +3. When prompted, log in using the **User** and **Password** credentials specified in the **Connection information** section. +4. Once in the **Apache Flink Dashboard**, click the **Job Manager** from the menu. +5. Switch to the **Configuration** tab. +6. Review the configurations and parameters related to the restart strategy. + +Disable default restart strategy +------------------------------------ +While Aiven for Apache Flink® typically recommends using the default restart strategy for JAR applications, there are circumstances, particularly during testing or debugging, where disabling automatic restarts might be necessary. You cannot disable the default restart strategy in Aiven for Apache Flink® through configuration files. Instead, directly modify the code of your Jar application to achieve this. + + +.. code-block:: java + + StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); + env.setRestartStrategy(RestartStrategies.noRestart()); + +This code sets the restart strategy to 'None', preventing any restart attempts in case of failures. + +Key considerations when disabling default restarts +`````````````````````````````````````````````````````````` + +Before choosing to disable the default restart strategy, consider the following: + +- **Persistent failures**: Disabling restarts means that if a Flink Job fails, it will not attempt to recover, potentially leading to permanent job failure. +- **Testing and debugging**: Disabling is beneficial when identifying issues in the application code, as it prevents the masking of errors through automatic restarts. +- **External factors**: Jobs can fail due to external factors, such as infrastructure changes or maintenance activities. If you disable restarts, your Flink jobs might become vulnerable to failures. +- **Operational risks**: In production environments, it is generally advisable to use the default restart strategy to ensure high availability and fault tolerance. + + +Related pages +-------------- +* `Restart strategies in Apache Flink® `_ \ No newline at end of file From 270213e518fe9df0d9219587e7843c8c06e31e33 Mon Sep 17 00:00:00 2001 From: Harshini Rangaswamy Date: Fri, 19 Jan 2024 14:34:27 +0100 Subject: [PATCH 2/3] addressed feedback --- .../flink/howto/restart-strategy-jar-applications.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/products/flink/howto/restart-strategy-jar-applications.rst b/docs/products/flink/howto/restart-strategy-jar-applications.rst index d857e3120..434be51eb 100644 --- a/docs/products/flink/howto/restart-strategy-jar-applications.rst +++ b/docs/products/flink/howto/restart-strategy-jar-applications.rst @@ -1,13 +1,13 @@ Restart strategy in JAR applications ====================================== -A restart strategy is a set of rules that Apache Flink® adheres to when dealing with application failures. These strategies enable the automatic restart of a failed job under specific conditions and parameters, which is crucial for high availability and fault tolerance in distributed and scalable systems. +A restart strategy is a set of rules that Apache Flink® adheres to when dealing with Flink job failures. These strategies enable the automatic restart of a failed Flink job under specific conditions and parameters, which is crucial for high availability and fault tolerance in distributed and scalable systems. Default restart strategy for JAR applications ----------------------------------------------- -Aiven for Apache Flink® includes a default restart strategy for JAR applications. This strategy uses the **exponential-delay** technique, incrementally increasing the delay time between restarts up to a specified maximum. Once this maximum delay is reached, it remains constant for any subsequent restarts. The default strategy is fully integrated into the Aiven for Apache Flink cluster configuration and automatically applies to all JAR applica +Aiven for Apache Flink® includes a default restart strategy for JAR applications. This strategy uses the **exponential-delay** technique, incrementally increasing the delay time between restarts up to a specified maximum. Once this maximum delay is reached, it remains constant for any subsequent restarts. The default strategy is fully integrated into the Aiven for Apache Flink cluster configuration and automatically applies to all JAR applications. View the default strategy ```````````````````````````````````````````` @@ -38,9 +38,9 @@ Key considerations when disabling default restarts Before choosing to disable the default restart strategy, consider the following: -- **Persistent failures**: Disabling restarts means that if a Flink Job fails, it will not attempt to recover, potentially leading to permanent job failure. +- **Persistent failures**: Disabling restarts means that if a Flink Job fails, Flink will not attempt to recover the job, leading to permanent job failure. - **Testing and debugging**: Disabling is beneficial when identifying issues in the application code, as it prevents the masking of errors through automatic restarts. -- **External factors**: Jobs can fail due to external factors, such as infrastructure changes or maintenance activities. If you disable restarts, your Flink jobs might become vulnerable to failures. +- **External factors**: Jobs can fail due to external factors, such as infrastructure changes or maintenance activities. If you disable restarts, your Flink jobs will become vulnerable to failures. - **Operational risks**: In production environments, it is generally advisable to use the default restart strategy to ensure high availability and fault tolerance. From 1104b4bc6f88925a77a9b1b059392a50f1abf1db Mon Sep 17 00:00:00 2001 From: Harshini Rangaswamy Date: Fri, 19 Jan 2024 14:37:36 +0100 Subject: [PATCH 3/3] updated sentence --- docs/products/flink/howto/restart-strategy-jar-applications.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/products/flink/howto/restart-strategy-jar-applications.rst b/docs/products/flink/howto/restart-strategy-jar-applications.rst index 434be51eb..b82f58bc7 100644 --- a/docs/products/flink/howto/restart-strategy-jar-applications.rst +++ b/docs/products/flink/howto/restart-strategy-jar-applications.rst @@ -38,7 +38,7 @@ Key considerations when disabling default restarts Before choosing to disable the default restart strategy, consider the following: -- **Persistent failures**: Disabling restarts means that if a Flink Job fails, Flink will not attempt to recover the job, leading to permanent job failure. +- **Persistent failures**: Disabling restarts means that if a Flink Job fails, Flink will not attempt to recover it, leading to permanent job failure. - **Testing and debugging**: Disabling is beneficial when identifying issues in the application code, as it prevents the masking of errors through automatic restarts. - **External factors**: Jobs can fail due to external factors, such as infrastructure changes or maintenance activities. If you disable restarts, your Flink jobs will become vulnerable to failures. - **Operational risks**: In production environments, it is generally advisable to use the default restart strategy to ensure high availability and fault tolerance.