Skip to content
This repository has been archived by the owner on Jan 29, 2024. It is now read-only.

Document default restart strategy for Flink JAR applications #2447

Merged
merged 3 commits into from
Jan 19, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions _toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -665,6 +665,8 @@ entries:
title: Create a JAR application
- file: docs/products/flink/howto/manage-flink-applications
title: Manage Apache Flink applications
- file: docs/products/flink/howto/restart-strategy-jar-applications
title: Restart Strategy for JAR Applications
- file: docs/products/flink/howto/list-flink-tables
title: Apache Flink tables
entries:
Expand Down
3 changes: 2 additions & 1 deletion docs/products/flink/concepts/custom-jars.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,5 +32,6 @@
Related pages
--------------

* :doc:`How to use custom JARs in Aiven for Apache Flink application </docs/products/flink/howto/create-jar-application>`.
* :doc:`How to use custom JARs in Aiven for Apache Flink application </docs/products/flink/howto/create-jar-application>`.

Check failure on line 35 in docs/products/flink/concepts/custom-jars.rst

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/products/flink/concepts/custom-jars.rst#L35

[Aiven.common_replacements] Use 'Flink' instead of 'flink'.
Raw output
{"message": "[Aiven.common_replacements] Use 'Flink' instead of 'flink'.", "location": {"path": "docs/products/flink/concepts/custom-jars.rst", "range": {"start": {"line": 35, "column": 86}}}, "severity": "ERROR"}


49 changes: 49 additions & 0 deletions docs/products/flink/howto/restart-strategy-jar-applications.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
Restart strategy in JAR applications
======================================

A restart strategy is a set of rules that Apache Flink® adheres to when dealing with application failures. These strategies enable the automatic restart of a failed job under specific conditions and parameters, which is crucial for high availability and fault tolerance in distributed and scalable systems.
harshini-rangaswamy marked this conversation as resolved.
Show resolved Hide resolved


Default restart strategy for JAR applications
-----------------------------------------------

Aiven for Apache Flink® includes a default restart strategy for JAR applications. This strategy uses the **exponential-delay** technique, incrementally increasing the delay time between restarts up to a specified maximum. Once this maximum delay is reached, it remains constant for any subsequent restarts. The default strategy is fully integrated into the Aiven for Apache Flink cluster configuration and automatically applies to all JAR applica

Check failure on line 10 in docs/products/flink/howto/restart-strategy-jar-applications.rst

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/products/flink/howto/restart-strategy-jar-applications.rst#L10

[Aiven.aiven_spelling] 'applica' does not seem to be a recognised word
Raw output
{"message": "[Aiven.aiven_spelling] 'applica' does not seem to be a recognised word", "location": {"path": "docs/products/flink/howto/restart-strategy-jar-applications.rst", "range": {"start": {"line": 10, "column": 440}}}, "severity": "ERROR"}
harshini-rangaswamy marked this conversation as resolved.
Show resolved Hide resolved

View the default strategy
````````````````````````````````````````````

You can view the default restart strategy configurations for your Aiven for Apache Flink cluster in the Apache Flink Dashboard. Follow these steps to view the current settings:

1. Access the `Aiven Console <https://console.aiven.io/>`_ and select the Aiven for Apache Flink service.
2. From the **Connection information** section on the overview page, copy the **Service URI** and paste it into your web browser's address bar.
3. When prompted, log in using the **User** and **Password** credentials specified in the **Connection information** section.
4. Once in the **Apache Flink Dashboard**, click the **Job Manager** from the menu.
5. Switch to the **Configuration** tab.
6. Review the configurations and parameters related to the restart strategy.

Disable default restart strategy
------------------------------------
While Aiven for Apache Flink® typically recommends using the default restart strategy for JAR applications, there are circumstances, particularly during testing or debugging, where disabling automatic restarts might be necessary. You cannot disable the default restart strategy in Aiven for Apache Flink® through configuration files. Instead, directly modify the code of your Jar application to achieve this.


.. code-block:: java

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setRestartStrategy(RestartStrategies.noRestart());

This code sets the restart strategy to 'None', preventing any restart attempts in case of failures.

Key considerations when disabling default restarts
``````````````````````````````````````````````````````````

Before choosing to disable the default restart strategy, consider the following:

- **Persistent failures**: Disabling restarts means that if a Flink Job fails, it will not attempt to recover, potentially leading to permanent job failure.
harshini-rangaswamy marked this conversation as resolved.
Show resolved Hide resolved
- **Testing and debugging**: Disabling is beneficial when identifying issues in the application code, as it prevents the masking of errors through automatic restarts.
- **External factors**: Jobs can fail due to external factors, such as infrastructure changes or maintenance activities. If you disable restarts, your Flink jobs might become vulnerable to failures.
harshini-rangaswamy marked this conversation as resolved.
Show resolved Hide resolved
- **Operational risks**: In production environments, it is generally advisable to use the default restart strategy to ensure high availability and fault tolerance.


Related pages
--------------
* `Restart strategies in Apache Flink® <https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/ops/state/task_failure_recovery/#restart-strategies>`_
Loading