Skip to content

Commit

Permalink
Add common Retry strategy including jitter (#43)
Browse files Browse the repository at this point in the history
* Extend common retry strategy + doc

* Improve retry docs
  • Loading branch information
svroonland authored Sep 30, 2020
1 parent 7cf5e1d commit d5474bd
Show file tree
Hide file tree
Showing 5 changed files with 142 additions and 62 deletions.
2 changes: 1 addition & 1 deletion docs/docs/docs/circuitbreaker.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Make calls to an external system through the CircuitBreaker to safeguard that sy

## Usage example

```scala mdoc
```scala mdoc:silent
import nl.vroste.rezilience.CircuitBreaker._
import nl.vroste.rezilience._
import zio._
Expand Down
4 changes: 2 additions & 2 deletions docs/docs/docs/general_usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ val result3: ZIO[Any, Throwable, Int] =
result1.mapError(policyError => policyError.toException)
```

Similar methods exist on `BulkheadError` and `PolicyError` (see [Bulkhead](./bulkhead) and [Combining Policies](./combining))
Similar methods exist on `BulkheadError` and `PolicyError` (see [Bulkhead](../bulkhead) and [Combining Policies](../combining_policies))

## ZLayer integration
You can apply `rezilience` policies at the level of an individual ZIO effect. But having to wrap all your calls in eg a rate limiter can clutter your code somewhat. When you are using the [ZIO module pattern](https://zio.dev/docs/howto/howto_use_layers) using `ZLayer`, it is also possible to integrate a `rezilience` policy with some service at the `ZLayer` level. In the spirit of aspect oriented programming, the code using your service will not be cluttered with the aspect of rate limiting.
Expand All @@ -96,4 +96,4 @@ val env: ZLayer[Clock, Nothing, Database] = (Clock.live ++ databaseLayer) >>> ad

For policies where the result type has a different `E` you will need to map the error back to your own `E`. An option is to have something like a general `case class UnknownServiceError(e: Exception)` in your service error type, to which you can map the policy errors. If that is not possible for some reason, you can also define a new service type like `ResilientDatabase` where the error types are `PolicyError[E]`.

See the [full example](rezilience/shared/src/test/scala/nl/vroste/rezilience/examples/ZLayerIntegrationExample.scala) for more.
See the [full example](https://github.com/svroonland/rezilience/blob/master/rezilience/shared/src/test/scala/nl/vroste/rezilience/examples/ZLayerIntegrationExample.scala) for more.
76 changes: 59 additions & 17 deletions docs/docs/docs/retry.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,30 +5,72 @@ permalink: docs/retry/
---

# Retry
ZIO already has excellent built-in support for retrying effects on failures using a `Schedule`, there is not much this library can add.

Two helper methods are made available:
`Retry` is a policy that retries effects on failure

* `Retry.exponentialBackoff`
Exponential backoff with a maximum delay and an optional maximum number of recurs. When the maximum delay is reached, subsequent delays are the maximum.

* `Retry.whenCase`
Accepts a partial function and a schedule and will apply the schedule only when the input matches partial function. This is useful to retry only on certain types of failures/exceptions

For consistency with the other policies and to support combining policies, there is `Retry.make(schedule)`.

## Usage

```scala
## Common retry strategy

`Retry` implements a common-practice strategy for retrying:

* The first retry is performed immediately. With transient failures this method gives the highest chance of fast success.
* After that, Retry uses an exponential backoff capped to a maximum duration.
* Some random jitter is added to prevent spikes of retries from many call sites applying the same retry strategy.
* An optional maximum number of retries ensures that retrying does not continue forever.

## Usage example

```scala mdoc:silent
import zio._
import zio.duration._
import zio.clock.Clock
import zio.random.Random
import nl.vroste.rezilience._

val myEffect: ZIO[Any, Exception, Unit] = ZIO.unit

val retry: ZManaged[Clock with Random, Nothing, Retry[Any]] = Retry.make(min = 1.second, max = 10.seconds)

retry.use { retryPolicy =>
retryPolicy(myEffect)
}
```

## Custom retry strategy
ZIO already has excellent built-in support for retrying effects on failures using a `Schedule` and `rezilience` is built on top of that. Retry can accept any `ZIO` [`Schedule`](https://zio.dev/docs/datatypes/datatypes_schedule).

Some Schedule building blocks are available in `Retry.Schedules`:

* `Retry.Schedules.common(min: Duration, max: Duration, factor: Double, retryImmediately: Boolean, maxRetries: Option[Int])`
The strategy with immediate retry, exponential backoff and jitter as outlined above.

* `Retry.Schedules.exponentialBackoff(min: Duration, max: Duration, factor: Double = 2.0)`
Exponential backoff with a maximum delay and an optional maximum number of recurs. When the maximum delay is reached, subsequent delays are the maximum.

* `Retry.Schedules.whenCase[Env, In, Out](pf: PartialFunction[In, Any])(schedule: Schedule[Env, In, Out])`
Accepts a partial function and a schedule and will apply the schedule only when the input matches partial function. This is useful to retry only on certain types of failures/exceptions.

## Different retry strategies for different errors

By composing ZIO `Schedule`s, you can define different retries for different types of errors:

```scala mdoc:silent
import java.util.concurrent.TimeoutException
import java.net.UnknownHostException

val isTimeout: PartialFunction[Exception, Any] = {
case _ : TimeoutException =>
}

val myEffect: ZIO[Any, Exception, Unit] = ???
val isUnknownHostException: PartialFunction[Exception, Any] = {
case _ : UnknownHostException =>
}

// Retry exponentially on timeout exceptions
myEffect.retry(
Retry.Schedule.whenCase({ case TimeoutException => })(Retry.Schedule.exponentialBackoff(min = 1.second, max = 1.minute))
val retry2 = Retry.make(
Retry.Schedules.whenCase(isTimeout) { Retry.Schedules.common(min = 1.second, max = 1.minute) } ||
Retry.Schedules.whenCase(isUnknownHostException) { Retry.Schedules.common(min = 1.day, max = 5.days) }
)

retry2.use { retryPolicy =>
retryPolicy(myEffect)
}
```
120 changes: 79 additions & 41 deletions rezilience/shared/src/main/scala/nl/vroste/rezilience/Retry.scala
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package nl.vroste.rezilience
import zio.clock.Clock
import zio.duration._
import zio.random.Random
import zio.{ Schedule, ZIO, ZManaged }

trait Retry[-E] { self =>
Expand All @@ -24,56 +25,36 @@ trait Retry[-E] { self =>
}

object Retry {
object Schedule {

/**
* Schedule for exponential backoff up to a maximum interval
*
* @param min Minimum backoff time
* @param max Maximum backoff time. When this value is reached, subsequent intervals will be equal to this value.
* @param factor Exponential factor. 2 means doubling, 1 is constant, < 1 means decreasing
* @tparam E Schedule input
*/
def exponentialBackoff[E](
min: Duration,
max: Duration,
factor: Double = 2.0
): Schedule[Any, E, Duration] =
zio.Schedule.exponential(min, factor).whileOutput(_ <= max) andThen zio.Schedule.fixed(max).as(max)

/**
* Apply the given schedule only when inputs match the partial function
*/
def whenCase[Env, In, Out](pf: PartialFunction[In, Any])(
schedule: Schedule[Env, In, Out]
): Schedule[Env, In, (In, Out)] =
zio.Schedule.recurWhile(pf.isDefinedAt) && schedule
}

/**
* Create a Retry from a ZIO Schedule
* @param schedule
* @tparam R
* @tparam E
* @return
*/
def make[R, E](schedule: Schedule[R, E, Any]): ZManaged[Clock with R, Nothing, Retry[E]] =
ZManaged.environment[Clock with R].map(RetryImpl(_, schedule))

/**
* Create a Retry policy with exponential backoff
* Create a Retry policy with a common retry schedule
*
* By default the first retry is done immediately. With transient / random failures this method gives the
* highest chance of fast success.
* After that Retry uses exponential backoff between some minimum and maximum duration. Jitter is added
* to prevent spikes of retries.
* An optional maximum number of retries ensures that retrying does not continue forever.
*
* @param min Minimum retry backoff delay
* @param max Maximum retry backoff delay
* @param max Maximum backoff time. When this value is reached, subsequent intervals will be equal to this value.
* @param factor Factor with which delays increase
* @return
* @param retryImmediately Retry immediately after the first failure
* @param maxRetries Maximum number of retries
*/
def make(
min: Duration = 1.second,
max: Duration = 1.minute,
factor: Double = 2.0
): ZManaged[Clock, Nothing, Retry[Any]] =
ZManaged.environment[Clock].map(RetryImpl(_, Schedule.exponentialBackoff(min, max, factor)))
factor: Double = 2.0,
retryImmediately: Boolean = true,
maxRetries: Option[Int] = Some(3)
): ZManaged[Clock with Random, Nothing, Retry[Any]] =
make(Schedules.common(min, max, factor, retryImmediately, maxRetries))

/**
* Create a Retry from a ZIO Schedule
*/
def make[R, E](schedule: Schedule[R, E, Any]): ZManaged[Clock with R, Nothing, Retry[E]] =
ZManaged.environment[Clock with R].map(RetryImpl(_, schedule))

private case class RetryImpl[-E, ScheduleEnv](
scheduleEnv: Clock with ScheduleEnv,
Expand All @@ -89,4 +70,61 @@ object Retry {
}
)
}

/**
* Convenience methods to create common ZIO schedules for retrying
*/
object Schedules {

/**
* A common-practice schedule for retrying
*
* By default the first retry is done immediately. With transient / random failures this method gives the
* highest chance of fast success.
* After that Retry uses exponential backoff between some minimum and maximum duration. Jitter is added
* to prevent spikes of retries.
* An optional maximum number of retries ensures that retrying does not continue forever.
*
* See also https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/
*
* @param min Minimum retry backoff delay
* @param max Maximum backoff time. When this value is reached, subsequent intervals will be equal to this value.
* @param factor Factor with which delays increase
* @param retryImmediately Retry immediately after the first failure
* @param maxRetries Maximum number of retries
*/
def common(
min: Duration = 1.second,
max: Duration = 1.minute,
factor: Double = 2.0,
retryImmediately: Boolean = true,
maxRetries: Option[Int] = Some(3)
): Schedule[Any with Random, Any, (Any, Long)] =
((if (retryImmediately) zio.Schedule.once else zio.Schedule.stop) andThen
exponentialBackoff(min, max, factor).jittered) &&
maxRetries.fold(zio.Schedule.forever)(zio.Schedule.recurs)

/**
* Schedule for exponential backoff up to a maximum interval
*
* @param min Minimum backoff time
* @param max Maximum backoff time. When this value is reached, subsequent intervals will be equal to this value.
* @param factor Exponential factor. 2 means doubling, 1 is constant, < 1 means decreasing
* @tparam E Schedule input
*/
def exponentialBackoff[E](
min: Duration,
max: Duration,
factor: Double = 2.0
): Schedule[Any, E, Duration] =
zio.Schedule.exponential(min, factor).whileOutput(_ <= max) andThen zio.Schedule.fixed(max).as(max)

/**
* Apply the given schedule only when inputs match the partial function
*/
def whenCase[Env, In, Out](pf: PartialFunction[In, Any])(
schedule: Schedule[Env, In, Out]
): Schedule[Env, In, (In, Out)] =
zio.Schedule.recurWhile(pf.isDefinedAt) && schedule
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ object RetrySpec extends DefaultRunnableSpec {
override def spec = suite("Retry")(
testM("widen should not retry unmatched errors") {
Retry
.make(Retry.Schedule.exponentialBackoff(1.second, 2.seconds))
.make(Retry.Schedules.exponentialBackoff(1.second, 2.seconds))
.map(_.widen(Policy.unwrap[Throwable]))
.use { retry =>
for {
Expand Down

0 comments on commit d5474bd

Please sign in to comment.