This article will show you how to implement the retry pattern in an object-oriented way with separation of concerns and testability in mind.
Background
Recently, when I was fixing something in our code, I stumbled upon an implementation of retries which was working but not very nice. It was intertwined with the actual logic of that class and made the unit test unnecessarily complicated. We actually have some nice implementation in our common library so I took and reused it. And when I realized there is no article here that would satisfy my needs, I decided to share the design with you.
Retry Pattern
The environment in which our applications operate is inherently unreliable. Especially when you are communicating with resources outside your application like services or storage. These errors can be either permanent (resource not found or you are not authorized to use that resource) or transient (a timeout or your calls are being throttled). If you can decide which failure is transient, then you can automatically retry those operations instead of returning failure to the caller.
A successful retry strategy must meet the following requirements:
- Determine if a failure is permanent or transient.
- Automatically retry the operation in case of transient failure. The amount of retries must be limited to not end-up in an infinite loop in case the transient failure is not so transient after all.
- Insert some kind of delay in-between the retries to give the resource some space to recover.
If you want to read more about retry logic, please refer to following articles on Microsoft Docs:
Implementation
Let's start with the retry strategy itself. I'm choosing async
version here as I think this is commonplace nowadays. This can be easily rewritten to not use tasks. Moreover, the action that is going to be retried is returning data. Again, you could have a separate overload not returning data, but my point here is not to provide complete implementation just to show the object-oriented design.
public interface IRetryStrategy
{
Task<T> Execute<T>(Func<Task<T>> action);
}
We will base our error handling on exceptions.
public interface IExceptionHandler
{
bool IsTransient(Exception ex);
}
End we have one more dependency for getting the delay in-between retries.
public interface IDelayProvider
{
TimeSpan GetNextValue();
}
Here is the simplified implementation of the retry strategy. Please see the attached code for full version. There, you also find implementation of exception handler and delay provider.
public async Task<T> Execute<T>(Func<Task<T>> action)
{
var retryCounter = 0;
while (true)
{
try
{
return await action();
}
catch (Exception ex)
{
if (retryCounter >= this.maxRetry || !this.exceptionHandler.IsTransient(ex))
{
throw;
}
}
retryCounter++;
var delay = this.delayProvider.GetNextValue();
await Task.Delay(delay);
}
}
Using the Code
The retry strategy has several dependencies and we use a factory to create it. You usually want this to be configurable (number of retries, delays, etc.) which is one more argument to use a factory. Here is one example how it may look like, but you need to tailor this specifically to your needs.
public class RetryStrategyFactory : IRetryStrategyFactory
{
public IRetryStrategy Create()
{
var exceptionHandler = new HttpRequestExceptionHandler();
var delayProvider = new ExponentialBackoffDelayProvider(TimeSpan.FromSeconds(0.6),
TimeSpan.FromSeconds(6), 2.0);
return new RetryStrategy(5, exceptionHandler, delayProvider, new WaitWrapper());
}
}
Then you use the factory to create the retry strategy every time you need to run an unreliable operation. Please note that you may want to have different strategies for different operations.
public async Task Process()
{
var retryStrategy = this.retryStrategyFactory.Create();
var data = await retryStrategy.Execute(() =>
{
});
}
Points of Interest
This implementation is almost identical to what we use in our production code. You can argue that there is room for improvement in terms of the object-oriented design and I would agree. For example, using a delegate is very convenient and extremely easy compared to an interface, but it's a bit tricky to mock as you can see in the tests, which indicates the design is not ideal. I was thinking about pushing the design further, but maybe later if there is enough interest.
History
- 17th January, 2021 - Initial release