Over the last week I’ve been slowly writing a little utility library to make it easier to express retry logic within application code. As software becomes more complex and more interconnected the concept of gracefully recovering from transient conditions becomes more important. Your average cloud or enterprise application is going to have to deal with network time outs, database deadlocks and temporary glitches caused by underlying systems recovering from hardware failures.
Palmer is a library that I’ve created to help solve this problem. You can read more about it over at GitHub. If you just want to download it and have a play, you can install the package from NuGet.
Install-Package Palmer
Using the API is easy with its fluent syntax:
Retry.On<SqlException>().For(5).With(context =>
{
// Some code that might throw a transient SQL exception.
});
This is a 0.1 release so I am looking for feedback and ideas on how to improve the library. I’ve already got a couple of ideas such as:
- Extend the On/AndOn methods to allow passing in a typed exception to make accessing specific exception fields easier.
- Add event hooks to support logging exceptions as they occur.
- Create an extension library for common usage scenarios such as SQL, network issues.
- Support capturing multiple exceptions with one termination condition.
I’ve also added some issues around improving the documentation and turning the library back into a portable library after getting my build issues sorted out.
Great library idea. I had to do something similar myself, when using Linq-to-SQL over SQL Azure, which doesn’t guarantee reliability.
I always find these kind of apis confusing because it is not clear if the repeat number is inclusive or exclusive of the original atttempt. It would be good if the API could make this clear.
My issue with these types of api is that it is not clear if the repeat number is inclusive or exclusive of the original attempt.
Thanks for the feedback Rory. I’ve created an issue for this on GitHub:
https://github.com/MitchDenny/Palmer/issues/8
Can I ask another favour, could you take a stab (in the comments of that issue) in writing what would make it easier to figure out how many times it is going to execute? I must admit I struggle with this one myself.
Interesting idea. Certainly will be very useful to commercial if a solid retry framework is available. For failed retries at work I constantly encounter the need to deal the followings:
1. Minimum waiting time between retries.
2. Every retry action may take a different set of arguments / data. E.g. 100 web service calls made, 50 needs retry after 1st run, 20 needs retry after 2nd run etc.
3. Feeling that it would be nice if retries can be specified via AOP as attributes of service methods.
Hi Mitch,
If you need another good real world example of use for this library, check out retrying WCF operations. Theres a few good SO posts on this e.g. http://stackoverflow.com/questions/6130331/how-to-handle-wcf-exceptions-consolidated-list-with-code
One obvious point that is missing is the pause (which I see is on the github issues list).