Retries
When communication to a service or processing of a message from a queue, the consumer service re-attempts the request and tries again. This is typically paired with a finite number of retries, an exponential backoff, and jitter.
Avoiding the thundering herd problem
If a retried message continues to fail, then you can encounter “storm” such that you overload the queue and related processors with messages that cannot be processed. Mitigate this via a retry backoff. Each retry adds some additional delay (compounded via exponential backoff) and jitter. Jitter adds a degree of randomness to the backoff delay to avoid processing spikes.
Where to Use Retries
- Message processing in queues
- Transient network failures
- Rate limits on requested API (typically external)
- Transient overload of requested resource
Timeouts
Prevents indefinite hangs when a request is unable to be completed and the requested service never returns a response.
Fallbacks
Provide some, non-ideal response or experience when an error occurs so that the system can continue to operate. Typical examples include:
- Cached responses from previous or frequently accessed requests
- Default response (Null object pattern applied to systems design)
- Mocked responses use placeholder data
- Graceful degradation removes some functionality of the system that are problematic, but allow the core functionality to remain
Circuit Break
Refuse to send requests to service that has produced some threshold of errors in a given timeframe. This helps prevent the error from cascading through the system; which could lead to data corruption or an invalid system state.