🔄 Don't Panic() – goretry.IfNeeded() instead
A robust Go package implementing intelligent retry patterns with transient error detection, inspired by battle-tested enterprise strategies.
A reusable Go package implementing a process retry pattern with transient exception strategy, inspired by my System.Retry C# library.
- Flexible Retry Policies: Support for exponential backoff, linear backoff, fixed delay, and custom policies
- Transient Error Detection: Configurable logic to determine which errors should trigger retries
- Context Support: Full support for Go's context package for cancellation and timeouts
- Coordination Patterns: Layer circuit breakers, rate limiting, and distributed coordination on top
- Multiple Retry Strategies: Convenient functions similar to the original C# library
- Thread-Safe Design: Cryptographically secure jitter and safe concurrent usage
- Comprehensive Testing: Extensive test coverage with validation and security tests
- Idempotent Operations: Designed for operations that can be safely retried
go get github.com/kriscoleman/GoRetryimport "github.com/kriscoleman/GoRetry"
// Simple retry with default exponential backoff
err := goretry.IfNeeded(func() error {
return someOperationThatMightFail()
})err := goretry.IfNeeded(func() error {
return cloudService.Post(credentials)
}, goretry.WithTransientErrorFunc(func(err error) bool {
// Define your own logic for transient errors
return strings.Contains(err.Error(), "timeout") ||
strings.Contains(err.Error(), "connection refused")
}))ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
err := goretry.IfNeededWithContext(ctx, func(ctx context.Context) error {
return cloudService.GetMessages(ctx, credentials)
})GoRetry provides the retry mechanics, but you can layer coordination patterns on top. The library's context support and flexible policies make it perfect for building higher-level patterns for multiple clients, rate limiting, and distributed coordination.
Prevent cascading failures when multiple clients hit the same API:
type CircuitBreaker struct {
retrier *goretry.Retrier
failures atomic.Int32
state atomic.Int32 // 0=closed, 1=open, 2=half-open
lastFailure atomic.Int64
threshold int32
}
func (cb *CircuitBreaker) Call(fn func() error) error {
if cb.isOpen() {
return errors.New("circuit breaker open")
}
return cb.retrier.Do(func() error {
err := fn()
if err != nil {
cb.recordFailure()
} else {
cb.recordSuccess()
}
return err
})
}
func (cb *CircuitBreaker) isOpen() bool {
return cb.state.Load() == 1 &&
cb.failures.Load() >= cb.threshold
}Coordinate retries across multiple clients to respect API rate limits:
type RateLimitedRetrier struct {
retrier *goretry.Retrier
limiter *rate.Limiter
}
func NewRateLimitedRetrier(policy goretry.RetryPolicy, rps int) *RateLimitedRetrier {
return &RateLimitedRetrier{
retrier: goretry.NewRetrier(policy),
limiter: rate.NewLimiter(rate.Limit(rps), 1),
}
}
func (rlr *RateLimitedRetrier) Do(fn func() error) error {
return rlr.retrier.Do(func() error {
// Wait for rate limit before each attempt
if err := rlr.limiter.Wait(context.Background()); err != nil {
return err
}
return fn()
})
}Share backoff state across multiple clients:
type CoordinatedRetrier struct {
retrier *goretry.Retrier
backoffState *sync.Map // shared across clients
}
func (cr *CoordinatedRetrier) Do(apiEndpoint string, fn func() error) error {
return cr.retrier.Do(func() error {
// Check if endpoint is in coordinated backoff
if backoffUntil, exists := cr.backoffState.Load(apiEndpoint); exists {
if time.Now().Before(backoffUntil.(time.Time)) {
return errors.New("endpoint in coordinated backoff")
}
}
err := fn()
if isServerError(err) {
// Set coordinated backoff for all clients
cr.backoffState.Store(apiEndpoint, time.Now().Add(5*time.Second))
}
return err
})
}Use external state (Redis, Consul, etc.) for cross-service coordination:
type DistributedRetrier struct {
retrier *goretry.Retrier
redis *redis.Client
}
func (dr *DistributedRetrier) Do(apiKey string, fn func() error) error {
return dr.retrier.Do(func() error {
// Check distributed backoff state
backoff, _ := dr.redis.Get(ctx, "backoff:"+apiKey).Result()
if backoff != "" {
return errors.New("API in distributed backoff")
}
err := fn()
if isRateLimit(err) {
// Set distributed backoff for all services
dr.redis.SetEX(ctx, "backoff:"+apiKey, "true", 30*time.Second)
}
return err
})
}| Pattern | Use Case | Complexity | Benefits |
|---|---|---|---|
| Circuit Breaker | High-traffic APIs, cascading failures | Medium | Prevents thundering herd, fast failure |
| Rate Limiter | API quotas, rate-limited services | Low | Simple, prevents 429s |
| Coordinated Backoff | Shared resources, multiple clients | Medium | Reduces server load |
| Distributed | Microservices, multiple processes | High | Cross-service coordination |
Key insight: Coordination patterns handle the "when to allow retries" while GoRetry handles the "how to retry". This separation of concerns makes it easy to build sophisticated retry strategies while keeping the core retry logic simple and reliable.
policy := goretry.NewExponentialBackoffPolicy(100*time.Millisecond, 5*time.Second)
retrier := goretry.NewRetrier(policy)policy := goretry.NewFixedDelayPolicy(500 * time.Millisecond)
retrier := goretry.NewRetrier(policy)policy := goretry.NewLinearBackoffPolicy(
100*time.Millisecond, // base delay
50*time.Millisecond, // increment
1*time.Second, // max delay
)
retrier := goretry.NewRetrier(policy)policy := goretry.NewNoDelayPolicy()
retrier := goretry.NewRetrier(policy)basePolicy := goretry.NewExponentialBackoffPolicy(100*time.Millisecond, 5*time.Second)
policy := goretry.NewStopPolicy(basePolicy).
WithMaxAttempts(5).
WithMaxDuration(30 * time.Second)retrier := goretry.NewRetrier(policy, goretry.WithMaxAttempts(5))retrier := goretry.NewRetrier(policy,
goretry.WithTransientErrorFunc(func(err error) bool {
// Your custom logic here
return err.Error() == "temporary failure"
}),
)retrier := goretry.NewRetrier(policy,
goretry.WithOnRetry(func(attempt int, err error) {
log.Printf("Retry attempt %d due to: %v", attempt, err)
}),
)When all retries are exhausted, GoRetry returns an OutOfRetriesError:
err := goretry.IfNeeded(func() error {
return errors.New("persistent error")
})
var outOfRetriesErr *goretry.OutOfRetriesError
if errors.As(err, &outOfRetriesErr) {
fmt.Printf("Failed after %d attempts\n", outOfRetriesErr.Attempts)
fmt.Printf("Last error: %v\n", outOfRetriesErr.LastErr)
// Access all errors that occurred
for i, e := range outOfRetriesErr.AllErrs {
fmt.Printf("Attempt %d error: %v\n", i+1, e)
}
}The package includes a reasonable default for detecting transient errors:
- Network timeout errors (
Timeout() boolinterface) - Temporary errors (
Temporary() boolinterface) - Context cancellation and deadline exceeded are not considered transient
err := goretry.IfNeeded(func() error {
return operation()
})err := goretry.IfNeededWithContext(ctx, func(ctx context.Context) error {
return operation(ctx)
})policy := goretry.NewFixedDelayPolicy(200 * time.Millisecond)
err := goretry.IfNeededWithPolicy(policy, func() error {
return operation()
})err := goretry.IfNeededWithPolicyAndContext(ctx, policy, func(ctx context.Context) error {
return operation(ctx)
})See examples_test.go for more comprehensive examples of usage patterns.
This project is licensed under the MIT License - see the LICENSE file for details.
This library is inspired by the System.Retry C# library and follows MSDN's Retry Pattern guidelines.