go-notificationgo-notification
Features

Retry & Backoff

Automatic retries with exponential backoff for transient failures.

Transient failures are the norm for notification APIs — network blips, upstream 5xx, connection resets. go-notification retries these automatically with exponential backoff.

Defaults

main.go
notification.New(notification.Config{
    MaxRetries: 3,               // retries AFTER the first attempt → 4 attempts total
    RetryDelay: 1 * time.Second, // base delay for exponential backoff
})

These are the only two retry knobs. Backoff is computed as RetryDelay * 2^attempt, capped at 60 seconds, with up to 25% jitter added to avoid retry storms.

Default schedule (RetryDelay = 1s, MaxRetries = 3):

  • Attempt 1 — fire immediately
  • Retry 1 — after ~1s (+ jitter)
  • Retry 2 — after ~2s (+ jitter)
  • Retry 3 — after ~4s (+ jitter)
  • Give up → call OnError

MaxRetries: 0 is treated as the default (3). To disable retries entirely, set a negative value (e.g. MaxRetries: -1). RetryDelay <= 0 defaults to 1s.

What gets retried

By default every error is retried until MaxRetries is reached. A driver opts a specific failure out of retry by returning an error that wraps middleware.ErrPermanent:

main.go
import "github.com/gopackx/go-notification/middleware"

// Inside a driver's Send:
if resp.StatusCode >= 400 && resp.StatusCode < 500 {
    return fmt.Errorf("provider rejected (%d): %w", resp.StatusCode, middleware.ErrPermanent)
}
return fmt.Errorf("provider upstream %d", resp.StatusCode) // retryable

So the convention is: 4xx client errors (bad API key, bad recipient, invalid payload) wrap ErrPermanent and stop immediately; network errors and 5xx are left plain and get retried. Context cancellation also stops retries early.

Jitter

Exponential backoff without jitter causes retry storms — every client hits the upstream at the same instant after a blip. The built-in backoff adds up to 25% jitter on each wait. This is automatic and not configurable.

Observability

Retries themselves aren't surfaced via a per-attempt callback. To observe failures, use OnError (fired once, after all retries are exhausted) and OnSuccess:

main.go
notification.New(notification.Config{
    MaxRetries: 5,
    OnError: func(ctx context.Context, n notification.Notifiable, channel string, err error) {
        failureCounter.Add(1)
        slog.Error("delivery exhausted retries", "channel", channel, "id", n.GetID(), "err", err)
    },
})

Sustained OnError spikes mean something upstream is degrading — investigate rather than just absorbing.

Pairs with rate limiting

Retry handles failures you didn't prevent; rate limiting prevents many of them (avoiding 429s in the first place). The two features pair well.