Goroutine supervision
If long-lived goroutines may fail, restart policy, backoff, observability, and shutdown ownership must be explicit.
Canonical guidance
- supervise only work that is meant to be long-lived
- decide when to restart, when to give up, and who gets notified
- combine restart logic with backoff and observability
Use when
- long-running watchers
- connection-maintenance loops
- background workers that should recover from transient failure
Avoid
- infinite restart loops with no backoff
- hiding crashes by restarting silently forever
- supervising request-scoped work that should simply return an error
Preferred pattern
- one owner goroutine starts a worker, observes its exit, applies backoff if restartable, and stops when
ctx.Done()
Anti-pattern
for { go work() }after every failure
Explanation: This is tempting because restart seems like resilience, but unbounded crash loops just move failure into overload and noise.
Why
- recovery policy is part of lifecycle design, not an afterthought
Related pages
Sources
- Go Concurrency Patterns: Context - Sameer Ajmani
- Advanced Go Concurrency Patterns - Sameer Ajmani
- time package - Go Team