Rate limiting
Rate limiting protects downstream capacity. Limit at the right boundary, choose burst deliberately, and honor cancellation while waiting.
Canonical guidance
- use token-bucket rate limiting for most request throttling
- choose the limiter boundary deliberately
- make burst allowance as explicit as steady-state rate
Use when
- protecting upstream APIs or databases
- per-tenant or per-endpoint fairness
- smoothing producer bursts
Avoid
- sleeping manually between requests
- one global limiter when resources or caller classes need separate budgets
- blocking forever on rate limit waits after the caller gave up
Preferred pattern
- keep a
rate.Limiternear the protected dependency and callWait(ctx)orAllow()based on policy
Anti-pattern
- scattering ad hoc
time.Sleepthrottling through business code
Explanation: This is tempting because it is simple, but it couples policy to call sites and ignores burst semantics.
Why
- throttling is a resource and fairness policy, not just a delay
Related pages
Sources
- Go Wiki: Rate Limiting - Go Team
- golang.org/x/time/rate package - Go Authors
- Go Concurrency Patterns: Context - Sameer Ajmani