指数backoff
当我们连接到一个失败的后端时,通常希望不要立即重试(以避免泛滥的网络或服务器的请求),而是做某种形式的指数backoff。
我们有几个参数:
- INITIAL_BACKOFF (第一次失败重试前后需等待多久)
- MULTIPLIER (在失败的重试后乘以的倍数)
- JITTER (随机抖动因子).
- MAX_BACKOFF (backoff上限)
- MIN_CONNECT_TIMEOUT (最短重试间隔)
建议backoff算法
以指数形式返回连接尝试的起始时间,达到MAX_BACKOFF的极限,并带有抖动。
1
2
3
4
5
6
7
|
ConnectWithBackoff()
current_backoff = INITIAL_BACKOFF
current_deadline = now() + INITIAL_BACKOFF
while (TryConnect(Max(current_deadline, now() + MIN_CONNECT_TIMEOUT))!= SUCCESS)
SleepUntil(current_deadline)
current_backoff = Min(current_backoff *MULTIPLIER, MAX_BACKOFF)
current_deadline = now() + current_backoff + UniformRandom(-JITTER* current_backoff, JITTER * current_backoff)
|
参数默认值:
- MIN_CONNECT_TIMEOUT=20sec
- INITIAL_BACKOFF=1sec
- MULTIPLIER=1.6
- MAX_BACKOFF=120sec
- JITTER=0.2
根据的确切的关注点实现(例如最小化手机的唤醒次数)可能希望使用不同的算法,特别是不同的抖动逻辑。
备用的实现必须确保连接退避在同一时间开始分散,并且不得比上述算法更频繁地尝试连接。
重置backoff
backoff应在某个时间点重置为INITIAL_BACKOFF,以便重新连接行为是一致的,不管连接的是新开始的还是先前断开的连接。
当接收到SETTINGS帧时重置backoff,在那个时候,我们确定这个连接被服务器已经接受了。
grpc/backoff
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
|
// See internal/backoff package for the backoff implementation. This file is
// kept for the exported types and API backward compatibility.
package grpc
import (
"time"
"google.golang.org/grpc/backoff"
)
// DefaultBackoffConfig uses values specified for backoff in
// https://github.com/grpc/grpc/blob/master/doc/connection-backoff.md.
//
// Deprecated: use ConnectParams instead. Will be supported throughout 1.x.
var DefaultBackoffConfig = BackoffConfig{
MaxDelay: 120 * time.Second,
}
// BackoffConfig defines the parameters for the default gRPC backoff strategy.
//
// Deprecated: use ConnectParams instead. Will be supported throughout 1.x.
type BackoffConfig struct {
// MaxDelay is the upper bound of backoff delay.
MaxDelay time.Duration
}
// ConnectParams defines the parameters for connecting and retrying. Users are
// encouraged to use this instead of the BackoffConfig type defined above. See
// here for more details:
// https://github.com/grpc/grpc/blob/master/doc/connection-backoff.md.
//
// Experimental
//
// Notice: This type is EXPERIMENTAL and may be changed or removed in a
// later release.
type ConnectParams struct {
// Backoff specifies the configuration options for connection backoff.
Backoff backoff.Config
// MinConnectTimeout is the minimum amount of time we are willing to give a
// connection to complete.
MinConnectTimeout time.Duration
}
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
|
// Package backoff provides configuration options for backoff.
//
// More details can be found at:
// https://github.com/grpc/grpc/blob/master/doc/connection-backoff.md.
//
// All APIs in this package are experimental.
package backoff
import "time"
// Config defines the configuration options for backoff.
type Config struct {
// BaseDelay is the amount of time to backoff after the first failure.
BaseDelay time.Duration
// Multiplier is the factor with which to multiply backoffs after a
// failed retry. Should ideally be greater than 1.
Multiplier float64
// Jitter is the factor with which backoffs are randomized.
Jitter float64
// MaxDelay is the upper bound of backoff delay.
MaxDelay time.Duration
}
// DefaultConfig is a backoff configuration with the default values specfied
// at https://github.com/grpc/grpc/blob/master/doc/connection-backoff.md.
//
// This should be useful for callers who want to configure backoff with
// non-default values only for a subset of the options.
var DefaultConfig = Config{
// 第一次失败之后的延迟时间.
BaseDelay: 1.0 * time.Second,
// 多次失败之后的时间乘数.
Multiplier: 1.6,
// 随机因子.
Jitter: 0.2,
// 最大延迟时间.
MaxDelay: 120 * time.Second,
}
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
|
// Strategy defines the methodology for backing off after a grpc connection
// failure.
type Strategy interface {
// Backoff returns the amount of time to wait before the next retry given
// the number of consecutive failures.
Backoff(retries int) time.Duration
}
// DefaultExponential is an exponential backoff implementation using the
// default values for all the configurable knobs defined in
// https://github.com/grpc/grpc/blob/master/doc/connection-backoff.md.
var DefaultExponential = Exponential{Config: grpcbackoff.DefaultConfig}
// Exponential implements exponential backoff algorithm as defined in
// https://github.com/grpc/grpc/blob/master/doc/connection-backoff.md.
type Exponential struct {
// Config contains all options to configure the backoff algorithm.
Config grpcbackoff.Config
}
// Backoff returns the amount of time to wait before the next retry given the
// number of retries.
func (bc Exponential) Backoff(retries int) time.Duration {
// 当重试次数为0时直接返回BaseDelay,为1秒.
if retries == 0 {
return bc.Config.BaseDelay
}
backoff, max := float64(bc.Config.BaseDelay), float64(bc.Config.MaxDelay)
for backoff < max && retries > 0 {
// 当backoff小于max且重试次数大于0时不断的乘以Multiplier.
backoff *= bc.Config.Multiplier
retries--
}
if backoff > max {
backoff = max
}
// Randomize backoff delays so that if a cluster of requests start at
// the same time, they won't operate in lockstep.
// 对时间加上一个随机数.
backoff *= 1 + bc.Config.Jitter*(grpcrand.Float64()*2-1)
if backoff < 0 {
return 0
}
return time.Duration(backoff)
}
|
如果默认的backoff算法不满足需求的时候,还可以自定义backoff算法,通过实现backoffStrategy接口。
1
2
3
4
5
6
|
func withBackoff(bs backoffStrategy) DialOption {
return func(o *dialOptions) {
o.bs = bs
}
}
grpc.Dial(addr, grpc.withBackoff(mybackoff))
|
参考
grpc-go 连接backoff协议
gRPC系列之连接异常机制