Yang's Kernel

如果你不能解决一个问题,那么有一个更小的问题你也不能解决,找出那个问题!

0%

为什么websocket RFC建议把TIME_WAIT状态留在服务端

背景

之前听过一种说法,要尽量把TIME_WAIT状态留在客户端(客户端主动发起FIN),而不是服务端。因为TIME_WAIT状态的连接需要等待2MSL后才能被释放,会导致资源占用。这种说法很合理,但我在看WebSocket RFC时,文档中提到要让服务端主动发起FIN,让TIME_WAIT状态留在服务端,因为TIME_WAIT不会影响服务端处理新连接。这和前面的说法不一致,下面我们来看看问题出在哪里。

The underlying TCP connection, in most normal cases, SHOULD be closed
first by the server, so that it holds the TIME_WAIT state and not the
client (as this would prevent it from re-opening the connection for 2
maximum segment lifetimes (2MSL), while there is no corresponding
server impact as a TIME_WAIT connection is immediately reopened upon
a new SYN with a higher seq number). In abnormal cases (such as not
having received a TCP Close from the server after a reasonable amount
of time) a client MAY initiate the TCP Close. As such, when a server
is instructed to Close the WebSocket Connection it SHOULD initiate
a TCP Close immediately, and when a client is instructed to do the
same, it SHOULD wait for a TCP Close from the server.

服务端TIME_WAIT实验

先写个简单的程序看看服务端处理TIME_WAIT的现象。内核版本3.10。实验代码见下文。
启动server

1
LISTEN     0	  1	 127.0.0.1:8888                     *:*                   users:(("server",pid=656524,fd=3))

使用nc固定一个端口号连接服务端,服务端向客户端输出hello后主动关闭连接,Ctrl-C结束nc程序(内核会自动回复ack)

1
2
nc -p 59818 127.0.0.1 8888
hello

观察到连接变为TIME_WAIT状态

1
2
LISTEN     0	  1	 127.0.0.1:8888                     *:*                   users:(("server",pid=656524,fd=3))
TIME-WAIT 0 0 127.0.0.1:8888 127.0.0.1:59818

再次使用相同端口向服务端发起连接,抓包发现握手成功,原来处于TIME-WAIT状态的连接被重用

1
2
3
4
5
IP 127.0.0.1.59818 > 127.0.0.1.8888: Flags [S], seq 2318980489, win 43690, options [mss 65495,nop,nop,sackOK,nop,wscale 7], length 0

IP 127.0.0.1.8888 > 127.0.0.1.59818: Flags [S.], seq 4100132887, ack 2318980490, win 43690, options [mss 65495,nop,nop,sackOK,nop,wscale 7], length 0

IP 127.0.0.1.59818 > 127.0.0.1.8888: Flags [.], ack 1, win 342, length 0

实际验证下来,TIME—WAIT状态并没有阻碍相同四元组的TCP新连接建立,看起来是没有“副作用“的。

内核源码

查看我所使用的3.10内核源码,在连接处于TIME-WAIT状态时,如果收到SYN是合法的,则会重用此连接。核心原理是判断收到的SYN序列号和时间戳(如果开启)是否发生回绕,注释中如果只判断序列号回绕,在小于40Mbit/sec的网络下是安全的,如果开启了TCP时间戳(PAWS),整个机制更值得信赖,风险更小。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
/* Out of window segment.

All the segments are ACKed immediately.

The only exception is new SYN. We accept it, if it is
not old duplicate and we are not in danger to be killed
by delayed old duplicates. RFC check is that it has
newer sequence number works at rates <40Mbit/sec.
However, if paws works, it is reliable AND even more,
we even may relax silly seq space cutoff.

RED-PEN: we violate main RFC requirement, if this SYN will appear
old duplicate (i.e. we receive RST in reply to SYN-ACK),
we must return socket to time-wait state. It is not good,
but not fatal yet.
*/

if (th->syn && !th->rst && !th->ack && !paws_reject &&
(after(TCP_SKB_CB(skb)->seq, tcptw->tw_rcv_nxt) ||
(tmp_opt.saw_tstamp &&
(s32)(tcptw->tw_ts_recent - tmp_opt.rcv_tsval) < 0))) {
u32 isn = tcptw->tw_snd_nxt + 65535 + 2;
if (isn == 0)
isn++;
TCP_SKB_CB(skb)->when = isn;
return TCP_TW_SYN;
}

客户端TIME-WAIT实验

客户端是否能复用处于TIME-WAIT状态的连接呢,使用nc验证下。

1
2
3
4
5
6
7
8
9
10
11
12
# 启动服务端
nc -l 8888

# 客户端连接
nc -p 59818 127.0.0.1 8888

# 关闭客户端,TIME-WAIT状态“留”在客户端
TIME-WAIT 0 0 127.0.0.1:59818 127.0.0.1:8888

# 客户端使用相同端口重新发起连接,报错
nc -p 59818 127.0.0.1 8888
Ncat: Cannot assign requested address.

实验显示客户端默认无法重用处于TIME-WAIT的连接。
如果想让客户端尽快复用处于TIME-WAIT的连接,也是有办法的,答案就是tcp_tw_reuse内核参数。

net.ipv4.tcp_tw_reuse

如果开启此选项,客户端在调用connect()连接远端服务器时,如果内核发现相同四元组的连接处于TIME-WAIT状态,且此状态时间超过1s,就会重用这个连接。

1
2
3
4
# tcp_tw_reuse生效前提是tcp_timestamps开启,此选项是默认开启的
sysctl -w net.ipv4.tcp_timestamps=1
# 开启tcp_tw_reuse
sysctl -w net.ipv4.tcp_tw_reuse=1

虽然客户端可以通过tcp_tw_reuse复用TIME-WAIT连接,但此“复用”并没有服务端复用TIME-WAIT安全。详细可参考这里

总结

RFC6455建议是很合理的,虽然服务端和客户端都有机制复用TIME-WAIT连接,但服务端复用的行为更安全。而且服务端资源有限,应该本着不信任客户端的原则,掌握断开连接的主动权,避免因等待客户端断连而浪费资源。

实验代码

server

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
package main

import (
"fmt"
"os"
"syscall"
)

func main() {
// 创建 socket
fd, err := syscall.Socket(syscall.AF_INET, syscall.SOCK_STREAM, syscall.IPPROTO_TCP)
if err != nil {
fmt.Fprintf(os.Stderr, "Error creating socket: %v\n", err)
return
}

// 绑定地址
addr := syscall.SockaddrInet4{Port: 8888}
copy(addr.Addr[:], []byte{127, 0, 0, 1}) // localhost

if err := syscall.Bind(fd, &addr); err != nil {
fmt.Fprintf(os.Stderr, "Error binding socket: %v\n", err)
syscall.Close(fd)
return
}

// 开始监听
if err := syscall.Listen(fd, 1); err != nil {
fmt.Fprintf(os.Stderr, "Error listening: %v\n", err)
syscall.Close(fd)
return
}

for {
// 接受连接
nfd, _, err := syscall.Accept(fd)
if err != nil {
fmt.Fprintf(os.Stderr, "Error accepting connection: %v\n", err)
syscall.Close(fd)
return
}

// 发送消息
_, err = syscall.Write(nfd, []byte("hello"))
if err != nil {
fmt.Fprintf(os.Stderr, "Error writing to socket: %v\n", err)
}

// 关闭连接
syscall.Close(nfd)
}
syscall.Close(fd)
}
  1. https://www.rfc-editor.org/rfc/rfc1122#page-88
  2. https://www.rfc-editor.org/rfc/rfc1323.html
  3. https://xiaolincoding.com/network/3_tcp/time_wait_recv_syn.html
  4. https://serverfault.com/questions/693529/how-does-server-side-time-wait-really-work
  5. https://web.archive.org/web/20141029223553/http://blogs.technet.com/b/networking/archive/2010/08/11/how-tcp-time-wait-assassination-works.aspx