Linux SO_REUSEADDR 和 SO_REUSEPORT 有何不同?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14388706/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do SO_REUSEADDR and SO_REUSEPORT differ?
提问by Mecki
The man pages
and programmer documentations for the socket options SO_REUSEADDR
and SO_REUSEPORT
are different for different operating systems and often highly confusing. Some operating systems don't even have the option SO_REUSEPORT
. The WEB is full of contradicting information regarding this subject and often you can find information that is only true for one socket implementation of a specific operating system, which may not even be explicitly mentioned in the text.
在man pages
和套接字选项程序员单证SO_REUSEADDR
,并SO_REUSEPORT
针对不同的操作系统,不同的,往往混淆高度。有些操作系统甚至没有这个选项SO_REUSEPORT
。WEB 上充满了关于这个主题的自相矛盾的信息,通常您可以找到仅适用于特定操作系统的一个套接字实现的信息,这些信息甚至可能没有在文本中明确提及。
So how exactly is SO_REUSEADDR
different than SO_REUSEPORT
?
那么究竟有什么SO_REUSEADDR
不同SO_REUSEPORT
呢?
Are systems without SO_REUSEPORT
more limited?
系统没有SO_REUSEPORT
更多限制吗?
And what exactly is the expected behavior if I use either one on different operating systems?
如果我在不同的操作系统上使用其中任何一个,预期的行为究竟是什么?
采纳答案by Mecki
Welcome to the wonderful world of portability... or rather the lack of it. Before we start analyzing these two options in detail and take a deeper look how different operating systems handle them, it should be noted that the BSD socket implementation is the mother of all socket implementations. Basically all other systems copied the BSD socket implementation at some point in time (or at least its interfaces) and then started evolving it on their own. Of course the BSD socket implementation was evolved as well at the same time and thus systems that copied it later got features that were lacking in systems that copied it earlier. Understanding the BSD socket implementation is the key to understanding all other socket implementations, so you should read about it even if you don't care to ever write code for a BSD system.
欢迎来到便携性的美妙世界……或者更确切地说是缺乏便携性。在我们开始详细分析这两个选项并深入了解不同操作系统如何处理它们之前,应该注意的是,BSD 套接字实现是所有套接字实现之母。基本上所有其他系统都在某个时间点(或至少是它的接口)复制了 BSD 套接字实现,然后开始自行发展。当然,BSD 套接字实现也在同时发展,因此后来复制它的系统获得了早期复制它的系统所缺乏的功能。理解 BSD 套接字实现是理解所有其他套接字实现的关键,因此即使您不关心为 BSD 系统编写代码,也应该阅读它。
There are a couple of basics you should know before we look at these two options. A TCP/UDP connection is identified by a tuple of five values:
在我们查看这两个选项之前,您应该了解一些基础知识。TCP/UDP 连接由五个值组成的元组标识:
{<protocol>, <src addr>, <src port>, <dest addr>, <dest port>}
{<protocol>, <src addr>, <src port>, <dest addr>, <dest port>}
Any unique combination of these values identifies a connection. As a result, no two connections can have the same five values, otherwise the system would not be able to distinguish these connections any longer.
这些值的任何唯一组合都会标识一个连接。因此,任何两个连接都不能具有相同的五个值,否则系统将无法再区分这些连接。
The protocol of a socket is set when a socket is created with the socket()
function. The source address and port are set with the bind()
function. The destination address and port are set with the connect()
function. Since UDP is a connectionless protocol, UDP sockets can be used without connecting them. Yet it is allowed to connect them and in some cases very advantageous for your code and general application design. In connectionless mode, UDP sockets that were not explicitly bound when data is sent over them for the first time are usually automatically bound by the system, as an unbound UDP socket cannot receive any (reply) data. Same is true for an unbound TCP socket, it is automatically bound before it will be connected.
使用该socket()
函数创建套接字时设置套接字的协议。源地址和端口由该bind()
函数设置。目的地址和端口由该connect()
函数设置。由于 UDP 是一种无连接协议,因此可以在不连接的情况下使用 UDP 套接字。然而,允许将它们连接起来,并且在某些情况下对您的代码和一般应用程序设计非常有利。在无连接模式下,首次通过它们发送数据时未明确绑定的 UDP 套接字通常会被系统自动绑定,因为未绑定的 UDP 套接字无法接收任何(回复)数据。对于未绑定的 TCP 套接字也是如此,它会在连接之前自动绑定。
If you explicitly bind a socket, it is possible to bind it to port 0
, which means "any port". Since a socket cannot really be bound to all existing ports, the system will have to choose a specific port itself in that case (usually from a predefined, OS specific range of source ports). A similar wildcard exists for the source address, which can be "any address" (0.0.0.0
in case of IPv4 and ::
in case of IPv6). Unlike in case of ports, a socket can really be bound to "any address" which means "all source IP addresses of all local interfaces". If the socket is connected later on, the system has to choose a specific source IP address, since a socket cannot be connected and at the same time be bound to any local IP address. Depending on the destination address and the content of the routing table, the system will pick an appropriate source address and replace the "any" binding with a binding to the chosen source IP address.
如果您显式绑定套接字,则可以将其绑定到 port 0
,这意味着“任何端口”。由于套接字不能真正绑定到所有现有端口,在这种情况下,系统必须自己选择一个特定端口(通常来自预定义的、操作系统特定的源端口范围)。源地址存在类似的通配符,可以是“任何地址”(0.0.0.0
在 IPv4 和::
在 IPv6 的情况下)。与端口的情况不同,套接字实际上可以绑定到“任何地址”,这意味着“所有本地接口的所有源 IP 地址”。如果套接字稍后连接,系统必须选择特定的源 IP 地址,因为套接字不能连接,同时绑定到任何本地 IP 地址。根据目的地址和路由表的内容,系统将选择一个合适的源地址并将“any”绑定替换为到所选源 IP 地址的绑定。
By default, no two sockets can be bound to the same combination of source address and source port. As long as the source port is different, the source address is actually irrelevant. Binding socketA
to A:X
and socketB
to B:Y
, where A
and B
are addresses and X
and Y
are ports, is always possible as long as X != Y
holds true. However, even if X == Y
, the binding is still possible as long as A != B
holds true. E.g. socketA
belongs to a FTP server program and is bound to 192.168.0.1:21
and socketB
belongs to another FTP server program and is bound to 10.0.0.1:21
, both bindings will succeed. Keep in mind, though, that a socket may be locally bound to "any address". If a socket is bound to 0.0.0.0:21
, it is bound to all existing local addresses at the same time and in that case no other socket can be bound to port 21
, regardless which specific IP address it tries to bind to, as 0.0.0.0
conflicts with all existing local IP addresses.
默认情况下,没有两个套接字可以绑定到相同的源地址和源端口组合。只要源端口不同,源地址实际上是无关紧要的。绑定socketA
到A:X
并socketB
到B:Y
,这里A
和B
是地址和X
和Y
是港口,始终是可能的,只要X != Y
成立。但是,即使X == Y
,只要A != B
成立,绑定仍然是可能的。例如,socketA
属于一个FTP服务器程序,并绑定到192.168.0.1:21
和socketB
属于另一个FTP服务器程序并结合10.0.0.1:21
,既绑定会成功。但是请记住,套接字可能在本地绑定到“任何地址”。如果一个套接字绑定到0.0.0.0:21
,它同时绑定到所有现有的本地地址,在这种情况下,没有其他套接字可以绑定到 port 21
,无论它尝试绑定到哪个特定 IP 地址,因为0.0.0.0
与所有现有的本地 IP 地址冲突。
Anything said so far is pretty much equal for all major operating system. Things start to get OS specific when address reuse comes into play. We start with BSD, since as I said above, it is the mother of all socket implementations.
到目前为止,所有主要操作系统都差不多。当地址重用开始发挥作用时,事情开始变得特定于操作系统。我们从 BSD 开始,因为正如我上面所说,它是所有套接字实现的母体。
BSD
BSD
SO_REUSEADDR
SO_REUSEADDR
If SO_REUSEADDR
is enabled on a socket prior to binding it, the socket can be successfully bound unless there is a conflict with another socket bound to exactlythe same combination of source address and port. Now you may wonder how is that any different than before? The keyword is "exactly". SO_REUSEADDR
mainly changes the way how wildcard addresses ("any IP address") are treated when searching for conflicts.
如果SO_REUSEADDR
在绑定之前在套接字上启用,则该套接字可以成功绑定,除非与绑定到完全相同的源地址和端口组合的另一个套接字发生冲突。现在您可能想知道这与以前有何不同?关键字是“正是”。SO_REUSEADDR
主要改变搜索冲突时通配符地址(“任何 IP 地址”)的处理方式。
Without SO_REUSEADDR
, binding socketA
to 0.0.0.0:21
and then binding socketB
to 192.168.0.1:21
will fail (with error EADDRINUSE
), since 0.0.0.0 means "any local IP address", thus all local IP addresses are considered in use by this socket and this includes 192.168.0.1
, too. With SO_REUSEADDR
it will succeed, since 0.0.0.0
and 192.168.0.1
are not exactlythe same address, one is a wildcard for all local addresses and the other one is a very specific local address. Note that the statement above is true regardless in which order socketA
and socketB
are bound; without SO_REUSEADDR
it will always fail, with SO_REUSEADDR
it will always succeed.
如果没有SO_REUSEADDR
,绑定socketA
到0.0.0.0:21
然后绑定socketB
到192.168.0.1:21
将失败(有错误EADDRINUSE
),因为 0.0.0.0 意味着“任何本地 IP 地址”,因此所有本地 IP 地址都被认为正在被这个套接字使用,这也包括192.168.0.1
。有了SO_REUSEADDR
它一定会成功,因为0.0.0.0
和192.168.0.1
是不完全一样的地址,一个是为所有本地地址的通配符,另一个是一个非常具体的本地地址。请注意,上面的语句,无论是真实的次序socketA
和socketB
绑定; 没有SO_REUSEADDR
它永远失败,有了SO_REUSEADDR
它永远成功。
To give you a better overview, let's make a table here and list all possible combinations:
为了给您一个更好的概览,让我们在这里制作一个表格并列出所有可能的组合:
SO_REUSEADDR socketA socketB Result --------------------------------------------------------------------- ON/OFF 192.168.0.1:21 192.168.0.1:21 Error (EADDRINUSE) ON/OFF 192.168.0.1:21 10.0.0.1:21 OK ON/OFF 10.0.0.1:21 192.168.0.1:21 OK OFF 0.0.0.0:21 192.168.1.0:21 Error (EADDRINUSE) OFF 192.168.1.0:21 0.0.0.0:21 Error (EADDRINUSE) ON 0.0.0.0:21 192.168.1.0:21 OK ON 192.168.1.0:21 0.0.0.0:21 OK ON/OFF 0.0.0.0:21 0.0.0.0:21 Error (EADDRINUSE)
The table above assumes that socketA
has already been successfully bound to the address given for socketA
, then socketB
is created, either gets SO_REUSEADDR
set or not, and finally is bound to the address given for socketB
. Result
is the result of the bind operation for socketB
. If the first column says ON/OFF
, the value of SO_REUSEADDR
is irrelevant to the result.
上表假设socketA
已经成功绑定到给定的地址socketA
,然后socketB
创建,SO_REUSEADDR
设置或不设置,最后绑定到给定的地址socketB
。Result
是 的绑定操作的结果socketB
。如果第一列是ON/OFF
,则 的值SO_REUSEADDR
与结果无关。
Okay, SO_REUSEADDR
has an effect on wildcard addresses, good to know. Yet that isn't it's only effect it has. There is another well known effect which is also the reason why most people use SO_REUSEADDR
in server programs in the first place. For the other important use of this option we have to take a deeper look on how the TCP protocol works.
好的,SO_REUSEADDR
对通配符地址有影响,很高兴知道。然而,这还不是它的唯一效果。还有另一个众所周知的效果,这也是大多数人SO_REUSEADDR
首先在服务器程序中使用的原因。对于此选项的另一个重要用途,我们必须更深入地了解 TCP 协议的工作原理。
A socket has a send buffer and if a call to the send()
function succeeds, it does not mean that the requested data has actually really been sent out, it only means the data has been added to the send buffer. For UDP sockets, the data is usually sent pretty soon, if not immediately, but for TCP sockets, there can be a relatively long delay between adding data to the send buffer and having the TCP implementation really send that data. As a result, when you close a TCP socket, there may still be pending data in the send buffer, which has not been sent yet but your code considers it as sent, since the send()
call succeeded. If the TCP implementation was closing the socket immediately on your request, all of this data would be lost and your code wouldn't even know about that. TCP is said to be a reliable protocol and losing data just like that is not very reliable. That's why a socket that still has data to send will go into a state called TIME_WAIT
when you close it. In that state it will wait until all pending data has been successfully sent or until a timeout is hit, in which case the socket is closed forcefully.
一个套接字有一个发送缓冲区,如果调用该send()
函数成功,并不意味着请求的数据实际上已经发送出去,仅表示数据已经添加到发送缓冲区中。对于 UDP 套接字,数据通常很快发送,如果不是立即发送,但对于 TCP 套接字,在将数据添加到发送缓冲区和让 TCP 实现真正发送该数据之间可能存在相对较长的延迟。因此,当您关闭 TCP 套接字时,发送缓冲区中可能仍有待处理数据,这些数据尚未发送,但您的代码将其视为已发送,因为send()
调用成功。如果 TCP 实现根据您的请求立即关闭套接字,则所有这些数据都将丢失,您的代码甚至不会知道这一点。据说 TCP 是一种可靠的协议,并且丢失数据并不是很可靠。这就是为什么仍然有数据要发送的套接字将进入TIME_WAIT
关闭它时调用的状态。在该状态下,它将等待所有挂起的数据已成功发送或直到超时,在这种情况下,套接字将被强制关闭。
The amount of time the kernel will wait before it closes the socket, regardless if it still has data in flight or not, is called the Linger Time. The Linger Timeis globally configurable on most systems and by default rather long (two minutes is a common value you will find on many systems). It is also configurable per socket using the socket option SO_LINGER
which can be used to make the timeout shorter or longer, and even to disable it completely. Disabling it completely is a very bad idea, though, since closing a TCP socket gracefully is a slightly complex process and involves sending forth and back a couple of packets (as well as resending those packets in case they got lost) and this whole close process is also limited by the Linger Time. If you disable lingering, your socket may not only lose data in flight, it is also always closed forcefully instead of gracefully, which is usually not recommended. The details about how a TCP connection is closed gracefully are beyond the scope of this answer, if you want to learn more about, I recommend you have a look at this page. And even if you disabled lingering with SO_LINGER
, if your process dies without explicitly closing the socket, BSD (and possibly other systems) will linger nonetheless, ignoring what you have configured. This will happen for example if your code just calls exit()
(pretty common for tiny, simple server programs) or the process is killed by a signal (which includes the possibility that it simply crashes because of an illegal memory access). So there is nothing you can do to make sure a socket will never linger under all circumstances.
内核在关闭套接字之前等待的时间量,无论它是否仍有数据在传输,称为Linger Time。在逗留时间是在大多数系统中,默认情况下相当长的全球配置(两分钟,你会发现在许多系统中的常见值)。它也可以使用套接字选项为每个套接字进行配置,该选项SO_LINGER
可用于缩短或延长超时时间,甚至完全禁用它。但是,完全禁用它是一个非常糟糕的主意,因为优雅地关闭 TCP 套接字是一个稍微复杂的过程,涉及发送和返回几个数据包(以及在丢失的情况下重新发送这些数据包)以及整个关闭过程也受Linger Time的限制. 如果禁用延迟,则您的套接字不仅可能会在传输过程中丢失数据,而且总是强制关闭而不是正常关闭,这通常是不推荐的。关于如何优雅地关闭 TCP 连接的详细信息超出了本答案的范围,如果您想了解更多信息,我建议您查看此页面。即使您禁用了 lingering with SO_LINGER
,如果您的进程在没有明确关闭套接字的情况下终止,BSD(以及可能的其他系统)仍然会逗留,忽略您的配置。例如,如果您的代码只是调用exit()
(对于微小的、简单的服务器程序来说很常见)或者进程被一个信号杀死(包括它因为非法内存访问而崩溃的可能性)。因此,您无法确保套接字在所有情况下都不会停留。
The question is, how does the system treat a socket in state TIME_WAIT
? If SO_REUSEADDR
is not set, a socket in state TIME_WAIT
is considered to still be bound to the source address and port and any attempt to bind a new socket to the same address and port will fail until the socket has really been closed, which may take as long as the configured Linger Time. So don't expect that you can rebind the source address of a socket immediately after closing it. In most cases this will fail. However, if SO_REUSEADDR
is set for the socket you are trying to bind, another socket bound to the same address and port in state TIME_WAIT
is simply ignored, after all its already "half dead", and your socket can bind to exactly the same address without any problem. In that case it plays no role that the other socket may have exactly the same address and port. Note that binding a socket to exactly the same address and port as a dying socket in TIME_WAIT
state can have unexpected, and usually undesired, side effects in case the other socket is still "at work", but that is beyond the scope of this answer and fortunately those side effects are rather rare in practice.
问题是,系统如何处理处于 state 的套接字TIME_WAIT
?如果SO_REUSEADDR
未设置,TIME_WAIT
则认为处于状态的套接字仍绑定到源地址和端口,并且任何尝试将新套接字绑定到相同地址和端口的尝试都将失败,直到套接字真正关闭,这可能需要很长时间作为配置的逗留时间。所以不要指望关闭套接字后就可以立即重新绑定它的源地址。在大多数情况下,这将失败。但是,如果SO_REUSEADDR
为您尝试绑定的套接字设置了 ,则另一个套接字绑定到状态相同的地址和端口TIME_WAIT
被简单地忽略,毕竟它已经“半死不活”,并且您的套接字可以毫无问题地绑定到完全相同的地址。在这种情况下,另一个套接字可能具有完全相同的地址和端口没有任何作用。请注意,将套接字绑定到与处于TIME_WAIT
状态的垂死套接字完全相同的地址和端口可能会产生意外的、通常是不希望的副作用,以防另一个套接字仍在“工作”,但这超出了本答案的范围和幸运的是,这些副作用在实践中相当罕见。
There is one final thing you should know about SO_REUSEADDR
. Everything written above will work as long as the socket you want to bind to has address reuse enabled. It is not necessary that the other socket, the one which is already bound or is in a TIME_WAIT
state, also had this flag set when it was bound. The code that decides if the bind will succeed or fail only inspects the SO_REUSEADDR
flag of the socket fed into the bind()
call, for all other sockets inspected, this flag is not even looked at.
最后一件事你应该知道SO_REUSEADDR
。只要您要绑定的套接字启用了地址重用,上面写的所有内容都将起作用。另一个套接字,即已经绑定或处于某种TIME_WAIT
状态的套接字,在绑定时也不必设置此标志。决定绑定是成功还是失败的代码仅检查SO_REUSEADDR
馈入bind()
调用的套接字的标志,对于所有其他检查的套接字,甚至不查看此标志。
SO_REUSEPORT
SO_REUSEPORT
SO_REUSEPORT
is what most people would expect SO_REUSEADDR
to be. Basically, SO_REUSEPORT
allows you to bind an arbitrary number of sockets to exactlythe same source address and port as long as allprior bound sockets also had SO_REUSEPORT
set before they were bound. If the first socket that is bound to an address and port does not have SO_REUSEPORT
set, no other socket can be bound to exactly the same address and port, regardless if this other socket has SO_REUSEPORT
set or not, until the first socket releases its binding again. Unlike in case of SO_REUESADDR
the code handling SO_REUSEPORT
will not only verify that the currently bound socket has SO_REUSEPORT
set but it will also verify that the socket with a conflicting address and port had SO_REUSEPORT
set when it was bound.
SO_REUSEPORT
是大多数人所期望的SO_REUSEADDR
。基本上,SO_REUSEPORT
允许您将任意数量的套接字绑定到完全相同的源地址和端口,只要所有先前绑定的套接字在绑定之前也已SO_REUSEPORT
设置。如果绑定到某个地址和端口的第一个套接字没有SO_REUSEPORT
设置,则任何其他套接字都不能绑定到完全相同的地址和端口,无论这个其他套接字是否已SO_REUSEPORT
设置,直到第一个套接字再次释放其绑定。与SO_REUESADDR
代码处理的情况不同,它SO_REUSEPORT
不仅会验证当前绑定的套接字是否已SO_REUSEPORT
设置,而且还会验证具有冲突地址和端口的套接字SO_REUSEPORT
在绑定时是否已设置。
SO_REUSEPORT
does not imply SO_REUSEADDR
. This means if a socket did not have SO_REUSEPORT
set when it was bound and another socket has SO_REUSEPORT
set when it is bound to exactly the same address and port, the bind fails, which is expected, but it also fails if the other socket is already dying and is in TIME_WAIT
state. To be able to bind a socket to the same addresses and port as another socket in TIME_WAIT
state requires either SO_REUSEADDR
to be set on that socket or SO_REUSEPORT
must have been set on bothsockets prior to binding them. Of course it is allowed to set both, SO_REUSEPORT
and SO_REUSEADDR
, on a socket.
SO_REUSEPORT
并不意味着SO_REUSEADDR
。这意味着如果一个套接字SO_REUSEPORT
在绑定时没有设置,而另一个套接字SO_REUSEPORT
在绑定到完全相同的地址和端口时已经设置,则绑定失败,这是预期的,但如果另一个套接字已经死亡并且绑定也会失败。处于TIME_WAIT
状态。为了能够将套接字绑定到与处于TIME_WAIT
状态的另一个套接字相同的地址和端口,需要SO_REUSEADDR
在该套接字上设置或者SO_REUSEPORT
必须在绑定它们之前在两个套接字上设置。当然,允许在套接字上同时设置SO_REUSEPORT
和SO_REUSEADDR
。
There is not much more to say about SO_REUSEPORT
other than that it was added later than SO_REUSEADDR
, that's why you will not find it in many socket implementations of other systems, which "forked" the BSD code before this option was added, and that there was no way to bind two sockets to exactly the same socket address in BSD prior to this option.
没有多说关于SO_REUSEPORT
其他比它晚于加入SO_REUSEADDR
,这就是为什么你会不会在其他系统中,其中“分叉”的许多套接字实现发现该选项之前,BSD的代码被添加,并且没有在此选项之前,在 BSD 中将两个套接字绑定到完全相同的套接字地址的方法。
Connect() Returning EADDRINUSE?
Connect() 返回 EADDRINUSE?
Most people know that bind()
may fail with the error EADDRINUSE
, however, when you start playing around with address reuse, you may run into the strange situation that connect()
fails with that error as well. How can this be? How can a remote address, after all that's what connect adds to a socket, be already in use? Connecting multiple sockets to exactly the same remote address has never been a problem before, so what's going wrong here?
大多数人都知道这bind()
可能会因错误而失败EADDRINUSE
,但是,当您开始尝试地址重用时,您可能会遇到因connect()
该错误而失败的奇怪情况。怎么会这样?毕竟,远程地址是连接添加到套接字的内容,如何已经在使用中?将多个套接字连接到完全相同的远程地址以前从来都不是问题,那么这里出了什么问题呢?
As I said on the very top of my reply, a connection is defined by a tuple of five values, remember? And I also said, that these five values must be unique otherwise the system cannot distinguish two connections any longer, right? Well, with address reuse, you can bind two sockets of the same protocol to the same source address and port. That means three of those five values are already the same for these two sockets. If you now try to connect both of these sockets also to the same destination address and port, you would create two connected sockets, whose tuples are absolutely identical. This cannot work, at least not for TCP connections (UDP connections are no real connections anyway). If data arrived for either one of the two connections, the system could not tell which connection the data belongs to. At least the destination address or destination port must be different for either connection, so that the system has no problem to identify to which connection incoming data belongs to.
正如我在回复的最顶部所说,连接由五个值组成的元组定义,还记得吗?而且我还说,这五个值必须是唯一的,否则系统无法再区分两个连接,对吗?那么,通过地址重用,您可以将相同协议的两个套接字绑定到相同的源地址和端口。这意味着这五个值中的三个对于这两个套接字已经相同。如果您现在尝试将这两个套接字也连接到相同的目标地址和端口,您将创建两个连接的套接字,它们的元组完全相同。这行不通,至少对 TCP 连接不行(无论如何,UDP 连接都不是真正的连接)。如果数据到达两个连接中的任何一个,系统将无法判断数据属于哪个连接。
So if you bind two sockets of the same protocol to the same source address and port and try to connect them both to the same destination address and port, connect()
will actually fail with the error EADDRINUSE
for the second socket you try to connect, which means that a socket with an identical tuple of five values is already connected.
因此,如果您将两个相同协议的套接字绑定到相同的源地址和端口,并尝试将它们连接到相同的目标地址和端口,connect()
实际上将失败并显示EADDRINUSE
您尝试连接的第二个套接字的错误,这意味着具有五个值的相同元组的套接字已连接。
Multicast Addresses
组播地址
Most people ignore the fact that multicast addresses exist, but they do exist. While unicast addresses are used for one-to-one communication, multicast addresses are used for one-to-many communication. Most people got aware of multicast addresses when they learned about IPv6 but multicast addresses also existed in IPv4, even though this feature was never widely used on the public Internet.
大多数人忽略了多播地址存在的事实,但它们确实存在。单播地址用于一对一通信,多播地址用于一对多通信。大多数人在了解 IPv6 时就知道多播地址,但 IPv4 中也存在多播地址,尽管此功能从未在公共 Internet 上广泛使用。
The meaning of SO_REUSEADDR
changes for multicast addresses as it allows multiple sockets to be bound to exactly the same combination of source multicast address and port. In other words, for multicast addresses SO_REUSEADDR
behaves exactly as SO_REUSEPORT
for unicast addresses. Actually, the code treats SO_REUSEADDR
and SO_REUSEPORT
identically for multicast addresses, that means you could say that SO_REUSEADDR
implies SO_REUSEPORT
for all multicast addresses and the other way round.
SO_REUSEADDR
组播地址变化的含义,因为它允许多个套接字绑定到完全相同的源组播地址和端口组合。换句话说,多播地址的SO_REUSEADDR
行为与SO_REUSEPORT
单播地址完全一样。实际上,代码对多播地址的处理SO_REUSEADDR
和SO_REUSEPORT
相同,这意味着您可以说这SO_REUSEADDR
意味着SO_REUSEPORT
所有多播地址,反之亦然。
FreeBSD/OpenBSD/NetBSD
FreeBSD/OpenBSD/NetBSD
All these are rather late forks of the original BSD code, that's why they all three offer the same options as BSD and they also behave the same way as in BSD.
所有这些都是原始 BSD 代码的相当晚的分支,这就是为什么它们三个都提供与 BSD 相同的选项,并且它们的行为方式也与 BSD 相同。
macOS (MacOS X)
macOS (MacOS X)
At its core, macOS is simply a BSD-style UNIX named "Darwin", based on a rather late fork of the BSD code (BSD 4.3), which was then later on even re-synchronized with the (at that time current) FreeBSD 5 code base for the Mac OS 10.3 release, so that Apple could gain full POSIX compliance (macOS is POSIX certified). Despite having a microkernel at its core ("Mach"), the rest of the kernel ("XNU") is basically just a BSD kernel, and that's why macOS offers the same options as BSD and they also behave the same way as in BSD.
从本质上讲,macOS 只是一个名为“ Darwin”的 BSD 风格的 UNIX ,基于 BSD 代码(BSD 4.3)的一个相当晚的分支,后来甚至与(当时的)FreeBSD 重新同步Mac OS 10.3 版本的 5 代码库,以便 Apple 可以获得完全的 POSIX 合规性(macOS 已通过 POSIX 认证)。尽管其核心有一个微内核(“ Mach”),但内核的其余部分(“ XNU”)基本上只是一个 BSD 内核,这就是 macOS 提供与 BSD 相同的选项并且它们的行为方式也与 BSD 相同的原因.
iOS / watchOS / tvOS
iOS / watchOS / tvOS
iOS is just a macOS fork with a slightly modified and trimmed kernel, somewhat stripped down user space toolset and a slightly different default framework set. watchOS and tvOS are iOS forks, that are stripped down even further (especially watchOS). To my best knowledge they all behave exactly as macOS does.
iOS 只是一个 macOS 的分支,带有稍微修改和修剪的内核,稍微精简了用户空间工具集和稍微不同的默认框架集。watchOS 和 tvOS 是 iOS 的分支,它们被进一步精简(尤其是 watchOS)。据我所知,它们的行为都与 macOS 完全一样。
Linux
Linux
Linux < 3.9
Linux < 3.9
Prior to Linux 3.9, only the option SO_REUSEADDR
existed. This option behaves generally the same as in BSD with two important exceptions:
在 Linux 3.9 之前,只有该选项SO_REUSEADDR
存在。此选项的行为与 BSD 中的行为大致相同,但有两个重要的例外:
As long as a listening (server) TCP socket is bound to a specific port, the
SO_REUSEADDR
option is entirely ignored for all sockets targeting that port. Binding a second socket to the same port is only possible if it was also possible in BSD without havingSO_REUSEADDR
set. E.g. you cannot bind to a wildcard address and then to a more specific one or the other way round, both is possible in BSD if you setSO_REUSEADDR
. What you can do is you can bind to the same port and two different non-wildcard addresses, as that's always allowed. In this aspect Linux is more restrictive than BSD.The second exception is that for client sockets, this option behaves exactly like
SO_REUSEPORT
in BSD, as long as both had this flag set before they were bound. The reason for allowing that was simply that it is important to be able to bind multiple sockets to exactly to the same UDP socket address for various protocols and as there used to be noSO_REUSEPORT
prior to 3.9, the behavior ofSO_REUSEADDR
was altered accordingly to fill that gap. In that aspect Linux is less restrictive than BSD.
只要侦听(服务器)TCP 套接字绑定到特定端口,
SO_REUSEADDR
针对该端口的所有套接字都会完全忽略该选项。将第二个套接字绑定到同一个端口只有在 BSD 中没有SO_REUSEADDR
设置的情况下也是可能的。例如,您不能绑定到通配符地址,然后绑定到更具体的地址或相反的地址,如果您设置SO_REUSEADDR
. 你可以做的是你可以绑定到同一个端口和两个不同的非通配符地址,因为这总是被允许的。在这方面,Linux 比 BSD 限制更多。第二个例外是对于客户端套接字,这个选项的行为与
SO_REUSEPORT
BSD 中的完全一样,只要它们在绑定之前都设置了这个标志。允许这样做的原因很简单,重要的是能够将多个套接字完全绑定到各种协议的相同 UDP 套接字地址,并且因为SO_REUSEPORT
在 3.9 之前没有,因此SO_REUSEADDR
相应地改变了 的行为以填补这一空白. 在这方面,Linux 的限制比 BSD 少。
Linux >= 3.9
Linux >= 3.9
Linux 3.9 added the option SO_REUSEPORT
to Linux as well. This option behaves exactly like the option in BSD and allows binding to exactly the same address and port number as long as all sockets have this option set prior to binding them.
Linux 3.9 也向 Linux 添加了该选项SO_REUSEPORT
。此选项的行为与 BSD 中的选项完全相同,并且只要所有套接字在绑定之前设置了此选项,就允许绑定到完全相同的地址和端口号。
Yet, there are still two differences to SO_REUSEPORT
on other systems:
然而,SO_REUSEPORT
在其他系统上仍然有两个不同之处:
To prevent "port hiHymaning", there is one special limitation: All sockets that want to share the same address and port combination must belong to processes that share the same effective user ID!So one user cannot "steal" ports of another user. This is some special magic to somewhat compensate for the missing
SO_EXCLBIND
/SO_EXCLUSIVEADDRUSE
flags.Additionally the kernel performs some "special magic" for
SO_REUSEPORT
sockets that isn't found in other operating systems: For UDP sockets, it tries to distribute datagrams evenly, for TCP listening sockets, it tries to distribute incoming connect requests (those accepted by callingaccept()
) evenly across all the sockets that share the same address and port combination. Thus an application can easily open the same port in multiple child processes and then useSO_REUSEPORT
to get a very inexpensive load balancing.
为了防止“端口劫持”,有一个特殊限制:所有想要共享相同地址和端口组合的套接字必须属于共享相同有效用户 ID 的进程!所以一个用户不能“窃取”另一个用户的端口。这是一些特殊的魔法,可以在一定程度上弥补缺失的
SO_EXCLBIND
/SO_EXCLUSIVEADDRUSE
标志。此外,内核
SO_REUSEPORT
对其他操作系统中没有的套接字执行一些“特殊魔法” :对于 UDP 套接字,它尝试均匀地分发数据报,对于 TCP 侦听套接字,它尝试分发传入的连接请求(通过调用接受的那些accept()
)均匀分布在共享相同地址和端口组合的所有套接字上。因此,应用程序可以轻松地在多个子进程中打开同一个端口,然后使用它SO_REUSEPORT
来获得非常便宜的负载平衡。
Android
安卓
Even though the whole Android system is somewhat different from most Linux distributions, at its core works a slightly modified Linux kernel, thus everything that applies to Linux should apply to Android as well.
尽管整个 Android 系统与大多数 Linux 发行版有些不同,但其核心运行的是一个稍微修改过的 Linux 内核,因此适用于 Linux 的所有内容也应该适用于 Android。
Windows
视窗
Windows only knows the SO_REUSEADDR
option, there is no SO_REUSEPORT
. Setting SO_REUSEADDR
on a socket in Windows behaves like setting SO_REUSEPORT
and SO_REUSEADDR
on a socket in BSD, with one exception: A socket with SO_REUSEADDR
can always bind to exactly the same source address and port as an already bound socket, even if the other socket did not have this option set when it was bound. This behavior is somewhat dangerous because it allows an application "to steal" the connected port of another application. Needless to say, this can have major security implications. Microsoft realized that this might be a problem and thus added another socket option SO_EXCLUSIVEADDRUSE
. Setting SO_EXCLUSIVEADDRUSE
on a socket makes sure that if the binding succeeds, the combination of source address and port is owned exclusively by this socket and no other socket can bind to them, not even if it has SO_REUSEADDR
set.
Windows 只知道该SO_REUSEADDR
选项,没有SO_REUSEPORT
. 设置SO_REUSEADDR
在Windows中的行为像设置一个插座上SO_REUSEPORT
,并SO_REUSEADDR
在BSD插座,但有一个例外:与插座SO_REUSEADDR
可以随时绑定完全相同的源地址和端口为已绑定套接字,即使其他插座没有这个选项绑定时设置。这种行为有点危险,因为它允许应用程序“窃取”另一个应用程序的连接端口。不用说,这可能会产生重大的安全隐患。Microsoft 意识到这可能是一个问题,因此添加了另一个套接字选项SO_EXCLUSIVEADDRUSE
。环境SO_EXCLUSIVEADDRUSE
在套接字上确保如果绑定成功,源地址和端口的组合由这个套接字独占拥有,并且没有其他套接字可以绑定到它们,即使它已经SO_REUSEADDR
设置。
For even more details on how the flags SO_REUSEADDR
and SO_EXCLUSIVEADDRUSE
work on Windows, how they influence binding/re-binding, Microsoft kindly provided a table similar to my table near the top of that reply. Just visit this pageand scroll down a bit. Actually there are three tables, the first one shows the old behavior (prior Windows 2003), the second one the behavior (Windows 2003 and up) and the third one shows how the behavior changes in Windows 2003 and later if the bind()
calls are made by different users.
有关标志如何更详细信息SO_REUSEADDR
,并SO_EXCLUSIVEADDRUSE
在Windows上,是如何工作的,他们影响绑定/重新绑定,微软好心提供了类似的表我的桌子旁边,那答复的顶部。只需访问此页面并向下滚动一点。实际上有三个表,第一个显示旧行为(Windows 2003 之前的),第二个显示行为(Windows 2003 及更高版本),第三个显示如果bind()
调用由不同的用户。
Solaris
索拉里斯
Solaris is the successor of SunOS. SunOS was originally based on a fork of BSD, SunOS 5 and later was based on a fork of SVR4, however SVR4 is a merge of BSD, System V, and Xenix, so up to some degree Solaris is also a BSD fork, and a rather early one. As a result Solaris only knows SO_REUSEADDR
, there is no SO_REUSEPORT
. The SO_REUSEADDR
behaves pretty much the same as it does in BSD. As far as I know there is no way to get the same behavior as SO_REUSEPORT
in Solaris, that means it is not possible to bind two sockets to exactly the same address and port.
Solaris 是 SunOS 的继承者。SunOS 最初是基于 BSD 的一个分支,SunOS 5 后来是基于 SVR4 的一个分支,但是 SVR4 是 BSD、System V 和 Xenix 的合并,所以在某种程度上 Solaris 也是一个 BSD 分支,并且比较早的。结果 Solaris 只知道SO_REUSEADDR
,没有SO_REUSEPORT
。它的SO_REUSEADDR
行为与在 BSD 中的行为几乎相同。据我所知,无法获得与SO_REUSEPORT
Solaris相同的行为,这意味着不可能将两个套接字绑定到完全相同的地址和端口。
Similar to Windows, Solaris has an option to give a socket an exclusive binding. This option is named SO_EXCLBIND
. If this option is set on a socket prior to binding it, setting SO_REUSEADDR
on another socket has no effect if the two sockets are tested for an address conflict. E.g. if socketA
is bound to a wildcard address and socketB
has SO_REUSEADDR
enabled and is bound to a non-wildcard address and the same port as socketA
, this bind will normally succeed, unless socketA
had SO_EXCLBIND
enabled, in which case it will fail regardless the SO_REUSEADDR
flag of socketB
.
与 Windows 类似,Solaris 可以选择为套接字提供独占绑定。此选项名为SO_EXCLBIND
. 如果在绑定之前在套接字上设置此选项,则在SO_REUSEADDR
测试两个套接字是否存在地址冲突时,在另一个套接字上的设置无效。例如,如果socketA
绑定到通配符地址并socketB
已SO_REUSEADDR
启用并绑定到非通配符地址和与 相同的端口socketA
,则此绑定通常会成功,除非socketA
已SO_EXCLBIND
启用,在这种情况下,无论 的SO_REUSEADDR
标志如何,它都会失败socketB
。
Other Systems
其他系统
In case your system is not listed above, I wrote a little test program that you can use to find out how your system handles these two options. Also if you think my results are wrong, please first run that program before posting any comments and possibly making false claims.
如果您的系统未在上面列出,我编写了一个小测试程序,您可以使用它来了解您的系统如何处理这两个选项。此外,如果您认为我的结果是错误的,请在发表任何评论和可能做出虚假声明之前先运行该程序。
All that the code requires to build is a bit POSIX API (for the network parts) and a C99 compiler (actually most non-C99 compiler will work as well as long as they offer inttypes.h
and stdbool.h
; e.g. gcc
supported both long before offering full C99 support).
构建代码所需的只是一点 POSIX API(用于网络部分)和一个 C99 编译器(实际上,大多数非 C99 编译器只要提供inttypes.h
和就可以正常工作stdbool.h
;例如,gcc
在提供完整的 C99 支持之前很久就支持两者) .
All that the program needs to run is that at least one interface in your system (other than the local interface) has an IP address assigned and that a default route is set which uses that interface. The program will gather that IP address and use it as the second "specific address".
该程序需要运行的只是您系统中的至少一个接口(本地接口除外)分配了 IP 地址,并设置了使用该接口的默认路由。该程序将收集该 IP 地址并将其用作第二个“特定地址”。
It tests all possible combinations you can think of:
它测试您能想到的所有可能的组合:
- TCP and UDP protocol
- Normal sockets, listen (server) sockets, multicast sockets
SO_REUSEADDR
set on socket1, socket2, or both socketsSO_REUSEPORT
set on socket1, socket2, or both sockets- All address combinations you can make out of
0.0.0.0
(wildcard),127.0.0.1
(specific address), and the second specific address found at your primary interface (for multicast it's just224.1.2.3
in all tests)
- TCP和UDP协议
- 普通套接字、监听(服务器)套接字、多播套接字
SO_REUSEADDR
在 socket1、socket2 或两个套接字上设置SO_REUSEPORT
在 socket1、socket2 或两个套接字上设置- 您可以使用
0.0.0.0
(通配符)、127.0.0.1
(特定地址)和在您的主接口上找到的第二个特定地址(对于多播,它只是224.1.2.3
在所有测试中)的所有地址组合
and prints the results in a nice table. It will also work on systems that don't know SO_REUSEPORT
, in which case this option is simply not tested.
并将结果打印在一个漂亮的表格中。它也适用于不知道的系统SO_REUSEPORT
,在这种情况下,此选项根本没有经过测试。
What the program cannot easily test is how SO_REUSEADDR
acts on sockets in TIME_WAIT
state as it's very tricky to force and keep a socket in that state. Fortunately most operating systems seems to simply behave like BSD here and most of the time programmers can simply ignore the existence of that state.
程序无法轻松测试的是如何SO_REUSEADDR
作用于处于TIME_WAIT
状态的套接字,因为强制并保持套接字处于该状态非常棘手。幸运的是,大多数操作系统在这里似乎只是表现得像 BSD,大多数时候程序员可以简单地忽略该状态的存在。
Here's the code(I cannot include it here, answers have a size limit and the code would push this reply over the limit).
这是代码(我不能在这里包含它,答案有大小限制,代码会将此回复推到限制之上)。