C# GUID 不唯一的简单证明
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1705008/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Simple proof that GUID is not unique
提问by Kai
I'd like to prove that a GUID is not unique in a simple test program. I expected the following code to run for hours, but it's not working. How can I make it work?
我想证明 GUID 在简单的测试程序中不是唯一的。我预计以下代码可以运行几个小时,但它不起作用。我怎样才能让它工作?
BigInteger begin = new BigInteger((long)0);
BigInteger end = new BigInteger("340282366920938463463374607431768211456",10); //2^128
for(begin; begin<end; begin++)
Console.WriteLine(System.Guid.NewGuid().ToString());
I'm using C#.
我正在使用 C#。
采纳答案by ligos
Kai, I have provided a program that will do what you want using threads. It is licensed under the following terms: you must pay me $0.0001 per hour per CPU core you run it on. Fees are payable at the end of each calendar month. Please contact me for my paypal account details at your earliest convenience.
Kai,我提供了一个程序,可以使用线程执行您想要的操作。它根据以下条款获得许可:运行它的每个 CPU 内核每小时必须向我支付 0.0001 美元。费用应在每个日历月的月底支付。请尽快与我联系以获取我的贝宝帐户详细信息。
using System;
using System.Collections.Generic;
using System.Linq;
namespace GuidCollisionDetector
{
class Program
{
static void Main(string[] args)
{
//var reserveSomeRam = new byte[1024 * 1024 * 100]; // This indeed has no effect.
Console.WriteLine("{0:u} - Building a bigHeapOGuids.", DateTime.Now);
// Fill up memory with guids.
var bigHeapOGuids = new HashSet<Guid>();
try
{
do
{
bigHeapOGuids.Add(Guid.NewGuid());
} while (true);
}
catch (OutOfMemoryException)
{
// Release the ram we allocated up front.
// Actually, these are pointless too.
//GC.KeepAlive(reserveSomeRam);
//GC.Collect();
}
Console.WriteLine("{0:u} - Built bigHeapOGuids, contains {1} of them.", DateTime.Now, bigHeapOGuids.LongCount());
// Spool up some threads to keep checking if there's a match.
// Keep running until the heat death of the universe.
for (long k = 0; k < Int64.MaxValue; k++)
{
for (long j = 0; j < Int64.MaxValue; j++)
{
Console.WriteLine("{0:u} - Looking for collisions with {1} thread(s)....", DateTime.Now, Environment.ProcessorCount);
System.Threading.Tasks.Parallel.For(0, Int32.MaxValue, (i) =>
{
if (bigHeapOGuids.Contains(Guid.NewGuid()))
throw new ApplicationException("Guids collided! Oh my gosh!");
}
);
Console.WriteLine("{0:u} - That was another {1} attempts without a collision.", DateTime.Now, ((long)Int32.MaxValue) * Environment.ProcessorCount);
}
}
Console.WriteLine("Umm... why hasn't the universe ended yet?");
}
}
}
PS: I wanted to try out the Parallel extensions library. That was easy.
PS:我想试用 Parallel 扩展库。那很简单。
And using OutOfMemoryException as control flow just feels wrong.
使用 OutOfMemoryException 作为控制流感觉是错误的。
EDIT
编辑
Well, it seems this still attracts votes. So I've fixed the GC.KeepAlive() issue. And changed it to run with C# 4.
好吧,看来这仍然吸引了选票。所以我已经修复了 GC.KeepAlive() 问题。并将其更改为使用 C# 4 运行。
And to clarify my support terms: support is only available on the 28/Feb/2010. Please use a time machine to make support requests on that day only.
并澄清我的支持条款:仅在 2010 年 2 月 28 日提供支持。请仅在当天使用时间机器提出支持请求。
EDIT 2As always, the GC does a better job than I do at managing memory; any previous attempts at doing it myself were doomed to failure.
编辑 2与往常一样,GC 在管理内存方面比我做得更好;以前自己尝试做的任何尝试都注定要失败。
回答by Graviton
Any two GUIDs are very likely unique (not equal).
任何两个 GUID 很可能是唯一的(不相等)。
See this SO entry, and from Wikipedia
While each generated GUID is not guaranteed to be unique, the total number of unique keys (2^128 or 3.4×10^38) is so large that the probability of the same number being generated twice is very small. For example, consider the observable universe, which contains about 5×10^22 stars; every star could then have 6.8×10^15 universally unique GUIDs.
虽然不能保证每个生成的 GUID 都是唯一的,但唯一键的总数(2^128 或 3.4×10^38)是如此之大,以至于同一数字被生成两次的概率非常小。例如,考虑可观测的宇宙,它包含大约 5×10^22 颗恒星;每颗星星都可以有 6.8×10^15 个普遍唯一的 GUID。
So probably you have to wait for many more billion of years, and hope that you hit one before the universe as we know it comes to an end.
所以可能你还得再等几十亿年,并希望你能在我们所知的宇宙终结之前击中一个。
回答by Nathan Taylor
for(begin; begin<end; begin)
Console.WriteLine(System.Guid.NewGuid().ToString());
You aren't incrementing begin
so the condition begin < end
is always true.
您没有递增,begin
因此条件begin < end
始终为真。
回答by rjmunro
This will run for a lot more than hours. Assuming it loops at 1 GHz (which it won't - it will be a lot slower than that), it will run for 10790283070806014188970 years. Which is about 83 billion times longer than the age of the universe.
这将运行很多小时。假设它以 1 GHz 的频率循环(它不会——它会比那慢很多),它将运行 10790283070806014188970 年。这大约是宇宙年龄的 830 亿倍。
Assuming Moores lawholds, it would be a lot quicker to not run this program, wait several hundred years and run it on a computer that is billions of times faster. In fact, any program that takes longer to run than it takes CPU speeds to double (about 18 months) will complete sooner if you wait until the CPU speeds have increased and buy a new CPU before running it (unless you write it so that it can be suspended and resumed on new hardware).
假设摩尔定律成立,不运行这个程序,等待几百年,然后在速度快数十亿倍的计算机上运行它会快得多。事实上,如果您等到 CPU 速度提高并在运行之前购买新 CPU(除非您编写它以便它可以在新硬件上暂停和恢复)。
回答by tylerl
A GUID is theoretically non-unique. Here's your proof:
GUID 理论上是非唯一的。这是你的证明:
- GUID is a 128 bit number
- You cannot generate 2^128 + 1 or more GUIDs without re-using old GUIDs
- GUID 是一个 128 位的数字
- 如果不重新使用旧的 GUID,您将无法生成 2^128 + 1 个或更多的 GUID
However, if the entire power output of the sun was directed at performing this task, it would go cold long before it finished.
但是,如果太阳的全部能量输出都用于执行这项任务,那么它在完成之前很久就会变冷。
GUIDs can be generated using a number of different tactics, some of which take special measures to guarantee that a given machine will not generate the same GUID twice. Finding collisions in a particular algorithm would show that your particular method for generating GUIDs is bad, but would not prove anything about GUIDs in general.
可以使用多种不同的策略生成 GUID,其中一些策略会采取特殊措施来保证给定的机器不会两次生成相同的 GUID。在特定算法中查找冲突将表明您生成 GUID 的特定方法很糟糕,但通常无法证明有关 GUID 的任何信息。
回答by MZB
Presumably you have reason to be believe that the algorithm for producing Guids is not producing truly random numbers, but is in fact cycling with a period << 2^128.
大概您有理由相信生成 Guid 的算法并没有生成真正的随机数,但实际上是以周期 << 2^128 循环的。
e.g. RFC4122 method used to derive GUIDs which fixes the values of some bits.
例如,RFC4122 方法用于导出固定某些位值的 GUID。
Proof of cycling is going to depend upon the possible size of the period.
循环证明将取决于周期的可能大小。
For small periods, hash table of hash(GUID) -> GUID with replacement on collision if GUIDs do not match (terminate if they do) might be an approach. Consider also only doing the replacement a random fraction of the time.
对于小时期,如果 GUID 不匹配(如果匹配则终止),哈希(GUID)-> GUID 的哈希表在冲突时替换可能是一种方法。还考虑只在随机部分时间内进行替换。
Ultimately if the maximum period between collisions is large enough (and isn't known in advance) any method is only going to yield a probability that the collision would be found if it existed.
最终,如果碰撞之间的最大时间间隔足够大(并且事先不知道),任何方法只会产生如果存在碰撞就会被发现的概率。
Note that if the method of generating Guids is clock based (see the RFC), then it may not be possible to determine if collisions exist because either (a) you won't be able to wait long enough for the clock to wrap round, or (b) you can't request enough Guids within a clock tick to force a collision.
请注意,如果生成 Guid 的方法是基于时钟的(请参阅 RFC),则可能无法确定是否存在冲突,因为 (a) 您将无法等待足够长的时间让时钟环绕,或 (b) 您无法在时钟滴答内请求足够的 Guid 来强制碰撞。
Alternatively you might be able to show a statistical relationship between the bits in the Guid, or a correlation of bits between Guids. Such a relationship might make it highly probable that the algorithm is flawed without necessarily being able to find an actual collision.
或者,您可以显示 Guid 中位之间的统计关系,或 Guid 之间位的相关性。这种关系可能使算法很可能存在缺陷,而不一定能够找到实际的碰撞。
Of course, if you just want to prove that Guids can collide, then a mathematical proof, not a program, is the answer.
当然,如果你只是想证明 Guids 可以碰撞,那么答案是数学证明,而不是程序。
回答by RCIX
Have you tried begin = begin + new BigInteger((long)1)
in place of begin++?
你有没有尝试begin = begin + new BigInteger((long)1)
过代替开始++?
回答by jason
Of course GUIDs can collide. Since GUIDs are 128-bits, just generate 2^128 + 1
of them and by the pigeonhole principlethere must be a collision.
当然,GUID 可能会发生冲突。由于 GUID 是 128 位的,所以只要生成2^128 + 1
它们,根据鸽巢原理,肯定会发生碰撞。
But when we say that a GUID is a unique, what we really mean is that the key space is so large that it is practically impossible to accidentally generate the same GUID twice (assuming that we are generating GUIDs randomly).
但是当我们说一个 GUID 是唯一的时,我们真正的意思是密钥空间太大了,实际上不可能意外地生成相同的 GUID 两次(假设我们是随机生成的 GUID)。
If you generate a sequence of n
GUIDs randomly, then the probability of at least one collision is approximately p(n) = 1 - exp(-n^2 / 2 * 2^128)
(this is the birthday problemwith the number of possible birthdays being 2^128
).
如果您n
随机生成一系列GUID,则至少发生一次碰撞的概率约为p(n) = 1 - exp(-n^2 / 2 * 2^128)
(这是可能的生日数为 的生日问题2^128
)。
n p(n)
2^30 1.69e-21
2^40 1.77e-15
2^50 1.86e-10
2^60 1.95e-03
To make these numbers concrete, 2^60 = 1.15e+18
. So, if you generate one billion GUIDs per second, it will take you 36 years to generate 2^60
random GUIDs and even then the probability that you have a collision is still 1.95e-03
. You're more likely to be murdered at some point in your life(4.76e-03
) than you are to find a collision over the next 36 years. Good luck.
为了使这些数字具体化,2^60 = 1.15e+18
. 因此,如果每秒生成 10 亿个 GUID,则生成2^60
随机 GUID需要 36 年的时间,即便如此,发生碰撞的概率仍然是1.95e-03
。在您生命中的某个时刻( 4.76e-03
) ,您更有可能被谋杀,而不是在接下来的 36 年中发生碰撞。祝你好运。
回答by ctacke
If you're worried about uniqueness you can always purchase new GUIDs so you can throw away your old ones. I'll put some up on eBay if you'd like.
如果您担心唯一性,您可以随时购买新的 GUID,这样您就可以扔掉旧的 GUID。如果你愿意,我会在易趣上放一些。
回答by Steve314
Counting to 2^128 - ambitious.
数到 2^128 - 雄心勃勃。
Lets imagine that we can count 2^32 IDs per second per machine - not thatambitious, since it's not even 4.3 billion per second. Lets dedicate 2^32 machines to that task. Furthermore, lets get 2^32 civilisations to each dedicate the same resources to the task.
让我们想象一下,我们每台机器每秒可以计算 2^32 个 ID——不是那么雄心勃勃,因为它甚至不到每秒 43 亿。让我们将 2^32 台机器专用于该任务。此外,让 2^32 个文明为每个文明分配相同的资源。
So far, we can count 2^96 IDs per second, meaning we will be counting for 2^32 seconds (a little over 136 years).
到目前为止,我们每秒可以计算 2^96 个 ID,这意味着我们将计算 2^32 秒(136 年多一点)。
Now, all we need is to get 4,294,967,296 civilisations to each dedicate 4,294,967,296 machines, each machine capable of counting 4,294,967,296 IDs per second, purely to this task for the next 136 years or so - I suggest we get started on this essential task right now ;-)
现在,我们所需要的只是让 4,294,967,296 个文明用于每个专用 4,294,967,296 台机器,每台机器每秒能够计算 4,294,967,296 个 ID,纯粹用于未来 136 年左右的这项任务——我建议我们现在就开始这项基本任务; -)