C# 更快地替代 Dictionary<TKey, TValue>
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1869452/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
A faster replacement to the Dictionary<TKey, TValue>
提问by Alon Gubkin
I need a fast replacement for the System.Collections.Generic.Dictionary<TKey, TValue>
. My application should be reallyfast. So, the replacement should support:
我需要快速替换System.Collections.Generic.Dictionary<TKey, TValue>
. 我的应用程序应该非常快。所以,替换应该支持:
- Generics
- Add
- Get
- Contains
- 泛型
- 添加
- 得到
- 包含
... and that's it. I don't need any support in LINQ or anything. And it should be fast.
......就是这样。我不需要 LINQ 或任何东西的任何支持。它应该很快。
A simple code like:
一个简单的代码,如:
Stopwatch stopWatch = Stopwatch.StartNew();
Dictionary<string, string> dictionary = new Dictionary<string, string>();
dictionary.Add("fieldName", "fieldValue");
dictionary.Add("Title", "fieldVaaaaaaaaaaaaaaaaalue");
Console.WriteLine(stopWatch.Elapsed);
... prints 00:00:00.0001274, which is alotof time for me, because my application is doing many other things, some of them from old slow libraries that I must to use and are not dependent on me.
... 打印 00:00:00.0001274,这对我来说是很多时间,因为我的应用程序正在做许多其他事情,其中一些来自我必须使用的旧慢库,并且不依赖于我。
Any ideas on how to implement a faster one?
关于如何实现更快的任何想法?
Thank you.
谢谢你。
采纳答案by Jon Skeet
Chances are you're seeing JIT compilation. On my box, I see:
您可能会看到 JIT 编译。在我的盒子上,我看到:
00:00:00.0000360
00:00:00.0000060
when I run it twice in quick succession within the same process - and not in the debugger. (Make sure you're not running it in the debugger, or it's a pointless test.)
当我在同一进程中快速连续运行两次时 - 而不是在调试器中。(确保您没有在调试器中运行它,否则这是一个毫无意义的测试。)
Now, measuring any time thattiny is generally a bad idea. You'd need to iterate millions of times to get a better idea of how long it's taking.
现在,测量这么小的时间通常是个坏主意。您需要迭代数百万次才能更好地了解它需要多长时间。
Do you have good reason to believe it's actuallyslowing down your code - or are you basing it all on your original timing?
您是否有充分的理由相信它实际上会减慢您的代码速度 - 还是您完全基于您的原始时间?
I doubt that you'll find anything significantly faster than Dictionary<TKey, TValue>
and I'd be very surprised to find that it's the bottleneck.
我怀疑你会发现比Dictionary<TKey, TValue>
它快得多的东西,我会很惊讶地发现它是瓶颈。
EDIT: I've just benchmarked adding a million elements to a Dictionary<TKey, TValue>
where all the keys were existing objects (strings in an array), reusing the same value (as it's irrelevant) and specifying a capacity of a million on construction - and it took about 0.15s on my two-year-old laptop.
编辑:我刚刚对添加一百万个元素进行了基准测试,Dictionary<TKey, TValue>
其中所有键都是现有对象(数组中的字符串),重用相同的值(因为它不相关)并在构造时指定一百万的容量 - 它花了大约在我使用了两年的笔记本电脑上 0.15 秒。
Is that reallylikely to be a bottleneck for you, given that you've already said you're using some "old slow libraries" elsewhere in your app? Bear in mind that the slower those other libraries are, the less impact an improved collection class will have. If the dictionary changes are only accounting for 1% of your overall application time, then even if we could provide an instantaneousdictionary, you'd only speed up your app by 1%.
鉴于您已经说过您在应用程序的其他地方使用了一些“旧的慢速库”,这对您来说真的可能是一个瓶颈吗?请记住,其他库越慢,改进的集合类的影响就越小。如果字典的变化只占你整个应用时间的 1%,那么即使我们可以提供一个即时字典,你也只能使你的应用程序加速 1%。
As ever, get a profiler - it'll give you a much better idea of where your time is going.
一如既往,获得一个分析器 - 它会让你更好地了解你的时间去哪里了。
回答by Reed Copsey
I agree with Jon Skeet's supposition that this is most likely JIT compilation.
我同意Jon Skeet的假设,即这很可能是 JIT 编译。
That being said, I wanted to add some other information here:
话虽如此,我想在这里添加一些其他信息:
Most of the speed issues relating to using Dictionary<T,U>
are not related to the implementation of Dictionary. Dictionary<T,U>
is VERY fast, out of the box. It would be difficult to beat it.
大多数与使用有关的速度问题与Dictionary<T,U>
Dictionary 的实现无关。 Dictionary<T,U>
非常快,开箱即用。打败它会很困难。
Speed issues relating to Dictionary instances are almost always actually hash code implementation issues. If you're having speed issues when using Dictionary<MyCustomClass,MyValue>
, revisit the GetHashCode()
implementation you have defined on MyCustomClass. This is even more critical if you're using a custom struct as your key.
与 Dictionary 实例相关的速度问题实际上几乎总是哈希码实现问题。如果您在使用 时遇到速度问题Dictionary<MyCustomClass,MyValue>
,请重新访问GetHashCode()
您在 MyCustomClass 上定义的实现。如果您使用自定义结构作为密钥,这一点就更加重要。
In order to get good performance out of Dictionary, GetHashCode()
should be:
为了从 Dictionary 中获得良好的性能,GetHashCode()
应该是:
- Fast
- Able to provide hash codes that generate few conflicts. Unique instances should, when possible, generate unique hash values.
- 快速地
- 能够提供产生很少冲突的哈希码。在可能的情况下,唯一实例应该生成唯一的哈希值。
If you get that right, I think you'll be very happy with the default Dictionary implementation.
如果你猜对了,我想你会对默认的 Dictionary 实现感到非常满意。
回答by Cade Roux
If you really need better performance, you're going to have to give up something major - like generics, dynamic memory allocation, etc. All those features sacrifice some performance.
如果你真的需要更好的性能,你将不得不放弃一些重要的东西——比如泛型、动态内存分配等。所有这些特性都牺牲了一些性能。
I would avoid using Contains if at all possible and look at TryGetValueetc.
如果可能的话,我会避免使用 Contains 并查看TryGetValue等。
回答by Michael
Odds are you are not going to find anything much faster than Dictionary. I would just use Dictionary. Then, when you see you are not meeting your perf goals, and a profiler indicates that adding/removing from Dictionary are your bottlenecks you can consider replacing with a more targeted class.
很可能你找不到比字典快得多的东西。我只会使用字典。然后,当您发现自己没有达到性能目标时,分析器表明从 Dictionary 添加/删除是您的瓶颈,您可以考虑用更有针对性的类替换。
Note that features such as LINQ due not incur any performance loss if you do not use them.
请注意,如果您不使用 LINQ 等功能,则不会导致任何性能损失。
回答by Justin Niessner
Don't forget, you're timing the Dictionary constructor in that code as well. I did a test, moving the call to the constructor out of the measurement, and looped 10 times. Here's my test code:
不要忘记,您还在该代码中为 Dictionary 构造函数计时。我做了一个测试,将对构造函数的调用移出测量值,并循环了 10 次。这是我的测试代码:
for (int i = 0; i < 10; i++)
{
Dictionary<string, string> test = new Dictionary<string, string>();
System.Diagnostics.Stopwatch watch = System.Diagnostics.Stopwatch.StartNew();
test.Add("fieldName", "fieldValue");
test.Add("Title", "fieldavlkajlkdjflkjalkjslkdjfiajwelkrjelrkjavoijl");
Console.WriteLine(watch.Elapsed);
}
Console.ReadKey();
Below are the results:
以下是结果:
00:00:00.0000607
00:00:00.0000025
00:00:00.0000015
00:00:00.0000015
00:00:00.0000016
00:00:00.0000017
00:00:00.0000016
00:00:00.0000016
00:00:00.0000016
00:00:00.0000015
I'm not sure how much faster you could get than that...
我不确定你能得到多快比那...
Update
更新
Looks like this mirrors Jon Skeets results too...JIT.
看起来这也反映了 Jon Skeets 的结果...... JIT。
回答by Paul Sasik
Could you use a List and define an enum such that, for example, fieldName = 0, Title = 1 and use each propery's unique index as a lookup index into the list? That would be the fastest solution, though the least flexible since you'd be tied to an enum.
您能否使用 List 并定义一个枚举,例如,fieldName = 0, Title = 1 并使用每个属性的唯一索引作为列表中的查找索引?这将是最快的解决方案,尽管最不灵活,因为您会被绑定到枚举。
回答by Nate Zaugg
How many items do you plan to add to the dictionary? While Dictionary/Hashtable is usually the fastest, depending on what you are doing, there may be something faster (aka better suited) than a Hashtable (the underlying structure in a Dictionary). Based on the usage, it's possible that SortedList could be faster if combine with some kind of Skip List or even a self-balancing tree or tries. Especially if you wish to return a range of values rather than a single value.
您打算在字典中添加多少项?虽然字典/哈希表通常是最快的,但取决于你在做什么,可能有比哈希表(字典中的底层结构)更快(也更适合)的东西。根据使用情况,如果与某种跳过列表甚至自平衡树或尝试结合使用,SortedList 可能会更快。特别是如果您希望返回一系列值而不是单个值。
A Hashtable is a good fit when:
在以下情况下,哈希表非常适合:
- You know how many items you intend to store before population of the table begins. Dynamic resizing will be very painful!
- You have a good hash algorithm with even distribution, which .NET does
- There is a good mechanism in place for collision resolution, which .NET does
- You are looking for a single value
- You can guarantee that all values will be unique
- 在开始填充表之前,您知道要存储多少项。动态调整大小会很痛苦!
- 你有一个很好的散列算法,分布均匀,这是 .NET 所做的
- 有一个很好的冲突解决机制,.NET 这样做
- 您正在寻找单一值
- 您可以保证所有值都是唯一的
If you're doing some compression, for example, a RB-Tree is better than a Hashtable.
例如,如果您要进行一些压缩,则 RB-Tree 比 Hashtable 更好。
Source: http://en.wikipedia.org/wiki/Hashtable#Dynamic_resizing
回答by JamesHoux
USE INTS AS KEYS FOR MAXIMUM PERFORMANCE:
使用整数作为获得最佳性能的关键:
For anyone who came here from Google, if you want to squeeze every last bit of performance out of a Dictionary, then use Ints as keys. Here's a benchmark comparing Int vs String Keys: https://Hymansondunstan.com/articles/2527
对于从 Google 来到这里的任何人,如果您想从 Dictionary 中榨取最后一点性能,那么请使用 Ints 作为键。这是比较 Int 与 String Keys 的基准:https: //Hymansondunstan.com/articles/2527
The author of the article even mentions that converting strings to ints is worthwhile if you have such a need.
文章作者甚至提到,如果你有这样的需求,将字符串转换为整数是值得的。
Also, note that this same behavior occurs in some other languages like PHP. Php associative arrays -are in fact- dictionaries, and if you use Ints in ascending orderin PHP7, they outperform string keys tremendously.
另外,请注意,在某些其他语言(如 PHP)中也会出现相同的行为。PHP 关联数组实际上是字典,如果您在 PHP7中按升序使用Int,它们的性能将大大优于字符串键。
回答by Walter Vehoeven
Dictionaries allow a specified IEqualityComparer comparer. for strings, or other types of A generic compare may not be the best performing. A little ILSpy will show you that it, if take the default == comparer, if your implementation suffers performance you can inject your own IEqualityComparer compairer. In the end the dictionary will compare hash code of what you provide as a key with the existing hash codes in it's list of entries.
字典允许指定的 IEqualityComparer 比较器。对于字符串或其他类型的通用比较可能不是最佳表现。一个小的 ILSpy 会告诉你,如果采用默认 == 比较器,如果你的实现受到性能影响,你可以注入你自己的 IEqualityComparer 比较器。最后,字典会将您作为键提供的哈希码与其条目列表中的现有哈希码进行比较。
So if you have specific needs dictionary, perhaps specialize it in FastDictionary class getting to the hascode in a more efficient way,
因此,如果您有特定需求的字典,也许可以将它专门用于 FastDictionary 类,以更有效的方式获取 hascode,
In your implementation that would be:
在您的实现中,这将是:
var dictionary = new Dictionary<string, string>(StringComparer.Ordinal);