C# List<T>.ToArray 性能不好?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1147497/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
C# List<T>.ToArray performance is bad?
提问by George2
I'm using .Net 3.5 (C#) and I've heard the performance of C# List<T>.ToArray
is "bad", since it memory copies for all elements to form a new array. Is that true?
我正在使用 .Net 3.5 (C#),我听说 C# 的性能List<T>.ToArray
“很差”,因为它会为所有元素进行内存复制以形成一个新数组。真的吗?
采纳答案by Joe
No that's not true. Performance is good since all it does is memory copy all elements (*) to form a new array.
不,那不是真的。性能很好,因为它所做的只是内存复制所有元素 (*) 以形成一个新数组。
Of course it depends on what you define as "good" or "bad" performance.
当然,这取决于您定义的“好”或“坏”性能。
(*) references for reference types, values for value types.
(*) 引用类型的引用,值类型的值。
EDIT
编辑
In response to your comment, using Reflector is a good way to check the implementation (see below). Or just think for a couple of minutes about how you would implement it, and take it on trust that Microsoft's engineers won't come up with a worse solution.
针对您的评论,使用 Reflector 是检查实现的好方法(见下文)。或者只是想一想你将如何实施它,并相信微软的工程师不会想出更糟糕的解决方案。
public T[] ToArray()
{
T[] destinationArray = new T[this._size];
Array.Copy(this._items, 0, destinationArray, 0, this._size);
return destinationArray;
}
Of course, "good" or "bad" performance only has a meaning relative to some alternative. If in your specific case, there is an alternative technique to achieve your goal that is measurably faster, then you can consider performance to be "bad". If there is no such alternative, then performance is "good" (or "good enough").
当然,“好”或“坏”的表现只具有相对于某种选择的意义。如果在您的特定情况下,有一种替代技术可以更快地实现您的目标,那么您可以将性能视为“糟糕”。如果没有这样的选择,那么性能就是“好”(或“足够好”)。
EDIT 2
编辑 2
In response to the comment: "No re-construction of objects?" :
回应评论:“没有重新构建对象?” :
No reconstruction for reference types. For value types the values are copied, which could loosely be described as reconstruction.
没有对引用类型进行重构。对于值类型,值被复制,这可以粗略地描述为重构。
回答by George2
it creates new references in an array, but that's just the only thing that that method could and should do...
它在数组中创建新引用,但这只是该方法可以而且应该做的唯一事情......
回答by Sorantis
Reasons to call ToArray()
调用 ToArray() 的原因
- If the returned value is not meant to be modified, returning it as an array makes that fact a bit clearer.
- If the caller is expected to perform many non-sequential accesses to the data, there can be a performance benefit to an array over a List<>.
- If you know you will need to pass the returned value to a third-party function that expects an array.
- Compatibility with calling functions that need to work with .NET version 1 or 1.1. These versions don't have the List<> type (or any generic types, for that matter).
- 如果不打算修改返回的值,将它作为数组返回会使这个事实更清楚一点。
- 如果期望调用者对数据执行许多非顺序访问,则数组可能比 List<> 具有性能优势。
- 如果您知道需要将返回值传递给需要数组的第三方函数。
- 与需要使用 .NET 版本 1 或 1.1 的调用函数的兼容性。这些版本没有 List<> 类型(或任何泛型类型,就此而言)。
Reasons not to call ToArray()
不调用 ToArray() 的原因
- If the caller ever does need to add or remove elements, a List<> is absolutely required.
- The performance benefits are not necessarily guaranteed, especially if the caller is accessing the data in a sequential fashion. There is also the additional step of converting from List<> to array, which takes processing time.
- The caller can always convert the list to an array themselves.
- 如果调用者确实需要添加或删除元素,则绝对需要 List<>。
- 不一定保证性能优势,尤其是当调用者以顺序方式访问数据时。还有一个从 List<> 转换为数组的额外步骤,这需要处理时间。
- 调用者总是可以自己将列表转换为数组。
taken from here
取自这里
回答by Daniel Earwicker
Performance has to be understood in relative terms. Converting an array to a List involves copying the array, and the cost of that will depend on the size of the array. But you have to compare that cost to other other things your program is doing. How did you obtain the information to put into the array in the first place? If it was by reading from the disk, or a network connection, or a database, then an array copy in memory is very unlikely to make a detectable difference to the time taken.
性能必须从相对的角度来理解。将数组转换为 List 涉及复制数组,其成本取决于数组的大小。但是您必须将该成本与您的程序正在做的其他事情进行比较。您最初是如何获得要放入数组的信息的?如果是通过从磁盘、网络连接或数据库读取,则内存中的阵列副本不太可能对所花费的时间产生可察觉的差异。
回答by chris166
Yes, it's true that it does a memory copy of all elements. Is it a performance problem? That depends on your performance requirements.
是的,它确实对所有元素进行了内存复制。这是性能问题吗?这取决于您的性能要求。
A List
contains an array internally to hold all the elements. The array grows if the capacity is no longer sufficient for the list. Any time that happens, the list will copy all elements into a new array. That happens all the time, and for most people that is no performance problem.
AList
在内部包含一个数组来保存所有元素。如果容量不再足以容纳列表,则阵列会增长。任何时候发生这种情况,列表都会将所有元素复制到一个新数组中。这种情况一直发生,对大多数人来说这没有性能问题。
E.g. a list with a default constructor starts at capacity 16, and when you .Add()
the 17th element, it creates a new array of size 32, copies the 16 old values and adds the 17th.
例如,一个带有默认构造函数的列表从容量 16 开始,当你.Add()
有第 17 个元素时,它会创建一个大小为 32 的新数组,复制 16 个旧值并添加第 17 个。
The size difference is also the reason why ToArray()
returns a new array instance instead of passing the private reference.
大小差异也是ToArray()
返回一个新的数组实例而不是传递私有引用的原因。
回答by Curtis Yallop
For any kind of List/ICollection where it knows the length, it can allocate an array of exactly the right size from the start.
对于知道长度的任何类型的 List/ICollection,它可以从一开始就分配一个大小完全正确的数组。
T[] destinationArray = new T[this._size];
Array.Copy(this._items, 0, destinationArray, 0, this._size);
return destinationArray;
If your source type is IEnumerable (not a List/Collection) then the source is:
如果您的源类型是 IEnumerable(不是列表/集合),那么源是:
items = new TElement[4];
..
if (no more space) {
TElement[] newItems = new TElement[checked(count * 2)];
Array.Copy(items, 0, newItems, 0, count);
items = newItems;
It starts at size 4 and grows exponentially, doubling each time it runs out of space. Each time it doubles, it has to reallocate memory and copy the data over.
它从大小 4 开始并呈指数增长,每次用完空间时都会增加一倍。每次加倍时,它都必须重新分配内存并复制数据。
If we know the source-data size, we can avoid this slight overhead. However in most cases eg array size <=1024, it will execute so quickly, that we don't even need to think about this implementation detail.
如果我们知道源数据的大小,我们就可以避免这种轻微的开销。然而,在大多数情况下,例如数组大小 <=1024,它会执行得如此之快,以至于我们甚至不需要考虑这个实现细节。
References: Enumerable.cs, List.cs (F12ing into them), Joe's answer
参考资料:Enumerable.cs、List.cs(按 F12 进入)、Joe 的回答
回答by J Gaspar
This is what Microsoft's official documentationsays about List.ToArray's time complexity
这是微软官方文档中关于 List.ToArray 的时间复杂度的说明
The elements are copied using Array.Copy, which is an O(n) operation, where n is Count.
使用 Array.Copy 复制元素,这是一个 O(n) 操作,其中 n 是计数。
Then, looking at Array.Copy, we see that it is usually not cloning the data but instead using references:
然后,查看 Array.Copy,我们看到它通常不是克隆数据而是使用引用:
If sourceArray and destinationArray are both reference-type arrays or are both arrays of type Object, a shallow copy is performed. A shallow copy of an Array is a new Array containing references to the same elements as the original Array. The elements themselves or anything referenced by the elements are not copied. In contrast, a deep copy of an Array copies the elements and everything directly or indirectly referenced by the elements.
如果sourceArray 和destinationArray 都是引用类型数组或者都是Object 类型的数组,则执行浅拷贝。Array 的浅拷贝是一个新的 Array,其中包含对与原始 Array 相同元素的引用。元素本身或元素引用的任何内容都不会被复制。相比之下,数组的深层副本复制元素以及元素直接或间接引用的所有内容。
So in conclusion, this is a pretty efficient way of getting an array from a list.
所以总而言之,这是从列表中获取数组的一种非常有效的方法。