C# 如何判断 IEnumerable<T> 是否需要延迟执行?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1168944/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 09:54:03  来源:igfitidea点击:

How to tell if an IEnumerable<T> is subject to deferred execution?

c#linqlinq-to-entities

提问by Simon_Weaver

I always assumed that if I was using Select(x=> ...)in the context of LINQ to objects, then the new collection would be immediately created and remain static. I'm not quite sure WHY I assumed this, and its a very bad assumption but I did. I often use .ToList()elsewhere, but often not in this case.

我一直认为,如果我Select(x=> ...)在 LINQ to objects 的上下文中使用,那么新集合将立即创建并保持静态。我不太确定为什么我会这样假设,这是一个非常糟糕的假设,但我做到了。我经常.ToList()在别处使用,但在这种情况下通常不会。

This code demonstrates that even a simple 'Select' is subject to deferred execution :

此代码表明,即使是简单的“选择”也会延迟执行:

var random = new Random();
var animals = new[] { "cat", "dog", "mouse" };
var randomNumberOfAnimals = animals.Select(x => Math.Floor(random.NextDouble() * 100) + " " + x + "s");

foreach (var i in randomNumberOfAnimals)
{
    testContextInstance.WriteLine("There are " + i);
}

foreach (var i in randomNumberOfAnimals)
{
    testContextInstance.WriteLine("And now, there are " + i);
}

This outputs the following (the random function is called every time the collection is iterated through):

这将输出以下内容(每次迭代集合时都会调用随机函数):

There are 75 cats
There are 28 dogs
There are 62 mouses
And now, there are 78 cats
And now, there are 69 dogs
And now, there are 43 mouses

I have many places where I have an IEnumerable<T>as a member of a class. Often the results of a LINQ query are assigned to such an IEnumerable<T>. Normally for me, this does not cause issues, but I have recently found a few places in my code where it poses more than just a performance issue.

我有很多地方IEnumerable<T>作为班级的成员。通常将 LINQ 查询的结果分配给这样的IEnumerable<T>. 通常对我来说,这不会引起问题,但我最近在我的代码中发现了一些不仅仅是性能问题的地方。

In trying to check for places where I had made this mistake I thought I could check to see if a particular IEnumerable<T>was of type IQueryable. This I thought would tell me if the collection was 'deferred' or not. It turns out that the enumerator created by the Select operator above is of type System.Linq.Enumerable+WhereSelectArrayIterator``[System.String,System.String]and not IQueryable.

在尝试检查我犯了这个错误的地方时,我想我可以检查一下特定IEnumerable<T>的类型是否为IQueryable。我认为这会告诉我收藏是否“延期”。事实证明,上面的 Select 运算符创建的枚举器是 typeSystem.Linq.Enumerable+WhereSelectArrayIterator``[System.String,System.String]而不是IQueryable

I used Reflectorto see what this interface inherited from, and it turns out not to inherit from anything that indicates it is 'LINQ' at all - so there is no way to test based upon the collection type.

我使用Reflector来查看该接口继承自什么,结果证明它根本没有继承任何表明它是“LINQ”的东西——因此无法根据集合类型进行测试。

I'm quite happy now putting .ToArray()everywhere now, but I'd like to have a mechanism to make sure this problem doesn't happen in future. Visual Studio seems to know how to do it because it gives a message about 'expanding the results view will evaluate the collection.'

我现在很高兴现在.ToArray()到处放,但我想有一种机制来确保将来不会发生这个问题。Visual Studio 似乎知道该怎么做,因为它给出了一条关于“扩展结果视图将评估集合”的消息。

The best I have come up with is :

我想出的最好的是:

bool deferred = !object.ReferenceEquals(randomNumberOfAnimals.First(),
                                        randomNumberOfAnimals.First());

Edit:This only works if a new object is created with 'Select' and it not a generic solution. I'm not recommended it in any case though! It was a little tongue in the cheek of a solution.

编辑:这仅适用于使用“选择”创建的新对象,而不是通用解决方案。无论如何我都不推荐它!这是一个解决方案的脸颊上的小舌头。

采纳答案by Bevan

Deferred execution of LINQ has trapped a lot of people, you're not alone.

LINQ 的延迟执行困住了很多人,你并不孤单。

The approach I've taken to avoiding this problem is as follows:

我为避免这个问题而采取的方法如下:

Parameters to methods- use IEnumerable<T>unless there's a need for a more specific interface.

方法参数-IEnumerable<T>除非需要更具体的接口,否则使用。

Local variables- usually at the point where I create the LINQ, so I'll know whether lazy evaluation is possible.

局部变量- 通常在我创建 LINQ 的时候,所以我会知道是否可以进行惰性求值。

Class members- never use IEnumerable<T>, always use List<T>. And always make them private.

类成员- 从不使用IEnumerable<T>,始终使用List<T>。并始终将它们设为私有。

Properties- use IEnumerable<T>, and convert for storage in the setter.

属性- 使用IEnumerable<T>,并转换为存储在 setter 中。

public IEnumerable<Person> People 
{
    get { return people; }
    set { people = value.ToList(); }
}
private List<People> people;

While there are theoretical cases where this approach wouldn't work, I've not run into one yet, and I've been enthusiasticly using the LINQ extension methods since late Beta.

虽然存在这种方法行不通的理论案例,但我还没有遇到过,而且自 Beta 后期以来,我一直热情地使用 LINQ 扩展方法。

BTW: I'm curious why you use ToArray();instead of ToList();- to me, lists have a much nicer API, and there's (almost) no performance cost.

顺便说一句:我很好奇你为什么使用ToArray();而不是ToList();- 对我来说,列表有一个更好的 API,并且(几乎)没有性能成本。

Update: A couple of commenters have rightly pointed out that arrays have a theoretical performance advantage, so I've amended my statement above to "... there's (almost) no performance cost."

更新:一些评论者正确地指出数组具有理论上的性能优势,因此我将上面的陈述修改为“......(几乎)没有性能成本。”

Update 2: I wrote some code to do some micro-benchmarking of the difference in performance between Arrays and Lists. On my laptop, and in my specific benchmark, the difference is around 5ns (that's nanoseconds) per access. I guess there are cases where saving 5ns per loop would be worthwhile ... but I've never come across one. I had to hike my test up to 100 millioniterations before the runtime became long enough to accurately measure.

更新 2:我编写了一些代码来对数组和列表之间的性能差异进行一些微基准测试。在我的笔记本电脑和我的特定基准测试中,每次访问的差异约为 5ns(纳秒)。我想在某些情况下,每个循环节省 5ns 是值得的……但我从未遇到过。在运行时间变得足够长以进行准确测量之前,我不得不将测试增加到 1亿次迭代。

回答by Adam Robinson

The message about expanding the results view will evaluate the collection is a standard message presented for all IEnumerableobjects. I'm not sure that there is any foolproof means of checking if an IEnumerableis deferred, mainly because even a yieldis deferred. The only means of absolutely ensuringthat it isn't deferred is to accept an ICollectionor IList<T>.

有关扩展结果视图将评估集合的消息是为所有IEnumerable对象显示的标准消息。我不确定是否有任何万无一失的方法来检查 anIEnumerable是否被推迟,主要是因为即使是 ayield也被推迟了。绝对确保它不被延迟的唯一方法是接受一个ICollectionor IList<T>

回答by Sam Harwell

It's absolutely possible to manually implement a lazy IEnumerator<T>, so there's no "perfectly general" way of doing it. What I keep in mind is this: if I'm changing things in a list while enumerating something related to it, always call ToArray()before the foreach.

手动实现 lazy 绝对是可能的IEnumerator<T>,因此没有“完全通用”的方法。我要记住的是:如果我在枚举与之相关的内容的同时更改列表中的内容,请始终ToArray()foreach.

回答by Reed Copsey

In general, I'd say you should try to avoid worrying about whether it's deferred.

一般来说,我会说你应该尽量避免担心它是否被推迟。

There are advantages to the streaming execution nature of IEnumerable<T>. It is true - there are times that it's disadvantageous, but I'd recommend just always handling those (rare) times specifically - either go ToList()or ToArray()to convert it to a list or array as appropriate.

的流式执行性质具有优势IEnumerable<T>。这是真的 - 有时它是不利的,但我建议总是专门处理那些(罕见的)时间 - 要么去ToList()要么ToArray()适当地将其转换为列表或数组。

The rest of the time, it's better to just let it be deferred. Needing to frequently check this seems like a bigger design problem...

剩下的时间,最好让它推迟。需要经常检查这似乎是一个更大的设计问题......

回答by Daniel Earwicker

This is an interesting reaction to deferred execution - most people view it as a positive in that it allows you to transform streams of data without needing to buffer everything up.

这是对延迟执行的有趣反应——大多数人认为它是积极的,因为它允许您转换数据流而无需缓冲所有内容。

Your suggested test won't work, because there's no reason why an iterator method can't yield the same reference object instance as its first object on two successive tries.

您建议的测试将不起作用,因为迭代器方法没有理由在两次连续尝试中不能产生与其第一个对象相同的引用对象实例。

IEnumerable<string> Names()
{
    yield return "Fred";
}

That will return the same static string object every time, as the only item in the sequence.

这将每次返回相同的静态字符串对象,作为序列中的唯一项目。

As you can't reliably detect the compiler-generated class that is returned from an iterator method, you'll have to do the opposite: check for a few well-known containers:

由于您无法可靠地检测从迭代器方法返回的编译器生成的类,您必须做相反的事情:检查一些众所周知的容器:

public static IEnumerable<T> ToNonDeferred(this IEnumerable<T> source)
{
    if (source is List<T> || source is T[]) // and any others you encounter
        return source;

    return source.ToArray();
}

By returning IEnumerable<T>, we keep the collection readonly, which is important because we may get back a copy or an original.

通过返回IEnumerable<T>,我们将集合保持为只读,这很重要,因为我们可能会取回副本或原件。

回答by Trident D'Gao

My five cents. Quite often you have to deal with an enumerable that you have no idea what's inside of it.

我的五毛钱。很多时候你不得不处理一个你不知道里面是什么的枚举。

Your options are:

您的选择是:

  • turn it to a list before using it but chances are it's endless you are in trouble
  • use it as is and you are likely to face all kinds of deferred execution funny things and you are in trouble again
  • 在使用它之前把它变成一个列表,但很有可能它是无穷无尽的,你遇到了麻烦
  • 照原样使用,你很可能会面临各种延迟执行有趣的事情,你又遇到麻烦了

Here is an example:

下面是一个例子:

[TestClass]
public class BadExample
{
    public class Item
    {
        public String Value { get; set; }
    }
    public IEnumerable<Item> SomebodysElseMethodWeHaveNoControlOver()
    {
        var values = "at the end everything must be in upper".Split(' ');
        return values.Select(x => new Item { Value = x });
    }
    [TestMethod]
    public void Test()
    {
        var items = this.SomebodysElseMethodWeHaveNoControlOver();
        foreach (var item in items)
        {
            item.Value = item.Value.ToUpper();
        }
        var mustBeInUpper = String.Join(" ", items.Select(x => x.Value).ToArray());
        Trace.WriteLine(mustBeInUpper); // output is in lower: at the end everything must be in upper
        Assert.AreEqual("AT THE END EVERYTHING MUST BE IN UPPER", mustBeInUpper); // <== fails here
    }
}

So there is no way to get away with it but the one: iterate it exactly one time on as-you-go basis.

所以没有办法摆脱它,只有一种方法:在你使用的基础上只迭代一次。

It was clearly a bad design choice to use the same IEnumerable interface for immediate and deferred execution scenarios. There must be a clear distinction between these two, so that it's clear from the name or by checking a property whether or not the enumerable is deferred.

对立即执行和延迟执行场景使用相同的 IEnumerable 接口显然是一个糟糕的设计选择。这两者之间必须有明确的区别,以便从名称或通过检查属性是否延迟可枚举项就很清楚。

A hint: In your code consider using IReadOnlyCollection<T>instead of the plain IEnumerable<T>, because in addition to that you get the Countproperty. This way you know for sure it's not endless and you can turn it to a list no problem.

提示:在您的代码中考虑使用IReadOnlyCollection<T>而不是普通的IEnumerable<T>,因为除此之外,您还获得了Count属性。这样你就可以确定它不是无穷无尽的,你可以把它变成一个列表没问题。