C# 多个列表与 IEnumerable.Intersect() 的交集

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1674742/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 19:53:24  来源:igfitidea点击:

Intersection of multiple lists with IEnumerable.Intersect()

c#.netlinq

提问by Oskar

I have a list of lists which I want to find the intersection for like this:

我有一个列表列表,我想找到这样的交集:

var list1 = new List<int>() { 1, 2, 3 };
var list2 = new List<int>() { 2, 3, 4 };
var list3 = new List<int>() { 3, 4, 5 };
var listOfLists = new List<List<int>>() { list1, list2, list3 };

// expected intersection is List<int>() { 3 };

Is there some way to do this with IEnumerable.Intersect()?

有没有办法用 IEnumerable.Intersect() 做到这一点?

EDIT: I should have been more clear on this: I really have a list of lists, I don't know how many there will be, the three lists above was just an example, what I have is actually an IEnumerable<IEnumerable<SomeClass>>

编辑:我应该更清楚这一点:我真的有一个列表,我不知道会有多少,上面的三个列表只是一个例子,我拥有的实际上是一个 IEnumerable<IEnumerable<SomeClass>>

SOLUTION

解决方案

Thanks for all great answers. It turned out there were four options for solving this: List+aggregate(@Marcel Gosselin), List+foreach(@JaredPar, @Gabe Moothart), HashSet+aggregate(@jesperll) and HashSet+foreach(@Tony the Pony). I did some performance testing on these solutions (varying number of lists, number of elementsin each list and random number maxsize.

感谢所有伟大的答案。事实证明,有四个选项可以解决这个问题:List+aggregate(@Marcel Gosselin)、List+foreach(@JaredPar、@Gabe Moothart)、HashSet+aggregate(@jesperll) 和HashSet+foreach(@Tony the Pony)。我对这些解决方案进行了一些性能测试(不同数量的列表、每个列表中的元素数量随机数最大大小。

It turns out that for most situations the HashSet performs better than the List (except with large lists and small random number size, because of the nature of HashSet I guess.) I couldn't find any real difference between the foreach method and the aggregate method (the foreach method performs slightlybetter.)

事实证明,在大多数情况下,HashSet 的性能比 List 好(除了大列表和小随机数大小,因为我猜是 HashSet 的性质。)我找不到 foreach 方法和聚合之间的任何真正区别方法(foreach 方法的性能稍好一些。)

To me, the aggregate method is really appealing (and I'm going with that as the accepted answer) but I wouldn't say it's the most readable solution.. Thanks again all!

对我来说,聚合方法真的很吸引人(我将其作为公认的答案),但我不会说这是最易读的解决方案..再次感谢大家!

采纳答案by Jesper Larsen-Ledet

How about:

怎么样:

var intersection = listOfLists
    .Skip(1)
    .Aggregate(
        new HashSet<T>(listOfLists.First()),
        (h, e) => { h.IntersectWith(e); return h; }
    );

That way it's optimized by using the same HashSet throughout and still in a single statement. Just make sure that the listOfLists always contains at least one list.

通过这种方式,它通过始终使用相同的 HashSet 并仍然在单个语句中进行优化。只需确保 listOfLists 始终包含至少一个列表。

回答by JaredPar

You could do the following

您可以执行以下操作

var result = list1.Intersect(list2).Intersect(list3).ToList();

回答by Jon Skeet

You can indeed use Intersecttwice. However, I believe this will be more efficient:

你确实可以使用Intersect两次。但是,我相信这会更有效率:

HashSet<int> hashSet = new HashSet<int>(list1);
hashSet.IntersectWith(list2);
hashSet.IntersectWith(list3);
List<int> intersection = hashSet.ToList();

Not an issue with small sets of course, but if you have a lot of large sets it could be significant.

小集合当然不是问题,但如果你有很多大集合,这可能很重要。

Basically Enumerable.Intersectneeds to create a set on each call - if you know that you're going to be doing more set operations, you might as well keep that set around.

基本上Enumerable.Intersect需要在每次调用时创建一个集合——如果你知道你将要进行更多的集合操作,你最好保持这个集合。

As ever, keep a close eye on performance vs readability - the method chaining of calling Intersecttwice is very appealing.

一如既往,密切关注性能与可读性 - 调用Intersect两次的方法链非常吸引人。

EDIT: For the updated question:

编辑:对于更新的问题:

public List<T> IntersectAll<T>(IEnumerable<IEnumerable<T>> lists)
{
    HashSet<T> hashSet = null;
    foreach (var list in lists)
    {
        if (hashSet == null)
        {
            hashSet = new HashSet<T>(list);
        }
        else
        {
            hashSet.IntersectWith(list);
        }
    }
    return hashSet == null ? new List<T>() : hashSet.ToList();
}

Or if you know it won't be empty, and that Skip will be relatively cheap:

或者,如果您知道它不会是空的,并且 Skip 会相对便宜:

public List<T> IntersectAll<T>(IEnumerable<IEnumerable<T>> lists)
{
    HashSet<T> hashSet = new HashSet<T>(lists.First());
    foreach (var list in lists.Skip(1))
    {
        hashSet.IntersectWith(list);
    }
    return hashSet.ToList();
}

回答by Marcel Gosselin

Try this, it works but I'd really like to get rid of the .ToList() in the aggregate.

试试这个,它有效,但我真的很想摆脱 .ToList() 在聚合中。

var list1 = new List<int>() { 1, 2, 3 };
var list2 = new List<int>() { 2, 3, 4 };
var list3 = new List<int>() { 3, 4, 5 };
var listOfLists = new List<List<int>>() { list1, list2, list3 };
var intersection = listOfLists.Aggregate((previousList, nextList) => previousList.Intersect(nextList).ToList());

Update:

更新:

Following comment from @pomber, it is possible to get rid of the ToList()inside the Aggregatecall and move it outside to execute it only once. I did not test for performance whether previous code is faster than the new one. The change needed is to specify the generic type parameter of the Aggregatemethod on the last line like below:

根据@pomber 的评论,可以摆脱ToList()内部Aggregate调用并将其移到外部以仅执行一次。我没有测试以前的代码是否比新代码快的性能。所需的更改是Aggregate在最后一行指定方法的泛型类型参数,如下所示:

var intersection = listOfLists.Aggregate<IEnumerable<int>>(
   (previousList, nextList) => previousList.Intersect(nextList)
   ).ToList();

回答by gigi

This is my version of the solution with an extension method that I called IntersectMany.

这是我的解决方案版本,带有我称为 IntersectMany 的扩展方法。

public static IEnumerable<TResult> IntersectMany<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, IEnumerable<TResult>> selector)
{
    using (var enumerator = source.GetEnumerator())
    {
        if(!enumerator.MoveNext())
            return new TResult[0];

        var ret = selector(enumerator.Current);

        while (enumerator.MoveNext())
        {
            ret = ret.Intersect(selector(enumerator.Current));
        }

        return ret;
    }
}

So the usage would be something like this:

所以用法是这样的:

var intersection = (new[] { list1, list2, list3 }).IntersectMany(l => l).ToList();

回答by Sergey

This is my one-row solution for List of List (ListOfLists) without intersect function:

这是我的 List of List (ListOfLists) 没有相交函数的单行解决方案:

var intersect = ListOfLists.SelectMany(x=>x).Distinct().Where(w=> ListOfLists.TrueForAll(t=>t.Contains(w))).ToList()

This should work for .net 4 (or later)

这应该适用于 .net 4(或更高版本)

回答by harakim

This is a simple solution if your lists are all small. If you have larger lists, it's not as performing as hash set:

如果您的列表都很小,这是一个简单的解决方案。如果您有更大的列表,它的性能不如哈希集:

public static IEnumerable<T> IntersectMany<T>(this IEnumerable<IEnumerable<T>> input)
{
    if (!input.Any())
        return new List<T>();

    return input.Aggregate(Enumerable.Intersect);
}

回答by birdus

After searching the 'net and not really coming up with something I liked (or that worked), I slept on it and came up with this. Mine uses a class (SearchResult) which has an EmployeeIdin it and that's the thing I need to be common across lists. I return all records that have an EmployeeIdin every list. It's not fancy, but it's simple and easy to understand, just what I like. For small lists (my case) it should perform just fine—and anyone can understand it!

在网上搜索并没有真正想出我喜欢(或有效)的东西后,我睡在上面并想出了这个。我的使用一个类 ( SearchResult) ,其中有一个EmployeeId,这就是我需要在列表中通用的东西。我返回EmployeeId在每个列表中都有的所有记录。不花哨,但简单易懂,正是我喜欢的。对于小列表(我的情况),它应该表现得很好——任何人都可以理解!

private List<SearchResult> GetFinalSearchResults(IEnumerable<IEnumerable<SearchResult>> lists)
{
    Dictionary<int, SearchResult> oldList = new Dictionary<int, SearchResult>();
    Dictionary<int, SearchResult> newList = new Dictionary<int, SearchResult>();

    oldList = lists.First().ToDictionary(x => x.EmployeeId, x => x);

    foreach (List<SearchResult> list in lists.Skip(1))
    {
        foreach (SearchResult emp in list)
        {
            if (oldList.Keys.Contains(emp.EmployeeId))
            {
                newList.Add(emp.EmployeeId, emp);
            }
        }

        oldList = new Dictionary<int, SearchResult>(newList);
        newList.Clear();
    }

    return oldList.Values.ToList();
}

Here's an example just using a list of ints, not a class (this was my original implementation).

这是一个仅使用整数列表而不是类的示例(这是我的原始实现)。

static List<int> FindCommon(List<List<int>> items)
{
    Dictionary<int, int> oldList = new Dictionary<int, int>();
    Dictionary<int, int> newList = new Dictionary<int, int>();

    oldList = items[0].ToDictionary(x => x, x => x);

    foreach (List<int> list in items.Skip(1))
    {
        foreach (int i in list)
        {
            if (oldList.Keys.Contains(i))
            {
                newList.Add(i, i);
            }
        }

        oldList = new Dictionary<int, int>(newList);
        newList.Clear();
    }

    return oldList.Values.ToList();
}