C# 使用 LINQ to Objects 在一个集合中查找与另一个不匹配的项目

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1647698/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 19:40:22  来源:igfitidea点击:

Using LINQ to Objects to find items in one collection that do not match another

c#linqlinq-to-objects

提问by TrueWill

I want to find all items in one collection that do not match another collection. The collections are not of the same type, though; I want to write a lambda expression to specify equality.

我想在一个集合中找到与另一个集合不匹配的所有项目。不过,这些集合的类型不同。我想编写一个 lambda 表达式来指定相等性。

A LINQPadexample of what I'm trying to do:

我正在尝试做的一个LINQPad示例:

void Main()
{
    var employees = new[]
    {
        new Employee { Id = 20, Name = "Bob" },
        new Employee { Id = 10, Name = "Bill" },
        new Employee { Id = 30, Name = "Frank" }
    };

    var managers = new[]
    {
        new Manager { EmployeeId = 20 },
        new Manager { EmployeeId = 30 }
    };

    var nonManagers =
    from employee in employees
    where !(managers.Any(x => x.EmployeeId == employee.Id))
    select employee;

    nonManagers.Dump();

    // Based on cdonner's answer:

    var nonManagers2 =
    from employee in employees
    join manager in managers
        on employee.Id equals manager.EmployeeId
    into tempManagers
    from manager in tempManagers.DefaultIfEmpty()
    where manager == null
    select employee;

    nonManagers2.Dump();

    // Based on Richard Hein's answer:

    var nonManagers3 =
    employees.Except(
        from employee in employees
        join manager in managers
            on employee.Id equals manager.EmployeeId
        select employee);

    nonManagers3.Dump();
}

public class Employee
{
    public int Id { get; set; }
    public string Name { get; set; }
}

public class Manager
{
    public int EmployeeId { get; set; }
}

The above works, and will return Employee Bill (#10). It does not seem elegant, though, and it may be inefficient with larger collections. In SQL I'd probably do a LEFT JOIN and find items where the second ID was NULL. What's the best practice for doing this in LINQ?

以上工作,并将返回员工比尔(#10)。但是,它看起来并不优雅,并且对于较大的集合可能效率低下。在 SQL 中,我可能会执行 LEFT JOIN 并找到第二个 ID 为 NULL 的项目。在 LINQ 中执行此操作的最佳实践是什么?

EDIT: Updated to prevent solutions that depend on the Id equaling the index.

编辑:更新以防止依赖于等于索引的 Id 的解决方案。

EDIT: Added cdonner's solution - anybody have anything simpler?

编辑:添加了 cdonner 的解决方案 - 有人有更简单的方法吗?

EDIT: Added a variant on Richard Hein's answer, my current favorite. Thanks to everyone for some excellent answers!

编辑:在 Richard Hein 的答案中添加了一个变体,这是我目前最喜欢的。感谢大家的一些出色的答案!

采纳答案by Richard Anthony Hein

This is almost the same as some other examples but less code:

这与其他一些示例几乎相同,但代码更少:

employees.Except(employees.Join(managers, e => e.Id, m => m.EmployeeId, (e, m) => e));

It's not any simpler than employees.Where(e => !managers.Any(m => m.EmployeeId == e.Id)) or your original syntax, however.

然而,它并不比employees.Where(e => !managers.Any(m => m.EmployeeId == e.Id)) 或您的原始语法更简单。

回答by cdonner

    /// <summary>
    /// This method returns items in a set that are not in 
    /// another set of a different type
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <typeparam name="TOther"></typeparam>
    /// <typeparam name="TKey"></typeparam>
    /// <param name="items"></param>
    /// <param name="other"></param>
    /// <param name="getItemKey"></param>
    /// <param name="getOtherKey"></param>
    /// <returns></returns>
    public static IEnumerable<T> Except<T, TOther, TKey>(
                                           this IEnumerable<T> items,
                                           IEnumerable<TOther> other,
                                           Func<T, TKey> getItemKey,
                                           Func<TOther, TKey> getOtherKey)
    {
        return from item in items
               join otherItem in other on getItemKey(item)
               equals getOtherKey(otherItem) into tempItems
               from temp in tempItems.DefaultIfEmpty()
               where ReferenceEquals(null, temp) || temp.Equals(default(TOther))
               select item;
    }

I don't remember where I found this method.

我不记得我在哪里找到这个方法的。

回答by nitzmahone

Have a look at the Except() LINQ function. It does exactly what you need.

看看Except() LINQ 函数。它完全符合您的需求。

回答by G-Wiz

var nonmanagers = employees.Select(e => e.Id)
    .Except(managers.Select(m => m.EmployeeId))
    .Select(id => employees.Single(e => e.Id == id));

回答by Partha Choudhury

         var nonManagers = ( from e1 in employees
                             select e1 ).Except(
                                   from m in managers
                                   from e2 in employees
                                   where m.EmployeeId == e2.Id
                                   select e2 );

回答by amelvin

It's a bit late (I know).

有点晚了(我知道)。

I was looking at the same problem, and was considering a HashSet because of various performance hints in that direction inc. @Skeet's Intersection of multiple lists with IEnumerable.Intersect()- and asked around my office and the consensus was that a HashSet would be faster and more readable:

我正在研究同样的问题,并且由于该方向的各种性能提示正在考虑使用 HashSet。@Skeet 的多个列表与 IEnumerable.Intersect()交集- 并在我的办公室周围询问,一致认为 HashSet 会更快且更具可读性:

HashSet<int> managerIds = new HashSet<int>(managers.Select(x => x.EmployeeId));
nonManagers4 = employees.Where(x => !managerIds.Contains(x.Id)).ToList();

Then I was offered an even faster solution using native arrays to create a bit mask-ish type solution (the syntax on the native array queries would put me off using them except for extreme performance reasons though).

然后我得到了一个更快的解决方案,使用本机数组来创建一个位掩码类型的解决方案(除了极端性能原因,本机数组查询的语法会让我不使用它们)。

To give this answer a little credence after an awful long time I've extended your linqpad program and data with timings so you can compare what are now six options :

为了在很长一段时间后给这个答案一点可信度,我已经扩展了您的 linqpad 程序和数据,以便您可以比较现在的六个选项:

void Main()
{
    var employees = new[]
    {
        new Employee { Id = 20, Name = "Bob" },
        new Employee { Id = 10, Name = "Kirk NM" },
        new Employee { Id = 48, Name = "Rick NM" },
        new Employee { Id = 42, Name = "Dick" },
        new Employee { Id = 43, Name = "Harry" },
        new Employee { Id = 44, Name = "Joe" },
        new Employee { Id = 45, Name = "Steve NM" },
        new Employee { Id = 46, Name = "Jim NM" },
        new Employee { Id = 30, Name = "Frank"},
        new Employee { Id = 47, Name = "Dave NM" },
        new Employee { Id = 49, Name = "Alex NM" },
        new Employee { Id = 50, Name = "Phil NM" },
        new Employee { Id = 51, Name = "Ed NM" },
        new Employee { Id = 52, Name = "Ollie NM" },
        new Employee { Id = 41, Name = "Bill" },
        new Employee { Id = 53, Name = "John NM" },
        new Employee { Id = 54, Name = "Simon NM" }
    };

    var managers = new[]
    {
        new Manager { EmployeeId = 20 },
        new Manager { EmployeeId = 30 },
        new Manager { EmployeeId = 41 },
        new Manager { EmployeeId = 42 },
        new Manager { EmployeeId = 43 },
        new Manager { EmployeeId = 44 }
    };

    System.Diagnostics.Stopwatch watch1 = new System.Diagnostics.Stopwatch();

    int max = 1000000;

    watch1.Start();
    List<Employee> nonManagers1 = new List<Employee>();
    foreach (var item in Enumerable.Range(1,max))
    {
        nonManagers1 = (from employee in employees where !(managers.Any(x => x.EmployeeId == employee.Id)) select employee).ToList();

    }
    nonManagers1.Dump();
    watch1.Stop();
    Console.WriteLine("Any: " + watch1.ElapsedMilliseconds);

    watch1.Restart();       
    List<Employee> nonManagers2 = new List<Employee>();
    foreach (var item in Enumerable.Range(1,max))
    {
        nonManagers2 =
        (from employee in employees
        join manager in managers
            on employee.Id equals manager.EmployeeId
        into tempManagers
        from manager in tempManagers.DefaultIfEmpty()
        where manager == null
        select employee).ToList();
    }
    nonManagers2.Dump();
    watch1.Stop();
    Console.WriteLine("temp table: " + watch1.ElapsedMilliseconds);

    watch1.Restart();       
    List<Employee> nonManagers3 = new List<Employee>();
    foreach (var item in Enumerable.Range(1,max))
    {
        nonManagers3 = employees.Except(employees.Join(managers, e => e.Id, m => m.EmployeeId, (e, m) => e)).ToList();
    }
    nonManagers3.Dump();
    watch1.Stop();
    Console.WriteLine("Except: " + watch1.ElapsedMilliseconds);

    watch1.Restart();       
    List<Employee> nonManagers4 = new List<Employee>();
    foreach (var item in Enumerable.Range(1,max))
    {
        HashSet<int> managerIds = new HashSet<int>(managers.Select(x => x.EmployeeId));
        nonManagers4 = employees.Where(x => !managerIds.Contains(x.Id)).ToList();

    }
    nonManagers4.Dump();
    watch1.Stop();
    Console.WriteLine("HashSet: " + watch1.ElapsedMilliseconds);

      watch1.Restart();
      List<Employee> nonManagers5 = new List<Employee>();
      foreach (var item in Enumerable.Range(1, max))
      {
                   bool[] test = new bool[managers.Max(x => x.EmployeeId) + 1];
                   foreach (var manager in managers)
                   {
                        test[manager.EmployeeId] = true;
                   }

                   nonManagers5 = employees.Where(x => x.Id > test.Length - 1 || !test[x.Id]).ToList();


      }
      nonManagers5.Dump();
      watch1.Stop();
      Console.WriteLine("Native array call: " + watch1.ElapsedMilliseconds);

      watch1.Restart();
      List<Employee> nonManagers6 = new List<Employee>();
      foreach (var item in Enumerable.Range(1, max))
      {
                   bool[] test = new bool[managers.Max(x => x.EmployeeId) + 1];
                   foreach (var manager in managers)
                   {
                        test[manager.EmployeeId] = true;
                   }

                   nonManagers6 = employees.Where(x => x.Id > test.Length - 1 || !test[x.Id]).ToList();
      }
      nonManagers6.Dump();
      watch1.Stop();
      Console.WriteLine("Native array call 2: " + watch1.ElapsedMilliseconds);
}

public class Employee
{
    public int Id { get; set; }
    public string Name { get; set; }
}

public class Manager
{
    public int EmployeeId { get; set; }
}

回答by Mahendra

Its better if you left join the item and filter with null condition

如果您离开加入项目并使用空条件过滤,那就更好了

var finalcertificates = (from globCert in resultCertificate
                                         join toExcludeCert in certificatesToExclude
                                             on globCert.CertificateId equals toExcludeCert.CertificateId into certs
                                         from toExcludeCert in certs.DefaultIfEmpty()
                                         where toExcludeCert == null
                                         select globCert).Union(currentCertificate).Distinct().OrderBy(cert => cert.CertificateName);

回答by ErikE

Managers are employees, too! So the Managerclass should subclass from the Employeeclass (or, if you don't like that, then they should both subclass from a parent class, or make a NonManagerclass).

经理也是员工!所以这个Manager类应该从这个Employee类中子类化(或者,如果你不喜欢那样,那么它们都应该从一个父类中子类化,或者创建一个NonManager类)。

Then your problem is as simple as implementing the IEquatableinterface on your Employeesuperclass (for GetHashCodesimply return the EmployeeID) and then using this code:

那么您的问题就像IEquatableEmployee超类上实现接口一样简单(GetHashCode只需返回EmployeeID),然后使用以下代码:

var nonManagerEmployees = employeeList.Except(managerList);