C# Linq - 在多个 (OR) 条件下进行左连接
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1264993/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Linq - left join on multiple (OR) conditions
提问by Chris W
I need to do a left join on multiple conditions where the conditions are OR
s rather than AND
s. I've found lots of samples of the latter but am struggling to get the right answer for my scenario.
我需要在条件为OR
s 而不是AND
s 的多个条件上进行左连接。我发现了很多后者的样本,但我正在努力为我的场景找到正确的答案。
from a in tablea
join b in tableb on new { a.col1, a.col2 } equals new { b.col1, b.col2 }
group a by a into g
select new () { col1 = a.col1, col2 = a.col2, count = g.Count() }
works great for joins where all conditions must match. I need to get the join to match on a.col1 = b.col1 OR a.col2 = b.col2
.
适用于所有条件都必须匹配的连接。我需要让连接匹配on a.col1 = b.col1 OR a.col2 = b.col2
。
I know it must be easy but I've coming up blank on this!
我知道这一定很容易,但我对此一无所知!
Edit:
编辑:
To give a little more info, the purpose of the query is to get a projection containing all of the fields from 'a' plus a count of the matching records in 'b'. I've amended the sample above to try and illustrate what I'm after. When I run with the above using the approach Jon Skeet has noted I'm getting a count of all records from a, not the count of the related records in b.
为了提供更多信息,查询的目的是获取包含“a”中所有字段以及“b”中匹配记录计数的投影。我已经修改了上面的示例以尝试说明我所追求的内容。当我使用上面的方法运行时,Jon Skeet 注意到我从 a 中获取所有记录的计数,而不是 b 中相关记录的计数。
The basic left join works fine:
基本的左连接工作正常:
from a in tablea
from b in tableb
.Where( b => ( a.col1 == b.col1 || a.col2 == b.col2))
.DefaultIfEmpty()
select new { col1 = a.col1, col2 = a.col2 }
If I revise it to add the grouping as below
如果我修改它以添加如下分组
from a in tablea
from b in tableb
.Where( b => ( a.col1 == b.col1 || a.col2 == b.col2))
.DefaultIfEmpty()
group a by a.col1 into g
select new { col1 = g.Key, count = g.Count() }
I'm getting the count of the records returned from a - not the count of records in matching in b.
我正在获取从 a 返回的记录数 - 而不是 b 中匹配的记录数。
Edit:
编辑:
I'll give the answer to Jon - I've solved my count issue - I hadn't realized I could use a lamda to filter the count (g.Count(x => x != null))
. Plus I need to group b by a rather than a by a as I had above. This gives the correct result but the SQL is not as efficient as I'd write it by hand as it adds a correlated sub query - if anyone can advise a better way of writing it to simulate the following SQL I'd appreciate it!
我会回答 Jon - 我已经解决了我的计数问题 - 我没有意识到我可以使用 lamda 来过滤 count (g.Count(x => x != null))
。另外,我需要按 a 对 b 分组,而不是像上面那样按 a 分组。这给出了正确的结果,但 SQL 不如我手工编写的高效,因为它添加了一个相关的子查询 - 如果有人能建议一种更好的编写方式来模拟以下 SQL,我将不胜感激!
select a.col1, count(b.col1)
from tablea a
left join tableb b
on a.col1 = b.col1
or a.col2 = b.col2
group by a.col1
采纳答案by Jon Skeet
LINQ only directly supports equijoins. If you want to do any other kind of join, you basically need a cross-join and where
:
LINQ 仅直接支持等值联接。如果你想做任何其他类型的连接,你基本上需要一个交叉连接和where
:
from a in tablea
from b in tableb
where a.col1 == b.col1 || a.col2 == b.col2
select ...
It's probably worth checking what the generated SQL looks like and what the query plan is like. There may be more efficient ways of doing it, but this is probably the simplestapproach.
可能值得检查生成的 SQL 是什么样的以及查询计划是什么样的。可能有更有效的方法,但这可能是最简单的方法。
回答by Eamon Nerbonne
Depending on the query provider, you could just choose to use two from clauses:
根据查询提供程序,您可以选择使用两个 from 子句:
from a in tablea
from b in tableb
where a.col1 == b.col1 || a.col2 == b.col2
Which, if you execute on a DB, will be just as efficient. If you execute in-memory (Linq to Objects), this will enumerate all possible combinations, which may be inefficient.
如果您在 DB 上执行,则同样高效。如果在内存中执行(Linq to Objects),这将枚举所有可能的组合,这可能效率低下。
Arg, Skeeted ;-).
Arg,串接 ;-)。
More efficient Linq to Objects alternatives are possible. The join
operator enumerates each source only once, and then does a hash-join, so you could split the or-clause into two seperate joins, and then take their union. A union in linq is just a concatenation without duplicates, so that would look as follows:
更高效的 Linq to Objects 替代方案是可能的。该join
运营商列举每个源只有一次,然后做一个散列连接,所以你可以在或从句分成两个单独的连接,然后把他们的工会。linq 中的联合只是一个没有重复的串联,所以看起来如下:
(from a in tablea
join b in tableb on a.Col1 equals b.Col1
select new {a, b})
.Concat(
from a in tablea
join b in tableb on a.Col2 equals b.Col2
select new {a, b}
).Distinct()
This approach works, and it's just one query, but it's somewhat non-obvious in the sense that the performance characteristics of the code depend on understanding in detail how linq works. Personally, if you want to do a hash-join with potentially multiple matches, a more obvious tool is ToLookup
. An alternative using that might look as follows:
这种方法有效,而且它只是一个查询,但在代码的性能特征取决于对 linq 工作方式的详细理解的意义上,它有点不明显。就个人而言,如果您想对可能有多个匹配项进行哈希连接,一个更明显的工具是ToLookup
. 另一种使用方法可能如下所示:
var bBy1 = tableb.ToLookup(b=>b.Col1);
var bBy2 = tableb.ToLookup(b=>b.Col2);
var q3 =
from a in tablea
from b in bBy1[a.Col1].Concat(bBy2[a.Col2]).Distinct()
...
This solution is actually shorter, and the reason it works is more obvious, so it's the one I'd prefer. Just remember that if you split the ||
operator into two separate queries like in the above two scenarios that you need to manually avoid double-counting the results (i.e. use Distinct
).
这个解决方案实际上更短,它起作用的原因更明显,所以它是我更喜欢的一个。请记住,如果您将||
运算符拆分为两个单独的查询,如在上述两个场景中,您需要手动避免重复计算结果(即使用Distinct
)。