什么是 C# 中字符串类型的最快(内置)比较

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1452043/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 17:40:53  来源:igfitidea点击:

What is the fastest (built-in) comparison for string-types in C#

c#performancestring

提问by Willem Van Onsem

What is the fastest built-in comparison-method for string-types in C#? I don't mind about the typographical/semantical meaning: the aim is to use the comparator in sorted lists in order to search fast in large collections. I think there are only two methods: Compareand CompareOrdinal. What's the fastest?

C# 中字符串类型最快的内置比较方法是什么?我不介意印刷/语义含义:目的是在排序列表中使用比较器,以便在大型集合中快速搜索。我认为只有两种方法:CompareCompareOrdinal。什么是最快的?

Additionally, is there is a faster method for those string-comparisons?

此外,这些字符串比较是否有更快的方法?

采纳答案by Jon Skeet

I'm assuming you want a less-than/equal/greater-than comparison rather than just equality; equality is a slightly different topic, although the principles are basically the same. If you're actually only searching for presencein something like a SortedList, I'd consider using a Dictionary<string, XXX>instead - do you really need all that sorting?

我假设您想要小于/等于/大于比较而不仅仅是相等;平等是一个略有不同的话题,尽管原则基本相同。如果您实际上只是在a 之类的东西中搜索存在SortedList,我会考虑使用 aDictionary<string, XXX>代替 - 您真的需要所有排序吗?

String.CompareOrdinal, or using an overload of String.Comparewhich allows the comparison to be provided, and specifying an ordinal (case-sensitive) comparison, e.g. String.Compare(x, y, StringComparison.Ordinal)will be the fastest.

String.CompareOrdinal,或使用String.Compare允许提供比较的重载,并指定一个有序(区分大小写)比较,例如String.Compare(x, y, StringComparison.Ordinal)将是最快的。

Basically an ordinal comparison justneeds to walk the two strings, character by character, until it finds a difference. If it doesn't find any differences, and the lengths are the same, the result is 0. If it doesn't find any differences but the lengths aren't the same, the longer string is deemed "larger". If it doesfind a difference, it can immediately work out which is deemed "larger" based on which character is "larger" in ordinal terms.

基本上,序数比较需要逐个字符地遍历两个字符串,直到找到差异为止。如果没有发现任何差异,并且长度相同,则结果为0。如果没有发现任何差异但长度不相同,则将较长的字符串视为“较大”。如果确实发现了差异,它可以立即根据顺序术语中哪个字符“更大”来确定哪个被视为“更大”。

To put is another way: it's like doing the obvious comparison between two char[]values.

换一种说法:这就像在两个char[]值之间进行明显的比较。

Culture-sensitive comparisons have to perform all kinds of tortuous feats, depending on the precise culture you use. For an example of this, see this question. It's pretty clear that having more complex rules to follow can make this slower.

文化敏感的比较必须执行各种曲折的壮举,这取决于您使用的精确文化。有关此示例,请参阅此问题。很明显,遵循更复杂的规则会使速度变慢。

回答by Sam Harwell

Fastest is interned strings with reference equality test, but you only get equality testing and it's at the heavy expense of memory - so expensive that it's almost never the recommended course.

最快的是具有参考相等性测试的实习字符串,但您只能进行相等性测试,而且内存消耗很大 - 如此昂贵以至于几乎从不推荐 course

Past that, a case-sensitive ordinal test will be the fastest, and this method is absolutely recommended for non-culture-specific strings. Case-sensitive is faster if it works for your use case.

在那之后,区分大小写的序数测试将是最快的,对于非文化特定的字符串,绝对推荐使用这种方法。如果它适用于您的用例,则区分大小写会更快。

When you specify either StringComparison.Ordinalor StringComparison.OrdinalIgnoreCase, the string comparison will be non-linguistic. That is, the features that are specific to the natural language are ignored when making comparison decisions. This means the decisions are based on simple byte comparisons and ignore casing or equivalence tables that are parameterized by culture. As a result, by explicitly setting the parameter to either the StringComparison.Ordinalor StringComparison.OrdinalIgnoreCase, your code often gains speed, increases correctness, and becomes more reliable.

当您指定StringComparison.Ordinalor 时StringComparison.OrdinalIgnoreCase,字符串比较将是非语言的。也就是说,在进行比较决策时,会忽略自然语言特有的特征。这意味着决策基于简单的字节比较,并忽略按文化参数化的大小写或等价表。因此,通过将参数显式设置为StringComparison.Ordinalor StringComparison.OrdinalIgnoreCase,您的代码通常会提高速度、提高正确性并变得更可靠。

Source

来源

回答by Jaswant Agarwal

I Checked both the string.Compare and string.CompareOrdinal using stop watch

我使用秒表检查了 string.Compare 和 string.CompareOrdinal

    --Compare Ordinal  case 1 
    Stopwatch sw = new Stopwatch();
    sw.Start();
    int x = string.CompareOrdinal("Jaswant Agarwal", "Jaswant Agarwal");
    sw.Stop();
    lblTimeGap.Text = sw.Elapsed.ToString(); 






    -- Only compare  case 2
    Stopwatch sw = new Stopwatch();
    sw.Start();
    int x = string.Compare("Jaswant Agarwal", "Jaswant Agarwal");
    sw.Stop();
    lblTimeGap.Text = sw.Elapsed.ToString();

In case 1 Average elapsed timing was 00:00:00.0000030 In case 2 Average elapsed timing was 00:00:00.0000086

情况 1 平均经过时间为 00:00:00.0000030 情况2 平均经过时间为 00:00:00.0000086

I tried with different Equal and not equal combinations of string and found that every time CompareOrdinal is faster than only compare..

我尝试了不同的 Equal 和不相等的字符串组合,发现每次 CompareOrdinal 都比只比较快..

That is my own observation..you can also try just put two buttons on a form and copy paste this code in regrading event..

这是我自己的观察..您也可以尝试在表单上放置两个按钮并将此代码复制粘贴到重新分级事件中..

回答by gb2d

I designed a unit test to test string comparison speed using some of the methods mentioned in this post. This test was ran using .NET 4

我设计了一个单元测试来使用本文中提到的一些方法来测试字符串比较速度。此测试是使用 .NET 4 运行的

In short, there isn't much much difference, and I had to go to 100,000,000 iterations to see a significant difference. Since it seems the characters are compared in turn until a difference is found, inevitably how similar the strings are plays a part.

简而言之,没有太大区别,我不得不进行 100,000,000 次迭代才能看到显着差异。由于似乎依次比较字符直到找到差异,因此字符串的相似程度不可避免地起作用。

These results actually seem to suggest that using str1.Equals(str2) is the fastest way to compare strings.

这些结果实际上似乎表明使用 str1.Equals(str2) 是比较字符串的最快方法。

These are the results of the test, with the test class included:

这些是测试的结果,测试类包括:

######## SET 1 compared strings are the same: 0
#### Basic == compare: 413
#### Equals compare: 355
#### Equals(compare2, StringComparison.Ordinal) compare: 387
#### String.Compare(compare1, compare2, StringComparison.Ordinal) compare: 426
#### String.CompareOrdinal(compare1, compare2) compare: 412

######## SET 2 compared strings are NOT the same: 0
#### Basic == compare: 710
#### Equals compare: 733
#### Equals(compare2, StringComparison.Ordinal) compare: 840
#### String.Compare(compare1, compare2, StringComparison.Ordinal) compare: 987
#### String.CompareOrdinal(compare1, compare2) compare: 776

using System;
using System.Diagnostics;
using NUnit.Framework;

namespace Fwr.UnitTests
{
    [TestFixture]
    public class StringTests
    {
        [Test]
        public void Test_fast_string_compare()
        {
            int iterations = 100000000;
            bool result = false;
            var stopWatch = new Stopwatch();

            Debug.WriteLine("######## SET 1 compared strings are the same: " + stopWatch.ElapsedMilliseconds);

            string compare1 = "xxxxxxxxxxxxxxxxxx";
            string compare2 = "xxxxxxxxxxxxxxxxxx";

            // Test 1

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1 == compare2;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Basic == compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 2

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1.Equals(compare2);
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Equals compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 3

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1.Equals(compare2, StringComparison.Ordinal);
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Equals(compare2, StringComparison.Ordinal) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 4

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = String.Compare(compare1, compare2, StringComparison.Ordinal) != 0;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### String.Compare(compare1, compare2, StringComparison.Ordinal) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 5

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = String.CompareOrdinal(compare1, compare2) != 0;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### String.CompareOrdinal(compare1, compare2) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            Debug.WriteLine("######## SET 2 compared strings are NOT the same: " + stopWatch.ElapsedMilliseconds);

            compare1 = "ueoqwwnsdlkskjsowy";
            compare2 = "sakjdjsjahsdhsjdak";

            // Test 1

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1 == compare2;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Basic == compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 2

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1.Equals(compare2);
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Equals compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 3

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1.Equals(compare2, StringComparison.Ordinal);
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Equals(compare2, StringComparison.Ordinal) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 4

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = String.Compare(compare1, compare2, StringComparison.Ordinal) != 0;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### String.Compare(compare1, compare2, StringComparison.Ordinal) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 5

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = String.CompareOrdinal(compare1, compare2) != 0;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### String.CompareOrdinal(compare1, compare2) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();
        }
    }
}

回答by Peter Jamsmenson

I just noticed a 50% performance increase in my own code by comparing string lengths first and if equal then using the string.compare methods. So in a loop I have:

我刚刚注意到,通过首先比较字符串长度,如果相等,然后使用 string.compare 方法,我自己的代码性能提高了 50%。所以在一个循环中我有:

VB:

VB:

If strA.length = strB.length then
   if string.compare(strA,strB,true) = 0 then
      TheyAreEqual
   End if
End if

C#:

C#:

if(strA.Length == strB.Length)
{
   if(string.Compare(strA,strB,true) == 0)
   {
       //they are equal
   }
}

This could be dependant on your own strings but its seems to have worked well for me.

这可能取决于您自己的字符串,但它似乎对我来说效果很好。

回答by tekiegirl

This might be useful to someone, but changing one line of my code brought the unit testing of my method down from 140ms to 1ms!

这可能对某人有用,但是更改我的一行代码会使我的方法的单元测试从 140 毫秒降低到 1 毫秒!

Original

原来的

Unit test: 140ms

单元测试:140ms

public bool StringsMatch(string string1, string string2)
{
    if (string1 == null && string2 == null) return true;
    return string1.Equals(string2, StringComparison.Ordinal);
}

New

新的

Unit test: 1ms

单元测试:1ms

public bool StringsMatch(string string1, string string2)
{
    if (string1 == null && string2 == null) return true;
    return string.CompareOrdinal(string1, string2) == 0 ? true : false;
}

Unit Test(NUnit)

单元测试(NUnit)

[Test]
public void StringsMatch_OnlyString1NullOrEmpty_ReturnFalse()
{
    Authentication auth = new Authentication();
    Assert.IsFalse(auth.StringsMatch(null, "foo"));
    Assert.IsFalse(auth.StringsMatch("", "foo"));
}

Interestingly StringsMatch_OnlyString1NullOrEmpty_ReturnFalse() was the only unit test that took 140ms for the StringsMatch method. StringsMatch_AllParamsNullOrEmpty_ReturnTrue() was always 1ms and StringsMatch_OnlyString2NullOrEmpty_ReturnFalse() always <1ms.

有趣的是,StringsMatch_OnlyString1NullOrEmpty_ReturnFalse() 是唯一一个为 StringsMatch 方法花费 140 毫秒的单元测试。StringsMatch_AllParamsNullOrEmpty_ReturnTrue() 总是 1ms,StringsMatch_OnlyString2NullOrEmpty_ReturnFalse() 总是 <1ms。

回答by buddybubble

This is quite an old question, but since I found it others might as well.

这是一个相当古老的问题,但既然我发现了它,其他人也可以。

In researching this topic a bit further, I came upon an interesting blog postthat compares all methods for string comparison. Probably not highly scientific but still a good housenumber.

在进一步研究这个主题时,我发现了一篇有趣的博客文章,它比较了所有字符串比较方法。可能不是很科学,但仍然是一个很好的门牌号。

Thanks to this article I started using string.CompareOrdinal in a scenario where I had to find out if one string was in a list of 170.000 other strings and doing this 1600 times in a row. string.CompareOrdinal made it almost 50% faster compared to string.Equals

感谢这篇文章,我开始在一个场景中使用 string.CompareOrdinal,我必须找出一个字符串是否在 170.000 个其他字符串的列表中,并连续执行 1600 次。string.CompareOrdinal 使其比 string.Equals 快了近 50%

回答by buddybubble

I think there's a few ways most C# developers go about comparing strings, with the following being the most common:

我认为大多数 C# 开发人员比较字符串的方法有几种,以下是最常见的:

  • Compare- as you mentioned
  • CompareOrdinal- as you mentioned
  • ==
  • String.Equals
  • writing a custom algorithm to compare char by char
  • Compare- 正如你提到的
  • CompareOrdinal- 正如你提到的
  • ==
  • String.Equals
  • 编写一个自定义算法来逐个字符比较

If you want to go to extremes, you can use other objects/methods that aren't so obvious:

如果你想走极端,你可以使用其他不那么明显的对象/方法:

  • SequenceEqualexample:

    c1 = str1.ToCharArray(); c2 = str2.ToCharArray(); if (c1.SequenceEqual(c2))

  • IndexOfexample: if (stringsWeAreComparingAgainst.IndexOf(stringsWeWantToSeeIfMatches, 0 , stringsWeWantToSeeIfMatches.Length) == 0)

  • Or you can implement Dictionary and HashSets, using the strings as "keys" and testing to see if they exist already with the string you want to compare against. For instance: if (hs.Contains(stringsWeWantToSeeIfMatches))

  • SequenceEqual例子:

    c1 = str1.ToCharArray(); c2 = str2.ToCharArray(); if (c1.SequenceEqual(c2))

  • IndexOf例子: if (stringsWeAreComparingAgainst.IndexOf(stringsWeWantToSeeIfMatches, 0 , stringsWeWantToSeeIfMatches.Length) == 0)

  • 或者您可以实现 Dictionary 和 HashSets,使用字符串作为“键”并测试它们是否已经存在于您想要比较的字符串中。例如:if (hs.Contains(stringsWeWantToSeeIfMatches))

So feel free to slice and dice to find your own ways of doing things. Remember though someone is going to have to maintain the code and probably won't want to spend time trying to figure out why you're using whatever method you've decided to use.

因此,请随意切片和切块以找到自己的做事方式。请记住,尽管有人将不得不维护代码,并且可能不想花时间试图弄清楚为什么要使用您决定使用的任何方法。

As always, optimize as your own risk. :-)

与往常一样,优化风险由您自己承担。:-)