C# 将字符串拆分为特定大小的块
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1450774/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Splitting a string into chunks of a certain size
提问by
Suppose I had a string:
假设我有一个字符串:
string str = "1111222233334444";
How can I break this string into chunks of some size?
我怎样才能把这个字符串分成一定大小的块?
e.g., breaking this into sizes of 4 would return strings:
例如,将其分解为 4 的大小将返回字符串:
"1111"
"2222"
"3333"
"4444"
采纳答案by Konstantin Spirin
static IEnumerable<string> Split(string str, int chunkSize)
{
return Enumerable.Range(0, str.Length / chunkSize)
.Select(i => str.Substring(i * chunkSize, chunkSize));
}
Please note that additional code might be required to gracefully handle edge cases (null
or empty input string, chunkSize == 0
, input string length not divisible by chunkSize
, etc.). The original question doesn't specify any requirements for these edge cases and in real life the requirements might vary so they are out of scope of this answer.
请注意,可能需要额外的代码来优雅地处理边缘情况(null
或空输入字符串、chunkSize == 0
、不能被 整除的输入字符串长度chunkSize
等)。最初的问题没有为这些边缘情况指定任何要求,在现实生活中,要求可能会有所不同,因此它们超出了本答案的范围。
回答by dove
Why not loops? Here's something that would do it quite well:
为什么不是循环?这里有一些可以做得很好的东西:
string str = "111122223333444455";
int chunkSize = 4;
int stringLength = str.Length;
for (int i = 0; i < stringLength ; i += chunkSize)
{
if (i + chunkSize > stringLength) chunkSize = stringLength - i;
Console.WriteLine(str.Substring(i, chunkSize));
}
Console.ReadLine();
I don't know how you'd deal with case where the string is not factor of 4, but not saying you're idea is not possible, just wondering the motivation for it if a simple for loop does it very well? Obviously the above could be cleaned and even put in as an extension method.
我不知道你会如何处理字符串不是 4 的因子的情况,但不是说你的想法是不可能的,只是想知道如果一个简单的 for 循环做得很好,它的动机是什么?显然,上述内容可以清除,甚至可以作为扩展方法放入。
Or as mentioned in comments, you know it's /4 then
或者如评论中所述,您知道它是 /4 然后
str = "1111222233334444";
for (int i = 0; i < stringLength; i += chunkSize)
{Console.WriteLine(str.Substring(i, chunkSize));}
回答by Guffa
It's not pretty and it's not fast, but it works, it's a one-liner and it's LINQy:
它不漂亮,速度也不快,但它有效,它是单线的,它是 LINQy:
List<string> a = text.Select((c, i) => new { Char = c, Index = i }).GroupBy(o => o.Index / 4).Select(g => new String(g.Select(o => o.Char).ToArray())).ToList();
回答by Jo?o Silva
Using regular expressionsand Linq:
使用正则表达式和Linq:
List<string> groups = (from Match m in Regex.Matches(str, @"\d{4}")
select m.Value).ToList();
I find this to be more readable, but it's just a personal opinion. It can also be a one-liner : ).
我觉得这更具可读性,但这只是个人意见。它也可以是单行的:)。
回答by Eamon Nerbonne
In a combination of dove+Konstatin's answers...
结合 dove+Konstatin 的答案......
static IEnumerable<string> WholeChunks(string str, int chunkSize) {
for (int i = 0; i < str.Length; i += chunkSize)
yield return str.Substring(i, chunkSize);
}
This will work for all strings that can be split into a whole number of chunks, and will throw an exception otherwise.
这将适用于所有可以拆分为多个块的字符串,否则将引发异常。
If you want to support strings of any length you could use the following code:
如果你想支持任何长度的字符串,你可以使用以下代码:
static IEnumerable<string> ChunksUpto(string str, int maxChunkSize) {
for (int i = 0; i < str.Length; i += maxChunkSize)
yield return str.Substring(i, Math.Min(maxChunkSize, str.Length-i));
}
However, the the OP explicitly stated he does notneed this; it's somewhat longer and harder to read, slightly slower. In the spirit of KISS and YAGNI, I'd go with the first option: it's probably the most efficient implementation possible, and it's very short, readable, and, importantly, throws an exception for nonconforming input.
然而,在OP明确表示他并不会需要这个; 它有点长,读起来更难,稍微慢一点。本着 KISS 和 YAGNI 的精神,我会选择第一个选项:它可能是最有效的实现,而且它非常简短、可读,而且重要的是,对于不符合要求的输入会引发异常。
回答by Alan Moore
How's this for a one-liner?
这对单线怎么样?
List<string> result = new List<string>(Regex.Split(target, @"(?<=\G.{4})", RegexOptions.Singleline));
With this regex it doesn't matter if the last chunk is less than four characters, because it only ever looks at the characters behind it.
使用此正则表达式,最后一个块是否少于四个字符并不重要,因为它只查看其后面的字符。
I'm sure this isn't the most efficient solution, but I just had to toss it out there.
我敢肯定这不是最有效的解决方案,但我不得不把它扔在那里。
回答by HoloEd
public static IEnumerable<IEnumerable<T>> SplitEvery<T>(this IEnumerable<T> values, int n)
{
var ls = values.Take(n);
var rs = values.Skip(n);
return ls.Any() ?
Cons(ls, SplitEvery(rs, n)) :
Enumerable.Empty<IEnumerable<T>>();
}
public static IEnumerable<T> Cons<T>(T x, IEnumerable<T> xs)
{
yield return x;
foreach (var xi in xs)
yield return xi;
}
回答by Abhishek Shrestha
Changed slightly to return parts whose size not equal to chunkSize
稍微改变以返回大小不等于 chunkSize 的部分
public static IEnumerable<string> Split(this string str, int chunkSize)
{
var splits = new List<string>();
if (str.Length < chunkSize) { chunkSize = str.Length; }
splits.AddRange(Enumerable.Range(0, str.Length / chunkSize).Select(i => str.Substring(i * chunkSize, chunkSize)));
splits.Add(str.Length % chunkSize > 0 ? str.Substring((str.Length / chunkSize) * chunkSize, str.Length - ((str.Length / chunkSize) * chunkSize)) : string.Empty);
return (IEnumerable<string>)splits;
}
回答by Jeff Mercado
This should be much faster and more efficient than using LINQ or other approaches used here.
这应该比使用 LINQ 或此处使用的其他方法更快、更有效。
public static IEnumerable<string> Splice(this string s, int spliceLength)
{
if (s == null)
throw new ArgumentNullException("s");
if (spliceLength < 1)
throw new ArgumentOutOfRangeException("spliceLength");
if (s.Length == 0)
yield break;
var start = 0;
for (var end = spliceLength; end < s.Length; end += spliceLength)
{
yield return s.Substring(start, spliceLength);
start = end;
}
yield return s.Substring(start);
}
回答by Seth
An important tip if the string that is being chunked needs to support all Unicode characters.
如果被分块的字符串需要支持所有 Unicode 字符,这是一个重要提示。
If the string is to support international characters like , then split up the string using the System.Globalization.StringInfo class. Using StringInfo, you can split up the string based on number of text elements.
如果字符串支持像 那样的国际字符,则使用 System.Globalization.StringInfo 类拆分字符串。使用 StringInfo,您可以根据文本元素的数量拆分字符串。
string internationalString = '';
The above string has a Length of 2, because the String.Length
property returns the number of Char objects in this instance, not the number of Unicode characters.
上述字符串的 Length 为 2,因为该String.Length
属性返回的是本实例中 Char 对象的数量,而不是 Unicode 字符的数量。