将字节数组转换为字符串并在 C# 中再次返回

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1422314/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 16:22:34  来源:igfitidea点击:

Converting byte array to string and back again in C#

c#stringfilebytearray

提问by Brian Hicks

So here's the deal: I'm trying to open a file (from bytes), convert it to a string so I can mess with some metadata in the header, convert it back to bytes, and save it. The problem I'm running into right now is with this code. When I compare the string that's been converted back and forth (but not otherwise modified) to the original byte array, it's unequal. How can I make this work?

所以这就是交易:我正在尝试打开一个文件(从字节),将其转换为字符串,以便我可以弄乱标头中的一些元数据,将其转换回字节,然后保存它。我现在遇到的问题是这段代码。当我将来回转换(但未进行其他修改)的字符串与原始字节数组进行比较时,它是不相等的。我怎样才能使这项工作?

public static byte[] StringToByteArray(string str)
{
    UTF8Encoding encoding = new UTF8Encoding();
    return encoding.GetBytes(str);
}

public string ByteArrayToString(byte[] input)
{
    UTF8Encoding enc = new UTF8Encoding();
    string str = enc.GetString(input);
    return str;
}

Here's how I'm comparing them.

这是我比较它们的方式。

byte[] fileData = GetBinaryData(filesindir[0], Convert.ToInt32(fi.Length));
string fileDataString = ByteArrayToString(fileData);
byte[] recapturedBytes = StringToByteArray(fileDataString);
Response.Write((fileData == recapturedBytes));

I'm sure it's UTF-8, using:

我确定它是 UTF-8,使用:

StreamReader sr = new StreamReader(filesindir[0]);
Response.Write(sr.CurrentEncoding);

which returns "System.Text.UTF8Encoding".

它返回“System.Text.UTF8Encoding”。

采纳答案by Adam Robinson

Try the static functions on the Encodingclass that provides you with instances of the various encodings. You shouldn't need to instantiate the Encodingjust to convert to/from a byte array. How are you comparing the strings in code?

尝试Encoding为您提供各种编码实例的类上的静态函数。您不需要实例化Encodingjust 来转换为字节数组/从字节数组转换。你如何比较代码中的字符串?

Edit

编辑

You're comparing arrays, not strings. They're unequal because they refer to two different arrays; using the ==operator will only compare their references, not their values. You'll need to inspect each element of the array in order to determine if they are equivalent.

您正在比较数组,而不是字符串。它们是不相等的,因为它们指的是两个不同的数组;使用==运算符只会比较它们的引用,而不是它们的值。您需要检查数组的每个元素以确定它们是否等效。

public bool CompareByteArrays(byte[] lValue, byte[] rValue)
{
    if(lValue == rValue) return true; // referentially equal
    if(lValue == null || rValue == null) return false; // one is null, the other is not
    if(lValue.Length != rValue.Length) return false; // different lengths

    for(int i = 0; i < lValue.Length; i++)
    {
        if(lValue[i] != rValue[i]) return false;
    }

    return true;
}

回答by csharptest.net

Your problem would appear to be the way you're comparing the array of bytes:

您的问题似乎是您比较字节数组的方式:

Response.Write((fileData == recapturedBytes));

This will always return false since you're comparing the address of the byte array, not the values it contains. Compare the string data, or use a method of comparing the byte arrays. You could also do this instead:

这将始终返回 false,因为您正在比较字节数组的地址,而不是它包含的值。比较字符串数据,或使用比较字节数组的方法。你也可以这样做:

Response.Write(Convert.ToBase64String(fileData) == Convert.ToBase64String(recapturedBytes));

回答by Sam Harwell

Due to the fact that .NET strings use Unicode strings, you can no longer do this like people did in C. In most cases, you should not even attemptto go back and forth from string<->byte array unless the contents are actually text.

由于 .NET 字符串使用 Unicode 字符串这一事实,你不能再像人们在 C 中那样做。在大多数情况下,你甚至不应该尝试从 string<->byte 数组来回,除非内容实际上是文本

I have to make this point clear:In .NET, if the byte[]data is not text, then do not attempt to convert it to a stringexcept for the special Base64encoding for binary data over a text channel. This is a widely-held misunderstanding among people that work in .NET.

我必须明确这一点:在 .NET 中,如果byte[]数据不是text,则不要尝试将其转换为 astring除了文本通道上二进制数据的特殊Base64编码。这是在 .NET 中工作的人们普遍存在的误解。

回答by J.Merrill

When you have raw bytes (8-bit possibly-not-printable characters) and want to manipulate them as a .NET string and turn them back into bytes, you can do so by using

当您有原始字节(8 位可能不可打印的字符)并希望将它们作为 .NET 字符串进行操作并将它们转换回字节时,您可以使用

Encoding.GetEncoding(1252)

instead of UTF8Encoding. That encoding works to take any 8-bit value and convert it to a .NET 16-bit char, and back again, without losing any information.

而不是 UTF8Encoding。该编码可以接受任何 8 位值并将其转换为 .NET 16 位字符,然后再返回,而不会丢失任何信息。

In the specific case you describe above, with a binary file, you will not be able to "mess with metadata in the header" and have things work correctly unless the length of the data you mess with is unchanged. For example, if the header contains

在您上面描述的特定情况下,使用二进制文件,您将无法“弄乱标题中的元数据”并使事情正常工作,除非您弄乱的数据长度不变。例如,如果标题包含

{any}{any}ABC{any}{any}

and you want to change ABC to DEF, that should work as you'd like. But if you want to change ABC to WXYZ, you will have to write over the byte that follows "C" or you will (in essence) move everything one byte further to the right. In a typical binary file, that will mess things up greatly.

并且您想将 ABC 更改为 DEF,这应该可以正常工作。但是,如果您想将 ABC 更改为 WXYZ,则必须改写“C”后面的字节,否则您将(实质上)将所有内容向右移动一个字节。在典型的二进制文件中,这将把事情搞得一团糟。

If the bytes after "ABC" are spaces or null characters, there's a better chance that writing larger replacement data will not cause trouble -- but you still cannot just replace ABC with WXYZ in the .NET string, making it longer -- you would have to replace ABC{whatever_follows_it} with WXYZ. Given that, you might find that it's easier just to leave the data as bytes and write the replacement data one byte at a time.

如果“ABC”之后的字节是空格或空字符,则写入更大的替换数据更有可能不会引起问题——但你仍然不能只是在 .NET 字符串中用 WXYZ 替换 ABC,使其更长——你会必须用 WXYZ 替换 ABC{whatever_follows_it}。鉴于此,您可能会发现将数据保留为字节并一次写入一个字节的替换数据更容易。