C# 如果字符串以 <?xml... ?> 部分开头，则将 xml 字符串解析为 xml 文档会失败

Question

提问by agnieszka

I have an XML file begining like this:

我有一个像这样开头的 XML 文件：

<?xml version="1.0" encoding="utf-8"?>
<Report xmlns:rd="http://schemas.microsoft.com/SQLServer/reporting/reportdesigner" xmlns="http://schemas.microsoft.com/sqlserver/reporting/2008/01/reportdefinition">
  <DataSources>

When I run following code:

当我运行以下代码时：

byte[] fileContent = //gets bytes
            string stringContent = Encoding.UTF8.GetString(fileContent);
            XDocument xml = XDocument.Parse(stringContent);

I get following XmlException:

我得到以下 XmlException：

Data at the root level is invalid. Line 1, position 1.

根级别的数据无效。第 1 行，位置 1。

Cutting out the version and encoding node fixes the problem. Why? How to process this xml correctly?

删除版本和编码节点可以解决问题。为什么？如何正确处理这个xml？

Answer 1

采纳答案by stevehipwell

If you only have bytes you could either load the bytes into a stream:

如果您只有字节，则可以将字节加载到流中：

XmlDocument oXML;

using (MemoryStream oStream = new MemoryStream(oBytes))
{
  oXML = new XmlDocument();
  oXML.Load(oStream);
}

Or you could convert the bytes into a string (presuming that you know the encoding) before loading the XML:

或者您可以在加载 XML 之前将字节转换为字符串（假设您知道编码）：

string sXml;
XmlDocument oXml;

sXml = Encoding.UTF8.GetString(oBytes);
oXml = new XmlDocument();
oXml.LoadXml(sXml);

I've shown my example as .NET 2.0 compatible, if you're using .NET 3.5 you can use XDocumentinstead of XmlDocument.

我已经证明我的例子作为.NET 2.0兼容的，如果你使用.NET 3.5，您可以使用XDocument来代替XmlDocument。

Load the bytes into a stream:

将字节加载到流中：

XDocument oXML;

using (MemoryStream oStream = new MemoryStream(oBytes))
using (XmlTextReader oReader = new XmlTextReader(oStream))
{
  oXML = XDocument.Load(oReader);
}

Convert the bytes into a string:

将字节转换为字符串：

string sXml;
XDocument oXml;

sXml = Encoding.UTF8.GetString(oBytes);
oXml = XDocument.Parse(sXml);

Answer 2

回答by Brian Agnew

Do you have a byte-order-mark(BOM) at the beginning of your XML, and does it match your encoding ? If you chop out your header, you'll also chop out the BOM and if that is incorrect, then subsequent parsing may work.

您的 XML 开头是否有字节顺序标记(BOM)，它是否与您的编码匹配？如果你砍掉你的标题，你也会砍掉 BOM，如果这是不正确的，那么后续的解析可能会起作用。

You may need to inspect your document at the byte level to see the BOM.

您可能需要在字节级别检查您的文档以查看 BOM。

Answer 3

回答by Darin Dimitrov

Why bothering to read the file as a byte sequence and then converting it to string while it is an xml file? Just leave the framework do the loading for you and cope with the encodings:

为什么要费心将文件作为字节序列读取，然后在它是 xml 文件时将其转换为字符串？只需让框架为您加载并处理编码：

var xml = XDocument.Load("test.xml");

Answer 4

回答by Dave Cluderay

My first thought was that the encoding is Unicode when parsing XML from a .NET string type. It seems, though that XDocument's parsing is quite forgiving with respect to this.

我的第一个想法是从 .NET 字符串类型解析 XML 时编码是 Unicode。看起来，尽管 XDocument 的解析对此相当宽容。

The problem is actually related to the UTF8 preamble/byte order mark (BOM), which is a three-byte signature optionally presentat the start of a UTF-8 stream. These three bytes are a hint as to the encoding being used in the stream.

该问题实际上与 UTF8 前导码/字节顺序标记 (BOM) 相关，它是一个三字节的签名，可选择出现在 UTF-8 流的开头。这三个字节是有关流中使用的编码的提示。

You can determine the preamble of an encoding by calling the GetPreamblemethod on an instance of the System.Text.Encodingclass. For example:

您可以通过GetPreamble在System.Text.Encoding类的实例上调用方法来确定编码的前导码。例如：

// returns { 0xEF, 0xBB, 0xBF }
byte[] preamble = Encoding.UTF8.GetPreamble();

The preamble should be handled correctly by XmlTextReader, so simply load your XDocumentfrom an XmlTextReader:

序言应该由正确处理XmlTextReader，因此只需XDocument从加载您的XmlTextReader：

XDocument xml;
using (var xmlStream = new MemoryStream(fileContent))
using (var xmlReader = new XmlTextReader(xmlStream))
{
    xml = XDocument.Load(xmlReader);
}

Answer 5

回答by eugene.sushilnikov

Try this:

尝试这个：

int startIndex = xmlString.IndexOf('<');
if (startIndex > 0)
{
    xmlString = xmlString.Remove(0, startIndex);
}

C# 如果字符串以 <?xml... ?> 部分开头，则将 xml 字符串解析为 xml 文档会失败

提问by agnieszka

采纳答案by stevehipwell

回答by Brian Agnew

回答by Darin Dimitrov

回答by Dave Cluderay

回答by eugene.sushilnikov

相关推荐

最近更新

标签

C# 如果字符串以 <?xml... ?> 部分开头，则将 xml 字符串解析为 xml 文档会失败

提问by agnieszka

采纳答案by stevehipwell

回答by Brian Agnew

回答by Darin Dimitrov

回答by Dave Cluderay

回答by eugene.sushilnikov

相关推荐

C# 我想要 "(int)null" 返回我 0

C# Linq where column ==（空引用）与 column == null 不同

C# 如何清除 Linq to Sql 上的 DataContext 缓存

C# 在 WPF 中自动调整窗口内容大小

相关推荐

最近更新

标签