C# 如何从 XmlReader 构建 XmlNodes

Question

提问by JohnIdol

I am parsing a big number of big files and after profiling my bottleneck is:

我正在解析大量大文件，在分析我的瓶颈之后是：

XmlDocument doc = new XmlDocument();
doc.Load(filename);

This approach was very handy because I could extract nodes like this:

这种方法非常方便，因为我可以像这样提取节点：

XmlNodeList nodeList = doc.SelectNodes("myXPath");

I am switching to XmlReader, but When I find the element I need to extract I am stuck with regards to how to build a XmlNode from it as not too familiar with XmlReader:

我正在切换到 XmlReader，但是当我找到需要提取的元素时，我对如何从中构建 XmlNode 感到困惑，因为对 XmlReader 不太熟悉：

XmlReader xmlReader = XmlReader.Create(fileName);

while (xmlReader.Read())
{
   //keep reading until we see my element
   if (xmlReader.Name.Equals("myElementName") && (xmlReader.NodeType == XmlNodeType.Element))
   {
       // How do I get the Xml element from the reader here?
   }
}

I'd like to be able to build a List<XmlNode>object. I am on .NET 2.0.

我希望能够构建一个List<XmlNode>对象。我在.NET 2.0 上。

Any help appreciated!

任何帮助表示赞赏！

Answer 1

采纳答案by Fredrik M?rk

The XmlNodetype does not have a public constructor, so you cannot create them on your own. You will need to have an XmlDocumentthat you can use to create them:

该XmlNode类型没有公共构造函数，因此您无法自行创建它们。您需要有一个XmlDocument可以用来创建它们的：

XmlDocument doc = new XmlDocument();
while (xmlReader.Read())
{
    //keep reading until we see my element
    if (xmlReader.Name.Equals("myElementName") && (xmlReader.NodeType == XmlNodeType.Element))
    {
        // How do I get the Xml element from the reader here?
        XmlNode myNode = doc.CreateNode(XmlNodeType.Element, xmlReader.Name, "");
        nodeList.Add(myNode);
    }        
}

Answer 2

回答by Abel

XmlReaderand XmlDocumenthave a very distinct way of processing. XmlReaderkeeps nothing in memory and uses a forward-only approach as opposed to building a full DOM tree in memory for XmlDocument. It is helpful when performance is an issue, but it also requires you to write your application differently: instead of using XmlNode, you don't keep anything and only process "on the go": i.e., when an element passes by that you need, you do something. This is close to the SAX approach, but without the callback model.

XmlReader并且XmlDocument有非常独特的处理方式。XmlReader在内存中不保留任何内容并使用只进的方法，而不是在内存中为XmlDocument. 当性能成为问题时它很有帮助，但它也要求您以不同的方式编写应用程序：而不是使用XmlNode，您不保留任何东西并且只在“移动中”进行处理：即，当一个元素经过您需要的元素时，你做某事。这接近于 SAX 方法，但没有回调模型。

The answer to "how to get the XmlElement" is: you'll have to build them from scratch based on the info from the reader. This, unfortunately, defies the performance gain. It is often better to prevent using DOM approaches altogether once you switch to XmlReader, unless for a few distinct cases.

“如何获取 XmlElement”的答案是：您必须根据读者提供的信息从头开始构建它们。不幸的是，这无视性能提升。通常最好在切换到 XmlReader 后完全避免使用 DOM 方法，除非有一些不同的情况。

Also, the "very handy" way to extract nodes using XPath (SelectNodesis what you show above) cannot be used here: XPath requires a DOM tree. Consider this approach a filtering approach: you can add filters to the XmlReader and tell it to skip certain nodes or read until a certain node. This is extremely fast, but a different way of thinking.

此外，SelectNodes这里不能使用使用 XPath 提取节点的“非常方便”的方法（就是你上面展示的）：XPath 需要一个 DOM 树。将此方法视为一种过滤方法：您可以向 XmlReader 添加过滤器并告诉它跳过某些节点或读取到某个节点。这是非常快的，但是一种不同的思维方式。

Answer 3

回答by Mehdi Golchin

Use XmlDocument.ReadNodefor this approach. Put XmlReaderin using statement and use XmlReader.LocalNameinstead of Name to remove namespace prefix.

使用XmlDocument.ReadNode这种方法。放入XmlReaderusing 语句并使用XmlReader.LocalName而不是 Name 来删除命名空间前缀。

Answer 4

回答by Barney Light

I've used the following workaround when I've had to insert data from a XmlReaderinto a XmlDocumenht:

当我不得不将数据从 aXmlReader插入到 a时，我使用了以下解决方法XmlDocumenht：

XmlReader rdr = cmd.ExecuteXmlReader();

XmlDocument doc = new XmlDocument();

// create a container node for our resultset
XmlElement root = doc.CreateElement("QueryRoot");
doc.AppendChild(root);

StringBuilder xmlBody = new StringBuilder();

while(rdr.Read())
{
    xmlBody.Append(rdr.ReadOuterXml());
}

root.InnerXml = xmlBody.ToString();

Answer 5

回答by executor

Why not just do the following?

为什么不做以下事情？

XmlDocument doc = new XmlDocument();
XmlNode node = doc.ReadNode(reader);

Answer 6

回答by Bj?rn Lindqvist

Here is my approach:

这是我的方法：

public static IEnumerable<XmlNode> StreamNodes(
    string path,
    string[] tagNames) 
{            
    var doc = new XmlDocument();            
    using (XmlReader xr = XmlReader.Create(path)) 
    {
        xr.MoveToContent();
        while (true) {
            if (xr.NodeType == XmlNodeType.Element &&
                tagNames.Contains(xr.Name)) 
            {
                var node = doc.ReadNode(xr);
                yield return node;
            } 
            else 
            {
                if (!xr.Read()) 
                {
                    break;
                }
            }
        }
        xr.Close();
    }                        
}
// Used like this:
foreach (var el in StreamNodes("orders.xml", new string[]{"order"})) 
{
    ....
}

The nodes can then be imported into another document for further processing.

然后可以将节点导入另一个文档以进行进一步处理。

C# 如何从 XmlReader 构建 XmlNodes

提问by JohnIdol

采纳答案by Fredrik M?rk

回答by Abel

回答by Mehdi Golchin

回答by Barney Light

回答by executor

回答by Bj?rn Lindqvist

相关推荐

最近更新

标签

C# 如何从 XmlReader 构建 XmlNodes

提问by JohnIdol

采纳答案by Fredrik M?rk

回答by Abel

回答by Mehdi Golchin

回答by Barney Light

回答by executor

回答by Bj?rn Lindqvist

相关推荐

如何在 Linux/Unix 上永久设置 $PATH？

如何在 linux 的命令中间使用 xargs 传递所有参数

C# 转换 & 对等

如何在linux中查找特定进程每5秒的内存消耗

相关推荐

最近更新

标签