Cómo leer el archivo XML grande sin cargarlo en la memoria y el uso de XElement

https://stackoverflow.com/questions/2249875

20-09-2019
|

Pregunta

quiero leer un gran archivo XML (100 + M). Debido a su tamaño, no quiero cargarlo en la memoria usando XElement. Estoy utilizando consultas LINQ-XML para analizar y leerlo.

¿Cuál es la mejor manera de hacerlo? Cualquier ejemplo de combinación de XPath o XmlReader con LINQ-xml / XElement?

Por favor, ayuda. Gracias.

Solución

Sí, se puede combinar con el XmlReader método XNode.ReadFrom , véase el ejemplo en la documentación que utiliza C # para procesar selectivamente nodos encontrados por el XmlReader como un XElement.

Otros consejos

El código de ejemplo, en la documentación de MSDN para el método XNode.ReadFrom es como sigue:

class Program
{
    static IEnumerable<XElement> StreamRootChildDoc(string uri)
    {
        using (XmlReader reader = XmlReader.Create(uri))
        {
            reader.MoveToContent();
            // Parse the file and display each of the nodes.
            while (reader.Read())
            {
                switch (reader.NodeType)
                {
                    case XmlNodeType.Element:
                        if (reader.Name == "Child")
                        {
                            XElement el = XElement.ReadFrom(reader) as XElement;
                            if (el != null)
                                yield return el;
                        }
                        break;
                }
            }
        }
    }

    static void Main(string[] args)
    {
        IEnumerable<string> grandChildData =
            from el in StreamRootChildDoc("Source.xml")
            where (int)el.Attribute("Key") > 1
            select (string)el.Element("GrandChild");

        foreach (string str in grandChildData)
            Console.WriteLine(str);
    }
}

Sin embargo, he encontrado que el método StreamRootChildDoc en el ejemplo tiene que ser modificado de la siguiente manera:

    static IEnumerable<XElement> StreamRootChildDoc(string uri)
    {
        using (XmlReader reader = XmlReader.Create(uri))
        {
            reader.MoveToContent();
            // Parse the file and display each of the nodes.
            while (!reader.EOF)
            {
                if (reader.NodeType == XmlNodeType.Element && reader.Name == "Child")
                {
                    XElement el = XElement.ReadFrom(reader) as XElement;
                    if (el != null)
                        yield return el;
                }
                else
                {
                    reader.Read();
                }
            }
        }
    }

Hemos de tener en cuenta que tendrá que leer el archivo secuencialmente y en referencia a los hermanos o descendientes va a ser lento en el mejor e imposible en el peor. De lo contrario @MartinHonnn tiene la llave.

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow