¿Llevar el XPath a un XElement?

https://stackoverflow.com/questions/451950

19-08-2019
|

Pregunta

Tengo un XElement en lo profundo de un documento. Dado el XElement (y XDocument?), ¿Existe un método de extensión para obtener su XPath completo (es decir, absoluto, por ejemplo, / root / item / element / child )?

Por ejemplo. myXElement.GetXPath ()?

EDITAR: Bien, parece que pasé por alto algo muy importante. Whoops! El índice del elemento debe tenerse en cuenta. Vea mi última respuesta para la solución corregida propuesta.

Solución

Los métodos de extensiones:

public static class XExtensions
{
    /// <summary>
    /// Get the absolute XPath to a given XElement
    /// (e.g. "/people/person[6]/name[1]/last[1]").
    /// </summary>
    public static string GetAbsoluteXPath(this XElement element)
    {
        if (element == null)
        {
            throw new ArgumentNullException("element");
        }

        Func<XElement, string> relativeXPath = e =>
        {
            int index = e.IndexPosition();
            string name = e.Name.LocalName;

            // If the element is the root, no index is required

            return (index == -1) ? "/" + name : string.Format
            (
                "/{0}[{1}]",
                name, 
                index.ToString()
            );
        };

        var ancestors = from e in element.Ancestors()
                        select relativeXPath(e);

        return string.Concat(ancestors.Reverse().ToArray()) + 
               relativeXPath(element);
    }

    /// <summary>
    /// Get the index of the given XElement relative to its
    /// siblings with identical names. If the given element is
    /// the root, -1 is returned.
    /// </summary>
    /// <param name="element">
    /// The element to get the index of.
    /// </param>
    public static int IndexPosition(this XElement element)
    {
        if (element == null)
        {
            throw new ArgumentNullException("element");
        }

        if (element.Parent == null)
        {
            return -1;
        }

        int i = 1; // Indexes for nodes start at 1, not 0

        foreach (var sibling in element.Parent.Elements(element.Name))
        {
            if (sibling == element)
            {
                return i;
            }

            i++;
        }

        throw new InvalidOperationException
            ("element has been removed from its parent.");
    }
}

Y la prueba:

class Program
{
    static void Main(string[] args)
    {
        Program.Process(XDocument.Load(@"C:\test.xml").Root);
        Console.Read();
    }

    static void Process(XElement element)
    {
        if (!element.HasElements)
        {
            Console.WriteLine(element.GetAbsoluteXPath());
        }
        else
        {
            foreach (XElement child in element.Elements())
            {
                Process(child);
            }
        }
    }
}

Y salida de muestra:

/tests/test[1]/date[1]
/tests/test[1]/time[1]/start[1]
/tests/test[1]/time[1]/end[1]
/tests/test[1]/facility[1]/name[1]
/tests/test[1]/facility[1]/website[1]
/tests/test[1]/facility[1]/street[1]
/tests/test[1]/facility[1]/state[1]
/tests/test[1]/facility[1]/city[1]
/tests/test[1]/facility[1]/zip[1]
/tests/test[1]/facility[1]/phone[1]
/tests/test[1]/info[1]
/tests/test[2]/date[1]
/tests/test[2]/time[1]/start[1]
/tests/test[2]/time[1]/end[1]
/tests/test[2]/facility[1]/name[1]
/tests/test[2]/facility[1]/website[1]
/tests/test[2]/facility[1]/street[1]
/tests/test[2]/facility[1]/state[1]
/tests/test[2]/facility[1]/city[1]
/tests/test[2]/facility[1]/zip[1]
/tests/test[2]/facility[1]/phone[1]
/tests/test[2]/info[1]

Eso debería resolver esto. ¿No?

Otros consejos

Actualicé el código de Chris para tener en cuenta los prefijos del espacio de nombres. Solo se modifica el método GetAbsoluteXPath.

public static class XExtensions
{
    /// <summary>
    /// Get the absolute XPath to a given XElement, including the namespace.
    /// (e.g. "/a:people/b:person[6]/c:name[1]/d:last[1]").
    /// </summary>
    public static string GetAbsoluteXPath(this XElement element)
    {
        if (element == null)
        {
            throw new ArgumentNullException("element");
        }

        Func<XElement, string> relativeXPath = e =>
        {
            int index = e.IndexPosition();

            var currentNamespace = e.Name.Namespace;

            string name;
            if (currentNamespace == null)
            {
                name = e.Name.LocalName;
            }
            else
            {
                string namespacePrefix = e.GetPrefixOfNamespace(currentNamespace);
                name = namespacePrefix + ":" + e.Name.LocalName;
            }

            // If the element is the root, no index is required
            return (index == -1) ? "/" + name : string.Format
            (
                "/{0}[{1}]",
                name,
                index.ToString()
            );
        };

        var ancestors = from e in element.Ancestors()
                        select relativeXPath(e);

        return string.Concat(ancestors.Reverse().ToArray()) +
               relativeXPath(element);
    }

    /// <summary>
    /// Get the index of the given XElement relative to its
    /// siblings with identical names. If the given element is
    /// the root, -1 is returned.
    /// </summary>
    /// <param name="element">
    /// The element to get the index of.
    /// </param>
    public static int IndexPosition(this XElement element)
    {
        if (element == null)
        {
            throw new ArgumentNullException("element");
        }

        if (element.Parent == null)
        {
            return -1;
        }

        int i = 1; // Indexes for nodes start at 1, not 0

        foreach (var sibling in element.Parent.Elements(element.Name))
        {
            if (sibling == element)
            {
                return i;
            }

            i++;
        }

        throw new InvalidOperationException
            ("element has been removed from its parent.");
    }
}

Permítanme compartir mi última modificación a esta clase. Básicamente, excluye el índice si el elemento no tiene hermanos e incluye espacios de nombres con el operador local-name () si tenía problemas con el prefijo del espacio de nombres.

public static class XExtensions
{
    /// <summary>
    /// Get the absolute XPath to a given XElement, including the namespace.
    /// (e.g. "/a:people/b:person[6]/c:name[1]/d:last[1]").
    /// </summary>
    public static string GetAbsoluteXPath(this XElement element)
    {
        if (element == null)
        {
            throw new ArgumentNullException("element");
        }


        Func<XElement, string> relativeXPath = e =>
        {
            int index = e.IndexPosition();

            var currentNamespace = e.Name.Namespace;

            string name;
            if (String.IsNullOrEmpty(currentNamespace.ToString()))
            {
                name = e.Name.LocalName;
            }
            else
            {
                name = "*[local-name()='" + e.Name.LocalName + "']";
                //string namespacePrefix = e.GetPrefixOfNamespace(currentNamespace);
                //name = namespacePrefix + ":" + e.Name.LocalName;
            }

            // If the element is the root or has no sibling elements, no index is required
            return ((index == -1) || (index == -2)) ? "/" + name : string.Format
            (
                "/{0}[{1}]",
                name,
                index.ToString()
            );
        };

        var ancestors = from e in element.Ancestors()
                        select relativeXPath(e);

        return string.Concat(ancestors.Reverse().ToArray()) +
               relativeXPath(element);
    }

    /// <summary>
    /// Get the index of the given XElement relative to its
    /// siblings with identical names. If the given element is
    /// the root, -1 is returned or -2 if element has no sibling elements.
    /// </summary>
    /// <param name="element">
    /// The element to get the index of.
    /// </param>
    public static int IndexPosition(this XElement element)
    {
        if (element == null)
        {
            throw new ArgumentNullException("element");
        }

        if (element.Parent == null)
        {
            // Element is root
            return -1;
        }

        if (element.Parent.Elements(element.Name).Count() == 1)
        {
            // Element has no sibling elements
            return -2;
        }

        int i = 1; // Indexes for nodes start at 1, not 0

        foreach (var sibling in element.Parent.Elements(element.Name))
        {
            if (sibling == element)
            {
                return i;
            }

            i++;
        }

        throw new InvalidOperationException
            ("element has been removed from its parent.");
    }
}

Esto es en realidad un duplicado de esto pregunta. Si bien no está marcado como la respuesta, el método en my responder a esa pregunta es la única forma de formular inequívocamente el XPath a un nodo dentro de un documento XML que siempre funcionará en todas las circunstancias. (También funciona para todos los tipos de nodos, no solo para los elementos).

Como puede ver, el XPath que produce es feo y abstracto. pero aborda las preocupaciones que muchos respondedores han planteado aquí. La mayoría de las sugerencias hechas aquí producen un XPath que, cuando se usa para buscar el documento original, producirá un conjunto de uno o más nodos que incluye el nodo objetivo. Es eso "o más" ese es el problema. Por ejemplo, si tengo una representación XML de un DataSet, el XPath ingenuo a un elemento de DataRow específico, / DataSet1 / DataTable1 , también devuelve los elementos de todos los otros DataRow en DataTable. No puede desambiguar eso sin saber algo acerca de cómo se forola el XML (por ejemplo, ¿hay un elemento de clave principal?).

Pero / node () [1] / node () [4] / node () [11] , solo hay un nodo que volverá, pase lo que pase.

Como parte de un proyecto diferente desarrollé un método de extensión para generar un XPath simple a un elemento Es similar a la respuesta seleccionada, pero admite XAttribute, XText, XCData y XComment además de XElement. Está disponible como code nuget , página del proyecto aquí: xmlspecificationcompare.codeplex.com

Si está buscando algo provisto de forma nativa por .NET, la respuesta es no. Tendría que escribir su propio método de extensión para hacer esto.

Puede haber varios xpaths que conducen al mismo elemento, por lo que encontrar el xpath más simple que conduce al nodo no es trivial.

Dicho esto, es bastante fácil encontrar un xpath al nodo. Simplemente suba el árbol de nodos hasta que lea el nodo raíz y combine los nombres de nodo y tenga un xpath válido.

Por " full xpath " Supongo que te refieres a una simple cadena de etiquetas, ya que el número de xpaths que podría coincidir con cualquier elemento podría ser muy grande.

El problema aquí es que es muy difícil, si no específicamente imposible, construir un xpath dado que pueda rastrearse reversiblemente hasta el mismo elemento, ¿es esa una condición?

Si " no " entonces quizás podría construir una consulta haciendo un bucle recursivo con referencia a los elementos actuales parentNode. Si '' sí '', entonces va a buscar extender eso haciendo referencias cruzadas para la posición del índice dentro de los conjuntos hermanos, haciendo referencia a atributos similares a ID si existen, y esto dependerá mucho de su XSD si es general la solución es posible.

Microsoft ha proporcionado un método de extensión para hacerlo desde .NET Framework 3.5:

http://msdn.microsoft. com / es-us / library / bb156083 (v = vs.100) .aspx

Simplemente agregue un uso a System.Xml.XPath e invoque los siguientes métodos:

XPathSelectElement : seleccione un único elemento
XPathSelectElements : seleccione elementos y regrese como IEnumerable<XElement>
XPathEvaluate : seleccione nodos (no solo elementos, sino también texto, comentarios, etc.) y regrese como IEnumerable<object>

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow