Question

This query:

SELECT * 
FROM html 
WHERE url='http://wwww.example.com' 
AND xpath='//tr[@height="20"]'

returns XML:

<results>
    <tr height="20">
        <td height="20" width="425">
            <p>Institution 0</p>
        </td>
        <td width="134">
            <p>Minneapolis</p>
        </td>
        <td width="64">
            <p>MN</p>
        </td>
    </tr>
    ...
</results>

Questions:

  • Is there a way to use XPATH to create individual columns?
  • Is there a way to create column aliases?

Example (invalid syntax):

SELECT td[position()=1]/p/. AS name, td[position()=2]/p/. AS city, td[position()=3]/p/. AS region
FROM   ...

Goal:

<results>
    <tr height="20">
      <name>Institution 0</name>
      <city>Minneapolis</city>
      <region>MN</region>
    </tr>
    ...
</results>
Était-ce utile?

La solution

Not with XPath, as you are trying to do. However one can apply XSL Transformations to XML/HTML documents with YQL. Here's an example:

XSLT

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:template match="/">
        <rows>
          <xsl:apply-templates select="descendant::tr" />
        </rows>
    </xsl:template>
    <xsl:template match="//tr">
        <row>
            <name>
                <xsl:value-of select="td[1]/p" />
            </name>
            <city>
                <xsl:value-of select="td[2]/p" />
            </city>
            <region>
                <xsl:value-of select="td[3]/p" />
            </region>
        </row>
    </xsl:template>
</xsl:stylesheet>

HTML

<html>
    <body>
        <table>
            <tr height="20">
                <td height="20" width="425">
                    <p>Institution 0</p>
                </td>
                <td width="134">
                    <p>Minneapolis</p>
                </td>
                <td width="64">
                    <p>MN</p>
                </td>
            </tr>
            <tr height="20">
                <td height="20" width="425">
                    <p>Institution 1111</p>
                </td>
                <td width="134">
                    <p>Minneapolis 1111</p>
                </td>
                <td width="64">
                    <p>MN 11111</p>
                </td>
            </tr>
        </table>
    </body>
</html>

YQL query

select * from xslt where stylesheet="url/to.xsl" and url="url/to.html"

YQL Result

<results>
    <rows>
        <row>
            <name>Institution 0</name>
            <city>Minneapolis</city>
            <region>MN</region>
        </row>
        <row>
            <name>Institution 1111</name>
            <city>Minneapolis 1111</city>
            <region>MN 11111</region>
        </row>
    </rows>
</results>

» See an example running in the YQL console.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top