I'm trying to use Yahoo's content analysis, which seems really easy to use from here

But whenever I execute my code, I get the following output, as it is:

Italian sculptors the Virgin Mary painters http://en.wikipedia.com/wiki/Painting http://en.wikipedia.com/wiki/Adobe_Photoshop http://en.wikipedia.com/wiki/Still_life http://en.wikipedia.com/wiki/Avant-garde http://en.wikipedia.com/wiki/In_the_Sky http://en.wikipedia.com/wiki/Potato 1

What I want is to see an XML document structured with the XML tags just like the way it appears when you click this link

Also, the source code (from the browser.. the right click>view source thing) of what I'm seeing as the output is:

<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:count="1" yahoo:created="2012-11-24T05:54:55Z" yahoo:lang="en-US"><results><entities xmlns="urn:yahoo:cap">
    <entity score="0.784327">
      <text end="16" endchar="16" start="0" startchar="0">Italian sculptors</text>
    </entity>
    <entity score="0.78097">
      <text end="72" endchar="72" start="58" startchar="58">the Virgin Mary</text>
    </entity>
    <entity score="0.509566">
      <text end="29" endchar="29" start="22" startchar="22">painters</text>
      <wiki_url>http://en.wikipedia.com/wiki/Painting</wiki_url>
      <related_entities>
        <wikipedia>
          <wiki_url>http://en.wikipedia.com/wiki/Adobe_Photoshop</wiki_url>
          <wiki_url>http://en.wikipedia.com/wiki/Still_life</wiki_url>
          <wiki_url>http://en.wikipedia.com/wiki/Avant-garde</wiki_url>
          <wiki_url>http://en.wikipedia.com/wiki/In_the_Sky</wiki_url>
          <wiki_url>http://en.wikipedia.com/wiki/Potato</wiki_url>
        </wikipedia>
      </related_entities>
    </entity>
  </entities></results></query><!-- total: 191 -->
<!-- engine6.yql.ac4.yahoo.com -->
1

Following is my code :

<?php
$c = curl_init();
curl_setopt($c, CURLOPT_URL, 'http://query.yahooapis.com/v1/public/yql');
curl_setopt($c, CURLOPT_POST, true);
curl_setopt($c, CURLOPT_POSTFIELDS, "q=select * from contentanalysis.analyze where text='Italian sculptors and painters of the renaissance favored the Virgin Mary for inspiration';");
curl_setopt($c,CURLOPT_HEADER,0);
$op=curl_exec ($c);
curl_close ($c); 
echo $op;
?>
有帮助吗?

解决方案

That is how XML is displayed in the browser when the header being sent is Content-type: text/html. The demo example you link to that shows formatted XML uses some special formatting to get it looking like that. You need to set the header to be text/xml like header('Content-type: text/xml'); and then the output should display formatted.

header('Content-type: text/xml');
echo $op;

You can also output your content like so:

echo '<pre>';
echo htmlentities($op);
echo '</pre>';

The above explains why XML shows up unformatted in the browser and demonstrates how to fix that. The OP's main problem is that his XML is malformed due to that stray string at the the end of the output. The following deals with that:

$r = 'http://query.yahooapis.com/v1/public/yql';
$p = "q=select * from contentanalysis.analyze where text='Italian sculptors and painters of the renaissance favored the Virgin Mary for inspiration'"; 

$c = curl_init($r);
curl_setopt($c, CURLOPT_POST, true);
curl_setopt($c, CURLOPT_POSTFIELDS, $p);
curl_setopt($c, CURLOPT_HEADER, true);
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
$op = curl_exec ($c);
curl_close ($c); 

if (!($xml = strstr($op, '<?xml'))) {
    $xml = null;
}

header('Content-type: text/xml');
echo $xml;

其他提示

If you seen that result in a browser. You should just do view source. That will show everything including tags. As brousr would not display tags, just content.

You haven't used the header method to specify the Content-Type HTTP header. Consequently, PHP is outputting its default Content-Type of text/html and the browser is treating the XML markup as invalid HTML.

Output the correct content-type for your data.

header("Content-Type: application/xml");
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top