Question

A client want that we deliver content via RSS feed, they use cURL to get the feed contents, but they say that they get an 404 error instead. I have tried this command in the terminal: $ curl -g --compressed http://mediosymedia.com/wp-content/plugins/nextgen-gallery/xml/media-rss.php > temp.xml and as the client says I get the 404 page instead of the feed. When I type the URI in the browser it shows the feed without problem.

I cannot change anything in the client app, so, how can I ensure that they get the feed instead of the 404 error?

Thanks!

Was it helpful?

Solution

Indeed, the curl returns a 404 status page:

$ curl -g --compressed http://mediosymedia.com/wp-content/plugins/nextgen-gallery/xml/media-rss.php -s -o /dev/null -D-
HTTP/1.1 **404 Not Found**
Date: Tue, 04 Mar 2014 08:12:27 GMT
Server: Apache
X-Pingback: http://mediosymedia.com/xmlrpc.php
Expires: Wed, 11 Jan 1984 05:00:00 GMT
Cache-Control: no-cache, must-revalidate, max-age=0
Pragma: no-cache
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8 

Many webservers will be suspicious of requests without a browser User-Agent because they expect curl to be used for scraping. This is probably not the smartest technique because a simple UserAgent spoofing will fix that problem:

$ curl -g --compressed http://mediosymedia.com/wp-content/plugins/nextgen-gallery/xml/media-rss.php -s -o /dev/null -D- -H'User-Agent:  Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:27.0) Gecko/20100101 Firefox/27.0'
HTTP/1.1 **200 OK**
Date: Tue, 04 Mar 2014 08:13:46 GMT
Server: Apache
Expires: Wed, 11 Jan 1984 05:00:00 GMT
Cache-Control: no-cache, must-revalidate, max-age=0
Pragma: no-cache
Transfer-Encoding: chunked
Content-Type: text/xml;charset=utf-8

So, in practice, make sure you set up a User-Agent for your requests that is not Curl's.

OTHER TIPS

My initial though was that this may be related to cookies (see this question), but this may be a localized issue. This is working fine from my machine:

[root@devtest tmp]# curl -g --compressed http://mediosymedia.com/wp-content/plug
ins/nextgen-gallery/xml/media-rss.php > temp.xml
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 27926    0 27926    0     0  54564      0 --:--:-- --:--:-- --:--:-- 69815

CORRECTION:

Thanks to Julien for pointing out that the contents of the downloaded file was the custom 404 page contents. As he mentions, you need to add a useragent flag (-A) to your curl requests:

# curl -A "User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1
; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12"-g --compressed http://medio
symedia.com/wp-content/plugins/nextgen-gallery/xml/media-rss.php > temp.xml

I would just delete my answer, but it's worth leaving up as a warning to others who might be experiencing this issue - make sure you validate the response!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top