Question

I am trying to get postal codes from this site:

http://pl.wikisource.org/wiki/Lista_kod%C3%B3w_pocztowych_w_Polsce

My code is simple:

 <?php
    $postalCode = $_GET['code'];

    $httpAddr = 'http://pl.wikisource.org/wiki/Lista_kod%C3%B3w_pocztowych_w_Polsce/Okr%C4%99g_'.$postalCode[0].'_'.$postalCode[0].$postalCode[1].'-xxx';

    file_get_contents($httpAddr);
    ?>

But when i set $postalCode to 03-000 (also 01-000, 05-000, but for 07-000, 61-000, 62-000 is working) i am reciving error:

Warning: file_get_contents(http://pl.wikisource.org/wiki/Lista_kod%C3%B3w_pocztowych_w_Polsce/Okr%C4%99g_0_03-xxx): failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden in /var/www/clients/client1/web4/web/ofix/test.php on line 5 

Page address is correct, you can copy and past it in your web browser and it works. Any ideas?

Was it helpful?

Solution

As Lightness Races in Orbit suspected, it does seem that the webserver is blocking PHP's request.

Using cURL instead of file_get_contents() reveals the details:

HTTP/1.0 403 Forbidden
Scripts should use an informative User-Agent string with contact information, or they may be IP-blocked without notice.

A web browser sends a valid User-Agent header in its request, which is why the page loads OK in your browser but not in PHP.

In my tests loading this URL in PHP, sometimes it succeeds with an HTTP status code of 200, other times it fails with 403. Notice that the error message says scripts may be blocked (ie. sometimes they may not be blocked).

Edit

See this question for more info: How to get results from the Wikipedia API with PHP?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top