Domanda

As some of you may have heard, several subreddits are having a charity drive at the moment, notably r/atheism. In the interests of helping/encouraging fundraising, I've started writing a little web utility to provide real-time information about these donations (basically, mashing-up data from Reddit with data from FirstGiving) - you can see what I have so far here - it just shows the totals and average figures for each subreddit and it's very preliminary (also not pretty.)

A feature I'd like to add is something which FirstGiving doesn't seem to offer, the ability to search for or link to a specific donation. There were a lot of posts last week in which people tried to offer donation matching and similar, but there were also a lot of fake/troll posts, and no good way to verify whether someone was "delivering" (we all know screenshots are easily faked.) I plan to cache data from FirstGiving to allow someone to link to

Having examined the FirstGiving page, there seems to be an undocumented JSON API call (used when scrolling to the bottom of the page to display more donations) which will return a list of donation amounts, messages and nicknames as an HTML table. Here's what it looks like when I access it in my browser (Opera), according to Opera Dragonfly:

URL:    http://www.firstgiving.com/ProfileWebApi/Donations
Method: POST
Status: 200 OK
Duration:   1220 ms

Request details

POST /ProfileWebApi/Donations HTTP/1.1 
User-Agent: Opera/9.80 (Windows NT 6.1; U; Edition United Kingdom Local; en) Presto/2.10.229 Version/11.60
Host: www.firstgiving.com
Accept-Language: en-GB,en;q=0.9
Accept-Encoding: gzip, deflate
Referer: http://www.firstgiving.com/fundraiser/r-atheism/ratheism
Cookie: ASP.NET_SessionId=rmsl4b45jdxwykanpoqkb255
Connection: Keep-Alive
Content-Length: 111
Content-Type: application/json;
Accept: application/json, text/javascript, */*; q=0.01
X-Requested-With: XMLHttpRequest
Content-Transfer-Encoding: binary
Request body
{"EventGivingGroupId":1476950,"TotalRaised":"190776.020000","PageIsExpired":false,"PageNumber":4,"PageSize":50}
Response details
HTTP/1.1 200 OK 
Cache-Control: private
Content-Length: 62979
Content-Type: application/json; charset=utf-8
Server: Microsoft-IIS/7.5
X-AspNetMvc-Version: 2.0
X-AspNet-Version: 2.0.50727
X-Powered-By: ASP.NET
Date: Tue, 13 Dec 2011 19:13:28 GMT

Body

{"Data":"\u0009\u000d\u000a\u0009\u0009\u0009\u0009\u000d\u000a                         <table class=\"donationTable collapsed\" border=\"0\" cellspacing=\"0\" cellpadding=\"0\" style='height:0px; overflow:hidden;' >\u000d\u000a                            <thead class=\"visuallyhidden\">\u000d\u000a\u0009\u0009                        <tr>\u000d\u000a                                    <th scope=\"col\">Comment<\/th>\u000d\u000a                                    <th scope=\"col\" class=\"amount\">Donation<\/th>\u000d\u000a                                <\/tr>\u000d\u000a                            <\/thead>\u000d\u000a\u0009\u0009\u0009            \u000d\u000a                            <tr>                              \u000d\u000a                                  <td class=\"comment\">\u000d\u000a                                            \u000d\u000a                                                    <strong>Dear Regan Layman<\/strong>\u000d\u000a                                                Happy holidays :)<br \/>\u000d\u000a                                            \u000d\u000a                                                <time datetime=\"2011-12-10T21:55:35.0000000\">\u000d\u000a                                                    12\/10\/2011\u000d\u000a                                                <\/time>\u000d\u000a                                            \u000d\u000a                                   <\/td>\u000d\u000a                               \u000d\u000a                              <td class=\"amount\">\u000d\u000a                                $20.00<sup style=\"font-size:10px;\" title=\"Offline donation\"><\/sup> \u000d\u000a                                \u000d\u000a                              <\/td>\u000d\u000a                        <\/tr>\u000d\u000a\u0009                \u000d\u000a                            <tr>                              \u000d\u000a                                  <td class=\"comment\">\u000d\u000a                                            \u000d\u000a                                                    <strong>Frodo Baggins<\/strong>\u000d\u000a                                                Due to the fact that doctors heal people, not God!<br \/>\u000d\u000a                                            \u000d\u000a                                                <time datetime=\"2011-12-10T21:52:11.0000000\">\u000d\u000a                                                    12\/10\/2011\u000d\u000a                                                <\/time>\u000d\u000a                                            \u000d\u000a                                   <\/td>\u000d\u000a                               \u000d\u000a                              <td class=\"amount\">\u000d\u000a                                $4.64<sup style=\"font-size:10px;\" title=\"Offline donation\"><\/sup> \u000d\u000a                                \u000d\u000a                              <\/td>\u000d\u000a                        <\/tr>\u000d\u000a\u0009                \u000d\u000a                            

(snipped the rest of the response body. Also, there are usually more cookies, but I manually deleted everything except aspsession id, and it worked normally so they don't appear to be relevant to anything except analytics etc)

However, when I try to do the same thing from a perl script, I don't get this useful output. Here is my script:

#!/usr/bin/perl -w

use LWP::Simple;
use JSON;

use HTTP::Cookies;
use LWP::UserAgent;

use Data::Dumper;

my $cookie_jar = HTTP::Cookies->new;
my $ua = LWP::UserAgent->new(cookie_jar => $cookie_jar);
#push @{ $ua->requests_redirectable }, 'POST';
$ua->get('http://www.firstgiving.com/fundraiser/r-atheism/ratheism');

print Dumper $cookie_jar;

my $req = HTTP::Request->new(
    'POST',
    'http://www.firstgiving.com/ProfileWebApi/Donations');
$req->header('Accept-Encoding' => 'gzip, deflate');
$req->header('Referer' => 'http://www.firstgiving.com/fundraiser/r-atheism/ratheism');
$req->header('X-Requested-With' => 'XMLHttpRequest');
$req->header('Content-Transfer-Encoding' => 'binary');
$req->header('Content-type:' => 'application/json');
$req->header('User-Agent' => 'Opera/9.80 (Windows NT 6.1; U; Edition United Kingdom Local; en) Presto/2.10.229 Version/11.60');
$req->content('{"EventGivingGroupId":1476950,"TotalRaised":"190776.020000","PageIsExpired":true,"PageNumber":2,"PageSize":50}');
#$req->content('{"EventGivingGroupId":1476950,"PageNumber":1,"PageSize":50}');

my $post_request = $ua->request($req);
print Dumper( ($post_request) );

and here is the output:

$VAR1 = bless( {
                 'COOKIES' => {
                                'www.firstgiving.com' => {
                                                           '/' => {
                                                                    'ASP.NET_SessionId' => [
                                                                                             0,
                                                                                             'yynhqi2udtz4y055fakdvjiu',
                                                                                             undef,
                                                                                             1,
                                                                                             undef,
                                                                                             undef,
                                                                                             1,
                                                                                             {
                                                                                               'HttpOnly' => undef
                                                                                             }
                                                                                           ]
                                                                  }
                                                         }
                              }
               }, 'HTTP::Cookies' );
$VAR1 = bless( {
                 '_protocol' => 'HTTP/1.1',
                 '_content' => '<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="%2ferror%2f404">here</a>.</h2>
</body></html>
',
                 '_rc' => '302',
                 '_headers' => bless( {
                                        'x-powered-by' => 'ASP.NET',
                                        'client-response-num' => 1,
                                        'location' => '/error/404',
                                        'cache-control' => 'private',
                                        'date' => 'Tue, 13 Dec 2011 19:43:56 GMT',
                                        'client-peer' => '204.12.127.197:80',
                                        'x-aspnet-version' => '2.0.50727',
                                        'client-date' => 'Tue, 13 Dec 2011 19:36:45 GMT',
                                        'x-aspnetmvc-version' => '2.0',
                                        'content-type' => 'text/html; charset=utf-8',
                                        'title' => 'Object moved',
                                        'client-transfer-encoding' => [
                                                                        'chunked'
                                                                      ],
                                        'server' => 'Microsoft-IIS/7.5'
                                      }, 'HTTP::Headers' ),
                 '_msg' => 'Found',
                 '_request' => bless( {
                                        '_content' => '{"EventGivingGroupId":1476950,"TotalRaised":"190776.020000","PageIsExpired":true,"PageNumber":2,"PageSize":50}',
                                        '_uri' => bless( do{\(my $o = 'http://www.firstgiving.com/ProfileWebApi/Donations')}, 'URI::http' ),
                                        '_headers' => bless( {
                                                               'cookie2' => '$Version="1"',
                                                               'user-agent' => 'Opera/9.80 (Windows NT 6.1; U; Edition United Kingdom Local; en) Presto/2.10.229 Version/11.60',
                                                               'cookie' => 'ASP.NET_SessionId=yynhqi2udtz4y055fakdvjiu',
                                                               'x-requested-with' => 'XMLHttpRequest',
                                                               'accept-encoding' => 'gzip, deflate',
                                                               'content-transfer-encoding' => 'binary',
                                                               'content-type:' => 'application/json',
                                                               'referer' => 'http://www.firstgiving.com/fundraiser/r-atheism/ratheism'
                                                             }, 'HTTP::Headers' ),
                                        '_method' => 'POST',
                                        '_uri_canonical' => $VAR1->{'_request'}{'_uri'}
                                      }, 'HTTP::Request' )
               }, 'HTTP::Response' );

If I enable the line push @{ $ua->requests_redirectable }, 'POST'; (i.e., allow redirection for POST) it redirects to a 404 error page

If this is some intentional attempt by FirstGiving to keep out non-human clients, I'll of course give up, but their robots.txt doesn't seem to prohibit what I'm doing.

È stato utile?

Soluzione

Add the Accept: application/json, text/javascript, */*; q=0.01 header. Not a header I'd normally expect to be critical, but in this case it seems to be.

I did a quick little test using curl. This worked:

curl -vv -H 'Content-Type: application/json' \
  -H 'Referer: http://www.firstgiving.com/fundraiser/r-atheism/ratheism' \
  -H 'Cookie: ASP.NET_SessionId=svqlde45h0cvrv55hqvhwv55;' \
  -H 'X-Requested-With: XMLHttpRequest' \
  -H 'Accept: application/json, text/javascript, */*; q=0.01' \
  -d '{"EventGivingGroupId":1476950,"TotalRaised":"191532.480000","PageIsExpired":false,"PageNumber":2,"PageSize":50}' \
  'http://www.firstgiving.com/ProfileWebApi/Donations'

This gave me the redirect:

curl -vv -H 'Content-Type: application/json' \
  -H 'Referer: http://www.firstgiving.com/fundraiser/r-atheism/ratheism' \
  -H 'Cookie: ASP.NET_SessionId=svqlde45h0cvrv55hqvhwv55;' \
  -H 'X-Requested-With: XMLHttpRequest' \
  -d '{"EventGivingGroupId":1476950,"TotalRaised":"191532.480000","PageIsExpired":false,"PageNumber":2,"PageSize":50}' \
  'http://www.firstgiving.com/ProfileWebApi/Donations'
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top