Question

I am currently making use of Tweetstream to store tweets in MongoDB.
I have setup a script that im running with python 2.7:

extent =["144.715, -38.03", "145.219, -37.541"]

with tweetstream.FilterStream(username, password, locations=extent) as stream:
    for tweet in stream:
        db.tweets.save(tweet)

This is working fine, and storing tweets into mongoDb, but its also storing tweets that have no geolocation at all. i.e. for the coordinates property,, blank ones are also being stored.

To me, the current script should only be saving tweets that are within the specified extent to my mongoDb, but thats not happening.

Can anyone suggest how to modify my script to catch only sending the geotagged tweets within my specified extent to the mongoDb?

Was it helpful?

Solution

Twitter supports two different levels of accuracy in geolocation to allow users to limit the information they share.

http://support.twitter.com/forums/26810/entries/78525

Why do I see an exact location for some Tweets, but only the general vicinity (neighborhood or city) for others?

The default display is place location (like neighborhood or town), but some third-party apps let you tweet with your exact location or address. If you select your exact location to be displayed through a third-party app, the actual coordinates can be publicly shared.

Tweets returned by tweetstream.FilterStream can be of either accuracy. Some tweets will only have place-level accuracy, in which case the 'coordinates' key will be None.

 u'coordinates': None,
 u'place': {u'attributes': {},
            u'bounding_box': {u'coordinates': [[[-122.51368188,
                                                 37.70813196],
                                                [-122.35845384,
                                                 37.70813196],
                                                [-122.35845384,
                                                 37.83245301],
                                                [-122.51368188,
                                                 37.83245301]]],
                              u'type': u'Polygon'},
            u'country': u'United States',
            u'country_code': u'US',
            u'full_name': u'San Francisco, CA',
            u'id': u'5a110d312052166f',
            u'name': u'San Francisco',
            u'place_type': u'city',
            u'url': u'http://api.twitter.com/1/geo/id/5a110d312052166f.json'},

Other tweets will have an exact location, in which case the 'coordinates' key will be populated:

 u'coordinates': {u'type': 'Point', u'coordinates': [-122.51368188, 37.83245301]}

You need to decide if you're interested in the place level accuracy tweets. If you are, you can either store their coordinates as a polygon, or calculate a centroid.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top