Filter images from tweets

https://stackoverflow.com/questions/23605337

20-07-2023
|

Вопрос

I am fresh to tweepy, and I wandering how is it possible to track down and store the image that a user posts in his/her tweets. I found several ways in tutorials to get user tweets, but I couldnt find a way to filter only the images.

I am using the following code in order to get user tweets. How is it possible to get only user images??

EDIT: I edit my code like above:

auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(OAUTH_TOKEN, OAUTH_SECRET)
api = tweepy.API(auth)
timeline = api.user_timeline(count=10, screen_name = "zenitiss") 
for tweet in timeline: 
   for media in tweet.entities.get("media",[{}]):
      print media
      #checks if there is any media-entity
      if media.get("type",None) == "photo":
          # checks if the entity is of the type "photo"
          image_content=requests.get(media["media_url"])
          print image_content

However it seems that the for loop it doesnt works. The print media line prints a null object. Basically when I am trying to print urls of a user for example karyperry I am getting:

{u'url': u'http://t.co/TaP2JZrpxu', u'indices': [42, 64], u'expanded_url':  
u'http://youtu.be/7bDLIV96LD4', u'display_url': u'youtu.be/7bDLIV96LD4'}
{u'url': u'https://t.co/t3hv7VQiPG', u'indices': [42, 65], u'expanded_url': 
u'https://vine.co/v/MgvxZA2qKbV', u'display_url': u'vine.co/v/MgvxZA2qKbV'}
{u'url': u'http://t.co/vnJAAU7KN6', u'indices': [50, 72], u'expanded_url':
u'http://instagram.com/p/n01XZjv-fp/', u'display_url': u'instagram.com/p/n01XZjv-fp/'}
{u'url': u'http://t.co/NycqAwtcgo', u'indices': [78, 100], u'expanded_url':
u'http://bit.ly/1o7xQRj', u'display_url': u'bit.ly/1o7xQRj'}
{u'url': u'http://t.co/BG6ozuRD6D', u'indices': [111, 133], u'expanded_url':
u'http://www.johnnywujek.com/sos', u'display_url': u'johnnywujek.com/sos'}
{u'url': u'http://t.co/nWIQ9ruJ3f', u'indices': [88, 110], u'expanded_url':
u'http://uncf.us/1kSXIwF', u'display_url': u'uncf.us/1kSXIwF'}
{u'url': u'http://t.co/yTbOgqt9fw', u'indices': [101, 123], u'expanded_url':
u'http://instagram.com/p/nvxD8eP-SZ/', u'display_url': u'instagram.com/p/nvxD8eP-SZ/'}

The most of urls are images, however when I put 'url' instead of 'media' in loop for media in tweet.entities.get("url",[{}]). Most of them are image urls.

Решение

Tweets (their JSON-representation) contain a "media"-entity, as mentioned here. Tweepy should expose that type of entity as following, assuming there is an image included in the tweet:

tweet.entities["media"]["media_url"]

Therefore, if you want to store the image, you just need to download it, f.e. via python's request library. Try adding something like the following statement to your code (or modify according to your needs):

for media in tweet.entities.get("media",[{}]):
    #checks if there is any media-entity
    if media.get("type",None) == "photo":
        # checks if the entity is of the type "photo"
        image_content=requests.get(media["media_url"])
        # save to file etc.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow