Análisis de Tweet Unicoded usando JSON Python

https://stackoverflow.com//questions/9647481

10-12-2019
|

Pregunta

Hay muchas publicaciones sobre el análisis de Twitter JSON, pero ninguno que he visto resuelve mi problema.

Este es código

import json

file = open('tweet', 'r')
tweet = file.read()
#{"geo":null,"text":"Lmao!! what time? I dont finish evening cleaning till 5 RT \u201c@some_user: football anyone?.....i wanna have a kickabout :(\u201d"}
#{"geo":null,"text":"Lmao!! what time? I dont finish evening cleaning till 5 RT @some_user: football anyone?.....i wanna have a kickabout :("}
def parseStreamingTweet(tweet):
    try:
        singleTweetJson = json.loads(tweet)
        for index in singleTweetJson:
            if index == 'text':
                print "text : ", singleTweetJson[index]
    except ValueError:
        print "Error ", tweet
        print ValueError
        return

parseStreamingTweet(tweet)

Este es el programa de prueba.Tweet viene en la corriente y para verificar el propósito, he guardado un tweet en un archivo y revisé.Hay una parte editada de la alimentación de Twitter.

¿Alguien puede decirme cómo analizar el Tweet que se codifica uni-codificado?El primer Tweet en el comentario está codificado por uni-codificado y el segundo no lo es.En primer lugar, hay un error, mientras se elimina la cadena de código UNI, el análisis tiene éxito.¿Cuál puede ser la solución?

Solución

I think your code works, the reason for the error is probably because of a UnicodeEncodeError which happens when you try to print the unicode value to the terminal. I'm guessing you are calling the script in a non-unicode aware terminal. If instead you printed the repr of the unicode value, or (wrote it to an output file) it would probably work:

print "text : ", repr(singleTweetJson[index])

Also its generally bad practice to hide specific exceptions/error messages with generic catch-all exceptions/error messages.

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow