Pergunta


Hi! Here is code:

def on_data(self, data):
    j_data = json.loads(data)
    tweet = data.split(',"text":"')[1].split('","source')[0]
    print j_data[u"text"]


    saveTweet = str(time.time())+'::'+tweet
    saveFile = open('tweetDB1.csv','a')
    saveFile.write(saveTweet)
    saveFile.write('\n')
    saveFile.close()

BUT! I need to write a tweet in file as a string without any garbarge there. If i write print tweet (not print j_data[u"text"]) I'll have utf-8 code, not encode: -\u0422\u044b \u0432\u0438\u0434\u0435\u043b

How can I fix it?

Foi útil?

Solução

To append just the text field to a CSV file, use:

def on_data(self, data):
    j_data = json.loads(data)
    tweet = j_data[u"text"]

    with open('tweetDB1.csv', 'a') as save_file:
        save_file.write('{}::{}\n'.format(time.time(), tweet.encode('utf8'))

This loads the original JSON data into a dictionary, uses just the text field, and encodes it to UTF-8 when writing to the file.

Your version used the JSON-encoded string, ignoring the j_data dictionary altogether. You could just as well have removed the json.loads() call. JSON uses \u.... escape sequences to represent Unicode codepoints. The above writes UTF-8 data instead.

Outras dicas

I think you should encode your data.

saveTweet=saveTweet.encode(encoding='utf-8')

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top