What data structure to use to store the sentiment count of corresponding word during sentiment analysis in python?

StackOverflow https://stackoverflow.com/questions/20649595

Question

We are doing a project on twitter sentiment analyser in python. In order to increase the efficiency of the system ,during training we wish to store the occurrence of particular words in positive,negative and neutral tweets. Finally we will take the sentiment of the word as the one with maximum occurrence. Which data structure is suitable for storing words and their sentiments(positive,negative and neutral) dynamically? example:

            positive  negative   neutral
 market       45        12         2
 quite        35         67        5
 good         98         2         7

we require to add words to the structure dynamically.

Was it helpful?

Solution

Something like this might do the trick for you:

sentiment_words = {}  # this will be a dict of 3-member lists, with word as key

for word in words:
    if not word in sentiment_words:  # initialize the word if it's not present yet
        sentiment_words[word] = [0, 0, 0]
    if ispositive(word):  # increment the right sentiment item in the list
        sentiment_words[word][0] += 1
    elif isnegative(word):
        sentiment_words[word][1] += 1
    elif isneutral(word):
        sentiment_words[word][2] += 1

If you can say more about the specifics I might be able to tune it in a bit for you.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top