Question

I'm playing around with sentiment analysis, and I'm looking for some seed data. Is there a free dictionary around?

It can be really simple: 3 sets of texts/sentences, for "positive", "negative", "neutral". It doesn't have to be huge.

Eventually I'll probably generate my own seed data for my specific use case, but it would be great to have something to play with now while I'm building the thing.

Was it helpful?

OTHER TIPS

If you're interested in sentiment dictionaries, many authors have presented work based on manually built lists, and other semi automated methods for obtaining lists of opinionated terms. One good approach is to derive it from the WordNet database, by extending a core of positive/negative words using relationships like synonyms etc.

A good example of a manually built list is the General Inquirer.

For a semi automated method that derives lists, check out SentiWordNet from Esuli and Sebastiani.

These I believe are generally available for research, but you may need to get in touch with the authors regarding the use of these resources for non-research purposes.

B.

You can use the AFINN word list here:

http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=6010

AFINN is a list of English words rated for valence with an integer between minus five (negative) and plus five (positive). The words have been manually labeled by Finn Årup Nielsen in 2009-2011. The file is tab-separated. There are two versions:

AFINN-111: Newest version with 2477 words and phrases.

AFINN-96: 1468 unique words and phrases on 1480 lines. Note that there are 1480 lines, as some words are listed twice. The word list in not entirely in alphabetic ordering.

I maintain a list of corpora and word lists for sentiment analysis (where my AFINN is one of them):

http://neuro.compute.dtu.dk/wiki/Sentiment_analysis#Corpora

http://neuro.compute.dtu.dk/wiki/Sentiment_analysis#Affective_word_lists

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top