Question

I am working on building a Question Classification/Answering corpus as a part of my masters thesis. I'm looking at evaluating my expected answer type taxonomy with respect to inter-rater agreement/reliability, and I was wondering: Does anybody know of any decent (preferably free) Java API(s) that can do this?

I'm reasonably certain all I need is Fleiss' Kappa and Krippendorff's Alpha at this point.

Weka provides a kappa statistic in it's evaluation package, but I think it can only evaluate a classifier and I'm not at that stage yet (because I'm still building the data set and classes).

Thanks.

Was it helpful?

Solution 3

I couldn't find an existing Java API in time for my research, so I ended up implementing both Fleiss' Kappa and Krippendorff's Alpha myself. Preliminary results for our research can be found in this paper.

OTHER TIPS

Check QDAP (Pittsburg University) Open Source code.

I ported a matlab implementation of Fleiss' kappa to Python/numpy.

http://code.google.com/p/hydrat/source/browse/src/hydrat/common/fleiss.py

It is not difficult to implement, perhaps you could port it to Java yourself.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top