I have a set of text files in a particular domain. I need to rank the files based on some metric.

Please help me out with a few metrics that can be used to rank my text files (term frequency, size, frequency of use, etc..). I would then like to use text mining techniques to rank the files based on one of these techniques.

有帮助吗?

解决方案

The major issue that i had come across is to rank the documents according to thier relevance or some other metric .

Now i have come to a conclusion that documents ranked based on their content(relevance) provides better results.

I am making use of a vector based approach to rank documents based on the search words given in the query . I am not sure if that is the best approach but it provides results with average accuracy

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top