Question

I have a set of text files in a particular domain. I need to rank the files based on some metric.

Please help me out with a few metrics that can be used to rank my text files (term frequency, size, frequency of use, etc..). I would then like to use text mining techniques to rank the files based on one of these techniques.

Was it helpful?

Solution

The major issue that i had come across is to rank the documents according to thier relevance or some other metric .

Now i have come to a conclusion that documents ranked based on their content(relevance) provides better results.

I am making use of a vector based approach to rank documents based on the search words given in the query . I am not sure if that is the best approach but it provides results with average accuracy

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top