Question

I have a set of text files in a particular domain. I need to rank the files based on some metric.

Please help me out with a few metrics that can be used to rank my text files (term frequency, size, frequency of use, etc..). I would then like to use text mining techniques to rank the files based on one of these techniques.

Était-ce utile?

La solution

The major issue that i had come across is to rank the documents according to thier relevance or some other metric .

Now i have come to a conclusion that documents ranked based on their content(relevance) provides better results.

I am making use of a vector based approach to rank documents based on the search words given in the query . I am not sure if that is the best approach but it provides results with average accuracy

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top