How to find trending tags out of long search string
Question
I would like to have a trending tags in my website according to searches users made. The problem I'm not seeing an simple solution for is how to easily extract the important terms out of a search string. For example, many users might search for "visual studio" with different purposes. For example, "visual studio 2010", "visual studio unit testing", "visual studio web forms components". In those 3 searches, "visual studio" is trending. How can an algorithm notice that since "visual studio" in most cases will be mixed with many other words?
Thank you!
Solution
- split every search query into an array of single words.
- calculate the distance between the words (the nearer, the better => higher value)
- add this word distance for each wordpair across all queries.
The wordpairs with the higher values are your "trending tags".
OTHER TIPS
Have a look on this codeplex project
http://www.codeplex.com/TheTagCloud
Includes a function that you can pass an html file to as input and will return a tag cloud.