Question

I've been trying to learn Text mining and other related things in Collective Intelligence field. I am interested to make an app which will scan thru the document and show related posts/articles on page.

What algorithm(s) would be helpful to retrieve required info?

Thanks

/A

Was it helpful?

Solution

A simple method is to count the non-common words and their instances on the page. The more a word shows up, the better it is at describing the content of the post. You can then use it to look up other articles/posts.

OTHER TIPS

You can use Resource Description Framework (RDF). RDF bases contain structured knowledge and connections between them. So, you can get RDF records for every word in text and connect them in graph. Nodes with maximum number of edges and root nodes (if the graph is like a tree) will refer to the theme of the document.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top