문제

I've been trying to learn Text mining and other related things in Collective Intelligence field. I am interested to make an app which will scan thru the document and show related posts/articles on page.

What algorithm(s) would be helpful to retrieve required info?

Thanks

/A

도움이 되었습니까?

해결책

A simple method is to count the non-common words and their instances on the page. The more a word shows up, the better it is at describing the content of the post. You can then use it to look up other articles/posts.

다른 팁

You can use Resource Description Framework (RDF). RDF bases contain structured knowledge and connections between them. So, you can get RDF records for every word in text and connect them in graph. Nodes with maximum number of edges and root nodes (if the graph is like a tree) will refer to the theme of the document.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top