Question

It may be looking easy. But I am confused.

What is the difference between Text Mining and Information Extraction ?

Was it helpful?

Solution

Information extraction

(IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents. In most of the cases this activity concerns processing human language texts by means of natural language processing (NLP). Recent activities in multimedia document processing like automatic annotation and content extraction out of images/audio/video could be seen as information extraction.

Text Mining

is the activity of obtaining information resources relevant to an information need from a collection of information resources. Searches can be based on metadata or on full-text indexing.

Text mining is vast area as compared to information retrieval. Typical text mining tasks include document classification, document clustering, building ontology, sentiment analysis, document summarization, Information extraction etc. Where as information retrieval typically deals with crawling, parsing and indexing document, retrieving documents.

Source

OTHER TIPS

First lets have a look at the meaning of these two important words.

Text Mining is automatic discovery of new, previously unknown information, by automatic analysis of various textual resources.It starts by extracting facts and events from textual sources and then enables forming new hypothesis that are further explored by traditional data mining and data analysis methods.

Information Extraction is more of NLP(natural language processing) & Machine Learning problem where you train the machine to extract hidden information from the raw text.

So the difference can be said as - Text mining is a vast area compared to Information Extraction. Text mining concerns looking for patterns in unstructured text. The related task of Information Extraction (IE) is about locating specific items in natural-language documents

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top