Question

Perhaps this is too broad, but I am looking for references on how to use deep learning in a text summarization task.

I have already implemented text summarization using standard word-frequency approaches and sentence-ranking, but I'd like to explore the possibility of using deep learning techniques for this task. I have also gone through some implementations given on wildml.com using Convolutional Neural Networks (CNN) for sentiment analysis; I'd like to know how one could use libraries such as TensorFlow or Theano for text summarization and keyword extraction. Its been about a week since I started experimenting with Neural nets, and I am really excited to see how the performance of these libraries compares to my previous approaches to this problem.

I am particularly looking for some interesting papers and github projects related to text summarization using these frameworks. Can anyone provide me with some references?

Was it helpful?

Solution

The Google Research Blog should be helpful in the context of TensorFlow.

In the above article, there is a reference to the Annotated English Gigaword dataset which is routinely used for text summarization.

The 2014 paper by Sutskever et al titled Sequence to Sequence Learning with Neural Networks could be a meaningful start on your journey as it turns out that for shorter texts, summarization can be learned end-to-end with a deep learning technique.

Lastly, here is a great Github repository demonstrating text summarization while making use of TensorFlow.

OTHER TIPS

This is an open area of research and it certainly depends on the way you frame the problem. If you're talking about multi-document summarization then the problem is slightly different than if you were talking about single-document summarization.

It's worth briefly reviewing the literature.

The link provided by u/Society Of Data Scientists is great and it's useful for the abstractive summarization task across a single document. There's also work done on extractive summarizations, which identifies important sentences to extract.

Rush et. al has a nice paper on the abstractive summarization with Attention, which is based on deep learning.

For an extractive summarization, you could use an LSTM to build your classifier and use standard TensorFlow/Torch libraries but there doesn't seem to be any current publications on using deep learning for this approach.

Here are some additional GitHub repos:

Sounds like this is more extractive summarization if you are looking for key words. Here are a few papers which probably have implementations:

Neural Summarization by Extracting Sentences and Words

Extractive Summarization using Deep Learning

Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding

Also, SpaCy (not affiliated) has a good blog on the general architecture of of text extraction tasks.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top