Question

Thanks in advance for reading my question!

I've been using CNNs to classify text using Keras and TF. My data is strings "I read the news" or "I read machine learning news" and my labels are tags: Data Science, Reporter, Child...

My issues is that each text can have multiple labels attached to it. How should I construct my target such that I can capture each of those multiple outputs?

Description, Tag "I read the news", Child "I read the news", Reporter "I read machine learning news", Data Science "I read machine learning news", Reporter

Was it helpful?

Solution

CNN's (generally) use softmax as activation function at the last layer, which gives a probability distribution over all possible labels. You use some kind of loss function to optimize the CNN to generate "right" label for a novel sample.

Loss function used in such case is cross-entropy loss or KL-divergence which measures how close two distributions are. For multiclass classification (different from your case, where each input example can have only one label), desired distribution is (0, 0, .... , 1, ..., 0) where 1 appears at the position of actual label. Loss function penalizes the distribution which is "far" from this distribution.

For your case you can encode your output as (0, 0, 1, 0, ... 1, ..,) so on where you put ones at positions of the labels which are associated with your input and use KL-divergence as loss function and optimize your CNN to minimize the loss.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top