Convolutional neural network for sparse one-hot representation

https://datascience.stackexchange.com/questions/5821

16-10-2019
|

Question

I have some basic features which I encoded in a one-hot vector.

Length of the feature vector equals to 400. It is sparse.

I saw that conv nets is applied to a dense feature vectors.

Is there any problems to apply conv nets to a sparse feature vectors?

Solution

I would not apply convolutional neural networks to your problem (at least from what I can gather from the description).

Convolutional nets' strengths and weaknesses are related to a core assumption in the model class: Translating patterns of features in a regular way either has a minor impact on the outcome, or has a specific useful meaning. So a pattern 1 0 1 seen in features 9,10,11 is similar in some way to the same pattern seen in features 15,16,17. Having this assumption built in to the model allows you to train a network with far fewer free parameters when dealing with e.g. image data, where this is a key property of data captured by scanners and cameras.

With one-hot encoding of features, you assign a feature vector index from a value or category essentially at random (via some hashing function). There is no meaning to translations between indices of the feature vectors. The patterns 0 0 1 0 1 0 0 and 0 0 0 1 0 1 0 can represent entirely different things, and any associations between them are purely by chance. You can treat a sparse one-hot encoding as an image if you wish, but there is no good reason to do so, and models that assume translations can be made whilst preserving meaning will not do well.

For such a small sparse feature vector, assuming you want to try a neural network model, use a simple fully-connected network.

OTHER TIPS

Although I agree with Neil Slater's response, you should keep a couple of things in mind.

1) "you never know!" In data exploration, you never know what you may find. If you have a ton of data, perhaps playing around with a 20x20 conv net will give you some decent results. Of course, it would be helpful if there are more than just a few features for it to learn...if your 400 length vector is the result of one-hotting 4 different features then it's probably safe to say that a conv net won't give you much.

2) If you're looking for a reason to implement a conv net, then go for it. Even if your accuracy metrics are terrible you at least get to learn how to create your net, train, and predict using your own data...one cannot underestimate this learning experience! So much more valuable that running yet another mnist example out of the box.

3) Comparison. Make a regular net and a conv net...then you get to compare the two. Not only that, compare it to a Random Forest, logistic regression, etc. etc.. Do this enough times and you start to develop intuition.

I say do it! (unless somebody is paying you...in which case try the regular NN first)

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange