Wavelet Packet Decomposition, Feature Selection and SVM

https://stackoverflow.com//questions/11702767

13-12-2019
|

Question

I want to know more about a fault detection model using Wavelet Packet Decomposition, Feature Selection and SVM. One can read some related papers here:

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4547208

https://mospace.umsystem.edu/xmlui/bitstream/handle/10355/4845/research.pdf?sequence=3

My question is in "Feature Selection" step, that we need to choose Wavelet Packet Nodes (with their computed rms value) as feature for the final SVM classifier. In SVM we need to know also label of each vector (+1, -1) but how can we obtain this label during the feature selection process. I really don't understand so much about Genetic Algorithm (GA) using 10-folds SVM as in the above papers. Anyone can explain to me about this?

Solution

In the listed paper the authors are attempting to minimize the amount of noise in their classifier by only training the SVM using packets from frequency bands that have relatively low EMI noise. So, they are incorporating the Genetic Algorithm in their feature selection step as follows:

First, the Genetic Algorithm in this case will be a bunch of 0's and 1's with a length equal to the number of frequency bands for your particular problem. Each "slot" (actually referred to as a "gene" in genetic algorithms) corresponds to whether or not you should use data from that frequency band when training your classifier.

So if you are using 10 frequency bands total for your data each individual or chromosome (a group of slots/genes) will be of size 10 and will look like this (just a bunch of 10 1s/0s)

0110101000

So you start by generating a whole bunch of random chromosomes, how many is a whole bunch? Well depends on your problem and it certainly takes some effort to determine...I would recommend experiment with this value heavily from several 100 to several million depending on your hardware capabilities.

Okay, so now you have your random bunches of 0s,1s so what right?

Well here is the fun part, you need to evaluate how well EACH of these chromosomes are (individually/one-at-a-time) at selecting data for you, so for each chromosome you cycle through your data and train your SVN with a data packet if and only that specific chromosome has a "1" in the spot corresponding to the frequency band of that packet.

So in other words, if you have an example chromosome : 1000000001 and suppose you receive a packet corresponding to the first frequency band, you would accept it into your training set for your SVN. If you receive a packet for the 5th frequency band for this chromosome, you throw it out.

Since the GA is only used for the feature selection step, now you simply look at the packet data the chromosome "accepted" and use this data only to train your SVN as normal (instead of the whole training data set). Then you compute the error from the SVN you trained with this data and assign the chromosome a score based on how good it was.

After you repeat this for all of the chromosomes, you now have a list of every chromosome and how well the resulting SVN did with classification. You then save your best performing chromosome ever, and perform Genetic Algorithm steps to create a whole new batch of chromosomes. The idea is that by combining the well performing chromosomes from the previous steps, you can get some EVEN BETTER chromosomes for selecting data for your SVN. And you simply repeat this evaluate/recombine procedure until you do not see the new chromosomes doing better than your all time best.

Make Sense?

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow