문제

For nonlinear data, when we are using Support Vector Machines, we can use kernels such as Gaussian RBF, Polynomial, etc to achieve linearity in a different (potentially unknown to us) feature space and the algorithm learns a maximal separating hyperplane in that feature space.

My question is how do we create heatmaps such as the one seen in the image below to show this max. separating hyperplane in our original space and how should it be interpreted?

enter image description here

도움이 되었습니까?

해결책

I think I can answer that, since I implement such a thing in my own library, even if I really don't know how it's implemented in other libraries. Although I am confident that if there are other ways, they don't differ too much.

It took my a few weeks to understand how such a graph can be drawn.

Let's start with a general function $f:\mathbb{R} \times \mathbb{R} \to \mathbb{R}$. What you want is to draw points with a colour which signifies the value of the function. One way would be to simplify the whole problem and draw one point for each pixel. This would work, but will draw only shaded surfaces and it's impossible to draw lines with various formats (dotted lines with some colours and line widths.

The real solution which I found makes two simplifications. The first one would the that instead of colouring with a gradient you can colour bands. Suppose that your function $f$ takes values in $[-1, 1]$. You can split your co-domain into many subintervals like: $[-1, -0.9]$, $[-0.9, -0.8]$ and so on. Now what you have to do would be to paint a polygon filled with appropriate color, for each interval. So your original problem is simplified to draw multiple instances of a simpler problem. Note that when your intervals are small enough it will look like a gradient and even a trained eye would not notice.

The second simplification would be to split the space which needs to be drawn into a grid of small rectangles. So instead of drawing a polygon on the whole surface, you need to fill each small rectangle with appropriate much simpler polygon.

If it's not obvious, the problem is much much simplified. Take for example a one of the rectangle. This rectangle has four corners and you can take an additional point in the center of that rectangle (you might need that point in some certain situations).

The question is how to fill with appropriate colour the proper region? You need to evaluate function in all four corners and in the center.

There are some specific cases:

  • function evaluated in all corners are smaller than the beginning of the interval => you need to do nothing
  • functions evaluated in all corners are greater than the end of the interval => you need to do nothing
  • functions evaluated in all corners are within interval => fill the whole rectangle with an appropriate color

You can stop here if you want, but your figures would looks non-smooth. You can go further:

  • left-up, left-down, right-down in interval, right-up bigger => there are two points, one on up side and one on right side which contains the function evaluated at max value from interval => those two points together with top-right corner forms a triangle which can be filled
  • many other cases which requires only common judgement to decide which polygons to be formed and should be filled.

Using this algorithm you can fill polygons or draw lines. In the specific case of SVM you need to know that the line which corresponds with $f(x,y)=0$ is the line which classifies points into positive and negative samples. Also, lines which evaluates the function at $-1$ or $1$ corresponds with the maximal margins of SVM.

After some time I found that this kind of approach is named iso lines, or iso curves. Perhaps are more similar algorithms like that.

My implementation is named mesh contour (I did not found a proper name at that time in the beginning) and you can find source here.

Some examples:

Mesh curves for grained colour step

Mesh curves for fine colour steps

SVM on synthetic data

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 datascience.stackexchange
scroll top