문제

Given four binary vectors which represent "classes":

[1,0,0,0,0,0,0,0,0,0]
[0,0,0,0,0,0,0,0,0,1]
[0,1,1,1,1,1,1,1,1,0]
[0,1,0,0,0,0,0,0,0,0]

What methods are available for classifying a vector of floating point values into one of these "classes"?

Basic rounding works in most cases:

round([0.8,0,0,0,0.3,0,0.1,0,0,0]) = [1 0 0 0 0 0 0 0 0 0] 

But how can I handle some interference?

round([0.8,0,0,0,0.6,0,0.1,0,0,0]) != [1 0 0 0 0 1 0 0 0 0]

This second case should be a better match for 1000000000, but instead, I have lost the solution entirely as there is no clear match.

I want to use MATLAB for this task.

도움이 되었습니까?

해결책

Find the SSD (sum of squared differences) of your test vector with each "class" and use the one with the least SSD.

Here's some code: I added a 0 to the end of the test vector you provided since it was only 9 digits whereas the classes had 10.

CLASSES = [1,0,0,0,0,0,0,0,0,0
           0,0,0,0,0,0,0,0,0,1
           0,1,1,1,1,1,1,1,1,0
           0,1,0,0,0,0,0,0,0,0];

TEST = [0.8,0,0,0,0.6,0,0.1,0,0,0];

% Find the difference between the TEST vector and each row in CLASSES
difference = bsxfun(@minus,CLASSES,TEST);
% Class differences
class_diff = sum(difference.^2,2);
% Store the row index of the vector with the minimum difference from TEST
[val CLASS_ID] = min(class_diff);
% Display
disp(CLASSES(CLASS_ID,:))

For illustrative purposes, difference looks like this:

 0.2    0   0   0   -0.6    0   -0.1    0   0   0
-0.8    0   0   0   -0.6    0   -0.1    0   0   1
-0.8    1   1   1    0.4    1    0.9    1   1   0
-0.8    1   0   0   -0.6    0   -0.1    0   0   0

And the distance of each class from TEST looks like this, class_diff:

 0.41
 2.01
 7.61
 2.01

And obviously, the first one is the best match since it has the least difference.

다른 팁

This is the same thing as Jacob did, only with four different distance measures:


%%
CLASSES = [1,0,0,0,0,0,0,0,0,0
           0,0,0,0,0,0,0,0,0,1
           0,1,1,1,1,1,1,1,1,0
           0,1,0,0,0,0,0,0,0,0];

TEST = [0.8,0,0,0,0.6,0,0.1,0,0,0];

%%
% sqrt( sum((x-y).^2) )
euclidean = sqrt( sum(bsxfun(@minus,CLASSES,TEST).^2, 2) );

% sum( |x-y| )
cityblock = sum(abs(bsxfun(@minus,CLASSES,TEST)), 2);

% 1 - dot(x,y)/(sqrt(dot(x,x))*sqrt(dot(y,y)))
cosine = 1 - ( CLASSES*TEST' ./ (norm(TEST)*sqrt(sum(CLASSES.^2,2))) );

% max( |x-y| )
chebychev = max( abs(bsxfun(@minus,CLASSES,TEST)), [], 2 );

dist = [euclidean cityblock cosine chebychev];

%%
[minDist classIdx] = min(dist);

Pick the one you like :)

A simple Euclidean distance algorithm should suffice. The class with the minimum distance to the point would be your candidate.

http://en.wikipedia.org/wiki/Euclidean_distance

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top