How to define confusion matrix for classification?

https://datascience.stackexchange.com/questions/1189

confusion-matrix

16-10-2019
|

문제

Below is the dataset where the response variable is play with two labels (yes, and no):

No. outlook temperature humidity    windy   play
1   sunny       hot     high        FALSE   no
2   sunny       hot     high        TRUE    no
3   overcast    hot     high        FALSE   yes
4   rainy       mild    high        FALSE   yes
5   rainy       cool    normal      FALSE   yes
6   rainy       cool    normal      TRUE    no
7   overcast    cool    normal      TRUE    yes
8   sunny       mild    high        FALSE   no
9   sunny       cool    normal      FALSE   yes
10  rainy       mild    normal      FALSE   yes
11  sunny       mild    normal      TRUE    yes
12  overcast    mild    high        TRUE    yes
13  overcast    hot     normal      FALSE   yes
14  rainy       mild    high        TRUE    no

Here are the decisions with their respective classifications:

1: (outlook,overcast) -> (play,yes) 
[Support=0.29 , Confidence=1.00 , Correctly Classify= 3, 7, 12, 13]

2: (humidity,normal), (windy,FALSE) -> (play,yes)
[Support=0.29 , Confidence=1.00 , Correctly Classify= 5, 9, 10]

3: (outlook,sunny), (humidity,high) -> (play,no) 
[Support=0.21 , Confidence=1.00 , Correctly Classify= 1, 2, 8]

4: (outlook,rainy), (windy,FALSE) -> (play,yes) 
[Support=0.21 , Confidence=1.00 , Correctly Classify= 4]

5: (outlook,sunny), (humidity,normal) -> (play,yes) 
[Support=0.14 , Confidence=1.00 , Correctly Classify= 11]

6: (outlook,rainy), (windy,TRUE) -> (play,no) 
[Support=0.14 , Confidence=1.00 , Correctly Classify= 6, 14]

해결책

You are just predicting if Play = Yes or Play = No.

The confusion matrix would look like this:

             Predicted
          +------+------+
          |  Yes |  No  |
    +-------------------+
A   |     |      |      |
c   | Yes |  TP  |  FP  |
t   |     |      |      |
u   +-------------------+
a   |     |      |      |
l   | No  |  FN  |  TN  |
    |     |      |      |
    +-----+------+------+

TP: True positives
FP: False positives 
FN: False negatives 
TN: True negatives

The accuracy can then be calculated as (TP + TN)/(TP + FP + TN + FN).

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 datascience.stackexchange