next up previous contents
Next: ROC Analysis Up: Evaluation of Classifiers Previous: Evaluation of Classifiers   Contents


Confusion Matrices

A confusion matrix is a visualization tool commonly used in supervised machine learning. It contains information about actual and predicted classifications by a classification system. Usually, each column of the matrix represents the instances of the predicted class, while each row represents the instances of the actual class.

The confusion matrix is useful to evaluate the performance of a classifier, showing the number per class of well classified and mislabeled instances. Moreover, it is easy to see if the automatic system is confusing two or more classes (mislabeling one class as another).

An example of confusion matrix is shown in Table [*](a). Is easy to see, for instance, that $ 45$ instances of Class A are correctly predicted as Class A and, in contrast, $ 9$ and $ 6$ instances of Class A are misclassified as Class B and Class C, respectively. The same can be done for the other classes. The numerical evaluation of a confusion matrix is commonly computed as the percentage of correctly classified instances. For instance, in Table [*](a) an agreement of $ 67\%$ is achieved.


Table C.1: (a) Example of confusion matrix, and (b) the same data classified as would have been expected by mere chance, given the observed marginal totals.
              Automatic         Automatic
 
  A B C Total

A  $ 45$    $ 9$    $ 6$    $ 60$  
B  $ 4$    $ 19$    $ 7$  
C  $ 1$    $ 2$    $ 7$  
Total  $ 50$    $ 30$    $ 20$  
  A B C Total
A  $ 30$    $ 18$    $ 12$    $ 60$  
B  $ 15$    $ 9$    $ 6$    $ 30$  
C  $ 5$    $ 3$    $ 2$    $ 10$  
Total  $ 50$    $ 30$    $ 20$    $ 100$  
(a)     (b)


Another measure which can be extracted from a confusion matrix is the kappa ($ \kappa$ ) coefficient [34,51], which is a popular measure to estimate agreement in categorical data. The motivation of this measure is to extract from the correctly classified percentage the actual percentage expected by chance. Thus, this coefficient is calculated as:

$\displaystyle \kappa = \frac{P(D)-P(E)}{1-P(E)}$ (C.1)

where $ P(D)$ is the percentage of correct classified instances (the sum of diagonal terms divided by the sum of total instances) and $ P(E)$ is the expected proportion by chance (the sum of the multiplication of the marginal probabilities per class divided by the sum of total instances). A $ \kappa$ coefficient equal to one means a statistically perfect model whereas a value equal to zero is the chance value. Table [*] shows a commonly used interpretation of the various $ \kappa$ values [99].


Table C.2: Common interpretation of the various $ \kappa$ values [99].
$ \kappa$ Agreement
$ <0$ Poor
$ [0,0.20]$ Slight
$ [0.21,0.40]$ Fair
$ [0.41,0.60]$ Moderate
$ [0.61,0.80]$ Substantial
$ [0.81,1.00]$ Almost Perfect


Thus, following the example, Table [*](b) shows the classification of the instances as expected by mere chance, given the observed marginal totals. Using the above formula $ \kappa =
0.51$ , which looking at Table [*] is in the moderate agreement. However, what means kappa? Looking at the example, note that of the correctly classified $ 71$ instances of (a) (the sum of the diagonal values), $ 41$ of them were in fact expected by chance, thus showing that the classifier agrees in $ 30$ more cases. Similarly, the number of mislabeled instances expected by chance is $ 100-41=59$ . The coefficient $ \kappa$ is simply this ratio ($ 30/59$ ), which can be translate as ``of all the $ 59$ items that would have mislabeled by chance, a total of $ 30$ are in fact correctly classified".


next up previous contents
Next: ROC Analysis Up: Evaluation of Classifiers Previous: Evaluation of Classifiers   Contents
Arnau Oliver 2008-06-17