A confusion matrix is a visualization tool commonly used in supervised machine learning. It contains information about actual and predicted classifications by a classification system. Usually, each column of the matrix represents the instances of the predicted class, while each row represents the instances of the actual class.
The confusion matrix is useful to evaluate the performance of a classifier, showing the number per class of well classified and mislabeled instances. Moreover, it is easy to see if the automatic system is confusing two or more classes (mislabeling one class as another).
An example of confusion matrix is shown in
Table (a). Is easy to see, for instance, that
instances of Class A are correctly predicted as Class A and, in
contrast,
and
instances of Class A are misclassified as
Class B and Class C, respectively. The same can be done for the
other classes. The numerical evaluation of a confusion matrix is
commonly computed as the percentage of correctly classified
instances. For instance, in Table
(a) an agreement
of
is achieved.
Another measure which can be extracted from a confusion matrix is
the kappa (
) coefficient [34,51], which is
a popular measure to estimate agreement in categorical data. The
motivation of this measure is to extract from the correctly
classified percentage the actual percentage expected by chance.
Thus, this coefficient is calculated as:
Thus, following the example, Table (b) shows the
classification of the instances as expected by mere chance, given
the observed marginal totals. Using the above formula
, which looking at Table
is in the
moderate agreement. However, what means kappa? Looking at the
example, note that of the correctly classified
instances of
(a) (the sum of the diagonal values),
of them were in fact
expected by chance, thus showing that the classifier agrees in
more cases. Similarly, the number of mislabeled instances
expected by chance is
. The coefficient
is
simply this ratio (
), which can be translate as ``of all
the
items that would have mislabeled by chance, a total of
are in fact correctly classified".