In this section, we test our approach using the same training set
but using a test set from the Trueta digital database. This is
composed by a set of
MLO and
CC views mammograms
containing, at least, one mass.
The evaluation is done using ROC analysis, and the set of
DDSM RoIs depicting masses for the templates construction and the
rest of RoIs for the false positive reduction model. In order to
calculate how the breast density misclassification affects the
performance of the system we will repeat our experiment twice:
firstly, considering the breast density as annotated in the
database, and secondly, classifying the breasts using the
algorithm proposed in Chapter
.
Table shows the confusion matrices
for both classifications and MLO and CC views. The algorithm
clearly obtained better performance for MLO mammograms than for CC
ones. The kappa value for the former is
, which according to
Table
is in the high part of the
substantial agreement. In contrast, the kappa value for CC
views is
which is on the border between moderate and
substantial. Looking at class level, note that mammograms
belonging to BIRADS I are almost all classified correctly for MLO
mammograms, while for CC views the performance is reduced.
Moreover, the two mammograms belonging to BIRADS IV are
misclassified in both confusion matrices.
On the other hand, Table shows the
obtained results when training the proposed segmentation
algorithms using the RoIs clustered according to both annotations:
the manual and the automatic. Note that, in general, both results
are less satisfactory compared with the ones obtained using the
MIAS database (see Table
). The main
reason for this is due to the false positive reduction algorithm,
which is still trained using digitized RoIs in contrast to using
digital ones. It is the same effect we noticed when comparing the
results obtained with the MIAS database but even more pronounced.
Comparing the results according to the annotations origin, note
that the results obtained using the automatic estimation
outperforms in almost
higher than the ones obtained using
the manual annotations. This shows that the automatic method is
able to capture the mammogram appearance with more objectivity
than a human expert, although the mammogram will probably be
misclassified according to the experts opinion.