Initial experiments consist of the evaluation of the proposed
method using the individual expert classifications independently.
We used a leave-one-woman-out methodology, i.e. the left and right
mammograms of a woman are analyzed by a classifier trained using
the mammograms of all other women in the database. The
leave-one-woman-out methodology is used to avoid bias as the left
and right mammograms of a woman are expected to have similar
internal morphology [94]. The confusion matrices for
the three classifiers: the SFS+kNN, C
, and Bayesian
approaches are shown in Table
,
where each row corresponds to results based on the manual
classification by an individual radiologist. In this work a value
of
was used for kNN. Other odd values ranging from
to
were tested and gave similar results.
For expert A, we can see that the SFS+kNN correctly classifies
about
of the mammograms, while the C
decision tree
achieves
of correct classification. kNN clearly outperforms
C
when classifying mammograms belonging to BIRADS II, while
for the rest of BIRADS the performance is quite similar. On the
other hand, C
tends to classify the mammograms according to
its own or its neighbouring BIRADS classification, while kNN shows
a larger dispersion. The
coefficient also reflects that
kNN has better performances than C
, with values equal to
and
, respectively. Note that both classifiers belong
to the Substantial category according to the scale in
Table
.
The results obtained by the Bayesian classifier are shown in
Table (c). This classifier shows
an increase in the overall performance when compared to the
individual classifiers, reaching
correct classification.
This is an increase of
and
when compared to kNN and
C
, respectively. When considering the individual BIRADS
classes, the percentage of correct classification for BIRADS I is
around
, whilst in the other cases, the percentages are
for BIRADS II,
for BIRADS III, and
for BIRADS
IV. Note that using the Bayesian classifier,
is increased
to
.
The results obtained for expert B are slightly decreased with
respect to those obtained for expert A. Specifically,
of
the mammograms were correctly classified by using the SFS+kNN
classifier, while the C
results remained at
. The
better results for the kNN classifier are independent of the
BIRADS classes, except for the BIRADS IV class, in which C
clearly outperforms kNN. The results obtained by the Bayes
classifier shows an increase of the performance of
and
when compared to kNN and C
, respectively, obtaining an
overall performance of
. When considering the individual
BIRADS classes, the percentage of correct classification for
BIRADS I is around
, whilst for the other cases, the
percentages are
for BIRADS II,
for BIRADS III, and
for BIRADS IV. The
value is equal to
.
The last row of Table shows the
results obtained for Expert C. The performance of the classifiers
is similar to that obtained by using the ground truth of Expert B.
The kNN classifier obtained
correct classification, while
C
obtained
. Using the Bayes classifier,
of the
mammograms were correctly classified. In summary,
correct
classification for BIRADS I,
for BIRADS II,
for
BIRADS III, and
for BIRADS IV. The
value is equal
to
.
In conclusion, the best classification rates are obtained using
the Bayesian combination. For each individual expert
,
, and
correct classification are obtained,
respectively.
In line with other publications [17,137], we can
reduce the four-class classification problem to the following
two-class problem: {BIRADS I and II} vs {BIRADS III and IV},
or in words, low density (low risk) versus high density (high
risk) classification, which from a mammographic risk assessment
point of view might be more appropriate than the four-class
division. Comparing to Expert A, the percentage of correct
classification is about
for the three classifiers and low
breast densities, while for dense breasts the percentage is
,
, and
for the kNN, C
and the Bayesian
combination, respectively. In contrast, for Expert B, the correct
classification percentage for low density breasts is around
for the single classifiers and
for the combination, while
for high density breasts it is reduced to
for each
classifier, and
for their combination. On the other hand,
using Expert C, the correct classification percentage for low
density breasts is
for the single classifiers and
for the combination, while for high density breasts the kNN
obtains
, and the other classifiers
.
For this two class approach, in summary, the results are
,
and
of correct classification for Experts A, B and
C, respectively.