We analyze in this section the performance of the algorithm when
using different databases for learning the size and shape of the
masses and for detecting them in the images. For such task, we
used the MIAS database to test the system and the
DDSM [68] one for training. Due to the large mass
variability of DDSM database we used six different sizes to train
the system:
, and the number
of masses in each interval was respectively,
,
,
,
,
, and
masses. Moreover, for the false positive
reduction step,
normal RoIs for each mass RoI were included in
each size-cluster.
![]() |
Figure shows the performance of the
algorithm . The grey lines show the performance of the proposal
without the false positive reduction step, while the black ones
including it. The lines with squares are obtained when training
and testing with the same database, while the lines with
pentagrams when training and testing using different databases. We
can see that the Bayesian pattern matching has more false
positives per image when is trained with different database. This
is basically due to the fact that we are now training with more
sizes and, further, smaller patterns. Thus, there is a large set
of small regions being normal tissue but detected as suspicious by
the algorithm. However, the false positive reduction step allows
to greatly reduce this number, although the performance is
slightly worst compared to training and testing with the same
database. For instance, at a sensitivity of
the number of
false positives per image when training and testing using
different databases was
without the false positive
reduction algorithm and
when including it, while when
training and testing using the same database the false positives
were
and
, respectively.
Figure shows the comparison
of the algorithm trained with DDSM when testing the set of
mammograms from MIAS database and algorithms d1 and
d2. Note that the performance of the proposal Eig is
similar to algorithm d2 at sensitivities around
.
In contrast, is clearly better at higher sensitivities and worst
at intermediate sensitivities. Note that when including the false
positive reduction step the performance is clearly better.
![]() |
On the other hand, using ROC analysis for the set of
mammograms containing masses, we found that mean
without
false positive reduction was
, while including it was
. This results are slightly worse compared to the
algorithm trained and tested using the same database (
and
, respectively) and also compared
to algorithm d2 (
). However, note that
this algorithm is still trained and tested using the same
database. On the other hand, both proposals outperforms algorithm
d1 (
).
Table shows the mean
values
detailed per mass size when the training and testing was done
using the same database or using different databases. Note that
the main performance drop is for the smaller masses, where the
mean
is reduced around
units. This is basically due to
the number of false positives detected by the template matching
algorithm at small template sizes. The false positive reduction
step allows to decrease the number of false positives, although in
the cases where this algorithm increases the number of false
negatives (classifying a true mass as normal tissue) the mean
of the system is drastically reduced. For the other sizes,
the performance is similar when the training and testing was done
using the same database or using different databases, and also in
some cases, the performance is better when using different
databases.