next up previous contents
Next: Training and Testing using Up: Combining Bayesian Pattern Matching Previous: Combining Bayesian Pattern Matching   Contents


MIAS Database

The performance of the system is evaluated using a total of $ 120$ mammograms, $ 40$ with confirmed masses (the ground-truth provided by an expert) and the rest being normal mammograms.

Both sets of RoIs (the one containing only masses and the other containing masses and normal tissue) were extracted from these mammograms. We used four different groups according to their size. For the first dataset, each group corresponds to the following intervals for mass sizes: $ <1.20 cm^2, (1.20-1.80) cm^2, (1.80-3.60) cm^2, >
3.60 cm^2$ , and there were, respectively, $ 10$ , $ 8$ , $ 10$ and $ 9$ masses. For the second set of RoIs these groups were completed with $ 3$ normal, but suspicious, RoIs images for each mass RoI. In the results, algorithms d1, d2 and the algorithm without this false positive reduction step are also included for direct comparison.

Figure [*] shows the FROC curve for our proposal explained in the Chapter [*] (grey line) and the same approach integrated with the proposed false positive reduction algorithm (black line). Note that the inclusion of this step clearly improves the performance of the algorithm: at the same sensitivity, the number of false positives per image is reduced. For instance, one false positive per image is reduced at a sensitivity of $ 0.87$ . Analyzing in the other direction, the inclusion of the false positive reduction algorithm allows to increase the sensitivity at a given false positive rate. For example, at one false positive per image the sensitivity increases from $ 0.27$ to $ 0.58$ .

Figure 5.6: FROC analysis of the algorithm including the false positive reduction step (black line) over the set of $ 120$ mammograms compared to the algorithm without false positive reduction (grey line). It is clear that the use of the false positive reduction clearly outperforms the proposed algorithm.
\includegraphics[width=10.5 cm]{images/frocFP.eps}

On the other hand, Figure [*] shows the FROC curve for the algorithms d1, d2, and the proposed system including the false positive reduction step (the black line with squares). The difference between the proposed algorithm and both approaches is now clearer than in Figure [*]. For instance, at the same sensitivity analyzed in Section [*] (Sensitivity = $ 0.8$ ) the mean number of false positive per image is now $ 1.40$ , which is $ 0.93$ less compared to the algorithm without the false positive reduction step. This shows the benefits of including this algorithm.

Figure 5.7: FROC analysis of the algorithm over the set of $ 120$ mammograms compared to d1 and d2 algorithms. The inclusion of the false positive step improves the proposed probabilistic template matching.
\includegraphics[width=10.5 cm]{images/frocFPComparison.eps}

We include again the performance of the algorithm detailed for each lesion size in Figure [*]. Note that larger masses are still more difficult to be accurately detected. However, the inclusion of the false positive reduction step allows to detect them at almost $ 3.00$ false positive per image less than without this step. Moreover, the performance of the three smaller sizes is now more similar than without using the false positive reduction step.

Figure 5.8: FROC analysis of the algorithm detailed for each lesion size.
\includegraphics[width=10.5 cm]{images/frocFPSize.eps}

Once the mammograms containing masses are detected, ROC curves are constructed to measure the accuracy in which the masses are detected. The overall performance over the $ 40$ mammograms containing masses resulted in $ A_z$ values of $ 89.3 \pm 5.9$ and $ 89.1 \pm 4.1$ without and with the false positive reduction step, while the results for the both compared approaches were $ A_z = 84.1 \pm 7.9$ and $ A_z = 88.1 \pm 8.4$ for algorithms d1 and d2 respectively. Note that the false positive reduction step introduces a penalization term in the accuracy with which the algorithm detects masses. This is due to the elimination of some RoI that were actually representing a true mass.

Table [*] shows the effect of the lesion size for the different algorithms in terms of mean and standard deviation of $ A_z$ values. Note that the inclusion of the false positive reduction step in some cases slightly decreases the performance of the proposal. This is due to the above mentioned fact, where a mass which was correctly detected using the proposal, was then considered as normal tissue by the false positive reduction algorithm. When this is not the case, the obtained $ A_z$ is increased.


Table 5.4: Influence of the lesion size (in $ cm^2$ ) for the d1, d2, and the proposal without (Eig) and with (Eig $ \&$ FPRed)) false positive reduction. The results show the mean and the standard deviation of $ A_z$ values.
  Lesion Size (in $ cm^2$ )
 
  $ <$ 1.20 1.20-1.80 1.80-3.60 $ >$ 3.60
 
 -||-- d1 $ 92.1\pm 5.5$ $ 85.8\pm 8.2$ $ 82.4\pm 7.3$ $ 79.1\pm 7.2$
 -||-- d2 $ 84.9\pm 8.8$ $ 86.7\pm 8.1$ $ 89.1\pm 9.6$
 -||-- Eig $ 91.3\pm 7.4$ $ 90.3\pm 3.3$ $ 89.6\pm 4.7$
 -||-- Eig $ \&$ FPRed $ 89.9\pm 3.1$ $ 91.4\pm 2.1$ $ 88.5\pm 5.0$
 -||--



next up previous contents
Next: Training and Testing using Up: Combining Bayesian Pattern Matching Previous: Combining Bayesian Pattern Matching   Contents
Arnau Oliver 2008-06-17