Introduction

Almost all works trying to detect masses in mammography need a final step in order to reduce the number of false positives (regions being normal marked as suspicious by the algorithm). This is due to the complexity of the internal breast tissue, which induces the detection of regions which are not masses, but normal variations in tissue characteristics.

A set of different techniques for false positive reduction have been developed in recent years. These algorithms are based on the classification of the RoIs as normal tissue or as depicting an abnormality, which in our case are masses. Thus, all the algorithms are based on a typical classifier scheme: using a database of known cases the system learns how to differentiate between both kinds of RoIs. Subsequently, once the system has been trained, a new RoI can be classified.

Observing these algorithms, we can distinguish between two main strategies. The first one includes the set of algorithms which firstly extracts features from the RoIs, usually related to their texture, and subsequently trains the classifier. On the other hand, a second strategy handles this problem as a template matching algorithm. Each new image is compared to all the RoIs of the database and then it is classified as an image containing a mass or not. Table

summarizes some works belonging to both strategies.

Table 5.1: Summary of the reviewed works on false positive reduction, with the features used, the classifier/similarity used (where LDA means linear discriminant analysis, NN neural network analysis, and ICA independent component analysis), the number of RoIs depicting masses vs the number of normal RoIs, and the results obtained. Note that for all works accuracy is given in terms of

(the area under the ROC curve) except for the work of Christoyianni et al. [32] which just gives the correct classification percentage.

Classifier-Based
Author	Year	Features	Classifier	RoIs	Results
Sahiner [157]		Texture, Morphologic	LDA, NN
Christoyianni [32]		Grey-level, Texture, ICA	NN		$88.23\%$
Qian [144]		Texture, Shape	NN
Tourassi [178]		Grey-level	NN

Template-Based
Author	Year	Features	Similarity	RoIs	Results
Chang [28]		Grey-level, shape	Likelihood function
Tourassi [179]		Grey-level	Mutual Information

Note that among all those works, one of the main differences are the ratio between the number of RoIs depicting masses and the total number of RoIs. This is an important issue because the number of wrong classified RoIs will increase as the number of normal RoIs increases. One should remember that the aim of this step is to reduce the number of false positives, which is usually higher than the number of true positives (as we have seen in Chapter

Sahiner et al. [157] extracted a huge set of features, and subsequently used genetic algorithms to select the most discriminative ones. With this subset of features, a neural net (NN) and a linear classifier (LDA) are trained and used to classify a new RoI. A similar strategy is used by Christoyianni et al. [32], who extracted grey-level, texture, and features related to independent component analysis (ICA), and use them to train a neural net. Note also, that they apply a principal component analysis (PCA) pre-processing step to reduce the complexity of the problem. On the other hand, Qian et al. [144] analyzed the implementation of an adaptive module to improve the performance of an automatic procedure which consists of training a Kalman-filter based neural net using features obtained from a wavelet decomposition.

As explained, the works of Chang et al. [28] and Tourassi et al. [179] are based on comparing a new RoI with all the RoIs in the database. The two most clear differences between them arise from the similarity measure and the database used. More specifically, the former developed a likelihood measure which depends on the grey-level and the shape of the RoIs. Both parameters were compared with the new RoI and the set of RoIs present in the database, which was only composed by RoIs depicting masses. From this comparison a likelihood measure was computed. On the other hand, the work of Tourassi et al. [179] consists of comparing all the RoIs of the database (including RoIs with and without masses) with the new one using a mutual information based similarity measure. Thus, the new RoI will be labeled as belonging to the closest class.

Note that with the last strategy, the similarity used for classifying has to be re-computed for each new element, as it measures the difference between the new RoI and all the RoIs in the database. On the other hand, a different drawback found in the first strategy is that a large set of features needs to be computed and only some of them will be finally selected. In contrast, we will show that our approach, which is again based on the eigenfaces algorithm, is more straightforward and efficient. Furthermore, we will show that using the recent developed 2DPCA decomposition [194] instead of using the typical PCA, the results are greatly improved.