Almost all works trying to detect masses in mammography need a final step in order to reduce the number of false positives (regions being normal marked as suspicious by the algorithm). This is due to the complexity of the internal breast tissue, which induces the detection of regions which are not masses, but normal variations in tissue characteristics.
A set of different techniques for false positive reduction have been developed in recent years. These algorithms are based on the classification of the RoIs as normal tissue or as depicting an abnormality, which in our case are masses. Thus, all the algorithms are based on a typical classifier scheme: using a database of known cases the system learns how to differentiate between both kinds of RoIs. Subsequently, once the system has been trained, a new RoI can be classified.
Observing these algorithms, we can distinguish between two main
strategies. The first one includes the set of algorithms which
firstly extracts features from the RoIs, usually related to their
texture, and subsequently trains the classifier. On the other
hand, a second strategy handles this problem as a template
matching algorithm. Each new image is compared to all the RoIs of
the database and then it is classified as an image containing a
mass or not. Table summarizes some works
belonging to both strategies.
|
Note that among all those works, one of the main differences are
the ratio between the number of RoIs depicting masses and the
total number of RoIs. This is an important issue because the
number of wrong classified RoIs will increase as the number of
normal RoIs increases. One should remember that the aim of this
step is to reduce the number of false positives, which is usually
higher than the number of true positives (as we have seen in
Chapter ).
Sahiner et al. [157] extracted a huge set of features, and subsequently used genetic algorithms to select the most discriminative ones. With this subset of features, a neural net (NN) and a linear classifier (LDA) are trained and used to classify a new RoI. A similar strategy is used by Christoyianni et al. [32], who extracted grey-level, texture, and features related to independent component analysis (ICA), and use them to train a neural net. Note also, that they apply a principal component analysis (PCA) pre-processing step to reduce the complexity of the problem. On the other hand, Qian et al. [144] analyzed the implementation of an adaptive module to improve the performance of an automatic procedure which consists of training a Kalman-filter based neural net using features obtained from a wavelet decomposition.
As explained, the works of Chang et al. [28] and Tourassi et al. [179] are based on comparing a new RoI with all the RoIs in the database. The two most clear differences between them arise from the similarity measure and the database used. More specifically, the former developed a likelihood measure which depends on the grey-level and the shape of the RoIs. Both parameters were compared with the new RoI and the set of RoIs present in the database, which was only composed by RoIs depicting masses. From this comparison a likelihood measure was computed. On the other hand, the work of Tourassi et al. [179] consists of comparing all the RoIs of the database (including RoIs with and without masses) with the new one using a mutual information based similarity measure. Thus, the new RoI will be labeled as belonging to the closest class.
Note that with the last strategy, the similarity used for classifying has to be re-computed for each new element, as it measures the difference between the new RoI and all the RoIs in the database. On the other hand, a different drawback found in the first strategy is that a large set of features needs to be computed and only some of them will be finally selected. In contrast, we will show that our approach, which is again based on the eigenfaces algorithm, is more straightforward and efficient. Furthermore, we will show that using the recent developed 2DPCA decomposition [194] instead of using the typical PCA, the results are greatly improved.