Clustering methods are one of the most commonly used techniques in image segmentation, as discussed in the review by Jain et al. [78]. Based on this work, clustering techniques can be divided into hierarchical and partitional algorithms, where the main difference between them is that hierarchical methods produce a nested series of partitions while partitional methods produce only a single one. Although hierarchical methods can be more accurate, partitional methods are used in applications involving large data sets, like the ones related to images, because the use of nested partitions becomes computationally prohibitive. However, partitional algorithms have two main disadvantages: 1) the algorithm has to know, a priori, the number of clusters that are in the image, and 2) the fact that clustering algorithms do not use spatial information inherent to the image.
A traditional partitional clustering algorithm is the k-Means algorithm [108], which is characterized by easy implementation and low complexity. For mass segmentation purposes, this algorithm has been used by Sahiner et al. [155,157] in order to generate an initial segmentation result. As we have described in the above section, Sahiner et al. improved this segmentation using edge information. In contrast, Li et al. [105] used a generalization of k-Means that included spatial information to refine an initial segmentation (the initial result is achieved by using adaptive thresholding).
Alternative clustering algorithms used for mammographic mass segmentation are the Fuzzy C-Means (FCM) algorithm [11], the Dogs and Rabbit (DaR) algorithm [118] and the Expectation Maximization (EM) algorithm [40]. FCM was used with different objectives in the works of Velthuizen [185] and Chen and Lee [29]. While Velthuizen used it to group pixels with similar grey-level values in the original images, Chen and Lee used it over the set of local features extracted from the application of a multi-resolution wavelet transform and Gaussian Markov random fields analysis. Moreover, the output of the FCM was the input to an EM algorithm based on Gibbs random fields. On the other hand, the DaR algorithm was used by Zheng and Chan [195], and in contrast to FCM which improves k-Means using a fuzzy approach of the energy function, this algorithm performs a more robust seed placement, resulting in a stable clustering algorithm [118]. Other clustering approaches are based on prior assumptions (models) of the image, as for example, the algorithm of Li et al. [104] which is similar to the EM approach proposed by Chen and Lee [29], but they formulated the segmentation as a Markov Random Fields model. Like the k-Means, their algorithm is iterative and alternates between estimating the mean intensity and the pixel label.
One of the earliest approaches to mass segmentation was the work
of Brzakovic et al. [20] which was based on a
multi-resolution fuzzy pyramid linking approach, a data structure
in which the input image formed the basis of the pyramid and each
subsequent level (of lower resolution) was sequentially
constructed. The links between each node and its four parents were
propagated using a fuzzy function to upper levels. They
demonstrated that this algorithm was directly correlated with the
isodata clustering algorithm [20]. It has to be
noted, that with this strategy, spatial information (region
information) is taken into account.
Thresholding Methods
Like Fu and Mu [56], we consider threshold methods as a special case of partitional clustering methods, where only two clusters are considered. Threshold methods have been widely used for mass segmentation. For instance, Matsubara et al. [116,117] used different grey-level threshold values depending on the type of tissue of the breast based on histogram analysis. More recently, Mudigonda et al. [122] used multilevel thresholding to detect closed edges. In this approach a concentric group of contours represents the propagation of density information from the central-core portion of an object or tissue region in the image into the surrounding tissues. This algorithm can be regarded as a region growing approach, where in each iteration neighbours with similar grey-level values are grouped (the works of Huo et al. [76], as well as Petrick et al. [134], described in subsection , follow a similar strategy). The main drawback of this approach is the assumption that masses have (more or less) uniform density compared to the local background.
In some cases the thresholding is not applied directly to the mammographic image, but to an enhanced version of the original image. For example, Kobatake et al. [88,92] applied an iris filter designed to enhance rounded opacities and to be insensitive to thin anatomical structures. Using adaptive thresholding they detected round masses. Another example is by Saha et al. [154], who first enhanced the image by a scale-based fuzzy connectivity method, and subsequently thresholded the image to detect masses.
Instead of enhancing the image, a different approach is to first extract some (texture) features from the image and threshold them in a posterior step. For instance, Undrill et al. [181] thresholded images using Laws masks, while Heath and Bowyer [68] developed a new mass detection algorithm which was based on the use of an Average Fraction Under the Minimum (AFUM) filter. This filtering is designed to find the degree to which the surrounding region of a point radially decreases in intensity. The final step is to threshold the image to identify suspicious regions.