The Importance of the Segmentation Step

We include in this section a comparison between our strategy for breast density classification and the others found in the literature. In fact, the main difference among these approaches is the density segmentation, which can be divided in three general approaches: no density segmentation, segmentation according to the distance to the skin-line, and segmentation according to the internal tissue.

To quantitatively compute the improvement provided by our strategy, the same features and classifier as proposed are used. Below, the strategies are explained in more detail.

To quantitatively measure the improvement of our proposal we used in this experiment the MIAS database [169] with the annotations obtained from the consensus opinion (the set

mammograms divided as

BIRADS I,

BIRADS II,

BIRADS III, and

BIRADS IV). The same leave-one-woman-out procedure explained is used to evaluate each strategy.

The confusion matrix for the first strategy (no segmentation) is shown in Table

(a). The overall performance of this approach is $67\%$ , and detailed for each class, we obtained $77\%$ , $74\%$ , $55\%$ , and $57\%$ , from BIRADS I to BIRADS IV respectively. Note that mammograms with low density are better classified than mammograms with high density.

Table

(b) shows the results obtained by the second approach, which is the segmentation of the breast in regions according the distance to the skin-line. Note that the performance is highly increased compared with the no-segmentation approach, resulting in $75\%$ correct classification. The highests improvement are found in mammograms belonging to BIRADS I and BIRADS III, obtaining respectively $89\%$ and $69\%$ correct classification.

Finally, Table

shows the results obtained by using a segmentation of the breast according to the internal breast tissue. Here (a) shows the results obtained by the Fuzzy C-Means approach, (b) based on the Fractal approach, and (c) using the Statistical approach. Note that the overall performance for each algorithm is similar: $86\%$ , $84\%$ , and $85\%$ , respectively, and all of them are clearly better than the results obtained by the other two approaches.

The results obtained by the Fuzzy C-Means and the Statistical approach are quite similar for all classes except BIRADS III. For BIRADS I both approaches obtained $91\%$ correct classification, for BIRADS II the Statistical approach obtained $84\%$ while the Fuzzy C-Means $83\%$ , and for BIRADS IV both approaches obtained $73\%$ . In contrast, for BIRADS III the performance of the Fuzzy C-Means is better, increasing the percentage of correct classification from $85\%$ to $89\%$ . On the other hand, the performance of the Fractal approach is slightly different. It obtains $95\%$ correct classification for mammograms belonging to BIRADS I, while for the rest of classes this is reduced to $82\%$ , $83\%$ , and $62\%$ from BIRADS II to BIRADS IV, respectively.

Table 3.8: Confusion matrices for MIAS mammogram classification by using the internal breast density as a segmentation strategy: (a) Fuzzy C-Means, (b) Fractal, and (c) Statistical approaches.

FCM ( $86\%,\kappa = 0.81$ )

Fractal ( $84\%,\kappa = 0.77$ )

Statistical ( $85\%,\kappa = 0.79$ )

	B-I	B-II	B-III	B-IV
B-I
	B-II
	B-III	0
	B-IV	0

B-I	B-II	B-III	B-IV
			0

0

B-I	B-II	B-III	B-IV


0

(a)

(b)

(c)

Figure

shows the segmentation of the breast according to the compared strategies. Except for BIRADS I, the three last columns (which corresponds to the segmentation algorithms that use breast tissue information) show similar results, and thus the classification results for these strategies are also similar. Note that for

Figure 3.3: The reviewed strategies for dividing into regions a mammogram. The density of the mammograms shown in column (a) increases from the top row (BIRADS I) to the bottom row (BIRADS IV). Segmentation using (b) a single breast area, (c) the distance between the pixels and the skin-line, (d) a Fuzzy C-Means clustering of pixels with similar appearance, (e) the fractalization of the image, and (f) the statistical approach.

$\includegraphics[height=3.45 cm]{images/pdb005ll.b1.eps}$	$\includegraphics[height=3.45 cm]{images/mdb005ll.b1.eps}$	$\includegraphics[height=3.45 cm]{images/kdb005ll.b1.eps}$	$\includegraphics[height=3.45 cm]{images/cdb005ll.b1.eps}$	$\includegraphics[height=3.45 cm]{images/fdb005ll.b1.eps}$	$\includegraphics[height=3.45 cm]{images/sdb005ll.b1.eps}$
$\includegraphics[height=3.45 cm]{images/pdb041ll.b2.eps}$	$\includegraphics[height=3.45 cm]{images/mdb041ll.b2.eps}$	$\includegraphics[height=3.45 cm]{images/kdb041ll.b2.eps}$	$\includegraphics[height=3.45 cm]{images/cdb041ll.b2.eps}$	$\includegraphics[height=3.45 cm]{images/fdb041ll.b2.eps}$	$\includegraphics[height=3.45 cm]{images/sdb041ll.b2.eps}$
$\includegraphics[height=3.45 cm]{images/pdb194rl.b3.eps}$	$\includegraphics[height=3.45 cm]{images/mdb194rl.b3.eps}$	$\includegraphics[height=3.45 cm]{images/kdb194rl.b3.eps}$	$\includegraphics[height=3.45 cm]{images/cdb194rl.b3.eps}$	$\includegraphics[height=3.45 cm]{images/fdb194rl.b3.eps}$	$\includegraphics[height=3.45 cm]{images/sdb194rl.b3.eps}$
$\includegraphics[height=3.45 cm]{images/pdb171ll.b4.eps}$	$\includegraphics[height=3.45 cm]{images/mdb171ll.b4.eps}$	$\includegraphics[height=3.45 cm]{images/kdb171ll.b4.eps}$	$\includegraphics[height=3.45 cm]{images/cdb171ll.b4.eps}$	$\includegraphics[height=3.45 cm]{images/fdb171ll.b4.eps}$	$\includegraphics[height=3.45 cm]{images/sdb171ll.b4.eps}$
(a)	(b)	(c)	(d)	(e)	(f)

BIRADS I the Fuzzy C-Means obtains a singular result, grouping in a cluster most of the pixels of the breast except those located near the skin-line, which form the second cluster. This is due to the fact that, for this set of mammograms, the breast is almost homogeneous and the algorithm only can distinguish between those pixels with different compressed tissue (the region is darker in those regions with less compressed tissue). As discussed in Section

, the breast texture information is in the breast tissue cluster, while the small ribbon-like cluster does not provide significant information to the system.

Analyzing in more detail the segmentations of the mammograms belonging to the rest of BIRADS categories, one can conclude that the fractal approach provides a pixelated segmentation, while the statistical approach obtains larger and clearly separated regions. On the other hand, the Fuzzy C-Means performance is an intermediate solution and, thus, classification results are slightly improved compared to the other two.

The obtained results show that the segmentation step increase the performance of the classification, improving the results by, at least, $8\%$ . Moreover, we have noticed that using the segmentation according to the breast tissue clearly outperforms the segmentation according to the distance to the skin-line. We have also noted that the strategy used to segment the internal breast tissue does not provide a major variation in the results, with the Fuzzy C-Means based results slightly better than the other ones.