Accuracy of the detection

Once the mammograms that contain masses have been detected, the algorithms have to be capable of precisely identifying the position and borders of them. This capability is here evaluated using ROC analysis, with the emphasis on performance with respect to the different morphological aspects detailed in MIAS annotations: the lesion shape (circular mass or spiculated one), lesion size, and breast tissue type (glandular, dense or fatty). In addition, an evaluation of the influence of the number of clusters in clustering algorithms is presented.

The lesion shape has a strong influence on the performance of the segmentation algorithms. Table

shows the values of

when segmenting the

mammograms with masses. In all cases the algorithms show more accurate detection on circumscribed masses. However, due to the small number of samples this difference is not significant.

All algorithms are more accurate in detecting circular masses than spiculated ones. For algorithms

and

this is due to the fact that both algorithms apply enhancement filters and thin spicules are likely to be removed. Algorithm

shows a larger difference which might be caused by the fact that the original algorithm was applied to Regions of Interest (RoIs), while our implementation is applied to the whole image. On the other hand, note that the efficiency of algorithms

and

is hardly affected by the lesion shape and it can be attributed to the nature of clustering, which only takes pixel feature similarity into account independently of neighbouring pixels. The pattern matching approach (

) loses some accuracy if the lesion is spiculated, because the proposed templates are circular. Finally, algorithms

and

were originally developed to detect spiculated masses, but in our implementation show higher accuracy for circumscribed masses. These last two algorithms show the overall best performance on both circumscribed and spiculated masses.

Table 2.5: Influence of the lesion shape for the segmentation algorithms. The results show mean and standard deviation

values.

Lesion Shape

		Circular	Spiculated

	a1	$86.6\pm 6.9$	$81.5\pm 11.7$
	-\|\|-	a2	$86.3\pm 17.3$
	-\|\|-	b1	$83.5\pm 14.8$
	-\|\|-	b2	$90.7\pm 5.8$
	-\|\|-	c1	$85.2\pm 6.9$
	-\|\|-	c2	$85.8\pm 6.9$
	-\|\|-	d1	$84.6\pm 10.8$
	-\|\|-	d2	$90.9\pm 6.3$
-\|\|-

The influence of the lesion size on the algorithms accuracy is summarized in Table

. It shows that the

approach works well for small masses, but when the size of the masses increases, the algorithms performance decreases. This is due to the fact that as the size of the masses increases, the variation in their shape also increases. The opposite is true for algorithms

and

, where performance increases with the size of the lesion. Both k-Means (

) and FCM (

) tend to produce homogeneous clusters [78]. We can also see that the use of filters and statistical approaches (algorithms

and

) follow the same behaviour as the clustering-based methods.

Finally, note that algorithm

increases its performance with the size of the lesion until a maximum is reached. Then, its performance decreases. This is due to the use of skeletons. When the mass is small, its skeleton is less important for the detection process. If the mass size increases, its skeleton becomes easier to detect. For large masses the skeleton becomes more difficult to detect and hence of less use in the detection process. It should be noted that this decrease in performance for larger masses is not significant.

Table 2.6: Influence of the lesion size (in

) for the segmentation algorithms. The results show mean and the standard deviation

values.

Lesion Size ( )



-\|\|---	a1	$80.6\pm 5.7$	$79.5\pm 10.6$	$83.1\pm 5.3$	$84.8\pm 4.6$	$88.1\pm 6.3$	$90.1\pm 5.1$
	-\|\|---	a2	$89.7\pm 6.3$	$81.3\pm 12.4$	$84.7\pm 12.6$	$78.3\pm 15.5$	$83.3\pm 8.7$
	-\|\|---	b1	$68.9\pm 10.4$	$79.6\pm 10.0$	$83.3\pm 5.4$	$85.3\pm 4.7$	$88.4\pm 6.3$
	-\|\|---	b2	$84.9\pm 10.7$	$84.9\pm 9.1$	$88.1\pm 5.7$	$90.2\pm 8.3$	$90.0\pm 7.4$
	-\|\|---	c1	$80.7\pm 9.2$	$82.3\pm 9.4$	$86.8\pm 3.0$	$87.2\pm 5.8$	$88.3\pm 7.7$
	-\|\|---	c2	$81.5\pm 10.3$	$83.2\pm 9.0$	$86.9\pm 3.6$	$87.4\pm 6.3$	$88.7\pm 8.4$
	-\|\|---	d1	$93.1\pm 3.4$	$91.7\pm 9.2$	$83.0\pm 9.1$	$82.3\pm 6.6$	$83.0\pm 5.3$
	-\|\|---	d2	$84.7\pm 9.1$	$85.3\pm 8.0$	$89.9\pm 4.2$	$88.5\pm 8.1$	$90.5\pm 9.3$
-\|\|---

The accuracy of the

algorithms depending on breast tissue classification is summarized in Table

. Note that most of the algorithms have superior performance in fatty breasts. This is clearly true for the

and a2 algorithms, which reduce their accuracy by $10\%$ for dense breasts. The reason for this can be found in the fact that in the glandular and dense breasts of the MIAS database, the difference between mass and normal tissue is less clear when compared to fatty breasts. In addition, most algorithms have better performance when dealing with fatty breasts when compared to breasts of increasing density tissue.

The two exceptions to the above rules are algorithms

and

. This is because these algorithms use contour information as a basis for the detection process and as such have a better performance when increased intensity changes are present.

Table 2.7: Influence of the breast tissue for the segmentation algorithms. The results show mean and the standard deviation

values.

Breast Tissue

		Fatty	Glandular	Dense

-\|\|--	a1	$85.2\pm 19.7$	$82.8\pm 7.8$	$84.3\pm 5.0$
	-\|\|--	a2	$87.5\pm 15.0$	$79.7\pm 22.0$
	-\|\|--	b1	$84.0\pm 6.8$	$82.8\pm 7.5$
	-\|\|--	b2	$87.6\pm 14.1$	$89.8\pm 8.0$
	-\|\|--	c1	$85.9\pm 14.4$	$85.0\pm 8.0$
	-\|\|--	c2	$86.2\pm 14.3$	$85.6\pm 8.2$
	-\|\|--	d1	$88.8\pm 8.7$	$82.6\pm 8.2$
	-\|\|--	d2	$89.6\pm 13.7$	$90.0\pm 6.5$
	-\|\|--

As indicated when describing the clustering approaches, over-segmentation of the image results in improved mass detection. Table

shows the result of segmenting the image using algorithms

and

but with different numbers of initial clusters, pointing out that the performance of the algorithms increases with the increase in initial set of seeds. However, there is a point where the increase becomes small and the

value is stable. However, this point is not the same for all mammograms. In addition, the segmentation time also increases with the number of initial seed points. In general, an initial set of

seeds gives the best trade-off between processing time and detection performance.

Table 2.8:

dependence on the number of clusters;

grows with the number of clusters until it stabilizes.

Number of Clusters

		5	10	15	20	25

-\|\|---	c1	$84.3 \pm 5.8$	$87.5\pm 6.9$	$88.1\pm 5.8$	$88.4\pm 5.5$	$88.5\pm 4.6$
-\|\|---	-\|\|---	c2	$82.2 \pm 6.8$	$87.9\pm 4.1$	$87.8\pm 3.4$	$88.7\pm 3.5$
-\|\|---