From observing mammographic images one can conclude that pixels
from a similar tissue have similar grey-level values, as can be
seen in Figure . Hence, as our aim is to
cluster those pixels into meaningful regions, the Fuzzy C-Means
algorithm (see Section
) is used to group them
into two separate categories: fatty tissue and dense tissue.
Beforehand, and with the aim to avoid effects from microtexture
that could appear in some regions, the breast region is smoothed
by using a median filter of size
. From our
experiments, this filter size is a good compromise between noise
reduction and texture preservation of mammographic tissue.
When using partitional clustering algorithms, like Fuzzy C-Means,
the placement of the initial seed points is one of the central
issues in the variation of segmentation results [78].
Despite their importance, usually seeds for these algorithms are
randomly initialized. As we only consider two classes in our
approach, the Fuzzy C-Means is initialized using histogram
information, with the aim to obtain representative instances of
both classes. Hence, we initialized the two seeds with the
grey-level values that represent
and
of the
accumulative histogram of the breast pixels of each mammogram
(representing fatty and dense tissue, respectively). Although
these values were empirically determined, the obtained
segmentations do not critically depend on them. Moreover, some
mammograms do not have clearly determined dense and fatty
components. In these cases, the segmentation result is one cluster
grouping the breast tissue and the other cluster grouping regions
with less compressed tissue (an elongated region, like a ribbon,
following the skin-line). In these cases, the breast texture
information is in the breast tissue cluster, while the ribbon does
not provide significant information to the system.