The mammograms are stored in a raw format, where each pixel is
represented by two bytes (the grey level of each pixel is
represented using
bits). The size of each mammogram is
variable, but it can be known from the overlay file.
Each case is stored using the following nomenclature: XXXXAB.raw, where XXXX is a number representing the case, A indicates if the mammograms provides from a right breast (``R'') or from a left one (``L''). The B can be ``C'' or ``O'', which respectively imply that the mammogram provides a CC or a MLO view.
The overlays are stored in a tif file, which the same name of the case, but with a constant suffix (_lb). Thus, looking for the size of these files, is possible to know the size of the raw files.
The most interesting fact of this database is that the overlay
files are composed by the opinion of six different radiologists.
Thus, each overlay file has only seven grey-levels (0
means
normal pixels, and from
to
there is the opinion of the
different radiologists), where
means that only one radiologist
has marked the pixel as belonging to a mass,
that two
radiologists marked the pixel as mass, and so on. Thus, linearly
equalizing the images, the centre of the masses are the most
brighter regions of the mass, and then, gradually, grey-level
values decrease surrounding them.