The films were extracted from the UK National Breast Screening Programme, and digitised to micron pixel edge with a Joyce-Loebl scanner, a device with linear response in the optical density range . Each pixel was described as a -bit word. The database also includes ``ground-truth'' on the locations of any abnormalities may be present.
The mammograms are named as ``mdbXXXBS'', where:
Each image is stored in raw format: each number of the stored file corresponds to the grey-level value (from 0 to ) of their corresponding pixel in the image. Thus, reading the size of the image from its name, is straightforward to read correctly the image.
The annotations include kind of abnormalities, and the coordinates of their centre of masses and approximate radius of the circle enclosing them. Table summarizes the mammograms present in this database. In it, we can see the number of mammograms containing calcifications, masses, other kind of abnormalities, and the number of normal mammograms, all distributed according to the kind of breast tissue (fatty, glandular, or dense).