Quantitative Analysis of Breast Density

Tuesday, May 17, 2011

Quick Update

I've met with my advisor, Dr. Nelson, today and he told me that he arranged for me to go to Moore Cancer Center to work with Dr. Haydee Ojeda-Fournier on analyzing the data. Dr. Nelson, also provided me with a user friendly interface that he used in another study. The code is done in OpenGL and I can modify it according to my needs for this study. I will provide more information on it after running some tests on my images.

Wednesday, May 11, 2011

Mammography

Mammography is the process of using low-dose amplitude-X-rays (usually around 0.7 mSv) to examine the human breast and is used as a diagnostic and a screening tool. There are two projective viewpoints available of the breast in mammography. The craniocaudal view (CC), where the breast is compressed horizontally and an x-ray is taken in the direction from head to toe, and the medio-lateral (ML), where the breast is vertically compressed and x-ray is taken from the side. The images included are both carniocaudal views.

During the procedure, the tissue is compressed by parallel-plate of mammography unit. Parallel-plate compression evens out the thickness of breast tissue to increase image quality by reducing the thickness of tissue that x-rays must penetrate, decreasing the amount of scattered radiation (scatter degrades image quality), reducing the required radiation dose, and holding the breast still (preventing motion blur).

Mammographic density refers to the prevalence of fibroglandular tissue in the breast versus fatty tissues as it appears on a mammogram. In the following image, Raundahl illustrates three examples of mammograms with different density
(a) Low density; (b) Medium density; (c) High density

Jakob Raundahl: Mammographic Pattern Recognition

I have been reading Jakob Raundahl's PhD dissertation which is titled as Mammographic Pattern Recognition. The focus of his thesis is very much like what I want to do except that his image acquisition technique is different.

In his thesis, Raundahl has two major parts one is the methodological aspects of the research and the other is on application to clinical data and discussions derived therefrom. Furthermore, he claims that the common ways to evaluate new automated density measures are either through visual assessment or correlation with radiologist readings. Consequently, even though advanced and powerful image analysis methods are applied the endpoint is still an approximation of a radiologist giving a score of 1-4 based on visual assessment.

Since some types of hormone replacement therapy (HRT) has been proven to increase mammographic density, Raundahl uses images from HRT studies to evaluate density measures by radiologists ability to separate the HRT and placebo populations.

The automated approaches at measuring mammographic density and mammographic patterns mentioned in this paper are:

Automated Thresholding Method

He discusses three algorithms:

Kittler and Illingworth’s optimal threshold (KI)
KI applied to the variance normalized image (KIVA) (suggested by Sivaramakrishna et al. )
Adaptive threshold based on the mean breast intensity (1.3*Avg).

Conclusion:
These types of approaches are very developed and it is better to use more complex methodologies that includes structural and textural information. Similarly, for the purpose of my own study, I need to develop indicative measures able to capture structure.

Unsupervised Method
Supervised Method
Supervised framework extended using SFS feature selection

Reflectivity Data

This is the reflectivity data from the patients 2 - 12 in the 110051 study. Including left and right, there are 22 breasts in total. Top image is slice 0, the small images underneath each big image are the rest of the slices in that breast.

This montage illustrates the remarkable difference between each patient as well as the strong symmetry between left and right breasts.

Looking at this montage we also realized that there is a rotation bug. Scan starting ccw vs cw are rotated about 90. Should be an easy fix!

Another issue that becomes evident is that the patient needs to center her breast as well as possible in the scanner, otherwise it can be cut off and the images reduced in quality (patient 10 left was cut off for example)..

Tuesday, May 3, 2011

n-fold Cross Validation

Often the process of collecting training dataset is quite painstaking. Even worse, often one-third of data is useless and then what? So if one cheats and change his programming parameters that may result in the problem of overfitting.

One remedy for this problem is to use n-fold cross-validation.

In n-fold cross-validation, the original sample is randomly partitioned into n subsamples. Of the n subsamples, a single subsample is retained as the validation data for testing the model, and the remaining n − 1 subsamples are used as training data. The cross-validation process is then repeated n times (the folds), with each of the n subsamples used exactly once as the validation data. The n results from the folds then can be averaged (or otherwise combined) to produce a single estimation. The advantage of this method over repeated random sub-sampling is that all observations are used for both training and validation, and each observation is used for validation exactly once. The result will be the average of the result of each fold [1].

As suggested by Prof. Belongie, 5-fold cross-validation will provide a decent average and decent standard deviation. But this is the step after collecting data and establishing ground truth labels.

[1] GJ, McLachlan; K.A. Do, C. Ambroise (2004). Analyzing microarray gene expression data. Wiley.

"Ground Truth" & labeling

I need to:

(a) gather more data
(b) establish its "ground truth" labels
(c) measuring inter-rater or intra-rater reliability (one rater could be myself or my colleague, Jakob and the other could be a radiologist)

There are different methods of collecting ground truth. One strategy would be to ask a rater to sort the 100 images (or image patches) in order of density. For intra-rater reliability, I then shuffle the deck and have the rater repeat the task. For inter-rater reliability, I simply compare the sorted lists between raters. In fact, there are a variety of distances of ranked lists that I can use when the time comes.

An alternative:
Now if it is too much work for a rater to sort the entire set of images, I could use an approximation in which I pick a subset of random pairs of images, and just have the rater say which image in each pair is denser.

Thursday, April 21, 2011

Unsorted

Sorted