Tuesday, May 3, 2011

"Ground Truth" & labeling

I need to:

(a) gather more data
(b) establish its "ground truth" labels
(c) measuring inter-rater or intra-rater reliability (one rater could be myself or my colleague, Jakob and the other could be a radiologist)

There are different methods of collecting ground truth. One strategy would be to ask a rater to sort the 100 images (or image patches) in order of density. For intra-rater reliability, I then shuffle the deck and have the rater repeat the task. For inter-rater reliability, I simply compare the sorted lists between raters.  In fact, there are a variety of distances of ranked lists that I can use when the time comes.

An alternative: 
Now if it is too much work for a rater to sort the entire set of images, I could use an approximation in which I pick a subset of random pairs of images, and just have the rater say which image in each pair is denser.

No comments:

Post a Comment