View Full Version : Crater marking questions - accuracy expectations?
2012-Feb-03, 01:29 AM
Overall I like the site, but on occasion I get the dreaded "Only 42% of the marked craters match the pros; here are the ones that are correct" messages. I get the intent of the message, but I have a few issues.
First, if you go use the tutorial link provided, the image you were working on goes away. So you can't learn based on that.
Second, if you use the "corrected image" link, your markups disappear so you can't compare yourself to the pros. Maybe having pros in green and "Joe's" in blue would be better?
Third, the craters marked by the pros are sometimes very subtle. Check this example:
marked by the pros
In this example, the 2 red circles are supposed to be craters. I thought they were hills or undulations, or at least not clearly craters. But if they are indeed, then I would think that the 2 features I circled in green would also be craters. They look almost identical to me.
Also, the red arrow clearly points to a crater, no question there. The green arrow above it is also clearly a crater; I marked it, but the pro did not. Presumably it was just under the threshold size, but that difference is way too small to be clearly discerned at the zoom levels available.
Lastly, the blue arrow points to a crater marked by the pro. Honestly, that feature is so faint that there is no way I would have marked it. Meanwhile the feature marked with the orange arrow, while much more well-defined, was not marked by the pro. Again, that one may be a pixel too small in diameter I suppose.
I know that this all sounds like whining, but I swear that's not my intent. I don't expect to be as good at this as a pro by any stretch. However, if this is what you want my work to look like then I should quit now so I don't mess up your database with erroneous markings! :)
So I guess that's a long-winded way of asking: What are your expectations for user accuracy on the site?
2012-Feb-03, 07:33 AM
Being the "pro" that marked these, most of what you said is fairly accurate, and there are a few things to note in no particular order:
I have re-done the image and looked for more subtle ones (what you marked in green) and added another ~200, but they have not yet been fully incorporated into the database. It's likely the ones you marked in green are in the latest version.
Getting to the point where you can distinguish between the highly modified craters and "undulations" at some point becomes a judgement call. For me, I take a step back (almost literally) and look at the image. The subtle ones tend to pop out at that point. For example, the thumbnails of your attachments to me make those highly modified craters look much more obvious to me than they do when blown up to full resolution.
The technique I use for identifying craters is somewhat different than what's used in Moon Mappers, and it can yield slightly different diameters. And everyone is going to be getting slightly different diameters when measuring any crater. I can 99% guarantee that I found the ones you've indicated by the green and orange arrows, but if when I ID'ed the crater I found a diameter of 17.999 pixels, then it would not be included in the dataset that was uploaded. At some point we have to draw a cut-off, but I'm not sure where exactly that should be. Should we have rounded and included everything I identified as >17.5 pixels? We had just done a cut-off at 18, but maybe that's not the right way to go about it for my comparison set.
It's okay not to get everything I get. I did the image once, then went back and did it again. Then I used the interface of Moon Mappers and my score has averaged 91±4%. Against myself. There is a known variability between professionals and even between hours a single pro does it. This is actually another thing we're trying to study with Moon Mappers as the last formal study in this kind of thing was in the 1980s. I want to try to get a half dozen expert crater counters to all do the same image and then look at the variability and then compare that to the variability among the volunteers.
We're also still playing somewhat with how we're providing feedback to you. I'd love to have the interface do something like, when you click "Done Working" it shows your score, then animates your markings moving to mine and color-codes them for how well you did or something like that. But doing that would take a lot of the programmers' time and they are buried in slightly more critical things than kewel animations that I dream up. That said -- we're in beta! Please feel free to tell us what YOU would like to see as a way that would make sense to you to compare the markings.
We're also still figuring out the exact way we're going to "score" everyone. What's currently on the website is one version of the algorithm, and I've been toying with tweaking it on my end and re-running the numbers. The idea is that we need to come up with an algorithm for comparison that gives us a numerical confidence relative to "experts" that also seems "fair." Kind of a, "You got a score of 50% on paper, but looking at what you actually did, do we think you deserved that 50%?" We're trying to get those to match, and it's a rather difficult problem and at the moment involves a fairly complicated equation that took me about 8 hours to program.
In it - and this is probably the most important part given your arrows - anything <20 pixels that does not match between the two sets (your markings and mine) is completely removed from the scoring. That way we get rid of all the "well this was 18.1 pixels" versus "but I thought it was 17.9 pixels!" I also remove anything that doesn't match that's >250 pixels. But again, that's my side at the moment and hasn't been uploaded to the server because I want to get something finalized so that I'm not sending the programmers a new change every-other day. And I'm going to have to wait for more data to come in before I really settle on anything.
And, as I said towards the beginning: It's completely possible that I've made a mistake! If you think I have, in addition to posting the first kind of screenshot you did, could you also click the "Image Link" below the "Done Working" button and paste the URL of the original image to the forum thread? That would help track it down on my end so I can take a look at my original markings.
Finally, and I left this to last for emphasis: I've been identifying craters for 5 years. There are others on the team who have been doing it for 46 years (I had to reference one of his papers from 1967). We don't give you a score because we want to penalize you nor try to frustrate you, but rather we want to provide some feedback with how "the pros" would mark the image so that we can get some baseline. You've volunteered your free time to help us do this project and we greatly appreciate it. We want to provide you with live feedback when possible, but we are still tweaking exactly how that feedback is being generated and calibrated! Please continue to let us know how we can do this better.
2012-Feb-03, 06:47 PM
I was going to mention something about that also . First let me say that all the hard work and time yawl put into this isn't going unnoticed ;) .
Having comparable marks would be a great tool on both (attaboy's and ah-shucks) . With novice and expert mark overlaid one could readily see where the disparity lays . I too have a few questions about Nth degree marking . However when Stu say's he is working 80 hrs a week I tend to take a wait and see stance for the time being .
I wonder if having a thread for " Rim Reckoning " with zoomed , annotated photos of craters and there rims would be helpful at this point or would just be adding to much to the work load for the time being .
With it all " coming out in the wash " I understand that it may not make much difference to the computer . However , to users it may make a huge difference in the individual's confidence to " get it in the ball park ."
OOPS ; What are the chances of getting a crayon on that image blow-up for annotation ?
2012-Feb-03, 08:21 PM
placidstorm - Feb is not quite as crazy for me, and actually, I will be giving an invited talk on early Moon Mappers results on March 2. So this will be getting a fair amount of my attention (I think you were getting the 80+ hrs from what I said on my podcast - that may have been a slight exaggeration, but yeah, January has been crazy for me). I have been reading everything you've been writing on here so I am watching for general user feedback and will be compiling it with my recommendations/requests for the programming team.
2012-Feb-04, 01:59 PM
astrostu - Thanks for the great explanation. That's precisely what I was looking for and makes me feel better that I am actually providing the kind of data you guys want. As I said, I wasn't particularly worried about my score, I just didn't want to water down your results.
Keep up the good work you folks are doing!
Powered by vBulletin® Version 4.2.0 Copyright © 2013 vBulletin Solutions, Inc. All rights reserved.