BAAM Science Lesson:  Frequentists vs. Bayesians

Why is the Baysian so confident?  Because Bayesian analysis measures the accuracy of the screening test against the actual rate of the event rather than against pure chance.  This means you are getting a real measure of the usefulness of the test, not a theoretical analysis of its effectiveness in a purely chance world.  (Editorial note: Chance is rarely a factor in real things.)

In the instance below, the machine lies just 1/32nd of the time. But, the probability that the sun has just exploded is effectively zero.  This means, as a practical matter, every report of the sun exploding is going to be a "false positive."

Let's make it easier to see the math.  Imagine the sun actually explodes 1% of the time.  But, the machine still lies to us 1/32nd of the time.  That means out of every 100 measurements, we'll report one real explosion and three false ones.  That is, even if the sun explodes 1% of the time, the machine is still wrong 75% of the time. (Four reports, only one correct.)

Let's consider a real situation, drug sniffing dogs. In this case, we'll imagine a pretty well trained dog. It finds the drugs 95% of the time. It also has just a 10% false positive rate--meaning it alerts only 10% of time when there are no drugs. How accurate is this dog, really?  In the real world, not very*.   (Technically, we say this dog has 0.95 sensitivity and 0.90 specificity.)

We will assume that 2% of randomly selected cars have contraband drugs.  Thus, out of 100 cars, our dog will probably correctly alert on both. But, with a 10% false positive rate, it will also implicate about 10 people who do not have drugs.  Only 2 out of 12 (16.7%) of the detections is accurate.

Notice that we need to know how often the event actually occurs to know how accurate the test really is.

Let's make our dog even better.  Imagine it always alerts when there are drugs.  And, with lots of training, we reduce the false positive rate by a factor of 10.  Now it has a sensitivity of 1.00 and a specificity of 0.99. Out of 100 randomly stopped cars, the dog will catch the two with drugs and implicate just one person who does not have them. That means even using an essentially perfect dog, 1/3 of our reports are false.

Things may be even worse that that.  Research by Lisa Lit and her colleagues suggests that handler cueing of alerts is very common. That is, the dogs can be responding more to the handler's signals about the drugs more than to the drugs themselves.  In her experiment, in which empty sample envelopes were marked so the handlers would notice, handler cueing accounted for the majority of alerts.

Now, apply this to autism.  Someone tells you they have a test that can detect autism in children as young as 6 months of age.  They tell you it has a sensitivity of 0.99 and a specificity of 0.99.  Pretty good?  If autism actually occurs in 1% of the population, out of every 100 children the test will correctly diagnose one child and incorrectly diagnose another.  Fully half your sample will not have autism.  If a treatment is done using this sample, it will appear to be 50% effective--even if it actually does nothing.

*In the case of Florida v. Harris, the United States Supreme Court behaved as frequentists, unanimously upholding the use of dogs to provide probable cause based on certifications in controlled tests, rather than based on a "mechanistic" analysis of actual accuracy as the Florida Supreme Court had.

References:

Lit, L., Schweitzer, J.B., Oberbauer, A.M. (2011). Handler beliefs affect scent detection dog outcomes.  Animal Cognition, 14,  387-394. DOI 10.1007/s10071-010-0373-2

Myers, R. E. (2006).  Detector dogs and probable cause. George Mason Law Review, 14(1), 1-36.

 

Behavior Analysis Association of Michigan, Department of Psychology, Eastern Michigan University, Ypsilanti, MI 48197