Critical Appraisal of Diagnostic Test Studies - Results
Introduction
After you have established the validity of a diagnostic test study, then you examine the RESULTS of the study, which are usually expressed as sensitivity, specificity, predictive values and likelihood ratios.
When there are confidence intervals around these statistics (such as sensitivity 85% [95% CI 79 to 93%]), this is ONLY an estimate of precision of the statistic. Because they are not talking about a difference in rates, etc.,they are not used to determine significance (so crossing 1 or zero in this case is not applicable).
More thoughts about sensitivity and specificity.
Diagnostic Test Statistics
We're going to break down diagnostic testing and uncover the relationships between sensitivity/specificity, likelihood ratios and pre- and post-test probabilities.
Hopefully, by the end, I will convince you that the likelihood ratio is the best choice of statistic for understanding the capabilities of a diagnostic test.
Let's start with the calculations themselves - as a refresher...
If we have the results of a diagnostic test laid out in a table like this:
Using the algebraic notation (a,b,c,d), our equations look like this (scroll in the window to see all of them):
Shout out to http://atomurl.net/math/ for the TeX equation editor plugin for Chrome...awesome!
Diagnostic Decision-Making
Another concept to know well is: pre-test and post-test probabilities - These are the probabilities of the disease both before and after you do the test you are considering.
Obviously, for a test to be useful, it will change the probability of disease - making it more or less likely depending on the result.
Frequently, we can use the overall prevalence of a disease in a population as the baseline pre-test probability, but we can also use the results of clinical decision rules and even "guesstimates" as pre-test probabilities to help us understand if the test we're considering is worth doing.
Now, let's define some levels of likelihood here using the "guesstimate" method.
It's always possible to quibble with the meaning of numbers in these cases, but we'll use these same numbers throughout so that you can see the relationships between the different statistics.
Very Unlikely - 5% or 0.05
Unlikely - 20% or 0.20
Even chance - 50% or 0.5
Likely - 80% or 0.80
Very Likely - 95% or 0.95
Now we're ready to do some (basic) math...
Say that you're seeing a patient for a sore throat, and you have available to you a "rapid strep test" - a test that gives you an answer right there (rather than waiting a day for the culture result), so that you can determine the next best step for the patient:
reassurance that the sore throat is viral and should resolve on its own or
diagnosis of streptococcal pharyngitis, which requires treatment with antibiotics.
I've compiled a few tables below so that you can see the effects of varying prevalence (or "pre-test probability") given consistent test characteristics, and the effects changes in sensitivity and specificity given consistent pre-test probability.
Accurate diagnosis depends on a combination of the test accuracy and the likelihood of the disease before you test.
If you're not very likely to have the disease (0.05 pre-test probability), the chance you have it AFTER a positive test is less than 1 in 4.
If you're very likely to have the disease (95% pre-test probability), then getting a negative test only reduces the chance to 77% that you still will have the disease
Having a more sensitive test doesn't really ensure that a positive test is more accurate, but does help decrease the accuracy of a negative test.
Varying the specificity helps much more at confirming a positive test result, but doesn't change the accuracy of a negative result much.
SpPIn and SnNOut
The data above helps confirm the helpful mnemonics SpPIn and SnNOut:
SpPIn - in a test with a high Specificity, a Positive test rules In the diagnosis
SnNOut - in a test with a high Sensitivity, a Negative test rules Out the diagnosis
These can still mislead you when the pretest probabilities are very high or very low, but they give us a general sense of how these statistics work.