Questions
Questions on QC during Lab Week 23
In celebration of Lab Week 2023, we share some of the question and answer sessions of recent year. And try to get to the question under the question...
Questions (and Answers!) from Lab Week 2023
April 2023
Sten Westgard, MS
Lab Week, that once a year celebration of diagnostic laboratory professionals, has arrived. Just as there isn't enough recognition, enough praise, or enough renumeration govem to laboratories, very often there aren't enough answers. Questions crop up again and again, particularly in QC, and this week we share them as well as the best answers we provide.
Q: We run 3 levels of QC and we have the protocol 2 levels in range out of 3 levels we are to proceed testing samples. If that single level failed due to < or > 3SD is it still ok to validate results?
So this question is about the 1:3s N=3 rule. Limits set at 3 sd using 3 levels of control per run. When does this 1:3s rule "really" apply? In asking the question, the questioner implies that perhaps a 2of3:3sd or 3of3:3sd is more desirable.
First off, there are no such rules "on the books" of QC. Certainly it is possible to model them, build power curves out of error simulations and determine the false rejection rates (likely very low) and error detection rate (probably quite, quite low), and then determine whether or not a 2of3:3sd or 3of3:3sd rule is appropriate for the particular method in question.
The shorter answer is a bit zen: If you are using the 1:3s N=3 rule, when it is violated, it has been violated.
But let's go deeper and ask more fundamental question: does your method need the 1:3s N=3 rule?
In order to answer this question, we need additional information about the method, most importantly, the performance specification, something like a CLIA 2024 goal or some other quality requirement. That gives us the context of the use of the test results. But that's not the full answer. We also need to know the observed imprecision and observed bias of the method - performance monitoring the laboratory is already required to do. When we have those measurements, we can make a Sigma metric calculation, and the resulting metric helps determine how many rules are necessary to implement (spoiler: high Sigma metrics needs fewer rules, fewer levels, and can use wider limits, while lower Sigma metrics need more Westgard Rules, more levels, tighter limits, and may even need more frequent runs of QC).
An examination of any version of Westgard Sigma Rules will note that even the highest Sigma (6) requires the 1:3s rule. So unless the method in question is over extremely high Sigma metric (Sigma of 10, 20, higher?), the 1:3s rule is necessary, and any violation of it at any level should be treated as a real out-of-control event.
Furthermore, even if an extremely high Sigma metric method only needed something like a 1:4s rule, it is highly unlikely that current laboratory software can implement and display that rule.
To sum up, No, it's not okay to validate those results.
Q: If I do have one level or multiple levels running below the mean and some are within some are outside 1SD for 10 consecutive days, but our lab data is comparable with my Peer Data, should I readjust my mean and ranges? How many data points should I have prior to adjustments?
The zen answer: if you are using the 10:x rule, when it is violated, it has been violated. Also, if you are using the 4:1s (or 3:1s) rule, when it is violated, it has been violated.
Notice, there is no 10:1s rule "on the books." You can build one if you like, but without knowing its power curve, you cannot tell if this will catch a real error reliably, nor can you be sure it won't generate a lot of false alarms. In any case, the 10:x and/or 4:1s rule will get triggered earlier than a 10:1s rule.
The deeper question and answer: does this method need the 10:x or the 4:1s rule? Again, Sigma-metrics will determine that.
Even if lab data is comparable to the peer data, that doesn't invalidate an out-of-control event. We could imagine a scenario where that rule violation leads to trouble-shooting, which uncovers a problem of a drift or trending, that is not from the reagent but perhaps from the control material (in other words, a matrix issue, not something reflecting a true issue with patient samples). In that specific scenario, the peer group data is experiencing the same drift/trend because it's the same matrix issue across many laboratories.
Let's change the script: if the local QC data is showing a shift or trend, AND the peer group data is showing a bias in the same direction, that's more evidence that there's a problem with the lab's mean. That's when re-establishing the mean becomes a rational possibility. But please note, changing the SD (as the range) is still not okay. A peer group SD will be too wide and won't reflect the real variation of the laboratory. Any QC based on using a large SD will degrade the error detection of any real error, and give a false sense of security to the laboratory.
These questions spring from an ever-present wound in the laboratory, a lack of fundamental knowledge about QC, why QC is run, and what QC needs to be run. We cannot fault a laboratory for being faced with these challenges and questions. All we hope is that our answers provide not only a short term answer, as well as a long-term solution.
As part of your lab week celebrations, please send in any questions you have. For over 25 years, we've been trying to give you the best answers.