The previous installment (Part 47, American Laboratory, Apr 2012) dealt with several regression practices that are risky in the world of calibration and recovery. One final matter, relative standard deviations (RSDs), was left until this article. Although this statistic is widely used for reporting uncertainty, the concept can involve risks, even when used outside the regression arena. Problems arise with both calculating and reporting an RSD, and will be addressed below.
Much of the difficulty arises because RSDs have units of percent. In the world of measurement units, “per” is a dangerous word. Consider speeds, which in everyday usage are reported in miles-per-hour (mph). Suppose someone travels at 20 mph while going through a school zone, speeds up to 40 mph when able to return to posted speeds, and then increases to 70 mph after entering an interstate highway. The average speed cannot be calculated simply by adding the three speeds and dividing by three. The unit of mph is a “double unit,” so to speak, and must be broken apart before mathematical operations (e.g., averaging) can take place.
Thus, in this example, more information is needed: 1) the total distance driven in each segment or 2) the time spent at each speed. Having either of these two pieces of data will allow the user to break the mph unit into single units, which can be used in arithmetic operations. Once the total miles driven and the total time elapsed have been calculated, the average mph can be derived by dividing the former by the latter.
A related problem can surface in the laboratory. If the analyst needs 50 mL of water and a balance is handy, he or she can pour out 50 g of the liquid and have the required volume. Most people do not think twice about performing such an operation. Nevertheless, such a move “works” because the density of water is 1 g/mL; 1 mL has a mass of 1 g, so either a balance or a volumetric container will give the same result.
However, if the liquid is concentrated phosphoric acid, all bets are off. Now the typically forgotten matter of density makes volume and mass unequal. In this case, the density is roughly 1.7 g/mL. Measuring out 50 g of the H3PO4 will result in only 29.4 mL! If a balance will be used to obtain a desired volume, or a graduated container to estimate a given mass, then the density unit must be “broken apart” to achieve correct results.
With RSDs, users typically do not attempt mathematical operations with these percentages. However, it is easy to forget that “percent” indicates that the reported number is a ratio of two other values. One way to avoid unwise conclusions is to keep asking and answering the question, “Percent of what?”
The take-home message is that seeing the word “per” in a unit should raise a red flag in the user’s mind! (Also, be aware that “per” may be hidden in the unit. Remember that some commonplace units are abbreviations that actually contain “per”; examples are “ppmw” for parts per million by weight and “ppbv” for parts per billion by volume).
Now it is time to focus on the statistical details of RSDs. A generally accepted definition of RSD is, “The standard deviation (s) of a set of data, divided by the mean (xavg) of the data set, expressed in units of percent.” Thus, the formula is:
RSD = (s ÷ xavg) * 100% (1)
General risks with RSDs
How do problems arise when calculating and reporting RSDs?
Consider first a simple scenario wherein an analyst weighs a solid object 10 times in a row. Once the data are available, all 10 numbers are used to calculate the standard deviation and the mean for the experiment’s results. The RSD is easily calculated via Eq. (1) and can be reported along with the mean (e.g., 6.74 g, ± 0.013%).
Figure 1 - Plot showing the noisiness of standard deviations. Thousands of random, Normally distributed measurements were simulated, and subsets were chosen to compute the sample standard deviation, s. The spread of the s values decreases as more measurements are incorporated into each calculation. From left to right in the plot, the number of measurements per s calculation is 5, 10, 15, 30, and 150. Twenty sample standard deviations were calculated for each sample size studied.
The analyst is taking risks if he or she reports only the above information. First, the report does not indicate how many measurements went into obtaining the result. Standard deviations are very noisy numbers, as is shown in Figure 1.
Second, RSDs can be misleading when xavg is a small number. In such instances, the RSD can explode, simply because a small (but perhaps acceptable) standard deviation is being compared with another small number. (Remember the “double-unit” nature of a percent!)
A final risk is that the recipient may not realize what confidence level is involved. By definition, RSD involves only one standard deviation. However, it would be wise to remind users that for one standard deviation, the confidence level associated with the reported result is only 68% at best; the confidence level is that high only in the limit as the sample size tends to infinity and the Student’s-t value approaches the z value.
RSD risks associated with reporting calibration and recovery data
Now think about the more complicated situation of a calibration or recovery study. One risk that appears can be summarized as follows. When determining either s or xavg, it is easy to lose track of (or simply to forget about) reference points; in other words, the “percent” problem arises again.
Why does this risk surface? In all likelihood, the study involved the replicate analysis of several standard solutions of various concentrations. Now, more options are available for calculating and reporting an RSD.
Using the strict definition expressed in Eq. (1), the analyst would use the entire data set of responses in the computation. However, doing so does not make sense, since a wildly high value will result for the standard deviation and a useless number will be generated for the mean. Instead, any such calculations should be performed only after the data have been grouped by concentration. As a consequence, multiple RSDs can be determined. Now the analyst must make choices on what to report, and must be careful to communicate all related information: 1) relative standard deviation, 2) mean concentration, 3) number of replicates, and (optionally) 4) confidence level. An example is, “The RSD at 10 ppb is 12%, based on 10 replicates, for a confidence level of 58%.” The fact that this confidence level falls short of the nominal 68% is due to the finite sample size; compare ±1 standard deviation from the Student’s-t table [with 9=(10-1) degrees of freedom] to the standard Normal table.
Even if all details are reported, this application of RSD must be made with caution! The temptation (and reality, typically) is to compute only the standard deviation at one concentration, and assume that this value applies to every other concentration. As has been demonstrated in numerous examples throughout this series of articles, standard-deviation behavior over a working concentration range cannot be taken for granted!!
Figure 2 - A set of simulated data (same as used in Part 47 of this series), fitted with a quadratic model using WLS. In each tracing, the red plot invokes a shortcut formula for the weight; the shortcut formulas are: a) (1/s) and b) (1/s2). Also plotted (in green) is the case where the weight is based on the modeled standard deviation. In each set of curves, the identities from top to bottom are: upper prediction interval, regression line, and lower prediction interval.
RSD risks associated with weighted-least-squares fitting
As was mentioned at the end of Part 47, RSDs can be involved in a risky calibration practice. If weighted least squares (WLS) is appropriate and the analyst decides to perform regression using only the mean response for each concentration, then two shortcut formulas exist for the weight:
Figure 2 shows that unrealistic prediction intervals can occur when either of these approaches is utilized. The data set involved is the same as the one used for the illustrations in Part 47. As before, the RSD-related lines are in red, and the modeled-weight tracings are in green.
Once again, the message is the same. In calibration and recovery studies, the recommended statistical approach is to model the standard deviation of the responses, test proposed models rigorously, and use prediction intervals to provide an overall estimate of measurement uncertainty. Such an estimate should be reported with sample results, either in concentration units or as a relative measurement uncertainty (abbreviated as RMU). The associated confidence level should also be communicated.
The next installment will apply the above regression discussion to the subject of detection limits, where some interesting twists surface when delving into measurement-uncertainty details.
David Coleman is an Applied Statistician, and Lynn Vanatta is an Analytical Chemist; e-mail: email@example.com.