Statistics in Analytical Chemistry: Part 25—Calibration Summary

In a modern chemical-analysis laboratory, virtually all of the testing equipment must be calibrated periodically. However, there is not a universally applicable calibration procedure that can be used in all cases. Much of the problem results from the fact that analytical instruments fall into one of two classes (i.e., instruments that immediately provide the results in usable units and instruments that do not).

An example of the former class is a balance scale. The object for which mass measurement is desired is placed on the balance pan and the numerical value of the mass appears on the readout. The calibration of this type of equipment is quite straightforward: Place a mass (or masses) of known amount(s) on the balance, read the results, and adjust as necessary.

An example of the latter class is a chromatograph. When an analyte reaches the detector, the signal is reported out by the computer in arbitrary units (e.g., peak area or peak height). Calibration in this instance is not as rapid as in the above example. A more involved procedure (i.e., regression) is needed to convert the raw data into useful units (typically concentration).

Since the beginning of this American Laboratory column, the overriding theme has been the statistics behind the sound calibration of these “regression-based” instruments. This article summarizes the basics that have been presented. For definitions of statistical terms that are used, the reader is referred to Part 24 in this series (American Laboratory, Nov/Dec 2006).

Regression-based calibration should have two objectives: 1) a curve that will transform sample data into concentration units (and do so without bias), and 2) provide (at a user-chosen confidence level) a statistically sound estimate of the uncertainty in any reported concentration. To accomplish these goals, three main steps are involved. First, a calibration study must be designed. Second, the study must be performed carefully in the laboratory. Third, the data must be evaluated statistically using a set of calibration (or, more generally, regression) diagnostics. This third step will ensure that the resulting curve meets the two calibration objectives listed above.

Step 1: Design the calibration study

In designing the study, two decisions must be made: 1) the number of different concentrations of standard solutions that will be prepared, and 2) the number of each solution’s replicates that will be analyzed. In every study, there must be sufficient numbers of both concentrations and replicates to allow for: 1) detection of curvature in the data, 2) modeling of response standard deviation (to see if this statistic trends with concentration), and 3) use of the calibration curve (to predict sample concentrations) without extrapolation at either end. Additionally, the design may need to be adjusted if the analyst is pursuing a low detection limit or high precision.

In the designing process, it is helpful to propose a model and a confidence level for the calibration curve. A rule-of-thumb starting place is a 5 × 5 design (i.e., five replicates of each of five concentrations, typically including blanks). However, the final plan must be based on the intended use of the calibration curve for sample predictions.

Step 2: Perform the calibration study in the laboratory

While this step does not directly involve the use of statistics, a few comments are in order. No matter how carefully the study has been designed, if it is not performed properly in the laboratory, the resulting data will be compromised. Standards should be prepared in the pure solvent that is appropriate for the instrument at hand (see below for comments on dealing with sample-matrix issues). If blanks are included in the study, they must be prepared appropriately for the analytical method being studied. If standard preparation is subject to such things as contamination, or if standards degrade rapidly, appropriate action must be incorporated into the lab work.

Step 3: Diagnose the calibration data statistically

This portion of the process involves seven basic parts:

  1. Plot response versus true concentration. Evaluate the overall shape of the scatterplot.
  2. Determine the behavior of the standard deviation of the responses. Plot the standard deviation versus concentration and fit with a straight line, using ordinary least squares as the fitting technique (the general equation for the line is: standard deviation = g + hx). If the p-value for the slope is significant (i.e., <0.01), then the standard deviation trends with concentration; in such cases, weighted least squares must be used to fit proposed curves to the calibration data themselves. The formula for the weight is:

    ([g + hx]–2)/(Avg [{g + hx} –2])

  3. Fit the proposed model and evaluate R2adj. Although R2 adj is a weak statistical tool, the value should be close to 1.
  4. Examine the residuals for nonrandomness. The ideal is to have the zero line pass through the mean of each concentration’s residuals. In such a case, there will be a random scatter of the points about the zero line. If a distinct pattern (e.g., parabola or sine wave) exists, then the model probably is not adequate. Appearance of a “trumpet effect” indicates that the standard deviation of the responses may be trending with concentration.
  5. Evaluate the p-value for the slope (and any higher-order terms). For calibration data, the x-term will always be significant (i.e., the term’s p-value should be <0.01). For higher-order models to be appropriate, the coefficients for the additional terms should be significant as well.
  6. Perform a lack-of-fit (LOF) test. If the p-value is <0.05, then the model is not adequate. The shape of the residual pattern should be used to help select an alternate model to be tested.
  7. Plot and evaluate the prediction interval. This important step will indicate the uncertainty in sample estimates that are derived from this curve. The width of the interval will depend on the noisiness of the data and the confidence level that has been chosen.

The previous seven steps center on making two statistically sound choices: a model and a fitting technique. It must be emphasized that these selections are independent of each other. The model choice depends on the outcome of the lack-of-fit test (with help from the residual pattern). The fitting-technique choice depends on the behavior of the responses’ standard deviations (supported by the presence or absence of a trumpet effect in the residuals plot).

Occasionally, no model that is tested is adequate, or a model that is adequate is not easily inverted for use in estimating sample concentrations. When a less-than-adequate model is selected for the calibration curve, the width of the uncertainty interval must be adjusted to account for the bias that exists. The procedure for this correction is found in Part 16 (American Laboratory, May 2005).

If the sample analytes are in a matrix other than pure solvent, and recovery problems are known or suspected, then a recovery study should be conducted postcalibration. Typically, the calibration design can be used for this study, too. Recovered concentrations are calculated via the pure-solvent calibration curve and plotted versus true concentration, thereby generating a second graph. These data are also modeled and diagnosed using the seven steps above. If the recovery is unacceptably low or high, the equation for the model can be used to correct the recovered concentrations to true values. The associated prediction interval gives the overall uncertainty (at the chosen confidence level) for the method, since this interval includes the uncertainty for both the calibration and the spiking processes.

As a final summary, the following is presented in hopes that it can serve as a useful reference for readers. The authors would more than welcome feedback on this summary article, and especially on the reference box.

Mr. Coleman is an Applied Statistician, Alcoa Technical Center, MST-C, 100 Technical Dr., Alcoa Center, PA 15069, U.S.A.; email: [email protected]. Ms. Vanatta is an Analytical Chemist, Air Liquide-Balazs™ Analytical Services, Box 650311, MS 301, Dallas, TX 75265, U.S.A.; tel.: 972-995-7541; fax: 972-995- 3204; e-mail: [email protected].