Statistics in Analytical Chemistry: A new American Laboratory column

Welcome to “Statistics in analytical chemistry.” The column will appear every other month and will involve audience participation. We will start with a series of topics, but welcome your comments, questions, and suggestions. How we branch off will depend in large part on you, dear reader.

Why are we starting this column? We will begin by introducing ourselves and explaining how our collaborative efforts have evolved. David Coleman is an applied statistician with Alcoa Technical Center in Alcoa Center, PA. He works with diverse groups of people, including analytical chemists, helping them design experiments (R&D and plant-based) and then analyze the data. Measurement-capability studies are a specialty area. Lynn Vanatta is an analytical chemist with Air Liquide-BalazsTM Analytical Services in Dallas, TX. She does research on ion-chromatographic methods to test semiconductor-grade chemicals for purity and assay. As a result of her endeavors, she is constantly dealing with calibration, detection, and quantitation issues. In 1995, Lynn attended a PITTCON® short course taught by David. The class addressed these very subjects that had been of concern to Lynn for years. It was clear to her that the material contained the techniques needed for statistically sound data analysis. However, she had questions about how to apply these approaches to real data. 

 

Fortunately, David had encouraged people to contact him with questions and comments. In early 1996, Lynn took him up on his offer, since Air Liquide-Balazs had asked her to calculate detection limits for determining anions in ultrapure water. This project marked the beginning of their collaborative efforts. Lynn sat in on the PITTCON class in 1996, having made some suggestions on revising the course. By 1997, Lynn’s detection-limit project was complete, and she had learned how to analyze her own data. Beginning that year, she conducted a one-hour computer demonstration during the class, proving that you do not need a Ph.D. in statistics to be able to use the course material (and the material in this column). The demonstration was a part of the course until 2001. By then, the requests for hands-on computer work had risen to the level that PITTCON agreed to offer a class for such instruction. David now teaches the “theory” on the first day, and Lynn conducts the workshop on the second. In addition, Lynn has developed a related course on her own and teaches it each year at the International Ion Chromatography Symposium.

Meanwhile, Lynn continued (and still continues) to work with David on the design and analysis of her various projects, constantly learning more about calibration, detection, and quantitation. They publish virtually all of their joint research and have tried to explain some of the techniques in these various papers. However, over the years, they have seen a growing need for a cohesive package that explains and demonstrates these procedures. They have also felt strongly that these writings should be available in a forum that is easily accessible to chemists. They thought American Laboratory would be the perfect home, and the editorial staff agreed.

We will start by discussing statistically sound calibration techniques, since this topic is at the heart of both detection and quantitation. To make the series of columns as practical as possible, we intend to illustrate the theory with actual data sets from our research. In addition, as we said earlier, we welcome your input. If you have questions, comments, or topic suggestions, please feel free to contact either one of us.

To give you an overview of the initial columns, we have outlined our first major topic (i.e., calibration). The plan is as follows:

  • Measurement and segments of the real number line
  • Calibration—introduction; ordinary-least-squares solution and assumptions
  • Uncertainty intervals—types/definition; prediction-interval equation for a straight line (with discussion of how to minimize the interval’s width)
  • Calibration design—things to consider; example designs
  • Calibration diagnostics and model-selection matrix
  • Calibration examples using real data.

Before we close this introduction, we would like to mention briefly two concepts that are at the heart of everything we will be doing with calibration, detection, and quantitation: uncertainty estimates and risk levels.

Uncertainty is a fact of life (we cannot be sure how long the traffic light will stay green, what card will be dealt next in poker, or that the ice cream cone will not melt before we finish it; and we certainly do not know tomorrow’s value of our stock portfolio). Although we often act as if analytical measurements are immune from this fact, that assumption is obviously false. We often can take steps to minimize measurement uncertainty, but we can never eliminate it. It behooves us to make a realistic, sound assessment of the overall variability inherent in a measurement. Then we can decide if the measurement is reliable enough for our purposes. Statistics provides us the tools for making this judgment. Our task is twofold: 1) Identify all the relevant sources of variation affecting our measurement process, and 2) determine and apply the appropriate statistical tool(s) for quantifying the uncertainty. This series of articles will focus on the second task. Many formulas are available for calculating uncertainty, but the underlying assumptions vary. We hope to clarify matters in this arena.

Risk levels refer to the fraction of the time we are willing to be incorrect in our measurements. These levels also are crucial, but are horses of a totally different color. Statistics cannot help set these values. How much risk one will take of being wrong depends solely on what one is willing to tolerate. People often try to force this subject into the statistics realm. (“How good a number can you give me?” is a common reply.) Typically, this strategy is adopted because the individual is uncomfortable about making a decision. (No one wants to be responsible for set- ting a risk level that might be controversial later.) However, the truth is that people, not statistics, make these decisions. Only when all interested parties sit down and discuss the consequences of being wrong can realistic risk levels be set.

We hope you will keep these two concepts in mind as we begin. We are enthusiastic about developing these ideas further in future articles.

Mr. Coleman is an Applied Statistician, Alcoa Technical Center, MST-C, 100 Technical Drive, Alcoa Center, PA, U.S.A.; e-mail: [email protected]. Ms. Vanatta is an Analytical Chemist, Air Liquide-BalazsTM Analytical Services, Box 650331, MS 301, Dallas, TX 75265, U.S.A.; tel.: 972-995-7541; fax: 972-995-3204; e-mail: [email protected].