Quote of the Day

Who are a little wise the best fools be.

— John Donne

## Introduction

I recently took an excellent class at Statistics.com called "Prediction & Tolerance Intervals, Measurement and Reliability" taught by Dr. Tom Ryan, a former NIST researcher. I took the class because I have been concerned that some of the statistical methods I am currently using for calibrating optics are not as good as they could be. As part of the class, we were required to perform some practice Gage Repeatability and Reproducibility (aka gage R&R) studies, which required me to use ANOVA. The class gave me many ideas for improving my experiment design and I highly recommend it to those who do a lot of experiments.

Figure 1 shows a caliper, which is a common form of gage (sometimes spelled gauge). A gage is any device that is used to perform a measurement. In my case, I will be performing gage R&R on some optical power measurements.

In this post, I am focusing on my self-education on ANOVA and its application to gage R&R. While I have used ANOVA for years for evaluating the significance of test data, but I have never looked at how it works until very recently. The opportunity to study ANOVA in detail came because of some work I needed to do in Taiwan (a place that I enjoy very much). On the flights from Minneapolis to San Francisco to Taiwan, I had 17 hours of flight and a 5 hour of layover to think about how ANOVA works. I put this time to some good use.

## Background

### Source Material

My approach here will slavishly follow that of M.J. Moroney in his excellent book "Facts From Figures". First published in 1951, you can find this book in used bookstores for about $1 or you can download it in PDF form from the web. I discovered this little gem years ago and I turn to it when I need a short refresher on basic statistics.

### Why Gage R&R?

When I started my career at HP back in 1979, my mentor there told me that "Manufacturing is in a constant battle against variation -- ideally, we make the millionth unit the same as the first." To battle variation, we must first be able to identify and measure it. The focus of gage R&R is on understanding and measuring the sources of variation in a measurement. Because good product design practice requires that you design for the worst-case parameter variation, excessive variation in measurement forces you to design your product to be tolerant of this variation and that increases cost.

The folks in our Quality team break down variation as shown in Figure 2 (Source).

When we talk about measurement variation, we are talking about precision. The Wikipedia describes precision as follows

The precision of a measurement system, related to reproducibility and repeatability, is the degree to which repeated measurements under unchanged conditions show the same results.

Gage R&R explicitly measures reproducibility and repeatability relative to the level of part variation.

### Approach

Most technical discussions of ANOVA dive into equations with multiple levels of summation symbols. Moroney begins his example by taking a simple experiment and showing how you can can break the variance of the result into components with no use of summations. I like this approach as a gentle start.

Since I am focused on gage R&R here, my plan is produce a several related worksheet items all contained in a single Excel workbook (available here) that does not use any Visual Basic. :

- An Excel worksheet that works through one of the ANOVA examples from "Facts From Figures". While not a gage R&R example, I will use this example as the basis of my gage R&R work. Here is the book excerpt that I am using as my gage R&R reference.
The examples in this book are simple because they are from a time when calculations were done by hand. This is not a bad thing for today because it means you can easily duplicate them using a tool like Excel or Mathcad. While I prefer to use Mathcad for nearly everything, Excel is probably a better vehicle for my study work here.

- A set of Excel worksheets that contain gage R&R example using a template I have created whose results agree with the same data processed by Minitab, which is the tool Dr. Ryan recommended (he was one of its creators).
Yes, Dr. Ryan strongly discouraged me from using Excel for ANYTHING, but I do believe it can play a role in routine statistical calculations and I will ignore his advice here. As you can see, I often do not follow directions. Sister Mary Agnes from the Osseo Catholic School is probably looking down upon me from heaven with a frown on her face.

This worksheet is intended to illustrate the concepts behind ANOVA and gage R&R -- it is not computationally efficient. However from a conceptual standpoint, I like Moroney's approach of eliminating row and column variation in separate operations to determine the desired variance components.

## Analysis

### Role of ANOVA in Gage R&R

ANOVA is one of the accepted methods for determining the variance components in a experiments (cf. AIAG approach). With respect to this post, there are three sources of variation: the part itself, the operator, and random error. Gage R&R studies often include modeling the interaction between part and operator, but to keep things simple I will ignore this sort of variation for this post. The methods shown here can be extended to include interaction, but I want to keep this post simple.

Equation 1 shows the gage R&R variance model that I will be using.

Eq. 1 |

where

*σ*is the total variance of data._{T}^{2}*σ*is the component of variance due only to the parts._{Parts}^{2}*σ*is the component of variance due only to the operators._{Reproducibility}^{2}*σ*is the component of variance due only the measurement tools._{Repeatability}^{2}

We will use ANOVA to determine the variance components. Once we have the components, we can determine the relative effects on our measurement of the different components. Customarily, we want to see the repeatability and reproducibility components to be less than 10% of the total variance (example of the 10% standard).

### My ANOVA Reference Model

Figure 3 shows my rework of Maroney's Latin Square excerpt, which is focused on "treatments" applied to some crop. This is the example I used as my model for writing a simple gage R&R worksheet. The process for generating Figure 2 can be broken down as follows:

- Average the effect of each set of treatment levels, then determine the Mean Square error (MS) of the treatment.
We will use this term to estimate the variance contribution of the treatment.

- Remove the effect of row variation by replacing each row element with the average element value for that row.
By eliminating the row variation, we can compute the MS of the column variation.

- Remove the effect of column variation by replacing each column element with the average element value for that column.
By eliminating the column variation, we can compute the MS of the row variation.

### My Gage R&R Worksheet

I have put a number of tabs in this Excel worksheet that duplicate the results obtained from Minitab and other tools. We can make a direct analogy between my gage R&R and Figure 3 as follows.

- treatments in Figure 3 are comparable to parts in gage R&R. Our analysis will provide us with the part-to-part variation.
- column variation is Figure 3 is comparable to repeatability variation in gage R&R (statistics folks will often call this term the residual error)
- row variation in Figure 3 is comparable to reproducibility or operator variation in gage R&R.

The worksheet uses array formulas to compute the gage R&R of data placed into the data area at the bottom of the worksheet. The variances are computed by substituting the MS values into the equations shown in Figure 4.

I will not be deriving the formulas of Figure 3 in this post. You can find them in various places on the web. Here is a link to one Powerpoint presentation that gives these formulas, but with different variable names. I intend to derive them in a later blog post.

## Conclusion

I finally feel like I have an intuitive model of what is going on when I perform an ANOVA analysis. While you do not need to know the details of how the computation is performed, it does help you get some insight into the ANOVA process. This reminds me a bit of Fourier analysis. You do not necessarily need to know the details of a Fast Fourier Transform (FFT), but knowing the details does give you some insight into what is going on.

Really nice to see some explanatory diagrams in the text.

I needed to make some diagrams because just staring at summation symbols did nothing for my understanding of how things worked.

mathscinotes