In reality this is only non-numeric in the sense that it is data we collect about the presence or absence (nominal data) of some characteristic or attribute of an item. Usually we take this data and transform it into a count of the characteristic; like the number of “naughty” or “nice” kids on Santa’s list. Or, more practical to some, the number of red cars going through an intersection; the number of order forms with mistakes in them, etc. These counts are all called nominal scale measurements. This scale of measurement gives us the least amount of information of the four types of measurement scales (Nominal, Ordinal, Interval, Ratio).

The question I’d like to address here is once you have the data how can you compare it to a standard or another collection to determine if there is a significant difference between the two. An example of this is with order forms. Say you made, what you think is an improvement in the way you handle orders but you really want to know if there actually is an improvement. How do you do that? You can use what is called the Chi Square Test.

### Chi Square Test

Chi Square Test is used to evaluate count data presented in 2-dimensional tables (rows and columns) to answers the question: “Do the groups differ with regard to the proportion of items in the categories?” In our order form example we might have these three categories: No Errors, Minor Errors, and Major Errors. We would collect data from these three categories, before and after the improvement.

Lets say before the improvement we had 60% error free, 30% minor errors and 10% major errors. After the improvement we looked at 136 orders and found that 93 were error free, 33 had minor errors, and 36 had major errors.

Our two dimensional table would look like this ( In this table Chi Squared is the value marked **X2**):

For those who want to calculate the Chi Square value the formula is below:

**BUT !!!** there is an easier way using Excel formulas. To do this we need to use the “CHITEST”formula in Excel.

- The “CHITEST” performs the comparison for you and calculates the probability that the two are the same.
So in our example I entered the formula: =CHITEST(Actual Range [new process], Expected Range[old process]) OR =CHITEST(B2:B4,D2:D4)

As you can see this gives us a formula result of 0.0000004152 or 0% [.00004152%]. This says the probability that the new and the old process are the same is 0%. The two processes are different! Looking at the counts you can see the new process improved minor errors but increased major errors. Go back to old process!

Well there you have my thoughts on comparing non-numeric data. If, you have questions or comments please feel free to contact me by leaving a comment below, emailing me, calling me, or leaving a comment on my website.

Bersbach Consulting

Peter Bersbach

Six Sigma Master Black Belt

http://sixsigmatrainingconsulting.com

peter@bersbach.com

1.520.829.0090