The Chi-Square Test

The Chi-Square Test
How Well Does Your Mental Model Fit with Reality?
Or Is Your Mental Model WRONG?


Whenever you think you have an idea of how something works, you have a mental model. That is, in effect, a layman's way of talking about having an hypothesis. And then the hypothesis needs to be tested for how closely it fits reality - and reality is the data collected from an experiment.

Frequently experimenters can devise ways for setting up their experiments that use so many zillions of trials, that results either obviously fit the model or they clobber it. Two examples of rarely needing statistical analysis are: (a) a chemist titrating acid with base and looking for the pH indicator to change color (Avogadro's numbers are involved); (b) a geneticist using E. coli to study the turning on or off of a gene rarely uses fewer than a billion bacteria, and the nature of the system of whether or not a colored compound is produced in the liquid is an automatic averaging.

However, there are times that data must be collected on small groups of subjects, and individual quirks and other random factors cause "diffusion" of the results. Consider doing medical trials. It is hard enough to get a few people to cooperate let alone millions. So the data is collected on the few and compared with a few controls. Is there really a difference between the two groups? It is that sort of question where Chi-Square analysis comes in. In short, if the differences between your model and reality are small, that is good; if huge, develop a new model! These differences are denoted as "chi-square", which equals the sum of all the squares of the deviations divided by what was expected.

A G A I N ! The chi-square analysis of one's data does NOT tell you if your hypothesis is correct. There may be other unseen factors that make it appear so as shown in this pair of simple examples. H O W E V E R! If you get a huge chi-square value, your model is extremely likely to be WRONG!

Table I contains data obtained by Carl Correns in 1900 when he repeated Mendel's cross between strains of garden peas with yellow and green seeds. Correns expected that there would be green to yellow in a 3:1 ratio (YY, Yy, yY, yy; with only yy not containing the dominant yellow gene). Later we'll get to the interpretation of the analysis. But it should be obvious that large deviations contradict our model. Furthermore, using a statistician's trick of variance we square the deviations and then divide by the expected number. Again, the larger the numbers the more likely the model is wrong, and this is especially true if we sum the variances, we get a chi-square value, as shown by the 0.221, below.


TABLE I: Correns' Repeat of Mendel's First Experiment

Class Data Expected Deviation Dev*2/Expected

Yellow seeds 1,394 1,385.25 8.75 0.055
Green seeds 453 461.75 -8.75 0.166

Totals 1,847. 1,847. 0.00 0.221


Mendel went on to look at what eventually led to the discovery of linkages between genes. Some genes are physically linked to others, while other genes are not - since they are on different segregatable chromosomes. In this experiment, he looked at the F2 results of a dihybrid cross between the genes for round/wrinkled seeds and yellow/green seeds. He expected from his model to get something close to a 9:3:3:1 with round and yellow being dominant genes. Let's take a look:

TABLE II: Mendel's Second Experiment

Model Phenotype Data Expected Deviation Dev*2/Expected

9 Rnd/Yel 315 312.75 2.25 0.016
3 Rnd/Grn 108 104.25 3.75 0.135
3 Wri/Yel 101 104.25 -3.25 0.101
1 Wri/Grn 32 34.75 -2.75 0.218

Totals 556 556 0.00 0.470


Interestingly, we have observed something in the above two tables: the sum of the deviations is always zero. Therefore, if we know all but one of the deviations, we can calculate the remaining one, and, hence, all its antecedents in the table. Or, in other words, all but one have the freedom to change, but once they are fixed the remaining one is fixed too. Hence, the term 'degrees of freedom.'

Thus, "it obviously follows that" (we must use proper math textbook language, mustn't we?) the more degrees of freedom (the more comparisons being made), the larger the chi-square calculation is likely to be. Some adjustment must be made for this so that comparing lots of things don't give us numbers that would mislead us into thinking our models are wrong. We will see that Table II's larger chi-sq value is actually more reliable that Table I's smaller value - once it is adjusted for the number of degrees of freedom. But how to do this...

Again, statistics to the rescue! Table III has been developed in which we can match up our degrees of freedom along with our chi-square calculation, and find out just how likely a limited number of reality points will have diffused away from the ideal values. And thus we would get some sort of handle on how likely our model hypothesis was correct. Remember that the closer the data are to ideality, the more likely they are to 'diffuse' that little difference away from the ideal. | Separate Table |



Table III
Chi-Square (X2) Values for up to 10 Degrees of Freedom
That Are Associated with Various Probabilities

Degrees
of
Freedom
Probabilities.
0.950.900.700.500.300.200.100.050.010.001

10.0040.0160.150.461.071.642.713.846.6410.83
20.100.210.711.392.413.224.615.999.2113.82
30.350.581.422.373.674.646.257.8211.3516.27
40.711.062.203.364.885.997.789.4913.2818.47
51.151.613.004.356.067.299.2411.0715.0920.52
61.642.203.835.357.238.5610.6512.5916.8122.46
72.172.834.676.358.389.8012.0214.0718.4824.32
82.733.495.537.349.5211.0313.3615.5120.0926.13
93.334.176.398.3410.6612.2414.6816.9221.6727.88
103.944.877.279.3411.7813.4415.9918.3123.2129.59
Do not reject without other cause.|REJECT!



For each value in the table, the associated probability gives the likelihood of obtaining a X2 with the given degrees of freedom that is as large or larger than the value in the table.



The chi-square values for experiments 1 and 2, above, relate to this probability table in the following way. Imagine that all points started out on the ideal line. Then they started randomly diffusing away from that line. There would be more data points near the line than far from it. Sort of a bell-shaped curve centered upon that line. Hence, any points near the line have greater possibility of being correct than points further away. Thus very low chi-square values (meaning they didn't diffuse off the line far) have very high possibility of being correct, and a low possibility of being wrong. And vice versa.

Expt #1 has one degree of freedom with a chi-square = 0.221. Look across the line for Freedom = 1: the value fits between 0.15 and 0.46, which have column headings of 70% and 50%, respectively. Therefore, Correns' data had a 50 to 70% probability of diffusing this far from the ideal (his hypothesis).

Expt #2 has three degrees of freedome with a chi-square = 0.470. Look across the line for Freedom = 3: the value fits between 0.35 and 0.58, which have column headings of 95% and 90%, respectively. Therefore, Correns' data had a 90 to 95% probability of diffusing this far from the ideal (his hypothesis). Mendel's data are extremely close! Many have wondered in print whether or not he "cooked" his data.

YET AGAIN ! A small chi-square calculation does NOT tell you that your hypothesis is correct. There may be other unseen factors that make it merely appear so. HOWEVER! If you get a huge chi-square value, your model is extremely likely to be WRONG! (Why must nature always be so critical? What about some support!)


I want to go to the TOP OF PAGE or ESCAPE! or go back to the Intro Lab's Home Page! or W&M course info or Mendelian Genetics.