Degrees Of Freedom By Chong Ho Yu

Toothaker (1986) explained df as the number of independent components minus the number of parameters estimated. This approach is based upon the definition provided by Walker (1940): the number of observations minus the number of necessary relations, which is obtainable from these observations (df = n - r). Although Good (1973) criticized that Walker's approach is not obvious in the meaning of necessary relations, the number of necessary relationships is indeed intuitive when there are just a few variables. The definition of "necessary relationship" is beyond the scope of this article. To avoid confusion, in this article, it is simply defined as the relationship between a dependent variable (Y) and each independent variable (X) in the research.

Please keep in mind that this illustration is simplified for conceptual clarity. Although Walker regards the preceding equation as a universal rule, don't think that df = n - r can really be applied to all situations.

No degree of freedom and effective sample size

Figure 1 shows that there is one relationship under investigation (r = 1) when there are two variables. In the scatterplot where there is only one datum point. The analyst cannot do any estimation of the regression line because the line can go in any direction, as shown in Figure 1.In other words, there isn't any useful information.

Figure 1. No degree of freedom with one datum point.

When the degree of freedom is zero (df = n - r = 1 - 1 = 0), there is no way to affirm or reject the model! In this sense, the data have no "freedom" to vary and you don't have any "freedom" to conduct research with this data set. Put it bluntly, one subject is basically useless, and obviously, df defines the effective sample size (Eisenhauer, 2008).

Perfect fitting

In order to plot a regression line, you must have at least two data points as indicated in the following scattergram.

Figure 2. Perfect fit with two data points.

In this case, there is one degree of freedom for estimation (n - 1 = 1, where n = 2). When there are two data points only, one can always join them to be a straight regression line and get a perfect correlation (r = 1.00). Since the slope goes through all data points and there is no residual, it is considered a "perfect" fit. The word "perfect-fit" can be misleading. Naive students may regard this as a good sign. Indeed, the opposite is true. When you marry a perfect man/woman, it may be too good to be true! The so-called "perfect-fit" results from the lack of useful information. Since the data do not have much "freedom" to vary and no alternate models could be explored, the researcher has no "freedom" to further the study. Again, the effective sample size is defined by df = n -1.

This point is extremely important because very few researchers are aware that perfect fitting is a sign of serious problems. For instance, when Mendel conducted research on heredity, the conclusion was derived from almost "perfect" data. Later R. A. Fisher questioned that the data are too good to be true. After re-analyzing the data, Fisher found that the "perfectly-fitted" data are actually erroneous (Press & Tanur, 2001).

Over-fitting

In addition, when there are too many variables in a regression model i.e. the number of parameters to be estimated is larger than the number of observations, this model is said to lacking degrees of freedom and thus is over-fit. To simplify the illustration, a scenario with three observations and two variables are presented.

Figure 3. Over-fit with three data points.

Conceptually speaking, there should be four or more variables, and three or fewer observations to make a model over-fitting. Nevertheless, when only three subjects are used to estimate the strength of association between two variables, the situation is bad enough. Since there are just a few observations, the residuals are small and it gives an illustration that the model and the data fit each other very well. When the sample size is larger and data points scatter around the plot, the residuals are higher, of course. In this case, the model tends to be have a lesser degree of fit. Nevertheless, a less fitted model resulted from more degrees of freedom carry more merits.

Useful information

Finally, you should see that the degree of freedom is the number of pieces of useful information.

Sample size Degree(s) of freedom Amount of information

1

0

no information

2

1

not enough information

3

2

still not enough information

Falsifiability

To further explain why lacking useful information is detrimental to research, the program ties degrees of freedom to falsifiability. In the case of "perfect-fitting," the model is "always right." In "over-fitting," the model tends to be "almost right." Both models have a low degree of falsifiability. The concept "falsifiability" was introduced by Karl Popper (1959), a prominent philosopher of science. According to Popper, the validity of knowledge is tied to the probability of falsification. Scientific propositions can be falsified empirically. On the other hand, unscientific claims are always "right" and cannot be falsified at all. We cannot conclusively affirm a hypothesis, but we can conclusively negate it. The more specific a theory is, the higher possibility that the statement can be negated. For Popper, a scientific method is "proposing bold hypotheses, and exposing them to the severest criticism, in order to detect where we have erred." (1974, p.68) If the theory can stand "the trial of fire," then we can confirm its validity. When there is no or low degree of freedom, the data could be fit with any theory and thus the theory is said to be unfalsifiable.

Tag » What Is Df In Stats