Stats/Inferential

Lack-of-fit sum of squares and Pure-error sum of squares

데먕 2020. 2. 4. 12:17

1. Overview

In statistics, a sum of squares due to lack of fit, or more tersely a lack-of-fit sum of squares, is one of the components of a partition of the sum of squares of residuals in an analysis of variance, used in the numerator in an F-test of the null hypothesis that says that a proposed model fits well. The other component is the pure-error sum of squares.

2. Description

2.1 Intuition

$$SSE=SSPE+SSLF$$

$$\sum(observed\: value-fitted\: value)^{2}\: \: (error)\\
=\sum (observed\: value-local\: average)^{2}\: \: (pure error)\\
+\sum (weight\times (local\: average-fitted\: value)^{2}\: \: (lack of fit)$$

2.1.1 The sum of squares due to "pure" error (SSPE)

The sum of squares of the differences between each observed y-value and the average of all y-values corresponding to the same x-value.

2.1.2 The sum of squares due to lack of fit (SSLF)

The weighted sum of squares of differences between each average of y-values corresponding to the same x-value and the corresponding fitted y-value, the weight in each case being simply the number of observed y-values for that x-value.

2.2 Formular

$$\sum_{i=1}^{n}\sum_{j=1}^{n_{i}}\hat{\varepsilon }_{ij}^{2}=\sum_{i=1}^{n}\sum_{j=1}^{n_{i}}(Y_{ij}-\hat{Y}_{i})^{2}=\underbrace{\sum_{i=1}^{n}\sum_{j=1}^{n_{i}}(Y_{ij}-\bar{Y}_{i})^{2}}_{sum\: of\: squares\: due\: to\: pure\: error}+\underbrace{\sum_{i=1}^{n}n_{i}(\bar{Y}_{i}-\hat{Y}_{i})^{2}}_{sum\: of\: squares\: due\: to\: pure\: error}$$

3. Reference

https://en.wikipedia.org/wiki/Lack-of-fit_sum_of_squares

https://www.youtube.com/watch?v=6VhjGw90TB4

http://reliawiki.org/index.php/Simple_Linear_Regression_Analysis