By Ariel Balter, Ph.D. Updated Aug 30, 2022
DragonImages/iStock/Getty Images
In regression analysis, we designate one variable as the explanatory variable (x) and the other as the response variable (y). The regression model produces a function y = f(x) that best predicts y from x. For each observation i, the residual is the difference between the observed response y[i] and its predicted value f(x[i]):
Residual = y[i] – f(x[i])
Consider five individuals with the following height (cm) and weight (kg) pairs: (152, 54), (165, 65), (175, 100), (170, 80), and (140, 45). A quadratic fit for weight as a function of height yields the equation:
w = f(h) = 1160 – 15.5 h + 0.054 h²
Using this model, the residuals (in kilograms) are [2.38, 7.65, 1.25, 5.60, 3.40]. The sum of residuals is 15.5 kg.
The simplest regression model is linear, represented by y = m x + b. By construction, the sum of residuals for a linear regression is zero, because the line is fitted to minimize the total vertical deviation.