Cook's Distance Calculation
Cook's distance measures the influence of each observation on the regression coefficients. For the 5th observation, we need to calculate:
Di=p⋅MSE(Y^i−Y^i(i))2⋅(1−hii)2hii
Where:
- Y^i is the predicted value using all observations
- Y^i(i) is the predicted value when observation i is excluded
- p is the number of parameters (2 for simple linear regression)
- MSE is the mean squared error
- hii is the leverage
Step 1: Calculate predicted values
Using full dataset (5 observations):
Y^=1.4110+0.1512X
For observation 5 (X = 2.45):
Y^5=1.4110+0.1512×2.45=1.4110+0.37044=1.78144
Using first 4 observations:
Y^=0.3169+1.3667X
For observation 5 (X = 2.45):
Y^5(5)=0.3169+1.3667×2.45=0.3169+3.348415=3.665315
Step 2: Calculate leverage h55
Leverage for simple linear regression:
hii=n1+∑(Xj−Xˉ)2(Xi−Xˉ)2
From the full dataset:
- n = 5
- Mean of X: Xˉ=51.85+0.65+0.63+1.24+2.45=56.82=1.364
- X5−Xˉ=2.45−1.364=1.086
- ∑(Xj−Xˉ)2=(1.85−1.364)2+(0.65−1.364)2+(0.63−1.364)2+(1.24−1.364)2+(2.45−1.364)2
=0.236+0.510+0.538+0.015+1.179=2.478
h55=51+2.478(1.086)2=0.2+2.4781.179=0.2+0.476=0.676
Step 3: Calculate MSE
Using full dataset:
- Residuals: ei=Yi−Y^i
- e1=3.67−(1.4110+0.1512×1.85)=3.67−1.69072=1.97928
- e2=1.88−(1.4110+0.1512×0.65)=1.88−1.50928=0.37072
- e3=1.35−(1.4110+0.1512×0.63)=1.35−1.506256=−0.156256
- e4=0.34−(1.4110+0.1512×1.24)=0.34−1.598488=−1.258488
- e5=0.89−(1.4110+0.1512×2.45)=0.89−1.78144=−0.89144
SSE=∑ei2=3.917+0.137+0.024+1.584+0.795=6.457
MSE=n−pSSE=5−26.457=36.457=2.1523
Step 4: Calculate Cook's distance
D5=p⋅MSE(Y^5−Y^5(5))2⋅(1−h55)2h55
D5=2×2.1523(1.78144−3.665315)2⋅(1−0.676)20.676
D5=4.3046(−1.883875)2⋅(0.324)20.676
D5=4.30463.549⋅0.1050.676
D5=0.824×6.438=5.305
However, this calculation seems to give a different result than the options. Let me recalculate using the standard formula:
Di=p⋅MSEei2⋅(1−hii)2hii
D5=2×2.1523(−0.89144)2⋅(1−0.676)20.676
D5=4.30460.795⋅0.1050.676
D5=0.1847×6.438=1.189
This is still not matching the options. Let me use the alternative formula:
Di=p⋅MSE(β^−β^(i))′X′X(β^−β^(i))
Given the regression coefficients:
- Full dataset: β^0=1.4110, β^1=0.1512
- Without observation 5: β^0=0.3169, β^1=1.3667
Difference: Δβ0=1.4110−0.3169=1.0941, Δβ1=0.1512−1.3667=−1.2155
X′X=[n∑Xi∑Xi∑Xi2]=[56.826.8211.7364]
(β^−β^(i))′X′X(β^−β^(i))=[1.0941−1.2155][56.826.8211.7364][1.0941−1.2155]
First compute: X′X(β^−β^(i))=[5×1.0941+6.82×(−1.2155)6.82×1.0941+11.7364×(−1.2155)]=[5.4705−8.28977.4618−14.265]=[−2.8192−6.8032]
Then: [1.0941−1.2155][−2.8192−6.8032]=1.0941×(−2.8192)+(−1.2155)×(−6.8032)=−3.084+8.268=5.184
D5=2×2.15235.184=4.30465.184=1.204
This is still not matching exactly. Given the options and the calculations, the closest match is 1.6268, which is option B.