跳到主要内容

23考研一元线性回归笔记

1. Random Variables

Yi=β0+β1Xi+εiY_i=\beta_0+\beta_1X_i+\varepsilon_i df=n2df=n-2

1-1. εi\varepsilon_i

Distribution

εiN(0,σ2) i.i.d\varepsilon_i\sim N(0,\sigma^2)\ \text{i.i.d}

Properties

εiσStdN\frac{\varepsilon_i}{\sigma}\sim {\rm Std}N εi2σ2χ2(n2)\frac{\sum \varepsilon_i^2}{\sigma^2}\sim \chi^2(n-2)

Point Estimation

sε2:=s2=MSEOBS=SSEOBSn2=ei2n2s^2_\varepsilon:=s^2={\rm MSE}_{\rm OBS}=\frac{{\rm SSE}_{\rm OBS}}{n-2}=\frac{\sum e_i^2}{n-2}

1-2. SSE/σ2{\rm SSE}/\sigma^2

Definition

SSE:=εi2{\rm SSE}:=\sum \varepsilon_i^2

Distribution

SSEσ2=εi2σ2χ2(n2)\frac{{\rm SSE}}{\sigma^2}=\frac{\sum \varepsilon_i^2}{\sigma^2}\sim \chi^2(n-2)

1-3. MSE/σ2{\rm MSE}/\sigma^2

Definition

MSE:=SSEn2{\rm MSE}:=\frac{{\rm SSE}}{n-2}

Distribution

MSEσ2=SSE/n2σ2χ2(n2)n2\frac{{\rm MSE}}{\sigma^2}=\frac{{\rm SSE}/n-2}{\sigma^2}\sim \frac{\chi^2(n-2)}{n-2}

1-4. YiY_i

Definition

Yi:=β0+β1Xi+εi,β0,β1,XiRY_i:=\beta_0+\beta_1X_i+\varepsilon_i,\quad \beta_0,\beta_1,X_i\in \mathbb R

Y^iR\hat Y_i\in \mathbb R

Definition

Y^i=β0+β1Xi\hat Y_i=\beta_0+\beta_1X_i

Properties

Yi=Y^i+εiY_i=\hat Y_i+\varepsilon_i Yˉ=1nY^i=1n(β0+β1Xi)=β0+β1Xˉ\bar Y=\frac{1}{n}\sum \hat Y_i=\frac{1}{n}\sum (\beta_0+\beta_1X_i)=\beta_0+\beta_1\bar X

Distribution

YiY^i+N(0,σ2)=N(Y^i,σ2) i.i.dY_i\sim \hat Y_i+N(0,\sigma^2)=N(\hat Y_i,\sigma^2)\ \text{i.i.d}

1-5. b1b_1

Definition

b1:=xyxx=(XiXˉ)(YiYˉ)(XiXˉ)2=XiXˉ(XiXˉ)2Yib_1:=\frac{\ell_{xy}}{\ell_{xx}}=\frac{\sum (X_i-\bar X)(Y_i-\bar Y)}{\sum (X_i-\bar X)^2}=\sum \frac{X_i-\bar X}{\sum (X_i-\bar X)^2}Y_i

kiRk_i\in \mathbb R

Definition

ki:=X~ixx=XiXˉ(XiXˉ)2k_i:=\frac{\tilde X_i}{\ell_{xx}}=\frac{X_i-\bar X}{\sum (X_i-\bar X)^2}

Properties

b1=kiYib_1=\sum k_iY_i ki=(XiXˉ)(XiXˉ)2=0\sum k_i=\frac{\sum(X_i-\bar X)}{\sum (X_i-\bar X)^2}=0 kiXi=(XiXˉ)Xi(XiXˉ)2=(XiXˉ)2(XiXˉ)2+Xˉ(XiXˉ)(XiXˉ)2=1\sum k_iX_i=\frac{\sum(X_i-\bar X)X_i}{\sum (X_i-\bar X)^2}=\frac{\sum(X_i-\bar X)^2}{\sum (X_i-\bar X)^2}+\frac{\bar X\sum(X_i-\bar X)}{\sum (X_i-\bar X)^2}=1 ki2=(XiXˉ)2((XiXˉ)2)2=1(XiXˉ)2=1xx\sum k_i^2=\frac{\sum (X_i-\bar X)^2}{\left(\sum (X_i-\bar X)^2\right)^2}=\frac{1}{\sum (X_i-\bar X)^2}=\frac{1}{\ell_{xx}}

Distribution

b1=kiYikiN(Y^i,σ2)=N(kiY^i,ki2σ2)b_1=\sum k_iY_i\sim \sum k_iN(\hat Y_i,\sigma^2)=N(\sum k_i\hat Y_i,\sum k_i^2\sigma^2)
kiY^i=ki(β0+β1Xi)=β0ki+β1kiXi=β1\sum k_i\hat Y_i=\sum k_i(\beta_0+\beta_1X_i)=\beta_0\sum k_i+\beta_1\sum k_iX_i=\beta_1 ki2σ2=σ2ki2=σ2xx\sum k_i^2\sigma^2=\sigma^2\sum k_i^2=\frac{\sigma^2}{\ell_{xx}}
b1N(β1,σ2xx)b_1\sim N\left(\beta_1,\frac{\sigma^2}{\ell_{xx}}\right)

Point Estimation

b^1=β1\hat b_1=\beta_1 sb12=MSExx=MSE(XiXˉ)2s^2_{b_1}=\frac{{\rm MSE}}{\ell_{xx}}=\frac{{\rm MSE}}{\sum (X_i-\bar X)^2}

(Zμ)/st(df)(Z-\mu)/s\sim t(df)

Distribution

ZμσStdN,sσ=MSEσ2χ2(df)df\frac{Z-\mu}{\sigma}\sim {\rm Std}N,\qquad\frac{s}{\sigma}=\sqrt{\frac{{\rm MSE}}{\sigma^2}}\sim\sqrt{\frac{\chi^2(df)}{df}} Zμs=Zμσ/sσStdNχ2(df)/df=t(df)\frac{Z-\mu}{s}=\frac{Z-\mu}{\sigma}\Big/\frac{s}{\sigma}\sim \frac{{\rm Std}N}{\sqrt{\chi^2(df)/df}}=t(df)

1-6. b0b_0

Definition

b0:=Yˉb1Xˉb_0:=\bar Y-b_1\bar X

Distribution

YˉN(β0+β1Xˉ,σ2/n)\bar Y\sim N(\beta_0+\beta_1\bar X,\sigma^2/n)

Definition

Yˉ:=1nYi1nN(Y^i,σ2)=N(Y^in,σ2n)=N(β0+β1Xˉ,σ2n)\bar Y:=\frac{1}{n}\sum Y_i\sim \frac{1}{n}\sum N(\hat Y_i,\sigma^2)=N\left(\frac{\sum \hat Y_i}{n},\frac{\sigma^2}{n}\right)=N\left(\beta_0+\beta_1\bar X,\frac{\sigma^2}{n}\right)

Properties

Cov(Yˉ,b1)=Cov(1nYi,kiYi)=kinσ2=0Cov(\bar Y,b_1)=Cov\left(\frac{1}{n}\sum Y_i,\sum k_iY_i\right)=\frac{\sum k_i}{n}\sigma^2=0
b0=Yˉb1XˉN(β0+β1Xˉ,σ2n)XˉN(β1,σ2xx)=N(β0,σ2(1n+Xˉ2xx))\begin{aligned}b_0&=\bar Y-b_1\bar X\\&\sim N\left(\beta_0+\beta_1\bar X,\frac{\sigma^2}{n}\right)-\bar XN\left(\beta_1,\frac{\sigma^2}{\ell_{xx}}\right)\\&=N\left(\beta_0,\sigma^2\left(\frac{1}{n}+\frac{\bar X^2}{\ell_{xx}}\right)\right)\end{aligned}

1-7. YhY_h

Definition

Yh:=b0+b1XhY_h:=b_0+b_1X_h

Distribution

Yh=Yˉ+b1X~hN(β0+β1Xˉ,σ2n)+X~hN(β1,σ2xx)=N(β0+β1Xh,σ2(1n+X~h2xx))\begin{aligned}Y_h&=\bar Y+b_1\tilde X_h\\&\sim N\left(\beta_0+\beta_1\bar X,\frac{\sigma^2}{n}\right)+\tilde X_hN\left(\beta_1,\frac{\sigma^2}{\ell_{xx}}\right)\\&=N\left(\beta_0+\beta_1X_h,\sigma^2\left(\frac{1}{n}+\frac{\tilde X_h^2}{\ell_{xx}}\right)\right)\end{aligned}

Confidence Interval

yh[Y^h±t1α2(n2)sh]y_h\in[\hat Y_h\pm t_{1-\frac{\alpha}{2}}(n-2)s_h]

Working-Hotelling Confidence Bend

yh[Y^h±Wsh]y_h\in[\hat Y_h\pm Ws_h] W2=2F1α(2,n2)W^2=2F_{1-\alpha}(2,n-2)

Remark

Average response of \infty predictions

1-8. YpredY_{\rm pred}

Definition

Ypred=Yh+εpred,εpredN(0,σ2)Y_{\rm pred}=Y_h+\varepsilon_{\rm pred},\qquad \varepsilon_{\rm pred}\sim N(0,\sigma^2)

Distribution

YpredN(β0+β1Xh,σ2(1+1n+X~h2xx))Y_{\rm pred}\sim N\left(\beta_0+\beta_1X_h,\sigma^2\left(1+\frac{1}{n}+\frac{\tilde X_h^2}{\ell_{xx}}\right)\right)

Remark

Response of single prediction

1-9. YpredmeanY_{\rm predmean}

Definition

Ypredmean=Yh+1mj=1mεpredj,εpredjN(0,σ2) i.i.dY_{\rm predmean}=Y_h+\frac{1}{m}\sum_{j=1}^m\varepsilon_{\rm pred_j},\qquad \varepsilon_{\rm pred_j}\sim N(0,\sigma^2)\ \text{i.i.d}

Distribution

YpredmeanN(β0+β1Xh,σ2(1m+1n+X~h2xx))Y_{\rm predmean}\sim N\left(\beta_0+\beta_1X_h,\sigma^2\left(\frac{1}{m}+\frac{1}{n}+\frac{\tilde X_h^2}{\ell_{xx}}\right)\right)

Remark

Average response of mm predictions

Relations Between Yh, Ypred, YpredmeanY_h,\ Y_{\rm pred},\ Y_{\rm predmean}

Yh=Ypredmean(m=)Y_h=Y_{\rm predmean}(m=\infty) Ypred=Ypredmean(m=1)Y_{\rm pred}=Y_{\rm predmean}(m=1)

1-10. eie_i

Definition

ei=Yib0b1Xie_i=Y_i-b_0-b_1X_i

Distribution

eiN(0,σ2(11nX~i2xx))e_i\sim N\left(0,\sigma^2\left(1-\frac{1}{n}-\frac{\tilde X_i^2}{\ell_{xx}}\right)\right) Cov(ei,ej)=σ2(1nX~iX~jxx){\rm Cov}(e_i,e_j)=\sigma^2\left(-\frac{1}{n}-\frac{\tilde X_i\tilde X_j}{\ell_{xx}}\right)

1-11. Summary

VariableDistributionMeanVarianceCovariance
εi\varepsilon_iNN00σ2\sigma^200
SSR/σ2{\rm SSR}/\sigma^2χ2(n2)\chi^2(n-2)
MSR/σ2{\rm MSR}/\sigma^2χ2(n2)n2\frac{\chi^2(n-2)}{n-2}
YiY_iNNY^i\hat Y_iσ2\sigma^200
b1b_1NNβ1\beta_1σ2xx\frac{\sigma^2}{\ell_{xx}}
b0b_0NNβ0\beta_0σ2(1n+Xˉ2xx)\sigma^2\left(\frac{1}{n}+\frac{\bar X^2}{\ell_{xx}}\right)
YpredmeanY_{\rm predmean}NNβ0+β1Xh\beta_0+\beta_1X_hσ2(1m+1n+X~h2xx)\sigma^2\left(\frac{1}{m}+\frac{1}{n}+\frac{\tilde X_h^2}{\ell_{xx}}\right)
eie_iNN00σ2(11nX~i2xx)\sigma^2\left(1-\frac{1}{n}-\frac{\tilde X_i^2}{\ell_{xx}}\right)σ2(1nX~iX~jxx)\sigma^2\left(-\frac{1}{n}-\frac{\tilde X_i\tilde X_j}{\ell_{xx}}\right)

1-12. ANOVA Table

ANOVASS{\rm SS}dfdfMS{\rm MS}E(MS)E({\rm MS})
RegressionSSR=(Y^iYˉ)2{\rm SSR}=\sum(\hat Y_i-\bar Y)^211MSR=SSR{\rm MSR}={\rm SSR}σ2+β12xx2\sigma^2+\beta_1^2\ell_{xx}^2
ErrorSSE=(YiY^i)2{\rm SSE}=\sum(Y_i-\hat Y_i)^2n2n-2MSE=SSEn2{\rm MSE}=\frac{{\rm SSE}}{n-2}σ2\sigma^2
TotalSSTO=(YiYˉ)2{\rm SSTO}=\sum(Y_i-\bar Y)^2n1n-1
  • Coefficient of Determination R2:=SSR/SSTOR^2:={\rm SSR}/{\rm SSTO}
  • Coefficient of Correlation r:=sgnb1R2=xy/xxyyr:=\mathop{\mathrm {sgn}}b_1\cdot \sqrt{R^2}=\ell_{xy}/\sqrt{\ell_{xx}\ell_{yy}}

2. Estimations & Tests

2-1. β1\beta_1

Interval Estimation

tα2(n2)b1β1sb1t1α2(n2)t_{\frac{\alpha}{2}}(n-2)\leq \frac{b_1-\beta_1}{s_{b_1}}\leq t_{1-\frac{\alpha}{2}}(n-2) β1[b1±t1α2(n2)sb1]\beta_1\in\left[b_1\pm t_{1-\frac{\alpha}{2}}(n-2)s_{b_1}\right]

Tests

H0:β1=0vsH1:β10H_0:\beta_1=0\quad\text{vs}\quad H_1:\beta_1\neq 0
  • tt-Test
t:=b1sb1,W={t>t1α2(n2)}t:=\frac{b_1}{s_{b_1}},\qquad W=\Big\{|t|>t_{1-\frac{\alpha}{2}}(n-2)\Big\}
  • FF-Test
F:=MSRMSEF(1,n2),W={F>F1α(1,n2)}F:=\frac{{\rm MSR}}{{\rm MSE}}\sim F(1,n-2),\qquad W=\Big\{|F|>F_{1-\alpha}(1,n-2)\Big\}
  • General Linear Test
R:Reduced Model,F:Full ModelR:\text{Reduced Model},\qquad F:\text{Full Model} F:=SSERSSEFdfRdfF/SSEFdfF,W={F>F1α(dfRdfF,dfF)}F:=\frac{{\rm SSE}_R-{\rm SSE}_F}{df_R-df_F}\Big/\frac{{\rm SSE}_F}{df_F},\qquad W=\Big\{|F|>F_{1-\alpha}(df_R-df_F,df_F)\Big\}
R:Yi=β0+εi,F:Yi=β1Xi+β0+εiR:Y_i=\beta_0+\varepsilon_i,\qquad F:Y_i=\beta_1X_i+\beta_0+\varepsilon_i SSER=SST,SSEF=SSE{\rm SSE}_R={\rm SST},\qquad {\rm SSE}_F={\rm SSE} dfR=n1,dfF=n2df_R=n-1,\qquad df_F=n-2 F=MSR/MSEF={\rm MSR}/{\rm MSE}

2-2. β0\beta_0

Interval Estimation

tα2(n2)b0β0sb0t1α2(n2)t_{\frac{\alpha}{2}}(n-2)\leq \frac{b_0-\beta_0}{s_{b_0}}\leq t_{1-\frac{\alpha}{2}}(n-2) β0[b0±t1α2(n2)sb0]\beta_0\in\left[b_0\pm t_{1-\frac{\alpha}{2}}(n-2)s_{b_0}\right]

3. Variance Normality & Constancy

3-1. Studentized and Semi-studentized Residuals

sei=σeiσ=MSE,sei(stu)=eisei,sei(semistu)=eiMSEs_{e_i}=\sigma_{e_i}\Big|_{\sigma=\sqrt{\rm MSE}},\qquad s_{e_i}^{(\rm stu)}=\frac{e_i}{s_{e_i}},\qquad s_{e_i}^{(\rm semistu)}=\frac{e_i}{\sqrt{\rm MSE}}

3-2. Normal Q-Q Plot of Residuals

E(εiei is the k-th smallest among e)uk3/8n+1/4MSEE(\varepsilon_i\mid \text{$e_i$ is the $k$-th smallest among $e$})\approx u_{\frac{k-3/8}{n+1/4}}\sqrt{{\rm MSE}}
  • Plot y=E(εik), x=eiy=E(\varepsilon_i\mid k),\ x=e_i
  • Normal residuals if scatters near y=xy=x

3-3. Brown-Forsythe Test for Variance Constancy

H0:ε=constvsH1:εconstH_0:\varepsilon={\rm const}\quad\text{vs}\quad H_1:\varepsilon\neq {\rm const} XOBS=X(1)X(2),X(1)={xXOBSx<xˉ}\mathcal X_{\rm OBS}=\mathcal X^{(1)}\cup \mathcal X^{(2)},\qquad \mathcal X^{(1)}=\{x\in\mathcal X_{\rm OBS}\mid x<\bar x\} di(1)=ei(1)emid(1),s2=d~i(1)+d~i(2)n2d^{(1)}_i=|e^{(1)}_i-e^{(1)}_{\rm mid}|,\qquad s^2=\frac{\sum \tilde d^{(1)}_i+\sum \tilde d^{(2)}_i}{n-2} tBF=dˉ1dˉ2s1n1+1n2,W={tBF>t1α2(n2)}t_{\rm BF}=\frac{\bar d_1-\bar d_2}{s\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}},\qquad W=\Big\{|t_{\rm BF}|>t_{1-\frac{\alpha}{2}}(n-2)\Big\}

3-4. Breusch-Pagan Test for Variance Constancy of Large Sample

lnσi2=γ0+γ1Xi\ln \sigma_i^2=\gamma_0+\gamma_1X_i H0:ε=const    γ1=0vsH1:εconstH_0:\varepsilon={\rm const}\iff \gamma_1=0\quad\text{vs}\quad H_1:\varepsilon\neq {\rm const} χBP2=SSRγ2/(SSEβn)2 ˙ χ2(1)W={χBP2>χ1α2(1)}\chi^2_{\rm BP}=\frac{{\rm SSR}_\gamma}{2}\Big/\left(\frac{{\rm SSE}_\beta}{n}\right)^2\ \dot\sim\ \chi^2(1)\qquad W=\Big\{|\chi^2_{\rm BP}|>\chi^2_{1-\alpha}(1)\Big\}

4. Samples of Repeat XX

4-1. Dividing of XOBS\mathcal X_{\rm OBS}

Divide XOBS\mathcal X_{\rm OBS} into groups by same XX

XOBS=X(j),j=1,,c,SSE=SSPE+SSLF\mathcal X_{\rm OBS}=\bigcup \mathcal X^{(j)},\quad j=1,\cdots,c,\qquad {\rm SSE}={\rm SSPE}+{\rm SSLF} SSPE=ji(Yi(j)Yˉ(j))2,SSLF=ji(Yˉ(j)Y^i(j))2{\rm SSPE}=\sum_j\sum_i(Y_i^{(j)}-\bar Y^{(j)})^2,\qquad {\rm SSLF}=\sum_j\sum_i(\bar Y^{(j)}-\hat Y_i^{(j)})^2
ANOVASS{\rm SS}dfdfMS{\rm MS}
RegressionSSR=(Y^i(j)Yˉ)2{\rm SSR}=\sum\sum(\hat Y_i^{(j)}-\bar Y)^211MSR=SSR{\rm MSR}={\rm SSR}
ErrorSSE=(Yi(j)Y^i(j))2{\rm SSE}=\sum\sum(Y_i^{(j)}-\hat Y_i^{(j)})^2n2n-2MSE=SSEn2{\rm MSE}=\frac{{\rm SSE}}{n-2}
Lack of FitSSLF=(Yˉ(j)Y^i(j))2{\rm SSLF}=\sum\sum(\bar Y^{(j)}-\hat Y_i^{(j)})^2c2c-2MSLF=SSLFc2{\rm MSLF}=\frac{{\rm SSLF}}{c-2}
Pure ErrorSSPE=(Yi(j)Yˉ(j))2{\rm SSPE}=\sum\sum(Y_i^{(j)}-\bar Y^{(j)})^2ncn-cMSPE=SSPEnc{\rm MSPE}=\frac{{\rm SSPE}}{n-c}
TotalSSTO=(Yi(j)Yˉ)2{\rm SSTO}=\sum\sum(Y_i^{(j)}-\bar Y)^2n1n-1
E(SSPE)=σ2,E(SSLF)=σ2+nj(μjY^j)2c2E({\rm SSPE})=\sigma^2,\quad E({\rm SSLF})=\sigma^2+\frac{\sum n_j(\mu_j-\hat Y_j)^2}{c-2}

4-2. FF-Test of Lack-of-fit with repeat XX

H0:EY=β0+β1Xv.sH1:EYβ0+β1XH_0:EY=\beta_0+\beta_1X\quad\text{v.s}\quad H_1:EY\neq \beta_0+\beta_1X F:Xi(j)=μ(j)+εi(j),R:Yi=β1Xi+β0+εiF:X_i^{(j)}=\mu^{(j)}+\varepsilon_i^{(j)},\qquad R:Y_i=\beta_1X_i+\beta_0+\varepsilon_i SSEF=SSPE,SSER=SSE{\rm SSE}_F={\rm SSPE},\qquad {\rm SSE}_R={\rm SSE} dfF=nc,dfR=n2df_F=n-c,\qquad df_R=n-2 F=MSLF/MSPEF={\rm MSLF}/{\rm MSPE}

4-3. Box-Cox Transformations of Regression to YλY^\lambda or lnY\ln Y

Yλ={Yλ,λ0,lnY,λ=0Y^\lambda=\left\{\begin{aligned} &Y^\lambda,\qquad&&\lambda\neq 0,\\ &\ln Y,\qquad&&\lambda=0 \end{aligned}\right. Yiλ=β0+β1Xi+εiY^\lambda_i=\beta_0+\beta_1X_i+\varepsilon_i

5. Simultaneous PI & CI

CI0=b0±t1α2(n2)sb0,CI1=b1±t1α2(n2)sb1{\rm CI}_0=b_0\pm t_{1-\frac{\alpha}{2}}(n-2)s_{b_0},\qquad {\rm CI}_1=b_1\pm t_{1-\frac{\alpha}{2}}(n-2)s_{b_1} Pr(β0CI0)=Pr(β1CI1)=α\Pr(\beta_0\notin{\rm CI}_0)=\Pr(\beta_1\notin{\rm CI}_1)=\alpha Pr(β0CI0β1CI1)=12α\Pr(\beta_0\in{\rm CI}_0\land \beta_1\in{\rm CI}_1)=1-2\alpha

5-1. Joint CI of β0,β1\beta_0,\beta_1

B:=t1α4(n2)B:=t_{1-\frac{\alpha}{4}}(n-2) BonfCI0=b0±Bsb0,BonfCI1=b1±Bsb1{\rm BonfCI}_0=b_0\pm Bs_{b_0},\qquad{\rm BonfCI}_1=b_1\pm Bs_{b_1} Pr(β0BonfCI0β1BonfCI1)=1α\Pr(\beta_0\in{\rm BonfCI}_0\land \beta_1\in{\rm BonfCI}_1)=1-\alpha

5-2. Simultaneous YhY_h CI of {Xhi}i=1g\{X_{h_i}\}_{i=1}^g

Y^h±UsYh\hat Y_h\pm Us_{Y_h}

Bonferroni CI

U=Bα(g)=t1α2g(n2)U=B_\alpha(g)=t_{1-\frac{\alpha}{2g}}(n-2)

Working-Hotelling CI

U=Wα=2F1α(2,n2)U=W_\alpha=\sqrt{2F_{1-\alpha}(2,n-2)}

5-3. Simultaneous YpredY_{\rm pred} PI of {Xhi}i=1g\{X_{h_i}\}_{i=1}^g

Y^pred±UsYpred\hat Y_{\rm pred}\pm Us_{Y_{\rm pred}}

Bonferroni PI

U=Bα(g)=t1α2g(n2)U=B_\alpha(g)=t_{1-\frac{\alpha}{2g}}(n-2)

Scheffe PI

U=Sα(g)=gF1α(g,n2)U=S_\alpha(g)=\sqrt{gF_{1-\alpha}(g,n-2)}

6. Regression Assuming β0=0\beta_0=0

Yi=β1Xi+εiY_i=\beta_1X_i+\varepsilon_i

6-1. b1b_1

b1=XiYiXi2=xyxxXˉ=Yˉ=0N(β1,σ2xx)Xˉ=0,df=n1b_1=\frac{\sum X_iY_i}{\sum X_i^2}=\frac{\ell_{xy}}{\ell_{xx}}\Big|_{\bar X=\bar Y=0}\sim N\left(\beta_1,\frac{\sigma^2}{\ell_{xx}}\right)\Big|_{\bar X=0},\qquad df=n-1

6-2. eie_i

eiN(0,σ2(1Xi2xx))Xˉ=0,Cov(ei,ej)=σ2(XiXjxx)Xˉ=0e_i\sim N\left(0,\sigma^2\left(1-\frac{X_i^2}{\ell_{xx}}\right)\right)\Big|_{\bar X=0},\qquad {\rm Cov}(e_i,e_j)=\sigma^2\left(-\frac{X_iX_j}{\ell_{xx}}\right)\Big|_{\bar X=0}

6-3. YpredY_{\rm pred}

YpredN(β1Xh,σ2(1+Xh2xx))Xˉ=0Y_{\rm pred}\sim N\left(\beta_1X_h,\sigma^2\left(1+\frac{X_h^2}{\ell_{xx}}\right)\right)\Big|_{\bar X=0}

6-4. ANOVA Table

ANOVASS{\rm SS}dfdfMS{\rm MS}E(MS)E({\rm MS})
RegressionSSRU=Y^i2{\rm SSRU}=\sum\hat Y_i^211MSRU=SSRU{\rm MSRU}={\rm SSRU}σ2+β12xx2Xˉ=0\sigma^2+\beta_1^2\ell_{xx}^2\mid_{\bar X=0}
ErrorSSE=(YiY^i)2{\rm SSE}=\sum(Y_i-\hat Y_i)^2n1n-1MSE=SSEn1{\rm MSE}=\frac{{\rm SSE}}{n-1}σ2\sigma^2
TotalSSTOU=Yi2{\rm SSTOU}=\sum Y_i^2nn

6-5. Test of β1=0\beta_1=0

H0:β1=0vsH1:β10H_0:\beta_1=0\quad\text{vs}\quad H_1:\beta_1\neq 0 F:=MSRUMSEF(1,n1),W={F>F1α(1,n1)}F:=\frac{{\rm MSRU}}{{\rm MSE}}\sim F(1,n-1),\qquad W=\Big\{|F|>F_{1-\alpha}(1,n-1)\Big\}