Why do we divide by n-1 when calculating the sample standard deviation?
For a shorter proof, here are a few things we need to know: x1β,x2β,x3β,β―,xnβ are independent observations from a population with mean ΞΌ and variance Ο2. E[xiβ]=ΞΌ,V(xiβ)=Ο2. E[xi2β]=Ο2+ΞΌ2. V(X)=E[X2]βE[X]2 E[XΛ2]=ΞΌ2+nΟ2β
The sample variance is defined as follows: S2=nβ11ββi=1nβ(xiββXΛ)2, where XΛ is the sample mean.
On the right hand side of equation,
i=1βnβ(xiββXΛ)2=x12β+x22β+x32β+β―+xnβ12β+xn2ββ2XΛ(x1β+x2β+x3β+β―+xnβ1β+xnβ)+nXΛ2.If we take the expectation value on the right hand side,
E[i=1βnβ(xiββXΛ)2]====βE[x12β]+E[x22β]+E[x32β]+β―+E[xn2β]β2E[XΛβ
nXΛ]+nE[XΛ2]E[x12β]+E[x22β]+E[x32β]+β―+E[xn2β]βnE[XΛ2]n(ΞΌ2+Ο2)βn(ΞΌ2+nΟ2β)(nβ1)Ο2.ββ β΄Ο2=nβ11βE[i=1βnβ(xiββXΛ)2]=E[S2].