ยท Econometrics  ยท 2 min read

Two sample T-test example

``` r dental = read.csv("data/dental.csv") boxplot(resp~treatment,data=dental,col='red') boxplot(log(resp)~treatment,data=dental) var.test(resp~treatment,data=dental) variance equality test var.te...

dental = read.csv("data/dental.csv")
boxplot(resp~treatment,data=dental,col='red')
boxplot(log(resp)~treatment,data=dental)
var.test(resp~treatment,data=dental) # variance equality test
var.test(log(resp)~treatment,data=dental) # variance equality test, log-normal
t.test(resp~treatment,data=dental) # Welch test
t.test(log(resp)~treatment,var.equal=TRUE,data=dental) # pooled variance test
regout=lm(log(resp)~treatment,data=dental)
shapiro.test(resid(regout))

ํ†ต๊ณ„ํ•™์—์„œ ๋‘ ๋ณ€์ˆ˜์˜ ํ‰๊ท ๊ฐ’์ด ๊ฐ™์€์ง€ ์•Œ์•„๋ณด๋Š” ๋ฐฉ๋ฒ•์„ ๋ช‡ ๊ฐ€์ง€๋กœ ๋‚˜๋ˆ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

  1. ๋™์ผํ•œ ์‹คํ—˜ ๋Œ€์ƒ์˜ ํŠน์ •๋ณ€์ˆ˜ ๊ฐ’์ด ๋ณ€ํ–ˆ๋Š”์ง€ ํ™•์ธํ•˜๋Š” ๊ฒฝ์šฐ, ๋‘ ๋ณ€์ˆ˜์˜ ์ฐจ๋ฅผ T-test ํ•ด๋ณด๋ฉด ๋ฐ”๋กœ ๊ฒฐ๊ณผ๋ฅผ ์•Œ ์ˆ˜ ์žˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์–ด๋–ค ๋‹ค์ด์–ดํŠธ ํ”„๋กœ๊ทธ๋žจ์„ ์‹คํ–‰ํ•œ ๊ฐ ๊ฐœ์ธ์˜ ๋ชธ๋ฌด๊ฒŒ ๋ณ€ํ™”๋ฅผ ๋ณด๋Š” ๊ฒฝ์šฐ๊ฐ€ ์ด์— ์†ํ•œ๋‹ค. (Paired sample)
  2. ๋™์ผํ•œ ๋ถ„์‚ฐ์„ ๊ฐ–๋Š” ๊ฒฝ์šฐ (Pooled variance)
  3. ์„œ๋กœ ๋‹ค๋ฅธ ๋ถ„์‚ฐ์„ ๊ฐ–๋Š” ๊ฒฝ์šฐ (Welch test) 2์™€ 3์˜ ๊ฒฝ์šฐ๋ฅผ Variance Equality Test๋ฅผ ํ†ตํ•ด ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ทธ์— ์•ž์„œ ๋ฐ•์Šคํ”Œ๋กฏ์„ ํ†ตํ•ด ์‹œ๊ฐ์ ์œผ๋กœ ํ™•์ธํ•ด ๋ณผ ์ˆ˜๋„ ์žˆ๊ฒ ๋‹ค. ์œ„ ๋ฐ์ดํ„ฐ๋Š” ์ƒ๋ฌผํ•™ ์‹คํ—˜ ์ค‘ ๋ฐ•ํ…Œ๋ฆฌ์•„ ๋ฐฐ์–‘์—์„œ ์ถ”์ถœํ•œ ๋ฐ•ํ…Œ๋ฆฌ์•„์˜ ์ˆ˜๋ฅผ ๊ธฐ๋กํ•œ ๊ฒƒ์ด๊ณ  ๋ฐ•ํ…Œ๋ฆฌ์•„ ์ˆ˜๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ์ง€์ˆ˜์ ์œผ๋กœ ์ฆ๊ฐ€ํ•˜๋Š” ๊ฒƒ์œผ๋กœ ์•Œ๋ ค์ ธ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ Log ๋ณ€ํ™˜์„ ํ†ตํ•ด ์ •๊ทœ๋ถ„ํฌ๋ฅผ ๋”ฐ๋ฅด๊ฒŒ ๋˜๋ฏ€๋กœ log-normal distribution์ด๋ผ๊ณ  ํ•œ๋‹ค. Two sample t-test๋ฅผ ์ˆ˜ํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ๋ชจ์ง‘๋‹จ์ด ์ •๊ทœ๋ถ„ํฌ๋ฅผ ๋”ฐ๋ฅด๋Š”์ง€ ๋ณด๊ธฐ์œ„ํ•ด Regression์„ ์ˆ˜ํ–‰ํ•œ ํ›„ ์ž”์ฐจํ•ญ์— ๋Œ€ํ•ด Shapiro test๋ฅผ ์ˆ˜ํ–‰ํ•˜์˜€๋‹ค.
Back to Blog