Brandon Harris bio photo

Brandon Harris

Big Data + Analytics @ Discover Financial

Twitter LinkedIn Instagram Github Photography

Going through some homework tonight, I was stuck on checking my answers when using R. The numbers were close, but R was slightly different than my manual calculations. As it turns out, when doing an independent two-sample t-test, the default for R is to assume that two variances are not equal, which is why it spits out a big line that says “Welch Two Sample t-test”. I suppose the author of the text decided to spare us calculating df under the Welch test and just gave us data where the two variances were equal (or maybe we’re just ignoring Welch for now).

Anyway, after poking through the R documentation, I found this little gem regarding t.test parameters.

var.equal > a logical variable indicating whether to treat the two variances as being equal. If TRUE then the pooled variance is > >used to estimate the variance otherwise the Welch (or Satterthwaite) approximation to the degrees of freedom is used.

Setting var.equal=T in R forced it to use the method in the text, and my answers magically matched up with the R output.

Hopefully this post gets google-fied at some point and saves someone else the 30 minutes of checking, and re-checking their manual work.

Lesson learned… Read the R docs first to understand what’s happening behind the scenes, then double-check my work.