R is the easiest language to speak badly
Back to R. As I found out, there are lots of different ways to calculate the means on subsets of data. I begin to wonder, why so many different interfaces and functions have been developed over the years, and also why I didn’t use the aggregate
function more often in the past?
Can we blame internet search engines? Why should I learn a programming language properly, when I can find approximate answers to my problem online. I may not end up with the best answer, but with something which will work after all: Don’t know why, but it works.
And sometimes the help files can be more difficult to understand than the code in the examples. Hence, I end up playing around with the example code until it works, and only then I try to figure out how it works. That was my experience with reshape
.
Maybe this is a bit harsh. It is always up to the individual to improve his language skills, but you can get drunk in a pub as well, by only being able to order beer. I think it was George Bernard Shaw, who said: “R is the easiest language to speak badly.” No, actually he said: “English is the easiest language to speak badly.” Maybe that explains the success of English and R?
Reading helps. More and more books have been published on R over the last years, and not only in English. But which should you pick? Xi’an’s review on the Art of R Programming suggests that it might be a good start.
Back to aggregate
. Has anyone noticed, that the formula interface of aggregate
is different to summaryBy
?
aggregate(cbind(Sepal.Width, Petal.Width) ~ Species, data=iris, FUN=mean)
Species Sepal.Width Petal.Width
1 setosa 3.428 0.246
2 versicolor 2.770 1.326
3 virginica 2.974 2.026
versus
library(doBy)
summaryBy(Sepal.Width + Petal.Width ~ Species, data=iris, FUN=mean)
Species Sepal.Width.mean Petal.Width.mean
1 setosa 3.428 0.246
2 versicolor 2.770 1.326
3 virginica 2.974 2.026
And another slightly more complex example:
aggregate(cbind(ncases, ncontrols) ~ alcgp + tobgp, data = esoph, FUN=sum)
summaryBy(ncases + ncontrols ~ alcgp + tobgp, data = esoph, FUN=sum)
Citation
For attribution, please cite this work as:Markus Gesmann (Feb 01, 2012) R is the easiest language to speak badly. Retrieved from https://magesblog.com/post/2012-02-01-r-is-easiest-language-to-speak-badly/
@misc{ 2012-r-is-the-easiest-language-to-speak-badly,
author = { Markus Gesmann },
title = { R is the easiest language to speak badly },
url = { https://magesblog.com/post/2012-02-01-r-is-easiest-language-to-speak-badly/ },
year = { 2012 }
updated = { Feb 01, 2012 }
}