How many more R-bloggers posts can I expect?
I noticed that the monthly number of posts on R-bloggers stopped increasing over the last year. Indeed, the last couple of months saw a decline in posts compared to the previous year. Thus, has most been said and written about R already?
Who knows? Well, I took a stab at looking into the future. However, I can tell you already that I am not convinced by my predictions. But maybe someone else will be inspired to take this work forward.
First, I have to get the data - that’s easy, I can scrape the monthly post counts from the R-bloggers homepage.
Looking at the incremental and cumulative plots, and believing that eventually the number of R posts will decrease, I thought that a logistic growth function would provide a nice fit to the data and also give an asymptotic view of the total number of posts on R-bloggers.
Although the fit, see below, looks reasonable at first glance, I don’t believe it provides a sensible prediction of the future. The model would forecast only another 1,269 post by the end of 2016 with not much more to expect after that. Indeed the asymptotic total number of posts K is only 14,396. I don’t believe this can be right, not even as a proxy, when the current count of monthly posts is well above 100.
I played around with data and the logistic growth function a little further, using annual instead of monthly data, changing the time horizon and fixing K, yet without much success.
Eventually I recalled a talk by Rob Hyndman’s about his forecast package. After all, I have a time series here. So, applying the
forecast function to the incremental data provides a somewhat more realistic prediction of 2,695 posts for the next 12 months, but with an increasing trend in monthly posts for 2014, which I find hard to believe given the observations over the last year.
Well, I presented two models here: One predicts a rapid decline in monthly posts on R-bloggers, while the other forecasts an increase. Neither feels right to me. Of course time will tell, but have you got any ideas or views?
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
attached base packages:
 stats graphics grDevices utils datasets methods base
other attached packages:
 forecast_4.8 xts_0.9-7 zoo_1.7-10 XML_3.95-0.2
loaded via a namespace (and not attached):
 colorspace_1.2-4 fracdiff_1.4-2
 grid_3.0.2 lattice_0.20-23
 nnet_7.3-7 parallel_3.0.2
 quadprog_1.5-5 Rcpp_0.10.6
 RcppArmadillo_0.3.920.1 tools_3.0.2