linear model | mages' blog

Bayesian regression models using Stan in R

It seems the summer is coming to end in London, so I shall take a final look at my ice cream data that I have been playing around with to predict sales statistics based on temperature for the last couple of weeks [1], [2], [3]. Here I will use the new brms (GitHub, CRAN) package by Paul-Christian Bürkner to derive the 95% prediction credible interval for the four models I introduced in my first post about generalised linear models.

Visualising the predictive distribution of a log-transformed linear model

Last week I presented visualisations of theoretical distributions that predict ice cream sales statistics based on linear and generalised linear models, which I introduced in an earlier post. Theoretical distributions Today I will take a closer look at the log-transformed linear model and use Stan/rstan, not only to model the sales statistics, but also to generate samples from the posterior predictive distribution. The posterior predictive distribution is what I am most interested in.

Visualising theoretical distributions of GLMs

Two weeks ago I discussed various linear and generalised linear models in R using ice cream sales statistics. The data showed not surprisingly that more ice cream was sold at higher temperatures. icecream <- data.frame( temp=c(11.9, 14.2, 15.2, 16.4, 17.2, 18.1, 18.5, 19.4, 22.1, 22.6, 23.4, 25.1), units=c(185L, 215L, 332L, 325L, 408L, 421L, 406L, 412L, 522L, 445L, 544L, 614L) ) I used a linear model, a log-transformed linear model, a Poisson and Binomial generalised linear model to predict sales within and outside the range of data available.

Generalised Linear Models in R

Linear models are the bread and butter of statistics, but there is a lot more to it than taking a ruler and drawing a line through a couple of points. Some time ago Rasmus Bååth published an insightful blog article about how such models could be described from a distribution centric point of view, instead of the classic error terms convention. I think the distribution centric view makes generalised linear models (GLM) much easier to understand as well.

Reserving based on log-incremental payments in R, part III

This is the third post about Christofides’ paper on Regression models based on log-incremental payments [1]. The first post covered the fundamentals of Christofides’ reserving model in sections A - F, the second focused on a more realistic example and model reduction of sections G - K. Today’s post will wrap up the paper with sections L - M and discuss data normalisation and claims inflation. I will use the same triangle of incremental claims data as introduced in my previous post.

Reserving based on log-incremental payments in R, part II

Following on from last week’s post I will continue to go through the paper Regression models based on log-incremental payments by Stavros Christofides [1]. In the previous post I introduced the model from the first 15 pages up to section F. Today I will progress with sections G to K which illustrate the model with a more realistic incremental claims payments triangle from a UK Motor Non-Comprehensive account: # Page D5.

Reserving based on log-incremental payments in R, part I

A recent post on the PirateGrunt blog on claims reserving inspired me to look into the paper Regression models based on log-incremental payments by Stavros Christofides [1], published as part of the Claims Reserving Manual (Version 2) of the Institute of Actuaries. The paper is available together with a spreadsheet model, illustrating the calculations. It is very much based on ideas by Barnett and Zehnwirth, see [2] for a reference.