13 Nov 2023
Hands-on Adaptive Learning of GLMs for Risk Modelling in R
In recent years, machine learning techniques have found their way into the insurance world. While these methods generally improve model accuracy, both explainability and manual interventions continue to play a key role in risk and tariff modelling. This is why practitioners in many lines of business still apply Generalised Linear Models (GLMs) today for non-life pricing.
But conventional modelling with GLMs comes with downsides. It is a mostly manual and step-by-step process, which may result in overfitting or unrecognised main/interaction effects.
However, GLMs do offer variants in the flavour of machine learning that automatically adapt to patterns in the data. These techniques are known as regularised GLMs, and their most prominent versions are the Lasso, Ridge regression and elastic nets. Not only can these methods proactively prevent overfitting but also adaptively learn non-linear patterns in the data along with an implicitly integrated pre-processing and selection of variables.
In this web session, we will dive into a specific algorithm that uses GLM regularisation in an easy yet powerful way. In this algorithm, we first postulate a complex model structure that represents all potential linear and non-linear patterns for the main effects (and possibly interaction effects) in the data. We then introduce a global penalty term which we apply to reduce the model to only the statistically significant effects at which model accuracy on unseen data performs best.
Applying the algorithm results in a simple but generally more accurate model in which we adaptively learned the relevant effects in a data-driven, simultaneous and automated way. A key feature is that we can account for all common types of explanatory variables (continuous, ordinal, nominal) both at the same time and in the same way. The desired balance between model simplicity and forecast accuracy can be set by means of a single control parameter. The final model has a proven GLM structure that is still explainable and allows seamless integration into existing pricing workflows.
During the web session, we will first explore the theoretical foundations of regularised GLMs and the explicit design of the algorithm. The remainder will then be hands-on as we provide extensive code that implements the algorithm in the statistical programming language R. We will discuss and run the code. You will learn how to use the programme and apply the algorithm to non-life claims data for pricing. Further focus will be on the visualization of the results, especially on the insights gained from the learned meta-results of the algorithm, e.g., the implicit way how we selected, prioritised and pre-processed variables.
Organised by the EAA – European Actuarial Academy GmbH.
This web session is suited for actuaries in (but not restricted to) non-life pricing with experience in risk modelling using Generalized Linear Models (GLMs).
We assume basic knowledge of the theoretical foundations of GLMs and the conventional way of risk modelling, e.g., in P&C, although not necessarily required to be able to follow the web session.
As we will be diving into the underlying R code for this specific use case, you should bring general knowledge of the R programming language (https://www.r-project.org/) and experience with R Studio (https://www.rstudio.com). Both R and R Studio need to be installed prior to attending the web session in order to participate in the case study. We will distribute technical requirements in advance.
Technical RequirementsPlease check with your IT department if your firewall and computer settings support web session participation (the programme Zoom is used for the web session). Please also make sure that you are joining the web session with a stable internet connection.
Purpose and Nature
The purpose of this web session is twofold:
First, you will gain an understanding of the algorithm and its underlying theoretical foundations.
Second, you will receive a comprehensive executable R programme that implements the algorithm. During the web session we will discuss and apply the code hands-on. You will learn how to run the algorithm for different settings and data. After the web session, you will have an R programme to add to your actuarial toolbox, which you can easily apply to own data and extend or modify according to your needs.