6 May 2022
Advanced Concepts of Clustering in Insurance
Cluster analysis is the task of grouping a set of objects (e.g., observations, policies, claims) in such a way that objects in the same group (called a cluster) are more similar to each other than to those in other groups. In contrast to simple segmentation (e.g. by geographical location only), clustering uses several features to differentiate among those groups. Potential applications are manifold and centred around questions such as, for example:
The course shows how different algorithms can be used to obtain a segmentation of insurance data. The methods covered range from centroid-based (k-means, k-prototypes) to probabilistic (Gaussian Mixture Models) and density-based (DBSCAN) approaches. We demonstrate how the clustering results can be visualized and evaluated. Moreover, it will be shown how the clustering results can be used to identify outliers in the data set. We also cover techniques that reduce the dimension of the data so that the segments can be computed either on aggregated information or using only a subset of the available information. The course puts an emphasis on the practical application and therefore showcases all concepts on an insurance data set.Organised by the EAA – European Actuarial Academy GmbH.
The web session is open to all interested persons. Prior knowledge about statistical clustering is not necessary but recommended, for example, a participation in the introductory course “Practical Application of Clustering in Insurance”. Experience with the programming language R is helpful as it is used to analyse the insurance data set.
Technical RequirementsPlease check with your IT department if your firewall and computer settings support web session participation (the programme Zoom is used for this online training). Please also make sure that you are joining the web session with a stable internet connection.
Purpose and Nature
The following topics will be covered:
The theoretical coverage is supplemented with a practical example on an insurance data set using the programming language R.