How have we used R to improve how we process Tripartite Template (TPT) asset data?


This is the third in our case study series, showcasing some of the ways we have been using R at Barnett Waddingham.

Checking and cleaning data is an incredibly important first step in actuarial workflows. If you put garbage data into your models, you will get garbage results out of it (GIGO)!

Unfortunately, checking and cleaning data can also be one of the most tedious and least exciting parts of an Actuary’s day.

This case study shows how we have used R to automate the checking and cleaning of Tripartite asset data. We are now using this tool on the valuations we carry out as part of our outsourced Chief Actuary roles.

Why did we automate the checking of TPT data?

Historically, we carried out our checking and cleaning of TPT data using a semi-manual process in Excel. The process was adequate in the sense that we were satisfied the final data was appropriate to use in our valuations, but it was time consuming and tedious to carry out.  

TPT asset data is a great candidate for automation as the structure is standardised and there are often large volumes of data. Now that we have automated this process using R, the checking and cleaning of TPT data is a breeze!  

How did we improve our process?

Improvements were carried out using the following steps.

  • We transposed the Excel based process into R and ensured that the existing process was fully replicated.
  • We then built an R Shiny front-end to the automated process, which is accessed through a simple web interface.
  • Finally, we added a module that records the checking and cleaning carried out in the tool to automatically create an audit trail.  

What are the benefits of the new process?

The R Shiny front-end means that the end user does not need to edit any R code directly. They simply use the front-end to configure the tool to confirm which tests should be run, and set out how data should be amended if needed.

The automated audit trail makes the governance aspects of the process incredibly easy as well. The generated document contains the below. 

  • Standard audit information about which data was used, who carried out the work and where the results are saved.
  • Summary statistics for the data.
  • The results of the data checks prior to any data cleaning.
  • The data cleaning carried out and the rational logged when making the changes to the data.
  • The results of the data checks after the data cleaning has taken place.
  • Space for the user to document why any residual failures of data checks after data cleaning are deemed acceptable.

Can we help you?

We want to share our knowledge on automated data checking, cleaning and manipulation so that others can reap the same benefits as us.

If you or your team could benefit from an efficient and automated data process, please contact us below and we will be happy to provide additional information and discuss your data process.

To stay up to date with the latest independent commentary and exclusive insights - tailored to your preference - click here.

Applications of R in insurance: demographic experience monitoring

How we have used R to improve demographic experience monitoring?

Read our case study

Applications of R in insurance: capital modelling

How R can be used as both a parameterisation tool and as a component of the internal model?

Read our case study