Common triangle manipulations

This tutorial will guide you through some common triangle manipulation functions available in Bermuda. First, let’s load some sample data to work with.

from bermuda import meyers_tri

meyers_tri is one of the triangles from Glenn Meyers’ monograph Stochastic Loss Reserving using Bayesian MCMC Models.

>>> meyers_tri

       Cumulative Triangle

 Number of slices:  1
 Number of cells:  100
 Triangle category:  Regular
 Experience range:  1988-01-01/1997-12-31
 Experience resolution:  12
 Evaluation range:  1988-12-31/2006-12-31
 Evaluation resolution:  12
 Dev Lag range:  0.0 - 108.0 months
 Fields:
   earned_premium
   paid_loss
   reported_loss
 Common Metadata:
   currency  USD
   country  US
   risk_basis  Accident
   reinsurance_basis  Net
   loss_definition  Loss+DCC

The Meyers triangle has 100 cells with 10 annual experience periods starting in 1988 and 10 evaluation years ending in 2006. The development lags range from 0 to 108 months, because Bermuda assumes development lags are time since the end of each experience period. The triangle also provides metadata, telling us the denomination is US dollars, the domain of business is the US, the triangle is accident basis, the values are net of reinsurance ceded expenses, and the loss definition is loss plus defence and cost containment expenses (DCC).

Clipping, filtering and aggregating triangles

It’s common to need to remove portions of a triangle based on experience period and/or evaluation date. Bermuda offers a clip method that makes these operations easy. For instance, imagine we wanted to turn Meyers’ 10x10 triangle into an upper diagonal triangle. We can do so by clipping evaluation dates at a maximum of the latest experience period end.

# Triangle.periods returns a list of (period_start, period_end) tuples
_, latest_period_end = max(meyers_tri.periods)
clipped = meyers_tri.clip(max_eval=latest_period_end)

The new triangle, clipped, now has 55 cells with an evaluation date range from 1998 through 1997.

>>> clipped

       Cumulative Triangle

 Number of slices:  1
 Number of cells:  55
 Triangle category:  Regular
 Experience range:  1988-01-01/1997-12-31
 Experience resolution:  12
 Evaluation range:  1988-12-31/1997-12-31
 Evaluation resolution:  12
 Dev Lag range:  0.0 - 108.0 months
 Fields:
   earned_premium
   paid_loss
   reported_loss
 Common Metadata:
   currency  USD
   country  US
   risk_basis  Accident
   reinsurance_basis  Net
   loss_definition  Loss+DCC

Triangle.clip can similarly clip for minimum and maximum experience periods as well as evaluation dates.

A more powerful, lower-level operation is Triangle.filter, which takes any function of cells that returns a boolean, and filters cells accordingly. For instance, the same clipping operation as above could be performed with filter:

clipped = meyers_tri.filter(
    lambda cell: cell.evaluation_date <= meyers_tri.periods[-1][1]
)

Bermuda triangles can also be aggregated across their experience period and evaluation period axes. For example, we could turn the Meyers triangle into a single 10-year period using Triangle.aggregate.

import datetime

aggregated = meyers_tri.aggregate(
    period_resolution=(10, "year"),
    period_origin=datetime.date(1987, 12, 31),
)

For this to work, we use the period_origin argument to tell Bermuda that we want the aggregation to happen from 1987-12-31 onwards, which will sum values for all cells until the last period through 1997-12-31. By default, all cells are summed. The result is a triangle with negative development lags but a single period of 19 cells:

>>> (aggregated.periods, aggregated)

([(datetime.date(1988, 1, 1), datetime.date(1997, 12, 31))],
        Cumulative Triangle


  Number of slices:  1
  Number of cells:  19
  Triangle category:  Regular
  Experience range:  1988-01-01/1997-12-31
  Experience resolution:  120
  Evaluation range:  1988-12-31/2006-12-31
  Evaluation resolution:  12
  Dev Lag range:  -108.0 - 108.0 months
  Fields:
    earned_premium
    paid_loss
    reported_loss
  Common Metadata:
    currency  USD
    country  US
    risk_basis  Accident
    reinsurance_basis  Net
    loss_definition  Loss+DCC
 )

The negative development lags indicate that the first cell for experience period 1988-1-1 to 1997-12-31 is evaluated at 1988-12-31, 10 years prior to the end of the period.

Merging and summarizing multi-slice triangles

Imagine we now have separate triangles for paid losses and premiums, which might arise if someone has loaded triangle data from different sources. We can create two separate triangles by using the Triangle.select method.

meyers_paid = meyers_tri.select("paid_loss")
meyers_premium = meyers_tri.select("earned_premium")

We can combine these two triangles in a number of ways. Simply concatenating the two triangles will result in a multi-slice triangle:

>>> meyers_paid + meyers_premium

       Cumulative Triangle

 Number of slices:  1
 Number of cells:  200
 Triangle category:  Regular
 Experience range:  1988-01-01/1997-12-31
 Experience resolution:  12
 Evaluation range:  1988-12-31/2006-12-31
 Evaluation resolution:  12
 Dev Lag range:  0.0 - 108.0 months
 Optional Fields:
   earned_premium (50.0% coverage)
   paid_loss (50.0% coverage)
 Common Metadata:
   currency  USD
   country  US
   risk_basis  Accident
   reinsurance_basis  Net
   loss_definition  Loss+DCC

The __repr__ method now tells us that there are 200 cells, and two Optional Fields that each have 50% coverage across triangle cells.

To create the original triangle again, the canonical method is merge, which is available as a method on Triangle.

>>> meyers_merged = meyers_paid.merge(meyers_premium)
>>> meyers_merged

       Cumulative Triangle

 Number of slices:  1
 Number of cells:  100
 Triangle category:  Regular
 Experience range:  1988-01-01/1997-12-31
 Experience resolution:  12
 Evaluation range:  1988-12-31/2006-12-31
 Evaluation resolution:  12
 Dev Lag range:  0.0 - 108.0 months
 Fields:
   earned_premium
   paid_loss
 Common Metadata:
   currency  USD
   country  US
   risk_basis  Accident
   reinsurance_basis  Net
   loss_definition  Loss+DCC

There is an exception to this operation, which occurs if the two triangles have different metadata. For instance, imagine we had the paid and earned premium triangles above but distinct metadata. We can create these triangles with help from the derive_metadata triangle method.

meyers_paid = meyers_tri.select("paid_loss").derive_metadata(
    details=dict(slice=1)
)

meyers_premium = meyers_tri.select("earned_premium").derive_metadata(
    details=dict(slice=2)
)

A merge operation would now return the same as meyers_paid + meyers_premium because merging shouldn’t take place across distinct metadata or triangle slices. In this case, the canonical pattern is to summarize the combined, multi-slice triangle.

>>> combined = meyers_paid + meyers_premium
>>> combined.summarize()

which returns the single triangle that we started with. summarize works by figuring out the greatest common denominator of metadata elements, and using that to summarize triangles using a set of pre-defined field aggregation functions, which for loss and premium fields are all summed. If there is an unrecognized field, Bermuda will error. For instance, let’s create a new field called paid_losses rather than paid_loss, using the derive_fields method, and try to summarize the triangles:

>>> meyers_paid_2 = meyers_paid.derive_fields(paid_losses = lambda cell: cell["paid_loss"])
>>> combined = meyers_paid_2 + meyers_premium
>>> combined.summarize()

 ...
 TriangleError: Don't know how to aggregate `paid_losses` values

The result is a TriangleError that indicates summarize does not know how to summarize, a priori, paid_losses. However, we can pass in a custom function to help tell Bermuda what to do. Currently, this functionality is reserved for people comfortable with looking in the internals of bermuda.utils.summarize.SUMMARIZE_DEFAULTS, since the custom function requires on Bermuda summarization logic:

>>> from bermuda.utils.summarize import _conforming_sum

>>> combined.summarize(
...     summary_fns={"paid_losses": lambda v: _conforming_sum(v["paid_losses"])}
... )

       Cumulative Triangle

 Number of slices:  1
 Number of cells:  100
 Triangle category:  Regular
 Experience range:  1988-01-01/1997-12-31
 Experience resolution:  12
 Evaluation range:  1988-12-31/2006-12-31
 Evaluation resolution:  12
 Dev Lag range:  0.0 - 108.0 months
 Fields:
   earned_premium
   paid_loss
   paid_losses
 Common Metadata:
   currency  USD
   country  US
   risk_basis  Accident
   reinsurance_basis  Net
   loss_definition  Loss+DCC

This procedure now returns a summarized triangle with both paid_loss and paid_losses.

In some cases, we might have premium present in two triangles that should not be summarized. For instance, imagine we had two triangles for paid and reported losses, each with earned premium as a field. We can use summarize_premium=False in our call to summarize to ensure that premium fields are not summed.

paid = meyers_tri.select(["paid_loss", "earned_premium"]).derive_metadata(
    details=dict(slice=1)
)
reported = meyers_tri.select(["reported_loss", "earned_premium"]).derive_metadata(
    details=dict(slice=2)
)

summarized = (paid + reported).summarize(summarize_premium=False)

How can we check that the triangle summarized has the same earned premium as both paid and reported triangles? Triangle cells are easy to iterate over, so one option is to zip both triangles, and iterate and compare their values, such as:

assert all(
    cell1["earned_premium"] == cell2["earned_premium"]
    for cell1, cell2
    in zip(summarized, paid)
)

But Bermuda also provides an extract() method on triangles, which returns a Numpy array and can make this easier.

assert (summarized.extract("earned_premium") == paid.extract("earned_premium")).all()