Common triangle manipulations¶
This tutorial will guide you through some common triangle manipulation functions available in Bermuda. First, let’s load some sample data to work with.
from bermuda import meyers_tri
meyers_tri
is one of the triangles from Glenn Meyers’ monograph
Stochastic Loss Reserving using Bayesian MCMC Models.
>>> meyers_tri
Cumulative Triangle
Number of slices: 1
Number of cells: 100
Triangle category: Regular
Experience range: 1988-01-01/1997-12-31
Experience resolution: 12
Evaluation range: 1988-12-31/2006-12-31
Evaluation resolution: 12
Dev Lag range: 0.0 - 108.0 months
Fields:
earned_premium
paid_loss
reported_loss
Common Metadata:
currency USD
country US
risk_basis Accident
reinsurance_basis Net
loss_definition Loss+DCC
The Meyers triangle has 100 cells with 10 annual experience periods starting in 1988 and 10 evaluation years ending in 2006. The development lags range from 0 to 108 months, because Bermuda assumes development lags are time since the end of each experience period. The triangle also provides metadata, telling us the denomination is US dollars, the domain of business is the US, the triangle is accident basis, the values are net of reinsurance ceded expenses, and the loss definition is loss plus defence and cost containment expenses (DCC).
Clipping, filtering and aggregating triangles¶
It’s common to need to remove portions of a triangle based
on experience period and/or evaluation date.
Bermuda offers a clip
method that makes these operations
easy. For instance, imagine we wanted to turn Meyers’ 10x10 triangle
into an upper diagonal triangle.
We can do so by clipping evaluation dates at a maximum of the
latest experience period end.
# Triangle.periods returns a list of (period_start, period_end) tuples
_, latest_period_end = max(meyers_tri.periods)
clipped = meyers_tri.clip(max_eval=latest_period_end)
The new triangle, clipped
, now has 55 cells
with an evaluation date range from 1998 through
1997.
>>> clipped
Cumulative Triangle
Number of slices: 1
Number of cells: 55
Triangle category: Regular
Experience range: 1988-01-01/1997-12-31
Experience resolution: 12
Evaluation range: 1988-12-31/1997-12-31
Evaluation resolution: 12
Dev Lag range: 0.0 - 108.0 months
Fields:
earned_premium
paid_loss
reported_loss
Common Metadata:
currency USD
country US
risk_basis Accident
reinsurance_basis Net
loss_definition Loss+DCC
Triangle.clip
can similarly clip for
minimum and maximum experience periods as well
as evaluation dates.
A more powerful, lower-level operation is Triangle.filter
,
which takes any function of cells that returns a boolean, and
filters cells accordingly. For instance, the same clipping
operation as above could be performed with filter
:
clipped = meyers_tri.filter(
lambda cell: cell.evaluation_date <= meyers_tri.periods[-1][1]
)
Bermuda triangles can also be aggregated across their experience period
and evaluation period axes. For example, we could turn the Meyers
triangle into a single 10-year period using Triangle.aggregate
.
import datetime
aggregated = meyers_tri.aggregate(
period_resolution=(10, "year"),
period_origin=datetime.date(1987, 12, 31),
)
For this to work, we use the period_origin
argument to tell Bermuda that we want the aggregation
to happen from 1987-12-31 onwards, which will sum values
for all cells until the last period through 1997-12-31.
By default, all cells are summed.
The result is a triangle with negative development lags
but a single period of 19 cells:
>>> (aggregated.periods, aggregated)
([(datetime.date(1988, 1, 1), datetime.date(1997, 12, 31))],
Cumulative Triangle
Number of slices: 1
Number of cells: 19
Triangle category: Regular
Experience range: 1988-01-01/1997-12-31
Experience resolution: 120
Evaluation range: 1988-12-31/2006-12-31
Evaluation resolution: 12
Dev Lag range: -108.0 - 108.0 months
Fields:
earned_premium
paid_loss
reported_loss
Common Metadata:
currency USD
country US
risk_basis Accident
reinsurance_basis Net
loss_definition Loss+DCC
)
The negative development lags indicate that the first cell for experience period 1988-1-1 to 1997-12-31 is evaluated at 1988-12-31, 10 years prior to the end of the period.
Merging and summarizing multi-slice triangles¶
Imagine we now have separate triangles
for paid losses and premiums, which might arise if someone has loaded
triangle data from different sources. We can create two separate triangles
by using the Triangle.select
method.
meyers_paid = meyers_tri.select("paid_loss")
meyers_premium = meyers_tri.select("earned_premium")
We can combine these two triangles in a number of ways. Simply concatenating the two triangles will result in a multi-slice triangle:
>>> meyers_paid + meyers_premium
Cumulative Triangle
Number of slices: 1
Number of cells: 200
Triangle category: Regular
Experience range: 1988-01-01/1997-12-31
Experience resolution: 12
Evaluation range: 1988-12-31/2006-12-31
Evaluation resolution: 12
Dev Lag range: 0.0 - 108.0 months
Optional Fields:
earned_premium (50.0% coverage)
paid_loss (50.0% coverage)
Common Metadata:
currency USD
country US
risk_basis Accident
reinsurance_basis Net
loss_definition Loss+DCC
The __repr__
method now tells us that there are 200 cells,
and two Optional Fields
that each have 50% coverage across
triangle cells.
To create the original triangle again, the canonical method is merge
,
which is available as a method on Triangle
.
>>> meyers_merged = meyers_paid.merge(meyers_premium)
>>> meyers_merged
Cumulative Triangle
Number of slices: 1
Number of cells: 100
Triangle category: Regular
Experience range: 1988-01-01/1997-12-31
Experience resolution: 12
Evaluation range: 1988-12-31/2006-12-31
Evaluation resolution: 12
Dev Lag range: 0.0 - 108.0 months
Fields:
earned_premium
paid_loss
Common Metadata:
currency USD
country US
risk_basis Accident
reinsurance_basis Net
loss_definition Loss+DCC
There is an exception to this operation, which occurs if the two
triangles have different metadata. For instance, imagine we had
the paid and earned premium triangles above but distinct metadata.
We can create these triangles with help from the derive_metadata
triangle method.
meyers_paid = meyers_tri.select("paid_loss").derive_metadata(
details=dict(slice=1)
)
meyers_premium = meyers_tri.select("earned_premium").derive_metadata(
details=dict(slice=2)
)
A merge
operation would now return the same as meyers_paid + meyers_premium
because merging shouldn’t take place across distinct metadata or triangle
slices. In this case, the canonical pattern is to summarize
the
combined, multi-slice triangle.
>>> combined = meyers_paid + meyers_premium
>>> combined.summarize()
which returns the single triangle that we started with.
summarize
works by figuring out the greatest common
denominator of metadata elements, and using that to summarize
triangles using a set of pre-defined field aggregation functions,
which for loss and premium fields are all summed.
If there is an unrecognized field, Bermuda will error. For instance,
let’s create a new field called paid_losses
rather than paid_loss
,
using the derive_fields
method, and try to summarize the triangles:
>>> meyers_paid_2 = meyers_paid.derive_fields(paid_losses = lambda cell: cell["paid_loss"])
>>> combined = meyers_paid_2 + meyers_premium
>>> combined.summarize()
...
TriangleError: Don't know how to aggregate `paid_losses` values
The result is a TriangleError
that indicates summarize
does not know
how to summarize, a priori, paid_losses
.
However, we can pass in a custom function to help tell Bermuda what to do.
Currently, this functionality is reserved for people comfortable with looking
in the internals of bermuda.utils.summarize.SUMMARIZE_DEFAULTS
, since the
custom function requires on Bermuda summarization logic:
>>> from bermuda.utils.summarize import _conforming_sum
>>> combined.summarize(
... summary_fns={"paid_losses": lambda v: _conforming_sum(v["paid_losses"])}
... )
Cumulative Triangle
Number of slices: 1
Number of cells: 100
Triangle category: Regular
Experience range: 1988-01-01/1997-12-31
Experience resolution: 12
Evaluation range: 1988-12-31/2006-12-31
Evaluation resolution: 12
Dev Lag range: 0.0 - 108.0 months
Fields:
earned_premium
paid_loss
paid_losses
Common Metadata:
currency USD
country US
risk_basis Accident
reinsurance_basis Net
loss_definition Loss+DCC
This procedure now returns a summarized triangle with both paid_loss
and
paid_losses
.
In some cases, we might have premium present in
two triangles that should not be summarized.
For instance, imagine we had two triangles for paid and reported losses,
each with earned premium as a field.
We can use summarize_premium=False
in our call to summarize
to ensure that premium fields are not summed.
paid = meyers_tri.select(["paid_loss", "earned_premium"]).derive_metadata(
details=dict(slice=1)
)
reported = meyers_tri.select(["reported_loss", "earned_premium"]).derive_metadata(
details=dict(slice=2)
)
summarized = (paid + reported).summarize(summarize_premium=False)
How can we check that the triangle summarized
has the same earned premium as both
paid
and reported
triangles? Triangle cells are easy to iterate over,
so one option is to zip both triangles, and iterate and compare their values, such as:
assert all(
cell1["earned_premium"] == cell2["earned_premium"]
for cell1, cell2
in zip(summarized, paid)
)
But Bermuda also provides an extract()
method on triangles,
which returns a Numpy array and can make this easier.
assert (summarized.extract("earned_premium") == paid.extract("earned_premium")).all()