Common triangle manipulations ================================ This tutorial will guide you through some common triangle manipulation functions available in Bermuda. First, let's load some sample data to work with. .. code-block:: python from bermuda import meyers_tri ``meyers_tri`` is one of the triangles from Glenn Meyers' monograph `Stochastic Loss Reserving using Bayesian MCMC Models `_. .. code-block:: python >>> meyers_tri Cumulative Triangle Number of slices: 1 Number of cells: 100 Triangle category: Regular Experience range: 1988-01-01/1997-12-31 Experience resolution: 12 Evaluation range: 1988-12-31/2006-12-31 Evaluation resolution: 12 Dev Lag range: 0.0 - 108.0 months Fields: earned_premium paid_loss reported_loss Common Metadata: currency USD country US risk_basis Accident reinsurance_basis Net loss_definition Loss+DCC The Meyers triangle has 100 cells with 10 annual experience periods starting in 1988 and 10 evaluation years ending in 2006. The development lags range from 0 to 108 months, because Bermuda assumes development lags are time since the end of each experience period. The triangle also provides metadata, telling us the denomination is US dollars, the domain of business is the US, the triangle is accident basis, the values are net of reinsurance ceded expenses, and the loss definition is loss plus defence and cost containment expenses (DCC). Clipping, filtering and aggregating triangles ------------------------------------------------- It's common to need to remove portions of a triangle based on experience period and/or evaluation date. Bermuda offers a ``clip`` method that makes these operations easy. For instance, imagine we wanted to turn Meyers' 10x10 triangle into an upper diagonal triangle. We can do so by clipping evaluation dates at a maximum of the latest experience period end. .. code-block:: python # Triangle.periods returns a list of (period_start, period_end) tuples _, latest_period_end = max(meyers_tri.periods) clipped = meyers_tri.clip(max_eval=latest_period_end) The new triangle, ``clipped``, now has 55 cells with an evaluation date range from 1998 through 1997. .. code-block:: python >>> clipped Cumulative Triangle Number of slices: 1 Number of cells: 55 Triangle category: Regular Experience range: 1988-01-01/1997-12-31 Experience resolution: 12 Evaluation range: 1988-12-31/1997-12-31 Evaluation resolution: 12 Dev Lag range: 0.0 - 108.0 months Fields: earned_premium paid_loss reported_loss Common Metadata: currency USD country US risk_basis Accident reinsurance_basis Net loss_definition Loss+DCC ``Triangle.clip`` can similarly clip for minimum and maximum experience periods as well as evaluation dates. A more powerful, lower-level operation is ``Triangle.filter``, which takes any function of cells that returns a boolean, and filters cells accordingly. For instance, the same clipping operation as above could be performed with ``filter``: .. code-block:: python clipped = meyers_tri.filter( lambda cell: cell.evaluation_date <= meyers_tri.periods[-1][1] ) Bermuda triangles can also be aggregated across their experience period and evaluation period axes. For example, we could turn the Meyers triangle into a single 10-year period using ``Triangle.aggregate``. .. code-block:: python import datetime aggregated = meyers_tri.aggregate( period_resolution=(10, "year"), period_origin=datetime.date(1987, 12, 31), ) For this to work, we use the ``period_origin`` argument to tell Bermuda that we want the aggregation to happen from 1987-12-31 onwards, which will sum values for all cells until the last period through 1997-12-31. By default, all cells are summed. The result is a triangle with negative development lags but a single period of 19 cells: .. code-block:: python >>> (aggregated.periods, aggregated) ([(datetime.date(1988, 1, 1), datetime.date(1997, 12, 31))], Cumulative Triangle Number of slices: 1 Number of cells: 19 Triangle category: Regular Experience range: 1988-01-01/1997-12-31 Experience resolution: 120 Evaluation range: 1988-12-31/2006-12-31 Evaluation resolution: 12 Dev Lag range: -108.0 - 108.0 months Fields: earned_premium paid_loss reported_loss Common Metadata: currency USD country US risk_basis Accident reinsurance_basis Net loss_definition Loss+DCC ) The negative development lags indicate that the first cell for experience period 1988-1-1 to 1997-12-31 is evaluated at 1988-12-31, 10 years prior to the end of the period. Merging and summarizing multi-slice triangles ------------------------------------------------------- Imagine we now have separate triangles for paid losses and premiums, which might arise if someone has loaded triangle data from different sources. We can create two separate triangles by using the ``Triangle.select`` method. .. code-block:: python meyers_paid = meyers_tri.select("paid_loss") meyers_premium = meyers_tri.select("earned_premium") We can combine these two triangles in a number of ways. Simply concatenating the two triangles will result in a multi-slice triangle: .. code-block:: python >>> meyers_paid + meyers_premium Cumulative Triangle Number of slices: 1 Number of cells: 200 Triangle category: Regular Experience range: 1988-01-01/1997-12-31 Experience resolution: 12 Evaluation range: 1988-12-31/2006-12-31 Evaluation resolution: 12 Dev Lag range: 0.0 - 108.0 months Optional Fields: earned_premium (50.0% coverage) paid_loss (50.0% coverage) Common Metadata: currency USD country US risk_basis Accident reinsurance_basis Net loss_definition Loss+DCC The ``__repr__`` method now tells us that there are 200 cells, and two ``Optional Fields`` that each have 50% coverage across triangle cells. To create the original triangle again, the canonical method is ``merge``, which is available as a method on ``Triangle``. .. code-block:: python >>> meyers_merged = meyers_paid.merge(meyers_premium) >>> meyers_merged Cumulative Triangle Number of slices: 1 Number of cells: 100 Triangle category: Regular Experience range: 1988-01-01/1997-12-31 Experience resolution: 12 Evaluation range: 1988-12-31/2006-12-31 Evaluation resolution: 12 Dev Lag range: 0.0 - 108.0 months Fields: earned_premium paid_loss Common Metadata: currency USD country US risk_basis Accident reinsurance_basis Net loss_definition Loss+DCC There is an exception to this operation, which occurs if the two triangles have different metadata. For instance, imagine we had the paid and earned premium triangles above but distinct metadata. We can create these triangles with help from the ``derive_metadata`` triangle method. .. code-block:: python meyers_paid = meyers_tri.select("paid_loss").derive_metadata( details=dict(slice=1) ) meyers_premium = meyers_tri.select("earned_premium").derive_metadata( details=dict(slice=2) ) A ``merge`` operation would now return the same as ``meyers_paid + meyers_premium`` because merging shouldn't take place across distinct metadata or triangle slices. In this case, the canonical pattern is to ``summarize`` the combined, multi-slice triangle. .. code-block:: python >>> combined = meyers_paid + meyers_premium >>> combined.summarize() which returns the single triangle that we started with. ``summarize`` works by figuring out the greatest common denominator of metadata elements, and using that to summarize triangles using a set of pre-defined field aggregation functions, which for loss and premium fields are all summed. If there is an unrecognized field, Bermuda will error. For instance, let's create a new field called ``paid_losses`` rather than ``paid_loss``, using the ``derive_fields`` method, and try to summarize the triangles: .. code-block:: python >>> meyers_paid_2 = meyers_paid.derive_fields(paid_losses = lambda cell: cell["paid_loss"]) >>> combined = meyers_paid_2 + meyers_premium >>> combined.summarize() ... TriangleError: Don't know how to aggregate `paid_losses` values The result is a ``TriangleError`` that indicates ``summarize`` does not know how to summarize, a priori, ``paid_losses``. However, we can pass in a custom function to help tell Bermuda what to do. Currently, this functionality is reserved for people comfortable with looking in the internals of ``bermuda.utils.summarize.SUMMARIZE_DEFAULTS``, since the custom function requires on Bermuda summarization logic: .. code-block:: python >>> from bermuda.utils.summarize import _conforming_sum >>> combined.summarize( ... summary_fns={"paid_losses": lambda v: _conforming_sum(v["paid_losses"])} ... ) Cumulative Triangle Number of slices: 1 Number of cells: 100 Triangle category: Regular Experience range: 1988-01-01/1997-12-31 Experience resolution: 12 Evaluation range: 1988-12-31/2006-12-31 Evaluation resolution: 12 Dev Lag range: 0.0 - 108.0 months Fields: earned_premium paid_loss paid_losses Common Metadata: currency USD country US risk_basis Accident reinsurance_basis Net loss_definition Loss+DCC This procedure now returns a summarized triangle with both ``paid_loss`` and ``paid_losses``. In some cases, we might have premium present in two triangles that should not be summarized. For instance, imagine we had two triangles for paid and reported losses, each with earned premium as a field. We can use ``summarize_premium=False`` in our call to ``summarize`` to ensure that premium fields are not summed. .. code-block:: python paid = meyers_tri.select(["paid_loss", "earned_premium"]).derive_metadata( details=dict(slice=1) ) reported = meyers_tri.select(["reported_loss", "earned_premium"]).derive_metadata( details=dict(slice=2) ) summarized = (paid + reported).summarize(summarize_premium=False) How can we check that the triangle ``summarized`` has the same earned premium as both ``paid`` and ``reported`` triangles? Triangle cells are easy to iterate over, so one option is to zip both triangles, and iterate and compare their values, such as: .. code-block:: python assert all( cell1["earned_premium"] == cell2["earned_premium"] for cell1, cell2 in zip(summarized, paid) ) But Bermuda also provides an ``extract()`` method on triangles, which returns a Numpy array and can make this easier. .. code-block:: python assert (summarized.extract("earned_premium") == paid.extract("earned_premium")).all()