Architecture

While almost all end-user interaction with triangle data can be accomplished with public methods available on Triangle objects, it helps to understand major elements in the internal structure of Triangle.

Cell

The basic building block class in Bermuda triangles is the Cell. There are three types of cells: Cell and its subclasses CumulativeCell and IncrementalCell. All cells consists of an experience period start date and end date, a development lag, one or more observed values, and a Metadata object. A loose representation of a single cell may look something like this:

  • Experience Start Date: 2017-07-01

  • Experience End Date: 2017-07-31

  • Evaluation Date: 2018-10-31 (development lag of 15 months)

  • Metadata:

    • Country: US

    • Currency: USD

    • Risk Basis: Accident

    • Reinsurance Basis: Gross

    • Per Occurrence Limit: $1M

    • Loss Definition: Loss+DCC

    • Details:

      • State: Texas

      • Coverage: Bodily Injury

  • Values:

    • Paid Loss: $1,234,567

    • Reported Loss: $2,345,678

    • Earned Premium: $3,456,789

If this data structure looks very similar to a single row of a tabular triangle, that’s not by coincidence. The structure of cells intentionally mirrors tabular triangle rows. We refer to individual observed values within a cell as “fields”. In the example above, the fields are paid loss, reported loss, and earned premium. Cell.values are implemented as a Python dictionary, so there are essentially no restrictions on the fields that can be stored within an observation. In the example above, all of the fields are amounts of money, but we could just as easily have included reported claim counts, closed claim counts, or any number of other fields. Furthermore, since each cell’s values are independent, there is no requirement that all observations have same set of fields. If paid loss is present in every cell, but earned premium is only present in some cell, that’s not a problem.

Every observation contains a set of metadata associated with it, including items such as the country the risk is in, the currency that loss and premium amounts are denominated in, whether the exposure period is on accident-basis or policy-basis, and so forth. The set of attributes is extensible via the details dictionary attribute. The example above shows state and coverage, but this is just for illustrative purposes; state and coverage are not required members of the details field, and any other arbitrary attributes can be added to the details field if relevant.

We track all of this data at the cell level because it can be critical for appropriate modeling when mixing data from several different sources. For example, it would obviously be inappropriate to fit a combined loss development model with volume-weighting to data from two different portfolios, one of which is measured in dollars and the other in yen, without converting currencies first! Similarly, mixing accident-basis and policy-basis experience periods is usually a recipe for disaster, unless special mitigation measures are taken. With that being said, the metadata on each observation is not burdensome for end-users. If an end-user doesn’t care about one or more metadata fields in a given analytical context, they can simply omit them and Bermuda will gracefully supply sensible defaults.

Triangles

Collections of Cells are aggregated into a Triangle. From the end-user’s perspective, a Triangle is an undifferentiated agglomeration of Cells. Under the hood, the Triangle class indexes and groups cells by common metadata attributes; we refer to these internal groups of Cells as “slices”. A slice consists of a list of Cells, all of which pertain to the same logical group of exposures. For example, a slice may contain cells for the accident-month triangle for Company X, or for the policy-quarter triangle for private passenger bodily injury claims in the state of Missouri for Product Y written by Company Z. Slice grouping is automatically determined based on the metadata associated with each cell.

Operations on Triangles

We stated earlier that one of the design goals of bermuda.Triangle is ergonomics. To that end, triangles include a rich set of operations out of the box. We summarize some of the most common operations below. In general, when we have a choice between implementing a behavior as a function or as a method, we prefer the method in almost all cases. There are a few reasons for this. First, all methods on triangles are non-mutating/non-destructive, so there’s no semantic distinction between functions and methods. Second, it tends to be easier and more natural to express a sequence of operations on a triangle as a sequence of chained method calls than as a nested sequence of function calls. Finally, from a rhetorical perspective, we think of Triangle objects as having a convenient and tidy namespace for holding operations on triangular data, so we don’t have to import functions from another namespace or qualify the function names.

Operators

  • Equality: The == operator on triangles returns True if the contents of both operands are identical (not if the two operands are references to the same object, as the default behavior for Python objects).

  • Concatenation: The + operator on two triangles returns a single triangle with the concatenated contents of the two operands.

Properties

Any given triangle triangle has the following basic properties:

  • triangle.slices returns a dictionary of slices contained in the triangle.

  • triangle.cells returns a list of all cells in the triangle.

  • triangle.periods is the sorted list of all distinct experience periods in the triangle.

  • triangle.dev_lags() is the sorted list of all distinct development lags in the triangle. dev_lag accepts unit as a keyword argument that can be month, day or timedelta.

  • triangle.evaluation_dates is the sorted list of all distinct evaluation dates in the triangle.

  • triangle.evaluation_date is the latest evaluation date in the triangle.

  • triangle.fields is the sorted list of all distinct fields in cells in the triangle.

  • triangle.metadata is the sorted list of all distinct metadata in the triangle.

  • triangle.common_metadata returns a single metadata element common to all cells in the triangle.

  • triangle.metadata_differences returns a list of unique metadata in the triangle, that are not in triangle.common_metadata.

Triangles also implement several higher-order properties. For explanation of Bermuda-specific triangle terminology, see the discussion on triangle philosophy and terminology.

  • triangle.is_empty returns True if there are no cells in the triangle, and False otherwise.

  • triangle.is_disjoint returns True if all experience periods in the triangle are disjoint, and False if the triangle is erratic.

  • triangle.is_semi_regular tests whether the triangle is semi-regular.

  • triangle.is_regular tests whether the triangle is regular.

  • triangle.has_consistent_currency and triangle.has_consistent_risk_basis test whether every cell in the triangle has the same currency or risk basis, respectively. These two pieces of metadata are the most common showstoppers for invalidating a modeling approach.

  • triangle.is_incremental returns True if the triangle is incremental, otherwise False.

Basic Mutators

Triangles have the following methods that return modified triangles:

  • triangle.select() accepts a list of field names. For each cell in the triangle, any fields that are not in the supplied list of names are removed from the cell’s set of values. If any cells don’t have any values in the list, then those cells are removed entirely.

  • triangle.clip() filters a triangle based on cutoff dates. For example, triangle.clip(max_eval=datetime.date(2018, 12, 31)) removes all cells with an evaluation date after December 31st, 2018. clip accepts the keyword arguments min_eval, max_eval, min_period, max_period, min_dev, max_dev, and dev_lag_unit. Multiple arguments can be supplied – if so, only those cells that satisfy all supplied conditions are returned.

  • triangle.right_edge returns the rightmost edge of the triangle – i.e., for each distinct experience period within each slice, the cell with the latest evaluation date is retained and all other cells are dropped.

Representations

  • triangle.to_data_frame() returns a pandas.DataFrame representation of a triangle, for ease of graphing, exporting, and ad hoc manipulation. There are the I/O functions triangle.to_long_data_frame and triangle.to_wide_data_frame used for transforming triangles to, and from, wide and long CSVs, respectively. Similarly, there are triangle.to_json() and triangle.to_binary() output functions.

  • triangle._repr_html_() provides a friendlier rich-HTML representation of a triangle for use in Jupyter notebooks.

Intermediate Mutators

Some triangle mutators require direct manipulation of individual cells. Cells are fairly straightforward to work with, so this does not pose too much of an obstacle. An individual cell cell’s experience period start, experience period end, and development lag can be accessed via cell.period_start, cell.period_end, and cell.dev_lag. The internal cell representation of these values may be unintuitive, so be warned. We can access individual fields within cells as (for example) cell.values["paid_loss"], or just cell["paid_loss"] for short.

  • triangle.filter() allows for filtering of triangles based on arbitrary cell-level predicates. For example, triangle.filter(lambda cel: cel["paid_loss"] > 0) removes all cells with zero (or negative) paid loss. The predicate function passed to filter must take a single argument (a single cell), and the predicate is then applied to every cell in the triangle, one by one. This means filter cannot be used to express conditions that depend on multiple cells.

  • triangle.derive_fields() allows for adding new fields to cells that are transformations of existing cells. For example, triangle.derive_fields(paid_LR=lambda ob: ob["paid_loss"] / ob["earned_premium"]) would add a new field paid_LR to every observation that contains the paid loss ratio according to the definition provided. derive_fields can also be used to overwrite existing fields.

  • triangle.aggregate() allows for aggregation of a triangle’s experience period or evaluation date resolution, such as turning quarterly triangles into annual triangles.

  • triangle.summarize() allows for turning multi-slice triangles into a smaller number of triangles that share common metadata. Bermuda will automatically work out the greatest-common-denominator of Metadata objects in the triangle, and will try to combine fields using default aggregation functions for commonly-used fields (e.g. paid_loss, reported_loss, earned_premium etc.). Alternatively, users can pass in their own set of summarization functions.

  • triangle.blend() allows for the blending of multiple triangles with the same cell fields using either a linear weighted average, or a ‘mixture blend’ that samples randomly from the different triangle fields according to weights passed in by the user. This is particularly useful if your triangle holds samples from upstream stochastic modelling.

  • triangle.split() splits triangles by metadata attributes. For instance, if your triangle holds multiple lines of business triangles, you can split by the line of business metadata identifying attribute to obtain a dictionary of separate triangles.

  • triangle.merge() offers triangle cell value joining functionality, where the join_type argument can be used to specify full, inner, left, right, left-anti or right-anti joining operations.

  • triangle.coalesce() is similar to merge, but can take more than two triangles as input, where earlier triangles’ cell fields take precedence over later triangles’ cell fields. This is similar to an iterated left-join on multiple triangles.

  • triangle.to_incremental() turns a cumulative triangle into an incremental triangle, or returns a no-op if the triangle is already incremental. triangle.to_cumulative() provides the opposite functionality.

  • triangle.add_statics() adds static field values from one triangle to the current triangle. Similar functionality might be achieved with a left-join merge operation or even derive_fields, but add_statics offers greater control over merging single cell fields into the base triangle.

  • triangle.make_right_triangle() creates a lower-diagonal of the existing triangle with empty cell field values.

  • triangle.make_right_diagonal() creates a new triangle diagonal for user-specified evaluation dates.

Plots

The Triangle class currently has a couple of useful visualizations using Plotly, but better visualization functionality will be added in the future.

triangle.plot_data_completeness() shows the triangular data structure as a scatter plot in (experience_period, development_lag) coordinate space. Each point represents a cell, colored proportional to the proportion of cell field values that are present in the cell. If all cells have the same number of cell fields, they will all be the same color.

triangle.plot_right_edge() plots the most recent (‘right edge’) for, by default, paid and/or reported loss ratios (using earned premium), but users can pass their own functions of cell values.