Triangle Input/Output ===================== To get started with ``Triangle`` objects, consider importing the ``meyers_tri`` object from Bermuda. .. code-block:: python from bermuda import meyers_tri which returns the following triangle: .. code-block:: python >>> meyers_tri Cumulative Triangle Number of slices: 1 Number of cells: 100 Triangle category: Regular Experience range: 1988-01-01/1997-12-31 Experience resolution: 12 Evaluation range: 1988-12-31/2006-12-31 Evaluation resolution: 12 Dev Lag range: 0.0 - 108.0 months Fields: earned_premium paid_loss reported_loss Common Metadata: currency USD country US risk_basis Accident reinsurance_basis Net loss_definition Loss+DCC Bermuda ``Triangle`` objects can be exported in a variety of formats for use in other applications. The recommended format for saving triangles to disk is to use the ``Triangle.to_binary`` method. This method saves the triangle in a binary format that is optimized for reading into and writing from Python. The ``to_binary`` method requires a file path as an argument, and forces the use of a ``.trib`` extension. We also offer a compressed binary file format, which can be saved using the ``.tribc`` extension. While this format is more memory efficient, it is much slower to read and write, so it is generally not recommended. .. code-block:: python meyers_tri.to_binary('meyers_tri.trib') Once saved, binary files can be read back into Python using the ``Triangle.from_binary`` method .. code-block:: python from bermuda import Triangle meyers_tri = Triangle.from_binary('meyers_tri.trib') Tabular Formats ------------------------------------------------- Triangles can also be saved in a variety of commonly used tabular formats. For convenience, we've labeled these formats as ``wide``, ``long``, and ``array`` formats. The ``wide`` format is a table where each row represents a single cell in the triangle. It has fixed columns in the order ``['period_start', 'period_end', 'evaluation_date', *fields, *metadata]``. It's considered ``wide`` because all of the fields are split out as separate columns. Take a look at the Meyers triangle as a wide pandas ``DataFrame``: .. code-block:: pycon >>> meyers_wide_df = meyers_tri.to_wide_data_frame() >>> meyers_wide_df.axes [RangeIndex(start=0, stop=100, step=1), Index(['period_start', 'period_end', 'evaluation_date', 'earned_premium', 'paid_loss', 'reported_loss', 'reinsurance_basis', 'risk_basis', 'country', 'currency', 'loss_definition'], dtype='object')] This is in contrast to the ``long`` format, which is a table with a single row for each value in each cell of the triangle. Note the following triangle is longer (300 rows instead of 100), and the columns fit the pattern ``['period_start', 'period_end', 'evaluation_date', *metadata, 'field', 'value']``: .. code-block:: pycon >>> meyers_long_df = meyers_tri.to_long_data_frame() >>> meyers_long_df.axes [RangeIndex(start=0, stop=300, step=1), Index(['period_start', 'period_end', 'evaluation_date', 'reinsurance_basis', 'risk_basis', 'country', 'currency', 'loss_definition', 'field', 'value'], dtype='object')] Both of these formats can be saved to disk using the ``to_wide_csv`` and ``to_long_csv`` methods, and read back into memory using ``from_wide_csv`` and ``from_long_csv``. .. code-block:: python meyers_tri.to_wide_csv('meyers_tri_wide.csv') meyers_tri.to_long_csv('meyers_tri_long.csv') meyers_tri = Triangle.from_wide_csv('meyers_tri_wide.csv', detail_cols=[]) meyers_tri = Triangle.from_long_csv('meyers_tri_long.csv') Note that the wide format requires the user to specify either ``detail_cols`` or ``field_cols``. This tells Bermuda which columns in the wide format are cell fields (i.e. ``paid_loss``, ``earned_premium`` etc.) and which are metadata (i.e. ``coverage`` Finally, we allow export into what we call an ``array`` format -- essentially a triangle-shaped data frame (we avoid the term 'triangle' in order to avoid confusion with the ``Triangle`` class). This is the format that actuaries would typically be most familiar with, where each row represents a single period, and each column represents a certain development lag from the end of that period. Note that our convention throughout the Bermuda library is to index development lags from the *end* of the period rather than the beginning. Therefore, the first column of the array format is typically a 0 lag observation that takes place at the end of the period. The development lags are denoted in months, and periods are saved as date objects denoting the period start. The array data frame can only operate on a single-sliced triangle and will only return values for a single field. Any missing evaluation dates for a period will show up as NaN. .. code-block:: pycon >>> from datetime import date >>> clipped_meyers = meyers_tri.clip(max_eval = date(1990, 12, 31)) >>> paid_array = clipped_meyers.to_array_data_frame('paid_loss') >>> paid_array period 0 12 24 0 1988-01-01 952000 1529000.0 2813000.0 1 1989-01-01 849000 1564000.0 NaN 2 1990-01-01 983000 NaN NaN >>> reported_array = clipped_meyers.to_array_data_frame('reported_loss') >>> reported_array period 0 12 24 0 1988-01-01 1722000 3830000.0 3603000.0 1 1989-01-01 1581000 2192000.0 NaN 2 1990-01-01 1834000 NaN NaN Some fields are often static with respect to evaluation date, like ``earned_premium`` or ``earned_exposure``. Rather than display these fields in a triangle it often makes sense to output them for each period at the latest evaluation date. This can be done using the ``to_right_edge_data_frame`` method, which will provide the values of all fields at the right edge of the triangle. .. code-block:: pycon >>> right_edge_array = clipped_meyers.to_right_edge_data_frame() >>> right_edge_array period evaluation_date paid_loss reported_loss earned_premium 0 1988-01-01 1990-12-31 2813000 3603000 5812000 1 1989-01-01 1990-12-31 1564000 2192000 4908000 2 1990-01-01 1990-12-31 983000 1834000 5454000 We're frequently presented with data provided in these array formats that we'd like to load into Bermuda ``Triangle`` objects. This can be accomplished using the ``Triange.from_array_data_frame`` method. This method requires a single field argument, but also allows for other metadata to be provided along with the tabular data. .. code-block:: python from bermuda import Metadata reported_tri = Triangle.from_array_data_frame( reported_array, 'reported_loss', metadata = Metadata(loss_definition="Loss+DCC") ) Often we'll have multiple tabular triangles representing different fields, in which case we can use the ``bermuda.io.array_triangle_builder`` helper function to build up a single multi-field triangle. .. code-block:: python from bermuda.io import array_triangle_builder loss_triangle = array_triangle_builder( dfs = [reported_array, paid_array], fields = ['reported_loss', 'paid_loss'], metadata = Metadata(loss_definition="Loss+DCC") ) This triangle now has both paid and reported losses, but it's missing earned premium. Let's read that in from a tabular format and add it to the triangle. The ``Triangle.from_statics_data_frame`` function assumes the first column represents the period of the associated data. Note that the ``static_data_tri`` must have metadata matching the existing loss triangle or the static values will not be attached correctly. .. code-block:: python >>> static_df = right_edge_array[['period', 'earned_premium']] >>> static_data_tri = Triangle.from_statics_data_frame( ... static_df, ... metadata = Metadata(loss_definition="Loss+DCC") ... ) >>> full_triangle = loss_triangle.add_statics(static_data_tri) >>> full_triangle Cumulative Triangle Number of slices: 1 Number of cells: 6 Triangle category: Regular Experience range: 1988-01-01/1990-12-31 Experience resolution: 12 Evaluation range: 1988-12-31/1990-12-31 Evaluation resolution: 12 Dev Lag range: 0.0 - 24.0 months Fields: earned_premium paid_loss reported_loss Common Metadata: risk_basis Accident loss_definition Loss+DCC Other Formats ------------------------------------------------- Bermuda also supports reading and writing triangles in a JSON format. This format is particularly useful for interacting with our upcoming modeling API - more to come on that soon! .. code-block:: python meyers_tri.to_json('meyers_tri.json') meyers_tri = Triangle.from_json('meyers_tri.json') Finally, we support reading triangles from and exporting to the ``chainladder`` package. .. code-block:: python import chainladder as cl from bermuda import chain_ladder_to_triangle chain_ladder_tri = cl.load_sample("clrd") bermuda_tri = chain_ladder_to_triangle(chain_ladder_tri) chain_ladder_tri = bermuda_tri.to_chain_ladder()