NannyML Cloud
HomeBlogNannyML OSS Docs
v0.24.0
v0.24.0
  • ☂️Introduction
  • Model Monitoring
    • Quickstart
    • Data Preparation
      • How to get data ready for NannyML
    • Tutorials
      • Monitoring a tabular data model
      • Monitoring with segmentation
      • Monitoring a text classification model
      • Monitoring a computer vision model
    • How it works
      • Probabilistic Adaptive Performance Estimation (PAPE)
      • Reverse Concept Drift (RCD)
    • Custom Metrics
      • Creating Custom Metrics
        • Writing Functions for Binary Classification
        • Writing Functions for Multiclass Classification
        • Writing Functions for Regression
        • Handling Missing Values
        • Advanced Tutorial: Creating a MTBF Custom Metric
      • Adding a Custom Metric through NannyML SDK
    • Reporting
      • Creating a new report
      • Report structure
      • Exporting a report
      • Managing reports
      • Report template
      • Add to report feature
  • Product tour
    • Navigation
    • Adding a model
    • Model overview
    • Deleting a model
    • Model side panel
      • Summary
      • Performance
      • Concept drift
      • Covariate shift
      • Data quality
      • Logs
      • Model settings
        • General
        • Data
        • Performance settings
        • Concept Drift settings
        • Covariate Shift settings
        • Descriptive Statistics settings
        • Data Quality settings
    • Account settings
  • Deployment
    • Azure
      • Azure Managed Application
        • Finding the URL to access managed NannyML Cloud
        • Enabling access to storage
      • Azure Software-as-a-Service (SaaS)
    • AWS
      • AMI with CFT
        • Architecture
      • EKS
        • Quick start cluster setup
      • S3 Access
    • Application setup
      • Authentication
      • Notifications
      • Webhooks
      • Permissions
  • NannyML Cloud SDK
    • Getting Started
    • API Reference
  • Probabilistic Model Evaluation
    • Introduction
    • Tutorials
      • Evaluating a binary classification model
      • Data Preparation
    • How it works
      • HDI+ROPE (with minimum precision)
      • Getting Probability Distribution of a Performance Metric with targets
      • Getting Probability Distribution of Performance Metric without targets
      • Getting Probability Distribution of Performance Metric when some observations have labels
      • Defaults for ROPE and estimation precision
  • Experiments Module
    • Introduction
    • Tutorials
      • Running an A/B test
      • Data Preparation
    • How it works
      • Getting probability distribution of the difference of binary downstream metrics
  • miscellaneous
    • Engineering
    • Usage logging in NannyNL
    • Versions
      • Version 0.24.0
      • Version 0.23.0
      • Version 0.22.0
      • Version 0.21.0
Powered by GitBook
On this page
  • Sample Dataset
  • Developing custom regression metric functions
  • Custom Functions API
  • Custom Pinball metric
  • Testing a Custom Metric in the Cloud product
  • Next Steps
  1. Model Monitoring
  2. Custom Metrics
  3. Creating Custom Metrics

Writing Functions for Regression

Writing the functions needed to create a custom regression metric.

PreviousWriting Functions for Multiclass ClassificationNextHandling Missing Values

Last updated 7 months ago

As we have seen on the Introductory the key components of a custom binary classification metric are the specific Python functions we need to provide for the custom metric to work. Here we will see how to create them.

We will assume the user has access to a Jupyter Notebook running Python with the installed.

Sample Dataset

We have created a sample dataset to facilitate developing the code needed for custom binary classification metrics. The dataset is publicly accessible . It is a pure covariate shift dataset that consists of:

  • 7 numerical features: ['feature1', 'feature2', 'feature3', 'feature4', 'feature5', 'feature6', 'feature7']

  • Target column: y_true

  • Model prediction column: y_pred

  • A timestamp column: timestamp

  • An identifier column: identifier

We can inspect the dataset with the following code in a Jupyter cell:

import pandas as pd
import nannyml as nml

reference = pd.read_parquet("https://github.com/NannyML/sample_datasets/raw/main/synthetic_pure_covariate_shift_datasets/regression/synthetic_custom_metrics_regression_reference.pq")
monitored = pd.read_parquet("https://github.com/NannyML/sample_datasets/raw/main/synthetic_pure_covariate_shift_datasets/regression/synthetic_custom_metrics_regression_monitored.pq")
reference.head(5)
+----+------------+------------+------------+------------+------------+------------+------------+----------+----------+----------------------------+
|    | feature1   | feature2   | feature3   | feature4   | feature5   | feature6   | feature7   | y_true   | y_pred   | timestamp                  |
+====+============+============+============+============+============+============+============+==========+==========+============================+
| 0  | 0.899145   | -2.64707   | 2.80074    | 2.02636    | -2.53157   | -2.12171   | -0.360711  | 7.97047  | 3.15523  | 2020-03-11 00:00:00        |
+----+------------+------------+------------+------------+------------+------------+------------+----------+----------+----------------------------+
| 1  | -1.09015   | -0.442365  | 1.94911    | -0.378131  | 0.517571   | 2.67345    | -1.71358   | -13.5075 | -13.1339 | 2020-03-11 00:01:40.800000 |
+----+------------+------------+------------+------------+------------+------------+------------+----------+----------+----------------------------+
| 2  | 0.619741   | 0.924163   | 1.30714    | 2.60199    | -0.776712  | 1.41447    | -0.848892  | -6.39705 | -5.324   | 2020-03-11 00:03:21.600000 |
+----+------------+------------+------------+------------+------------+------------+------------+----------+----------+----------------------------+
| 3  | -1.7384    | -0.54207   | 1.58942    | 4.12909    | -1.78157   | -0.275194  | -1.82792   | -11.8357 | -3.20461 | 2020-03-11 00:05:02.400000 |
+----+------------+------------+------------+------------+------------+------------+------------+----------+----------+----------------------------+
| 4  | -4.78688   | 0.330358   | -2.56052   | -2.32385   | 1.19089    | -2.58183   | -1.68192   | 4.17999  | 5.68762  | 2020-03-11 00:06:43.200000 |
+----+------------+------------+------------+------------+------------+------------+------------+----------+----------+----------------------------+

Developing custom regression metric functions

Custom Functions API

The API of these functions is set by NannyML Cloud and is shown as a template on the New Custom Regression Metric screen.

import numpy as np
import pandas as pd

def loss(
    y_true: pd.Series,
    y_pred: pd.Series,
    chunk_data: pd.DataFrame,
    **kwargs
) -> np.ndarray:
    pass


def aggregate(
    loss: np.ndarray,
    chunk_data: pd.DataFrame,
    **kwargs
) -> float:
    pass

  • y_true: A pandas.Series python object containing the target column.

  • y_pred: A pandas.Series python object containing the model predictions column.

  • chunk_data: A pandas.DataFrame python object containing all columns associated with the model. This allows using other columns in the data provided for the calculation of the custom metric

Custom Pinball metric

We will create a custom metric from the pinball loss. Let's use an alpha value of 0.9. The loss function would be:

import numpy as np
import pandas as pd

def loss(
    y_true: pd.Series,
    y_pred: pd.Series,
    chunk_data: pd.DataFrame,
    **kwargs
) -> np.ndarray:
    y_true = y_true.to_numpy()
    y_pred = y_pred.to_numpy()

    alpha = 0.9
    factor1 = alpha * np.maximum(y_true - y_pred, 0)
    factor2 = (1 - alpha) * np.maximum(y_pred - y_true, 0)
    return factor1 + factor2

The aggregate function is quite simpler:

import numpy as np
import pandas as pd

def aggregate(
    loss: np.ndarray,
    chunk_data: pd.DataFrame,
    **kwargs
) -> float:
    return loss.mean()

We can test those functions on the dataset loaded earlier. Assuming we run the functions as provided in a Jupyter cell we can then call them. Running them we get:

loss = loss(
    y_true=reference['y_true'],
    y_pred=reference['y_pred'],
    chunk_data=reference,
)
aggregate(
    loss=_loss,
    chunk_data=reference,
)
0.6298987071792433

We can double check the result with sklearn:

from sklearn.metrics import mean_pinball_loss
mean_pinball_loss(y_true=reference.y_true, y_pred=reference.y_pred, alpha=0.9)
0.6298987071792433

The results match as expected meaning we have correctly specified our custom metric.

Testing a Custom Metric in the Cloud product

Reference Dataset Public Link:

https://github.com/NannyML/sample_datasets/raw/main/synthetic_pure_covariate_shift_datasets/regression/synthetic_custom_metrics_regression_reference.pq

Monitored Dataset Public Link:

https://github.com/NannyML/sample_datasets/raw/main/synthetic_pure_covariate_shift_datasets/regression/synthetic_custom_metrics_regression_monitored.pq

Note that when we are on the Metrics page

we can go to Performance monitoring and directly add a custom metric we have already specified.

After the model has been added to NannyML Cloud and the first run has been completed we can inspect the monitoring results. Of particular interest to us is the comparison between estimated and realized performance for our custom metric.

We see that NannyML can accurately estimate our custom metric across the whole dataset. Even in the areas where there is a performance difference. This means that our loss function is compatible with DLE and we can reliably use both performance calculation and performance estimation for our custom metric.

You may have noticed that for custom metrics we don't have a sampling error implementation. Therefore you will have to make a qualitative judgement, based on the results, whether the estimated and realized performance results are a good enough match or not.

Next Steps

NannyML Cloud requires two functions for the custom metric to be used. The first is the loss function used to calculate the instance level loss. The second is the aggregate function used to aggregate over the instance level results. We are using this decomposition to be able to use the algorithm for performance estimation. In realized performance loss is calculated while in estimated performance loss is estimated.

Let's use the as our custom regression metric. We start with creating the loss function. Let's describe the data that are available to us to create our calculate function.

We saw how to add a binary classification custom metric in the . We can further test it by using the dataset in the cloud product. The datasets are publicly available hence we can use the Public Link option when adding data to a new model.

The process of creating a new model is described in the .

You are now ready to use your new custom metric in production. However, you may want to make your implementation more robust to account for the data you will encounter in production. For example, you can to your implementation.

Custom Metric page
NannyML open-source library
here
Direct Loss Estimation
pinball metric
Custom Metrics Introductory page
Monitoring a tabular data model
add missing value handling
Metrics page of the Add new model Wizard
Performance settings of the Metrics page of the Add new model Wizard