Adding Custom Metrics programmatically through NannML SDK
The SDK Custom Metrics is part of the monitoring class and it can be created by instantiating a new nml_sdk.monitoring.CustomMetric(). Before all, you will need to set up the NannyML SDK address and your token. You can see how to do this on the Cloud SDK Getting Started page.
import nannyml_cloud_sdk as nml_sdk## First, authenticate to NannyML cloudnml_sdk.url ="https://beta.app.nannyml.com"nml_sdk.api_token =r"api token goes here"## Then create a new custom metric instancecustom_metric = nml_sdk.monitoring.CustomMetric()
The class CustomMetric allows the user to perform 4 main actions:
Create a new custom metric
List all the existing custom metrics
Get the details of a custom metric
Delete a custom metric
Create a custom metric
The custom metric function
To create a new custom metric you need to provide a python function or a string containing a python function that receives a list of named arguments and returns a value. This python function needs to be named either calculate, estimate, aggregate or loss depending on the problem type this function serves.
For example, the calculate function, as seen below, is used by Binary Classification or Multiclass Classification.
Passing a function
import pandas as pddefcalculate(y_true: pd.Series,y_pred: pd.Series,y_pred_proba: pd.Series,chunk_data: pd.DataFrame,labels: list[str],class_probability_columns: list[str],**kwargs) ->float:# Perform the metric calculate here# Return a float valuereturn1
Passing a string
calculate ="""import pandas as pddef calculate(y_true: pd.Series, y_pred: pd.Series, y_pred_proba: pd.Series, chunk_data: pd.DataFrame, labels: list[str], class_probability_columns: list[str], **kwargs) -> float: # Perform the metric calculate here # Return a float value return 1"""
The calculate function is expected to receive a set of arguments and return a float value. This means that when this function is called a list of arguments will be passed to the function and a float value is returned. Not providing a function that accepts the correct arguments and returns the float value will generate errors during the execution.
For the calculate function, some of the arguments being passed require the pandas library to be imported.
Creating the custom metric
After defining a function, you can call custom_metric.create to register your custom function as a custom metric. Here the custom metric is being created as a Binary Classification and passing just the calculate function.
custom_metric.create(calculation_function=calculate, name='example_metric', description='Example of a binary classification custom metric.', problem_type='BINARY_CLASSIFICATION')
If the provided function for the custom metric is not a valid python function, i.e. there is a syntax error in your function, an error will be thrown:
Traceback (most recent call last): File "<stdin>", line 1,in<module> File "/nannyml-cloud-sdk/nannyml_cloud_sdk/monitoring/custom_metric.py", line 210,in createreturnexecute(_CREATE_CUSTOM_METRIC, { File "/nannyml-cloud-sdk/nannyml_cloud_sdk/client.py", line 58, in wrapperraiseApiError(ex.errors[0]['message']) from exnannyml_cloud_sdk.errors.ApiError: Provided calculate function isnot valid Python code
Custom metric problem types
If you need a custom metric for Binary Classification or Multiclass Classification you need to provide the calculate function and you can also pass an optional estimate function. The estimate function has the following structure:
estimate_function ="""```import pandas as pddef estimate( estimated_target_probabilities: pd.Series, y_pred: pd.Series, y_pred_proba: pd.Series, chunk_data: pd.DataFrame, labels: list[str], class_probability_columns: list[str], **kwargs) -> float: # Perform the metric calculate here # Return a float value return 1```"""
The import statements needs to be repeated on every custom function defined.
The custom_metric.create function for the Binary Classification or Multiclass Classification is called like this:
custom_metric.create( name="custom_metric_name", # string containing the custom metric unique name description="custom metric description", # The custom metric description problem_type="BINARY_CLASSIFICATION", # BINARY_CLASSIFICATION or MULTICLASS_CLASSIFICATION calculation_function=calculate_function, # String containing a valid python code for the calculate function estimation_function=estimate_function, # Optional string containing a valid python code for the estimate function lower_value_limit=0.0, # Optional float value for lower limit upper_value_limit=1.0, # Optional float value for upper limit)
If you need a custom metric for a Regression problem type, instead of an estimate or calculate function you need to provide an aggregate and a loss function.
The Aggregate function structure
aggregate_function ="""import numpy as npimport pandas as pddef aggregate(loss: np.ndarray, chunk_data: pd.DataFrame) -> float: pass"""
The Loss function structure:
loss_function ="""import numpy as npimport pandas as pddef loss(y_true: pd.Series, y_pred: pd.Series, chunk_data: pd.DataFrame) -> np.ndarray: pass"""
Then creating the function would be like:
custom_metric.create( name="custom_metric_name", # string containing the custom metric unique name description="custom metric description", # The custom metric description problem_type="REGRESSION", # REGRESSION is the only allowed value for this case loss_function=loss_function, aggregation_function=aggregate_function, lower_value_limit=0.0, # Optional float value for lower limit if it exists upper_value_limit=1.0, # Optional float value for upper limit if it exists )
Assign custom metric to a model
A new custom metric, when created, is not assigned to any model. To assign this custom metric to a model, first you will need to retrieve the unique identifier of a model (model_id) and then call monitoring.Model.add_custom_metric passing the model_id and the metric_id as parameters.
Retrieving the model_id can be done by calling nml_sdk.monitoring.Model.list(). This function either lists all the models available of filter the models by name or problem type. The return will be an array of dictionaries, the model_id will be the value of the id key of a dictionary.
## Searching for models with problem type equals to Binary classificationprint(nml_sdk.monitoring.Model.list(problemType='BINARY_CLASSIFICATION'))>>> [{'name': 'Model1', 'id': 1, 'problemType': 'BINARY_CLASSIFICATION', 'createdAt': datetime.datetime(2024, 8, 19, 9, 14, 59, 678112, tzinfo=datetime.timezone.utc)}]
## The model_id here is represented by 'id':1## Creating a custom metric using the previusly defined calculate and estimate functioncustom_metric = nml_sdk.monitoring.CustomMetric()cm = custom_metric.create( name="new_custom_metric", description="custom metric description", problem_type="BINARY_CLASSIFICATION", calculation_function=calculate_function, estimation_function=estimate_function, lower_value_limit=0.0, upper_value_limit=1.0, )print(cm)>>> {'name': 'new_custom_metric', 'id': 1, 'problemType': 'BINARY_CLASSIFICATION', 'createdAt': datetime.datetime(2024, 8, 27, 12, 6, 17, 42718, tzinfo=datetime.timezone.utc), 'description': '', 'calculateFn': '', 'estimateFn': ''}
## Add the new custom metric to the existing modelnml_sdk.monitoring.Model.add_custom_metric(model_id=1, metric_id=cm['id'])
From now on, every time you run your model "Model1", the new custom metric will be calculated among the standard metrics.
A custom metric can be assigned to any existing model as long as their problem types matches.
Listing the custom metrics
It is possible to list the existing custom metrics filtering by name and problem type:
Listing a custom metric doesn't expose the functions implementation. To check the code inside the custom metrics you need to run the custom_metric.get function.
The function implementations here are converted to a raw one-line string. Line breaks are replaced by '\n' and possibly tabs will be replaced by '\t' if your test editor does not use spaces.
Removing a custom metric from a model
Just like it is possible to assign a custom metric to a model, you can also remove a custom metrics from a model:
After removing a custom metric from a model, this custom metric will not be calculated anymore when running this model and its previous results will not be shown anymore.
Deleting a custom metric
If a custom metric is not necessary anymore, it is possible to delete it.
custom_metric.delete(metric_id=1)
SDK custom metrics end-to-end examples
Binary classification
This is an example of a custom F_2 function. Note that it is possible to add external libraries to the custom code. In this example, the fbeta_score from sklearn.metrics will be used. For more context on custom metrics for binary classification, you can refer to the tutorial Writing functions for Binary Classification where the concept of the calculate and estimate functions are better defined.
import nannyml_cloud_sdk as nml_sdk## First, authenticate to NannyML cloudnml_sdk.url ="https://beta.app.nannyml.com"nml_sdk.api_token =r"api token goes here"import pandas as pdimport numpy as npfrom sklearn.metrics import fbeta_scoredefcalculate(y_true: pd.Series,y_pred: pd.Series,y_pred_proba: pd.DataFrame,chunk_data: pd.DataFrame,labels: list[str],class_probability_columns: list[str],) ->float:# labels and class_probability_columns are only needed for multiclass classification# and can be ignored for binary classification custom metricsreturnfbeta_score(y_true, y_pred, beta=2)defestimate(estimated_target_probabilities: pd.DataFrame,y_pred: pd.Series,y_pred_proba: pd.DataFrame,chunk_data: pd.DataFrame,labels: list[str],class_probability_columns: list[str],) ->float:# labels and class_probability_columns are only needed for multiclass classification# and can be ignored for binary classification custom metrics estimated_target_probabilities = estimated_target_probabilities.to_numpy().ravel() y_pred = y_pred.to_numpy()# Create estimated confusion matrix elements est_tp = np.sum(np.where(y_pred ==1, estimated_target_probabilities, 0)) est_fp = np.sum(np.where(y_pred ==1, 1- estimated_target_probabilities, 0)) est_fn = np.sum(np.where(y_pred ==0, estimated_target_probabilities, 0)) est_tn = np.sum(np.where(y_pred ==0, 1- estimated_target_probabilities, 0)) beta =2 fbeta = (1+ beta**2) * est_tp / ( (1+ beta**2) * est_tp + est_fp + beta**2* est_fn) fbeta = np.nan_to_num(fbeta)return fbeta## Create an instance of the custom metric modulecustom_metric = nml_sdk.monitoring.CustomMetric()cm = custom_metric.create( name="custom_F_2", description="Custom implementation for F_2", problem_type="BINARY_CLASSIFICATION", calculation_function=calculate, estimation_function=estimate, lower_value_limit=0.0, upper_value_limit=1.0, )## We will add this custom metric to the existing model, model_id = 1nml_sdk.monitoring.Model.add_custom_metric(model_id=1, metric_id=cm['id'])# Trigger analysis of the new datanml_sdk.monitoring.Run.trigger(model_id=1)
Multiclass Classification
This is an example of a custom F_2 Multiclass classification function. For more context on custom metrics for binary classification you can refer to the tutorial Writing Function for Multiclass Classification where the concept of the calculate and estimate functions are better defined.
import nannyml_cloud_sdk as nml_sdk## First, authenticate to NannyML cloudnml_sdk.url ="https://beta.app.nannyml.com"nml_sdk.api_token =r"api token goes here"import pandas as pdimport numpy as npfrom sklearn.metrics import fbeta_scorefrom sklearn.preprocessing import label_binarizedefcalculate(y_true: pd.Series,y_pred: pd.Series,y_pred_proba: pd.DataFrame,chunk_data: pd.DataFrame,labels: list[str],class_probability_columns: list[str],**kwargs) ->float:returnfbeta_score(y_true, y_pred, beta=2, average='macro')defestimate(estimated_target_probabilities: pd.DataFrame,y_pred: pd.Series,y_pred_proba: pd.DataFrame,chunk_data: pd.DataFrame,labels: list[str],class_probability_columns: list[str],**kwargs): beta =2defestimate_fb(_y_pred,_y_pred_proba,beta) ->float:# Estimates the Fb metric.## Parameters# ----------# y_pred: np.ndarray# Predicted class label of the sample# y_pred_proba: np.ndarray# Probability estimates of the sample for each class in the model.# beta: float# beta parameter## Returns# -------# metric: float# Estimated Fb score. est_tp = np.sum(np.where(_y_pred ==1, _y_pred_proba, 0)) est_fp = np.sum(np.where(_y_pred ==1, 1- _y_pred_proba, 0)) est_fn = np.sum(np.where(_y_pred ==0, _y_pred_proba, 0)) est_tn = np.sum(np.where(_y_pred ==0, 1- _y_pred_proba, 0)) fbeta = (1+ beta**2) * est_tp / ( (1+ beta**2) * est_tp + est_fp + beta**2* est_fn) fbeta = np.nan_to_num(fbeta)return fbeta estimated_target_probabilities = estimated_target_probabilities.to_numpy() y_preds =label_binarize(y_pred, classes=labels) ovr_estimates = []for idx, _ inenumerate(labels): ovr_estimates.append(estimate_fb( y_preds[:, idx], estimated_target_probabilities[:, idx], beta=2 ) ) multiclass_metric = np.mean(ovr_estimates)return multiclass_metric## Create an instance of the custom metric modulecustom_metric = nml_sdk.monitoring.CustomMetric()cm = custom_metric.create( name="custom_F_2", description="Custom implementation for F_2", problem_type="MULTICLASS_CLASSIFICATION", calculation_function=calculate, estimation_function=estimate, lower_value_limit=0.0, upper_value_limit=1.0, )## We will add this custom metric to the existing model, model_id = 1nml_sdk.monitoring.Model.add_custom_metric(model_id=1, metric_id=cm['id'])# Trigger analysis of the new datanml_sdk.monitoring.Run.trigger(model_id=1)
Regression
To define a Regression custom metric, you need to set up a loss function and an aggregate function. Those functions are needed to calculate realized performance and estimate performance. Please refer to the document Writing Functions for Regression where the concept of the loss and aggregate functions are better defined.
import nannyml_cloud_sdk as nml_sdk## First, authenticate to NannyML cloudnml_sdk.url ="https://beta.app.nannyml.com"nml_sdk.api_token =r"api token goes here"import numpy as npimport pandas as pddefloss(y_true: pd.Series,y_pred: pd.Series,chunk_data: pd.DataFrame,**kwargs) -> np.ndarray: y_true = y_true.to_numpy() y_pred = y_pred.to_numpy() alpha =0.9 factor1 = alpha * np.maximum(y_true - y_pred, 0) factor2 = (1- alpha) * np.maximum(y_pred - y_true, 0)return factor1 + factor2defaggregate(loss: np.ndarray,chunk_data: pd.DataFrame,**kwargs) ->float:return loss.mean()## Create an instance of the custom metric modulecustom_metric = nml_sdk.monitoring.CustomMetric()cm = custom_metric.create( name="custom_alpha_loss", description="Custom implementation for Direct Loss Estimation", problem_type="REGRESSION", loss_function=loss, aggregation_function=aggregate, lower_value_limit=0.0, upper_value_limit=None, )## We will add this custom metric to the existing model, model_id = 1nml_sdk.monitoring.Model.add_custom_metric(model_id=1, metric_id=cm['id'])# Trigger analysis of the new datanml_sdk.monitoring.Run.trigger(model_id=1)