Getting Probability Distribution of Performance Metric when some observations have labels

This page describes how NannyML estimates probability distribution of a performance metric when some observations have labels while other don't.

In some cases experiment data can contain both: observations with labels and observations without labels. This happens for example when the predictions of the model affect whether we get to observe the label or not. For example, in credit scoring, we will never see the label for credit applicants who were rejected. This is an example of censored confusion matrix - specific elements of the confusion matrix are not available (in the credit scoring case we never observe true and false negatives).