# Data quality

The data quality dashboard allows for analyzing the changes in missing and unseen values over time. Here is our guide explaining how to use the data quality dashboard:

{% embed url="<https://youtu.be/q2LauEsfr0E>" %}

<figure><img src="https://73177058-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FcgDbLHPAHMe3kHJeobtk%2Fuploads%2Fgit-blob-a9c0ef4901e2c320d0acd41fc8f1b56da1decd35%2Fdata%20quality%20issues.001.jpeg?alt=media" alt=""><figcaption><p>Data quality dashboard.</p></figcaption></figure>

There are three main components of the **Data quality** dashboard:

### **1. Filters**

{% tabs %}
{% tab title="1.1 Segmentation" %}
Segmentation allows you to split your data into groups and analyze them separately.

For a given model, each of the columns that are selected for segmentation during configuration or in the model settings appears under the segmentation filter. Segments are then created for each of the distinct values within that column.

In the filter section, you can select the segments you want to see visualized. You can also select **All data** to visualize results for the entire dataset.

<figure><img src="https://73177058-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FcgDbLHPAHMe3kHJeobtk%2Fuploads%2Fgit-blob-7d15e7b5f340bdcfbd297e85bb422ba8a6ca3f8c%2FScreen%20Shot%202024-07-25%20at%206.41.58%20PM.png?alt=media" alt="" width="292"><figcaption><p>Select segments of interest</p></figcaption></figure>
{% endtab %}

{% tab title="1.2 Metrics" %}
Filter which data quality metrics you want to see. Data quality metrics that are not calculated are not visible under the filter. Selecting which data quality metrics you want to calculate can be done under [model settings.](https://docs.nannyml.com/cloud/v0.22.0/product-tour/model-side-panel/model-settings)

{% hint style="info" %}
Note that unseen values do not apply to continuous columns. It is typically used to assess if new categories are appearing on which the model was not trained.
{% endhint %}

<figure><img src="https://content.gitbook.com/content/cgDbLHPAHMe3kHJeobtk/blobs/qFaNYWHmIp15UrMfoMIo/data_quality_metrics.png" alt="" width="252"><figcaption><p>Metrics filter.</p></figcaption></figure>
{% endtab %}

{% tab title="1.3 Columns" %}
Filter the resulting charts by the columns you want to see.

{% hint style="info" %}
Note that the data quality metrics can also be applied to the model output and the target, not just the features.
{% endhint %}

<figure><img src="https://content.gitbook.com/content/cgDbLHPAHMe3kHJeobtk/blobs/tUAWxcZplIPH4lyy6gjI/new_data_quality_columns.png" alt="" width="312"><figcaption><p>Columns filter.</p></figcaption></figure>
{% endtab %}

{% tab title="1.4 Alert status" %}
Select which metrics to display depending on whether there are alerts in the last chunk, alerts in the previous chunks, or no alerts at all, or include all charts regardless of when and if any alerts occurred.

<figure><img src="https://content.gitbook.com/content/cgDbLHPAHMe3kHJeobtk/blobs/F21yXJidp1O950PKNluO/new_alert_status.png" alt="" width="233"><figcaption><p>Alert status filter.</p></figcaption></figure>
{% endtab %}

{% tab title="1.5 Tags" %}
Filter charts by the previously specified tags.

<figure><img src="https://content.gitbook.com/content/cgDbLHPAHMe3kHJeobtk/blobs/kzFBPGQCl9JBmYNWiSTo/tags.png" alt="" width="290"><figcaption><p>Tags filter.</p></figcaption></figure>
{% endtab %}
{% endtabs %}

### **2. Visualisations**

{% tabs %}
{% tab title="2.1 Sort by" %}
You can change the order of charts based on the metric name, number, or recency of the alerts.

<figure><img src="https://content.gitbook.com/content/cgDbLHPAHMe3kHJeobtk/blobs/YdRFsEKXyiA5PzdOKnw5/covariate_shift_sort_by.png" alt="" width="194"><figcaption><p>Sort by window.</p></figcaption></figure>
{% endtab %}

{% tab title="2.2 Ascending/descending order" %}
For all sorting methods, the icons shown below toggle between ascending and descending order. The icon displayed depends on the selected sorting method.

* **Column Name and Method Name**: The icon toggles between alphabetical order and reverse alphabetical order. The default mode is alphabetical order.
* **Nr of Alerts**: The icon toggles between ascending and descending order based on the number of alerts. The default mode displays plots with the most alerts first.
* **Recency of Alerts**: The icon toggles between showing newer alerts first and older alerts first. The default mode shows the most recent alerts first.

<div><figure><img src="https://73177058-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FcgDbLHPAHMe3kHJeobtk%2Fuploads%2Fgit-blob-a4483533a098e43c14d66e08fecd24325233ab73%2FScreen%20Shot%202024-07-25%20at%205.42.46%20PM.png?alt=media" alt="" width="118"><figcaption><p>Toggle for 'Nr of alerts'</p></figcaption></figure> <figure><img src="https://73177058-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FcgDbLHPAHMe3kHJeobtk%2Fuploads%2Fgit-blob-3dbbbb79b9e0761387c92f967681ff37d812fdf0%2FScreen%20Shot%202024-07-25%20at%205.36.52%20PM.png?alt=media" alt="" width="120"><figcaption><p>Toggle for other sorting methods</p></figcaption></figure></div>
{% endtab %}

{% tab title="2.3 Date range" %}
You can select a specific period of interest which applies to all charts.

<figure><img src="https://content.gitbook.com/content/cgDbLHPAHMe3kHJeobtk/blobs/xPaRsRwdV0sIMI6GSt82/date_range.png" alt="" width="342"><figcaption><p>Date range window.</p></figcaption></figure>
{% endtab %}

{% tab title="2.4 Date reset" %}
To reset a previously set date period, whether using the date range or slider, simply press the "Reset" button.

<figure><img src="https://content.gitbook.com/content/cgDbLHPAHMe3kHJeobtk/blobs/kdFahAG0g4dNyuqqiKzh/plot_data_format_gifs.gif" alt=""><figcaption><p>Date range reset.</p></figcaption></figure>
{% endtab %}

{% tab title="2.5 Date slider" %}
Similar to selecting a date range, you can choose a specific period of interest by simply moving the date slider.

<figure><img src="https://content.gitbook.com/content/cgDbLHPAHMe3kHJeobtk/blobs/kdFahAG0g4dNyuqqiKzh/plot_data_format_gifs.gif" alt=""><figcaption><p>Date slider</p></figcaption></figure>
{% endtab %}

{% tab title="2.6 Chart" %}
The charts are interactive; you can hover over them for more details. Red squares signal alerts, and the blue line represents the metric in the reference period. The light blue line indicates the metric during the analysis period.

You can also zoom in on any part of a chart. Simply press and hold your mouse button, then draw a square over your area of interest. To reset the zoom, just double-click on the chart.

<figure><img src="https://content.gitbook.com/content/cgDbLHPAHMe3kHJeobtk/blobs/JVsEuZllhDqmYfI0qKwR/data_quality_plot.png" alt=""><figcaption><p>Data quality plot.</p></figcaption></figure>
{% endtab %}
{% endtabs %}

### **3. Plot config**

{% tabs %}
{% tab title="3.1 Plot format" %}
There are two types of plot formats: line and step. A line plot smoothly connects points with straight lines to show trends, while a step plot uses sharp vertical and horizontal lines to show exact changes between points clearly.

<figure><img src="https://content.gitbook.com/content/cgDbLHPAHMe3kHJeobtk/blobs/TnstUQvrKerp7qJa5af1/plot_format_gifs.gif" alt=""><figcaption><p>Plot formats.</p></figcaption></figure>
{% endtab %}

{% tab title="3.2 Datasets" %}
Select datasets to zoom in on reference, analysis, or create a separate subplot for both.

<figure><img src="https://content.gitbook.com/content/cgDbLHPAHMe3kHJeobtk/blobs/pD9YFsPGQXRDG7jVkRV1/datasets_gif.gif" alt=""><figcaption><p>Datasets plots.</p></figcaption></figure>
{% endtab %}

{% tab title="3.3 Plot elements" %}
Toggle on or off some components on the charts, like alerts, confidence bands, thresholds, and legends.

<figure><img src="https://content.gitbook.com/content/cgDbLHPAHMe3kHJeobtk/blobs/M43s0FrJXM5sGeQ6xxSo/plot_elements_correct_gif.gif" alt=""><figcaption><p>Plot elements.</p></figcaption></figure>
{% endtab %}
{% endtabs %}
