Field Behavior Monitoring

Overview

Upriver provides a wide variety of monitors that are automatically set for every field in the data source. Users can also define custom monitors using Upriver's Customer Metrics (see Custom Metrics).

For all of the monitored metrics, Upriver automatically sets the threshold expectations by analyzing the data and allows the users to set up thresholds on top of these values (see User Exptations in Data Source Fields).

Monitored Metrics

Field Metadata

Metric
Description

Completeness (%)

Percentage of rows where the field value is null or the field is not available (for semi-structured data).

Default Value Completeness (%)

Percentage of rows where the field value is null or the field is not available (for semi-structured data) when considering the default field value as empty. e.g., the completeness when considering empty strings as null values.

Null Count

Count of rows where the field value is null or the field is not available (for semi-structured data).

Non-null count

Count of rows where the field value is not null.

Field Uniqueness (%)

Percentage of unique values for a specific fields. Calculated as cardinality / row_count.

Row Uniqueness

Percentage of unique rows across all fields.

Cardinality

Count of the different number of unique values seen for a given field.

Field Data Monitors

Metric
Description
Relevant Field Type

Mean

Average value across all rows.

Numeric

Standard Deviation

Standard deviation of values across all rows.

Numeric

Max Value

Maximum value across all rows.

Numeric

Min Value

Minimum value across all rows.

Numeric

Percentiles

Expected value for the following percentiles: 0 - 10, 20, 30, 40, 50, 60, 70, 80, 90 - 100.

Numeric

Sum

Sum of values across all rows.

Numeric

String Length (mean)

Average length across all rows.

String

String Length (standard deviation)

Standard deviation of lengths across all rows.

String

String Length (max)

Maximum string length across all rows.

String

String Length (min)

Minimum string length across all rows.

String

String Length Percentiles

Expected string lengths for the following percentiles: 0 - 10, 20, 30, 40, 50, 60, 70, 80, 90 - 100.

String

Array Size (mean)

Average array size across all rows.

Array

Array Size (standard deviation)

Standard deviation of array sizes across all rows.

Array

Array Size (max)

Maximum array size across all rows.

Array

Array Size (min)

Minimum array size across all rows.

Array

Array Size Percentiles

Expected array sizes for the following percentiles: 0 - 10, 20, 30, 40, 50, 60, 70, 80, 90 - 100.

Array

Value distribution (Histogram)

This is available only for categorical fields The distribution between the different values seen for a given field.

Any

Semantic Format Distribution

The semantic formats seen for a given field (e.g. UUID, SSN, email, IP, etc.) and their percentage of rows in which the field adheres to each format.

Numeric, String

Last updated