Data Source Configuration
Upriver allows you to add and monitor any data source with just a few clicks.
Configuration Attributes
Each data source has the following attributes that make up it's configuration.

Basic Configuration
Name
This is the name with which the data source is displayed in both the data set catalog as well as any other pages (such as incidents and notifications).
This name can can be changed in order to provide a name that is more meaningful than the name given to the physical storage.
Owners
You can assign users to a data source by selecting from any of the users who have access to the platform. This feature enhances collaboration by clearly identifying who is responsible for each data source.
When a user is assigned as the owner, they will be tagged and notified whenever incidents occur related to their data source, ensuring timely attention and resolution.
Type
Select the data source type from the supported types. For each data source type, a different sub-menu will appear with the relevant details for the data source.
Advanced Configuration
Sampling Rate
The percentage of the data that will be used to create the profile. Users can either decide the percentage of the data (0-1) or select dynamic sampling.
If the user decides to use Dynamic Sampling , Upriver's algorithms decide the sampling rate based on the volume of data and the distribution it sees.
Update Time
The interval in which the data analysis process will run. Users can either select one of the available hours from the drop down, or select dynamic update.
If the user decides to use Dynamic Update , Upriver's algorithms decide the relevant interval by analyzing the update patterns seen for the data source.
Pivot Fields
The name of the columns used to create the different "pivots". See Pivot Fields.
Filter
A filter can be used to only monitor parts of the data where a certain condition holds true. In order to use the filter, the user needs to enter the name of field/column they wish to filter and the desired value.
Cardinality Threshold
The number of distinct values needed to determine if a field is categorical or not.
Fields with less distinct values than the set threshold will be considered categorical.
Default Incident Severity
The user can set the default severity that will be assigned to incidents related to the datasource. See Managing Incidents
Staleness Threshold
The maximum number of days a datasource can go without seeing data before it is considered "stale". Stale data sources are considered those not being updated.
Timestamp Column
Specifies the name of the column to be used as the timestamp for each row in the table.
External ID
A user-defined identifier that represents the ID used in the its systems.
Webhook
A URL where incidents will be sent. Enables integration with external systems by forwarding incident data in real time.
Nullify Empty Strings (Checkbox)
If enabled, empty strings ("") will be treated as empty fields when calculating the completeness of a field.
Nullify Empty Arrays (Checkbox)
If enabled, empty arrays ([]) will be treated as empty fields when calculating the completeness of a field.
Last updated






