Redshift

Integrating Amazon Redshift with Upriver

Amazon Redshift is a fully managed, petabyte-scale data warehouse solution provided by Amazon Web Services (AWS). By integrating Redshift with Upriver, you can streamline data governance and maintain high data quality directly within your data warehouse. This integration enables efficient data monitoring, traceability, and consistency, ensuring your analytics and reporting pipelines remain accurate and reliable.

Prerequisites

To monitor Amazon Redshift with Upriver, you will need the following:

  1. Provision AWS for Upriver operation.

  2. Configure a Redshift integration in Upriver:

    1. Create a Redshift user for Upriver.

    2. Grant Upriver's user access to your Redshift.

Step 1: Provision AWS

Before connecting Redshift to Upriver, make sure your AWS account is properly set up. Follow the guidelines provided in this page to ensure correct setup.

Provisioning AWS for Redshift access is only needed in Hybrid/Outpost deployment methods.

Step 2: Configure Redshift integration in Upriver's Platform

Once your AWS account is set up, its time to configure a Redshift integration in the Upriver platform by providing the following the next steps:

  1. Configure a new Data Integration - Navigate to Settings->Data Integrations , click "+ Add" in the top-right corner.

  2. Fill in basic settings:

    • Name - A user defined name.

    • Type - Type of integration. Redshift should be chosen.

  3. Create a Redshift user for Upriver: In order to connect to redshift, Upriver will need a user created in the database. You can run the following script to create the user. Replace <username> and <password> with a user and password for it. Replace <schema_name> with any schema you want to grant access to (can be more than one):

Upriver supports different users for different hosts, however does not support different users for different databases/schemas accessed via the same host.

  1. Granting Upriver's user access to the Redshift instance: To grant upriver access to your redshift - you'll need to set up a redshift-managed VPC endpoint according to the following guide by AWS. Alternatively, you can set the redshift as publicly accessible, and use security group/firewall to limit access as you wish. For SaaS deployment - please contact your Upriver representative for the VPC you need to connect to and for him to complete the setup on Upriver's side. For outpost/hybrid deployments - you'll need to connect to the VPC created by the cloud formation script (details available in the outputs of the cloud formation). Note that you'll need to create a security group allowing redshift access for that VPC yourself.

  2. Fill in the connection details:

    • Host: The url of the host for your redshift. If you've set up a VPC endpoint, it's the host for that endpoint (for SaaS deployment - you'll receive the host from the Upriver representative). In a publicly accessible redshift, for serverless redshift, this is the host of the workgroup. For provisioned redshift, this is the cluster endpoint.

    • Port: The port used in the connection. By default, this is 5439.

    • User: Upriver's Redshift username as created on step 3.

    • Password: Upriver's Redshift user's password.

Step 3: Configure the Data Source in Upriver's Platform

Now that your Amazon Redshift integration is set up, you can configure Data Sources to be monitored.

To do this:

  1. Follow the instructions provided in Data Source Configuration section of the documentation.

  2. When configuring a Amazon Reshift Data Source, choose the correct integration in the Connection step. The integration chosen should point to the Redshift instance that holds the relevant table you wish to monitor.

  3. Provide the required Database, Schema and Table.

  4. Continue with the rest of the configuration as needed.


Monitor and Manage Your Data

After configuring a Redshift datasource, Upriver will automatically monitor it for you. You can track data issues and enforce governance policies, ensuring your data is consistently accurate and trustworthy throughout the pipeline.

Last updated