Skip to content

Commit

Permalink
Align Redshift page with new refactored docs structure (#4358)
Browse files Browse the repository at this point in the history
  • Loading branch information
AndrewRTsao authored Mar 15, 2024
1 parent fd8dd8d commit d5e5f9b
Show file tree
Hide file tree
Showing 13 changed files with 83 additions and 37 deletions.
1 change: 1 addition & 0 deletions docs/docs/build/connect/connect.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ To provide a non-exhaustive list, Rill supports the following connectors:
- [Azure Blob Storage](/reference/connectors/azure.md)
- [BigQuery](/reference/connectors/bigquery.md)
- [Athena](/reference/connectors/athena.md)
- [Redshift](/reference/connectors/redshift.md)
- [DuckDB and MotherDuck](/reference/connectors/motherduck.md)
- [PostgreSQL](/reference/connectors/postgres.md)
- [MySQL](/reference/connectors/mysql.md)
Expand Down
4 changes: 2 additions & 2 deletions docs/docs/reference/connectors/athena.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ To check if you already have the AWS CLI installed and authenticated, open a ter
```bash
aws iam get-user --no-cli-pager
```
If it prints information about your user, there is nothing more to do. Rill will be able to connect to Athena that you have access to.
If it prints information about your user, there is nothing more to do. Rill will be able to connect to any existing Athena instances that your user has privileges to access.

If you do not have the AWS CLI installed and authenticated, follow these steps:

Expand Down Expand Up @@ -50,7 +50,7 @@ If this project has already been deployed to Rill Cloud and credentials have bee
When deploying a project to Rill Cloud (i.e. `rill deploy`), Rill requires you to explicitly provide an access key and secret for an AWS service account with access to Athena used in your project.
If you subsequently add sources that require new credentials (or if you input the wrong credentials during the initial deploy), you can update the credentials used by Rill Cloud by running:
If you subsequently add sources that require new credentials (or if you had simply input the wrong credentials during the initial deploy), you can update the credentials used by Rill Cloud by running:
```
rill env configure
```
Expand Down
1 change: 1 addition & 0 deletions docs/docs/reference/connectors/connectors.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ We are always adding new connectors as part of our release cycle. If there's a s
- [Azure Blob Storage (Azure)](azure.md)
- [BigQuery](bigquery.md)
- [Amazon Athena](athena.md)
- [Amazon Redshift](redshift.md)
- [DuckDB / MotherDuck](motherduck.md)
- [PostgreSQL](postgres.md)
- [MySQL](mysql.md)
Expand Down
4 changes: 2 additions & 2 deletions docs/docs/reference/connectors/googlesheets.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@
title: Google Sheets
description: Connect to data in Google Sheets
sidebar_label: Google Sheets
sidebar_position: 12
sidebar_position: 13
---


### Google Sheets

Rill has the ability to read from any http(s) URL endpoint that produces a valid data file in a supported format. For example, to bring in data from Google Sheets as a CSV file directly into Rill as a source ([leveraging the direct download link syntax](https://www.highviewapps.com/blog/how-to-create-a-csv-or-excel-direct-download-link-in-google-sheets/)), you can create a `source_name.yaml` file in the `sources` directory of your Rill project directory with the following content:
Rill has the ability to read from any http(s) URL endpoint that produces a valid data file in a supported format. For example, to bring in data from [Google Sheets](https://www.google.com/sheets/about/) as a CSV file directly into Rill as a source ([leveraging the direct download link syntax](https://www.highviewapps.com/blog/how-to-create-a-csv-or-excel-direct-download-link-in-google-sheets/)), you can create a `source_name.yaml` file in the `sources` directory of your Rill project directory with the following content:

```yaml
type: "duckdb"
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/reference/connectors/motherduck.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: DuckDB / MotherDuck
description: Connect to data in DuckDB locally or MotherDuck
sidebar_label: DuckDB / MotherDuck
sidebar_position: 6
sidebar_position: 7
---

<!-- WARNING: There are links to this page in source code. If you move it, find and replace the links and consider adding a redirect in docusaurus.config.js. -->
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/reference/connectors/mysql.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: MySQL
description: Connect to data in MySQL
sidebar_label: MySQL
sidebar_position: 8
sidebar_position: 9
---

<!-- WARNING: There are links to this page in source code. If you move it, find and replace the links and consider adding a redirect in docusaurus.config.js. -->
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/reference/connectors/postgres.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: PostgreSQL
description: Connect to data in PostgreSQL
sidebar_label: PostgreSQL
sidebar_position: 7
sidebar_position: 8
---

<!-- WARNING: There are links to this page in source code. If you move it, find and replace the links and consider adding a redirect in docusaurus.config.js. -->
Expand Down
92 changes: 68 additions & 24 deletions docs/docs/reference/connectors/redshift.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,26 @@
title: Amazon Redshift
description: Connect to data in Amazon Redshift
sidebar_label: Redshift
sidebar_position: 40
sidebar_position: 6
---

<!-- WARNING: There are links to this page in source code. If you move it, find and replace the links and consider adding a redirect in docusaurus.config.js. -->

## How to configure credentials in Rill
## Overview

How you configure access to Redshift depends on whether you are developing a project locally using `rill start` or are setting up a deployment using `rill deploy`.
[Amazon Redshift](https://docs.aws.amazon.com/redshift/) is a fully managed, petabyte-scale data warehouse service in the cloud, offering fast query and I/O performance for data analysis applications. It enables users to run complex analytical queries against structured data using SQL, ETL processes, and BI tools, leveraging massively parallel processing (MPP) to efficiently handle large volumes of data. Redshift's architecture is designed for high performance on large datasets, supporting data warehousing and analytics of all sizes, making it a pivotal component in a modern data-driven decision-making ecosystem. By leveraging the AWS SDK for Go and utilizing intermediary parquet files in S3 (to ensure performance), Rill is able to connect to and read from Redshift as a source.

### Configure credentials for local development
![Connecting to Redshift](/img/reference/connectors/redshift/redshift.png)

When developing a project locally, Rill uses the credentials configured in your local environment using the AWS CLI.
## Local credentials

When using Rill Developer on your local machine (i.e. `rill start`), Rill uses the credentials configured in your local environment using the AWS CLI.

To check if you already have the AWS CLI installed and authenticated, open a terminal window and run:
```bash
aws iam get-user --no-cli-pager
```
If it prints information about your user, there is nothing more to do. Rill will be able to connect to Redshift that you have access to.
If it prints information about your user, there is nothing more to do. Rill will be able to connect to any existing Redshift databases that your user has privileges to access.

If you do not have the AWS CLI installed and authenticated, follow these steps:

Expand All @@ -38,32 +40,74 @@ If you do not have the AWS CLI installed and authenticated, follow these steps:
You have now configured AWS access from your local environment. Rill will detect and use your credentials next time you try to ingest a source.
### Configure credentials for deployments on Rill Cloud
:::tip Did you know?
If this project has already been deployed to Rill Cloud and credentials have been set for this source, you can use `rill env pull` to [pull these cloud credentials](/build/credentials/credentials.md#rill-env-pull) locally (into your local `.env` file). Please note that this may override any credentials that you have set locally for this source.
When deploying a project to Rill Cloud, Rill requires you to explicitly provide an access key and secret for an AWS service account with access to Redshift used in your project.
:::
When you first deploy a project using `rill deploy`, you will be prompted to provide credentials for the remote sources in your project that require authentication.
## Cloud deployment
If you subsequently add sources that require new credentials (or if you input the wrong credentials during the initial deploy), you can update the credentials used by Rill Cloud by running:
When deploying a project to Rill Cloud (i.e. `rill deploy`), Rill requires you to explicitly provide an access key and secret for an AWS service account with access to the Redshift database used in your project.
When you first deploy a project using `rill deploy`, you will be prompted to provide credentials for the remote sources in your project that require authentication. If you subsequently add sources that require new credentials (or if you had simply input the wrong credentials during the initial deploy), you can update the credentials used by Rill Cloud by running:
```
rill env configure
```
:::info
Note that you must `cd` into the Git repository that your project was deployed from before running `rill env configure`.
:::
:::tip Did you know?
If you've configured credentials locally already (in your `<RILL_HOME>/.home` file), you can use `rill env push` to [push these credentials](/build/credentials/credentials.md#rill-env-push) to your Rill Cloud project. This will allow other users to retrieve / reuse the same credentials automatically by running `rill env pull`.
:::
## Appendix
### Redshift Serverless permissions
Associate a IAM role (that has S3 access) with the Redshift Serverless namespace or the Redshift cluster(https://docs.aws.amazon.com/redshift/latest/mgmt/serverless-iam.html). Redshift connector places temporary files in S3 to accelerate extraction.
Redshift connector does the following AWS queries while ingesting data from Redshift:
1. Redshift Serverless:[`GetCredentials`](https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_ExecuteStatement.html) if you use Workgroup name to connect.
1. Reshift Data API:[`DescribeStatement`, `ExecuteStatement`](https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_ExecuteStatement.html) to unload data to S3.
1. S3:[`ListObjects`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html) to identify files unloaded by Redshift
1. S3:[`GetObject`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html) to ingest files unloaded by Redshift.
Make sure your account or a service account have corresponding permissions to perform these requests.
### Redshift cluster permissions
1. Reshift:[`GetClusterCredentialsWithIAM`](https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_ExecuteStatement.html) if you use Cluster Identifier to connect.
1. Reshift Data API:[`DescribeStatement`, `ExecuteStatement`](https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_ExecuteStatement.html) to unload data to S3.
1. S3:[`ListObjects`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html) to identify files unloaded by Redshift
1. S3:[`GetObject`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html) to ingest files unloaded by Redshift.
When using **Redshift Serverless**, make sure to associate an [IAM role (that has S3 access)](https://docs.aws.amazon.com/redshift/latest/mgmt/serverless-iam.html) with the Serverless namespace or the Redshift cluster.
:::info What happens when Rill is reading from Redshift Serverless?
Our Redshift connector will place temporary files in parquet format in S3 to help accelerate the extraction process (maximizes performance). To provide some more details, the Redshift connector will execute the following queries / requests while ingesting data from Redshift:
1. Redshift Serverless:[`GetCredentials`](https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_ExecuteStatement.html) if you are using _Workgroup_ name to connect.
2. Reshift Data API:[`DescribeStatement`, `ExecuteStatement`](https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_ExecuteStatement.html) to unload data to S3.
3. S3:[`ListObjects`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html) to identify files unloaded by Redshift.
4. S3:[`GetObject`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html) to ingest files unloaded by Redshift.
:::
:::warning Check your service account permissions
Your account or service account will need to have the <u>appropriate permissions</u> necessary to perform these requests.
:::
### Redshift Cluster permissions
Similarly, when using **Redshift Cluster**, make sure to associate an [IAM role (that has S3 access)](https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-iam-authentication-access-control.html) with the appropriate Redshift cluster.
:::info What happens when Rill is reading from a Redshift Cluster?
Our Redshift connector will place temporary files in parquet format in S3 to help accelerate the extraction process (maximizes performance). To provide some more details, the Redshift connector will execute the following queries / requests while ingesting data from Redshift:
1. Redshift:[`GetClusterCredentialsWithIAM`](https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_ExecuteStatement.html) if you are using _Cluster Identifier_ to connect.
2. Redshift Data API:[`DescribeStatement`, `ExecuteStatement`](https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_ExecuteStatement.html) to unload data to S3.
3. S3:[`ListObjects`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html) to identify files unloaded by Redshift.
4. S3:[`GetObject`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html) to ingest files unloaded by Redshift.
:::
:::warning Check your service account permissions
Your account or service account will need to have the <u>appropriate permissions</u> necessary to perform these requests.
:::
4 changes: 2 additions & 2 deletions docs/docs/reference/connectors/salesforce.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@
title: Salesforce
description: Connect to data in a Salesforce org using the Bulk API
sidebar_label: Salesforce
sidebar_position: 11
sidebar_position: 12
---

<!-- WARNING: There are links to this page in source code. If you move it, find and replace the links and consider adding a redirect in docusaurus.config.js. -->

## Overview

Salesforce is a leading cloud-based Customer Relationship Management (CRM) platform designed to help businesses connect with and understand their customers better. It offers a comprehensive suite of applications focused on sales, customer service, marketing automation, analytics, and application development. Salesforce enables organizations of all sizes to build stronger relationships with their customers through personalized experiences, streamlined communication, and predictive insights. Rill is able to ingest data from Salesforce as a source by utilizing the Bulk API, which requires a Salesforce username along with a password (and in some cases, a token, depending on the org configuration) to authenticate against a Salesforce org.
[Salesforce](https://www.salesforce.com/) is a leading cloud-based Customer Relationship Management (CRM) platform designed to help businesses connect with and understand their customers better. It offers a comprehensive suite of applications focused on sales, customer service, marketing automation, analytics, and application development. Salesforce enables organizations of all sizes to build stronger relationships with their customers through personalized experiences, streamlined communication, and predictive insights. Rill is able to ingest data from Salesforce as a source by utilizing the Bulk API, which requires a Salesforce username along with a password (and in some cases, a token, depending on the org configuration) to authenticate against a Salesforce org.

![Connecting to Salesforce](/img/reference/connectors/salesforce/salesforce.png)

Expand Down
4 changes: 2 additions & 2 deletions docs/docs/reference/connectors/snowflake.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@
title: Snowflake
description: Connect to data in Snowflake
sidebar_label: Snowflake
sidebar_position: 10
sidebar_position: 11
---

<!-- WARNING: There are links to this page in source code. If you move it, find and replace the links and consider adding a redirect in docusaurus.config.js. -->

## Overview

Snowflake is a cloud-based data platform designed to facilitate data warehousing, data lakes, data engineering, data science, data application development, and data sharing. It separates compute and storage, enabling users to scale up or down instantly without downtime, providing a cost-effective solution for data management. With its unique architecture and support for multi-cloud environments, including AWS, Azure, and Google Cloud Platform, Snowflake offers seamless data integration, secure data sharing across organizations, and real-time access to data insights, making it a common choice to power many busienss intelligence applications or use cases. Rill supports natively connecting to and reading from Snowflake as a source using the [Go Snowflake Driver](https://pkg.go.dev/github.com/snowflakedb/gosnowflake).
[Snowflake](https://docs.snowflake.com/en/user-guide-intro) is a cloud-based data platform designed to facilitate data warehousing, data lakes, data engineering, data science, data application development, and data sharing. It separates compute and storage, enabling users to scale up or down instantly without downtime, providing a cost-effective solution for data management. With its unique architecture and support for multi-cloud environments, including AWS, Azure, and Google Cloud Platform, Snowflake offers seamless data integration, secure data sharing across organizations, and real-time access to data insights, making it a common choice to power many busienss intelligence applications or use cases. Rill supports natively connecting to and reading from Snowflake as a source using the [Go Snowflake Driver](https://pkg.go.dev/github.com/snowflakedb/gosnowflake).

![Connecting to Snowflake](/img/reference/connectors/snowflake/snowflake.png)

Expand Down
4 changes: 2 additions & 2 deletions docs/docs/reference/connectors/sqlite.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@
title: SQLite
description: Connect to data in SQLite
sidebar_label: SQLite
sidebar_position: 9
sidebar_position: 10
---

<!-- WARNING: There are links to this page in source code. If you move it, find and replace the links and consider adding a redirect in docusaurus.config.js. -->

## Overview

SQLite is a lightweight, self-contained SQL database engine renowned for its reliability, speed, and full-featured, serverless architecture. SQLite is primarily known as an in-process database and widely used in embedded systems, mobile applications, and various small-to-medium sized applications due to its simplicity, zero-configuration, and single-file database format. SQLite supports standard SQL syntax and includes features such as transactions and atomic commit and rollback, making it a practical choice for applications requiring a compact, efficient data management system. Rill supports connecting and reading from a SQLite database as a source through the [DuckDB SQLite extension](https://duckdb.org/docs/extensions/sqlite.html).
[SQLite](https://www.sqlite.org/about.html) is a lightweight, self-contained SQL database engine renowned for its reliability, speed, and full-featured, serverless architecture. SQLite is primarily known as an in-process database and widely used in embedded systems, mobile applications, and various small-to-medium sized applications due to its simplicity, zero-configuration, and single-file database format. SQLite supports standard SQL syntax and includes features such as transactions and atomic commit and rollback, making it a practical choice for applications requiring a compact, efficient data management system. Rill supports connecting and reading from a SQLite database as a source through the [DuckDB SQLite extension](https://duckdb.org/docs/extensions/sqlite.html).

![Connecting to SQLite](/img/reference/connectors/sqlite/sqlite.png)

Expand Down
Binary file modified docs/static/img/reference/connectors/athena/athena.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit d5e5f9b

Please sign in to comment.