diff --git a/docs/docs/build/connect/connect.md b/docs/docs/build/connect/connect.md index cf7e7c7b5e8..9106319a930 100644 --- a/docs/docs/build/connect/connect.md +++ b/docs/docs/build/connect/connect.md @@ -16,6 +16,7 @@ To provide a non-exhaustive list, Rill supports the following connectors: - [Azure Blob Storage](/reference/connectors/azure.md) - [BigQuery](/reference/connectors/bigquery.md) - [Athena](/reference/connectors/athena.md) +- [Redshift](/reference/connectors/redshift.md) - [DuckDB and MotherDuck](/reference/connectors/motherduck.md) - [PostgreSQL](/reference/connectors/postgres.md) - [MySQL](/reference/connectors/mysql.md) diff --git a/docs/docs/reference/connectors/athena.md b/docs/docs/reference/connectors/athena.md index 89dbf95c94d..35f90f7a354 100644 --- a/docs/docs/reference/connectors/athena.md +++ b/docs/docs/reference/connectors/athena.md @@ -21,7 +21,7 @@ To check if you already have the AWS CLI installed and authenticated, open a ter ```bash aws iam get-user --no-cli-pager ``` -If it prints information about your user, there is nothing more to do. Rill will be able to connect to Athena that you have access to. +If it prints information about your user, there is nothing more to do. Rill will be able to connect to any existing Athena instances that your user has privileges to access. If you do not have the AWS CLI installed and authenticated, follow these steps: @@ -50,7 +50,7 @@ If this project has already been deployed to Rill Cloud and credentials have bee When deploying a project to Rill Cloud (i.e. `rill deploy`), Rill requires you to explicitly provide an access key and secret for an AWS service account with access to Athena used in your project. -If you subsequently add sources that require new credentials (or if you input the wrong credentials during the initial deploy), you can update the credentials used by Rill Cloud by running: +If you subsequently add sources that require new credentials (or if you had simply input the wrong credentials during the initial deploy), you can update the credentials used by Rill Cloud by running: ``` rill env configure ``` diff --git a/docs/docs/reference/connectors/connectors.md b/docs/docs/reference/connectors/connectors.md index 819e015b16e..681f31122f2 100644 --- a/docs/docs/reference/connectors/connectors.md +++ b/docs/docs/reference/connectors/connectors.md @@ -23,6 +23,7 @@ We are always adding new connectors as part of our release cycle. If there's a s - [Azure Blob Storage (Azure)](azure.md) - [BigQuery](bigquery.md) - [Amazon Athena](athena.md) +- [Amazon Redshift](redshift.md) - [DuckDB / MotherDuck](motherduck.md) - [PostgreSQL](postgres.md) - [MySQL](mysql.md) diff --git a/docs/docs/reference/connectors/googlesheets.md b/docs/docs/reference/connectors/googlesheets.md index 6ba0ec26394..ca0182f9310 100644 --- a/docs/docs/reference/connectors/googlesheets.md +++ b/docs/docs/reference/connectors/googlesheets.md @@ -2,13 +2,13 @@ title: Google Sheets description: Connect to data in Google Sheets sidebar_label: Google Sheets -sidebar_position: 12 +sidebar_position: 13 --- ### Google Sheets -Rill has the ability to read from any http(s) URL endpoint that produces a valid data file in a supported format. For example, to bring in data from Google Sheets as a CSV file directly into Rill as a source ([leveraging the direct download link syntax](https://www.highviewapps.com/blog/how-to-create-a-csv-or-excel-direct-download-link-in-google-sheets/)), you can create a `source_name.yaml` file in the `sources` directory of your Rill project directory with the following content: +Rill has the ability to read from any http(s) URL endpoint that produces a valid data file in a supported format. For example, to bring in data from [Google Sheets](https://www.google.com/sheets/about/) as a CSV file directly into Rill as a source ([leveraging the direct download link syntax](https://www.highviewapps.com/blog/how-to-create-a-csv-or-excel-direct-download-link-in-google-sheets/)), you can create a `source_name.yaml` file in the `sources` directory of your Rill project directory with the following content: ```yaml type: "duckdb" diff --git a/docs/docs/reference/connectors/motherduck.md b/docs/docs/reference/connectors/motherduck.md index 9a78dc40c11..66940b491f8 100644 --- a/docs/docs/reference/connectors/motherduck.md +++ b/docs/docs/reference/connectors/motherduck.md @@ -2,7 +2,7 @@ title: DuckDB / MotherDuck description: Connect to data in DuckDB locally or MotherDuck sidebar_label: DuckDB / MotherDuck -sidebar_position: 6 +sidebar_position: 7 --- diff --git a/docs/docs/reference/connectors/mysql.md b/docs/docs/reference/connectors/mysql.md index 3a0fcc8f88e..41c2803708c 100644 --- a/docs/docs/reference/connectors/mysql.md +++ b/docs/docs/reference/connectors/mysql.md @@ -2,7 +2,7 @@ title: MySQL description: Connect to data in MySQL sidebar_label: MySQL -sidebar_position: 8 +sidebar_position: 9 --- diff --git a/docs/docs/reference/connectors/postgres.md b/docs/docs/reference/connectors/postgres.md index 00c5490ff06..ab340eff563 100644 --- a/docs/docs/reference/connectors/postgres.md +++ b/docs/docs/reference/connectors/postgres.md @@ -2,7 +2,7 @@ title: PostgreSQL description: Connect to data in PostgreSQL sidebar_label: PostgreSQL -sidebar_position: 7 +sidebar_position: 8 --- diff --git a/docs/docs/reference/connectors/redshift.md b/docs/docs/reference/connectors/redshift.md index 7316d6f54d9..54f459a877a 100644 --- a/docs/docs/reference/connectors/redshift.md +++ b/docs/docs/reference/connectors/redshift.md @@ -2,24 +2,26 @@ title: Amazon Redshift description: Connect to data in Amazon Redshift sidebar_label: Redshift -sidebar_position: 40 +sidebar_position: 6 --- -## How to configure credentials in Rill +## Overview -How you configure access to Redshift depends on whether you are developing a project locally using `rill start` or are setting up a deployment using `rill deploy`. +[Amazon Redshift](https://docs.aws.amazon.com/redshift/) is a fully managed, petabyte-scale data warehouse service in the cloud, offering fast query and I/O performance for data analysis applications. It enables users to run complex analytical queries against structured data using SQL, ETL processes, and BI tools, leveraging massively parallel processing (MPP) to efficiently handle large volumes of data. Redshift's architecture is designed for high performance on large datasets, supporting data warehousing and analytics of all sizes, making it a pivotal component in a modern data-driven decision-making ecosystem. By leveraging the AWS SDK for Go and utilizing intermediary parquet files in S3 (to ensure performance), Rill is able to connect to and read from Redshift as a source. -### Configure credentials for local development +![Connecting to Redshift](/img/reference/connectors/redshift/redshift.png) -When developing a project locally, Rill uses the credentials configured in your local environment using the AWS CLI. +## Local credentials + +When using Rill Developer on your local machine (i.e. `rill start`), Rill uses the credentials configured in your local environment using the AWS CLI. To check if you already have the AWS CLI installed and authenticated, open a terminal window and run: ```bash aws iam get-user --no-cli-pager ``` -If it prints information about your user, there is nothing more to do. Rill will be able to connect to Redshift that you have access to. +If it prints information about your user, there is nothing more to do. Rill will be able to connect to any existing Redshift databases that your user has privileges to access. If you do not have the AWS CLI installed and authenticated, follow these steps: @@ -38,32 +40,74 @@ If you do not have the AWS CLI installed and authenticated, follow these steps: You have now configured AWS access from your local environment. Rill will detect and use your credentials next time you try to ingest a source. -### Configure credentials for deployments on Rill Cloud +:::tip Did you know? + +If this project has already been deployed to Rill Cloud and credentials have been set for this source, you can use `rill env pull` to [pull these cloud credentials](/build/credentials/credentials.md#rill-env-pull) locally (into your local `.env` file). Please note that this may override any credentials that you have set locally for this source. -When deploying a project to Rill Cloud, Rill requires you to explicitly provide an access key and secret for an AWS service account with access to Redshift used in your project. +::: -When you first deploy a project using `rill deploy`, you will be prompted to provide credentials for the remote sources in your project that require authentication. +## Cloud deployment -If you subsequently add sources that require new credentials (or if you input the wrong credentials during the initial deploy), you can update the credentials used by Rill Cloud by running: +When deploying a project to Rill Cloud (i.e. `rill deploy`), Rill requires you to explicitly provide an access key and secret for an AWS service account with access to the Redshift database used in your project. + +When you first deploy a project using `rill deploy`, you will be prompted to provide credentials for the remote sources in your project that require authentication. If you subsequently add sources that require new credentials (or if you had simply input the wrong credentials during the initial deploy), you can update the credentials used by Rill Cloud by running: ``` rill env configure ``` + +:::info + Note that you must `cd` into the Git repository that your project was deployed from before running `rill env configure`. +::: + +:::tip Did you know? + +If you've configured credentials locally already (in your `/.home` file), you can use `rill env push` to [push these credentials](/build/credentials/credentials.md#rill-env-push) to your Rill Cloud project. This will allow other users to retrieve / reuse the same credentials automatically by running `rill env pull`. + +::: + +## Appendix + ### Redshift Serverless permissions -Associate a IAM role (that has S3 access) with the Redshift Serverless namespace or the Redshift cluster(https://docs.aws.amazon.com/redshift/latest/mgmt/serverless-iam.html). Redshift connector places temporary files in S3 to accelerate extraction. -Redshift connector does the following AWS queries while ingesting data from Redshift: -1. Redshift Serverless:[`GetCredentials`](https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_ExecuteStatement.html) if you use Workgroup name to connect. -1. Reshift Data API:[`DescribeStatement`, `ExecuteStatement`](https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_ExecuteStatement.html) to unload data to S3. -1. S3:[`ListObjects`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html) to identify files unloaded by Redshift -1. S3:[`GetObject`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html) to ingest files unloaded by Redshift. - -Make sure your account or a service account have corresponding permissions to perform these requests. - -### Redshift cluster permissions -1. Reshift:[`GetClusterCredentialsWithIAM`](https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_ExecuteStatement.html) if you use Cluster Identifier to connect. -1. Reshift Data API:[`DescribeStatement`, `ExecuteStatement`](https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_ExecuteStatement.html) to unload data to S3. -1. S3:[`ListObjects`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html) to identify files unloaded by Redshift -1. S3:[`GetObject`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html) to ingest files unloaded by Redshift. +When using **Redshift Serverless**, make sure to associate an [IAM role (that has S3 access)](https://docs.aws.amazon.com/redshift/latest/mgmt/serverless-iam.html) with the Serverless namespace or the Redshift cluster. + +:::info What happens when Rill is reading from Redshift Serverless? + +Our Redshift connector will place temporary files in parquet format in S3 to help accelerate the extraction process (maximizes performance). To provide some more details, the Redshift connector will execute the following queries / requests while ingesting data from Redshift: + +1. Redshift Serverless:[`GetCredentials`](https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_ExecuteStatement.html) if you are using _Workgroup_ name to connect. +2. Reshift Data API:[`DescribeStatement`, `ExecuteStatement`](https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_ExecuteStatement.html) to unload data to S3. +3. S3:[`ListObjects`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html) to identify files unloaded by Redshift. +4. S3:[`GetObject`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html) to ingest files unloaded by Redshift. + +::: + +:::warning Check your service account permissions + +Your account or service account will need to have the appropriate permissions necessary to perform these requests. + +::: + +### Redshift Cluster permissions + +Similarly, when using **Redshift Cluster**, make sure to associate an [IAM role (that has S3 access)](https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-iam-authentication-access-control.html) with the appropriate Redshift cluster. + +:::info What happens when Rill is reading from a Redshift Cluster? + +Our Redshift connector will place temporary files in parquet format in S3 to help accelerate the extraction process (maximizes performance). To provide some more details, the Redshift connector will execute the following queries / requests while ingesting data from Redshift: + +1. Redshift:[`GetClusterCredentialsWithIAM`](https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_ExecuteStatement.html) if you are using _Cluster Identifier_ to connect. +2. Redshift Data API:[`DescribeStatement`, `ExecuteStatement`](https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_ExecuteStatement.html) to unload data to S3. +3. S3:[`ListObjects`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html) to identify files unloaded by Redshift. +4. S3:[`GetObject`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html) to ingest files unloaded by Redshift. + +::: + +:::warning Check your service account permissions + +Your account or service account will need to have the appropriate permissions necessary to perform these requests. + +::: diff --git a/docs/docs/reference/connectors/salesforce.md b/docs/docs/reference/connectors/salesforce.md index adb7763c9c3..94a4ecf3793 100644 --- a/docs/docs/reference/connectors/salesforce.md +++ b/docs/docs/reference/connectors/salesforce.md @@ -2,14 +2,14 @@ title: Salesforce description: Connect to data in a Salesforce org using the Bulk API sidebar_label: Salesforce -sidebar_position: 11 +sidebar_position: 12 --- ## Overview -Salesforce is a leading cloud-based Customer Relationship Management (CRM) platform designed to help businesses connect with and understand their customers better. It offers a comprehensive suite of applications focused on sales, customer service, marketing automation, analytics, and application development. Salesforce enables organizations of all sizes to build stronger relationships with their customers through personalized experiences, streamlined communication, and predictive insights. Rill is able to ingest data from Salesforce as a source by utilizing the Bulk API, which requires a Salesforce username along with a password (and in some cases, a token, depending on the org configuration) to authenticate against a Salesforce org. +[Salesforce](https://www.salesforce.com/) is a leading cloud-based Customer Relationship Management (CRM) platform designed to help businesses connect with and understand their customers better. It offers a comprehensive suite of applications focused on sales, customer service, marketing automation, analytics, and application development. Salesforce enables organizations of all sizes to build stronger relationships with their customers through personalized experiences, streamlined communication, and predictive insights. Rill is able to ingest data from Salesforce as a source by utilizing the Bulk API, which requires a Salesforce username along with a password (and in some cases, a token, depending on the org configuration) to authenticate against a Salesforce org. ![Connecting to Salesforce](/img/reference/connectors/salesforce/salesforce.png) diff --git a/docs/docs/reference/connectors/snowflake.md b/docs/docs/reference/connectors/snowflake.md index 5d99ae024c5..283a725c41d 100644 --- a/docs/docs/reference/connectors/snowflake.md +++ b/docs/docs/reference/connectors/snowflake.md @@ -2,14 +2,14 @@ title: Snowflake description: Connect to data in Snowflake sidebar_label: Snowflake -sidebar_position: 10 +sidebar_position: 11 --- ## Overview -Snowflake is a cloud-based data platform designed to facilitate data warehousing, data lakes, data engineering, data science, data application development, and data sharing. It separates compute and storage, enabling users to scale up or down instantly without downtime, providing a cost-effective solution for data management. With its unique architecture and support for multi-cloud environments, including AWS, Azure, and Google Cloud Platform, Snowflake offers seamless data integration, secure data sharing across organizations, and real-time access to data insights, making it a common choice to power many busienss intelligence applications or use cases. Rill supports natively connecting to and reading from Snowflake as a source using the [Go Snowflake Driver](https://pkg.go.dev/github.com/snowflakedb/gosnowflake). +[Snowflake](https://docs.snowflake.com/en/user-guide-intro) is a cloud-based data platform designed to facilitate data warehousing, data lakes, data engineering, data science, data application development, and data sharing. It separates compute and storage, enabling users to scale up or down instantly without downtime, providing a cost-effective solution for data management. With its unique architecture and support for multi-cloud environments, including AWS, Azure, and Google Cloud Platform, Snowflake offers seamless data integration, secure data sharing across organizations, and real-time access to data insights, making it a common choice to power many busienss intelligence applications or use cases. Rill supports natively connecting to and reading from Snowflake as a source using the [Go Snowflake Driver](https://pkg.go.dev/github.com/snowflakedb/gosnowflake). ![Connecting to Snowflake](/img/reference/connectors/snowflake/snowflake.png) diff --git a/docs/docs/reference/connectors/sqlite.md b/docs/docs/reference/connectors/sqlite.md index 407f9f144af..4f28b440d89 100644 --- a/docs/docs/reference/connectors/sqlite.md +++ b/docs/docs/reference/connectors/sqlite.md @@ -2,14 +2,14 @@ title: SQLite description: Connect to data in SQLite sidebar_label: SQLite -sidebar_position: 9 +sidebar_position: 10 --- ## Overview -SQLite is a lightweight, self-contained SQL database engine renowned for its reliability, speed, and full-featured, serverless architecture. SQLite is primarily known as an in-process database and widely used in embedded systems, mobile applications, and various small-to-medium sized applications due to its simplicity, zero-configuration, and single-file database format. SQLite supports standard SQL syntax and includes features such as transactions and atomic commit and rollback, making it a practical choice for applications requiring a compact, efficient data management system. Rill supports connecting and reading from a SQLite database as a source through the [DuckDB SQLite extension](https://duckdb.org/docs/extensions/sqlite.html). +[SQLite](https://www.sqlite.org/about.html) is a lightweight, self-contained SQL database engine renowned for its reliability, speed, and full-featured, serverless architecture. SQLite is primarily known as an in-process database and widely used in embedded systems, mobile applications, and various small-to-medium sized applications due to its simplicity, zero-configuration, and single-file database format. SQLite supports standard SQL syntax and includes features such as transactions and atomic commit and rollback, making it a practical choice for applications requiring a compact, efficient data management system. Rill supports connecting and reading from a SQLite database as a source through the [DuckDB SQLite extension](https://duckdb.org/docs/extensions/sqlite.html). ![Connecting to SQLite](/img/reference/connectors/sqlite/sqlite.png) diff --git a/docs/static/img/reference/connectors/athena/athena.png b/docs/static/img/reference/connectors/athena/athena.png index 6ee61090464..d3f21fcd1b7 100644 Binary files a/docs/static/img/reference/connectors/athena/athena.png and b/docs/static/img/reference/connectors/athena/athena.png differ diff --git a/docs/static/img/reference/connectors/redshift/redshift.png b/docs/static/img/reference/connectors/redshift/redshift.png new file mode 100644 index 00000000000..7b4ce853111 Binary files /dev/null and b/docs/static/img/reference/connectors/redshift/redshift.png differ