Skip to content

Commit

Permalink
Update README.md (#968)
Browse files Browse the repository at this point in the history
Signed-off-by: Robin Tang <[email protected]>
  • Loading branch information
Tang8330 authored Oct 21, 2024
1 parent c909d84 commit fd51d56
Showing 1 changed file with 15 additions and 20 deletions.
35 changes: 15 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,15 @@
<h1
align="center">
<img
align="center"
alt="Artie Transfer"
src="https://github.com/user-attachments/assets/1438aee9-614a-463e-9018-edc14931de8c"
style="width:100%;"
/>
</h1>
<div align="center">
<img height="150px" src="https://github.com/artie-labs/transfer/assets/4412200/238df0c7-6087-4ddc-b83b-24638212af6a"/>
<h3>Artie Transfer</h3>
<p><b>⚡️ Blazing fast data replication between OLTP and OLAP databases ⚡️</b></p>
<p>⚡️ Blazing fast data replication between OLTP and OLAP databases ⚡️</p>
<a href="https://artie.com/slack"><img src="https://img.shields.io/badge/[email protected]?logo=slack"/></a>
<a href="https://artie.com/docs/open-source/running-artie/overview"><img src="https://user-images.githubusercontent.com/4412200/226736695-6b8b9abd-c227-41c7-89a1-805a04c90d08.png"/></a>
<a href="https://github.com/artie-labs/transfer/blob/master/LICENSE.txt"><img src="https://user-images.githubusercontent.com/4412200/201544613-a7197bc4-8b61-4fc5-bf09-68ee10133fd7.svg"/></a>
Expand All @@ -11,7 +19,7 @@
</div>
<br/>

Artie Transfer is a real-time data replication solution for databases and data warehouses/data lakes.
Artie Transfer is a real-time data replication solution for databases and data warehouses/lakes.

Typical ETL solutions rely on batched processes or schedulers (i.e. DAGs, Airflow), which means the data in the downstream data warehouse is often several hours to days old. This problem is exacerbated as data volumes grow, as batched processes take increasingly longer to run.

Expand All @@ -21,11 +29,10 @@ Benefits of Artie Transfer:

- Sub-minute data latency: always have access to live production data.
- Ease of use: just set up a simple configuration file, and you're good to go!
- Automatic table creation and schema detection: Artie infers schemas and automatically merges changes to downstream destinations.
- Reliability: Artie has automatic retries and processing is idempotent.
- Scalability: handle anywhere from 1GB to 100+ TB of data.
- Monitoring: built-in error reporting along with rich telemetry statistics.

- Automatic table creation and schema detection: Artie infers schemas and automatically merges changes to downstream destinations.
- Reliability: Artie has automatic retries and processing is idempotent.
- Scalability: handle anywhere from 1GB to 100+ TB of data.
- Monitoring: built-in error reporting along with rich telemetry statistics.

Take a look at this [guide](#getting-started) to get started!

Expand All @@ -35,18 +42,6 @@ Take a look at this [guide](#getting-started) to get started!
<img src="https://github.com/artie-labs/transfer/assets/4412200/a30a2ee1-7bdd-437c-9acb-ce6591654d18"/>
</div>

### Pre-requisites

As you can see from the architecture diagram above, Artie Transfer is a Kafka consumer and expects CDC messages to be in a particular format.

The optimal set-up looks something like this:
* [Debezium](https://github.com/debezium/debezium) or [Artie Reader](https://github.com/artie-labs/reader) depending on the source
* Kafka
* One Kafka topic per table, such that you can toggle the number of partitions based on throughput.
* The partition key should be the primary key for the table to avoid out-of-order writes at the row level.

Please see the [supported section](#what-is-currently-supported) on what sources and destinations are supported.

## Examples

To run Artie Transfer's stack locally, please refer to the [examples folder](https://github.com/artie-labs/transfer/tree/master/examples).
Expand Down

0 comments on commit fd51d56

Please sign in to comment.