Skip to content
This repository has been archived by the owner on Jan 29, 2024. It is now read-only.

Commit

Permalink
More updates to content
Browse files Browse the repository at this point in the history
  • Loading branch information
harshini-rangaswamy committed Sep 15, 2023
1 parent eee2353 commit 512a18c
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 7 deletions.
2 changes: 1 addition & 1 deletion docs/products/kafka/concepts/kafka-tiered-storage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ Discover the tiered storage capability in Aiven for Apache Kafka®. Learn how it

Overview
---------
Tiered storage in Aiven for Apache Kafka® lets you manage your data more efficiently by leveraging multiple storage types—local disk and remote cloud storage options like AWS S3 and Google Cloud Storage. This feature offers a tailored approach to data storage, allowing you to allocate frequently accessed data to high-speed local disks while offloading less critical or infrequently accessed data to more cost-effective remote storage solutions. Tiered storage enables you to indefinitely store data on specific topics without running out of space. Once enabled, it is configured per topic, giving you granular control over data storage.

Tiered storage in Aiven for Apache Kafka® lets you manage your data more efficiently by leveraging multiple storage types—local disk and remote cloud storage options like AWS S3 and Google Cloud Storage. This feature offers a tailored approach to data storage, allowing you to allocate frequently accessed data to high-speed local disks while offloading less critical or infrequently accessed data to more cost-effective remote storage solutions. Tiered storage allows you to store data on specific topics indefinitely without running out of space. Once enabled, it is configured on a per-topic basis, giving you granular control over data storage.

.. note::
Azure blob storage is not yet supported for tiered storage in Aiven for Apache Kafka.
Expand Down
9 changes: 4 additions & 5 deletions docs/products/kafka/concepts/tiered-storage-how-it-works.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,16 +6,15 @@ Aiven for Apache Kafka® tiered storage is a feature that optimizes data managem
* **Local tier**: Primarily consists of faster and typically more expensive storage solutions like solid-state drives (SSDs).
* **Remote tier**: Relies on slower, cost-effective options like cloud object storage.

In Aiven for Apache Kafka's tiered storage architecture, **remote storage** refers to storage options external to the Kafka broker's local disk. This typically includes cloud-based or self-hosted object storage solutions like AWS S3, Google Cloud, and Azure Blob Storage. Although network-attached block storage solutions like AWS EBS are technically external to the broker machine, Apache Kafka considers them local storage within its tiered storage architecture.
In Aiven for Apache Kafka's tiered storage architecture, **remote storage** refers to storage options external to the Kafka broker's local disk. This typically includes cloud-based or self-hosted object storage solutions like AWS S3 and Google Cloud. Although network-attached block storage solutions like AWS EBS are technically external to the broker machine, Apache Kafka considers them local storage within its tiered storage architecture.

Tiered storage operates in a way that is seamless for both Apache Kafka producers and consumers. This means that producers and consumers interact with Apache Kafka in the same way, regardless of whether tiered storage is enabled or not.

Administrators can configure Tiered storage per topic by defining the retention period and retention bytes to specify how much data should be retained on the local disk as opposed to remote storage.
Administrators can configure Tiered storage per topic by defining the retention period and retention bytes to specify how much data should be retained on the local disk instead of remote storage.


Local vs. remote data retention
---------------------------------

When tiered storage is enabled, data is initially stored on the local disk of the Kafka broker. Data is then asynchronously transferred to remote storage based on the pre-defined local retention threshold. During periods of high data ingestion or transient errors, such as network connectivity issues, the local storage might temporarily hold more data than specified by the local retention threshold.

Segment management
Expand All @@ -29,10 +28,10 @@ Data is transferred to remote storage asynchronously and does not interfere with
Any data exceeding the local retention threshold will not be purged by the log cleaner until it is successfully uploaded to remote storage.
The replication factor is not considered during the upload process, and only one copy of each segment is uploaded to the remote storage. Most remote storage options have their own measures, including data replication, to ensure data durability.


Data retrieval
-----------------
When consumers fetch records stored in remote storage, the broker downloads and caches these records locally. This allows for quicker access in subsequent retrieval operations.
The retention time and the maximum size of the cache can be configured.
When consumers fetch records stored in remote storage, the broker downloads and caches these records locally. This allows for quicker access in subsequent retrieval operations. You can configure the retention time and the maximum size of the cache.



Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Trade-offs and limitations
============================

The main trade-off of tiered storage in Aiven for Apache Kafka® is the higher latency while accessing and reading data from remote storage compared to local disk storage. While adding local caching can partially solve this problem, it cannot eliminate the latency.
The main trade-off of tiered storage is the higher latency while accessing and reading data from remote storage compared to local disk storage. While adding local caching can partially solve this problem, it cannot eliminate the latency completely.

Limitations
-------------
Expand Down

0 comments on commit 512a18c

Please sign in to comment.