Skip to content
This repository has been archived by the owner on Jan 29, 2024. It is now read-only.

Commit

Permalink
Updated content after review feedback
Browse files Browse the repository at this point in the history
  • Loading branch information
harshini-rangaswamy committed Sep 14, 2023
1 parent addd8ef commit 17edb34
Show file tree
Hide file tree
Showing 4 changed files with 29 additions and 12 deletions.
2 changes: 2 additions & 0 deletions _toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -325,6 +325,8 @@ entries:
title: How it works
- file: docs/products/kafka/concepts/tiered-storage-guarantees
title: Guarantees
- file: docs/products/kafka/concepts/tiered-storage-backups
Title: Backups
- file: docs/products/kafka/concepts/tiered-storage-limitations
title: Limitations

Expand Down
16 changes: 9 additions & 7 deletions docs/products/kafka/concepts/kafka-tiered-storage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@ Discover the tiered storage capability in Aiven for Apache Kafka®. Learn how it
Overview
---------

Tiered storage provides the ability to use multiple storage types to store data, such as local disk and cloud storage, based on how frequently it is accessed. With Aiven for Apache Kafka, you can use tiered storage to allocate some of your data to high-speed local disks and move the rest to more cost-efficient remote storage options like AWS S3, Google Cloud Storage, or Azure blob storage.
Tiered storage provides the ability to use multiple storage types to store data, such as local disk and cloud storage, based on how frequently it is accessed. With Aiven for Apache Kafka, you can use tiered storage to allocate some of your data to high-speed local disks and move the rest to more cost-efficient remote storage options like AWS S3 and Google Cloud Storage.

Tiered storage offers multiple benefits, including:

* **Scalability**: Tiered storage allows Aiven for Apache Kafka instances to scale almost infinitely with cloud solutions, eliminating concerns about storage limitations.
* **Cost efficiency**: By moving less frequently accessed data to cost-effective storage tiers, you can realize significant financial savings.
* **Operational speed**: With the bulk of data offloaded to remote storage, service rebalancing in Aiven for Apache Kafka becomes faster, making for a smoother operational experience.
* **Infinite data retention**: With the scalability of cloud storage, you can achieve unlimited data retention, valuable for analytics and compliance.
* **Flexibility**: Data can be easily moved between storage tiers depending on usage and requirements, offering more flexibility.
* **Scalability:** Tiered storage allows Aiven for Apache Kafka instances to scale almost infinitely with cloud solutions, eliminating concerns about storage limitations.
* **Cost efficiency:** By moving less frequently accessed data to cost-effective storage tiers, you can realize significant financial savings.
* **Operational speed:** With the bulk of data offloaded to remote storage, service rebalancing in Aiven for Apache Kafka becomes faster, making for a smoother operational experience.
* **Infinite data retention:** With the scalability of cloud storage, you can achieve unlimited data retention, valuable for analytics and compliance.
* **Transparency:** Even older Kafka clients can benefit from tiered storage without needing to be explicitly aware of it.

When and why to use it
------------------------
Expand All @@ -33,6 +33,8 @@ Segments are encrypted with 256-bit AES encryption before being uploaded to the

Pricing
-------
Tiered storage users are billed for the remote storage usage in GB/hour, using the highest usage in each hour.
Tiered storage costs are determined by the amount of remote storage used, measured in GB/hour. The highest usage level within each hour is the basis for calculating charges.




13 changes: 13 additions & 0 deletions docs/products/kafka/concepts/tiered-storage-backups.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Tiered storage backups
========================

In Aiven for Apache Kafka®'s tiered storage, data that resides in topics with tiered storage will persist across power cycles. However, this is not the same as a conventional backup system. Only topics with tiered storage get copied to remote locations, while active data segments remain on local storage due to limitations with Apache Kafka.

.. note::
Remote data can remain vulnerable to accidental or intentional deletions as it stays connected to Apache Kafka brokers.

To ensure security, Aiven for Apache Kafka employs client-side encryption at rest and multi-stage data integrity checks. The remote data is stored in the same cloud region as the Aiven for Apache Kafka service.

Metadata backups are automatically restored during regular power cycles, but significant incidents may require manual operator intervention. The backup and restoration procedures for local storage remain separate and unchanged.

As active segments are not uploaded to the remote storage, the data stored in them will be lost after powering off the service.
10 changes: 5 additions & 5 deletions docs/products/kafka/concepts/tiered-storage-guarantees.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Guarantees
============
With tiered storage in Aiven for Apache Kafka®, there are two primary types of data retention guarantees: total retention and local retention.
With Aiven for Apache Kafka's tiered storage, there are two primary types of data retention guarantees: *total retention* and *local retention*.

**Total retention**: Tiered storage ensures that your data will be available up to the limit defined by the total retention threshold, regardless of whether it is stored locally or remotely. This means that your data will not be deleted until the total retention threshold, whether on local or remote storage, is reached.

Expand All @@ -10,9 +10,9 @@ With tiered storage in Aiven for Apache Kafka®, there are two primary types of
Example
--------

Let's say you have a topic with a **total retention threshold** of **1000 bytes** and a **local retention threshold** of **200 bytes**. This means that:
Let's say you have a topic with a total retention threshold of 1000 bytes and a local retention threshold of 200 GB. This means that:

* All data for the topic will be retained, regardless of whether it is stored locally or remotely, as long as the total size of the data does not exceed 1000 bytes.
* If the total size of the data exceeds 1000 bytes, Aiven for Apache Kafka will begin deleting the oldest data from remote storage.
* No data will be deleted from local storage until it has been safely transferred to remote storage.
* All data for the topic will be retained, regardless of whether it is stored locally or remotely, as long as the total size of the data does not exceed 1000 GB.
* If the total size of the data exceeds 200 GB, Kafka will move older segments to the remote storage and delete them from the local disk. No data will be deleted from local storage until it has been safely transferred to remote storage.
* If the total size of the data exceeds 1000 GB, Apache Kafka will begin deleting the oldest data from remote storage.

0 comments on commit 17edb34

Please sign in to comment.