Skip to content
This repository has been archived by the owner on Jan 29, 2024. It is now read-only.

Commit

Permalink
Addressed review feedback
Browse files Browse the repository at this point in the history
  • Loading branch information
harshini-rangaswamy committed Sep 18, 2023
1 parent 512a18c commit 6b2eed1
Show file tree
Hide file tree
Showing 6 changed files with 27 additions and 17 deletions.
4 changes: 3 additions & 1 deletion _toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -318,9 +318,11 @@ entries:
- file: docs/products/kafka/concepts/monitor-consumer-group
- file: docs/products/kafka/concepts/kafka-quotas
title: Quotas
- file: docs/products/kafka/concepts/kafka-tiered-storage
- file: docs/products/kafka/concepts/list-kafka-tiered-storage
title: Tiered storage
entries:
- file: docs/products/kafka/concepts/kafka-tiered-storage
title: Overview
- file: docs/products/kafka/concepts/tiered-storage-how-it-works
title: How it works
- file: docs/products/kafka/concepts/tiered-storage-guarantees
Expand Down
24 changes: 11 additions & 13 deletions docs/products/kafka/concepts/kafka-tiered-storage.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,7 @@
Tiered storage in Aiven for Apache Kafka®
===========================================
Tiered storage overview
==========================

Discover the tiered storage capability in Aiven for Apache Kafka®. Learn how it works and explore its use cases. Check why you might need it and what benefits you get using it.

Overview
---------
Tiered storage in Aiven for Apache Kafka® lets you manage your data more efficiently by leveraging multiple storage types—local disk and remote cloud storage options like AWS S3 and Google Cloud Storage. This feature offers a tailored approach to data storage, allowing you to allocate frequently accessed data to high-speed local disks while offloading less critical or infrequently accessed data to more cost-effective remote storage solutions. Tiered storage enables you to indefinitely store data on specific topics without running out of space. Once enabled, it is configured per topic, giving you granular control over data storage.
Tiered storage in Aiven for Apache Kafka allows you to manage your data more efficiently by leveraging two distinct storage types—local disk and remote cloud storage options like AWS S3 and Google Cloud Storage. This feature offers a tailored approach to data storage, allowing you to allocate frequently accessed data to high-speed local disks while offloading less critical or infrequently accessed data to more cost-effective remote storage solutions. Tiered storage enables you to indefinitely store data on specific topics without running out of space. Once enabled, it is configured per topic, giving you granular control over data storage needs.


.. note::
Expand All @@ -14,8 +10,8 @@ Tiered storage in Aiven for Apache Kafka® lets you manage your data more effici

Tiered storage offers multiple benefits, including:

* **Scalability:** Tiered storage allows Aiven for Apache Kafka instances to scale almost infinitely with cloud solutions, eliminating concerns about storage limitations.
* **Cost efficiency:** By moving less frequently accessed data to cost-effective storage tiers, you can realize significant financial savings.
* **Scalability:** With tiered storage in Aiven for Apache Kafka, storage and computing are effectively decoupled, enabling them to scale independently. This flexibility ensures that while the storage capacity can expand almost infinitely with cloud solutions, compute resources can also be adjusted based on demand, thus eliminating any concerns about storage or processing limitations.
* **Cost efficiency:** By moving less frequently accessed data to a cost-effective storage tier, you can achieve significant financial savings.
* **Operational speed:** With the bulk of data offloaded to remote storage, service rebalancing in Aiven for Apache Kafka becomes faster, making for a smoother operational experience.
* **Infinite data retention:** With the scalability of cloud storage, you can achieve unlimited data retention, valuable for analytics and compliance.
* **Transparency:** Even older Kafka clients can benefit from tiered storage without needing to be explicitly aware of it.
Expand All @@ -31,14 +27,16 @@ Understanding when and why to use tiered storage in Aiven for Apache Kafka will
* **High-speed data ingestion**: Tiered storage can offer a solution when dealing with unpredictable or sudden influxes of data. By supplementing the local disks with cloud storage, sudden increases in incoming data can be managed, ensuring optimum system performance.


Security
--------
Segments are encrypted with 256-bit AES encryption before being uploaded to the remote storage. The encryption keys are not shared with the cloud storage provider and generally do not leave Aiven machines.

Pricing
-------
Tiered storage costs are determined by the amount of remote storage used, measured in GB/hour. The highest usage level within each hour is the basis for calculating charges.


Related reading
----------------

* :doc:`How tiered storage works in Aiven for Apache Kafka® <../tiered-storage-how-it-works.html>`
* :doc:`Guarantees <../tiered-storage-guarantees>`
* :doc:`Backups <../tiered-storage-backups>`
* :doc:`Limiations <../tiered-storage-limitations>`

7 changes: 7 additions & 0 deletions docs/products/kafka/concepts/list-kafka-tiered-storage.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Tiered storage in Aiven for Apache Kafka®
===========================================

Discover the tiered storage capability in Aiven for Apache Kafka®. Learn how it works and explore its use cases. Check why you might need it and what benefits you get using it.

.. tableofcontents::

1 change: 0 additions & 1 deletion docs/products/kafka/concepts/tiered-storage-backups.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,3 @@ To ensure security, Aiven for Apache Kafka employs client-side encryption at res

Metadata backups are automatically restored during regular power cycles, but significant incidents may require manual operator intervention. The backup and restoration procedures for local storage remain separate and unchanged.

As active segments are not uploaded to the remote storage, the data stored in them will be lost after powering off the service.
2 changes: 1 addition & 1 deletion docs/products/kafka/concepts/tiered-storage-guarantees.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,6 @@ Example
Let's say you have a topic with a total retention threshold of 1000 GB and a local retention threshold of 200 GB. This means that:

* All data for the topic will be retained, regardless of whether it is stored locally or remotely, as long as the total size of the data does not exceed 1000 GB.
* If the total size of the data exceeds 200 GB, Kafka will move older segments to the remote storage and delete them from the local disk. No data will be deleted from local storage until it has been safely transferred to remote storage.
* If tiered storage is enabled per topic, older segments will be uploaded immediately to remote storage, irrespective of whether the local retention threshold of 200 GB is exceeded. Data will be deleted from local storage only after it has been safely transferred to remote storage.
* If the total size of the data exceeds 1000 GB, Apache Kafka will begin deleting the oldest data from remote storage.

6 changes: 5 additions & 1 deletion docs/products/kafka/concepts/tiered-storage-how-it-works.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,11 @@ The replication factor is not considered during the upload process, and only one

Data retrieval
-----------------
When consumers fetch records stored in remote storage, the broker downloads and caches these records locally. This allows for quicker access in subsequent retrieval operations. You can configure the retention time and the maximum size of the cache.
When consumers fetch records stored in remote storage, the broker downloads and caches these records locally. This allows for quicker access in subsequent retrieval operations.


Security
--------
Segments are encrypted with 256-bit AES encryption before being uploaded to the remote storage. The encryption keys are not shared with the cloud storage provider and generally do not leave Aiven machines.


0 comments on commit 6b2eed1

Please sign in to comment.