Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for streaming a large file via Context Imprint (AES CTR) #1043

Open
ashughes opened this issue Feb 16, 2024 · 4 comments
Open

Support for streaming a large file via Context Imprint (AES CTR) #1043

ashughes opened this issue Feb 16, 2024 · 4 comments
Labels
core Themis Core written in C, its packages feature request

Comments

@ashughes
Copy link

Is your feature request related to a problem? Please describe.
We need to be able to encrypt large files as well as perform random access when decrypting files. Based on my understanding of AES CTR, this should already be possible, however, there are no Themis APIs that allow us to do this.

Describe the solution you'd like to see
I believe the following API (in Java) should provide the necessary primitives to support this use case with minimal dependencies (e.g. no Java File or stream APIs):

public interface ContextImprintByOffset {
  byte[] encrypt(byte[] data, byte[] context, long offset);
  byte[] decrypt(byte[] data, byte[] context, long offset);
}

In the above interface an offset parameter is added which would be used to offset the counter appropriately when using the IV to encrypt and decrypt each byte. Additionally, the context parameter could be removed from this interface and instead provided as a constructor parameter when creating the instance (along with the key) since the additional context should be the same for every call to encrypt and decrypt for a given file.

Describe alternatives you've considered
An alternative is that we could split our large files into chunks, encrypt each chunk individually using Seal, Token Protect, or Context Imprint, and concatenate each chunk into an output file. For random access reads, we would determine which chunks need to be decrypted in order to return the requested data. This would effectively be creating our own data format based on the chosen encryption mode, chunk size, and header format.

Since AES CTR effectively already supports this capability, it would be nice to simply utilize that instead of defining our own format and strategy.

@Lagovas
Copy link
Collaborator

Lagovas commented Feb 26, 2024

Hi, sorry for the late answer. We don't plan to add such API because the main goal of the Themis design is to be as simple as possible for users who are not familiar with cryptography and hide all cryptography's complexity. So, most of the modes provide 1 function to encrypt and 1 to decrypt. Without any state except a key in some languages. Encryption of large files requires some state object that should be initialized and passed to every next encryption call. This state will hold the counter value and IV. The same counter and IV shouldn't be used for the same plaintext. So, users must not forget to create a new context for every new plaintext to generate a new IV. All these things complicate usage and require an understanding of security risks. That we wanted to avoid.

We suggest solving it on the application level, as you already described:

An alternative is that we could split our large files into chunks, encrypt each chunk individually using Seal, Token Protect, or Context Imprint, and concatenate each chunk into an output file.

In the case of secure cell in seal mode, ciphertext would have authentication tag that provides integrity checks. So your app would detect tampering. In the case of context imprint, app wouldn't detect tampering and as a result, get corrupted plaintext. So will be great to have at least MAC for the whole file to verify it's integrity.

@ashughes
Copy link
Author

Thanks for the thoughtful response! I totally understand the intent is to keep things simple and hide the complexity of cryptography (and we definitely appreciate it).

However, I wanted to clarify a few things. Unless I have a misunderstanding of how this would work, I think the only additional input needed from the user would be the offset parameter. The context parameter could be removed completely from the interface and instead provided as a constructor parameter when creating the Context Imprint instance (along with the key) since the additional context should be the same for every call to encrypt and decrypt for a given file.

Under the hood, the implementation would then determine the correct counter value and IV for the given offset. My understanding is that this is how AES CTR works already. The difference is just allowing the user to provide an offset so that the encryption and decryption can be done in chunks instead of all at once.

In my opinion, encrypting/decrypting large files is a common use case that's missing from Secure Cell.

@Lagovas
Copy link
Collaborator

Lagovas commented Feb 26, 2024

How to protect the users from the re-usage of the same object?

SymmetricKey symmetricKey = new SymmetricKey();
byte[] context = ....
SecureCell.ContextImprint encryptor = SecureCell.ContextImprintWithKey(symmetricKey, context);

byte[] plaintext1 = ...
byte[] plaintext2 = ...

encryptor.encrypt(plaintext1, 0)
encryptor.encrypt(plaintext2, 0)  // !!!!

We want to prevent the usage of the same encryptor with the same IV and counter for different plaintext. Such interface cannot protect from this flow.

@ashughes
Copy link
Author

We thought about this scenario as well when discussing internally and I agree with you that it is not ideal. However, it is only slightly different than the current problem of using the same additional context with multiple calls to encrypt:

SymmetricKey symmetricKey = new SymmetricKey();
byte[] context = ....
SecureCell.ContextImprint encryptor = SecureCell.ContextImprintWithKey(symmetricKey);

byte[] plaintext1 = ...
byte[] plaintext2 = ...

encryptor.encrypt(plaintext1, context)
encryptor.encrypt(plaintext2, context)  // !!!!

The only two options we've come up with that might help clarify are:

  1. Make the constructor something like SecureCell.ContextImprintWithKeyForFile(symmetricKey, context) and document that this should only be used for a single file and a new instance should be created with a new context for each file.
  2. Keep the additional context in the encrypt and decrypt (like my original example interface) and document that it must be the same value for each subsequent call for the same file.

@Lagovas Lagovas added core Themis Core written in C, its packages feature request labels Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Themis Core written in C, its packages feature request
Projects
None yet
Development

No branches or pull requests

2 participants