Simplified AES (S-AES) works the same way as the AES algorithm, with the difference being that S-AES uses 16 bit key for encryption instead of AES that normally uses 256. To achieve parallelization, S-AES uses 16 threads for parallel encryption/decryption of 16 bits. The text is converted to binary from CPU and then this array of bits is send to GPU for encryption. In addition to parallelization of S-AES, two modes that use SAES for encryption/decryption are also parallelized; CCM and XTS.
CCM - Counter with Cipher-Block Chaining Message Authentication is an authenticated encryption, which means that simultaneosly protects confidentiallity and authenticity (integrety) of communications.
The algorithm for CCM encryption is given below.
The XTS-AES is a block cipher mode of operation, approved by NIST in 2010. The XTS-AES algorithm uses the AES algorithm twice and uses two keys. The following parametres are associated with the algorithm.
- Key - 256 or 512 bit XTS-AES key; this is parsed as a concatination of two fields of equal size Key1 and Key2, such that Key = Key1 || Key2. For SAES Key has a length of 32 bits, Key1 and Key2 being both 16 bits.
- Pj - The jth block of the plaintext. All blocks except possibly the final block have a length of 128 bits (for SAES 16 bits). A plaintext data unit, typically a disk sector, conssits of the sequence P1,P2,,...Pm.
- Cj - The jth block of cipher text. All blocks except possibly the final block have a length of 128 bits (for -AES 16 bits).
- j - The sequential number of the 128-bit block inside the data unit.
- i - The value of the 128-bit tweak. Each data unit (sector) is assigned a tweak value that is a nonnegative integer. The tweak values are assigned consecuitevely, starting from an arbritary nonnegative integer.
- a - A primitive element of GF(2^128) that corresponds to the polynomial x ( 0000...10). (for S-AES we have GF(2^16).
- a^j - a multiplied by itself j times, in GF(2^128) (for S-AES, GF*2^16).
The XTS-AES operation on Single block is given on the scheme below.
The full XTS-AES mode is given with the scheme below.
Simplified AES algorithm, has as input the block of length 16 bits, the key 16-bit length as well as the 16-bit block output. The algorithm has the actions SBOX, ShiftRows, MixColums, AddRoundKey and their inverses InvSBOX, InvShiftRows, InvMixColums ( analogous to the AES algorithm).
It has three rounds, where two firs rounds are identical and the third round does not have MixColoumns. Let
, where
Then B enters in the below structure (Three rounds). First SBOX substitution is made, given with the
This SBOX is built within the Galois Field
Then comes the MixColoumns operation, where the multiplication with Maximal Distance Separabile matrix) and the result from ShiftRows operation. The MDS matrix for MixColoumnns is
where multiplication and addition is performed within the Galuois Field
In AddRoundKey, the obtained matrix XOR-s with the KEY from the first round
This matrix goes in again in the second round with the same repetition of operations and then again in round 3.
Below is given the encryption scheme for simplified AES.
It has three rounds, where the first round does not have InvMixColoumns and two last rounds are identical(second and the third round).
Given the encrypted text
In the first round the first operation is AddRound Key, so it is calculated
After AddRoundKey, comes InvShiftRows, where the second row is shifted right for four bits.
Then the substitution is done with the InvSBOX given with the matrix\table:
This InvSBOX is built within the Galois Field
After InvSBOX the first round ends and then the two last rounds begin.
In the second round, AddRoundKey is performed first, where the obtained value XOR-s the second round key
Then comes the MixColoumns operation, where the multiplication with Maximal Distance Separabile matrix) and the result from InvShiftRows operation. The MDS matrix for InvMixColoumnns is
where multiplication and addition is done within Galois Field
Then after InvMixColoumns comes InvShiftRows, where the second row is shifted right for four bits.
Then the substitution happens with the InvSBOX(inverse of SBOX)
This matrix goes back again in the third round with the repetition of the operations and finally in the end it XOR-s with the key
Below is given the Simplified AES decryption scheme:
Three keys are generated for both encryption and decryption for SAES. Below is given the scheme:
William Stallings - Cryptograph and Network Security, Principles and Practice