AES

Advanced Encryption Standard

AES-128/192/256

AES offers three different versions:

  • AES-128 takes a key of 128 bits (16 bytes)

  • AES-192 takes a key of 192 bits (24 bytes)

  • AES-256 takes a key of 256 bits (32 bytes)

It is foreseeable that AES-128 will remain secure for a long time.

AES Interface

Looking at the interface of AES for encryption, we see the following:

  • The algorithm takes a variable-length key as discussed previously. For AES-128, this key length is 128 bits.

  • It also takes a plaintext of exactly 128 bits.

  • The block size is 128 bits.

  • It outputs a ciphertext of exactly 128 bits.

Everything about AES is 128-bit/16-byte.

Because AES encrypts a fixed-size plaintext, we call it a block cipher. Some other ciphers can encrypt arbitrarily length plaintexts as you will see later in this chapter.

The decryption operation is exactly the reverse of this: it takes the same key, a ciphertext of 128 bits, and returns the original 128-bit plaintext. Effectively, decryption reverts the encryption. This is possible because the encryption and decryption operations are deterministic; they produce the same results no matter how many times you call them.

In technical terms, a block cipher is a keyed permutation: it maps all the possible plaintexts to all the possible ciphertexts. Changing the key changes that mapping.

A permutation is also reversible. From a ciphertext, you have a map back to its corresponding plaintext (otherwise, decryption wouldn't work).

AES Internals

Let’s dig a bit deeper into the guts of AES to see what's inside. Note that AES sees the state of the plaintext during the encryption process as a 4-by-4 matrix of bytes:

This doesn't really matter in practice, but this is how AES is defined. AES also has a round function that it iterates several times, starting on the original input (the plaintext):

Each call to the round function transforms the state further, eventually producing the ciphertext. Each round uses a different round key, which is derived from the main symmetric key during KeyExpansion.

KeyExpansion: From the 128 bit key, 11 separate 128 bit round keys are derived: one to be used in each AddRoundKey step. This step is also known as key schedule.

The combination of the key schedule and the rounds ensure that the slightest change in the bits of the key or the message renders a completely different encryption.

  • SubBytes

    • Confusion through substitution (using S-Box)

  • ShiftRows

    • Diffusion through permutation part 1

  • MixColumns

    • Diffusion through permutation part 2

  • AddRoundKey

    • Encryption

The first three are easily reversible but the last one is not. It performs XOR(state, round key) thus needs the knowledge of the round key to be reversed:

  • Confusion means that each bit of the ciphertext should depend on several parts of the key, obscuring the connections between the two.

    • The property of confusion hides the relationship between the ciphertext and the key.

    • This property makes it difficult to find the key from the ciphertext and if a single bit in a key is changed, the calculation of most or all of the bits in the ciphertext will be affected.

    • Confusion increases the ambiguity of ciphertext and it is used by both block and stream ciphers.

    • In substitution–permutation networks, confusion is provided by substitution boxes (S box).

  • Diffusion means that if we change a single bit of the plaintext, then about half of the bits in the ciphertext should change, and similarly, if we change one bit of the ciphertext, then about half of the plaintext bits should change. This is equivalent to the expectation that encryption schemes exhibit an avalanche effect.

    • The purpose of diffusion is to hide the statistical relationship between the ciphertext and the plain text. For example, diffusion ensures that any patterns in the plaintext, such as redundant bits, are not apparent in the ciphertext. Block ciphers achieve this by "diffusing" the information about the plaintext's structure across the rows and columns of the cipher.

    • In substitution–permutation networks, diffusion is provided by permutation boxes (P box).

AES-CBC

AES-CBC is bad. AES-CBC is better (but not perfect).

AES-CBC is bad. Since AES encryption is deterministic, and so encrypting the same block of plaintext twice leads to the same ciphertext. This means that by encrypting each block individually, the resulting ciphertext might have repeating patterns:

CBC works for any deterministic block cipher (not just AES) by taking an additional value called an initialization vector (IV) to randomize the encryption. Because of this, the IV is the length of the block size (16 bytes for AES) and must be random and unpredictable.

To encrypt with the CBC mode of operation, start by generating a random IV of 16 bytes, then XOR the generated IV with the first 16 bytes of plaintext before encrypting those. This effectively randomizes the encryption. Indeed, if the same plaintext is encrypted twice but with different IVs, the mode of operation renders two different ciphertexts. If there is more plaintext to encrypt, use the previous ciphertext (like we used the IV previously) to XOR it with the next block of plaintext before encrypting it. This randomizes the next block of encryption as well. Remember, the encryption of something is unpredictable and should be as good as the randomness we used to create our real IV:

AES encryption with PyCryptodome:

# AES-CBC encryption
>>> import json
>>> from base64 import b64encode
>>> from Crypto.Cipher import AES
>>> from Crypto.Util.Padding import pad
>>> from Crypto.Random import get_random_bytes
>>>
>>> data = b"secret"
>>> key = get_random_bytes(16)
>>> cipher = AES.new(key, AES.MODE_CBC)
>>> ct_bytes = cipher.encrypt(pad(data, AES.block_size))
>>> iv = b64encode(cipher.iv).decode('utf-8')
>>> ct = b64encode(ct_bytes).decode('utf-8')
>>> result = json.dumps({'iv':iv, 'ciphertext':ct})
>>> print(result)
'{"iv": "bWRHdzkzVDFJbWNBY0EwSmQ1UXFuQT09", "ciphertext": "VDdxQVo3TFFCbXIzcGpYa1lJbFFZQT09"}'

To decrypt with the CBC mode of operation, reverse the operations. As the IV is needed, it must be transmitted in clear text along with the ciphertext. Because the IV is supposed to be random, no information is leaked by observing the value:

AES decryption with PyCryptodome:

# AES-CBC decryption
>>> import json
>>> from base64 import b64decode
>>> from Crypto.Cipher import AES
>>> from Crypto.Util.Padding import unpad
>>>
>>> # We assume that the key was securely shared beforehand
>>> try:
>>>     b64 = json.loads(json_input)
>>>     iv = b64decode(b64['iv'])
>>>     ct = b64decode(b64['ciphertext'])
>>>     cipher = AES.new(key, AES.MODE_CBC, iv)
>>>     pt = unpad(cipher.decrypt(ct), AES.block_size)
>>>     print("The message was: ", pt)
>>> except (ValueError, KeyError):
>>>     print("Incorrect decryption")

An IV needs to be unique (it cannot repeat) as well as unpredictable (it really needs to be random). When an IV repeats or is predictable, the encryption becomes deterministic again, and a number of clever attacks become possible. This was the case with the famous BEAST attack (Browser Exploit Against SSL/TLS) on the TLS protocol.

AES-CBC-HMAC

So far, we have failed to address one fundamental flaw: the ciphertext as well as the IV in the case of CBC can still be modified by an attacker.

Indeed, there's no integrity mechanism to prevent that! Changes in the ciphertext or IV might have unexpected changes in the decryption. For example, in AES-CBC, an attacker can flip specific bits of plaintext by flipping bits in its IV and ciphertext:

To prevent modifications on the ciphertext, we can use MAC. For AES-CBC, we usually use HMAC in combination with the SHA-256 hash function to provide integrity. We then apply the MAC after padding the plaintext and encrypting it over both the ciphertext and the IV; otherwise, an attacker can still modify the IV without being caught.

Prior to decryption, the tag needs to be verified. The combination of all of these algorithms is referred to as AES-CBC-HMAC and was one of the most widely used authenticated encryption modes until we started to adopt more modern all-in-one constructions.

AEAD

The most current way of encrypting data is to use an all-in-one construction called Authenticated Encryption with Associated Data (AEAD). The construction is extremely close to what AES-CBC-HMAC provides as it also offers confidentiality of your plaintexts while detecting any modifications that could have occurred on the ciphertexts. What’s more, it provides a way to authenticate associated data.

The associated data argument is optional and can be empty or it can also contain metadata that is relevant to the encryption and decryption of the plaintext. This data will not be encrypted and is either implied or transmitted along with the ciphertext. In addition, the ciphertext’s size is larger than the plaintext because it now contains an additional authentication tag (usually appended to the end of the ciphertext). To decrypt the ciphertext, we are required to use the same implied or transmitted associated data. The result is either an error, indicating that the ciphertext was modified in transit, or the original plaintext:

The most widely used AEAD is AES-GCM. It was designed for high performance by taking advantage of hardware support for AES and by using a MAC (GMAC) that can be implemented efficiently.

Lab

Reference

Last updated