I am creating an encryption scheme with AES in cbc mode with a 256-bit key. Before I learned about CBC mode and initial values, I was planning on creating a 32-bit salt for each act of encryption and storing the salt. The password/entered key would then be padded with this salt up to 32 bits.
ie. if the pass/key entered was "tree," instead of padding it with 28 0s, it would be padded with the first 28 chars of this salt.
However, this was before I learned of the iv, also called a salt in some places. The question for me has now arisen as to whether or not this earlier method of salting has become redundant in principle with the IV. This would be to assume that the salt and the iv would be stored with the cipher text and so a theoretical brute force attack would not be deterred any.
Storing this key and using it rather than 0s is a step that involves some effort, so it is worth asking I think whether or not it is a practically useless measure. It is not as though there could be made, with current knowledge, any brute-force decryption tables for AES, and even a 16 bit salt pains the creation of md5 tables.
Thanks, Elijah
It's good that you know CBC, as it is certainly better than using ECB mode encryption (although even better modes such as the authenticated modes GCM and EAX exist as well).
I think there are several things that you should know about, so I'll explain them here.
Keys and passwords are not the same. Normally you create a key used for symmetric encryption out of a password using a key derivation function. The most common one discussed here is PBKDF2 (password based key derivation function #2), which is used for PBE (password based encryption). This is defined in the latest, open PKCS#5 standard by RSA labs. Before entering the password need to check if the password is correctly translated into bytes (character encoding).
The salt is used as another input of the key derivation function. It is used to prevent brute force attacks using "rainbow tables" where keys are pre-computed for specific passwords. Because of the salt, the attacker cannot use pre-computed values, as he cannot generate one for each salt. The salt should normally be 8 bytes (64 bits) or longer; using a 128 bit salt would give you optimum security. The salt also ensures that identical passwords (of different users) do not derive the same key.
The output of the key derivation function is a secret of dkLen
bytes, where dkLen
is the length of the key to generate, in bytes. As an AES key does not contain anything other than these bytes, the AES key will be identical to the generated secret. dkLen
should be 16, 24 or 32 bytes for the key lengths of AES: 128, 192 or 256 bits.
OK, so now you finally have an AES key to use. However, if you simply encrypt each plain text block with this key, you will get identical result if the plain text blocks are identical. CBC mode gets around this by XOR'ing the next plain text block with the last encrypted block before doing the encryption. That last encrypted block is the "vector". This does not work for the first block, because there is no last encrypted block. This is why you need to specify the first vector: the "initialization vector" or IV.
The block size of AES is 16 bytes independent of the key size. So the vectors, including the initialization vector, need to be 16 bytes as well. Now, if you only use the key to encrypt e.g. a single file, then the IV could simply contain 16 bytes with the value 00
h.
This does not work for multiple files, because if the files contain the same text, you will be able to detect that the first part of the encrypted file is identical. This is why you need to specify a different IV for each encryption you perform with the key. It does not matter what it contains, as long as it is unique, 16 bytes and known to the application performing the decryption.
[EDIT 6 years later] The above part is not entirely correct: for CBC the IV needs to be unpredictable to an attacker, it doesn't just need to be unique. So for instance a counter cannot be used.
Now there is one trick that might allow you to use all zero's for the IV all the time: for each plain text you encrypt using AES-CBC, you could calculate a key using the same password but a different salt. In that case, you will only use the resulting key for a single piece of information. This might be a good idea if you cannot provide an IV for a library implementing password based encryption.
[EDIT] Another commonly used trick is to use additional output of PBKDF2 to derive the IV. This way the official recommendation that the IV for CBC should not be predicted by an adversary is fulfilled. You should however make sure that you do not ask for more output of the PBKDF2 function than that the underlying hash function can deliver. PBKDF2 has weaknesses that would enable an adversary to gain an advantage in such a situation. So do not ask for more than 256 bits if SHA-256 is used as hash function for PBKDF2. Note that SHA-1 is the common default for PBKDF2 so that only allows for a single 128 bit AES key.
IV's and salts are completely separate terms, although often confused. In your question, you also confuse bits and bytes, key size and block size and rainbow tables with MD5 tables (nobody said crypto is easy). One thing is certain: in cryptography it pays to be as secure as possible; redundant security is generally not a problem, unless you really (really) cannot afford the extra resources.
When you understand how this all works, I would seriously you to find a library that performs PBE encryption. You might just need to feed this the password, salt, plain data and - if separately configured- the IV.
[Edit] You should probably look for a library that uses Argon2 by now. PBKDF2 is still considered secure, but it does give unfair advantage to an attacker in some cases, letting the attacker perform fewer calculations than the regular user of the function. That's not a good property for a PBKDF / password hash.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With