Padding (cryptography)

From HORSE - Holistic Operational Readiness Security Evaluation.
Jump to navigation Jump to search

Padding

In cryptography, padding refers to a number of distinct practices.

Classical Cryptography

Official messages often start and end in predictable ways: My dear ambassador, Weather report, Sincerely yours, etc. The primary use of padding with classical ciphers is to prevent the cryptanalyst from using that predictability to find cribs that aid in breaking the encryption. Random length padding also prevents an attacker from knowing the exact length of the plaintext message.

Many classical ciphers arrange the plaintext into particular patterns (e.g., squares, rectangles, etc) and if the plaintext doesn't exactly fit, it is often necessary to supply additional letters to fill out the pattern. Using nonsense letters for this purpose has a side benefit of making some kinds of cryptanalysis more difficult.

Symmetric Cryptography

Hash Functions

All modern cryptographic hash functions process messages in fixed-length blocks. Padding is appended to the final block in a predictable way that includes the total length of the message; this padding ensures that the final block is the right length, and is a key part of the security proof for this way of building hash functions, which is known as the Merkle-Damgård construction.

CBC mode

Cipher-block chaining (CBC) mode is a popular block cipher mode of operation. It requires messages whose length is a multiple of the block size (typically 8 or 16 bytes), so messages have to be padded to bring them to this length. One method is to fill out the last block with a 1-bit followed by zero bits. If the input happens to fill up an entire block, another block is added to accommodate the padding; otherwise, the end of the input plaintext might be misinterpreted as padding. Another method is to append n bytes with value (n-1) to the end of the plaintext to make its length divisible by 8/16. If it is already divisible by 8/16, then for the same reasons as before, a new 8/16 long new block is added. This means the padding is either one byte representing 0, or two bytes representing 1 etc.

Padding Methods

Two simple ways of padding a message are:

Bit Padding

This is described in RFC1321.

A single set ('1') bit is added to the message and then as many reset ('0') bits as required are added. The number of reset ('0') bits added will depend on the block boundary to which the message needs to be extended. In bit terms this is "1000 ... 0000", in hex byte terms this is "80 00 ... 00 00".

This method can be used to pad messages which are any number of bits long, not necessarily a whole number of bytes long.

Byte Padding

This is described in RFC1319.

Padding is in whole bytes. The value of each added byte is the number of bytes that are added, i.e. N bytes, each of value N are added. The number of bytes added will depend on the block boundary to which the message needs to be extended.

The padding will be one of:

 01
 02 02
 03 03 03
 04 04 04 04
 05 05 05 05 05
 etc.

This method can only be used to pad messages which are a whole number of bytes long.

Both forms of padding can easily be removed from the decrypted cyphertext to give the original plaintext.

Public Key Cryptography

In public key cryptography, padding is the process of preparing a message for encryption or signing with a primitive such as RSA. A popular example is OAEP. This is called "padding" because originally, random material was simply appended to the message to make it long enough for the primitive, but this is not a secure form of padding and is no longer used. A modern padding scheme aims to ensure that the attacker cannot manipulate the plaintext to exploit the mathematical structure of the primitive and will usually be accompanied by a proof, often in the random oracle model, that breaking the padding scheme is as hard as solving the hard problem underlying the primitive.

Traffic Analysis

Even if perfect cryptographic routines are used, the attacker can gain knowledge of the amount of traffic that was generated. The attacker might not know what A and B were talking about, but can know that they were talking and how much they talked. In certain circumstances this can be very bad. Consider for example when a military is organising a secret attack against another nation: it suffices to know for the other nation that there is a lot of secret activity going on to can alert them.

Padding messages is a way to make it harder to do traffic analysis. Normally, a number of random bits are appended to the end of the message with an indication at the end how much this random data is. The randomness should have a minimum value of 0, a maximum number of N and an even distribution between the two extremes. Note, that increasing 0 does not help, only increasing N helps, though that also means that a lower percentage of the channel will be used to transmit real data. Also note, that since the cryptographic routine is assumed to be uncrackable (otherwise the padding length itself is crackable), it does not help to put the padding anywhere else, e.g. at the beginning, in the middle, or in a sporadic manner. For the same reason, padding can be structured (e.g. it can simply be a set of zeros) - though structured padding can be hazard, as explained in timing attack.

See Also