Chapter 9 Message Authentication and Hash Functions
Chapter 9 Message Authentication and Hash Functions
Chapter 9 Message Authentication and Hash Functions
A hash function H accepts a variable-length block of data M as input and produces a fixed-size hash
value h = H(M). A “good” hash function has the property that the results of applying the function to a
large set of inputs will produce outputs that are evenly distributed and apparently random. In general
terms, the principal object of a hash function is data integrity. A change to any bit or bits in M results,
with high probability, in a change to the hash value.
The kind of hash function needed for security applications is referred to as a cryptographic hash
function. A cryptographic hash function is an algorithm for which it is computationally infeasible
(because no attack is significantly more efficient than brute force) to find either
(a) a data object that maps to a pre-specified hash result (the one-way property) or
(b) two data objects that map to the same hash result (the collision-free property).
Because of these characteristics, hash functions are often used to determine whether or not data has
changed.
Message Authentication:
Figure below illustrates a variety of ways in which a hash code can be used to provide message
authentication, as follows.
a) The message plus concatenated hash code is encrypted using symmetric encryption. Because
only A and B share the secret key, the message must have come from A and has not been
altered. The hash code provides the structure or redundancy required to achieve authentication.
Because encryption is applied to the entire message plus hash code, confidentiality is also
provided.
b) Only the hash code is encrypted, using symmetric encryption. This reduces the processing
burden for those applications that do not require confidentiality.
c) It is possible to use a hash function but no encryption for message authentication. The technique
assumes that the two communicating parties share a common secret value S. A computes the
hash value over the concatenation of M and S and appends the resulting hash value to M.
Because B possesses S, it can recompute the hash value to verify. Because the secret value itself
is not sent, an opponent cannot modify an intercepted message and cannot generate a false
message.
d) Confidentiality can be added to the approach of method (c) by encrypting the entire message
plus
the hash code.
When confidentiality is not required, method (b) has an advantage over methods (a) and (d), which
encrypts the entire message, in that less computation is required. Nevertheless, there has been growing
interest in techniques that avoid encryption.
Several reasons for this interest are pointed out as:
• Encryption software is relatively slow. Even though the amount of data to be encrypted per
message is small, there may be a steady stream of messages into and out of a system.
• Encryption hardware costs are not negligible. Low-cost chip implementations of DES are
available, but the cost adds up if all nodes in a network must have this capability.
• Encryption hardware is optimized toward large data sizes. For small blocks of data, a high
proportion of the time is spent in initialization/invocation overhead.
• Encryption algorithms may be covered by patents, and there is a cost associated with licensing
their use.
The message authentication function is concerned with the types of functions that may be used to pro-
duce an authenticator. These may be grouped into three classes:
• Hash function: A function that maps a message of any length into a fixed- length hash value,
which serves as the authenticator.
• Message encryption: The ciphertext of the entire message serves as its authenticator.
• Message authentication code (MAC): A function of the message and a secret key that produces a
fixed-length value that serves as the authenticator.
Message Authentication Code (MAC):
A authentication technique that involves the use of a secret key to generate a small fixed-size block of
data that is appended to the message is known as Message Authentication Code (MAC). This technique
assumes that two communicating parties, say A and B, share a common secret key K. When A has a
message to send to B, it calculates the MAC as a function of the message and the key:
MAC = C (K, M)
where
M = input message
C = MAC function
The message plus MAC are transmitted to the intended recipient. The recipient performs the same
calculation on the received message, using the same secret key, to generate a new MAC. The received
MAC is compared to the calculated MAC.
If we assume that only the receiver and the sender know the identity of the secret key, and if the
received
MAC matches the calculated MAC, then:
• The receiver is assured that the message has not been altered. If an attacker alters the message
but does not alter the MAC, then the receiver’s calculation of the MAC will differ from the
received MAC. Because the attacker is assumed not to know the secret key, the attacker cannot
alter the MAC to correspond to the alterations in the message.
• The receiver is assured that the message is from the alleged sender. Because no one else knows
the secret key, no one else could prepare a message with a proper MAC.
• If the message includes a sequence number (such as is used with TCP), then the receiver can be
assured of the proper sequence because an attacker cannot successfully alter the sequence
number.
A MAC function is similar to encryption. One difference is that the MAC algorithm need not be
reversible, as it must be for decryption. In general, the MAC function is a many-to-one function. The
domain of the function consists of messages of some arbitrary length, whereas the range consists of all
possible MACs and all possible keys.
• A cryptographic hash function (CHF) is a hash function that is suitable for use in cryptography.
• It is a mathematical algorithm that maps data of arbitrary size (often called the "message") to a
bit string of a fixed size (the "hash value", "hash", or "message digest") and is a one-way
function, that is, a function which is practically infeasible to invert.
• Ideally, the only way to find a message that produces a given hash is to attempt a brute-force
search of possible inputs to see if they produce a match, or use a table of matched hashes.
• Cryptographic hash functions are a basic tool of modern cryptography.
Figure: Cryptographic Hash Function; h = H(M)
The ideal cryptographic hash function has the following main properties:
• it is deterministic, meaning that the same message always results in the same hash
• it is quick to compute the hash value for any given message
• it is infeasible to generate a message that yields a given hash value (Pre-Image Resistant)
• it is infeasible to find two different messages with the same hash value (Collision Resistance)
• A small change to a message should change the hash value so extensively that the new hash
value appears uncorrelated with the old hash value (avalanche effect)
Given that the hash depends on the input to the hash function and will change with the input hash
functions are used:
• to ensure that messages have not been tampered with (message authentication, digital
signatures, checksums) and
• to check for equality while preserving secrecy / efficiently (for example, checking if password is
correct without storing the actual password, or checking for duplicates in lists of large items).
• They are also used as proof-of-work (for example, in cryptocurrencies like bitcoin), error-
correcting codes, randomization and to make cryptographic algorithms more efficient.
There are two direct applications of hash function based on its cryptographic properties.
Password Storage
• Instead of storing password in clear, mostly all logon processes store the hash values of
passwords in the file.
• The Password file consists of a table of pairs which are in the form (user id, h(P)).
• The process of logon is depicted in the following illustration:
• An intruder can only see the hashes of passwords, even if he accessed the password. He can
neither logon using hash nor can he derive the password from hash value since hash
function possesses the property of pre-image resistance.
Data integrity check is a most common application of the hash functions. It is used to generate
the checksums on data files. This application provides assurance to the user about correctness of
the data.
The process is depicted in the following illustration:
The integrity check helps the user to detect any changes made to original file. It however, does
not provide any assurance about originality. The attacker, instead of modifying file data, can
change the entire file and compute all together new hash and send to the receiver. This integrity
check application is useful only if the user is sure about the originality of file.