MD5 & Sha

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

What Is Hashing?

Hashing consists of converting a general string of information into an intricate piece of data. This
is done to scramble the data to completely transform the original value, making the hashed value
utterly different from the original.

Hashing uses a hash function to convert standard data into an unrecognizable format. These hash
functions are a set of mathematical calculations that transform the original information into its
hashed values, known as the hash digest or digest in general. The digest size is always the same
for a particular hash function like MD5 or SHA1, irrespective of input size.

Hashing has two primary use cases:

1) Password Verification

It is common to store user credentials of websites in a hashed format to prevent third parties from
reading the passwords. Since hash functions provide the same output for the same input,
comparing password hashes is much more private.

Step by step Process:-

1. User signs up to the website with a new password


2. It passes the password through a hash function and stores the digest on the server
3. When a user tries to log in, they enter the password again
4. It passes the entered password through the hash function again to generate a digest
5. If the newly developed digest matches the one on the server, the login is verified
2) Integrity Verification

Some files can be checked for data corruption using hash functions. Like the above scenario,
hash functions will always give the same output for similar input, irrespective of iteration
parameters.

Step by step Process:-

1. A user uploads a file on the internet


2. It also uploads the hash digest along with the file
3. When a user downloads the file, they recalculate the hash digest
4. If the digest matches the original hash value, file integrity is maintained

MD5 Algorithm
MD5 (Message Digest Method 5) is a cryptographic hash algorithm that generates a 128-bit
digest from a string of any length. The digests are represented as 32-digit hexadecimal numbers.

Ronald Rivest designed this algorithm in 1991 to provide the means for digital signature
verification. Eventually, it was integrated into multiple other frameworks to bolster security
indexes.
The digest size is always 128 bits, and thanks to hashing function guidelines, a minor change in
the input string generates a drastically different digest. This is essential to prevent similar hash
generation, also known as a hash collision, as much as possible.

With the consensus aiming to educate the public on digital privacy, it’s no surprise to see an
increasing interest in encryption algorithms and cybersecurity. The MD5 algorithm was one of
the first hashing algorithms to take the global stage as a successor to the MD4 algorithm. Despite
the security vulnerabilities encountered in the future, MD5 remains a crucial part of data
infrastructure in a multitude of environments.

Before diving headfirst into the main topic, it is best to review the basic concept of hashing.

What Is Hashing?

Hashing consists of converting a general string of information into an intricate piece of data. This
is done to scramble the data to completely transform the original value, making the hashed value
utterly different from the original.

Hashing uses a hash function to convert standard data into an unrecognizable format. These hash
functions are a set of mathematical calculations that transform the original information into its
hashed values, known as the hash digest or digest in general. The digest size is always the same
for a particular hash function like MD5 or SHA1, irrespective of input size.

Hashing has two primary use cases:

Password Verification

It is common to store user credentials of websites in a hashed format to prevent third parties from
reading the passwords. Since hash functions provide the same output for the same input,
comparing password hashes is much more private.
The entire process is as follows:

1. User signs up to the website with a new password


2. It passes the password through a hash function and stores the digest on the server
3. When a user tries to log in, they enter the password again
4. It passes the entered password through the hash function again to generate a digest
5. If the newly developed digest matches the one on the server, the login is verified

Integrity Verification

Some files can be checked for data corruption using hash functions. Like the above scenario,
hash functions will always give the same output for similar input, irrespective of iteration
parameters.

The entire process follows this order:

1. A user uploads a file on the internet


2. It also uploads the hash digest along with the file
3. When a user downloads the file, they recalculate the hash digest
4. If the digest matches the original hash value, file integrity is maintained
What Is the MD5 Algorithm?
MD5 (Message Digest Method 5) is a cryptographic hash algorithm that generates a 128-bit
digest from a string of any length. The digests are represented as 32-digit hexadecimal numbers.

Ronald Rivest designed this algorithm in 1991 to provide the means for digital signature
verification. Eventually, it was integrated into multiple other frameworks to bolster security
indexes.

The digest size is always 128 bits, and thanks to hashing function guidelines, a minor change in
the input string generates a drastically different digest. This is essential to prevent similar hash
generation, also known as a hash collision, as much as possible.

Importance of MD5 Hash Algorithm in Cryptography

 The MD5 algorithm is a cryptographic hash function that produces a 128-bit (16-byte) hash
value from any given input.
 In cryptography, MD5 ensures data integrity and authenticity by generating unique hash
values for distinct data inputs.
 It converts arbitrary-sized data into a fixed-size 128-bit hash, making it crucial for
applications like digital signatures, certificate generation, and data integrity verification.
 By producing a consistent hash for the same input and different hashes for even minor
changes in input, MD5 helps detect data corruption and tampering.
 However, due to vulnerabilities like collision attacks, where different inputs produce the
same hash, MD5 has diminished in favor of more secure algorithms like SHA-256 for critical
cryptographic applications.
How Does MD5 Algorithm Work?
The MD5 algorithm's working process involves padding, appending length, initializing
variables, processing in 512-bit blocks, and producing the final hash.

Step 1: Padding the Input

 The first step in the MD5 algorithm involves padding the input message so its length (in
bits) is congruent to 448 modulo 512.
 This is done by appending a single '1' bit followed by enough '0' bits to reach the required
length, ensuring the total message length is a multiple of 512 bits.

Step 2: Appending the Length

 After padding, the length of the original message (before padding) is appended as a 64-bit
value.
 This step ensures that the original message length is still embedded within the hash input,
even if the padded message length is manipulated.

Step 3: Initializing Variables

MD5 uses four 32-bit variables, which are initialized to specific constants. These variables, often
denoted as A, B, C, and D, are set to the following values in hexadecimal:

 A = 0x67452301
 B = 0xefcdab89
 C = 0x98badcfe
 D = 0x10325476

Step 4: Processing in 512-bit Blocks

 The padded message is processed in chunks of 512-bit blocks, each divided into sixteen
32-bit words.
 The main algorithm operates on each block in four rounds of 16 operations each, totaling
64 operations.

Step 5: Main Loop

 The core of the MD5 algorithm involves four non-linear functions (F, G, H, and I) and
four rounds of transformation.
 Each function takes three 32-bit words as input and produces a 32-bit output. The
operations are performed as follows:
1) Round 1: Uses the function F(B,C,D)=(B&C)∣((∼B)&D)F(B, C, D) = (B \& C) |
((\sim B) \& D)F(B,C,D)=(B&C)∣((∼B)&D)
2) Round 2: Uses the function G(B,C,D)=(B&D)∣(C&(∼D))G(B, C, D) = (B \& D)
| (C \& (\sim D))G(B,C,D)=(B&D)∣(C&(∼D))
3) Round 3: Uses the function H(B,C,D)=B⊕C⊕DH(B, C, D) = B \oplus C \oplus
DH(B,C,D)=B⊕C⊕D
4) Round 4: Uses the function I(B,C,D)=C⊕(B∣(∼D))I(B, C, D) = C \oplus (B |
(\sim D))I(B,C,D)=C⊕(B∣(∼D))
 The algorithm performs a series of bitwise operations, modular additions, and left
rotations in each round.
 Each operation modifies one of the four variables (A, B, C, D) using a different word
from the block and a constant derived from the sine function.

Step 6: Producing the Final Hash

 After all the 512-bit blocks have been processed, the final hash value is produced by
concatenating the variables A, B, C, and D.
 The resulting 128-bit value is the MD5 hash of the input message.

Applications of MD5 Algorithm

 Data Integrity

Users can ensure that the data has not been altered or corrupted during transit by generating an
MD5 hash of a file or piece of data before transmission and comparing it with the hash generated
after transmission.

 Digital Signatures

MD5 creates digital signatures that verify the integrity of digital messages or documents.

 Certificate Generation and Verification

In Public Key Infrastructure (PKI) systems, MD5 algorithms can be used to generate and verify
digital certificates. When a certificate authority (CA) issues a certificate, it creates an MD5 hash
of the certificate data, which is then included in the certificate. This allows for the verification of
the certificate's integrity when used or shared.

 Password Storage

MD5 has been historically used to hash passwords before storing them in databases. By hashing
passwords, systems can store the hash instead of the plain text password, providing an extra layer
of software security.
 Checksums and File Integrity

MD5 is often used to create checksums for files. A checksum is a small piece of data derived
from digital data to detect errors introduced during transmission or storage.

 Verifying Software and Digital Content

MD5 hashes are often used to verify the integrity of software distributions and digital content.

 Detecting Duplicate Files

MD5 algorithm can be used in applications that detect duplicate files by generating and
comparing the MD5 hashes of files.

 Malware Detection

MD5 hashes can also be used in cybersecurity to detect malware. Security researchers and
antivirus software providers maintain databases of known malware hashes. By comparing the
MD5 hash of a suspicious file to the database, it is possible to identify known malware quickly.

 Forensic Investigations

In digital forensics, MD5 hashes are used to create hash values of digital evidence to ensure its
integrity. When evidence is collected, an MD5 hash is generated and recorded.

 Version Control Systems

In some version control systems, MD5 hashes are used to identify specific revisions or versions
of files. This allows for efficient change tracking and ensures that specific versions can be
retrieved accurately.

You might also like