Fixed Point and Floating Point Number Representations
Fixed Point and Floating Point Number Representations
Fixed Point and Floating Point Number Representations
Digital Computers use Binary number system to represent all types of information inside
the computers. Alphanumeric characters are represented using binary bits (i.e., 0 and 1).
Digital representations are easier to design, storage is easy, accuracy and precision are
greater.
There are various types of number representation techniques for digital number
representation, for example: Binary number system, octal number system, decimal
number system, and hexadecimal number system etc. But Binary number system is
most relevant and popular for representing numbers in digital computer system.
There are two major approaches to store real numbers (i.e., numbers with fractional
component) in modern computing. These are (i) Fixed Point Notation and (ii) Floating Point
Notation. In fixed point notation, there are a fixed number of digits after the decimal
point, whereas floating point number allows for a varying number of digits after the
decimal point.
Fixed-Point Representation −
This representation has fixed number of bits for integer part and for fractional part. For
example, if given fixed-point representation is IIII.FFFF, then you can store minimum
https://www.tutorialspoint.com/fixed-point-and-floating-point-number-representations 1/5
21/2/24, 17:28 Fixed Point and Floating Point Number Representations
value is 0000.0001 and maximum value is 9999.9999. There are three parts of a fixed-
point number representation: the sign field, integer field, and fractional field.
Example −Assume number is using 32-bit format which reserve 1 bit for the sign, 15 bits
for the integer part and 16 bits for the fractional part.
https://www.tutorialspoint.com/fixed-point-and-floating-point-number-representations 2/5
21/2/24, 17:28 Fixed Point and Floating Point Number Representations
These are above smallest positive number and largest positive number which can be store
in 32-bit representation as given above format. Therefore, the smallest positive number is
We can move the radix point either left or right with the help of only integer field is 1.
Floating-Point Representation −
This representation does not reserve a specific number of bits for the integer part or the
fractional part. Instead it reserves a certain number of bits for the number (called the
mantissa or significand) and a certain number of bits to say where within that number the
decimal place sits (called the exponent).
The floating number representation of a number has two part: the first part represents a
signed fixed point number called mantissa. The second part of designates the position of
the decimal (or binary) point and is called the exponent. The fixed point mantissa may be
fraction or an integer. Floating -point is always interpreted to represent a number in the
following form: Mxre.
Only the mantissa m and the exponent e are physically represented in the register
(including their sign). A floating-point binary number is represented in a similar manner
except that is uses base 2 for the exponent. A floating-point number is said to be
normalized if the most significant digit of the mantissa is 1.
So, actual number is (-1)s(1+m)x2(e-Bias), where s is the sign bit, m is the mantissa, e is
the exponent value, and Bias is the bias number.
Note that signed integers and exponent are represented by either sign representation, or
one’s complement representation, or two’s complement representation.
The floating point representation is more flexible. Any non-zero number can be
represented in the normalized form of ±(1.b1b2b3 ...)2x2n This is normalized form of a
number x.
Example −Suppose number is using 32-bit format: the 1 bit sign bit, 8 bits for signed
exponent, and 23 bits for the fractional part. The leading bit 1 is not stored (as it is always
1 for a normalized number) and is referred to as a “hidden bit”.
https://www.tutorialspoint.com/fixed-point-and-floating-point-number-representations 3/5
21/2/24, 17:28 Fixed Point and Floating Point Number Representations
Note that 8-bit exponent field is used to store integer exponents -126 ≤ n ≤ 127.
The precision of a floating-point format is the number of positions reserved for binary
digits plus one (for the hidden bit). In the examples considered here the precision is
23+1=24.
The gap between 1 and the next normalized floating-point number is known as machine
epsilon. the gap is (1+2-23)-1=2-23for above example, but this is same as the smallest
positive floating-point number because of non-uniform spacing unlike in the fixed-point
scenario.
https://www.tutorialspoint.com/fixed-point-and-floating-point-number-representations 4/5
21/2/24, 17:28 Fixed Point and Floating Point Number Representations
So, actual number is (-1)s(1+m)x2(e-Bias), where s is the sign bit, m is the mantissa, e is
the exponent value, and Bias is the bias number. The sign bit is 0 for positive number and
1 for negative number. Exponents are represented by or two’s complement representation.
Half Precision (16 bit): 1 sign bit, 5 bit exponent, and 10 bit mantissa
Single Precision (32 bit): 1 sign bit, 8 bit exponent, and 23 bit mantissa
Double Precision (64 bit): 1 sign bit, 11 bit exponent, and 52 bit mantissa
Quadruple Precision (128 bit): 1 sign bit, 15 bit exponent, and 112 bit mantissa
There are some special values depended upon different values of the exponent and
mantissa in the IEEE 754 standard.
All the exponent bits 0 with all mantissa bits 0 represents 0. If sign bit is 0, then
+0, else -0.
All the exponent bits 1 with all mantissa bits 0 represents infinity. If sign bit is 0,
then +∞, else -∞.
All the exponent bits 0 and mantissa bits non-zero represents denormalized
number.
All the exponent bits 1 and mantissa bits non-zero represents error.
https://www.tutorialspoint.com/fixed-point-and-floating-point-number-representations 5/5