This document provides a project report on implementing a modified Booth encoding algorithm to reduce the computation time of two's complement multipliers. It describes how the modified Booth encoding scheme can decrease the maximum height of the partial product array from n/2+1 rows to n/2 rows, allowing faster compression without increasing delay. The project uses Verilog code with Cadence tools to synthesize and simulate an 8-16 bit radix-4 Booth encoded multiplier. Simulation results show that this technique can help design low power, high speed multipliers that are important for applications like digital signal processing.
This document provides a project report on implementing a modified Booth encoding algorithm to reduce the computation time of two's complement multipliers. It describes how the modified Booth encoding scheme can decrease the maximum height of the partial product array from n/2+1 rows to n/2 rows, allowing faster compression without increasing delay. The project uses Verilog code with Cadence tools to synthesize and simulate an 8-16 bit radix-4 Booth encoded multiplier. Simulation results show that this technique can help design low power, high speed multipliers that are important for applications like digital signal processing.
This document provides a project report on implementing a modified Booth encoding algorithm to reduce the computation time of two's complement multipliers. It describes how the modified Booth encoding scheme can decrease the maximum height of the partial product array from n/2+1 rows to n/2 rows, allowing faster compression without increasing delay. The project uses Verilog code with Cadence tools to synthesize and simulate an 8-16 bit radix-4 Booth encoded multiplier. Simulation results show that this technique can help design low power, high speed multipliers that are important for applications like digital signal processing.
This document provides a project report on implementing a modified Booth encoding algorithm to reduce the computation time of two's complement multipliers. It describes how the modified Booth encoding scheme can decrease the maximum height of the partial product array from n/2+1 rows to n/2 rows, allowing faster compression without increasing delay. The project uses Verilog code with Cadence tools to synthesize and simulate an 8-16 bit radix-4 Booth encoded multiplier. Simulation results show that this technique can help design low power, high speed multipliers that are important for applications like digital signal processing.
K L University Department of Electronics & Communication Engineering Year: 2013
i
Modified Booth Encoding Algorithm Mini Project report submitted in partial fulfilment of the requirement for the award of the Degree of B.Tech By A.SREE MADHURI 11004264
M.UDAYA BHANU 11004264
P.VINEELA 11004287 K L University Department of Electronics & Communication Engineering
ii Certificate
This is to certify that the project report entitled MODIFIED BOOTH ENCODING ALGORITHM being submitted by Miss A.SREE MADHURI (1100264), Miss M.UDAYA BHANU (11004278), Miss P.VINEELA (11004287), and Mr DAYANAND (11004304). In partial fulfilment for the award of the Degree of Bachelor of Technology in ECE to the KL University is a record of benefited work carried out by them under my guidance and supervision.
The results embodied in this project report have not been submitted to any other University or Institute for the award of any Degree or Diploma.
Mr. G.V.Ganesh Dr.K.S. Ramesh Mr. PranobKumar Dr.ASCS Sastry Internal Guide Miniproject-coordinator Academic Coordinator (HOD-Dept of ECE)
iii ACKNOWLEDGEMENT The successful completion of any task is not possible without proper suggestion, guidance and environment. Combination of these three factors acts like backbone to my project titled MODIFIED BOOTH ENCODING ALGORITHM. We express our sincere thanks to our guide, G.V.Ganesh Asst. Professor, Department of ECE, for her valuable suggestions during our course period, timely help, guidance and providing us with the most essential materials required for the completion of this work. We are thankful to all teaching and non-teaching staff of the Department of ECE for the cooperation given for the successful completion of my project. We would like to thank our Head of Department, Dr A.S.C.S.SASTRY sir for providing support and simulating environment. We would like to express my gratitude to the Management of KL UNIVERSITY for providing me a pleasant environment.
iv
ABSTRACT
Multipliers are used for high performance embedded cores and in all multiplier designs. In conventional twos complement multiplier the main problem is that it requires more computation time. In this paper computation time of the Twos complement multiplier is reduced by decreasing the maximum height of the partial product array by one row in a radix-4 Modified Booth Encoded Multiplier. The classic twos complement (nxn) bit multiplier using the radix-4 MBE scheme generates a partial product(PP) array with a maximum height of n/2+1 rows, here we are going to reduce the maximum height of PP array to n/2. This technique allows for faster compression of the partial product array without any increase in the delay and can be extended to higher radix encodings. This technique mostly relies on circuit optimization and minimization of the critical paths. We are using Cadence RTL compiler for synthesize report and Model simulator for simulation results. Short bit- width (8-16 bits) twos complement multipliers with single-cycle throughput and latency have emerged and become very important building blocks for high-performance embedded processors and DSP execution cores. In this case, the multiplier must be highly optimized to fit within the required cycle time and power budgets. Using these multipliers we can save cost (time and area) for adding partial products. Lower power consumption is there in this case of radix-4 multiplier because it is high speed parallel multiplier. It is used in multi-media and communication systems. As the multiplier and accumulator are the essential elements of the digital signal processing such as filtering, convolution, this algorithm is efficient.
KEY WORDS: Multiplier, Modified Booth Algorithm, Compiler, Power consumption.
Multipliers play an important role in todays digital signal processing and various other applications. With advances in technology, many researchers have tried and are trying to design multipliers which offer either of the following design targets high speed, low power consumption, regularity of layout and hence less area or even combination of them in one multiplier thus making them suitable for various high speed, low power and compact VLSI implementation. The common multiplication method is add and shift algorithm. In parallel multipliers number of partial products to be added is the main parameter that determines the performance of the multiplier. To reduce the number of partial products to be added, Modified Booth algorithm is one of the most popular algorithms. To achieve speed improvements Wallace Tree algorithm can be used to reduce the number of sequential adding stages. The Modified Booth Multiplier was proposed by O. L. Macsorley in 1961. The recoding method is widely used to generate the partial products for implementation of large parallel multipliers, which adopts the parallel encoding scheme. One of the solutions of realizing high speed multipliers is to enhance parallelism which helps to decrease the number of subsequent stages.
The majority of the steps involved calculating the set of partial product and summing the partial product together. The technique for multiplying decimal number is on the basis of computing the partial products, shifting them to the left and then adding them together. The first stage of majority of multipliers involves making the partial products which is an array of AND gates. For partial product generation, an n-bit by n-bit multiplier needs n2AND gates. The most difficult part is to get the partial products, as it involves multiplying the long number by one digit. The technique is very slow since it involves several intermediate additions. The multiplier bits are categorized into groups of s bits, called selection groups in s-bit selection. By multiplying the ith selection group Si with the multiplicand N, the ith partial product Pi is received, and shifting it to the same position x as The selection group, these multiplications takes lot of time. The sign with a separate rule generally in the 2s complement representation is the second issue. That compels the multiplication process to be adapted to manage 2s complement numbers, and that perplexes the process a little more. For the multiplication process it takes more time and it consumes high power, delay increased and switching activity also increased .More and more sophisticated signal processing systems are being used on a VLSI chip, as the scale of integration remains developing. The signal Processing applications not only need great computation capacity but also eat up substantial amount of energy. Power consumption has become a vital concern in todays VLSI system Design, while performance and area stay to be the two key design tolls. From two important Forces, the demand for low-power VLSI system wakes up. In portable devices, the low power design straightly leads to prolonged operation time. Multiplication is a fundamental operation. Multipliers have large area, long latency and consume substantial power.
So, in low power VLSI system design, the low power multiplier design has been a significant part. There has been widespread work on low-power multipliers at technology, physical, circuit and logic levels. Commonly, the performance of the multiplier decides the systems performance, because the slowest element in the system generally is the multiplier. And also, generally it is the most area consuming. Thus, a major design issue is the optimizing the power and area of the multiplier. Nevertheless, area and power are normally contradicting constraints so that improving power results mainly in low areas.
Two disadvantages were there in the Original version of Booths multiplier (Radix-2). Many add / subtract operations became changeable and thus became inconvenient while deciding multipliers. When there are separated 1s, the algorithm becomes ineffective. By means of Radix 4 Booths algorithm these issues are overcome which can scan strings of three bits with the encoding algorithm. The design of low power multiplier contains encoding, partial product generators, bypassing and finally an adder. To reduce the power by decreasing the number of partial products, this low power multiplier method is implemented. To produce the product by multiplying the multiplicand A by 0, 1, -1, 2 or -2, the partial products generator is designed. For product generator, multiply by zero intends the multiplicand is multiplied by 0. Multiply by 1 intends the product still keeps the same as the multiplicand value. Multiply by -1 intends that the product is the 2s complement form of the number. Multiply by -2 means to shift left one bit the 2s complement of the multiplicand value and multiply by 2 intends just shift left the multiplicand by one place. In these multipliers, many techniques are implemented to diminish the power dissipation. Tree multiplication is implemented for this multiplication process for high speed multiplication process. In microprocessor and digital signal processors, the digital multiplier is an arithmetic unit and implemented for emerging media processors. Before and after every partial product were empty gaps without any bits to be added. It would be tough to create a common summation network for such circuits which works with all possible partial product generators.
Multimedia and DSP applications are extremely multiplication intensive so that the performance and power consumption of these systems are ruled by multipliers. To create several partial products, the calculations of the multipliers manipulate two input data for addition operations, which in the CMOS circuit design needs several switching activities. Hence, switching activity within the functional unit needs for most of power consumption. For the intention of power reduction technique implemented as the kernel operator is applicable exact data path of video and audio codec technique the common multipliers can be simplified to a network of shift, adders and subtract or also decrease the switching quality of the following signal. To decrease the number of partial product in the system is one of the best ways to decrease the switching activity in resulting in less power consumption. To decrease the switching activity of the system, the different kinds of multiplication methods are implemented. To decrease the power consumption of the system, the ultra low power multiplier is implemented. It implements positive feedback charge distributing logic and 30% of power and the switching activity of the multiplier has been decreased. A power efficient 16 times 16 Configurable Booth Multiplier was presented in that supported single 16-bit, single 8-bit, or twin parallel 8-bit multiplication operations.
To increase the probability of partial product becoming zero, the Booth encoding is implemented, which contributes in reduction of the redundant switching activities and the total power consumed by the configurable Booth multiplier. The recoding method which was introduced in Booth decoder increased the number of zeros in multiplicand. Such that, the number of switching activity were minimized resulting in reduced power consumption. Modified Booth encoding is most often used to avoid variable size partial product arrays.
CHAPTER-2
SOFTWARE AND HARDWARE REQUIREMENTS
SOFTWARE REQUIREMENTS
1. Xilinx ISE design suite 2. Modelsim Altera 3. Digilent Adept 2.15.3
Xilinx ISE (Integrated Software Environment) is a software tool produced by Xilinx for synthesis and analysis of HDL designs, enabling the developer to synthesize ("compile") their designs, perform timing analysis, examine RTL diagrams, simulate a design's reaction to different stimuli, and configure the target device with the programmer. Along with Xilinx ISE Design suite modelsim is used as a simulator to show the simulated wave form analysis. After loading the required file from work to the simulator it generates the waveforms. Digilent Adept is a unique and powerful solution which allows you to communicate with Digilent system boards and a wide assortment of logic devices.
HARDWARE REQUIREMENTS NEXYS 2
The Nexys-2 is a powerful digital system Design platform built around a Xilinx Spartan 3E FPGA. With 16Mbytes of fast SDRAM and 16Mbytes of Flash ROM, the Nexys-2 is ideally Suited to embedded processors like Xilinx's 32-bit RISC Micro blaze. The on-board high-speed USB2 port, together with a collection of I/O devices, data ports, and expansion connectors, allow a wide range of designs to be completed without the need for any additional components.
Fig:- 2.1
CHAPTER-3
REVIEW OF LITERATURE
MULTIPLIER
A binary multiplier is an electronic circuit used in digital electronics, such as a computer, to multiply two binary numbers. The multiplier is a device which is used to perform the multiplication operation. The adders and multipliers are the basic components of designing of communication circuits. The multiplier is the basic key component of any digital signal processing system. Multiplication includes two basic steps, generation of partial products and their accumulation. Multipliers are key components of many high performance systems such as FIR filters, microprocessors, digital signal processors and multimedia applications. The type of the multiplier used for an application is based upon the requirements of the application. There are different types of multiplier available according to the requirements. The difference is in the way in which data is processed for the multiplication, the examples are serial multipliers, parallel multipliers and serial-parallel multipliers. Using VHDL language a single component can be described using all three styles of modelling.
In several of the digital circuits, the multiplier is mainly implemented. To use the digital multiplier, various methods can be implemented. The majority of the steps involved calculating the set of partial product and summing the partial product together. The technique for multiplying decimal number is on the basis of computing the partial products, shifting them to the left and then adding them together. The first stage of majority of multipliers involves making the partial products which is an array of AND gates. For partial product generation, an n-bit by n-bit multiplier needs n2 AND gates. The most difficult part is to get the partial products, as it involves multiplying the long number by one digit. The technique is very slow since it involves several intermediate additions. The multiplier bits are categorized into groups of s bits, called selection groups in s-bit selection.
In the process of normal multiplication of binary numbers the multiplicand is written first and then followed by the multiplier. For negative numbers the 2s complement form is required for representation of numbers in binary form. After writing like that take the left most significant bit of the multiplier and then multiply it with the multiplicand and then write down the result after that take the bit adjacent to the lsb bit and multiply it with the multiplicand and then shift it to the left once and then add it to the previous partial product and then take the third bit and multiply with multiplicand and shift it twice and add to the partial product and similarly the fourth step and now result is obtained.
1. Circuit complexity increases with the use of usual multipliers so, time taken for execution is more by using these multipliers. 2. By using multipliers number of partial product array is generally high. 3. Any single error in the partial product array leads to results which are not accurate.
So we go for an efficient multiplication algorithm which overcomes the above disadvantages. This is a modified booth encoding algorithm which comes from booth algorithm.
3.2 BOOTH ALGORITHM
It is a powerful algorithm for signed-number multiplication, which treats both positive and negative numbers uniformly. For the standard add-shift operation, each multiplier bit generates one multiple of the multiplicand to be added to the partial product. If the multiplier is very large, then a large number of multiplicands have to be added. In this case the delay of multiplier is determined mainly by the number of additions to be performed. If there is a way to reduce the number of the additions, the performance will get better.
Booth algorithm is a method that will reduce the number of multiplicand multiples. For a given range of numbers to be represented, a higher representation radix leads to fewer digits. Since a k-bit binary number can be interpreted as K/2-digit radix-4 number, a K/3-digit radix-8 number, and so on, it can deal with more than one bit of the multiplier in each cycle by using high radix multiplication.
The main bottleneck in the speed of multiplication is the addition of partial products. More the number of bits the multiplier/multiplicand is composed of, more is the number of partial products, longer is the delay in calculating the product. The critical path of the multiplier depends upon the number of partial products. In radix-2 booths algorithm, if we are multiplying 2 n bits number, we have n partial products to add. Radix-4 booths multiplication is an answer to reducing the number of partial products. Using Radix-4 booths multiplier, the number of partial products are reduced to n/2 if we are multiplying two n bits numbers, if n is even number, or (n+1)/2, if n is an odd number. By reducing the number of partial products, one can effectively speed up the multiplier by a factor roughly equal to 2.
3.3 BOOTH ENCODING RULES
xi xi-1 Operation Comments yi 0 0 Shift only String of zeros 0 1 1 Shift only String of ones 0 1 0 Subtract and shift Beginning of a string of ones 1 0 1 Add and shift End of a string of ones 1
Table: 3.1
Booth algorithm is a method that will reduce the number of multiplicand multiples. For a given range of numbers to be represented, a higher representation radix leads to fewer digits. Since a n-bit binary number can be interpreted as n/2-digit radix-4 number, a n/3-digit radix-8 number, and so on, it can deal with more than one bit of the multiplier in each cycle by using high radix multiplication. By using booth encoding algorithm we n/2+1 number of partial products but by using modified booth encoding algorithm we get n/2 number of partial products. Thus the when it is implemented using FPGA the area occupied will be less.
000011 is the multiplicand. 011101 is the multiplier. According to booth multiplication 3 partial products are to be obtained. By adding the partial products we get the final result. The result is 000001010111 that is 87 in decimal.
3.6 ADVANTAGES OF BOOTH ALGORITHM
1. The booth algorithm has its major advantages if the operands have a large number of bits.
2. Booth algorithm is efficient if the multiplier contains long sequences of 1s.
3. Bit pairing can reduce the number of summands further.
3.7 DISADVANTAGES OF BOOTH ALGORITHM
1. It has its limitations if the multiplier contains only small groups of 1s or even alternating 01 pairs. 2. Booth algorithm may double the number of non-zero summands which is a disadvantage.
CHAPTER-4
HARDWARE AND SOFTWARE REQUIREMENT ANALYSIS
MODIFIED BOOTH MULTIPLIER
The Modified Booth multiplier is an extension of Booths multiplier. In Modified Booth, the number of partial products reduced by N/2, that is half of total partial products as compare to simple multiplication process. So, clearly if the number of partial products becomes reduced, the area of the multiplier also will reduce and automatically as the result of it, the speed will increased. So, this multiplier is more efficient.
4.1 MODIFIED BOOTH ALGORITHM
The Modified Booth algorithm is the most frequently used method to generate partial products. The partial products are reduced by n/2 by using this algorithm. So as the result of this, the multiplier can be implemented using less hardware components as compare to conventional multiplier. This algorithm can save multiplier layout area and reduces delay at the same time which are the important design advantages. One of the method for high speed multiplier is to enhance the parallelism by reducing the number of calculating stages .Booth encoding reduces partial products to N/2. It converts the multiplier from radix-2 to radix-4 using redundant digit set {-2, -1, 0, 1, 2}. So in new multiplier withradix-4 there are only N/2 digits. Booth algorithm is a method that will reduce the number of multiplicand multiples. For a given range of numbers to be represented, a higher representation radix leads to fewer digits. Since a k-bit binary number can be interpreted as K/2-digit radix-4 number, a K/3-digit radix-8 number, and so on, it can deal with more than one bit of the multiplier in each cycle by using high radix multiplication.
4.2 MODULES AND THEIR FUNCTIONALITIES
As shown in the example , if multiplication is done in radix 4, in each step, the partial product term (Bi+1Bi)2 A needs to be formed and added to the cumulative partial product. Whereas in radix-2 multiplication, each row of dots in the partial products matrix represents 0 or a shifted version of A must be included and added.
STEPS TO EXECUTE THE CODE IN VERILOG
1. First double click on the Xilinx ISE icon in its respective location. 2. Close the existing projects if any are opened. 3. Create a new project with a name (MBE say) in a folder by click on the Verilog module. 4. Then a window appears with the module name created earlier along with respective inputs and outputs. 5. The program must be typed in the module and then the module must be saved. 6. Towards left side of the Xilinx window there exists synthesize option where the syntax of the module can be verified using the option check syntax. 7. When it was verified that there are no errors in the module it must be synthesized by running the synthesize option. 8. Double click on view RTL schematic and Technology schematic to view the designs. 9. To view the output of the module a test bench must be created. 10. For this right click on the name of the module towards left of the window and then click on new source. 11. There appears a window and then select Verilog test fixture and then name the test- bench as (MBEtb say) and then click finish. 12. Give the respective inputs in the test-bench and then save it. 13. Click on the simulation option towards left of the window and then double click on the Model-sim Simulator to view the output waveforms.
CHAPTER-5
HARDWARE AND SOFTWARE DESIGN
VERILOG CODE
module finalmodified(a,b,c,clk); input [3:0]a; input [3:0]b; input clk; output [7:0]c; reg [5:0] w1,w2,w4,w5; reg [7:0]w3; always@(posedge clk) begin if(b[1:0]==2'b00) w1=4'b0000; else if(b[1:0]==2'b01) w1=a[3:0]; else if(b[1:0]==2'b10) w1={a[3:0],1'b0}; else if(b[1:0]==2'b11) begin w2={a[3:0],1'b0}; w1=a[3:0]+w2[4:0]; end if(b[3:2]==2'b00) w3=4'b0000; else if(b[3:2]==2'b01) w3={a[3:0],2'b00}; else if(b[3:2]==2'b10) w3={a[3:0],3'b000}; else if(b[3:2]==2'b11) begin w4={a[3:0],1'b0}; w5=a[3:0]+w4[4:0]; w3={w5,2'b0}; end end assign c=w1[5:0]+w3[7:0]; endmodule
// Instantiate the Unit Under Test (UUT) finalmodified uut ( .a(a), .b(b), .c(c), .clk(clk));
initial begin // Initialize Inputs a = 4'b1111; b = 4'b1111; clk = 1;
// Wait 100 ns for global reset to finish#100;
// Add stimulus here
end
endmodule
CHAPTER-6
RESULTS AND SIMULATIONS (multiplication result for numbers 15 and 8)
Fig:-6.1
Fig:-6.2
6.1 DESIGN SUMMARY
Fig:-6.3 6.2 TECHNOLOGY SHEMATIC
Fig:-6.4
6.4 TIMING SUMMARY SPEED GRADE: -4 Minimum period: No path found Minimum input arrival time before clock: 4.929 ns Maximum output arrival time after clock: 7.204 ns
6.5 SNAPSHOT OF THE OUTPUT
CONCLUSION
In this project a low power multiplier using Modified booth encoding algorithm was proposed. The modifications to the conventional architecture included the Modified Booth algorithm for Radix-4 reducing the switching activities use of encoding and bypassing zeros technique. The results were shown for dynamic power and delay reduction compared to other multiplier design. For this proposed low power multiplier was successfully designed and synthesized using XILINX ISE. Output power for proposed design was 50% minimized and delay was minimized up to 30% compared to the existing low power Booths multipliers.
CHAPTER-7
REFERENCES
1.Eric Whartona, Dr. Karen Panettac, Dr. Sos Agaian, Digital electronic arithmetic with applications, IEEE Inter. Conf., 2007.
2. Design and performance of pixel-level pipelined-parallel architecture for high speed wavelet-based image compression, Computers and Electrical Engineering, 2005.
3.G. Deng and L. W. Cahill Logarithmic number system and its application to image processing, Department of Electronic Engineering, La Trobe University, Bundoora Victoria 3083 Australia. IEEE.
4. R. Hashemian and C.P. Chen A New Parallel Technique for Design of Decrement/Increment and Twos Complement Circuits, Proc. 34th Midwest Symp. Circuits and Systems, vol. 2, pp. 887-890, 1991.
5. J.-Y .Kang and J. - L.Gaudiot, A Fast and Well - Structured Multiplier, Proc. Euromicro Symp. Digital System Design, pp. 508-515, Sept. 2004.
6. M.D. Ercegovac and T. Lang, Digital Arithmetic. Morgan Kaufmann Publishers, 2003.