Academia.eduAcademia.edu

Variable block carry skip logic using reversible gates

2010

Reversible circuits have applications in digital signal processing, computer graphics, quantum computation and cryptography. In this paper, a generalized k*k reversible gate family is proposed and a 3*3 gate of the family is discussed. Inverter, AND, OR, NAND, NOR, and EXOR gates can be realized by this gate. Implementation of a full-adder circuit using two such 3*3 gates is given. This full-adder circuit contains only two reversible gates and produces no extra garbage outputs. The proposed full-adder circuit is efficient in terms of gate count, garbage outputs and quantum cost. A 4-bit carry skip adder is designed using this full-adder circuit and a variable block carry skip adder is discussed. Necessary equations required to evaluate these adder are presented.

VARIABLE BLOCK CARRY SKIP LOGIC USING REVERSIBLE GATES Md. Rafiqul Islam, Md. Saiful Islam, Muhammad Rezaul Karim, Abdullah Al Mahmud and Hafiz M. Hasan Babu {[email protected]}, {[email protected]}, {[email protected]}, {[email protected]}, {[email protected]} Department of Computer Science and Engineering, University of Dhaka Dhaka-1000, Bangladesh Abstract: Reversible circuits have applications in digital signal processing, computer graphics, quantum computation and cryptography. In this paper, a generalized k*k reversible gate family is proposed and a 3*3 gate of the family is discussed. Inverter, AND, OR, NAND, NOR, and EXOR gates can be realized by this gate. Implementation of a full-adder circuit using two such 3*3 gates is given. This full-adder circuit contains only two reversible gates and produces no extra garbage outputs. The proposed full-adder circuit is efficient in terms of gate count, garbage outputs and quantum cost. A 4-bit carry skip adder is designed using this full-adder circuit and a variable block carry skip adder is discussed. Necessary equations required to evaluate these adder are presented. 1. INTRODUCTION The input vector of reversible circuit can be uniquely recovered from the output vector, that is, for each input pattern there is a unique output pattern. Logic computations that are not reversible necessarily generate heat irrespective of their implementation technologies. According to [2], zero energy dissipation would be possible only if the network consists of reversible gates. Synthesis of reversible logic circuits differs significantly from the synthesis of combinational (classical) logic circuits. Because in a reversible circuit the number of inputs must be equal to the number of outputs, every output can be used only once (i.e., no fan-out is permitted), and the resulting circuit must be acyclic. Therefore, a good synthesis method must take into account the following rules: 1. use as many outputs of every gate as possible, and thus minimize garbage (unused) outputs. 2. do not create more constant inputs (required to make an irreversible specification to a reversible one) to gates than is absolutely necessary. 3. avoid leading output signals of gates to more than one input, because each fan-out of two requires adding one copying circuit. The rest of the paper is organized as follows: section 2 presents the families of reversible gates and their quantum cost. Section 3 presents a generalized k*k reversible gate and discusses an instance of this family of gates. Section 4 first establishes the minimum number of constant inputs and garbages are required to design a full adder circuit, and then composition of a new full adder circuit is proposed. Section 5 presents the design of a carry skip adder using the proposed full adder circuit for which it is used as basic building block. Section 6 presents a variable block carry skip adder block. Experimental results are shown in section 7. Section 8 concludes the paper. References are listed in section 9. 2. FAMILIES OF REVERSIBLE GATES AND THEIR QUANTUM COST There exist many universal reversible gates [1,3,7,10,11]. There exists only one 1*1 reversible gate called inverter (A→A′). This gate is very important since it does not introduce garbage outputs. Some of the popular and important gates are 2*2 Feynman ((A, B)→(P=A, Q=A⊕B)), 3*3 Toffoli ((A, B, C) →(P = A, Q=B, R=AB⊕C), 3*3 Fredkin ((A, B, C) →(P = A, Q=A′B⊕AC, R=A′C⊕AB)) and Peres [1] ((A, B, C) → (P = A, Q = A⊕B, R=AB⊕C)) gate. Figure 1. 2*2 Feynman Gate Figure 2. 3*3 Toffoli Gate The detailed cost of a reversible gate depends on any particular realization technology of quantum logic. According to [9], it is assumed that the cost of every 2*2 is the same. A 1*1 cost nothing, since it can be always included to arbitrary 2*2 gate that preceded or follows it. Thus, every permutation quantum gate will be build from 1*1 and 2*2 quantum primitives and its cost calculated as a total sum of 2*2 gates. Using the well known realization of Toffoli gate with truly quantum 2*2 primitives, according to [9], the cost of Toffoli gate is five 2*2 gates, or simply, 5 as shown in figure 2. The cost of Fredkin gate is exactly the same as the cost of Toffoli gate [5], which is shown in figure 3. Peres gate can be realized with cost 4 [9]. This is the cheapest quantum realization of a complete (universal) permutation gate. Figure 5. Operation of Peres gate Figure 3. 3*3 Fredkin Gate 3. A GENERALIZED K*K REVERSIBLE GATE FAMILY A generalized k*k reversible gate family is proposed in Figure 4(a), fk-2(A1, A2, … , Ak-2) is an arbitrary function of A1, A2, … , Ak-2 and fk-1(A1, A2, … , Ak-1) is the function of A1, A2, … , Ak-1. The gate is a (k-2) through gate. Mathematical properties of the gate family and systematic method for reversible logic synthesis using this family of gates are now being studied. With k=2, this family of gate performs the same function as the Feynman gate. A 3*3 gate of the family is shown in Figure 4(b). The equation of this gate was known to Peres [1]. The quantum cost of this circuit is 4. The operation of this gate is shown in figure 5. Proof: The full-adder output S (A⊕B⊕Cin), Cout ((A⊕B)Cin ⊕ AB) equations produce the same output (1,0) for the three distinct input combinations (0,0,1), (0,1,0), and (1,0,0). Therefore, to separate all repeated values of outputs S and Cout we need at least two garbage outputs. Thus, a total of outputs is 2+2 = 4. Since in reversible circuits number of inputs must be equal to number of outputs and there are three inputs (A, B, and Cin), at least one constant input is necessary. A full adder implementation using two 3*3 Toffoli gates and two 2*2 Feynman gate is presented in [8]. The circuit requires four reversible gates, produces two garbage outputs and the quantum cost is of 10. Another full adder implementation using four 3*3 Fredkin gates is presented in [6] The circuit requires four reversible gates, produces two garbage outputs and the quantum cost is of 20. In this paper, we present a new full adder composition. It consists only of two Peres gate and the quantum cost is of 8, which is minimum than all of the existing compositions. This we will call Peres full-adder which shown in figure 6. Figure 6. Peres Full Adder 5. CARRY SKIP-ADDER Figure 4. A generalized k*k reversible gate family and a 3*3 gate of the family 4. COMPOSITION OF FULL ADDER CIRCUIT Theorem: A full-adder can be realized with at least two garbage output and one constant input. The carry skip adder reduces the delay due to the carry computation. Consider the full-adder’s operation. If either input is a logical 1, the cell will propagate the carry input to the carry output. Therefore, the ith fulladder carry input, Ci, will propagate the carry input to its carry output, Ci+1, when Pi = Ai⊕Bi. Multiple fulladders, called a block, can generate a “block” propagate signal to detour the incoming carry around to the block’s carry output signal. Figure 7 shows a 4-bit carry skip adder block. Each block is a small ripple carry adder producing the block’s sum and carry bits. However, each block quickly calculates whether the block’s carry input is propagated to its carry output. Substituting (5) into (4) gives the shortest delay for a fixed block size carry skip adder 6. VARIABLE BLOCK CARRY SKIP ADDER Figure 7. Four Bit Carry Skip Adder A B bit full adder requires 2B Peres gate using the circuit in figure 6. A B input AND gate requires B-1 Peres gates. Therefore, a B bit carry skip adder requires 3B Peres gates. Consider the B bit carry skip adder block in figure 7 generating a block carry out Cout generates via carry ripple through the full adders. The least significant full adder requires a path delay of 2 Peres gates to generate C1 from the addends. Then, the carry “ripples” through the subsequent full adders with a path delay of 1 Peres gate per bit. Finally, the Peres gate in the left of figure 7 generates Cout. Therefore, the delay to generate block carry out Cout (via ripple) with a B bit carry skip adder is The full adder in figure 6 generate sum bit Si, carry bit Ci, and propagate signal Pi (G2) passing through 2, 2,and 2 reversible gate. Therefore, the delay to generate Si is 2. The delay to generate Pi is 2. And the delay to generate Ci is 2 reversible gates. Then, all propagate signals for the carry skip adder block are combined with a B bit AND gate with delay log2N. Finally, the Peres gate in the left of figure 7 generates Cout. Varying the size of the carry skip adder blocks can reduce the total worst-case delay. Since carries generated or absorbed in the adder circuits have shorter data paths. Without loss of generality, the first and last carry skip blocks are b bits wide, and the carry skip adder is divided into t blocks, where t is even. Assuming the carry skip adder block sizes are Summing the number of bits in the blocks, equating to N, and rearranging gives The total worst case delay Tvariable of an N bit carry skip adder with the variable block sizes is the sum of the ripple carry delay through the first carry skip adder block, the skip delays through the intermediate blocks, and the ripple carry delay through the last block. Assuming the variable block sizes in (3), the total delay is Assuming log2k ≈k/2 and rearranging (9) becomes The total worst-case delay Tfixed of an N bit carry skip adder with fixed block size B is the sum of the ripple carry delay through the first carry skip adder block, skip delays through the intermediate blocks and the ripple carry delay through the last block, or Inserting (8) into (10), and collecting terms gives Assuming log2  ≈ B/2 we get B The optimal number of blocks is found with Minimizing Tfixed with respect to block size B gives 8. CONCLUSION Therefore, the optimal variable block size carry skip adder has delay The main goal of this paper is finding a good architecture for adder circuits using reversible logic based on minimizing gate count, garbage outputs, constant inputs, and quantum cost. Technology independent analysis of these adder circuits is given since quantum or optical logic implementations are not available. Table 3: Showing Tfixed for different Implementations 7. RESULTS We compare our proposed full adder circuits with existing designs and result is shown in Table 1 and Table 2. In the previous paper Quantum costs of those circuits are not considered. We calculate the Quantum cost of those adder circuits and compare them with our proposed design. Table 1: Comparison Table1 No. Of Bits 4 8 16 32 64 128 256 512 1024 2048 4096 Tfixed for[6] Tfixed for Peres 13.49 21.9 34.98 55.82 89.97 147.64 247.94 427.27 755.87 1370.54 2539.74 4.93 9.80 17.86 31.60 55.71 99.19 179.43 330.38 618.85 1176.77 2265.70 Full-adder Composition No. Of gates used No. Of Garbage Output No. Of Constant input Quantum Cos t Proposed Peres Toffoli, Khan and Feynma n [4] Toffoli and Feynman [8] Khan and Feynman gate [7] Fredkin [6] 2 2 1 8 9. REFERENCES 3 2 1 - [1] A. Peres, “Reversible Logic and Quantum Computers”, Physical review A, 32:3266-3276, 1985. [2] C.H. Bennett, “Logical Reversibility of Computation”, IBM J. Research and Development, 17:525-532, November 1973. 4 2 1 10 3 3 2 - 4 3 2 20 The analytical performance of the carry skip adder in [6] and our carry skip adder (Figure 7) is given in table 3. It is evident from Table 3 that our design performs better. For smaller block size our carry skip adder performs best (approximately double) and practically smaller block size is required. We choose binary exponential values for the block size, which is natural for block size. Tabe1. Comparison Table 2 Full-adder Composition Unit Clock Cycle Proposed Peres Toffoli, Khan and Feynman [4] Toffoli and Feynman [8] 2 4 Two input AND 2 3 4 3 0 4 4 2 0 Khan and Feynman gate [7] Fredkin [6] 3 5 4 6 View publication stats [4] Hafiz Hasan Babu, Rafiqul Islam, Ahsan Raza Chowdhury, Syed Mostahed Ali Chowdhury. “Reversible Logic Synthesis for Minimization of Full-Adder Circuit.” Euromicro Symposium on Digital Systems Design (DSD’03), September 01-06, 2003, BelekAntalya, Turkey. [5] J. Smolin, D.P. Divincenzo,” Five two-qubit gate are sufficient to implement the fredkin gate”, Physical Review A, Vol. 53, no.4, April 1996, pp.2855-2856. [6] J.W. Bruce, M.A. Thornton, L. Shivakumaraiah, P.S. Kokate, and X. Li, “Efficient Adder Circuits Based on a Conservative Reversible Logic Gate”, IEEE Computer Society Annual Symposium on VLSI, April 25-26, 2000, Pittsburgh, Pennsylvania. [7] M. H. Azad Khan, “Design of Full-Adder with Reversible Gates”, 5th ICCIT 2002, East West University, 27-28 Dec 2002. Gate Calculations Two input EXOR [3] De Vos, “Towards reversible digital computers”, Proc. European Conference on Circuit Theory and Design, Budapest (1997), pp.923-931. NOT 0 [8] M. Perkowski, Lech Jozwiak, Pawel Kerntopf, Alan Mishchenko, Anas Al-Rabadi, “A General Decomposition for Reversible Logic”, In 5th International Red-Muller Workshop, pages 119-138, 2001. [9] M. Perkowski, Martin Lukace, Mikhail Pivtoraiko, “A Hierarchical Approach to Computer-Aided Design of Quantum Circuit” In 6th International Symposium on Representations and Methodology of Future Computing Technologies, pages 201-209, March 2003. [10] P. Kerntopf, “Maximally Efficient Binary and Multi-Valued Reversible Gates”, ULSI Workshop, Warsaw, Poland, May 2001. [11] T. Toffoli, “Reversible computing.” Tech memo MIT/LCS/TM151, MIT Lab for Comp. Sci, 1980. 4 8 16 4