Ioan Mang Raport Plagiat ucv-IC11
Ioan Mang Raport Plagiat ucv-IC11
Ioan Mang Raport Plagiat ucv-IC11
file:///D:/ioan_mang_raport_plagiat_ucv-IC11.html
From Quick Submit (Quick Submit) Processed on 09-May-2012 8:49 PM PDT ID: 248554052 Word Count: 3204 Similarity Index 90% Similarity by Source Internet Sources: 88% Publications: 90% Student Papers: 21%
sources: 1
76% match (Internet) http://www.vlsi.ee.upatras.gr/~sklavos/Papers/Papers02/IEEE02_Rijndael.pdf
1% match (publications)
"Dynamically configurable security for SRAM FPGA bitstreams", International Journal of Embedded Systems, 2006
1% match (publications)
Massoud Masoumi. "Design and evaluation of basic standard encryption algorithm modules using nanosized complementary metaloxidesemiconductormolecular circuits", Nanotechnology, 01/14/2006
N. Sklaos. "Architectures and VLSI implementations of the AES-Proposal Rijndael", IEEE Transactions on Computers, 12/2002
Mang, Erica. "An FPGA-based implementation of the key transformation procedure in the MARS algorithm", Journal of Computer Science & Control Systems/18446043, 20100501
paper text:
TWO DIFFERENT
1 of 12
10-May-12 06:56
file:///D:/ioan_mang_raport_plagiat_ucv-IC11.html
RIJNDAEL ALGORITHM
Science
Proposal Rijndael, are presented in this paper. These alternative architectures are operated both for encryption and decryption process. They reduce the required hardware resources and achieve high-speed performance. Their design philosophy is completely different. The first uses feedback logic and reaches a throughput value equal to 259 Mbit/sec. It performs efficiently in applications with low covered area resources. The second architectures is optimized for high-speed performance using pipelined technique. Its throughput can reach 3.65 Gbit/sec. Key words: Rijndael,
seems to be one of the most important issues in the communication standards. Of course, many encryption algorithms support the defense of private communications. However, the implementations of this algorithm is a complicated and difficult process and sometimes results in intolerant performance and allocated resources in hardware terms. The explanation for this fact is because these encryption algorithms were designed some years ago and for general cryptography reasons. In recent years, new flexible algorithms specially designed for the new protocols and applications have been introduced to face the increasing demand for cryptography. In October of 2000, the National Institute of Standards and Technology (NIST) announced the cipher Rijndael as the Advanced Encryption Standard (AES) in order to replace the aging Data Encryption Standard (DES)
(Schneider 1996).
1 In the Third Advanced Encryption Standard (AES) Candidate Conference (AES 2000), papers from different research groups were presented
2 of 12
10-May-12 06:56
file:///D:/ioan_mang_raport_plagiat_ucv-IC11.html
1 main purpose of these works was the evaluation of the AES finalist
algorithms in terms of hardware implementation performance. In order to achieve this, all the authors used general purposes architectures and not specialized designs for each algorithm implementation. This is a fair methodology for comparison of different algorithms. On the other hand, this way is not well-suited to the implementation of each algorithm separately. In addition, in two of these works
1 only the encryption mode of operation was implemented and not the
decryption. References
(Elbirt
et al 2001)
2 do not support the on-chip- generation of the necessary for the algorithm
encryption- decryption keys. In other words, the proposed designs do not support the completed operation of the algorithms and perform inefficiently in terms of both the encryption and decryption mode of data transformation. Especially for the Rijndael algorithm, other works
(Kuo
2001)
(Kuo et al 2001)
3 of 12
10-May-12 06:56
file:///D:/ioan_mang_raport_plagiat_ucv-IC11.html
(Mroczkowski et al 2001),
1 two different designs are introduced, one for encryption and one for
decryption. They have been implemented in two separate FPGA devices. This is not right way for the implementation of a block cipher. It is not efficient for the implementation of communications protocols, especially in integrated circuits with low allocation resource specifications. The proposed implementation in
(Mroczkowski et al 2001)
operation of the algorithm. In this paper, two architectures and VLSI implementations of the AES proposal are presented. These alternative designs operate both for encryption and decryption process in the same device. They are proposed in order to reduce the required hardware resources and to achieve high-speed performance. In the first design, the appropriate key expansion unit is integrated with the encryption/decryption core.
1 Performance analysis and comparison results with other works are also
reported.
algorithm called Rijndael has been developed and published by Daemen and Rijmen
(Daemen et al 2001).
1 This algorithm is an iterated block cipher with variable block length and
a variable key length. The block and the key length can be independently specified to 128, 192, or 256 bits. The number of algorithm rounds depends on the block and key length. The different transformations of the algorithm architecture operate on the intermediate result, called State. The State can be pictured as a rectangular array of bytes. This array has four rows. The number of columns is called Nb and it is equal to block length divided by 32. The Key is also considered as a rectangular array with the same number of rows as State. The number of columns is equal to the key length divided by 32. This number is denoted as Nk. The number
4 of 12
10-May-12 06:56
file:///D:/ioan_mang_raport_plagiat_ucv-IC11.html
of rounds, Nr, depends on the values Nb and Nk. For block and key length equal to 128 bits, both values of Nb and Nk are equal to four and the number of rounds Nr is defined as 10. These specifications are served by the proposed implementations, which will be analyzed in detail in the next paragraphs. MixColumn step removed. A
1 key expansion unit is defined in order to generate the appropriate key, for
every round, from the initial key value. When all rounds of transformation are completed, a cipher data block with the same length as the plain data has been generated. The decryption process has the same structure as the encryption architecture. The only main difference is that for every function that is used in the basic round, the mathematical inverse of it is taken. The key expansion unit performs almost the same operation with the encryption process. The only difference is that the decryption of the round keys is obtained by applying the inverse MixColumn to the corresponding round keys. The initial value of the key for the decryption operation is changed. The appropriate basic decryption key must be loaded in the key buffer before the decryption beginning
(Daemen et al 2001).
architectures are proposed for the Rijndael algorithm in order to reduce the required hardware resources and to achieve high-speed performance. Both architectures serve the encryption and decryption process in the same hardware device.
fundamental algebraic functions that operate on arrays of bytes. These transformations are: SubBytes: Operates in each byte of the State independently. This mathematical substitution is constructed of the compositions of two transformations: multiplicative inverse in
8 ),
1 too, and the inverse of the affine mapping transformation over GF(2).
ShiftRow: Cyclically shifts the rows of the State over different offsets. The operation is almost the same in the decryption process except for the fact
5 of 12
10-May-12 06:56
file:///D:/ioan_mang_raport_plagiat_ucv-IC11.html
that the shifting offsets have different values. MixColumn: In this transformation, the columns of the State are considered as polynomials over
+01x+02 for encryption and with the polynomial d(x)= 0B x 3 + +0Dx 2 +09x+0E for the decryption process. Both polynomial multiplications are modulo (x 4 +1).
simple bit by bit XOR. KeyAddition is the same for the decryption process. Before the first round, a key addition layer is applied to the cipher data. This transformation is stated as the algorithm initial round key addition. The final round of the cipher is equal to the basic round with the
Figure 1. Basic block round architecture Figure 2. Key Expansion Unit architecture
1 3.1. Basic Block Round The architecture of the basic block round is
shown in Figure 1. As was already mentioned in the previous section, each basic round of the algorithm is composed of basic building blocks:
SubByte, ShiftRow, MixColumn, and KeyAddition. The structure of
proposed architecture is shown in detail in Figure 2. This architecture performs both the encryption and the decryption process, with input plaintext and key vector equal to 128 bit. The
initial round key addition.A key buffer of 128-bit width is used for the key storage. In the initial round key addition transformation, the input state is
XOR-ed with the input key. In the first step, the initial round key addition is
executed and the key for the first round is calculated. In a clock cycle, one transformation round is executed and, at the same time, the appropriate key for the next round is calculated. The whole process reaches the end when 10 rounds of transformation are completed. The input Register is used to keep the transformed State after every round of operation. The State is forced to this register with the use of a feedback technique. The
6 of 12
10-May-12 06:56
file:///D:/ioan_mang_raport_plagiat_ucv-IC11.html
Basic Block Round architecture is shown in Figure 1 and has been described in detail in Section 3.1. The Key Expansion Unit architecture is illustrated in Figure 2. The round keys are derived from the initial key. Two are the basic component of this unit, the Key Transformation and the Round Key selection. The total number of the round key bits is equal to the block length, multiplied by the number of rounds plus one. The proposed implementation with 128 bit block length and 10 rounds generates 10*128 bit round keys. The round keys are taken from the initial key in a complicated way, defined in detail in the algorithm
between encryption and decryption processes. The basic difference is that, in decryption, the round keys are obtained by applying the inverse MixColumn to the corresponding round keys. The total execution time is one clock cycle for every round, plus one clock cycle for the initial round key addition. So, the system needs 11 clock cycles in order to completely transform a 128 bits data
1 architecture is illustrated in Figure 2. The round keys are derived from the
initial key. Two are the basic component of this unit, the Key Transformation and the Round Key selection. The total number of the round key bits is equal to the block length, multiplied by the number of rounds plus one. The proposed implementation with 128 bit block length and 10 rounds generates 10*128 bit round keys. The round keys are taken from the initial key in a complicated way, defined in detail in the algorithm
between encryption and decryption processes. The basic difference is that, in decryption, the round keys are obtained by applying the inverse MixColumn to the corresponding round keys. The total execution time is one clock cycle for every round, plus one clock cycle for the initial round key addition. So, the system needs 11 clock cycles in order to completely transform a 128 bits data clock. 3.3 Second Architecture Using RAM for Key Storage The second proposed architecture is shown in Figure 3. The main characteristics of this are: 1) the pipelining used technique and 2) the usage of a RAM for the key storage and loading. It is not possible to apply
7 of 12
10-May-12 06:56
file:///D:/ioan_mang_raport_plagiat_ucv-IC11.html
pipelining in many cryptographic applications. However, the Rijndael cryptographic algorithm internal architecture provides the possibility of being implemented with pipelining technique. The pipelining architecture offers the benefit of high-speed performance. The implementation can be applied in applications with hard throughput needs. This goal is achieved by using a number of operating blocks with a final cost to the covered area. The proposed architecture uses 10 basic round blocks, which are cascaded by using pipeline registers. In this architecture, 10 blocks of data can be transformed at the same time. The main disadvantage of the second proposed design is the increased required effective area. In order to face this problem, RAM was used for the key storage. Many FPGAs provide embedded RAM, which many be used to replace the Key Expansion Unit and the internal buffer of these architecture for the
1 initial key. In this way, the appropriate key for each round can be
addressed from the RAM. External RAM blocks can also be used.
The
demands in terms of the key length. In such architectures the switching time of the RAM is a factor that has to be considered in the total performance timing measurements.
Figure
implemented by using VHDL, with structural description logic. Both implementations were simulated for the correct encryption and decryption operation using the test vectors provided by the AES submission package
(AES 2000). The VHDL codes of the two designs are synthesize, placed and
routed using FPGA devices of Xilinx (Virtex) (Xilinx 2001). The two architectures were simulated again for the verification of the correct functionality in real time operating conditions. The measurements of the performance analysis are shown in Table 1. Measurements from other designs are added in the same table. The first architecture was optimized with covered area constraints. Xilinx Virtex XCV300BG432 was selected for this architecture implementation. The throughput reaches the value of
8 of 12
10-May-12 06:56
file:///D:/ioan_mang_raport_plagiat_ucv-IC11.html
external clock with frequency of 22 Mhz. In the proposed architecture, the critical path is
45 ns. Arch. Proc. FPGA CLB Fre. Thr. First En/De XCV300BG560 2358 22 259 Second En/De XCV1000BG560 17314 28.5 3650 (Dandalis et al 2000) Encr. Xilinx 5673 - 353 (Elbirt et al 2000) Encr. XCV1000BG560 5302/10992 14.1/31.8 300/1940 (Gaj et al 2000) En/De Xilinx 2902 25.9 331 (Weeks et al 2000) En/De ASIC 35x10 6 um 2 - 265 (Kuo et al 2001) Encr. ASIC 3.96 mm 2 100 910 (Fischer et al 2001) En/De Altera 845 LE - 750 (Mroczkowski et al 2001) Decr. Altera 2885 41.5 248 The
Throughput=block_size*frequency/total clock cycles (1) The transformed block size is 128 bit and the frequency is 22 Mhz. The necessary clock cycles for one block encryption or decryption are 11. For the second pipelining architecture, the device
1 has 128k bits of embedded RAM, divide in 32 RAM blocks, that are
separate from
supported embedded RAM. The Virtex block RAM also includes dedicated routing to provide an efficient interface with both Configurable Logic Blocks (CLBs) and other block RAMs. The throughput in the pipelining architecture is give by: Throughput= block_size/Tclkbasic (2) where Tclkbasic is he delay of a single round, including register delay. Tclkbasic is 35 ns. The width of the transformed block size is 128 bits. The second architecture achieves throughput 3.65 Gbit/sec. The external clock frequency is 28.5 Mhz. All the compared architectures operate with data and key block width of 128 bits. Someone could claim that the proposed
first architecture has a little bit slower performance at about 10, 15 percent
compared with the other architectures. Nevertheless, this is a physical result of the algorithm philosophy and not a tradeoff. In this cryptographic algorithm, the key expansion unit is partially modified in the case of decryption process. Especially, as the Rijndael introducers clarify in their AES-proposal
9 of 12
10-May-12 06:56
file:///D:/ioan_mang_raport_plagiat_ucv-IC11.html
1 InvMixColumn has to be applied to all round keys except the first and the
last one, during the decryptions process. In our first architecture proposed, the critical path is specified of the key expansion unit. In order to have a hardware implementation that supports both encryption and decryption the critical path of the key expansion unit for the slower process defines the critical path of the total system. The two proposed architectures support encryption and decryption in the same dedicated hardware device. So, in a comparison attempt, in hardware performance with other architectures that support only encryption
these
(Elbirt
device. In the first architecture proposed, the appropriate key expansion unit has been integrated in the same FPGA device. This extra feature of these architecture adds, of course more allocated hardware recourses and decreases the algorithm core performance. 5. Conclusions Two different philosophies of VLSI architectures for the design and implementation of the Rijndael encryption algorithm have been presented. The first uses feedback logic and
supports key expansion unit in the same device and performs efficiently in applications with low covered area resources. The second is optimized for high-speed performance using pipelining technique with high data throughput of 3.65 Gbit/sec. The resulting VLSI circuits achieve data rates significantly high, supporting both operation processes (encryption/decryption) of Rijdael algorithm. They can be applied to online encryption/decryption needs of high speed networking protocols like
Asynchro-nous Transfer Mode (ATM) or Fiber Distributed Data Interface
10 of 12
10-May-12 06:56
file:///D:/ioan_mang_raport_plagiat_ucv-IC11.html
1 A. Dandalis, V.K. Prasanna, and J.D.P. Rolim, A Comparative Study of Performances of AES Final Candidates Using FPGAs, Proc. Third
Performance evaluation of the AES Block Cipher Candidate Algorythm Finalists, Proc. Third Advanced Encryption Standard (AES) Candidate Conf., Apr. 2000.
in Reconfigurable Hardware, Proc. CHESS 2001, May 2001. K.Gaj and P. Chodowiec, Comparison of the Hardware Performance of the AES Candidates Using Reconfigurable Hardware,
2000. K. Gaj. And P. Chodowiec, Fast Implementation and Fair Comparison of the Final Candidates for Advanced Encryption Standard Using Field Programmable Gate Array , Proc. RSA Security Conf., Apr. 2001. H. Kuo and I. Verbauwhede, Architectural Optimization for a 1.82 Gbits/sec VLSI Implementation of the AES Rijndael Algorithm , Proc. Chess 2001, May 2001.
FPGA,
Simulations of round 2 Advanced Encryption Standard Algorithms, Proc. Third Advanced Encryption Standard (AES) Candidate Conf., Apr 2000.
11 of 12
10-May-12 06:56
file:///D:/ioan_mang_raport_plagiat_ucv-IC11.html
1 Xilinx Inc., San JOSE, Calif., Virtex, 2.5 V Field Programable Gate Array,
2001, www.xilinx.com.
12 of 12
10-May-12 06:56