Design and Comparison of Three 20-Gb/s Backplane Transceivers For Duobinary, PAM4, and NRZ Data
Design and Comparison of Three 20-Gb/s Backplane Transceivers For Duobinary, PAM4, and NRZ Data
Design and Comparison of Three 20-Gb/s Backplane Transceivers For Duobinary, PAM4, and NRZ Data
Design and Comparison of Three 20-Gb/s Backplane Transceivers for Duobinary, PAM4, and NRZ Data
Jri Lee, Member, IEEE, Ming-Shuan Chen, and Huai-De Wang
AbstractA full study of three data formats including duobinary, PAM4, and NRZ is proposed to estimate the performance of the corresponding transceivers under different conditions. Transceiver prototypes designed and optimized for the three signalings are presented to evaluate their performance as well as the feasibility. The three transceivers have been tested thoroughly in Rogers and FR4 boards. Fabricated in 90-nm CMOS technology, all three transceivers achieve error-free operation with 20-Gb/s 231 1 PRBS data over 40-cm Rogers and 10-cm FR4 channels. General comparison reveals that the NRZ data still achieves the best performance at 20 Gb/s. Index TermsDuobinary, pulse-amplitude modulation (PAM4), non-return-to-zero (NRZ), bit error rate (BER), backplane transceiver.
I. INTRODUCTION
HE pursuit of higher data rate in wireline communications has been demonstrated in the past and will conGb/s), tinue in the future. Recent research on high-speed ( short-range ( m) serial links over electrical backplanes or optical bers have revealed the design trends for next generation, e.g., the chip-to-chip and board-to-board communication are moving toward 20 Gb/s, and 100-Gb/s Ethernet is also on the way [1]. Fig. 1 shows the simulated power dissipation as a function of bandwidth of a typical differential pair in 90-nm CMOS with inductive peaking and fanout-of-4 loading. With the device sizes labeled in the inset, the interconnect is also taken into consideration by extracting the parasitic capacitance from layout. Drawing a best-t curve, we conclude that a good power efciency can be maintained up to 15 GHz. That is, the on-chip design margin for 20-Gb/s data is reasonably adequate. However, contemporary backplane materials and connectors fail to provide sufcient bandwidth for such high-speed data transmissions, encouraging research on signal processing and/or data coding to overcome the poor channel properties. The original idea is based on the fact that modifying the chips is always easier and cheaper than altering the board itself. Over the years, engineers have been dealing with different data formats that can satisfy bandwidth requirement with acceptable complexity. Among the existing solutions, non-return-to-zero (NRZ), duobinary, and 4-level pulse-amplitude modulation (PAM4) are most commonly used in various applications. The NRZ transceiver
Manuscript received December 22, 2007; revised March 27, 2008. Current version published September 10, 2008. The authors are with the Electrical Engineering Department, National Taiwan University, Taipei, Taiwan, R.O.C. (e-mail: [email protected]). Digital Object Identier 10.1109/JSSC.2008.2001934
can be realized in a relatively simple way, providing another advantage in high-speed I/O links when the power budget is limited. As the data rate increases, the ubiquitous NRZ data would gradually hit the bandwidth limit, and the duobinary and PAM4 signals are considered as substitutes due to the efcient utilization of bandwidth. As can be shown in the next section, the spectra of duobinary and PAM4 are exactly half as wide as that of the NRZ data, making these formats potentially favorable in high-speed links. Generally speaking, the duobinary signaling is further superior to PAM4 because it makes use of the intrinsic roll-off bandwidth of the channel as part of the desired transfer function, requiring even less boost for the equalizers and alleviating the stringent requirement at high frequencies. In this paper, we design and analyze three different transceiver topologies for the duobinary, PAM4, and NRZ signals. Operating at 20 Gb/s, all of the three transceivers are optimized to achieve the best performance with reasonable power consumption. Both Rogers and FR4 boards with different channel lengths are tested thoroughly to characterize the behavior of the transceivers. A careful comparison among the different data formats is conducted and veried by the experimental results. This paper is organized as follows. Section II reviews the fundamental operation of duobinary signal and its implementation issues. The design details of the duobinary, PAM4, and NRZ transceivers are described in Sections III, IV, and V, respectively. Section VI summarizes the measurement results, and Section VII draws a conclusion. II. DUOBINARY SIGNALING Having been used in optical communications and recently moving into electrical systems [2][4], duobinary modulation can achieve a data rate theoretically twice as much as the channel bandwidth. Intersymbol interference (ISI) is introduced in a controlled manner such that it can be cancelled
LEE et al.: DESIGN AND COMPARISON OF THREE 20-GB/S BACKPLANE TRANSCEIVERS FOR DUOBINARY, PAM4, AND NRZ DATA
2121
Fig. 2. (a) Linear model of duobinary signaling. (b) Composition of duobinary spectrum.
Fig. 3. Output spectra and waveforms for different data formats passing through an ideal lter. (a) NRZ. (b) Duobinary. (Data rate = 20 Gb/s.)
out to recover the original signal. Unlike PAM4 and NRZ signals, duobinary signals incorporate the channel loss as part of the overall response [5], substantially reducing the required boost and relaxing the equalizer design. A duobinary signal is originally dened as the sum of the present bit and the previous one of a binary sequence [6]: (1) It correlates two adjacent bits to introduce the desired ISI. Considering the equivalent linear model as shown in Fig. 2, we have as the transfer function (2) where denotes the bit period, and the attenuating factor is used to equalize the total power of and . It can be is given by also shown that the duobinary spectrum (3) (4)
is still a sinc function but with As shown in Fig. 2(b), . In other only half the bandwidth as compared with words, the duobinary coding squeezes the spectrum toward the dc line, and reduces the required channel bandwidth by 50%. Note that almost 90% of the signal power stays in the main lobe of a sinc function. To further clarify the analysis, we apply the NRZ and duobinary data through a brickwall lter . As can be shown in Fig. 3, cutting off at half data rate the received NRZ data suffers from 81.8% ISI and 0.8-UI jitter, whereas the duobinary is almost unaffected. It is because the former loses 51.4% of the power but the latter loses only 10%. It is worth noting that although the PAM4 signal possesses the same spectral efciency as the duobinary does, the latter can further take advantage of the channel response as part of the transfer function. Fig. 4(a) illustrates the operation of duobinary signaling, where the transmit preemphasis and receive equalizer cooperate to reshape the low-pass response of the channel so that the overall transfer function approximates the rst lobe of . In other words, a duobinary transceiver absorbs signicant amount of channel loss and makes it useful in the overall response, allowing more relaxed preemphasis and equalizer design. Fig. 4(b) shows the simulated results for the required boost
Authorized licensed use limited to: Texas A M University. Downloaded on June 22, 2009 at 18:38 from IEEE Xplore. Restrictions apply.
2122
Fig. 4. (a) Concept of duobinary signal formation. (b) Required boost at Nyquist frequency. (Data rate P = 20 Gb/s.)
at Nyquist frequencies for duobinary, PAM4, and NRZ codes.1 It can be shown that with a data rate of 20 Gb/s, the required equalization for duobinary is lower than that for PAM4 and NRZ by 2.8 and 8.9 dB in a 20-cm FR4 channel, and by 4.0 and 6.8 dB in a 40-cm Rogers channel, respectively. The simulation is conducted in SpectreRF as follows. First, we measure the s-parameter of the backplane traces with different lengths and deliver a pulse into the channels. Next, we convert the coefcients2 into a transfer function, obtaining the corresponding boost in different conditions. Note that the duobinary transmitter may need a supfor proper pression (rather than a lift) in the vicinity of spectrum shaping. must be impleIn reality, a precoder mented in the transmit side. Here, we follow the design of [7], and the complete duobinary transceiver is shown in Fig. 5. The reshaped duobinary data gets decoded by an LSB distiller that takes the LSB as the output, recovering the binary NRZ data
1The
as . The waveforms of important nodes are also depicted in Fig. 5. III. DUOBINARY TRANSCEIVER The proposed duobinary transceiver is illustrated in Fig. 6. This prototype conceptually resembles the structure in Fig. 5 but employs no equalizer in the receiver for simplicity. The transmitter consists of a skew-tolerant precoder and 3-tap feedforward equalizer, and the receiver contains a self-adjusted threelevels (1.58-bits) ADC. We present the design details in this section.
Authorized licensed use limited to: Texas A M University. Downloaded on June 22, 2009 at 18:38 from IEEE Xplore. Restrictions apply.
LEE et al.: DESIGN AND COMPARISON OF THREE 20-GB/S BACKPLANE TRANSCEIVERS FOR DUOBINARY, PAM4, AND NRZ DATA
2123
A. Transmitter Although it looks simple and feasible, the precoder in Fig. 5 is difcult to implement, primarily due to the stringent timing requirement in the feedback loop. Cascading active or passive in an open loop is not devices to develop a precise delay of an option because of the high power, large area, and uncertain PVT variations. Using a clock-driven ipop seems to be the only choice, but it suffers from severe phase requirement as well. This effect can be clearly explained by Fig. 7(a), where the XOR and , gate and the ipop experience a delay of respectively. To make this precoder work properly, these two delays must comprise an exact bit period : (5) That is, the input clock has very little margin for phase movement in order to produce a proper D-to-Q delay for the ipop. Such a timing issue becomes aggravated at high speed and requires a complex control scheme. To overcome the difculties, we realize the procoder in an alternative way as illustrated in Fig. 7(b), [8]. The input data and clock pass through an AND gate, which is followed by a divided-by-2 circuit. The output thus toggles whenever a data ONE arrives, leading to the following operation: (6) This structure provides advantages over that in Fig. 7(a) in breaking the loop and allowing much more relaxed phase relanow tionship between the input clock and data. The clock reveals a margin as wide as 180 for skews, which is no longer a limiting factor in most designs. Note that the initial state with of the divider has no inuence on the nal result; opposite polarity still yields the same output after decoding.
The popular feedforward equalizer also proves useful in duobinary systems. At 20 Gb/s, the number of taps becomes quite limited. Here, 4 taps are considered for the waveform reshaping. All the FIR equalizing methods and techniques that have been extensively used for NRZ data can be applied in duobinary, except that a single pulse ONE (preceded and followed by successive ZEROs) is expected to generate two at the far end. With a pulse response consecutive bits of are readily available by shown in Fig. 8(a),3 the coefcients solving the following equations:
Fig. 8(b) and (c) summarize the optimal coefcients at 20 Gb/s data rate as a function of channel length for Rogers and FR4 is relatively small in both cases, boards. It can be shown that urging us to omit it (and the corresponding ipop) for an agile design. The complete transmitter design is depicted in Fig. 8(d), where all blocks are implemented in current-mode logic (CML) to increase the operation speed. B. Receiver Suggested by Fig. 5, the duobinary receiver could be as simple as a quantizer with only the LSB taken out to convert the duobinary signal back to the NRZ data. It is equivalent to discriminating the middle level (logic ONE) from the two side levels (logic ZERO), as shown in Fig. 9(a). Here, a 3-level (1.58-bit) ash ADC is followed by an XOR gate to distill the LSB. However, this simple topology suffers from a number of drawbacks. The linearity and input common-mode level
3The example pulse response shown in Fig. 8(a) is obtained from a 20-cm Rogers channel.
Authorized licensed use limited to: Texas A M University. Downloaded on June 22, 2009 at 18:38 from IEEE Xplore. Restrictions apply.
2124
Fig. 8. (a) Typical pulse response for 20-Gb/s data. (b) Normalized FIR coefcients for Rogers. (c) Normalized coefcients for FR4. (d) Duobinary transmitter.
need precise reference voltage and , otherwise the signal integrity degrades. The pulsewidth of the output may also get distorted, resulting in signicant jitter or ISI. The proposed architecture alleviates the above difculties by incorporating a reference-free comparator and a servo controller that dynamically optimizes the output data eye. As shown in Fig. 9(b), the comparator compares the input and with two threshold levels virtually equivalent to , generating two outputs and . Amplied to and logic level by the subsequent hysteresis buffers [9], are then XORed to produce the nal output . The recovered data inevitably bears jitter, since (1) the threshold levels may drift due to mismatches and PVT variations; (2) the threshold-crossing points for the rising and falling would differ intrinsically. Here, the pulsewidth distortion associated with the rst issue is corrected by means of a negative feedback loop,
which contains a low-pass lter (LPF), and a V/I converter. With the assumption that the input data is purely random, the high loop gain forces the thresholds to stay at the optimal reaches an equal positions such that the waveform of pulsewidth for ZEROs and ONEs. In contrast to the design in [4], this arrangement recovers the data without extracting the clock, providing a compact solution. If necessary, the remaining jitter due to the second issue can be further removed by placing a regular CDR circuit behind it. Note that for simplicity, no receive-side equalization is used in this prototype. The comparator and V/I converter design is depicted in along with the tail Fig. 10(a), where the input quad currents and loading resistor form two zero-crossing thresholds and . Mirrored from the V/I converter, the two for and create a threshold tuning variable current 0.9. Fig. 10(b) illustrates the range of 205 mV for
Authorized licensed use limited to: Texas A M University. Downloaded on June 22, 2009 at 18:38 from IEEE Xplore. Restrictions apply.
LEE et al.: DESIGN AND COMPARISON OF THREE 20-GB/S BACKPLANE TRANSCEIVERS FOR DUOBINARY, PAM4, AND NRZ DATA
2125
Fig. 10. (a) Comparator and V/I converter in duobinary Rx. (b) V
and V
as a function of .
variation of threshold levels as a function of . The key point here is that the threshold adjustment is fully symmetric with respect to the input common-mode level. It not only eliminates reference offset issue but facilitates the pulsewidth equalization. The low-pass lter in Fig. 9(b) is realized as a simple RC network with a corner frequency of 20 kHz, implying a voltage drifting of less than 1.5 mV for 31 consecutive bits. A single-stage opamp is employed here, achieving 32-dB voltage gain, 85 phase margin, and 2.6-GHz unity-gain bandwidth with a power consumption of 1.2 mW. Reiterative simulation under severe PVT variations ensures the loop stability. Note that the performance could be affected by different kinds of mismatch, including imbalanced rising/falling times of the signal and comparator offsets. Monte Carlo simulation reveals that the threshold levels would deviate from the optimal positions by 12.5 mV , which corresponds to additional jitter of 0.5 ps. The device sizes here are properly chosen to minimize the deviation. As compared with [4], this approach simplies the circuit complexity especially the CDR design. The robust architecture indeed facilitates high-speed operation and saves power. More detail can be found in [10].
IV. PAM4 TRANSCEIVER A. Transmitter Fig. 11 illustrates the PAM4 transmitter design. It incorporates a demultiplexer (DMUX) to deserialize the original input, two signal paths (MSB and LSB) to independently preemphasize the data, and two joint combiners to construct the PAM4 signal. Serving as a 3-tap feedforward equalizer, each signal path performs FIR equalization with identical coefand [11]. The two preemphasis results are cients combined together (with the MSB twice as large as the LSB) in current mode and converted to voltage output by means of the inductively-peaked terminations. The combiner design is depicted in Fig. 12(a), where the weighting factor tuning is realized by adjusting the tail currents. Due to the limited testing facilities, only a single-ended clock at 20 GHz is applicable for circuit, we employ the transmitter. To drive the differential a single-ended-to-differential (S/D) converter as depicted in and create a self-biased input level, Fig. 12(b). Here, form a local feedback to increase the gain that along with and minimize the waveform distortion. Compared with typical topology such as that in [12], this structure achieves higher
Authorized licensed use limited to: Texas A M University. Downloaded on June 22, 2009 at 18:38 from IEEE Xplore. Restrictions apply.
2126
Fig. 12. PAM4 Tx building blocks. (a) Combiner. (b) S/D converter.
gain and lower magnitude distortion between the two output nodes. All the blocks are implemented as standard CML with the overall bandwidth and power consumption optimized. B. Receiver The receiver design is shown in Fig. 13. Owing to the multilevel input, no buffer can be placed in the very front end unless it possesses high linearity over a wide range. Similar to a 2-bit ash ADC, prior arts such as [13] utilize three slicers with programmable offsets to discriminate the four levels. Here, we propose a single preamplier that generates three thermometer codes simultaneously. These codes reach the full logic level mV) by means of amplication (hysteresis buffers) and ( regeneration (ipops). Subsequently, the PAM4 decoder translates the thermometer codes into binary codes and . In order to evaluate the signal integrity, we serialize them again by a 2-to-1 MUX to recover the 20-Gb/s data output. Although the nal muxing is unnecessary in real design, it does facilitate the testing of this prototype. The preamplier is illustrated in Fig. 14(a), where the inductively-peaked termination ensures a broadband matching at , loading resistor and the input. The switching quad , and tunable current source and produce three with three different threshold levels, and outputs the upper and lower ones are symmetric with respect to the
middle one, i.e., the input common-mode level. Note that the and is kept constant so as to minimize total current of the output common-mode variation. The hysteresis buffers [9] while cleaning up amagain amplify the outputs biguous transitions, and clear thermometer codes are presented to the decoder after the retime and regeneration of the ipops. Fig. 14(b) reveals the decoder design, where complementary operation is imposed in the current-mode logics. With a supply of 1.8 V, it is possible to accommodate multiple stacks at 10 Gb/s need with 250-mV overdrive for each stage. Note that not maintain in saturation all the time, since the circuit functions properly as long as the current can be completely switched from helps to speed up one arm to the other. Auxiliary pair the operation with moderate gain boosting during transition. V. NRZ TRANSCEIVER The NRZ transceiver is depicted in Fig. 15. As a vehicle for comparison, the transmitter is identical to the duobinary circuit in Fig. 8(d) with the precoder removed. In contrast to the multilevel signals such as duobinary and PAM4, the binary input here allows nonlinear amplication in the receiver front end to increase the signal-to-noise ratio (SNR). A transimpedance amplier (TIA) is employed as the receiver front-end buffer, converting the signal current into voltage more efciently. It achieves 15% larger bandwidth as compared with typical input
Authorized licensed use limited to: Texas A M University. Downloaded on June 22, 2009 at 18:38 from IEEE Xplore. Restrictions apply.
LEE et al.: DESIGN AND COMPARISON OF THREE 20-GB/S BACKPLANE TRANSCEIVERS FOR DUOBINARY, PAM4, AND NRZ DATA
2127
buffer made of a simple differential pair. All of the three transceivers are fully differential, and building blocks (e.g., ipops) are reused as much as possible so as to make a fair comparison. VI. EXPERIMENTAL RESULTS All the transceivers have been fabricated in 90-nm CMOS technology and tested in chip-on-board assemblies. High-speed I/Os are co-designed with pads and routing traces to achieve 50- termination precisely. Fig. 16(a) depicts the photos of the chips with their dimensions listed below. The testing setup is illustrated in Fig. 16(b), and the photo of a testing board (40-cm Rogers) is shown in Fig. 16(c). Three important points are specied to demonstrate the waveforms: position A (transmitters output), B (far end), and C (receivers output). Fig. 17 shows the measured frequency response of the channels and the corresponding pulse response at 20 Gb/s. The duobinary, PAM4, and NRZ transceivers consume 195 mW, 408 mW, and 126 mW from supplies of 1.5 V, 1.8 V, and 1.5 V, respectively.4 Unless otherwise specied, the following measurements are obtained
4To achieve better performance, the PAM4 transceiver requires a higher supply because of the four levels.
with pseudo-random bit sequence (PRBS) of 2 1. We discuss the measured results below. Duobinary: Fig. 18 depicts the transmitters output (position dB) and maximum ( dB) boost A) with minimum ( at 20 Gb/s and have them compared with simulations. The optimized duobinary waveforms at position B for different channels are shown in Fig. 19. The recovered data at the receivers output (position C) with longest traces are shown in Fig. 20, suggesting jitters of 3.41 ps,rms/29.11 ps,pp (Rogers) and 4.34 ps,rms/24.22 ps,pp (FR4). Fig. 21 plots the BER as a function of channel length for different media. PAM4: Fig. 22 shows the far-end (position B) waveforms for different channels, and Fig. 23 depicts the receivers output (position C). Note that the nite clock skew in the receiver causes pulsewidth distortion on the output of the MUX, resulting in eye diagrams with dual transition traces as shown in Fig. 23. Since the MUX is used only for testing here, it will not be an issue in real design. The BER performance is summarized in Fig. 24. NRZ: The same testing procedure has been applied to NRZ transceiver as well. The waveforms at positions B and C are plotted in Figs. 25 and 26, respectively. Again, Fig. 27 depicts the BER performance.
Authorized licensed use limited to: Texas A M University. Downloaded on June 22, 2009 at 18:38 from IEEE Xplore. Restrictions apply.
2128
Fig. 16. (a) Chip micrographs and dimensions. (b) Testing setup. (c) Photo of the 40-cm testing board.
Authorized licensed use limited to: Texas A M University. Downloaded on June 22, 2009 at 18:38 from IEEE Xplore. Restrictions apply.
LEE et al.: DESIGN AND COMPARISON OF THREE 20-GB/S BACKPLANE TRANSCEIVERS FOR DUOBINARY, PAM4, AND NRZ DATA
2129
Fig. 18. Boosting performance of duobinary Tx measured at position A. (a) Minimum (0 dB). (b) Maximum (9.5 dB). (Data rate = 20 Gb/s, vertical scale: 50 mV/div, horizontal scale: 10 ps/div.)
Fig. 19. Far-end (position B) waveforms with duobinary signals for (a) 15-cm Rogers, (b) 3-cm FR4, and (c) 10-cm FR4 channels. (Data rate = 20 Gb/s, vertical scale: 50 mV/div, horizontal scale: 10 ps/div.)
Fig. 20. Eye diagram of recovered data for duobinary transceiver. (Data rate = 20 Gb/s, vertical scale: 100 mV/div, horizontal scale: 10 ps/div.)
Fig. 21. BER measurements for duobinary transceiver in (a) Rogers, and (b) FR4 board.
Fig. 28 presents the spectra of different data formats at position B. As expected, the duobinary and PAM4 signals reveal notches at half data rate. Note that for duobinary signal, the notch slightly deviates from 10 GHz, primarily because the physical circuits can only mimic the rst lobe of the transfer function. The 9.3-Hz spacing shown in the inset corresponds to 1 PRBS at 20 Gb/s. the 2
In order to fairly compare the signal integrity, we operate the three transceivers with the same supply voltage of 1.5 V and examine the far-end (position B) eye opening after a 40-cm Rogers channel (Fig. 29). It can be clearly shown that the duobinary signal presents the largest magnitude (200 mV) and eye opening (35 mV), whereas the NRZ signal exhibits the smallest (i.e., 60-mV magnitude and 10-mV opening). However, the
Authorized licensed use limited to: Texas A M University. Downloaded on June 22, 2009 at 18:38 from IEEE Xplore. Restrictions apply.
2130
Fig. 22. Far-end (position B) waveforms with PAM4 signals for (a) 15-cm Rogers, (b) 5-cm FR4, and (c) 10-cm FR4 channels. (Data rate = 20 Gb/s, vertical scale = 100 mV/div, horizontal scale = 20 ps/div.)
Fig. 23. Eye diagrams of recovered data for PAM4 transceiver. (Data rate = 18 Gb/s, vertical scale = 100 mV/div, horizontal scale = 10 ps/div.)
Fig. 24. BER measurements for PAM4 transceiver in (a) Rogers, and (b) FR4 board.
Fig. 25. Far-end (position B) waveforms with NRZ signals for (a) 15-cm Rogers, (b) 5-cm FR4, and (c) 25-cm FR4 channels. (Data rate = 20 Gb/s, vertical scale = 50 mV/div, horizontal scale = 10 ps/div.)
NRZ signal can still achieve an outstanding BER primarily due to the simple receiver structure. In other words, the NRZ data can be amplied without considering the linearity, improving the signal integrity substantially. As mentioned earlier, a regular CDR circuit can be adopted in the proposed duobinary transceiver. A lower loop bandwidth is thus expected in such a CDR in order to suppress the input data jitter. Basically, it is possible to acquire the noise prole from the recovered data with pre-compiled pattern (e.g., 0101 ), and put it into the bandwidth optimization procedure like other phase-locking systems [14]. Since the receiver may create de-
terministic jitter because of the clock-free architecture, it is desirable to codesign the receiver and CDR so as to optimize the overall performance. The PAM4 receiver, on the contrary, suffers from complicated CDR design as compared with the other two. It is also instructive to compare the overall performance of the three circuits. The NRZ signal continues to play an important role in different systems owing to its plain structure and power efciency, whereas the duobinary provides an alternative solution for long-distance, high-speed communications. The NRZ data actually achieves the best performance in terms of BER
Authorized licensed use limited to: Texas A M University. Downloaded on June 22, 2009 at 18:38 from IEEE Xplore. Restrictions apply.
LEE et al.: DESIGN AND COMPARISON OF THREE 20-GB/S BACKPLANE TRANSCEIVERS FOR DUOBINARY, PAM4, AND NRZ DATA
2131
Fig. 26. Eye diagrams of recovered data for NRZ transceiver. (Data rate = 20 Gb/s, vertical scale = 100 mV/div, horizontal scale = 10 ps/div.)
Fig. 27. BER measurements for NRZ transceiver in (a) Rogers, and (b) FR4 board.
Fig. 29. Comparison of far-end waveforms. (Data rate = 20 Gb/s, supply voltage = 1:5 V, 40-cm Rogers.)
Authorized licensed use limited to: Texas A M University. Downloaded on June 22, 2009 at 18:38 from IEEE Xplore. Restrictions apply.
2132
and power dissipation. The NRZ data also manifests itself if a CDR needs to be included in the receiver design, although it is clear that with the proposed architecture, the CDR design for duobinary signal could be as simple as that for a NRZ signal. On the other hand, the PAM4 signal may need linear amplication in the receiver front end to increase the SNR if the input signal is too small. This is not a trivial work in any technology. Besides, the retiming ipops are almost mandatory in a PAM4 receiver [13], complicating the clock recovery and causing high power consumption. For these reasons, PAM4 becomes less attractive in modern transceiver designs. Table I compares the performance of these three transceivers with prior art. VII. CONCLUSION A complete comparison and design analysis regarding three popular data signalings are presented. Novel architectures and circuit techniques have been introduced in the three transceiver prototypes targeting duobinary, PAM4, and NRZ signals, and all of them achieve error free operation for at least 40-cm Rogers and 10-cm FR4 channels at 20 Gb/s. The advantages and disadvantages for different topologies are proposed, providing empirical information for future backplane transceiver design. REFERENCES
[1] 100 Gigabit Ethernet Forum - 100G Ethernet Forum NG Ethernet Forum [Online]. Available: http://ng-ethernet.com/ethernet_forum/index.php?c=2 [2] A. Lender, The duobinary technique for high-speed data transmission, IEEE Trans. Commun. Electron., vol. 82, pp. 214218, May 1963. [3] J. H. Sinsky et al., High-speed electrical backplane transmission using duobinary signaling, IEEE Trans. Microw. Theory Tech., vol. 53, no. 1, pp. 152160, Jan. 2005. [4] K. Yamaguchi et al., 12 Gb/s duobinary signaling with 2 oversampled edge equalization, in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2005, pp. 7071. [5] J. Sinsky et al., 10 Gb/s duobinary signaling over electrical backplanesExperimental results and discussion, Lucent Technologies, Bell Labs [Online]. Available: http://www.ieee802.org/3/ap/public/ jul04/sinsky_01_0704.pdf [6] F. Stremler, Introduction to Communication System, 3rd ed. Reading, MA: Addison-Wesley, 1990.
[7] M. Tomlinson, New automatic equalizer employing modulo arithmetic, Electron. Lett., vol. 7, pp. 138139, Mar. 1971. [8] H. Shankar, Duobinary modulation for optical systems, Inphi Corp. [Online]. Available: http://www.inphi-copr.com/products/whitepapers/DuobinaryModulationForOpticalSystems.pdf [9] J. Lee, A 75-GHz PLL in 90-nm CMOS, in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2007, pp. 432433. [10] J. Lee et al., A 20-Gb/s duobinary transceiver in 90-nm CMOS, in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2008, pp. 102103. [11] C. Menol et al., A 25 Gb/s PAM4 transmitter in 90-nm CMOS SOI, in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2005, pp. 7273. [12] B. Razavi, Design of Integrated Circuits for Optical Communications. New York: McGraw-Hill, 2002. [13] T. Toi et al., A 22-Gb/s PAM-4 receiver in 90-nm CMOS SOI technology, IEEE J. Solid-State Circuits, vol. 41, no. 4, pp. 954965, Apr. 2006. [14] H. Tao et al., 4043-Gb/s OC-768 16:1 MUX/CMU chipset with SFI-5 compliance, IEEE J. Solid-State Circuits, vol. 38, no. 12, pp. 21692180, Dec. 2003.
Jri Lee (S03-M04) received the B.Sc. degree in electrical engineering from National Taiwan University (NTU), Taipei, Taiwan, R.O.C., in 1995, and the M.S. and Ph.D. degrees in electrical engineering from the University of California, Los Angeles (UCLA), both in 2003. After two years of military service (19951997), he was with Academia Sinica, Taipei, from 1997 to 1998, and subsequently Intel Corporation from 2000 to 2002. He joined National Taiwan University (NTU) in 2004, where he is currently an Associate Professor of electrical engineering. His research interests include high-speed wireless and wireline transceivers, phase-locked loops, and data converters. Dr. Lee is currently serving in the Technical Program Committees of the International Solid-State Circuits Conference (ISSCC), Symposium on VLSI Circuits, and Asian Solid-State Circuits Conference (A-SSCC). He received the Beatrice Winner Award for Editorial Excellence at the 2007 ISSCC, the Takuo Sugano Award for Outstanding Far-East Paper at the 2008 ISSCC, and the NTU Outstanding Teaching Award in 2007 and 2008.
Authorized licensed use limited to: Texas A M University. Downloaded on June 22, 2009 at 18:38 from IEEE Xplore. Restrictions apply.
LEE et al.: DESIGN AND COMPARISON OF THREE 20-GB/S BACKPLANE TRANSCEIVERS FOR DUOBINARY, PAM4, AND NRZ DATA
2133
Ming-Shuan Chen was born in Taipei, Taiwan, R.O.C., in 1984. He received the B.S. degree in electrical engineering from National Tsing-Hua University, Hisnchu, Taiwan, in 2006, and the M.S. degree in electronics engineering from National Taiwan University, Taipei, Taiwan, in 2008. His research interests focus on mixed-signal integrated circuit design for high-speed communication systems.
Huaide Wang was born in Taipei, Taiwan, R.O.C., in 1984. He received the B.S. degree in electrical engineering from National Taiwan University, Taipei, Taiwan, in 2006. He is currently pursuing the Ph.D. in the Graduate Institute of Electrical Engineering, National Taiwan University, Taipei. His research interests are phase-locked loops and high-speed transceivers for wireline communication.
Authorized licensed use limited to: Texas A M University. Downloaded on June 22, 2009 at 18:38 from IEEE Xplore. Restrictions apply.