Sony Philips Super Audio CD (SACD) White Paper
Sony Philips Super Audio CD (SACD) White Paper
Sony Philips Super Audio CD (SACD) White Paper
At the time of its launch, the Compact Disc was literally state-of-the-art in the sense that its 16-bit PCM digital audio exercised the full capabilities of early-80s-vintage semiconductor and storage technology. In the following fifteen years, important progress has been made in optical discs, D/A conversion, digital filtering, digital signal processing, magnetic tape, hard disks and semiconductor processing power. Research has made similar advances in identifying the sources of digital audio distortions sources that accounted for the lingering perception that analog audio systems continued to outperform digital systems in key areas.
To be successful, any new digital audio system must satisfy needs at every link in the audio chain, from recording artists in the studios, to music companies, retailers and consumers. Before proposing the Super Audio CD, Sony and Philips considered the broadest range of industry and consumer needs:
Archiving. Music companies count as their assets the musical
heritage of over 100 years of recording. These include fragile, ancient acetates and lacquers plus hundreds of thousands of reels of audio tape. All these media have a finite archival life. For example, tape manufacturers typically specify 30 years life. This suggests that the master tapes from the 50s and 60s require immediate transfer onto some newer, more durable medium. Because these precious masters may not be able to tolerate more than a single playback, todays archival copy must capture all of the original recording, down to the merest hint of harmonics, buried in noise. The technology behind the Super Audio Compact Disc must support such ultra-high-quality archiving.
Production. Musicians want the greatest possible artists
palette for their creativity. Producers have a constant desire for higher and higher sound quality.
Distribution. The Super Audio CD must spare music retailers
the issue of dual inventory, the need to maintain separate stocks of conventional Compact Discs and Super Audio Compact Discs for each available title.
Consumers. Audiophiles have clearly expressed their demand for
better sound quality. But modern formats must also be capable of enhanced benefits like multichannel sound, text, graphics, and video. In addition, the market for a new generation of music software would be extremely narrow if it didnt offer backward compatibility with the Compact Disc. As readers already know, Compact Disc is the most successful digital format of all time. Nearly 500 million players and over 10 billion Compact Discs testify to consumer acceptance on a massive scale. Consumer research shows that for many, CD represents the height of convenience, home/portable versatility and sound quality. For all these reasons, the new discs must play on existing consumers CD players. And consumers vast libraries of CDs must play on the new generation of machines. As proposed by Sony and Philips, the Super Audio Compact Disc satisfies these demands. It has the potential to make every constituency recording artists, producers, engineers, music companies, retailers, audiophiles and general music lovers extremely happy.
The result of years of development, the Super Audio Compact Disc delivers new convenience, new capabilities and an altogether new standard of audio performance. It looks just like an ordinary Compact Disc. It plays on ordinary CD players. But under the surface, its a two-layer hybrid disc with a host of new and surprising capabilities:
CD is mastered with Direct Stream Digital (DSD) technology. A separate layer, compatible with the Compact Disc Red Book standard, is transferred from DSD using Super Bit Mapping Direct downconversion. The result is a disc that sounds noticeably better than conventional CDs when played in a conventional CD player. The new disc makes better use of the full 16 bits of resolution that the CD format can deliver.
Ready for a new generation of high density players. The
same disc that plays on conventional hardware contains a separate signal layer designed for the new Super Audio CD players. So advanced consumers can choose to step up to the new generation of performance.
The ultimate quality in two-channel stereo. The high-
density recording layer contains the original Direct Stream Digital 2-channel sound. Consumers will enjoy frequency response from DC to over 100,000 Hz, plus dynamic range greater than 120 dB, across the audio band specifications unmatched by any previous record/replay system. Independent critics and record producers have praised preliminary demonstrations of DSD sound as relaxed, musical, detailed and transparent, with a far greater sense of space around each instrument and voice.
The ultimate quality in multichannel sound. The high-
density layer can also contain a Direct Stream Digital six-channel recording of the same piece of music. Each of the six channels Can be recorded separately with full 100 kHz frequency response and 120 dB dynamic range. As a result, the six-channel sound image has unparalleled resolution and transparency.
Text and graphics. The musical performance can be
accompanied by text (including disc name, artist name, track name, lyrics and liner notes) and graphics.
12 cm
HIGH DENSITY LAYER Extra Data: Text Graphics Video Six-Channel Mix Two-Channel Stereo
Conventional Compact Disc Diameter Thickness Signal Sides Signal Layers 4-3/4 (120 mm) 1/20 (1.2 mm) One One
Super Audio Compact Disc 4-3/4 (120 mm) 1/20 (1.2 mm) One Two: CD-density reflective layer and high-density semi-transmissive layer 780 MB 4,700 MB (4.7 GB) 16-bit PCM, 44.1 kHz sampling 1-bit Direct Stream Digital, 2.8224 MHz sampling 6 channels of Direct Stream Digital DC100,000 Hz (DSD) 120 dB across the audio bandwidth (DSD) 74 minutes Text, Graphics, Video
The Super Audio Compact Disc and the International Steering Committee.
Data Capacity: Reflective Layer Semi-Transmissive Layer Audio Coding: Standard Audio Super Audio Multichannel Frequency Response Dynamic Range Playback Time Enhanced Capabilities
780 MB -16-bit PCM, 44.1 kHz sampling --520,000 Hz 96 dB across the audio bandwidth 74 minutes CD Text
The worlds music companies are keenly interested in the features and capabilities of the next-generation high-density audio disc. For this reason, the recording industrys three major trade associations have joined forces to form an International Steering Committee (ISC) to review proposals. The three associations are the Recording Industry Association of America (RIAA), the Recording Industry Association of Japan (RIAJ) and Europes International Federation of Phonographic Industry (IFPI). To provide input to high-density audio development, the International Steering Committee has offered a wish list of key technical requirements. Prominent on this list is a hybrid structure that ensures backward compatibility with existing CD players. The Super Audio Compact Disc clearly meets this need plus every other requirement on the International Steering Committee list.
International Steering Committee (ISC) Requirement 1. Active Copyright Management System. 2. Copyright Identification. 3. Anti-piracy measures. Super Audio Compact Disc Solution Yes. Yes. Proposed Digital Watermark can carry disc ID, ISRC and SID codes. Yes. Visible and invisible Digital Watermarks differentiate legitimate from pirate copies. Watermarks cannot easily be removed. Yes. Thanks to hybrid disc construction, with CD Red Book reflective layer plus high-density semi-reflective layer. Yes. Enhanced Data area provided. Yes. Digital Watermark can accommodate imbedded keys. Yes. Direct Stream Digital, 2.8224 MHz sampling, DC to over 100,000 Hz frequency response, 120 dB dynamic range across the audio band. Yes. DSD is designed for high precision conversion to and from 16-bit/44.1 kHz. Yes. Enhanced Data area provided. Yes. Yes. Yes. Yes.
Most new format launches tend to disenfranchise consumers who have purchased previous-generation technologies. In contrast, the hybrid disc approach empowers consumers. Record stores will stock the new Super Audio CDs, exactly as they do with existing CDs. Casual consumers could buy the new discs, without knowing or caring about the new technology they contained. The new discs will work on all home, car and portable CD players. But discriminating audiophiles will have the option of buying a newgeneration player that can deliver the full impact of Super Audio CD reproduction.
4. Compatibility.
5. Store audio, video and data. 6. Conditional access. 7. Highest quality two-channel stereo and 6-channel sound.
8.. Archive, master, transfer without loss of sound quality. 9. Extended disc functions, including text. 10. Must not require caddy.
A single Super Audio Compact Disc can contain three versions of the music, stored on two separate layers. First is CD-compatible stereo, stored on the reflective layer (top). High-resolution stereo is stored on the semi-transmissive layer (bottom-center) as is six-channel sound, where available (bottom-outside).
Its no surprise that the Super Audio Compact Disc solution meets the requirements of the worlds music companies. After all, it was developed from the outset to satisfy every link in the chain: recording artists, producers, music companies, retailers and consumers.
Enabling Technologies.
Super Audio Compact Disc accomplishes so many goals because it embodies several powerful, new technologies. In fact, there are five enabling technologies behind the new disc: 1
Direct Stream
Direct Stream
Digital Watermarking
Digital (DSD) 1-bit representation of the audio waveform with 2.8224 MHz sampling achieves sound quality unprecedented in analog or digital audio.
Direct downconversion enables much of the DSD sound quality to be heard on conventional CD players.
nology uses a two-layer approach. The hybrid disc is a normal CD with an additional semitransmissive highdensity layer. This enables a single disc to work compatibly on both standard CD and Super Audio Compact Disc players.
Transfer coding increases data capacity to give content providers the ability to combine 2-channel audio, 6-channel audio, plus additional text, graphics and video elements with great flexibility.
uses both visible and invisible approaches to establish new, more secure methods of thwarting would-be pirates.
Each of these technologies is new. Each opens up important possibilities. And each deserves a more detailed description.
Direct Stream Digital (DSD) Encoding.
Sony and Philips both have a well-known history of accomplishment in Pulse Code Modulation (PCM) digital audio. Starting in the late 1970s with commercial 14-bit systems, and moving up to 16-, 18-, 20- and 24-bit systems, these two companies have made an unmatched investment in multibit PCM technology, generating an unequaled string of multibit PCM products. So its not casually that these two companies now propose a fundamental move away from multibit PCM. Successively higher bit rates and higher sampling rates for PCM systems have, in fact, improved sound quality. But the improvements are getting smaller and smaller. And the reason for these diminishing returns is becoming clear: filtering. Every PCM system requires steep filters at the input to absolutely block any signal at or above half the sampling frequency. (In conventional 44.1 kHz sampling, brick wall filters must pass 20 kHz audio, yet reject 22.05 kHz a difficult task.) In addition, requantization noise is added by the multi-stage or cascaded decimation (downsampling) digital filters used in recording and the multi-stage interpolation (oversampling) digital filters used in playback. Every increase in the sampling rate eases the difficulty of the brick wall filter. But simply increasing the sampling rate cant correct the vexing problem of multi-stage decimation and interpolation. This problem was the inspiration for Direct Stream Digital. By using existing processes and simply eliminating decimation and interpolation we developed a whole new way of capturing digital audio. As in conventional PCM systems, the analog signal is first converted to digital by 64x oversampling delta-sigma modulation. The result is a 1-bit digital representation of the audio signal. Where conventional systems immediately decimate the 1-bit signal into a multibit PCM code, Direct Stream Digital records the 1-bit pulses directly.
Conventional multibit PCM requires decimation filters on the record side plus interpolation filters on the playback side.
Direct Stream Digital eliminates the filters and records the original 1-bit signal directly.
The delta-sigma digital-to-analog converter uses a negative feedback loop to accumulate the audio waveform. If the input waveform, accumulated over one sampling period, rises above the value accumulated in the negative feedback loop during previous samples, the converter outputs a digital 1. If the waveform falls relative to the accumulated value, a digital 0 is output. As a result, full positive waveforms will be all 1s. Full negative waveforms will be all 0s. The zero point will be represented by alternating 1s and 0s. Because the instantaneous amplitude of the analog waveform is represented by the density of pulses, the method is sometimes called Pulse Density Modulation (PDM).
Simply looking at a multibit PCM pulse train tells you little about the audio waveform that it encodes. However, the Direct Stream Digital pulse train looks remarkably like the analog waveform its represents. The pulses point up where the analog waveform approaches full positive and down where the analog wave approaches full negative. (The pulse train has been shaded for clarity.)
The resulting pulse train has some remarkable properties. Like PCM digital audio, DSD is inherently resistant to the distortion, noise, wow & flutter of recording media and transmission channels. But unlike PCM, DSD looks quite analog. Simple inspection of the digital pulse train tells you much about the frequency and amplitude of the waveform. And digital-to-analog conversion can be as simple as running the pulse train through an analog low-pass filter! In actual practice, the Delta-Sigma pulse train is relatively noisy. Ultra-high signal-to-noise ratios as required for DSD in the audio band are achieved through 5th-order noise shaping filters. These effectively shift the noise up in frequency, out of the audio band. Sony and Philips designed DSD to capture the complete information of todays best analog systems. The best 30ips half-inch analog recorders can capture frequencies past 50 kHz. DSD can represent this with a frequency response from DC to 100 kHz. To cover the dynamic range of a good analog mixing console, the residual noise power was held at -120 dB through the audio band. This combination of frequency response and dynamic range is unmatched by any other recording system, digital or analog. Both companies wanted to gain feedback for the purpose of refining DSD technology. So in 1995, demonstrations of early versions of the system were conducted in the recording centers of Tokyo, Los Angeles, New York and London. DSD has been demonstrated to a broad cross section of artists, producers, recording and mastering engineers and audiophile consumers. Presentations included carefully volume-matched three-way comparisons among DSD, state-of-the-art 20-bit PCM and the ultimate standard, a live studio feed. The responses ranged from cautious optimism to unbridled I-just-heard-the-future-of-audio enthusiasm.
A simplified illustration of the effect of noise shaping. The maximum audio frequency, fm, is nominally 20,000 Hz. Noise shaping moves most of the noise power far above the audio band, where it will be inaudible.
A notorious torture-test for recording systems, the 10 kHz square wave (top trace) includes component frequencies well above the audio band. The 16-bit PCM system approximates this with a 10 kHz sine wave (second to top trace). In comparison, the 1-bit Direct Stream Digital captures the waves true shape (bottom trace).
DSD in Archiving.
DSD is a superlative solution for music companies desperate to transfer archival recordings before they disintegrate forever. The unprecedented frequency response and dynamic range means that DSD will precisely capture every nuance of the original, down to the noise floor and below. Archiving benefits from another important difference between DSD and multibit PCM. In the multibit PCM world, significant improvement means changing the digital word length or changing the sampling frequency. And that means breaking established formats and creating new ones. DSD is quite different. By changing the characteristics of the loop filter in the analog-to-digital converter, you can actually change the audio specifications of DSD. Some filters can be optimized for bandwidth. Others can be optimized for low noise. Future designs with higher-order filters and greater sophistication can yield performance beyond the grasp of todays technology. And this can all happen without losing compatibility. Older archived DSD recordings will be compatible with the new machines. And newly archived material will be compatible with existing hardware! In this way, DSD archives become future-proof. They retain their currency, even as filter technology makes significant strides.
DSD in Recording.
DSD samples music at 64 times the rate of Compact Disc (64 x 44,100 Hz). This yields a sampling rate of 2,822,400 Hz. At first, recording all those bits may seem like a daunting task. But remember, the CD uses 16 bits for each sample, so the bit rate per channel is 16 x 44,100 Hz or 705,600 bits per second. DSD uses one bit per sample, so the bit rate per channel is 2,822,400 bits per second. Thats only four times the data of Compact Disc. While this data rate is high, its well within the capabilities of many current recording systems, both tape and hard disk.
DSD in Production.
The DSD pulse train can be downconverted to conventional PCM digital audio. But in the long run, the full benefits can best be retained by an all-DSD production chain. Both Sony and Philips have begun serious work on that chain. Prototype DSD-capable recording systems already exist. In addition, Sonic Solutions is developing a DSD-compatible version of Sonics well-known SonicStudio line of digital audio workstations. Substantial progress has been made in the mixing, crossfading, and equalizing of DSD 1-bit signals. In short, theres no theoretical barrier to the creation of a full range of DSD post production tools, recorders, mixers, editors and effects processors. Of course, while DSD establishes new standards in the recording studio, the consumer marketplace remains enthusiastically wedded to the Compact Disc. Clever technology is required to downconvert 1-bit DSD into 16-bit PCM for distribution on Compact Disc. That technology is called Super Bit Mapping Direct processing.
Downconverting Direct Stream Digital from 1-bit/64fs to 16-bit/1fs is not theoretically difficult. Every DAT recorder and A/D converter has a circuit that does much the same thing. But we needed to downconvert DSD in such a way as to retain the maximum possible signal quality in the 16-bit world. The answer was to completely filter and noise shape the DSD signal in a single stage. Thus, interstage requantizing errors would be eliminated. Aliasing would be minimized. And ripple would be suppressed. Sony designed a super-power one-stage FIR digital filter/noise shaper with an amazing 32,639 taps. This is Sonys real-time Super Bit Mapping Direct processor. Just as Sonys existing Super Bit Mapping circuit helps approach 20-to-24-bit precision in 16-bit digital audio, the new Super Bit Mapping Direct processor enables DSD to be released on industry-standard Compact Discs with audibly superior performance. Subjective comparisons conclude that much of the original DSD benefit is preserved in 16-bit Compact Disc release. Prototype SBM Direct processors have already been built. And theyve already been used in the creation of commercial Compact Disc titles. These are now entering the market, enabling anyone with a CD player to judge the sound for themselves.
The arithmetic of DSD downconversion.
Downconversion to 16-bit/44.1 kHz digital audio is just one option for the DSD bit stream. The systems 2.8224 MHz sampling rate is specifically designed for high precision downconversion to all current PCM sampling rates. In all cases, the conversions are performed with simple integer multiplies and divides. As a result, music companies can use DSD for both archiving and mastering. And DSD masters can be easily downconverted for release at any sampling rate or wordlength. This makes DSD a digital Rosetta Stone, able to speak all languages with equal facility. It also means that DSD can support a hierarchy of quality for distribution that allows the music company to precisely position different products for different applications.
The sampling rate of Direct Stream Digital lends itself to simple downconversion to all the standard PCM distribution formats.
A single DSD server can perform multiple tasks to support a music companys production and marketing environments.
Hybrid Discs.
To support both conventional CD players and Super Audio CD players, the new disc is a hybrid. The conventional CD layer is fully compatible with Compact Disc Red Book specifications. The CD players laser reads this layer through a semi-transmissive layer. The Super Audio Compact Disc players laser reads this semi-transmissive layer, which contains the Direct Stream Digital signals. On this two-layer disc, both signal layers are read from the same side. The reverse side is available for label printing, as with conventional Compact Discs.
Protective Layer
~10.00m ~0.05m
0.6mm
~0.05m
0.6mm PC Substrate Semi-Transmissive Layer The Super Audio CD is similar in concept to the standard CD, with an additional high-density layer. PC Substrate
Laser Pick Up
CD Red-Book Compatible Layer Reflectivity Capacity Minimum Pit/Land Length Track Pitch Laser Wavelength Pickup Lens Numerical Aperture (NA) Reflective 780 MB 0.83 m 1.6 m 780 nm 0.45 Super Audio CD Layer Semi-Transmissive 4,700 MB (4.7 GB) 0.40 m 0.74 m 650 nm 0.60
The two layers are read from the same side. The CD laser reads the reflective layer through the semi-transmissive layer.
This hybrid-disc technology is all thats needed to avoid the issue of dual inventory separate retail stocks of old CDs and new CDs for each title.
Direct Stream Transfer.
The 4.7 GB layer of the Super Audio Compact Disc can hold two complete, 74-minute versions of the music: DSD 2-channel stereo and DSD 6-channel sound. One key to this accomplishment is a Philips lossless coding method called Direct Stream Transfer. In general, there are two types of bit rate reduction technologies. Lossy data reduction actually chooses parts of the signal that can be ignored, for example, based on psychoacoustic models. Examples include MPEG-1 and MPEG-2 for video, ATRAC, Dolby Digital (AC-3) and DTS for audio.
10
Developed primarily for computer applications, lossless coding is different. It reduces the data rate while preserving the original signal bit-for-bit. As a simplified example, a string of eight consecutive 0s might be more efficiently encoded as 8x0. Direct Stream Transfer is a far more sophisticated process, involving data framing, prediction and entropy encoding stages. In tests, Direct Stream Transfer has achieved a 50% reduction in bit rate, with zero loss in data integrity. For lossless coding, 50% data reduction is quite impressive. Halving the data required means doubling the storage capacity.
Digital Watermarking.
Copy management systems can successfully defend copyright works against casual duplication. But recent history points to the existence of large, well-funded counterfeiting operations prepared to mass-produce illegal copies of Compact Discs. Our goals for the Super Audio Compact Disc required a higher level of protection against this new, more sophisticated crime. Philips and Sony have developed a multi-faceted technology called Digital Watermarking. Using a technology called Pit Signal Processing (PSP), the system can actually put a faint image or watermark on the signal side of the disc. This image, which can take the form of text or graphics, is extremely hard for pirates to duplicate clearly, no matter what duplication strategy is used. Visibly corrupted watermarks then become a sure sign of piracy. They alert consumers and retailers that something is wrong. And they help prosecutors trace illegal copies back to the source. The Super Audio CD Digital Watermark system also embraces disc bar codes, plus invisible, irremovable information embedded on the disc. It adds up to powerful protection not only for copyright holders, but also for consumers.
Conclusion.
Direct Stream Transfer technologies relies on data framing, prediction and entropy encoding.
In proposing the Super Audio Compact Disc, Sony and Philips have anticipated the full range of needs that the next-generation music carrier must meet. No other proposal so completely satisfies the desires of recording artists, producers, engineers, music companies, retailers, sophisticated audiophiles and general consumers. No other proposal achieves such high levels of audiophile performance. And no other proposal guarantees compatibility with the nearly 500 million existing Compact Disc players and 10 billion existing Compact Discs. Audiophiles and music lovers are seeking the next level in digital audio reproduction. The Super Audio Compact Disc is the ideal solution.
Just as high-tech watermarks help defeat currency counterfeiters, Digital Watermarks on the signal side of the disc can help defeat piracy.
11
Further Reading
Angus, J.A.S. and Casey, N.M., Filtering - Audio Signals Directly, 102nd AES Convention, March 1997, Mnchen Bruekers, F. et. al., Improved Lossless Coding of 1-bit Audio Signals, 103rd AES Convention, September 1997, New York Eastty, P.C., et al., Research on Cascadable Filtering, Equalisation, Gain Control and Mixing of 1-bit Signals for Professional Audio Applications, 102nd AES Convention, March 1997, Mnchen Horikawa, N. and Eastty, P.C., One Bit Audio Recording, AES UK Audio for New Media Conference, April, 1996, London Moorer, James A., Breaking the Sound Barrier: Mastering at 96 kHz and Beyond, 101st AES Convention, November, 1996, Los Angeles Nishio, A., et al., Direct Stream Digital Audio System, 100th AES Convention, May, 1996, Copenhagen Nishio, A., et. al., A New CD Mastering Processing Using Direct Stream Digital, 101st AES Convention, November, 1996, Los Angeles Noguchi, M., et. al., Digital Signal Processing in Direct Stream Digital Editing System, 102nd AES Convention, March 1997, Mnchen ten Kate, R., Disc-technology for Super Quality audio applications, 103rd AES Convention, September, 1997, New York
1997 Sony Electronics Inc. and Philips Electronics N.V. All rights reserved. All trademarks are property of their respective owners. Features and specifications are preliminary and subject to change without prior notice.