HD AtoZ Guide

Download as pdf or txt
Download as pdf or txt
You are on page 1of 55

AVID < > HD

HD Handbook an A to Z guide

Contents

Page

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .03 Real-time HD Workflows and Avid HD Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .04 Guide to HD Terms and Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10 Avid A to Z Guide Topics Production and Cost Advantages of HD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41 Broadcast and HD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42 HD, Film, and Digital Betacam: Advantages and Detractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43 Non-broadcast Use of HD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .45 Progressive Versus Interlaced . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .46 Multi-format Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .48 Audio and HD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .50 HD and Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .54

Introduction

To many of us, it seems as if high definition (HD) has been talked about for years. For others, its as though it has appeared out of nowhere. In many ways, both perceptions are correct. The dynamics behind this fundamental change to the worlds media and entertainment industries have been difficult to predict or to harness. Early adoption in production was driven by both a quest for quality and by the strategic objective of making valuable properties future-proof. In regions where standard-definition delivery is of a higher quality for example, where 625-line PAL standards prevail the benefits of long-term viability and worldwide distribution are compelling. And, most recently, consumers have been driving demand for HD televisions and HD delivery at a far more rapid pace than that mandated by governments and regulatory agencies. Up to now, the professional has been forced to make a difficult choice between quality and efficiency. Uncompressed HD media is, of course, the benchmark of image quality. It also requires an enormous investment in storage and bandwidth resources and makes real-time collaboration over a standard network nearly impossible. To make HD more accessible, many companies have concocted HD solutions that depend on highly compressed camera acquisition formats. While this offers reduced investment costs, the true price is paid in the reduction of image quality, especially when this compressed media is put through the multiple generations of image processing typical of advanced postproduction. Avid understands this dilemma, and has focused its efforts on developing an HD strategy that provides a viable path for both the film and video postproduction industries. The Avid HD strategy begins with support for all popular HD formats from HDV to uncompressed HD across the family of Avid systems. What separates the Avid HD strategy from all others is Avid DNxHD encoding technology. Simply put, Avid DNxHD encoding is engineered for postproduction processing. It delivers mastering-quality HD media at standard definition (SD) data rates and file sizes. Among other important benefits, this means that todays standalone systems and Avid Unity shared media networks are, for all intents and purposes, HD-ready. Today.

Avid is making Avid DNxHD encoding technology available royalty-free reinforcing its commitment to open standards. The source code for Avid DNxHD encoding technology will be licensable free of charge, available through the Avid Website as a download to any user who wants to compile it on any platform. This handbook details workflows that can be accelerated or adapted to HD production with the advent of high-efficiency Avid DNxHD encoding. It also provides an extensive glossary of HD terms and technologies used throughout the industry, with expanded essays focusing on key topics. We hope youll find it useful and informative.

Workflows

Real-time HD Workflows and Avid HD Solutions


July 2004 2004 is the year high-definition (HD) hit the mainstream. All the major networks have HD broadcast channels or are broadcasting HD during the primetime hours. Cable and satellite providers are offering HD at low cost. And sales of HD screens to consumers have exploded as demand drives prices lower. Even productions without current HD broadcast requirements are choosing to produce programming in HD, ensuring a high-quality master for repurposing content in the future. The popularity of HD is also expanding because a wider range of acquisition products is now available. New prosumer cameras that support lower-end HDV have recently been introduced, while high-end HD resolutions can now be achieved with HDCAM SR-capable cameras. With these newer acquisition methods, HD formats now scale from 19 megabits per second (HDV) to 440 megabits per second (HDCAM-SR), making HD a reality across the industry and in all markets. HD is used in a variety of ways, all of which need to be taken into consideration in order to design a complete workflow solution. Workflow details and metadata management will differ when considering HD as an acquisition format, HD as a postproduction format, or HD as a delivery format. This section discusses the incorporation of HD formats into various postproduction workflows supported by Avid.

Avid and HD
Avid has become an established leader in the HD arena by providing support for HD media itself, as well as by enabling workflow solutions as part of the overall HD production pipeline. Avid|DS HD and Avid DS Nitris systems have led the way with high-quality uncompressed 10-bit HD with real-time, two-stream playback. Highly efficient workflows depend on more than just playback performance, however. Sharing of media and metadata management must also be considered. Historically, simultaneous sharing and playback of multiple-stream HD media has been impossible due to the enormous bandwidth requirements of uncompressed HD data. That reality has been changed, however, with the introduction of Avid DNxHD highly efficient encoding that provides mastering-quality HD media in dramatically reduced file sizes. Powered by Avid DNxHD encoding technology, Media Composer Adrenaline and Avid DS Nitris systems can share the same mastering-quality HD content in real time over an Avid Unity MediaNetwork. Also, the source code for Avid DNxHD encoding is licensable free of charge through the Avid Website to any user who wants to compile it on any platform. Avid believes that MXF-based Avid DNxHD encoding is poised to become an industry standard. Companies who want to output or read Avid DNxHD media can do so. Content creators who want to protect their assets can have source code as insurance. From acquisition to delivery, many workflows are possible in Avid environments that are not possible in other systems. Avid enables these highly efficient workflows through superior metadata management: tracking metadata such as timecode, audio timecode, and film KeyKode for all captured sources. For example, by tracking audio timecode from the set, in addition to the timecode used for picture, Avid systems can easily synchronize the two in postproduction, creating a new source clip yet still keeping track of the original sources for downstream processes.

Workflows

Native DVCPRO HD and HDV Workflow


When working with DVCPRO HD and HDV media, capturing and editing their native formats is critical. Systems that cant handle native media must transcode to another format, wasting time and storage space and sacrificing quality. Avid editing products will support native DVCPRO HD and HDV from Avid Xpress Pro through Avid Media Composer Adrenaline systems. Media is captured over IEEE 1394 (FireWire) directly into the editing application. Deck control, picture and sound are all controlled over this single connection. As shown in the diagram on Fig. A], the production and post production processes are self-contained, but the final output can vary depending on the clients distribution needs. With DVCPRO HD and HDV projects, Avid offers the best of both worlds. Cuts-only sections of the project remain native: storage-efficient, original digital quality. But native formats are not ideal for sections with effects, compositing, and titles because quality can suffer. So for these sections, Avid takes advantage of mastering-quality Avid DNxHD encoding. The result is a final project with the best possible quality, performance, and storage efficiency. With Adrenaline systems, a wide range of HD output formats are possible, from native DVCPRO HD and HDV* to Windows Media HD encoding. With the Avid HD expansion option, mastering-quality HD can be output over an HD-SDI connection to an HD VTR. And high-throughput, collaborative workflows can be configured with Media Composer Adrenaline and Avid DS Nitris systems sharing Avid DNxHD media.

Fig. [A]: Native DVCPRO HD and HDV Workflow

DVCPRO HD Avid Xpress Pro or Media Composer Adrenaline All titles, effects, and renders will use the Avid DNxHD resolutions to maintain high quality during the editorial process.

IEEE 1394 (FireWire) Direct Capture


HDV*

Windows Media 9 Series

MPEG2

Render non-native media to DVCPRO HD

Render DVCPRO HD HDV

IEEE 1394 (FireWire)

Media Composer Adrenaline

Avid DS Nitris

HD-SDI

HD-SDI

Internet Streaming

HD-DVD

D-VHS

HDV

DVCPRO HD Master

DECK HDCAM DVCPRO HD D5

*Planned for future release

Workflows

HD Workflow Overview
To fulfill their HD acquisition, storage, and distribution needs, users will have the flexibility to choose from a full range of Avid solutions. HD workflow can be as simple as that using HDV or it can be much more complex, such as that for film-based television or feature film. Avid mastering-quality postproduction HD solutions range from uncompressed 10-bit HD to the highly efficient resolutions of the Avid DNxHD family. The data rates of the Avid DNxHD resolutions are, on average, the same as that of uncompressed standard definition (SD) video. Because of this, the infrastructure needed to support HD will be very similar to that needed to support standard definition. This efficiency translates into easy-to-configure standalone HD storage systems, or high-performance collaborative solutions when connected to an Avid Unity MediaNetwork system.

Fig. [B]: Workflow with Avid Media Composer Adrenaline with HD Expansion Option

Real-time HD Preview

DECK HDCAM DVCPRO HD D5

HD-SDI Real-time Encode

HD Expansion Option

HD-SDI Real-time Decode

DECK HDCAM DVCPRO HD D5

Avid Media Composer Adrenaline Title and Graphics Creation in Avid DNxHD Capture HD as Encode for Output

Windows Media 9 HD

Avid DNxHD

DVCPro HD

HDV

Workflows

Offline/Online Workflow
There are several factors that will determine the need for an offline versus online situation, such as amount of footage, time to deliver final program, and storage budget. For instance, even though the cost of high-quality HD acquisition has been reduced, productions tend to shoot more than ever before, ultimately increasing the amount of total footage. Efficient processes are therefore needed for postproduction, such as an offline process with a recapture for the online finish. Fig. [C] 1080p/23.976 HD Offline Workflow with Media Composer Adrenaline. The same processes are used for all frame rates. It is easy to correlate the Avid DNxHD resolutions to existing HD formats: Fig. [D]
Format Bit Depth Sampling Avid DNxHD 145 8-bit 4:2:2 Avid DNxHD 220 8- and 10-bit 4:2:2 220 Mb/sec Panasonic D5 8- and 10-bit 4:2:2 220 Mb/sec DVCPRO HD 8-bit 4:2:2 100 Mb/sec HDCAM 8-bit 3:1:1 135 Mb/sec HDCAM SR 10-bit 4:2:2 or 4:4:4 440 Mb/sec

Bandwidth 145 Mb/sec

1080p/23.976 Capture

Of course there is always uncompressed HD available via the Avid DS Nitris system, which offers the industrys best-performing 10-bit uncompressed HD solution on the market today. Uncompressed HD is expected to be offered as an additional option to the Media Composer Adrenaline product in the future.

TAPE

HD VTR with Downversion cards

Avid Media Composer Adrenaline HD 23.976p Project Type

SD with 2:3 insertion

When the offline is complete, the user can recapture the material at any of the Avid DNxHD resolutions for finishing and mastering. A good guideline is to use the Avid DNxHD resolution that matches the megabit data rate of the acquisition format being shot. So for HDCAMoriginated material, the user will be using the Avid DNxHD 145 resolution, which closely matches the data rate of HDCAM, but provides much higher image quality and is far better suited for color correction and effects multi-generational renders. Users that are sourcing from D5 will most likely use Avid DNxHD 220 (8-bit) or Avid DNxHD 220x (10-bit) versions of the Avid DNxHD codec. The end result will be better mastering results using significantly less storage.

Workflows

Film-Based Television Workflow


Film-based television will either transfer all the film sources to HD or transfer selects depending on the clients needs as well as the scheduling within a facility. Avid editing systems track all the sources from original acquisition through HD and SD sources, so the user can, at any time, go back and retransfer just selects and maintain a frame-accurate conform of the new sources. Feature films are also using HD as a means of viewing dailies and then using those dailies for conforms when screening the different versions of a film. The conform process in an Avid workflow is far more efficient and less expensive than conforming using a film workprint, especially when there are many effects being used.

Real-time Collaboration with HD Resolution


Any of the above workflows can be further enhanced by connecting two or more systems together via an Avid Unity MediaNetwork system. One of the key advantages of Avid DNxHD encoded media is that it enables the same real-time HD collaboration and workflow that customers currently enjoy today in SD environments using Avid Unity MediaNetwork systems. By enabling simultaneous media access for all users without the need to copy, push, or pull, the overall time spent in postproduction is reduced while the highest-quality picture is maintained.

Fig. [D]: Film-Based Television Workflow

Film 1080p/23.976

TAPE Audio HD VTR Avid Media Composer Adrenaline HD 23.976p 1080i/59.94 1080p/23.976 720p/59.94

Cut Lists and EDLs

HD Projector or VTR

NTSC SD with 2:3 insertion

Workflows

Other Workflow Enablers


MXF
Avid DNxHD resolutions are in the MXF format established by the ProMPEG forum and a codified international SMPTE standard.

Transcode
The user will also have the ability to transcode easily between all resolutions with the same frame rate. This is already a popular workflow in todays Media Composer Adrenaline systems. Transcoding allows any one resolution to be changed to another resolution in an automated fashion rather than recapturing from the tapes. Consider an offline workflow where the sources are captured as HD but the producer, editor, or director wants to take a version of the selects on a laptop. Transcoding would allow the conversion of all selected master clips and sequences to a lower resolution targeted for the laptop, the end result offering more storage efficiency and potentially more video streams.

SD and HD Combined*
The Avid Media Composer Adrenaline system will also allow for mixing of SD and HD in the same timeline for matching frame rates*. When this happens, the output format will determine the resolution and up-convert or down-convert will occur in real time (SD will be up-converted to HD in real time or HD will be down-converted in real time to SD). This allows for an offline-to-online conform check as well as a real-time proxy while all the elements are being recaptured in the target resolution. Once an HD sequence has been conformed, the output will have both the HD and SD signals available at the same time. In the case of a 1080p/23.976 conform, the SD down-covert will also have a real time 2:3 pulldown insertion applied. The user will have the choice of three aspect ratios for the SD down-convert: full height anamorphic, 16:9 letterbox, and 4:3 center crop. The user can also use the reformat tool for Pan & Scan for fine control of the composition, but the output will not be real-time on Media Composer Adrenaline systems with the HD Option card. Reformatting on Avid DS Nitris systems is real-time for all resolutions via a DVE.

Graphics and Codecs


With the Adrenaline HD Option card, there will be support for all of the HD resolutions, aspect ratios, and color spaces when importing from any of the more than 25 formats currently available in all Avid editing systems. In addition, the Avid DNxHD codecs will be available in the Avid QuickTime codec so that third-party QuickTime-aware applications can use the reference files that point to the original media.

*expected end of year 2004

Numeric

1000/1001
Frame rates of 23.976, 29.97, and 59.94 (instead of 24, 30, and 60fps) are used in some HD systems in order to relate directly to the 59.94 fields per second of NTSC. The use of these frame rates is necessary for HD so that simulcasts of HD and SD material keep in step. Although HD standards are still new, they have to relate with existing standards, which means that legacy issues arise. This is one. The 1000/1001 offset first occurred in 1953 when 525/60 monochrome TV went to NTSC color. The color subcarrier frequency was set to be half an odd multiple (455) of line frequency in order to minimize its visibility in the picture. Then the sound carrier was set at half an even multiple of line frequency to lessen its beating with the color subcarrier. For compatibility with existing monochrome receivers that are able to use the NTSC broadcast signal, the sound carrier was fixed at the existing monochrome frequency of 4.5 MHz. For monochrome this was nearly 286 times line frequency, but now it was made to be exactly 286 times line frequency which meant it, the color subcarrier, and the frame rate all changed by almost exactly 1000/1001.

frequencies, which are locked to the video, are also effected hence the .999 (1000/1001) offset. For example, 48 and 44.1kHz sampling become 47.952 and 44.056 kHz respectively.

1080i
Short for 1080 lines, interlace scan. This is the very widely used format which is defined as 1080 lines, 1920 pixels per line, interlace scan. The 1080i statement alone does not specify the frame rate which, as defined by SMPTE and ITU, can be 25, 29.97, or 30 Hz. See also: Common Image Format, Interlace, ITU-R.BT 709, Table 3

1080p
1080 x 1920 sized pictures, progressively scanned. Frame rates can be as for 1080i (25, 29.97, and 30 Hz) as well as 23.976, 24, 50, 59.94, and 60 Hz. See also: Common Image Format, Progressive, ITU-R.BT 709, Table 3

13.5 MHz
Sampling frequency used in the 601 digital coding of SD video. The frequency was chosen to be a whole multiple of the 525 and 625-line television system frequencies and high enough to faithfully portray the 5.5 MHz of luminance information present in SD images. Digital sampling of most HD uses 74.25 MHz, which is 5.5 times 13.5 MHz. See also: 2.25MHz, ITU-R BT.601

Heres the math: Line frequency (FI) would have been 525 x 30 Color subcarrier (Fsc) would have been 455/2 x 12,750 But 4.5MHz sound carrier meant FI became 4,500,000/286 So Fsc became 15,734.265 x 455/2 And Frame rate became 15,734.265/525

= = = = =

15,750 Hz 3,583,125 Hz 15,734.265 Hz 3,579,545.2 Hz 29.97 Hz

176.8 kHz / 192 kHz


Seen as the current and future standard for archiving and production, this high sample-rate format is four times the current norm, giving almost transparent audio recording. However, a stereo file of 24-bit at 192 kHz requires almost 60 MB of disk per minute as opposed to 15 Mb at 48 kHz.

The nominal 24fps of film transferred to television has to change in order for 2:3 pulldown to work correctly when producing video for NTSC broadcast. The whole 1000/1001 issue is to make sure that NTSC broadcasts continue to work well. Thus, HDTV standards include 1000/1001 offset frequencies for frame, field, and line rates as well as digital sampling frequencies all to keep in step with NTSC. But, when the analog is finally turned off, everyone can revert to 24, 30, and 60 Hz frame rates, and all other nominal frequencies. There are many further knock-on effects. The drop-frame timecode used with NTSC makes the 1000/1001 change by dropping one TV field in every 1000. Audio digital sampling

2.25 MHz
This is the lowest common multiple of the 525/59.94 and 625/50 television line frequencies, being 15.734265 kHz and 15.625 kHz respectively. Although seldom mentioned, its importance is great as it is the basis for all digital component sampling frequencies both at SD and HD. See also: 13.5 MHz

10

Numeric

2:3 Pulldown
This refers to the widely used method of mapping the 24 frames-per-second of motion picture film and, more recently, 24p video, onto television using 60 fields or frames per second refresh rate such as 525/60i, 1080/60i and 720/60p. The 2:3 refers to two and then three of the televisions 60 fields or frames being mapped to two consecutive film frames. The sequence repeats every 1/6th of a second, after four 24 fps frames and ten television frames at 60p, or ten fields at 60i. DVDs for the 525/60i (NTSC) market hold motion picture material stored at 24 fps and the 2:3 operation is executed in the player. See also: Cadence, Compression friendly
Film frames @ 24fps

24PsF provides a direct way to assist the display of this relatively low picture rate as showing 48 segments per second reduces the flicker that would be apparent with 24 whole progressive frames. 25PsF and 30PsF rates are also included in the ITU-R BT. 709-4 definition. See also: ITU-R BT. 709

2K
This is a picture format. Usually it refers to 1536 lines with 2048 pixels in RGB color space that is the basis for Digital Intermediates and file-based workflows. This is not a television format but 35mm film is often scanned to this resolution for use as digital film for effects work and, increasingly, grading, cutting, and mastering. For publishing in television, a 16:9 (1080 x 1920), and a 4:3 aspect ratio window can be selected from the 2K material for HD and SD distribution. The format is also suitable to support high quality transfers back to film or for D-cinema exhibition.

TV fields/frames @ 60fps a a b b b c c d d d

3:1:1
A sampling system used by HDCAM. This system reduces data considerably, and is considered acceptable for capture, but is not optimized for postproduction. In this system, Y, R-Y and B-Y are all sub-sampled.

Edit Point

Pulldown Sequence

Edit Point

3.375 MHz
Original unit of sampling frequency suitable for encoding chrominance information with a maximum bandwidth of 1.5 MHz. 3.375 MHz corresponds to unity in sampling ratios such as 4:1:1.

24p
Short for 24 frames, progressive scan. In most cases this refers to the HD picture format with 1080 lines and 1920 pixels per line (1080 x 1920/24p) although 720 lines and 1280 pixels is also used at 24p. The frame rate is also used for SD at 480 and 576 lines with 720 pixels per line, often as an offline for an HD 24p edit, or to create a pan-and-scan version of an HD down-conversion.

3D Compositing
Compositing which takes account of the z plane depth into the picture. For instance, compositing fog into a scene where z depth affects the shading. Other examples include casting shadows, surface plane clipping and blur, and other depth effects. Seamless effects require great attention to detail and being able to work in the z plane is a part of that process. See also: Compositing

24PsF
24p Segmented Frame. This blurs some of the film/video boundaries. It is video captured in a film-like way, formatted for digital recording and can pass through existing HD video infrastructure. Like film, the images are captured at one instant in time rather than by the usual line-by-line interlace TV scans. The images are then recorded to tape as two temporally coherent fields (segments), one with odd lines and the other with even lines. This is a pure electronic equivalent of a film shoot and telecine transfer except the video recorder operates at film rate (24 fps), not at television rate. Normal interlaced-scan images are not temporally/spatially coherent between field pairs.

4:1:1
A sampling system similar to 4:2:2 but where the R-Y and B-Y samples are at every fourth Y sample along a line. The horizontal sampling frequencies are 13.5, 3.375, and 3.375 MHz. Filtering ensures color information between samples is taken into account. It is used in DVCPRO (625 and 525 formats) as well as in DVCAM (525/NTSC) VTRs. No applications have yet arisen for this in HD. See also: 3.375 MHz

11

Numeric

4:2:0
With 4:2:0 sampling, the horizontal sampling along a line is the same as for 4:2:2, but chrominance is only sampled every other line. Appropriate chrominance pre-filtering on the analog chrominance means that the values on the lines not coincident with the samples are taken into account. 4:2:0s main importance is as the sampling system used by MPEG-2 encoders for distribution to digital viewers. With 4:2:0, the luminance and chrominance are sampled on one line just as in 4:2:2, but on the next line only the luminance is sampled. This halves the vertical resolution of the color information, rendering it unsuitable for postproduction but the technique reduces the number of samples from 4:2:2 by 25 percent overall an effective compression scheme. See also: 4:2:2, MPEG-2
Line 1 Y B-Y R-Y Line 2 Y Line 3 Y B-Y R-Y Line 4 Y Y Y Y Y Y Y Y Y Y Y B-Y R-Y Y Y B-Y R-Y Y Y B-Y R-Y Y Y B-Y R-Y Y Y Y Y Y Y Y Y Y Y B-Y R-Y Y Y B-Y R-Y Y Y B-Y R-Y Y Y B-Y R-Y

Assigning 4 to 13.5 MHz may be because this allows other related sampling structures to be easily defined (4:2:0, 4:1:1). Certainly it is used as a ratio that is applied to HD as well as SD even though HD is 5.5 times faster than SD. The structure of the sampling involves taking simultaneous (co-sited) samples of all three components on the first pixel of a line, followed by a sample of just the Y (luminance) component on the next. Thus luminance is assigned twice the digital bandwidth of chrominance. All lines of the picture are sampled the same way. Note that the samples are not simply instantaneous values of the analog levels but of their filtered results and so reflect values on either side of the sampling point.
Line 1 Y B-Y R-Y Line 2 Y B-Y R-Y Line 3 Y B-Y R-Y Y Y B-Y R-Y Y Y B-Y R-Y Y Y B-Y R-Y Y Y B-Y R-Y Y Y B-Y R-Y Y Y B-Y R-Y Y Y B-Y R-Y Y Y B-Y R-Y Y Y B-Y R-Y Y Y B-Y R-Y Y Y B-Y R-Y Y Y B-Y R-Y

4:2:2 sampling of luminance and colour difference signals

4:2:0 sampling reduces data by sampling colour only on alternate lines

4:2:2
This describes a digital sampling scheme that is very widely used in television. Many professional VTRs and nearly all studio infrastructures use SDI connections that depend on 4:2:2. A similar dependence on 4:2:2 is growing in HD. For SD, 4:2:2 means that the components of pictures, Y, B-Y, R-Y, are digitally sampled at the relative rates of 4:2:2 which are actually 13.5, 6.75, and 6.75 MHz respectively. With 13.5 MHz assigned 4, the 6.75 MHz chrominance components each become 2. For HD, the corresponding sampling rates are 74.25, 37.125, and 37.125.

The scheme takes into account the need for sophisticated editing and postproduction and so has a relatively high sampling rate for the color information (R-Y and B-Y) allowing good keying, color correction, and other processes working from the signal. This contrasts with the analog coded systems that this replaced, PAL and NTSC, where such post work was impractical. This was due to the need for digital processing to operate on the components not directly on the coded signal, as well as there being insufficient chrominance bandwidth for making good key signals. 4:2:2 is often quoted when referring to HD as this is a convenient way of indicating the relative sampling frequencies. But note that it does not mean they are the same values as for SD. The actual frequencies are 5.5 times higher: 74.25, 37.125, and 37.125 MHz. See also: 13.5 MHz, 2.25 MHz, Component Video, Y B-Y R-Y SDI

12

Numeric

4:2:2:4
4:2:2:4 is the same as 4:2:2 but with the addition of a full bandwidth key/alpha/matte channel. Unlike 4:2:2, this cannot be recorded on a single VTR or travel on a single SDI connection, but disk-based systems can more easily accommodate it. The 4:2:2:4 signal often exists as a video with key signal inside production and postproduction equipment and is transferred over networks or using dual SDI connections.

theater and home DVD and is also used by Digital Theater Systems (DTS). See also: AC3

6.1
Similar to 5.1 but the surround speakers gain an extra center surround signal for improved rear spatial effect.

4:2:4
Short name for the Dolby Stereo/Prologic encoding and decoding system combination. Due to the analog/non-discrete encoding system of Dolby Stereo, many audio studios simulate the entire delivery process with a dummy encoder and decoder to hear the effect that the encoding process has on the final mix. The installation of the encoder (four into two, i.e., LCRS into LtRt) connected to the decoder (two into four, i.e., LtRt into LCRS) gives the system its name. See also: Dolby Matrix, Dolby Stereo

60i
See ITU-R BT. 601

7.1
Similar to 5.1. However, it was felt that only three speakers across the front did not cover the large projection screens of modern theatres. Adding two additional speakers, Near Left and Near Right, gives better front spatial positioning and stops dips appearing when panning across large distances.

4:4:4
4:4:4 is quite different from all other quoted sampling systems in that it usually refers to sampling RGB color space such as 2K/4K rather than that of the Y, B-Y, R-Y components (although ITU-R BT.601 does allow for this as a sampling rate for components, it is rarely used). As television pictures are formed from red, blue, and green sensors, this is considered by many as the most accurate form of the digital signal, anything else being a form of compression. 4:4:4 simply makes full-bandwidth samples of R, G, and B. In studios this is usually converted to 4:2:2 as soon as possible as it does not fit the normal SDI/HD-SDI infrastructure and cannot be recorded on VTRs. However, IT networks and disk-based recorders can handle it and the signal is increasingly used in high-end postproduction.

709
See ITU-R BT. 709

720p
Short for 720 lines, progressive scan. Defined in SMPTE 296M and a part of both ATSC and DVB television standards, the full format is 1280 pixels per line, 720 lines and 60 progressively scanned pictures per second. It is mainly the particular broadcasters who transmit 720p that use it. Its 60 progressive scanned pictures per second offers the benefits of progressive scan at a high enough picture refresh rate to portray, action well. It has advantages for sporting events, smoother slow motion replays, etc.

74.25 MHz
The sampling frequency commonly used for luminance (Y) or RGB values of HD video. Being 33 x 2.25 MHz, the frequency is a part of the hierarchical structure used for SD and HD. It is a part of SMPTE 274M and ITU-R BT.709. See also: 1000/1001, 2.25 MHz

4:4:4:4
4:4:4:4 is 4:4:4 with an added full bandwidth key/alpha/matte channel.

44.1 kHz / 48 kHz


These are currently the standard audio sample rates used throughout the professional audio industry. Derived for video field/frame rates, they offer approximately 20 kHz audio response for digital recording systems.

88.4 kHz / 96 kHz


One of the new high sample rate audio formats. It can be either 16 or 24-bit and is double the current 44.1/48 kHz, giving greatly extended frequency response.

5.1
5.1 audio describes a multi-channel speaker system that has three speakers in front of the listener (left, center and right), two surround speakers behind (left and right) and a sub-woofer for low frequencies. This configuration is used by Dolby for its Digital playback systems for

.999
The offset in frequencies caused by the NTSC transmission standard. See 1000/1001

13

A
AAF
Advanced Authoring Format. AAF is an industry-driven, open standard designed specifically for exchanging rich metadata between different media authoring systems. Based on Avids seminal OMF technology, AAF is completely resolution- and format-independent. Unlike EDL, AAF is equally capable of handling film, video, audio and file-based sources at any frame rate or image size, assembled into any number of layers. And because AAF is extensible, it can store complex effects information and custom user metadata. Avid is a founding member of the AAF Association and has contributed much of the intellectual property for AAF, including the core object model and the Software Developers Kit (SDK). Avid continues to be active on the AAF board of directors and has invested heavily in implementing AAF in its products. Most every Avid product supports AAF today, including Avid Xpress DV and Avid Xpress Pro software; Media Composer Adrenaline, NewsCutter Adrenaline FX, NewsCutter XP, Symphony, and Avid DS Nitris editing systems; Digidesign Pro Tools; and SOFTIMAGE|XSI software. AAF is the preferred interchange mechanism for exchanging projects between Avid products. AAF is Avids preferred method for project interchange with products from other vendors. See also: EDL, MXF, OMFI Website: www.aafassociation.org The sampling process begins during line blanking of the analogue signal, just before the left edge of active picture, and ends after the active analogue picture returns to blanking level. Thus, the digitized image includes the left/right frame bounds and their rise/fall times as part of the digital scan line. HD systems are usually quoted just by their active line content, so a 1080-line system has 1080 lines of active video. This may be mapped onto a larger frame, such as 1125 lines, to fit with analog connections.

AES
Audio Engineering Society (USA) Website: www.aes.org

AES 31
This is a series of standards defined by the AES for audio file interchange between digital audio devices. AES 31 Level 1 defines the use of WAV or broadcast WAV files for raw audio data transfer. AES 31 Level 2 defines a basic audio EDL for the transfer of edits along with raw audio. AES 31 Level 3 is the proposed format for the interchange of mix information such as automation, mixer snapshots EQ settings, etc. Currently, only Level 1 has been adopted by many of the professional manufacturers and the Level 2 proposal is nearing completion. However, the lack of integration between video and audio manufacturers pigeonholes the AES 31 format to only the audio market. During this time, the OMF format, which has both support from audio and video manufacturers, has become the system of choice for professional production companies. It already carries audio and level information plus video data and graphic information, allowing many different professionals in the production process to share a common format of collaboration.

AC3
The audio compression algorithm developed by Dolby systems. AC3 is the basis for Dolby Digital Film, Dolby Digital DVD, and Dolby E formats. Other manufacturers have also licensed it, e.g., Liquid Audio which uses it for delivering high quality music over the Internet. The system takes advantage of our psycho-acoustic facilities by reducing or leaving out the parts that we do not notice so much. It applies compression in varying degrees across the frequency range and can provide a high level of data compression without perceptually degrading the audio.

Active Picture
The part of the picture that contains the image. With the analog 625 and 525-line systems, only 575 and 487 lines actually contain the picture. Similarly, the total time per line is 64 and 63.5S but only around 52 and 53.3S contain picture information. As the signal is continuous the extra time allows for picture scans to reset to the top of the frame and the beginning of the line. Digitally sampled SD formats contain 576 lines and 720 pixels per line (625-line system), and 486 lines and 720 pixels per line (525-line system) but only 702 contain picture information. The 720 pixels are equivalent to 53.3S.

AES/EBU
Audio Engineering Society/ European Broadcasting Union. The term AES/EBU is associated with the digital audio standard which they have jointly defined. An example is the AES/EBU IF format which is also accepted by ANSI (page 14) and defines a number of digital sampling standards using 16 and 24-bit resolutions and frequencies of 44.1 kHz for CDs and the 48 kHz commonly used for audio channels on professional digital VTRs. 88.2 and 96 kHz sampling are also defined. Website: www.aes.org

14

A
AIFF/AIFC
One of the very first audio file formats for PCs, the Audio Interchange File Format system is still used today. The OMF audio specification uses a modified version called AIFC for its audio transfer. See also: OMF

ARC
Aspect Ratio Converter. This equipment generally transforms images between televisions traditional 4:3 and 16:9 aspect ratio used by HD, and widescreen, digital SD pictures. Fundamentally, the process involves resizing and, perhaps, repositioning (panning) the picture. There are a number of choices for display: showing a 16:9 image on a 4:3 screen and a 4:3 image on a 16:9 screen. Some equipment goes further to offer presentation options to smoothly zoom from one state to another. See also: Aspect Ratio

Aliasing
Artifacts created as a result of inadequate video sampling or processing. Spatial aliasing results from the pixel-based nature of digital images and leads to the classic jagged edge (a.k.a. jaggies) appearance of curved and diagonal detail and twinkling on detail. This results from sampling rates or processing accuracy too low for the detail. Temporal aliasing occurs where the speed of the action is too fast for the frame rate, the classic example being wagon wheels that appear to rotate the wrong way. See also: Anti-aliasing

16:9 to 4:3 display 4:3 to 16:9 display

Full Full Full Full

screen losing left and right edges image (letterbox) - black bars at top and bottom screen losing top and bottom of image image - black bars left and right

Anamorphic
This generally describes cases where vertical and horizontal magnifications are not equal. The mechanical anamorphic process uses an additional lens to compress the image by some added amount, often on the horizontal axis. In this way, a 1.85:1 or a 2.35:1 aspect ratio can be squeezed horizontally into a 1.33:1 (4:3) aspect film frame. When the anamorphic film is projected, it passes through another anamorphic lens to stretch the image back to the wider aspect ratio. This is often used with SD widescreen images which keep to the normal 720 pixel count but stretch them over a 33-percent wider display. It can also apply to lenses, such as a camera lens used to shoot 16:9 widescreen where the CCD chips are 4:3 aspect ratio. See also: Aspect ratio

Aspect Ratio
For pictures, this refers to the ratio of picture width to height. HD pictures use a 16:9 aspect ratio, which also may be noted as 1.77:1. This is a third wider than the traditional 4:3 television aspect ratio (1.33:1) and is claimed to enhance the viewing experience as it retains more of our concentration by offering a wider field of view. Pixel aspect ratio refers to the length versus height describing a pixel in an image. HD always uses square pixels as do most computer applications. SD does not. The matter is further complicated by SD using 4:3 and 16:9 (widescreen) images which all use the same pixel and line counts. Care is needed to alter pixel aspect ratio when moving between systems using different pixel aspect ratios so that objects retain their correct shape. With both 4:3 and 16:9 images and displays in use, some thought is needed to ensure a shoot will suit its target displays. All HD, and an increasing proportion of SD, shoots are 16:9 but many SD displays are 4:3. As most HD productions will also be viewed on SD, clearly keeping the main action in the middle 4:3 safe area would be a good idea unless the display is letterboxed. There are also other aspect ratios to deal with depending on source content versus distribution such as a widescreen 2.40:1 film transferred to either 4:3 SD or 16:9 HD. Another distribution aspect ratio is 14:9 used by the BBC which crops just a little both from the 16:9 and the 4:3 resulting in less data loss on the X and Y axis. See also: Anamorphic, ARC, Letterbox, Square pixels

ANSI
American National Standards Institute Website: www.ansi.org

Anti-aliasing
Attempts to reduce the visible effects of aliasing. This is particularly the case with spatial anti-aliasing that typically uses filtering processes to smooth the effects of aliasing which may be noticeable as jaggedness on diagonal lines, or twinkling on areas of fine detail. A better solution is to improve the original sampling and processing. See also: Aliasing

15

A B

ATM
Asynchronous Transfer Mode (ATM). This is an IT networking technology, mostly used by telecom companies to provide reliable connections for transferring streaming data, such as television, at speeds up to 10G bit/s though 155 and 622 Mbit/s are most common for television. ATM is connection-based and can establish a path through the network before data is sent which allows a high quality of service. See also: Ethernet, Fibre Channel See www.atmforum.com

Multiple levels of redundancy constitute a rock-solid, reliable system that ensures peace of mind all the time

AVR
AVR is a range of Motion-JPEG video compression schemes devised by Avid Technology for use in its ABVB hardware-based non-linear systems. An AVR is referred to as a constant quality M-JPEG resolution since the same quantization table (of coefficients) is applied to each frame of a video clip during digitization. For any given AVR, the actual compressed data rate will increase as the complexity of the imagery increases. For example, a head shot typically results in a low data rate while a crowd shot from a sporting event will yield a high data rate. To avoid system bandwidth problems, AVRs utilize a mode of rate control called rollback which prevents the compressed data rate from increasing beyond a preset limit for a sustained period. So, when the data rate exceeds the rollback limit on a given frame, high spatial frequency information is simply discarded from subsequent frames until the rate returns to a tolerable level. See also: DCT, JPEG

ATSC
Advanced Television Systems Committee. In the USA, this body was responsible for creating the DTV standard. This was a combined industry effort and includes SD as well as HD standards for digital television that describe 18 video formats. ATSC also specifies the use of MPEG-2 video compression and AC-3 for audio. The transmission system for terrestrial and cable broadcasting is based on vestigial side-band (VSB) technology. 8-VSB is used for terrestrial and 16-VSB for cable. Current active broadcast users of ATSC are in North America. See also: Table 3 Website: www.atsc.org

Bit
One state of digital information that may be either 0 or 1. It is normally written with lower case b. The use of upper case B should be reserved for byte. See also: Byte kb = kilobit Mb = megabit Gb = gigbit

Avid DNxHD
Avid DNxHD is a 10- and 8-bit HD encoding technology that delivers mastering-quality HD media with storage bandwidth and capacity requirements rivaling those of uncompressed standard-definition (SD) files. Specifically engineered for multi-generation editing, the quality and efficiency of Avid DNxHD encoded HD media enables real-time collaborative HD production in networked environments. See appendices B and C.

Blocks
See DCT

Avid Unity MediaNetwork


Designed specifically for dynamically storing and sharing high-bandwidth, high-resolution media, the Avid Unity MediaNetwork solution offers significant performance, setup, and administrative advantages over standard storage area networks (SANs). Built on Avid's highly optimized file system architecture, Avid Unity MediaNetwork delivers a full range of uncompressed and compressed media in real time, while enabling editing, finishing, audio, and graphics work to take place at the same time using the same media files and projects in a shared workspace. Avid Unity Solutions offer: Improved workflow efficiency through true simultaneous sharing of media assets right down to the file level Openness to all network aware systems, allowing the entire facility to collaborate in real time

Blue Screen
Backdrop for shooting items intended to be composited by keying them onto other backgrounds. The color blue or green is normally chosen as being unique in the picture and not present in the foreground item to be keyed. This should enable easy and accurate derivation into a key signal. Consideration may also be given to the color spill onto the object. For example, if the object is to be set into a forest, maybe a green screen would be preferred. Modern color correction and key processing allow a wider choice of color and the possibility of correcting for less-than-perfect shoots. However, this will increase postproduction time and effort and always risks compromising the final result.

16

Derivation of keys from blue screen shots depends on the accuracy and resolution of color information. Unlike SD, where the popular Digital Betacam or DVCPRO 50 records 4:2:2 sampled video using only 2:1 or 3:1 compression, HD recorders do not yet offer equivalent quality except the uncompressed D6 VTR (Philips Voodoo). HD needs a 4:2:2 sampled recorder capable of 400-600 Mbit/s to offer as good a service for HD way in excess of the 100-140 Mbit/s camcorders now on offer. Currently, VTRs with high compression ratios and restrictions in chrominance bandwidth can limit the effectiveness of HD keying. Partly for this reason, 35mm is often favored for shooting HD blue screen shots. High color-resolution HD capture, such as lkegamis Avid DNxHD direct-to-disk camera, is another solution.

Metric prefixes provide an easy way to describe very large numbers. Gigabytes are already in our PCs, Terabytes are used for HD storage, and Petabyte tape stores are available. Binary (bytes) 210 = 1,024 220 = 1.048 x 106 230 = 1.073 x 109 240 = 1.099 x 1012 250 = 1.126 x 1015 Decimal (bytes) 103 106 109 1012 1015 @1080/601 1 / 4 line 2 / 3 frame 1 6 /2 secs 3 1 /4 hrs 74 days HDCAM 2 lines 5 frames 45 secs 1 12 /4 hrs 518 days

B C

kB MB GB TB PB

Metric prefixes used to express larger (byte) numbers and their relation to HD storage.

BWF
Broadcast WAV File. This is based on the WAV audio file but has extensions to make it more useful in the broadcast and professional environments. For example, it can carry a timecode reference as well as other metadata with information about the original source, owner and date as well as the type of audio, compressed or uncompressed. BWF is backwards-compatible with WAV and can be read by a WAV decoder. See also: WAV Website: www.ebu.ch Extrabyte Zettabyte Yottabyte 260 or 1018 270 or 1021 260 or 1024

Cadence
In video, this refers to the pace of pictures. Normally the cadence is steady and real-time, with each successive frame given over to describing the same amount of real-time action. Cadence changes where the 2:3 pulldown process uses two and then three 60 Hz television fields for successive frames of 24 frame-per-second film material. A program with continuous cadence refers to a continuous 2:3 pulldown from beginning to the end. Continuous cadence is preferred for MPEG encoding, as the effects of cadence are predictable and so can be more efficiently encoded, and can impact other downstream processes. See also: 2:3 pulldown

Byte (kilobyte, megabyte, gigabyte, terabyte, and petabyte)


A byte is a group of eight bits of data which is normally capable of holding a character. Its abbreviation is usually an upper case B (lower case b refers to bit). Hence: Kilobyte = kB Megabyte = MB Gigabyte = GB Terabyte = TB Petabyte = PB There are two interpretations of kilo, mega, giga, and tera. The binary way works by multiples of 2110. As 2110 is 1024, which is near to 1000, 1024 is referred to as 1 kilo (byte or bit). That is how RAM is measured out when you update your computer; 256 MB of RAM is 256 x 2210, which is 256 x 1,048,576 bytes. But if you buy a disk drive, a different method is used. Here, the decimal system is used, working in multiples of 103. So a 20 GB drive has 20 x 1,000,000,000 bytes capacity, although operating systems will report disk capacities based on 210. This sounds good until you realize you could have had 20 x 2330 which is 20 x 1,073,741,824 over 7 percent more. Unless stated otherwise, the decimal method is used in this book.

Chroma Keying
The process of deriving and using a key signal formed from areas of a particular color in a picture (often blue or green). See also: Keying

CineAlta
Sonys name for its family of products that bridge cinematography and HDTV which includes HDCAM camcorders and studio VTRs as well as extending to whole production and postproduction systems.

17

COFDM

Coded Orthogonal Frequency Divisional Multiplexing a modulation system used with DVB and ISDB digital broadcast systems, which uses multiple carriers to convey digital television services within 6, 7, or 8 MHz television channels. DVB specifies using either approximately two thousand (2K mode) or eight thousand (8K mode) carriers while ISDB describes three modes with from 1405 to 5617 carriers. This relatively modern technique has very high immunity to signal reflections and can support similar coverage areas to analogue broadcasts but with much reduced power. Its signal parameters can be selected to fit requirements of robustness, data rate delivery and channel bandwidth. Depending on set up, good mobile reception can be achieved even in urban environments, and up to high driving speeds and beyond. The 8K mode is particularly well suited to running large area (high power) single frequency networks (SFN) which, besides economizing on spectrum, enable mobile reception without frequency hopping. It is favored for its flexibility, and delivering DTV in many less-than-ideal circumstances. See also: DVB, ISDB, SFN

It describes the preferred format as 1080 lines, 1920 pixels per line with a 16:9 picture aspect ratio, at 24, 25, and 30 frames per second progressively scanned and progressive segmented frame (PsF), and at 50 and 60 interlaced fields and progressive frames-persecond. See also: ITU-R BT. 709, Publishing

Component Video
Most traditional digital television equipment handles video in its component form: as a combination of pure luminance Y, and pure color information carried in the two color difference signals R-Y and B-Y (analog) or Cr, Cb (digital). The components are derived from the RGB delivered by imaging devices, cameras, telecines, computers, etc. Part of the reasoning for using components is that the human eye can see much more detail in luminance than in the color information (chrominance). So, up to a point, restricting the bandwidth of the color difference signals will have a negligible impact on the viewed pictures. This is a form of compression that is used in PAL and NTSC color coding systems and has been carried through, to some degree, in component digital signals both at SD and HD. For the professional video applications, the color difference signals are usually sampled at half the frequency of the luminance - as in 4:2:2 sampling. There are also other types of component digital sampling such as 4:1:1 with less color detail (used in DV), and 4:2:0 used in MPEG-2. See also: 4:2:0, 4:2:2, Y R-Y B-Y

Color Correction
Historically this is the process of adjusting the colors in a picture so that they match those from other shots or create a particular look. Color correction in television has become highly sophisticated more recently aided by digital technology. This includes secondary color correction that can be targeted at specific areas of pictures or ranges of color. The operation is live and interactive; enabling fine adjustments to achieve precise results in a short time.

Compositing (a.k.a. Vertical Editing)


The process of adding layers of moving (or still) video to assemble a scene. This involves many tools such as DVE (sizing and positioning), color correction and keying. As the operation frequently entails adding many layers, the work is best suited to nonlinear equipment using mastering-quality, video to avoid generation losses. Techniques are now highly developed and are a key part of modern production for both film and television cutting production costs and bringing new possibilities and new effects.

Color Space
The space encompassed by a color system. Examples are: RGB, Y, B-Y, R-Y, HSL (hue, saturation, and luminance) for video, and CMYK from print. Moving between media, platforms or applications can require a change of color space. This involves image processing and so many such changes may degrade the picture if the processing is not of high quality. It is important to note that when converting from Y, B-Y, R-Y to RGB more bits are required in the RGB color space to maintain the dynamic range. For example, if the Y, B-Y, R-Y color space video is 8 bits per component then the RGB color space video will need to be 10 bits.

Compression (Audio)
Techniques to reduce the amount of data or bandwidth used to describe audio. Most methods involve some sort of perceptual coding that exploits weaknesses in the human auditory system. One example is the fact that louder signals of a particular frequency mask lower level signals of a similar frequency. By dropping the data that is in those masked frequencies the signal can be compressed. See also: 5.1, Dolby E

Common Image Format (CIF)


An image format recommended for program production and international exchange. For HD (HD-CIF), this is set out in ITU-R BT.709-4 which was approved in May, 2000.

18

Compression (Video)
Techniques to reduce the amount of data or bandwidth used to describe video. As moving pictures need vast amounts of data to describe them, there have long been various methods used to reduce this for SD. And as HD is approximately six times bigger, the requirement for compression is even more pressing. Compression methods are usually based around the idea of removing spatially or temporally redundant picture detail that we will not miss. Our perception of color is not as sharp as it is for black and white, so the color resolution is reduced (as in 4:2:2). Similarly, fine detail with little contrast is less noticed than bigger, higher contrast areas which is the basis for the scaling (down), or quantizing, of discrete cosine transform (DCT) coefficients as in AVR and JPEG. Huffman coding is applied to further reduce the data. In the case of MPEG-2, which also starts with DCT, it adds temporal compression by also identifying movement between video frames so it can send just information about movement (much less data) for much of the time, rather than whole pictures. Each of these techniques does a useful job but needs to be applied with care. Its important to choose the appropriate compression techniques for various purposes. For example, Avid DNxHD technology was designed for the demands of postproduction. With other compression algorithms not designed for postproduction, multiple compression cycles may occur while moving along the chain, causing a build-up of undesirable artifacts through the concatenation of compression errors. Also, processes such as keying and color correction depend on greater fidelity than we can see, so disappointing results may ensue from originals using compression algorithms not designed for postproduction. See also: AVR, Component video, Huffman Coding, JPEG, MPEG-2

an excess of both may show at output as poor picture quality. This excess may be produced by poor technical quality delivered at the end of the production chain. Random noise will be interpreted as movement by the MPEG-2 encoder, so it wastes valuable data space conveying false movement information. Movement portrayal can also be upset by poor quality frame-rate conversions that produce judder on movement, again increasing unwanted movement data to be transmitted at the expense of spatial detail. Such circumstances also increase the chance of movement going wrong producing blocking in the pictures. Errors can be avoided by the use of good-quality equipment throughout the production chain. Also, the choice of video format can help. For example, there is less movement in using 25 progressively scanned images than in 50 interlaced fields, so the former compress more easily. The efficiency increase is typically 15-20 percent.

Content
Any material completed and ready for delivery to viewers. Content is the product of applying metadata to essence. See also: Essence, Metadata

Conversion
See: Cross-Conversion, Down-Conversion, Standards Conversion, Up-Conversion

Co-sited Sampling
Where samples of luminance and chrominance are all taken at the same instant. This is designed so that the relative timing (phase) of all signal components is not skewed by the sampling system. Sampling is usually co-sited. See also: 4:2:2

Compression Ratio
The ratio of the uncompressed (video or audio) data to the compressed data. It does not define the resulting picture or sound quality, as the effectiveness of the compression system needs to be taken into account. Compression for HD is currently approximately between 4:1 and 7:1 for postproduction encoding techniques used in Avid DNxHD technology, and between 6:1 and 14:1 for VTR formats. For transmission, the actual values depend on the broadcasters use of the available data bandwidth, but around 40:1 is commonly used for SD and somewhat higher 50:1 or 60:1 for HD (also depending on format).

Cross-Conversion
This term refers to changing between HD video formats for example from 1080/50i to 1080/60i, or 720/60p to 1080/50i. It also covers up/down-conversion and res processes but does not imply moving between HD and SD formats. The digital processes involve spatial interpolation to change the number of lines and temporal interpolation to change the vertical scan rate. Note that while it is straightforward to go from progressive to interlace scanning, the reverse is more complex as movement between two interlaced fields has to be resolved into a single progressing frame. To maintain quality, Avid DNxHD technology is designed to avoid cross-conversion by encoding in the original resolution and frame rate, since both spatial and temporal interpolation can introduce unwanted artifacts. See also: Down-Conversion, Down-Res, Up-Conversion, Up-Res, Standards Conversion

Compression-Friendly
Material that is well suited for compression. This can become important in transmission where very limited data bandwidth is available and high compression ratios have to be used. MPEG-2 compression looks at spatial detail as well as movement in pictures and

19

C D

CSO
Color Separation Overlay. Another name for chroma keying. See also: Keying

D-cinema (a.k.a. E-cinema)


The process of digital electronic cinema which may involve the whole scene-to-screen production chain or just the distribution and exhibition of cinema material by digital means. The 1080/24p HD format has been used in some productions. Although this is not capable of representing the full theoretically available detail of 35mm film, audiences are generally impressed with the results. Lack of film weave, scratches, sparkles etc., along with the loss-free generations through the production process, delivers technical excellence through to the cinema screen. Being in digital form there are new possibilities for movie distribution by tape and disks. Some thought has also been given to the use of satellite links and telecommunications channels. The main savings over film are expected in the areas of copying and distribution of prints where an estimated $800 million per year is spent by studios on releasing and shipping film prints around the world. At the same time there could be far more flexibility in the screening as schedules could be more easily and quickly updated or changed. As most shoots still use film, current distribution schemes start with scanning the film and compressing the resulting digital images. These are then distributed, stored, and then replayed locally from hard disk arrays in cinemas. With film special effects and, increasingly, much postproduction adopting digital technology, digital production, distribution, and exhibition makes increasing sense. Among the necessary technologies, the recent rapid development of high-resolution, large screen digital projectors has made digital cinema exhibition possible. Projectors are based on either of two technologies, D-ILA or DLP. As yet, D-cinema standards are not fixed but the SMPTE DC 28 task force is working on this. See also: DC 28, DLP, D-ILA

D5-HD
This is an HD application of the D5 half-inch digital VTR format from Panasonic and is widely used for HD mastering. Using a standard D-5 cassette, it records and replays over two hours of 1080/59.94i, 1035/59.94i, 1080/23.98p, 720/59.94p, 1080/50i, 1080/25p and 480/59.94i. It can slew a 24 Hz recording to use the material directly in 25/50 Hz applications. There are eight discrete channels of 24-bit 48 kHz digital audio to allow for 5.1 sound and stereo mixes. This is derived from the standard D5 which records a data rate of 235 Mb/s, so compression is needed to reduce the video bitrate from up to 1224 Mb/s.

D6
The D6 tape format uses a 19mm D-1 like cassette to record 64 minutes of uncompressed HD material in any of the current HDTV standards. The recording rate is up to 1020 Mb/s and uses 10-bit luminance and 8-bit chrominance and records 12 channels of AES/EBU stereo digital audio. The primary D6 VTR on the market is VooDoo from Thomson multimedia and it is often used in film-to-tape applications. ANSI/SMPTE 277M and 278M are standards for D6.

D7-HD
See DVCPRO HD

Dark Chip
See DLP Cinema

DAT
Digital Audio Tape system. This uses a 4mm tape with a helical scan rotary head system to achieve 16-bit stereo 44.1/48 kHz record and playback. The system was originally designed as a consumer format to compete with the compact cassette. However, with the introduction of DAT machines that include timecode, it was adopted by the professional audio industry and later by the postproduction industry.

DCT
Discrete Cosine Transform. Used as a first stage of many digital video compression schemes, DCT converts 8 x 8 pixel blocks of pictures to express them as frequencies and amplitudes. In itself this may not reduce the data but arranges its information so that it can be compressed. As the high frequency, low amplitude detail is least noticeable their coefficients are progressively reduced, some often to zero, to fit the required file size per picture (constant bit rate) or to achieve a specified quality level. It is this process, known as quantization, which actually reduces the data.

DC 28
SMPTE Task Force On Digital Cinema. DC 28 aims to aid the development of this new area by developing standards for items such as picture formats, audio standards, and compression. While those who have seen the results of todays HD-based digital cinema are generally very impressed, it is believed that yet higher standards may be proposed to offer something in advance of that experience.

20

For VTR applications the file size is fixed and the compression schemes efficiency is shown in its ability to use all the file space without overflowing it. This is one reason why a quoted compression ratio is not a complete measure of picture quality. DCT takes place within a single picture and so does intra-frame (I-frame) compression. It is a part of the most widely used compression schemes in television. See also: AVR, Compression Ratio, JPEG, MPEG-2

DEVA
A robust audio four-track, 24-bit, hard disk location recording system manufactured by Zaxcom that integrates well with both nonlinear audio and video systems.

time the big capacity drive has expanded from 80Mbytes stored on 14-inch platters (form factor) to as much as 250 Gbytes on 3.5-inch form factor ATA drives (up to 147 GB on SCSI and Fibre drives) and just 1 inch in height. Historically capacity has doubled every two years (41 percent per year), but even this has been beaten in recent years nearing 60 percent per year. Hence there has been ready acceptance of the increased requirements of HD over SD for nonlinear applications. The up to seven-fold increase in capacity represents less than four years of disk development. Indeed, unlike SD development where tape editing operations were well established, disk-based nonlinear now offers the only practical means of uncompressed-quality service. Typical modern high-capacity drives have up to 4 stacked platters providing 8 data surfaces. In principle they are very simple and comprise just two moving parts, the spinning platters and the swinging arm that positions the read/write heads. Their simplicity contributes to reliability, which is typically quoted as around 1,000,000 hours mean time between failure (MTBF). This does not mean they will run for 20 years without a hitch but are unlikely to fail within their quoted service life typically five years. The technology is very highly developed. Drives are assembled in clean rooms and hermetically sealed as, to achieve such performance, the read/write heads fly on air pressure so close to the disk surface that any dust or even smoke particles may cause a loss of data. In operation, the heads must not touch the disk surface as temporary or permanent loss of data will ensue from such a crash, which imposes very tight limits on shock and vibration. Development normally proceeds in either of two directions: increasing (usually doubling) the tracks per inch (TPI) or the recording density, bits per inch (BPI), along the tracks, making an increase in areal density. Figures quoted for a 73 GB drive are 64,000 TPI, 570,000 BPI and an area density over 36,000 Mb /square inch. Note that increases of

DiBEG
Digital Broadcast Experts Group was set up in 1997 to promote the exchange of technical information about digital TV. Based in Tokyo, many of its members are from Japanese manufacturers and their Ministry of Post and Telecommunications has produced ISDB, the standard for digital TV broadcasting that has now been chosen for Japan. Website: www.dibeg.org

D-ILA
Direct-Drive Image Light Amplifier. A technology that uses a liquid crystal reflective CMOS chip for light modulation in a digital projector. In a drive for higher resolutions, the latest developments by JVC have produced a 2K (2,048 x 1,536) array, which is said to meet the SMPTE DC 28.8 recommendation for 2000 lines of resolution for digital cinema. The 1.3-inch diagonal, 3.1 million-pixel chip is addressed digitally by the source signal. The tiny 13.5-micron pitch between pixels is intended to help eliminate stripe noise to produce bright, clear, high-contrast images. This is an efficient reflective structure, bouncing more than 93 percent (aperture) of the used light off the pixels. See also D-cinema Website: www.jvc.com/prof

Head

Disk

Disk Drives
Over the last two decades, disk drives have continually increased their usage in television applications. Hard, fixed drives offer the greatest performance for capacity, speed of access to data, and reliability. Removable hard drives may have somewhat less performance as they tend to operate in more arduous conditions. Optical drives are also removable and provide low-cost storage but have lower data rates and are used mainly as an exchange medium and archive. Hard drive development has continued at an astounding rate for over 20 years. During that
L

Arm

Positioning (P) = 6ms (avergae) Latency (L) = 3ms (average) Time to access data = 9ms (average) 18ms (max)

21

BPI also raise the data rate. Another way is to increase the rotation speed of the disks. 10,000 RPM is fast and some 15,000 RPM models are available. This, in turn, reduces the time for the disk to spin to the start of required data (latency). The other component of data access is the time to position the heads over the correct track (positioning time) and, for the same 73 GB drive, adds 2.99 ms on average (but is only 0.35 ms to the next track), making a total average access time of 4.7 ms (typical) for reads. Note that while capacities have increased over 1,000-fold, access times to data have, roughly speaking, only decreased by 50 percent. Disk performance also varies according to where the data is stored. Fastest data is retrieved from near the circumference, slowest near the centre. Also, as useful data is not delivered during access time, data fragmentation will affect the maximum sustained data rate. Television requires continuous data so real-time disk stores have to be designed to meet this need, even during adverse access conditions. The driving force for ever-larger disk drive capacities has little to do with text or normal office use, and everything to do with the requirement to store video and audio. Development

is expected to continue for some years yet, at least at the pace of the last ten years. If sustained for the next ten years this could make for some 32-fold further increase to 3 or 4TB capacities. See also: RAID, Storage Capacity

DLP
Digital Light Processing: Texas Instruments, Inc. digital projection technology that involves the application of digital micromirror devices (DMD) for television, including HD, as well as cinema (see DLP cinema below). DMD chips have an array of mirrors which can be angled by +/- 10 degrees so as to reflect projection lamp light through the projection lens, or not. Since mirror response time is fast (~10 microseconds), rapidly varying the time of throughthe-lens reflection allows greyscales to be perceived. For video, each video field is subdivided into time intervals, or bit times. So, for 8-bit video, 256 grey levels are produced and, with suitable pre-processing, digital images are directly projected. The array, which is created by micomachining technology, is built up over conventional CMOS SRAM address circuitry. Array sizes for video started with 768 x 576 pixels 442,368 mirrors, for SD. The later 1280 x 1024 DMD has been widely seen in HD and D-cinema presentations. Most agree it is at least as good as projected film. TI expects to offer an over 2000-pixel wide chip in the near future. While much interest focuses on the DMD chips themselves, some processing is required to drive the chips. One aspect is degamma: the removal of gamma correction from the signal to suit the linear nature of the DMD-based display. Typically, this involves a LUT (Look Up Table) to convert one given range of signal values to another. Website: www.dlp.com See also: Gamma

3TB

1,000GB

DLP Cinema
300GB

100GB 2001 2006 2011

This refers to the application of Texas Instruments DLP technology to the specific area of film exhibition. Here, particular care is taken to achieve high contrast ratios and deliver high brightness to large screens. The development of Dark chips has played an important part by substantially reducing spurious reflected light from the digital micromirror devices. This has been achieved by making the chips substrate, and everything except the mirror faces, non-reflective. In addition, the use of normal projection lamp power produces up to 12 ft/l light level on a 60-foot screen. See also: D-cinema, DLP

Hard disk drive capacity at historic 41% pa growth

DNxHD
see: Avid DNxHD

22

Dolby Digital Surround EX


Dolby added a third surround channel to improve the rear spatial quality of the soundtrack. For playback there are three front channels and three rear using the 6.1 configuration.

Down-Res
Reducing the raster size of video images. This is usually from an HD format to an SD format and implies there is no change in frame rate. The process involves spatial interpolation, color correction (for SD/HD differences) and, for SD, possibly re-framing the image to best fit display at 4:3 aspect ratio. It is generally agreed that HD video down-converted to SD has superior quality to SD-native material. See also: Down-Conversion, Publishing, Standards Conversion, Up-Res

Dolby E
Dolby E encodes up to eight audio channels plus metadata into a two-channel bitstream with a standard data rate of 1.92 Mb/s (20-bit audio at 48 kHz x 2). This is a broadcast multi-channel audio encoding system based on AC3 that can take eight discrete tracks and encode them onto one AES EBU digital stream. Many broadcasters choose to transmit four stereo pairs in different languages or a full 5.1 and a stereo mix. The system also packages the audio data in blocks that line-up with the frame edges of the video. This allows the Dolby E stream to be seamlessly edited along with the picture without affecting or corrupting the encoded signal. Compression (audio)

DS Nitris (SD and HD)


The DS Nitris system is Avids flagship effects and editing solution for HD with the power of the Avid DNA system. DS Nitris is a high-end effects, editing, and content creation solution able to deliver in both SD and HD. The Nitris Digital Nonlinear Accelerator hardware offers unparalleled performance with 8 real-time streams of 10 bit SD with color correction, scale, crop, and 3D on each stream as well as 2 streams of 10 bit uncompressed HD.

Dolby Matrix
Another name for the 4:2:4 monitoring system. See also: 4:2:4, Dolby Stereo

DSP
Digital Signal Processing. This is a generic term given to hardware that enhances the speed of digital signal processing. For video and audio this often makes the difference between creating real-time results and waiting for much slower, software-based processing. It is not only when processing HDs bigger (than SD) pictures that the extra speed becomes important, audio mixing of 90-100 channels also benefits from the on-demand power available from DSPs. Many people recognize the use of DSP as marking the boundary of professional equipment.

Dolby Stereo / Pro Logic


This is the most popular analog multi-channel format to date. The system encodes LCRS signals into a Dolby Lt and Rt (Left Total and Right Total) using an analog system. The LtRt signals typically end up on the 35mm print for Dolby Stereo and on the VHS or DVD for Dolby Pro Logic releases. The system is not a discrete encoding system such as the digital versions (such as DTS and Dolby E) and the Dolby encoding process can adversely affect the spatial content when integrating stereo material such as premixed music.

DTF/DTF2
Name for Sonys half-inch Digital Tape Format which offers very high data storage capacity (up to 200 GB) on half-inch tape cartridges. Such stores are often used for storing digital video such as HD in postproduction areas, where they may be available to clients on a network.

Down-Conversion
Down-Conversion includes the processes of Down-Res but implies that frame rates are changed. For instance, moving from 1080/60i to 625/50i involves Down-Conversion. See also: Cross-Conversion, Down-Res, Publishing, Standards Conversion, Up-Res

DTS Down Mixing


The process of taking a surround mix and automatically turning it into a stereo mix, rather than re-purposing the entire mix. This can save time but needs to be carefully monitored to make sure that important elements of the film soundtracks, such as dialogue or motiontracked sound effect elements, do not get lost. See also: Metadata Digital Theatre System is a competitive system to Dolby AC3. It was originally designed for film sound delivery and the system uses an optical timecode track on the 35mm release print that then drives a CD ROM-based audio playback system. The advantages are twofold: it provides higher audio quality due to the audio not being compressed into small areas of the 35mm print, and one release print could be used for several language markets with only the CD ROMs changing, rather than the entire print.

23

DTS ES

D E

ES, standing for Extended Surround, uses a third surround (rear) channel for better rear spatial positioning. The system allows both 5.1 and 6.1 compatibility by matrixing the rear center channel into the LS and RS signals. On playback in a 6.1 theatre, this is removed from LS and RS and played discretely via center surround, but in 5.1 it is left in the LS and RS signals.

Terrestrial, DVB-T, is based on COFDM (Coded Orthogonal Frequency Divisional Multiplexing) with QPSK, 16 QAM and 64 QAM modulation. This provides a robust service which can match, and even improve on, analog coverage. The coding and modulation parameters can be selected to tailor transmission to specific requirements, such as for mobile or portable urban reception, and to fit 6, 7, and 8 MHz channels. Typically, around 24 Mb/s of useful data is delivered in an 8 MHz fixed terrestrial channel, or about half that when configured to operate in arduous mobile conditions. See also: COFDM, QPSK, and QAM Website: www.dvb.org

DTV
Digital Television. This is a general term that covers both SD and HD digital formats.

DVCPRO HD (a.k.a. D7-HD)


This is the HD version of the Panasonic DV VTR hierarchy. DV and DVCPRO record 25 Mb/s; DVCPRO 50 records 50 Mb/s; and DVCPRO HD records 100Mb/s. All use the DV intra-frame digital compression scheme and the 6.35 mm (1/4-inch) DV tape cassette. Video sampling is at 4:2:2 and 1080i as well as 720p formats are supported. There are 8 x 16-bit 48 kHz audio channels.The recording data rate means that considerable compression must be used to reduce approximately 1Gb/s of video and audio data. Video compression of 6.7:1 is quoted. A feature of DVCPRO HD camcorders is variable progressive frame rates for shooting from 4-33, 36, 40 and 60 Hz. DVCPRO supports various frame rates within the 60p structure such as 24p, 25p, and 50p. Because of this, VFR (variable frame rate) is also supported by a flagging of the proper frames to enable undercranked and overcranked motion photography for 24 fps, 25 fps, and 30 fps playback.

Dual Link
SDI and HD-SDI links enable the transport of uncompressed 4:2:2 video with or without embedded digital audio. Dual link is referred to as the solution used to enable larger requirements such as video with key (4:2:2:4), RGB (4:4:4) and RGB with key (4:4:4:4).

DVB
Digital Video Broadcasting. DVB is a consortium, in Geneva, that establishes common international standards for digital television. It comprises approximately 300 companies from over 30 countries around the world, all working in the broadcasting and related fields. DVB standards are open and published through ETSI (see entry below).
DVB-S DVB-S, DVB-C DVB-S, DVB-C and DVB-T DVB-S, DVB-C, ISDB-T/DiBEG DVB-S, DVB-C, DSS (Hughes satellite), OpenCable (Cablelabs), ATSC

E-cinema
See D-cinema

EDL
Edit Decision List. This is data that describes how material is to be edited, e.g., from offline to online, or a record of what happened in the editing process. EDLs were devised before the days of nonlinear editing and were never updated to accommodate any of the digital enhancements such as DVEs and advanced color correction and keying. Even so, they remain in wide use as a well-recognized means of conveying the more basic editing decisions: cuts, dissolves, wipes, slo-mo, etc. Popular formats are CMX 3400 and 3600. More recently, new initiatives, such as AAF and OMF file formats, offer the far wider capabilities needed for todays production needs. The OMF format has become a de facto standard for transferring full decision data between offline and online operations. See also: AAF, OMF

World Digital Television Standards

DVB has standards for SD and HD digital video formats as well as for transmission systems. It uses MPEG-2 video compression and MPEG Layer II audio compression. DVB-S is based on QPSK and cable DVB-C, uses 64-QAM and has support for higher order modulations.

24

Essence
Term used to describe essential material which, for television, is what appears on the screen and comes out of the speakers: video, audio and text. Essence consists of those recorded elements that may be incorporated by means of editing, mixing or effects compositing into a finished program (content). See also: Content, Metadata

Like all networking, Fibre Channel is defined in layers, labeled FC-1 to FC-4, which range from a definition of the physical media (FC-1) up to the protocol layers (FC-4) which, include SCSI, the widely used disk interface. This is the key to its operation with disks and in storage networking. Fibre Channel is specified for copper as well as fibre connections. Because of its close association with disk drives, many drives are offered with fibre channel interfaces. Its TV application is mostly in the creation of storage area networks (SANs). It may be configured in an arbitrated loop (FC-AL) or, for more speed, using fabric switching. FC is a connectionless protocol and uses an arbitration sequence to ensure network access before transmission. See also: SCSI, SAN

Ethernet
Ethernet (IEEE 802.x) is a widely used Local Area Network technology which is also popular for video applications. Currently 100 Mb/s and 1 Gb/s data speeds are in general use while a 10 Gb/s version is on the way. (These rates are after the 8B/10B coding; actual transmission rates are 25 percent higher). Depending on configuration, Ethernet can offer high-speed data transfers compatible with real-time video. Ethernet is connectionless. It is able to transfer data between devices but a continuous, real-time service cannot be guaranteed as other network users may interfere unless it uses a switch to isolate the traffic of each system. Its data packets include a destination address that all connected devices listen for and decide whether it is for them, or not. To transmit, a device waits for silence on the network before starting. CSMA/CD (Carrier Sense Multiple Access Collision Detect) handles cases where two devices start transmitting simultaneously they simply wait a random length of time before starting again. See also: ATM, Fibre Channel Website: http://standards.ieee.org/getieee802/

E F G

Gamma (Correction)
Gamma describes the differences in transfer curve characteristics between video source devices, such as cameras, and the response of the display devices usually cathode ray tubes. Gamma correction is normally applied early to the source video R, G, B signals as part of the processing in cameras to make the video signal more impervious to atmospheric noise during over-the-air analogue transmissions. However, the more recent use of other display devices using very different technologies and having very different gammas means that gamma must be modified in each display system to match the transfer characteristics of the given display technology. Digital Micromirror Devices (DMDs) are actually timemodulated. The amount of light they reflect onto the screen is a function of a duty cycle for time ON. Thus, DMD-based systems program the display gamma for any given luminance level by adjusting the exposure time for that level through a Look Up Table (LUT). Gamma corrected colors or components are annotated with a prime, e.g., R, G, B, and Y, Cr, Cb . As virtually all references in this document involve gamma corrected signals, the primes have not been included, for simplicity. See also: DLP

ETSI
European Telecommunications Standards Institute. Based in southern France it unites some 890 members from 52 countries inside and outside Europe and, where possible, promotes worldwide standardization. Its work is involved with international standardization bodies, mainly the ITU-T and the ITU-R. Website: www.etsi.org

FireWire
See IEEE 1394

Gamut (Color)
The range of possible colors available in an imaging system. The red, blue, and green phosphors on television screens and the RGB color pick-up CCDs in the camera define the limits of the colors that can be displayed the color gamut. Between the camera and viewers screen there are many processes, many using component 4:2:2 video. However, not all component value combinations relate to valid RGB colors (for example, combinations where Y is zero). Equipment that generates images directly in component color space, such as some graphics machines, can produce colors within the component range but invalid

Fibre Channel (FC)


A set of networking standards for high-speed data interchange and storage interface. Fibre Channel is quoted as 1 Gb/s or 2 Gb/s transmission speeds which, due to the 8B/10B coding used to improve the transmission characteristics, is a 800 kb/s maximum data speed. Both are capable of full duplex (simultaneous bidirectional operation).

25

in RGB, which can also exceed the limits allowed for PAL and NTSC. There is potential for overloading equipment. This applies especially to transmitters which may cut out to avoid damage.

per line, progressive scan format is well accepted as HD. This is partly explained by the better vertical resolution of its progressive scanning. Apart from the video format, another variation on SD is a slightly different colorimetry where, for once, the world agrees on a common standard. As HDs 1080 x 1920 image size is close to the 2K used for film, there is a crossover between film and television. This is even more the case if using a 16:9 window of 2K as here there is very little difference in size. It is generally agreed that any format containing at least twice the standard definition format on both H and V axes is high definition. After some initial debate about the formats available to prospective HD producers and television stations, the acceptance of 1080-HD video at various frame rates, as a common image format by the ITU, has made matters far more straightforward. While television stations may have some latitude in their choice of format, translating, if required, from the common image formats should be routine and give high quality results. See also: Common Image Format, Interlace Factor

GOP

G H

Group Of Pictures as in MPEG-2 video compression. This is the number of frames to each integral I-frame the frames between being predictive (types B and P). Long GOP usually refers to MPEG-2 transmission coding, where the GOP is often as long as half a second, 13 or 15 frames (25 or 30 fps), which helps to achieve the required very high compression ratios.

GOP=13

A typical group of pictures


Long GOP MPEG-2 is designed for transmission not for editing. Even cutting is not straightforward and its accuracy is limited to the GOP length unless further processing is applied. A GOP of 1 indicates I-frame only video, which can be cut at every frame without need of processing. Studio applications of MPEG-2 have very short GOPs, Betacam SX has a GOP of 2, IMX has 1, (i.e., I-frame only no predictive frames) which means cutting at any frame is straightforward. Other formats such as DVCPRO HD, HDCAM, and D5-HD do not use MPEG but are also I-frame only. See also: Inter-frame compression, MPEG-2

2048

2K Film
1920

1080-HD
1280

Green Screen
see: Blue Screen

720

720-HD
576 720

576 &

HD
High Definition Television. This is defined by the ATSC and others as having a resolution of approximately twice that of conventional television (meaning analog NTSC implying 486 visible lines) both horizontally and vertically, a picture aspect ratio of 16:9 and a frame rate of 24 fps and higher. This is not quite straightforward as the 720-line x 1280 pixels

HDCAM
Sonys HD camcorder version of the popular Digital Betacam. HDCAM defines a half-inch tape format used in the camcorders. There are studio recorders too.

26

1080

1536

In the camcorder, the camera section includes 2/3-inch, 2.1 million pixel CCDs to capture 1080 x 1920 images. The lenses have compatibility with Digital Betacam products as well as accepting HD lenses for the highest picture quality. The recorder offers up to 40-minutes on a small cassette making the package suitable for a wide range of program origination, including on location. A series of steps reduces the baseband video data rate, down-sampling from 996 Mb/s to 622 Mb/s after pre-filtering, and then 4.4:1 compression is used to achieve the lower 140 Mb/s recorded to tape. The format also supports four channels of AES/EBU audio and the total recording rate to tape is 185Mb/s. HDCAM is effectively sampled at 3:1:1 with the horizontal resolution sub-sampled to 1440 pixels. It fulfills many HD needs but is not an ideal medium for Blue Screen work. Video formats supported by HDCAM are 1080 x 1920 pixels at 24, 25, and 30 progressive fps and at 50 and 60 Hz interlace. Material shot at 24p can be directly played back into 50 Hz or 60 Hz environments. Also, the ability to play back at different frame rates can be used to speed up or slow down the action. See also: CineAlta

is encoded using long-GOP MPEG-2, and native files are transferred using IEEE 1394 (FireWire). Initial cameras were single-chip, with 3-chip cameras planned to increase image quality. For postproduction, native HDV editing increases storage efficiency and image quality.

Huffman Coding
A method of compressing data by recognizing repeated patterns and assigning short codes to those that occur frequently, and longer codes to those that are less frequent. The codes are assigned according to a Huffman Table. Sending the codes rather than all the original data can achieve as much as a 2:1 lossless compression and the method is often used as a part of video compression schemes such as JPEG. See also: JPEG

H I

IEEE 1394 (FireWire)


Digital video and multimedia applications have brought about the need to move large amounts of data quickly and at a guaranteed rate. The IEEE 1394 (FireWire) serial bus is the industry-standard implementation of Apple Computer, Inc.'s 1394 digital I/O technology. It's a versatile, high-speed, low-cost method of interconnecting a variety of devices and delivering high levels of data in real time. Current versions support data transfer rates of up to 400 Mb/s (1394a) and 800 Mb/s (1394b). 1394 technology offers several advantages over other technologies, including: Guaranteed delivery of multiple data streams through isochronous data transport The ability to connect up to 63 devices without the need for additional hardware (such as hubs) A flexible, six-wire cable Complete plug-and-play operation, including the hot swapping of live devices

HDCAM SR
Sony's new 4:4:4 RGB high-definition (HD) camera and new high-data rate HDCAM SR recording system (without pre-filtering or sub-sampling). HDCAM-SR provides for the industry's highest real-time compressed data rate to tape of up to 880 Mb/s (in the SRW-1), utilizing mild compression rates, advanced wavelength channel coding, and a self-aligning data track architecture. In the RGB spectrum, HDCAM-SR can provide archive quality compressed high definition digital videotape recordings for Digital Intermediate applications.

HD-CIF
See Common Image Format

ILA HD-SDI
High Definition Serial Digital Interface defined in SMPTE 292M. This is a High Definition version of the widely used SDI interface developed for standard definition applications. It has a total data rate of 1.485 Gb/s and will carry 8 or 10-bit Y, Cr, Cb at 74.25, 37.125, 37.125M-samples/s. It can also carry audio and ancillary data. Thus HD connections can be made by a single coax BNC within a facility, or extended to 2km over fibre. See D-ILA

Inter-frame Compression
Video compression that uses information from several successive video frames to make up the data for its compressed predictive frames. The most common example is MPEG-2 with a GOP greater than 1. Such an MPEG-2 stream contains a mix of both I-frames and predictive B and P (Bi-directional predictive and Predictive) frames. Predictive frames cannot be decoded in isolation from those in the rest of the GOP so the whole GOP must be decoded. This is good for transmission but does not offer the flexibility needed for accurate editing. See also: GOP, MPEG-2

HDV
HDV is a format embraced by JVC, Sony, Canon, Sharp, and others, and proposed as an international standard format. HDV cameras and decks record and play high definition images using DV cassette tapes. Files are transferred using IEEE 1394 (FireWire). Media

27

Interlace
A method of ordering the lines of scanned images as two (or more) fields per frame comprising alternate lines only. Interlace in television only uses the 2:1, an alternate interlace of a field of odd lines, 1,3,5, etc., followed by a field of even lines, 2, 4, 6, etc. This doubles the vertical refresh rate as there are twice as many interlaced fields as there are whole frames. The result is better portrayal of movement and reduction of flicker without increasing the number of full frames or required signal bandwidth. There is an impact on vertical resolution and care is needed in image processing. See also: Interlace factor, Progressive,

Bi-linear, Bi-cubic, Bi-linear summed and Bi-cubic summed. Of these, the latter is considered to give the best results. Generally all of these spatial interpolation techniques work by looking at an area of adjacent pixels in the original picture and applying a formula to generate a new pixel on a different raster. Note that such operations often have to be executed in real time requiring considerable processing power for HD output if they are to be introduced as live elements. A real-time example is Digital Video Effects (DVE). Standards conversion also requires real-time operation. Besides spatial interpolation it also has to interpolate in the time domain to change between the frame rates of different television standards. See also: Anti-aliasing, Standards Conversion

Interlace Factor
Use of interlaced, rather than progressive, scans has no effect on the vertical resolution of still images. However, if the image moves at all which it usually does, the resolution is reduced by the Interlace Factor, which may be 0.7 or less. This is due to the time displacement between the two fields of interlace. Thus the overall vertical resolution of a scanned moving image, which is also subject to the Kell Factor, will be 50 percent, or less. See also: Kell Factor

Intra-frame Compression (a.k.a. I-frame compression)


Compression which takes information from one picture only. This way, all the information to re-create a frame is contained within its own compressed data and is not dependent on other frames. This means that I-frame compressed video can simply be cut at any picture boundary without the need for any decoding and recoding. So it is possible for this compressed video to be edited and the result output as first generation material. Any other operations such as wipes, dissolves, mixes, DVE moves etc., can only be performed on the baseband signal, requiring that the video is first decompressed. See also: AVR, DVCPRO HD, JPEG, MPEG-2

Interoperability
Interoperability is the ability of one application to exchange and use data from another so the two can work together or interoperate. The standards developed by the television industry through organizations such as SMPTE and ITU have ensured that video and audio and some metadata, such as timecode, can pass between systems and be used by them. Hence an HD-SDI connection will be accepted by the HD-SDI interface that is commonly found on HD equipment. With the introduction of IT-based equipment into television production there came a mass of new de facto standards, often modified to suit the needs of a particular application. The plug-and-play simplicity and efficiency of the HD-SDI example was not there. Years of work by both industries have brought huge improvements exemplified by the work of Pro-MPEG with MXF files, and the AAF association with the AAF file format. See also: AAF, MXF, OMF

ISDB
Integrated Services Digital Broadcasting. A digital broadcasting system which is continuing development in Japan, to provide normal SD and HD television as well as support and multimedia services for Satellite (ISDB-S), cable (ISDB-C) and terrestrial (ISDB-T) broadcasters. Satellite HDTV services started broadcasting in December 2000. ISDB-T is planned to offer services for both stationary and mobile services together; and with operation possible in 6, 7, or 8 MHz channels, it is aimed at international markets. It has many similarities to DVB-T including robustness against multipath and propagation fading as well as allowing SFN operation. MPEG-2 is the chosen video compression scheme and both SD and HD can be carried. ISDB-T also uses OFDM, but this is Band Segmented Transmission OFDM that consists of a set of frequency blocks called OFDM segments. Flexibility is obtained as each segment may have different transmission parameters to allow hierarchical transmission. Type of modulation (e.g., QPSK, 16 QAM, and 64-QAM) and error correction can be independently specified to each segment group of up to three hierarchical layers in a channel. So, for example, different layers may be optimized for mobile SD and stationary

Interpolation
The technique of re-distributing information over different sized areas or dimensions. For example, changing the size of a picture requires spatial interpolation. Simply spreading the existing pixels over the raster lines will look very ragged. A good interpolation process will calculate the value of each new pixel according to its position relative to the original information. There are several established mathematical methods or filters for this:

28

HD applications. In addition, one segment can be independently transmitted as audio and data services for partial reception by portable receivers. ISDB-T development continues but it is not expected to be in commercial use until 2005. Website: www.dibeg.org/isdbt.htm

ITU-R BT. 709


ITU-R BT.709-4 was approved in May 2000 and describes 1080 x 1920 16:9 picture aspect ratio formats, at 24, 25, and 30 frames per second progressively scanned and with progressive segmented frame (PsF), and at 50 and 60 interlaced fields and progressive frames-per-second. Y, Cr, Cb sampling is at 74.25, 37.125, 37.125 MHz and RGB at 74.25, 74.25, 74.25 MHz. These are designated as preferred video formats for new productions and international exchange. Earlier versions of the standard only described the 1125/60i and 1250/50i HDTV formats, which have 1035 and 1152 active lines respectively. Their digital representation is defined using 8 or 10 bits per sample. As with 601 above, the common sampling frequency multiple is 2.25MHz, used here to produce rates of 74.25 (Y) and 37.125 (Cr and Cb) MHz for the 1125/60i system, and 72 and 36 MHz for the 1250/50i system. See also: Common Image Format

ISO
The International Standards Organization is a United Nations body and has published over 10,000 international standards since 1951. Membership consists mainly of national standards bodies such as ANSI (USA), BSI (UK) and DIN (Germany). ISO standards help ensure interoperability, for example for removable disks, networking, and compression systems. It works closely with the ITU on matters of common interest. Website: www.iso.ch

ITU
International Telecommunications Union. The specialized agency of the United Nations for telecommunications which encourages standards for interconnectivity, promotes the best use of spectrum and encourages telecoms growth in less developed countries. ITU-R is involved with radio communications and has taken over from the CCIR, and the broadcast television (BT) section recommendations are of obvious interest. Also, ITU-T (formerly CCITT) is concerned with telecommunications standards. Website: www.itu.ch

I J

JBOD
Just a bunch of disks such as a collection of drives connected on a single data bus (e.g., SCSI or Fibre Channel). These can behave as a large area of disk storage but the ability to share is not implied. See also: SAN

ITU-R BT. 601


This is the standard for the digital encoding of 525/60i and 625/50i SD component television signals (486 and 576 active lines). It defines 4:2:2 sampling (at 13.5, 6.75, 6.75MHz) for Y, R-Y, B-Y as well as 4:4:4 sampling of R, G, B, making 720 pixels per active line. There may be 8- or 10-bits per sample. There is allowance for 16:9 as well as 4:3 picture aspect ratios. In order for the sampling points to be a static pattern on the picture, the sampling frequencies were chosen to be exact multiples of both the 525 and 625 line frequencies. The lowest common frequency is 2.25 MHz. Multiples of this are used to provide sufficient bandwidth for luminance (13.5 MHz) and color difference (6.75 MHz). Note that the same 2.25 MHz frequency is also used as the basis for HD digital sampling. Sampling levels allow some headroom for digital signals whereby, for 8-bit coding, black is assigned to level 16, and white to 235, and for color difference zero is at level 128 and ranges from 16 to 240. See also: 4:2:2

JFIF
JPEG File Interchange Format a compression scheme used by Avid in its Meridien hardware-based nonlinear systems. A JFIF M-JPEG resolution is termed constant rate since compressing clips of varying complexity results in a fixed data rate. Each JFIF resolution is defined by a target data rate and a base quantization table. When digitizing, the quantization table is linearly scaled (known as rolling Q) to conform the actual compressed data rate to the target rate. Due to the flexibility of this approach, imagery compressed by a JFIF resolution generally looks better than that compressed by an AVR of comparable average data rate.

JPEG
Joint Photographic Experts Group refers to a form of intra-frame compression of still images given the extension .jpg in DOS/Windows file formats. The technique uses DCTbased digital compression working with 8 x 8 pixel blocks (see DCT) and Huffman coding. By altering the DCT quantization levels (Q), the compression ratio can be selected from high quality at around 2:1 to as much as 50:1 for browse-quality images. JPEG is very similar to, but not exactly the same as, the I-frame compression in MPEG-2.

29

The actual quantity of data generated depends on the amount of detail in the picture so, to achieve a constant output bit rate with moving pictures (video) as is normally required if recording to a VTR the quantization levels need dynamic adjustment. The aim is to fill, but not overflow, the allotted storage space per picture. For constant quality and variable bit rate, the quantization can be held constant but care is needed not to exceed any upper data limits. See also: AVR, DCT, JFIF, MPEG-2

captions, the key signal is supplied along with the video. Otherwise sophisticated means are available to derive the key signal. Typically, objects are shot against a blue or green screen and that key color then defines the key signal. In reality the key color spills onto the object so de-spill techniques are applied. The boundary between the object and background is often the subject of much effort. It is rarely a hard cut (hard key), which tends to look jagged and false, but a carefully set up dissolve to render a smooth, natural-looking edge (shaped or linear key). Further techniques are used to key semi-transparent material such as smoke, fog, and glass. Often this uses a non-additive mix technique which apportions foreground and background according to its luminance. The availability of highly developed digital keying techniques has been a large factor in moving motion picture effects into the digital domain. Their excellence and efficiency has already changed the way many are made, cutting costs by simplifying the shoot and avoiding some expensive on-location work. In digital systems, the key is a full-bandwidth signal (like Y, luminance) and is often associated with its foreground video when stored. Disk-based nonlinear systems can store and replay this video-with-key combination in one operation, but it would take two VTRs. See also: Blue Screen, 4:2:2:4, 4:4:4:4

Kell Factor
Named for Ray Kell, an engineering researcher who in the early 30s studied the effects of line scans and rasters on image resolution. A television scan line is one of a stack of lines in a raster. A line of resolution, or TV line is described along the vertical axis as any transition between a white scan line and a dark scan line. Thus, a TV line is actually an edge between white and black. If there are 576 active scan lines in a given picture, where these scan lines alternate between white and black, the visual result would be 288 TV line transitions to black interspersed with 288 TV line transitions to white. For such an image to be presented by a two-field interlaced scan system, one field would be white and one would be black. In the real world the variability of brightness of objects in a frame is not nearly so absolute. Thus, the effective resolution is perceived neither as 288 actual white lines on a dark field (or vice versa) nor as the full 576 scan lines, but as a function of the two phenomena. The Kell Factor describes the visual mix of perceived sharpness from the absolute lines and edges as 70 percent of the actual scan line count. This applies irrespective of progressive or interlaced scans. However, the Kell Factor is a study of static images. Moving images have other temporal considerations. See also: Interlace Factor

K L

LCR
LCR describes theater stereo, usually when a stereo music track is remixed for cinema. The master music track then contains a Left, Center, and Right signal. This remixing of stereo material into LCR allows it to be easily integrated into film and surround mixes without suffering the spatial anomalies that a stereo mix would encounter when being processed by the Dolby Stereo algorithm.

Keying
A general term for the process of placing an object or section of picture over another as in keying text over video. This is a video version of matting in film but may use interactive tools and feature live operation. Operation splits into two areas, deriving the key signal and applying it to produce the keyed result. In HDs high quality, big picture environment it is essential that keyed results of invisible effects are convincing. Increasing use of compositing to add scenery, objects, and actors to make a scene that the camera never saw, requires excellence in keying so that the keyed items look photo-real like a part of the original image. Keying tools have developed rapidly with the introduction of digital technology and online nonlinear editing. If working with electronically generated material, such as graphics or

LCRS
The speaker configuration for the Dolby Stereo or Dolby Pro Logic format Left, Center, Right, and Surround. There are, in fact, two surround speakers in the format, however, they both get the same signal.

Letterbox
A popular method for showing widescreen images in their original framing on non-widescreen 4x3 displays. When a widescreen image is sized to show its full width on a 4x3 screen, the height of the image is less than the height of the screen. A letterboxed program compensates by centering the image vertically and adding black bands above and below the image.

30

LSB
The Least Significant Bit. This is the last digit, and so the smallest in value, of a binary number. For instance:

Metadata is any information about the essence, for instance how, when (timecode) and where it was shot, who owns the rights, what processes it has been, or should be, subjected to in postproduction and editing, and where it should be sent next. Uses with audio alone include AES/EBU with metadata to describe sample rate, also metadata in AC3 helps the management of low frequencies and creating stereo down-mixes. Typically the audio and video essence is preserved as it passes through a production system, but the metadata is often lost. Avid with OMF technology and the AAF association have both done much to rectify this for the area of editing and postproduction. See also: AAF, Essence, OMF

Binary 111111110 = decimal 254 Binary 11111111 = decimal 255


Changing the LSB here alters the value by just one. See also: MSB, Truncation

Macroblock
A 16 x 16 pixel block, comprising four adjacent DCT blocks macroblocks are used to generate motion vectors in MPEG-2 coding. Most coders use a block matching technique to establish where the block has moved and so generate motion vectors to describe the movement. This works most of the time but also has its well-known moments of failure. For example, slow fades to black tend to defeat the technique, making the resulting misplaced blocks quite visible. Better technologies are available for use in movement estimation, such as phase correlation. See also: DCT

M-JPEG
JPEG compression applied to moving pictures. As the detail contained within each frame varies, some decision is required as to whether to use a constant bitrate scheme or constant quality. See also: AVR, JPEG

Motion Vectors
See Macroblock, MPEG-2

L M

MPEG
Moving Pictures Expert Group. A group of industry experts involved with setting standards for moving pictures and sound. These are not only those for the compression of video and audio (such as MPEG-2 and MP3) but also include standards for indexing, filing, and labeling material. Websites: www.mpeg.org

MADI
Multi-channel Audio Digital Interface (AES 10). This uses a single BNC connector that allows up to 56 channels of 24-bit audio transfer.

Media Composer
This series of nonlinear editing systems has formed the core part of Avids business over recent years. The most recent addition is Media Composer Adrenaline, the first HD-expandable Media Composer system. There are many permutations of hardware platforms, video cards and breakout boxes on both Apple Mac and PC formats. Seen as the de facto standard in editing for both on-line and off-line, Media Composer touches 90 percent of all film work and 80 percent of mainstream television production. See also: AVR

MPEG-2
ISO/IEC 13818-1. This is a video compression system primarily designed for use in the transmission of digital video and audio to viewers by use of very high compression ratios. Its importance is significant as it is used for all DTV transmissions worldwide, SD and HD, as well as for DVDs and many other applications where high video compression ratios are needed. The Profiles and Levels table (below) shows that it is not a single standard but a whole family which uses similar tools in different combinations for various applications. Although all profile and level combinations use MPEG-2, moving from one part of the table to another may be impossible without decoding to baseband video and recoding.

Metadata
Metadata is data about data. Essence, or video and audio, is of little use without rights and editing details. This information also adds long-term value to archives.

31

Profile Level

Simple 4:2:0 I, B

Main 4:2:0 I, B, P

422P 4:2:2 I, B, P

SNR* 4:2:0 I, B, P

Spatial* 4:2:0 I, B, P

High 4:2:0, 4:2:2 I, B, P

However, the data pipes for ATSC (19.2 Mb/s) or DVB (20 Mb/s, depending on channel width, parameters etc.) imply the need for around 40:1 compression. See also: DCT, GOP, Intra-frame Compression, Inter-frame Compression, Macroblock

High

1920x1152 80 Mb/s

1920x1152 100 Mb/s

MPEG-4
ISO/IEC 14496. MPEG-4 is designed for conveying multimedia by representing units of aural, visual, or audiovisual content as media objects, which can be of natural or synthetic origin (from microphones, cameras, or computers). It describes the composition of the object to create scenes which can be interacted with, at the receiver. Multiplex and synchronizing data is defined so that the whole data can be transmitted over networks or broadcast. The resulting transmission bitrates can be very low. The media objects are given a hierarchical structure e.g., still images, video, and audio objects which may be two or three-dimensional. A compound media aural and visual object (AVO) could be a talking person and their voice. Thus, complex scenes are more easily composed. Binary Format for Scenes (BIFS) describes the composition of scenes. A weather forecast, for example, might be composed of a background map, weather symbols (cloud, sun, etc.), a talking head and audio; the BIFS would describe how these would be composed and run together. The whole would require a small fraction of the data needed for MPEG-2. There could also be scope for the viewer to interact with the objects. The standard is very broad-based with bitrates from 5 kb/s to 10 Mb/s, progressive and interlaced scans, and resolutions from less than 352x288 to beyond HD.

High-1440

1440x1152 60 Mb/s

1440x1152 60 Mb/s

1440x1152 80 Mb/s

Main

720x570 15 Mb/s

720x576 15 Mb/s

720x608 15 Mb/s

720x576 15 Mb/s

720x576 20 Mb/s

Low

352x288 4 Mb/s

352x288 4 Mb/s

MPEG-2 profiles and levels *SNR and Spatial are both scalable

Profiles outline the set of compression tools used. Levels describe the picture format/quality from High Definition to VHS. There is a bit rate defined for each allocated profile/level combination. In all cases, the levels and bit rates quoted are maximums so lower values may be used. Combinations applicable to modern HD are highlighted. MPEG-2 is deliberately highly asymmetrical in that decoding is far simpler than the encoding so millions of viewers enjoy reasonable prices while a few broadcasters incur the higher costs. Coding has two parts. The first uses DCT-based intra-frame (I-frame) compression and application of quantizing, to reduce the data almost identically to JPEG. The second involves inter-frame compression calculating the movement of macroblocks and then substituting just that information for the pictures between successive I-frames making a GOP. The movement is conveyed as motion vectors, showing direction and distance, which amounts to far less data than is needed for I-frames. Motion vector calculation is not an exact science so there can be huge difference in quality between different MPEG compressors. Decompression is deterministic so all decompressors should be the same. The encoding process necessarily needs to look at several frames at once and so introduces a considerable delay. Similarly, the decoder delays pictures. For transmissions this can add up to over a second. MPEG-2 is sometimes used on broadcast contribution circuits, this becomes noticeable when news reporters appear to delay answering a question. To fit HD video and audio down a transmission data pipe requires very high compression. Uncompressed 10-bit HD requires up to 124 4Mb/s. But this is 10-bit data and sampled at 4:2:2. MPEG-2 is 8-bit sampled at 4:2:0 bringing the data down to 746 Mb/s.

MPEG-7
This is the Multimedia Content Description Interface. Still under development, it aims to create a standard to describe the multimedia content data to give some sense of the informations meaning. This can be passed onto, or accessed by, a device or a computer

DDL Defined in standard DS DS

DS D D

Not in standard: defined using DDL

32

code. This will enable quality access to content for a broad range of applications. There are three main elements used to describe content (see above): Descriptors (D), Description Schemes (DS) specifying the relationships between both Descriptors and Description Schemes, and a Description Definition Language (DDL) to create new DS and D and to modify existing DS. System tools support transmission, in both textual and binary formats, for efficient storage and transmission, management, and protection of intellectual property. MPEG-7 will make it far easier to find digitally stored information. Applications are very widespread from electronic program guides, digital libraries, E-commerce and education to dating services. They may make it possible to use new methods such as drawing a few lines or singing a few notes as keys to a search.

Multi-format (Audio)
The ability of an audio production system to handle all the current multi-channel audio formats: ST, LCRS, 5.1, 6.1, and 7.1.

MXF
Material Exchange Format, supported by the Pro-MPEG Forum, is aimed at the exchange of program material between file servers, tape streamers, and digital archives. It usually contains one complete sequence but this may comprise a sequence of clips and program segments. MXF is derived from the AAF data model, integrates closely with its files, and so bridges the worlds of file and streaming transfers. It helps to move material between AAF file-based postproduction and streaming program reply over standard networks. This setup extends the reliable essence and metadata pathways so that both formats together reach from content creation to playout. The MXF body carries content, which can include MPEG, DV, and uncompressed video, and contains an interleaved sequence of picture frames, each with audio and data essence, plus framebased metadata. MXF has been submitted to the SMPTE as a proposed standard. Website: www.pro-mpeg.org See also: AAF

MPEG-21
MPEG-21 aims to define a multimedia framework enabling transparent and augmented use of multimedia resources across a wide range of networks and devices used by different communities. This is to provide a big picture describing how these elements relate to each other and fit together. The result is an open framework for multimedia delivery and consumption, with both the content creator and content consumer as focal points. So content creators and service providers have equal opportunities in the MPEG-21 enabled open market. This will also give the content consumer access to a large variety of content in an interoperable manner. Standardization activity areas include Digital Item Declaration (DID), Digital Item Identification and Description (DII&D), and Intellectual Property Management and Protection (IPMP).

M N O

Non-additive Mix
See Keying

OCN
Original Camera Negative film. This has very high value, is handled with great care and, to avoid damage, as little as possible. The onward path toward making a program involves either scanning the OCN and proceeding in the digital domain, or copying to make an interpositive film, and so on into the film production chain. Note that whichever path is taken, there is no way to make a perfect copy of the OCN whereas, if using HD digital originals, exact lossless clones can be made.

MSB
The Most Significant Bit. This is the first digit, and so the largest in value, of a binary number. For instance: Note that the MSB is sometimes used to indicate the sign of a number, e.g., 0 for positive numbers, 1 for negative numbers. See also: LSB Binary 11111111 = decimal 255 Binary 01111111 = decimal 127 Changing the MSB here alters the value by 128.

OMF
Open Media framework. Original industry standard initiated by Avid Technology to allow metadata and essence to move from system to system regardless of platform or operating system. OMF was also the foundation for what was to become AAF and is still in use in many workflows, such as audio post. See AAF

33

Photo Real
Expression to describe effects-generated material that looks as if it originated from a camera. This may apply to computer-generated objects or to items shot on camera and composed into the picture. Here, attention to detail such as shadows and reflections, as well as keying, are needed to maintain the illusion. Achieving such quality at HD and film resolutions is all the more demanding as their bigger, sharper displays make detail, including errors, easier to see.

Publishing
As various television formats are now in use, programs may be made in a format to suit general publication, rather than just one particular broadcaster. Certainly, program makers are increasingly seeking wider sales often in the international marketplace. Before the arrival of HD, SD programs were often subjected to standards conversions but, even today, the complexities of changing frame rates still involve some loss of image quality. The arrival of HD and the designation of 1080/24p as a common image format have enabled publishing to deliver far better quality to all markets from a single edited master. The large picture format means that any change in image size will be down, rather than up, and the frame rate can be translated, or re-mapped onto the 50 and 60 Hz systems used around the world. See also: 2:3 Pulldown, Down-Res, Universal Master

Plug-ins
A generic term given to software applications that can be added to existing applications to enhance their functionality. Nonlinear video and audio systems are often expanded with new effects or extended basic functionality via plug-ins.

PPM
Peak Program Metering uses a ballistic response characteristic for metering TV broadcast audio. The system provides a clear way of measuring audio to prevent overloads and spikes. See also: VU

QPSK and QAM


QPSK (Quadrature Phase Shift Keying), and QAM (Quadrature Amplitude Modulation) are methods of carrier modulation used to define numbers of states to carry data in various communications systems such as DVB-T. They enable more states, and therefore data, to be carried within a specified bandwidth with a trade-off in robustness. QPSK is, in fact, 4-QAM and describes the four states of two bits per carrier. Moving to 16-QAM (16 states of four bits per carrier), it carries the same data in half the bandwidth. Again, 64-QAM describes the 64 states of six bits per carrier. Therefore 16-QAM carries twice the data rate and 64-QAM carries three times the data rate compared to QPSK. Such methods are used in areas such as telephone modems and for the transport of digital information in DVB transmission systems. Depending on the chosen kind of modulation of the single carriers in DVB-T, two bits per carrier (QPSK), four bits per carrier (16-QAM), or six bits per carrier (64-QAM) are transmitted. Therefore 16-QAM carries twice the data rate and 64-QAM carries three times the data rate compared to QPSK. See also: DVB

Progressive

P Q R

Sequence for scanning an image where the vertical scan progresses from line 1 to the end in one sweep. In HDTV there are a number of progressive vertical frame (refresh) rates allowed and used. 24 Hz is popular for its compatibility with motion pictures and its ability to be easily translated into all of the worlds television formats. 25 and 30 Hz correspond with existing SD frame rates (although they use interlaced scans). 50 and 60 Hz are also allowed for, but, due to bandwidth restrictions, these are limited in picture size, e.g., 1280 x 720/60p. Today, progressive scanning is most commonly found in computer displays. This results in rock steady images where the detail is easy to see. Refresh rates run up to, and beyond, 100 Hz. For the equipment designer, they are generally easier to process as there is no difference between the two fields of a frame to contend with. Progressive scans do have disadvantages arising from images being vertically refreshed only once per frame. Thus, for the lower rates of 24, 25, and 30 Hz, which can be used in television with the larger 1080-line formats, there would be considerable flicker on displays, unless there were some processing to display each picture twice (as in double shuttering in cinema projectors). One way around this is to use progressive segmented frames (PsF). Besides flicker, the other potential problem area is that of fast action or pans, as the lower refresh rate means that movement will tend to stutter. It was to solve precisely these problems that interlace has been used for television. See also: 24PsF, Interlace, Kell Factor

RAID
Redundant Array of Independent (or inexpensive) Disks. By grouping disk drives together, with an appropriate RAID controller, it is possible to offer faster data rates, greater storage capacity and protection from loss of data should a drive fail. Given the storage requirements of HD, no single disk drive can offer sufficient data rate to support uncompressed operation, but a RAID can. Most RAIDs are developed for IT-based applications. Those specifically designed for on-line video operation should offer the continuous performance required, even during a disk fail

34

condition. RAIDs protect data by recording additional information besides the basic needs of digital video and audio. This may take the form of a complete mirror of data involving twice the storage or generating and storing a checksum of data stored across several drives. Then, if one drive drops out, the missing data can be calculated from the remaining good disks and the check sum. In a similar way, after swapping out the faulty drive, the missing data is reconstructed and recorded to the new drive. Several types of protection/configuration, termed levels, have been developed to suit individual data storage requirements:

in some random or pseudo random way. Good rounding can be applied many times to the signal without causing accumulated defects or generation loss. See also: Truncation

SAN
Storage Area Network. This is an increasing popular method to provide large amounts of storage for shared use. Typically there is a collection of disk drives or RAIDs that are connected via Fibre Channel infrastructure. SANs can be used a number of different ways. First, the pool of storage can be partitioned so that each client gets access to a particular allocation of the storage. Second, there can be volume level locking where clients can share files for read but only one can write to the system at a time. Third, and most sophisticated, a custom file system such as Avid Unity MediaNetwork, that allows multiple readers/writers can be overlaid on the SAN. With file sharing, all clients can have access to material and work groups can be assembled to work together on the various aspects of projects. This has a big impact on workflow and is seen by many as the way to shape areas such as postproduction. In a similar way, other areas of operation can be pulled together such as production and transmission. Note there is potential for simplifying back-ups by use of a single back-up operation for the whole SAN. Not all SANs are the same. The effectiveness of operation is very dependent on the suitability of the SAN configuration and the software. See also: JBOD, Fibre Channel, SCSI

Level 0/1 or 10 (mirroring and striping)


RAID 0/1 is a dual level array that utilizes multiple RAID 1 (mirrored) sets into a single array. Data is striped across all mirrored sets. As a comparison to RAID 5 where lower cost and fault tolerance is important, RAID 0/1 utilizes several drives, in order to provide better performance. Each drive in the array is duplicated (mirrored). This eliminates the overhead and delay of parity. Provides highest performance with data protection.

Level 0/5 or 50
RAID 0/5 is a dual level array that utilizes multiple RAID 5 sets into a single array. In RAID 0/5 array, a single hard drive failure can occur in each of the RAID 5 without any loss of data on the entire array.

RGB
Red, Green, and Blue. Cameras, telecines, and most computer equipment originate images in this color space. For digital sampling, all three colors are sampled in the same way at full bandwidth hence 4:4:4. 4:4:4 images may offer better source material for the most critical chroma keying, but they occupy 50 percent more data space than 4:2:2 and, as no VTRs record 4:4:4, data recorders or disks must be used to store them. Also, there are no television means to connect them, so IT-based networking technology is used. Often 4:4:4 is only used in postproduction areas and is converted to 4:2:2 when material is more widely distributed. See also: 4:4:4, Gamut

R S

Workstations

Fibre Channel Switches

Rounding
The intelligent truncation of digital numbers to produce a more accurate result. Several methods have been developed for invisible rounding of digital video so that the process of, changing from twenty to ten bits does not simply discard the lower ten bits. This usually involves using the value of the lower ten bits to modify the remaining ten bits Disks Tapes Disks

Shared Storage

A SAN sharing storage between workstations

35

SCSI
Small Computer Systems Interface. This is a high-performance disk interface generally used where the recording and replay of high-speed data is required. This makes SCSI popular for video applications. There are several flavors of interface depending on speed (number of transfers/s) and width normal Narrow (eight bits) or Wide (16 bits). The interfaces are capable of the following maximum transfer rates but this does not mean the devices (disks) supply that data rate. Notes: Number of devices includes the host adapter. For Ultra and faster speeds, the maximum length of cable for some signaling types Normal SCSI Fast SCSI Ultra SCSI Ultra SCSI 2 5 M transfers/s 10 M transfers/s 20 M transfers/s 40 M transfers/s

SDII
SDII is the original sound file format developed by Digidesign for their audio editing platform. The format was later adopted by Avid for its early Mac-based nonlinear video systems.

SDTI
Serial Data Transport Interface (SMPTE 305.2M). This is a development of SDI using the same connector and cable to carry packetized data. This transports data such as MPEG-2, IMX, DV and even HDCAM and DVCPRO HD compressed video in real-time, (DV can be carried 4x faster). As much SDI infrastructure will support SDTI, it can be used along with SDI. Many routers are SDI and SDTI compatible. SDTI carries packetized data but has limited application in general IT. The SDTI-CP (content packet) standardizes the format of data sent down the cable opening it to more general data carrying applications.

Segmented Frame
See 24PsF

SFN
depends on the number of devices on the chain, thus, the multiple rows. The connections involve either a 50-pin or 68-pin cable that runs between 3 and 25 meters. The most recent version of SCSI Ultra320 - supports up to 16 devices. Single Frequency Network. A collection of terrestrial transmitters that operates on the same frequency to provide continuous reception over a large area. SFNs are highly efficient on spectrum usage and ideal for mobile applications. Requirements are that all transmitters carry exactly the same programming, are kept locked in phase and frequency and that their power is adjusted to create even coverage and avoid destructive interference. See also: COFDM, DVB, ISDB

SDDS

Sony Digital Dynamic Sound. This is a multi-channel film encoding and delivery system developed by Sony. It uses the outer edges of either side of the 35mm print to provide a 7.1 channel audio format.

SMPTE
Society of Motion Picture and Television Engineers. Based in the USA, it has branches and members in many countries around the world. Among its activities, SMPTE members and committees are involved in the setting of many of the industry standards and recommended practices. Its involvement with both film and television means it is well placed to help with the convergence of these media and has set up DC 28 to create standards for D-cinema. Website: www.smpte.org

SDI
Serial Digital Interface (SMPTE 259M). This places real-time 601 4:2:2-sampled digital video onto a single 75-ohm coax cable with a BNC connector, over lengths up to 200m depending on cable type. It supports up to 10-bit video and has a data rate of 270 Mb/s. Four groups of four channels of AES/EBU digital audio can also be embedded into the video. Most modern video equipment is supplied complete with SDI connectors.

SMPTE 274M SDIF


Sony Digital Interface, used on early Sony CD mastering systems. SDIF 3 is the latest version that uses the DSD (Direct Stream Digital) format for 24-bit, 192 kHz recording and transfer. SMPTE 274M defines 1080-line HD television scanning for multiple picture rates. These are all 1920 x 1080 pixels and define progressive frame rates of 60, 59.94, 50, 30, 29.97, 25, 24 and 23.98Hz as well as interlaced rates of 60, 59.94, and 50 Hz. Note that progressive rates above 30Hz are not used in television due to their vast data rate

36

requirement. The 1000/1001 offset frequencies (59.94, 29.97, 23.98) are legacies from broadcast NTSC where 59.94 (not 60) Hz is used to avoid frequencies interfering within the transmitted signal. Thus NTSC/HD simulcasts stay in sync and conversion between SD and HD is facilitated. When NTSC is switched off only the nominal frequencies need be used. 274M also defines mapping the 1080-line pictures onto a 1125-line system. See also: 1000/1001, ITU-R BT.709, Table 3

If the field or frame rate remains unchanged, such as a conversion from 1080/24p to 720/24p, then the only requirement is for spatial interpolation. Conversions with the same overall frame rate from progressive to interlace, such as 720/30p to 1080/60i, are equally straightforward. However, those from interlace to progressive even at the same frame rate are more complex as any movement between a pair of interlaced fields must be detected and resolved into one progressive frame. The technique becomes more involved for the temporal interpolation that is needed to change frame rates e.g., from 60i to 50i. To maintain smooth movement and picture sharpness, accurate in-between pictures have to be calculated for mapping onto the new fields that occur at different times to the originals. For best results this demands that all movement is accurately analyzed so that objects in the video can be correctly placed at times that were not recorded in the original. Although conversion techniques have improved immensely over the years, quality varies and the complex processes of temporal interpolation are worth avoiding where possible. Ideally, conversion should be reserved for final publication, not used as a part of editing and postproduction. Certainly the use of two temporal interpolation processes anywhere in the scene-to-screen chain is unlikely to look good. See also: Publishing, Interpolation

SMPTE 292M
See HD-SDI

SMPTE 296M
SMPTE 274M defines 720-line x 1280 pixel HD television scanning for progressive 60 and 59.94 Hz picture rates.

SPDIF
Sony Philips Digital Interface. A domestic two-channel digital audio interface that is capable of 24-bit transfer.

Square Pixels
The pixel aspect ratio where the pixels describe a square area of the displayed image. This is the case with HD as the picture format standards describe line length (number of pixels per line) and number of lines in exact 16:9 ratios which is also the display aspect ratio of the pictures. Generally, computers generate images with square pixels but SD television images are not square. This means care is needed when transferring between applications to maintain correct image aspect ratios (so circles remain circular). See also: Anamorphic, Aspect Ratio

Storage Capacity
Fixed disk capacity is measured in megabytes and gigabytes but editors need to know store size in terms of length of video. For any store, the storage time available depends on the video format used, and HD offers many. The calculations here are based on a commonly used format: 1080/30 (1080 lines, 1920 pixels/line at 30 frames per second) using 4:2:2 sampling at 10 bits. In a similar way, storage for some other formats can be quickly worked out: Luminance samples per line Chrominance samples per line Total samples /line Samples/picture 10-bit sampling/picture Data/second @ 30fps Data/hour = = = = = = = 1920 1920 3840 4,147,200 (4.15 M-samples) 5.184 M (8-bit) bytes 155.52 Mbytes/s 560 Gbytes/h

Standards Conversion
The process of translating video from one television standard into another. This may involve changing the number of lines, pixels per line (spatial interpolation), frames per second, or the interlace/progressive sequence (temporal interpolation). HD introduces many new video formats and multiplies the permutations for standards conversion, both within HD and between HD and SD leading to up-conversion (Up-Res) and down-conversion (Down-Res) and cross-conversion. HDs large pictures demand powerful processing to maintain quality and real-time operation. Up-Res and Down-Res are generally used to describe changes in picture size without affecting picture rates.

3840 x 1080 4.1472 x 10/8 5.184 x 30 0.15552 x 3600

Typically, between two and four hours of storage is considered sufficient for editing, meaning that massive disk capacities are needed. These are usually supplied by grouping

37

drives into RAIDs, to offer both capacity and data speed for real-time working, as well as data recovery from fault conditions.

Table 3
The video formats allowed for broadcast in the ATSC DTV standard are listed in Table 3 of document ATSC Standard A/53 C.
Table 3 Compression Format Constraints Vertical size value 1080 720 480 Horizontal size value 1920 1280 704 640 Aspect ratio information 1, 3 1, 3 2, 3 1, 2 Frame rate code 1, 2, 4, 5 4, 5 1, 2, 4, 5, 7, 8 1, 2, 4, 5, 7, 8 4, 5 1, 2, 4, 5, 7, 8 4, 5 Progressive sequence 1 0 1 1 0 1 0

1080/24 sampled at 4:2:2 uses 720/60 sampled at 4:2:2 uses 1080/30 with 4:4:4 sampling uses 2K film (1536 x 2048, 24 frames/s, 4:4:4 sampling)

448GB/h 498GB/h 840GB/h 1.02TB/h

Surround Panning
This is the positioning of a sound anywhere within the surround domain, either using front/back and left/right control or via a joystick controller. With automation, this function can track the on-screen motion of characters or effects.

Symphony
Avid Symphony is a nonlinear editing and finishing tool with real-time effects processing which offers advanced color correction, DVE, captioning and titles. Working at SD, its universal mastering allows users to generate both NTSC and PAL versions of an edit in real-time from a 24p master.

Legend for MPEG-2 coded values in Table 3 aspect ratio information 1 = square samples 2 = 4:3 display aspect ratio 3 = 16:9 display aspect ratio Frame rate code 1 = 23.976 Hz 2 = 24 Hz 4 = 29.97 Hz 5 = 30 Hz 7 = 59.94 Hz 8 = 60 Hz Progressivesequence 0 = interlaced scan 1 = progressive scan

System Nomenclature

S T

A term used to describe television standards. The standards are mostly written in a self-explanatory form but there is room for confusion concerning vertical scanning rates. For example, 1080/60i implies there are 60 interlaced fields per second that make up 30 frames. Then 1080/30p describes 30 frames per second, progressively scanned. The general rule appears to be that the final figure always indicates the number of vertical refreshes per second. However, Table 3 uses a different method. It defines frame rates (numbers of complete frames) and then defines whether they are interlaced or progressive. So here the frame rate code 5 is 30 Hz which produces 30 vertical refreshes when progressive, and 60 when interlaced. See also: Interlace, Progressive, Table 3

This table lists no fewer than 18 DTV formats for SD and HD. Initially, this led to some confusion about which should be adopted for whatever circumstances. Now most HD production and operation is centered on the 1080-line formats either with 24p, 25p, or 60i vertical scanning. See also: DVB

TDIF
Tascam Digital InterFace used on the Tascam digital products. The format uses a 25-pin D type connector to move up to eight channels of 24-bit digital audio.

Timecode
Method of identifying the frames of video. It uses a 24-hour clock and defines hours, minutes, seconds, and frames. Most professional video recording equipment includes recording timecode, which may be inserted into the vertical interval of the video or on a separate track reserved for the purpose.

38

Truncation (Rounding)
Reducing the number of bits used to describe a value. This is everyday practice; we may say 1,000 instead of 1024 in the same way we leave off the cents when talking about money. There is also the need to truncate the digits used in digital video systems. With due care, this can be invisible, without it degradation becomes visible.

Universal Master
The 1080/24p format has well defined and efficient paths to all major television formats and is capable of delivering high quality results to all. An edited master tape in this format is sometimes referred to as a Universal Master. See also: HD-CIF, Publishing

Decimal: Binary:

186 x 203 = 37758 10111010 x 11001011 =10010011011111110

Up-Conversion
A process that changes a smaller picture format to a larger one. This should include some color correction to allow for the differences between SD and HD set-ups. Up-Conversion usually implies that, beside this spatial change, there will also be a temporal change i.e., of the frame rate. For example, changing from 625/50 to 1080/60i is an up-conversion. See also: Down-Conversion, Multi-format Distribution (page 45), Standards Conversion, Up-Res

It is the nature of binary mathematics that multiplication, which is commonplace in video processing (e.g., mixing pictures), produces words of a length equal to the sum of the two numbers. For instance, multiplying two 8-bit video values produces a 16-bit result which will grow again if another process is applied. Although highways within equipment may carry this, ultimately the result will have to be truncated to fit the outside world which, for HD, may be a 10-bit HD-SDI interface or 8-bit MPEG-2 encoder. In the example, truncating by dropping the lower eight bits lowers its value by 01111110, or 126. Depending on video content, and any onward processing where the error is compounded, this may or may not be visible. Typically, flat (no detail) areas of low brightness are prone to showing this type of discrepancy as banding. This is, for example, sometimes visible from computer generated images. Inside equipment, it is a matter of design quality to truncate numbers in an intelligent way that will not produce visible errors even after further processing. Outside, plugging 10-bit equipment into 8-bit needs care. Intelligent truncation is referred to as Rounding. See also: Rounding

Up-Res
Changing from a smaller video format to a larger one. For example going from SD to HD as from 480/30i (525/60) to 1080/30i. This example may require changing aspect ratio as 480/30i may be 4:3 or 16:9, but the HD output is always 16:9. Up-Res does not imply a change of frame rate. This example does not call for a change in frame rate and technically demands increasing the picture size by 2.25 (1080/480) horizontally and vertically or 3 (1080/360) if filling the 16:9 picture area from a 3:4 original. In addition, there should be a color correction as the RGB phosphors of the two systems differ slightly. Although the resulting video will have an HD line count, it will not be up to normal HD-original standard. See also: Cross-Conversion, Down-Res, Standards Conversion, Up-Conversion

VU
Volume Unit metering system the standard ballistic system for measuring audio that gives an overall impression of the audio signal volume. See also: PPM

Unity MediaNetwork
see: Avid Unity MediaNetwork

Universal Format
1080/24p is sometimes referred to as the Universal Format for television. The reason is its suitability for translation into all other formats to produce high quality results in all cases. See also: HD-CIF, Publishing, Universal Master

WAV
WAV format developed by Microsoft for its Windows operating system and which can contain audio which may be coded in many ways. WAV files contain metadata saying which type of coding is used, assisting decoding. See also: BWF

T U V W

39

Y, Cr, Cb
This signifies video components in the digital form. Y, Cr, Cb is the digitized form of Y, R-Y, B-Y.

Y, R-Y, B-Y
See Component Video

YUV
De facto shorthand for any standard involving component video. This has been frequently, and incorrectly, used as shorthand for SD analog component video Y, R-Y, B-Y. Y is correct, but U and V are axes of the PAL color subcarrier which are modulated by scaled and filtered versions of B-Y and R-Y respectively. Strangely, the term is still used to describe component analog HD. This is double folly. Although Y is still correct, all HD coding is digital and has nothing to do with subcarriers or their axes.

40

Topics

Production and Cost Advantages of HD


For the viewer, HD is a whole new experience. More detail and bigger, wider pictures are not only impressive and compelling but also mean there is far greater interaction. So viewers see obvious advantages and, given a choice, HD wins hands down over SD. But historically, HD production costs have been a barrier, mainly because of the high cost of equipment. However, recent price falls and continuing development of technology have changed the landscape. Today, anyone aiming at an international audience or creating content with any significant shelf life could take advantage of HD even if there is no domestic HD audience. For those with audiences, HD production makes even more sense. But HD is not for acquisition only as most shows are mastered in HD, even if shot on film. Obviously, motion picture budgets are greater than those for TV but it is significant for both industries that top companies, directors, producers, and artists are already working with the HD medium. Visionaries, like George Lucas and Robert Rodriguez are striving to push the boundaries of what is possible. They are eagerly embracing HD not just for the sake of it, but because it offers real benefits that translate into increased revenue potential. They see many advantages to using HD in both production and postproduction. Episodic television production is currently an area of rapid HD expansion. Many shoots have been HD for years using 35mm or 16mm film but it really should make sense to shoot television for television. Now it does, and there are real cost benefits in making the switch to HD. The use of HD video instead of film impacts the production process and cuts many of the traditional corners and delays. Up front, there is a huge difference in stock cost. HD can be as low as $2 per minute, and there are no lab bills. The immediacy of video also brings new working methods to the studio. There is no waiting for dailies and no telecine, while on-set monitoring means that everyone can see exactly what is being recorded as it happens. After the shoot, working directly in the digital medium means there is easy access to all the tools that have otherwise only been accessible after the telecine transfer. With the whole show already in digital form, it is easy to use effects to clean up or add to the production. This is even more the case now as nonlinear online is available for HD. Calibration and color correction issues are minimized, effects are more readily integrated, and matching CG shots with actual footage is simpler. Editing the footage is more efficient in the nonlinear environment and changes, such as adding an alternative ending or creating a different version, can be made easily, even at a late stage in production. Once again, there is no wait for the lab as everything compositions, effects, retouching is immediate and interactive. The whole show can be color corrected and finished. With a

complete lack of scans and format changes, along with the use of uncompressed video in post and editing, the show will still be in pristine first-generation quality. If the production is aimed at an international audience, the use of 1080/24p as the production format makes sense. With this recognized as an international exchange format, the passage to worldwide digital distribution is extremely efficient with all TV formats receiving very high-quality copies regardless of acquisition format. In addition, copying tape is far cheaper than copying film. The use of 1080/24p means there is no need to scan and edit a 525 master and then go back and re-scan and re-edit a 625 master. From this view alone, considerable savings are made. There are those who still prefer the latitude of film for acquisition. Even so, scanning that footage and continuing the production process in HD makes a great deal of sense. Production genres such as Natural History and Period Drama can expect to have appeal over many years from generation to generation. While the timetable for the rollout of HD into homes is either not planned or too vague to justify extra cost, the short- and long-term benefits of producing in HD still make a lot of sense. Just as 35mm material provides the best input for SD, so HD also passes on much of its quality to the smaller formats. This is a top-down approach where working from large-scale images and reducing them to todays formats creates exceptionally good results beyond those possible from an SD shoot. In addition, the bigger pictures leave more scope for post-shoot zooms and re-framing. Down the line, sales and reruns will have enhanced value as they will already be in an HD format. And there will be no rush back to telecines and edit bays as the client specifies HD which will increasingly be the case. HD is being seen more and more as a solution for feature film postproduction rather than as a competing format. By having HD dailies, screenings can be quickly conformed and projected to a large audience for feedback and review. High Definition production offers many benefits outside the immediate market for HD television. Even though that market will grow, the advantages in other areas already make sense for a growing number of productions. Shooting sitcoms in HD rather than 35mm is a rapidly growing trend. The cost and quality paybacks are immediate since these shows are typically shot with multiple cameras. Current SD productions can get an immediate quality and flexibility boost via the production of HD masters as well as a long-term increased value. Now that the HD production chain can use nonlinear editing and digital effects, complete with zero generation loss, the benefits accrue to the production and show through to the viewers.

41

Topics

Broadcast and HD
The ultimate success of high definition television depends on people watching it and the means to reach the mass audience is broadcasting. Getting the bigger pictures and greater sound experience to viewers has involved, ultimately, the introduction of digital television broadcasting and the setting of numerous new standards. Many had hoped that HD would be the catalyst for a new world television standard, but legacy and politics make that improbable. Driven by different aims, there are now three broadcast systems for HD: ATSC from the USA, DVB from Europe, and ISDB from Japan. Broadly, ATSCs prime aim was to provide an HD service for the USA; DVB, emanating from Europe, was designed to be a flexible, international system; and ISDB offers a hierarchical SD/HD all-in-one solution. Each has defined terrestrial, cable, and satellite services, and each has an eye on the international market. Two defining events have helped shape HDTV. The first was in 1990 when General Instruments (GI) proposed an all-digital HDTV system. The second, in 2000, was the adoption of ITU-R BT.709-4 based on a high definition Common Image Format (HD-CIF) of 1080 x 1920 pixels, 16:9 picture aspect ratio with progressive scanning at 24, 25, and 30fps, and both interlaced and progressive scanning at 50, and 60 pictures per second. The former event allowed the necessary technology to move forward, and the latter laid to rest the confusion of the 18 formats allowed in the ATSCs famous Table 3 (page 35), and the many more quoted by DVB. Japan has been a prime mover in HD. NHK began development in 1970 and in 1986 began experimental broadcasting of the 1125/60i analog MUSE system (Multiple Sub-Nyquist Sampling Encoding) via their BS-2 satellite. One-hour daily test transmissions started in 1989, rising to 17 hours by 1997. Coverage of major spectacles including Atlanta and Sydney Olympics, Nagano Winter Olympics, World Cup Soccer 98 and images of earth from shuttle flights in 1998 and 2000 have also been a programming feature. According to JEITA (Japanese Information Technology Industries Association) there were one million HDTV households in Japan by the time digital HD broadcasting started in December 2000 using ISDB-S and standardized in the 1080/60i video format. Uptake of digital settop boxes and new digital receivers has been steep according to JEITA the target is for 20 million by 2005. Currently, viewers have the choice of seven digital HD channels by satellite, all with 24-hour programming. For terrestrial coverage, ISDB-T is still under development and not expected to be commercially available until 2005. HD terrestrial broadcasts in the USA began in earnest on 1 November 1998; although many stations covered the space shuttle Discoverys launch a couple of days earlier. With limited

broadcasts, early adoption was partly fueled by interest in better quality images for home theatre applications with DVD. Since then, broadcasting hours and the number of stations carrying HD have increased dramatically, especially in 2003 and 2004. By 2003, cable and satellite providers offered HD programming at relatively low cost, and pricing of HDTV displays dropped below $1,000. As a result, sales of HDTV displays accelerated, with an estimated 4 million HDTV sets in place in the USA. Of the networks, CBS stands out, transmitting nearly all prime time evening programming and some sporting events in HDTV. Significantly, an agreement with the satellite broadcaster EchoStar (DISH) will mean a rapid expansion of its HD coverage. There are already some other DBS (Direct Broadcast Satellite) service providers showing HDTV. Cable systems are also involved and expanding. For example, Action Sports Cable Network (ASCN) plans a 24/7 high-definition channel. In Europe, an analog HD system was developed but never run as a public broadcast standard. HD-MAC (Muliplexed Analogue Components) used 1250 lines and could be received in 625-line form. Now DVB includes HD but it is only used for SD, except in Australia. Here, HD broadcasting was due to start in January 2001, but the digital opportunity has been first exploited for multi-channel SD. Although European digital television services are rapidly spreading via terrestrial, cable, and satellite, there are no plans for HD. Nonetheless, an increasing amount of programming, especially that for international markets, is being produced in HD. There is no doubt that the first attraction of HDTV is its bigger, wider, sharper pictures. Most people will make the double leap from analog conventional definition, to digital high definition and many comment that the digital HD experience is compelling. To go with the pictures, there is AC-3 digital surround sound with up to 5.1 channels which adds to the experience. However, there is another side to HD broadcasts that some believe will be the greater pull interaction. A benefit of larger, sharper pictures is that additional information, such as text, can be clearly displayed at the sides without spoiling the main action. Such information can be carried using just some of the broadcast data pipe. So called active viewing effectively selects the appropriate data according to button pushes on the remote rather like changing channels, but more refined. For example, a re-run of the last play or additional information about players in a match can be recalled during the game typically this would include text and graphics. By completing a return path to the broadcaster, e.g., by phone line, the interplay becomes interactive, with the broadcast data refined according to the viewers selections. Viewers could vote on programs such as Big Brother, buy directly

42

Topics

from commercials, register for an e-coupon, or seek more information about program content items such as recipes or clothing. Such T-commerce is expected to take off from 2005 with transactions estimated at nearly $45 billion. The increasing volume of available HD material and equipment costs approaching those of SD together create a much more favorable environment for HD growth than has ever existed before. However, while there is talk of analog turn-off and reclamation of its spectrum, no one is talking of an end to SD both resolutions will co-exist in a market with wider choice. See also: ATSC, DiBEG, ISDB, DVB

HD, Film and Digital Betacam: Advantages and Detractions


Today, there is more choice than ever of recording mediums for moving pictures. These break down into three areas: standard definition television, high definition television, and film. The choice is a result of continued technical development so that users are able to select the right medium for their application. This is not a case of recorder wars but much more about fitting with individual requirements. The selection should be based on factors such as program look, shooting environment, and cost.

Look
Probably the most commonly asked question of those involved with HD is which is better, HD or film? And the reply should be, it depends on what you are doing and the look you want. Standard definition, high definition, and film all have different looks as they acquire moving images in different ways, light affects the medium in different ways, and the route to our screens varies. As a result, the three will never look the same although some special treatments can help to blur the boundaries. From this angle, it is a matter of basic program specification as to what the required look is. Take the example of programming with fast action or quick pans such as sports coverage. Here a high image refresh rate is needed to give good motion portrayal. Film, as used in television, is basically 24 progressive, shuttered frames per second though it can be over-cranked to 25 or 30, but this still offers a poor rendition of motion. Video offers similar or faster frame rates, with a choice of progressive or interlaced scans. As interlace effectively gives twice the refresh rate, it offers better rendition of motion. There are other possibilities. Film cameras have shutters so that the film is not exposed until it is held stationary in the gate. Then, when the shutter opens, the whole frame receives light at the same instant. Video cameras do not have an absolute need of shutters as there is no equivalent of film moving through the gate, but they can use them if required. Without shutters, the pictures contain the maximum of motion blur which smoothes the look of movement. Also, the pictures are scanned line-by-line not all from the same instant. But if you want to see clear detail of action such as seeing whether the tennis ball was in or out of court, then shuttering helps to produce clear, sharp action replay. However, the shuttered camera produces a judder effect on fast movement when viewed at normal speed rather like film. For look, there is a purely emotional argument to be made in favor of films graininess and slightly dream-like look, which helps the audience to suspend disbelief. On the other hand,

43

Topics

if accuracy and realism is required, then video produces a very clean and real-life result. Comments about HD acquisition experience indicate that the sets may need more dressing if moving from 35mm to a video shoot. The visibility of detail such as skin pores may drive a re-think in the makeup department. One producer reported that it was no longer possible to recycle the same bunch of extras as they were too easily recognizable. There is one area where film will likely remain a better choice than HD for at least a few years and that is slow motion effects. Film can be run at very fast speeds which translates into very smooth, high-quality slow motion. To do this with HD would require the commensurate increase in data rate which is not viable with todays technology.

The key is always to deliver the required experience to the viewer. If standard definition is required, then a Digitbeta or Super 16 shoot will generally suffice. If there is any requirement for high definition then HD or 35mm will need to be used. These will not only produce good results at HD but also the down-converted SD will look superb too. However, although up-converting from SD to HD can sometimes produce remarkably good results, it is not the right way to support the top quality needed for HD and should be avoided where possible.

Costs and Production Style


As ever, cost is important. Spending money on a 35mm or HD shoot can be wasted if the only viewers are SD. However, there is no doubt that shooting on these formats produces the best looking results at SD. High-budget programming, such as commercials and widely distributed episodics, has often been shot on 35mm but, for the latter, there is a swing now towards HD shoots as the 24p frame rate offers easy, very high quality transfers to all world TV formats. Everyone has to keep an eye on budgets and it is generally accepted that HD offers savings over 35mm film. Certainly the stock is far cheaper and there are no lab or dailies expenses to consider. However, many prefer HD for its immediacy. Everyone can see the results as they happen and there is no waiting for the lab. As HD is television, there are all the possibilities for live cutting and effects and the creation of a far more interactive production scene.

Environment
The mechanics of videotape recording depend on a very close and accurate contact between the recording head and tape. Dust, dirt, shock, and vibration may cause the recording to temporarily drop out or fail. Also, such elements do not favor electronics. Certainly water and electronics do not mix. While video cameras have been used in many adverse extremes, they need care just as any other delicate instrument.

Latitude Characteristic of Film and TV


One area where electronic and film acquisition vary is in latitude. Television cameras operate well over their normal usable light intensity range but do not fail gracefully beyond it. They tend to cut off. So, in a bright outside scene which contains a wide range of contrast it may be difficult to avoid the bright areas from clipping or whiting out while keeping detail in the shadows, or vice versa. On the other hand, film negative is formulated to provide very graceful performance far beyond the video contrast range. This leaves room for later grading in the lab to allow for shadows or bright areas, or even bending the characteristics to include detail over a wider contrast range. Until the characteristics of electronic cameras are developed further, film will always be preferred by many where high contrast scenes are used. But where lighting can be more controlled, such as in a studio, the contrast argument need not apply.

Video for Film


Some productions prefer a film look but require television production. For this, if an interlaced format was used then its fields must be transformed into frames at half the field rate, a process that may sacrifice vertical resolution, and grain effects added. Despite the treatments, the look is not quite that of film originated material. There is also increasing interest in shooting motion picture features on HD. To get as close as possible to the film technique Sony has created their CineAlta camera which records 1080/24 PsF. Here, the whole area of each image is captured at the same instant just as on film. Sf, Segmented frame, refers to the method of recording the images on otherwise standard HD equipment. Although there is no inherent grain, this is very close to the way film captures images. Recording media have evolved and now there is a real choice between film and television at standard and high definition. Each of the recording systems produces different results, they are not better or worse versions of the same thing. The right selection will provide all the target markets with at least the quality they expect, convey the look that is required of the production and fit within budgets. See also: 24PsF, CineAlta

Resolution
There is a choice of resolutions available in film and television. Popular film formats are 35mm and Super 16. While Super 16 does an excellent job for standard definition, many people prefer 35mm for high definition. However, there is nothing to stop either film format being scanned into any video format. In many cases, 35mm is favored for both.

44

Topics

Non-broadcast Use of HD
One thing everyone agrees on is the stunning picture quality that can be achieved with HDTV systems. Although it will take a while for countries, broadcasters, and viewers to switch over to HD en masse, anyone seeing the pictures for the first time is struck by their detail and clarity. Bigger, clearer, wider pictures draw viewers into the scenes and capture much more of their attention. It is a very different viewing experience that has applications beyond broadcast television itself.

Electronic cinematography is now being used for some motion picture shoots. One of the benefits is in the immediacy of the medium no waiting to see what comes back from the lab. Everyone can see the results on HD monitors as they are shot. George Lucas has famously made use of 1080/24p HD for the latest Star Wars features. Certainly this makes sense from many angles, in that there are no telecine scans needed and the material can fit straight into the HD production chain. With D-cinema, it can be digital all the way to the screen, preserving quality and adding to flexibility of production. However, 35mm film still has many advantages over electronic acquisition. Current electronic cameras nowhere near equal the latitude of film negative and its ability to capture wide contrast ranges. Some would also point to the spatial resolution of film being far greater and the importance of the film look. No doubt discussions and preferences will continue for a long time yet, but the mere fact that end-to-end digital/ electronic production can produce results that are appealing to cinema audiences is highly significant.

Cinema
Traditionally, the only place such an experience has been available is at the cinema. The rapid development of high-resolution digital projectors has transformed the prospects for the exhibition of motion pictures. Putting this together with HD video and 5.1 or other surround sound builds a new D-cinema operational model for distribution and exhibition that has possibilities to challenge the use of film. 35mm film has been the worlds one moving picture standard for a long time. It is accepted in any country and shown in cinemas and on television. Any system that intends to parallel it, or replace it, should have the same global acceptance. This is a part of the work being undertaken by the SMPTE DC28 Task Force on Digital Cinema. There is an argument that D-Cinema projection should seek not just to equal film and therefore be an alternative, but also to offer something better. However, nearly all those who have seen digitally projected movies say that it is already better than film. The lack of the all-toofamiliar defects, such as scratches and film weave, is an attraction. In addition, there is the prospect of far more consistent quality over the whole distribution chain as this medium does not noticeably wear after a weeks showings, and all copies can be first generation digital copies from the master. Whether or not HD standards become embedded into the DC28 results, they have already helped to show the potential of high definition applications in large-screen environments. At the same time an eye has to be kept on costs, and there are big differences here. The cost of supplying and installing electronic projection equipment and its necessary storage system is still much higher than that for a film projector, but huge savings can be made on distribution. With long-term cost benefits, installations of D-cinema projection continue to take place. For example, new movies could be downloaded using telecommunications links over cable or satellite or it may be that replay occurs directly from the studio. Either way, the program at the cinema could be swiftly altered to fit audience requirements.

Corporate
Everyone likes to look at attractive pictures. The ability of HD systems to show large, sharp images is being used as a part of corporate presentations and promotions to impress potential clients and investors. They offer an edge at trade shows, help pull in audiences and can put over messages while the audience is wowed by the quality of the presentation. In nearly all cases, large screens are used, either on panels or using projectors. With the costs of HD production coming down toward that of SD, the added value is increasingly likely to make sense.

Retail
Window dressing has for years been used to attract customers into shops. However, it is mostly static and limited as there is only so much space available. HDs lifelike images can be used to offer moving presentations in this confined environment. For example, clothes in the window can also be shown as modeled on the catwalk. Maybe this would be more appropriate as a 9:16 portrait display rather than the traditional landscape, but still uses the same HD technology. A lifelike, nearly life-sized model showing off the clothes in the window has great potential to draw customers into a shop. In a similar way, displays can be used throughout the shop to show clothes in use rather than just hanging on the peg. At the same time, while attention has been captured, other promotions and illustrations can be added to the display.

45

Topics

This does not just apply to clothes but all goods and services. Demonstrations of cooking in a food store, holidays in a travel agent, and gardening ideas are all possibilities. In the past, the cost of HD has limited its applications. Now the newer technologies are driving down prices and so increasing its possible uses. Attractive imagery is a powerful attention grabber and can be used to promote a wide range of goods and services. See also: D-cinema

Progressive versus Interlaced


Traditionally, television images have been interlaced. The 625/50i (PAL) and 525/60i (NTSC) standards have depended on it for years. However, another part of the imaging industry, computers, has settled for progressive scans. There are good reasons for both but it is only since the introduction of digital television and the HD formats that progressive scanning has been accepted for television as well as the continued use of interlace.

Progressive
In an ideal world, there is no doubt that progressive scans would always be best. For any given number of lines, they offer the best vertical resolution and rock steady detail so we can easily read the text on our computer screens. These are typically refreshed at quite a high rate at around 70-100+ times a second. Taking into account our persistence of vision, this is plenty to avoid visible flicker, which is marginal at around 50 times a second, although it is dependent on the ambient light level. It is also fast enough to give good movement portrayal. The action of scanning itself imposes limits on the resolution of the scanned image. This is assessed in the Kell Factor, which states that actual vertical resolution of a scanned image is only about 70 percent of its line count. This is due to the laws of physics and applies to both progressive and interlaced scans.

Interlace
Broadcasting television pictures has always been constrained by limited bandwidth for their signals. During the years, various devices have been used to compress or reduce the amount of transmitted information to fit the available bandwidth pipe. PAL, NTSC, MPEG-2, and interlace are all methods that help good-looking television pictures to arrive at our homes via relatively skinny communications channels. An immediate interlace benefit is its support of 2:3 pull-down which relies on there being 60 TV fields per second. How else would Hollywoods 24 frames-per-second movies have made it to 30 progressive pictures before the days of sophisticated standards conversion? All things being equal, if PAL or NTSC television pictures were progressive, they would require twice the bandwidth for transmission. By sending a field of odd lines followed by one of even lines, the whole picture is refreshed twice as often but with each scan at half the vertical resolution. Allowing for the persistence of the eye, the perceived resolution is no different to progressive where there is no movement between interlaced fields, but drops by about 30 percent during movement (see Interlace Factor). Another drawback is that some detail appears to bounce up and down by a line (twitter), for instance, when

46

Topics

some edge detail is displayed on an even line but not the adjacent odd line. Thus, a time dimension has been added to spatial detail! The effect is that small text becomes tiring to read and even still images seem slightly alive as some horizontal, or near horizontal edges twitter. Hence, interlace would be very poor for computer displays. Table [3]

image is not so simple. If there is movement between the two fields then those areas will oscillate at frame rate very distracting. Corrective action may take many forms. The simplest is to display only one field and repeat this in place of the other. The downside is that near-vertical lines become jagged. Another solution is to average between the lines from the frozen field to create the missing field. This softens the image but reduces the jagged near-vertical lines. The technically superior solution is to use both fields and detect any movement between them. Then, in these areas only, apply the averaging technique. Graphics are often composed from live video grabs. These are freezes and are subject to the same limitations. In addition good graphics design needs to be aware of interlace effects so its possible that softening a horizontal line will avoid a distracting twitter. Such effects also impinge on operations such as picture re-sizing in a DVE (Digital Video Effect). Here, DVEs apply a much more sophisticated form of the averaging technique to present clean-looking compressed images. Similar problems occur if there is movement between the two fields of a frame. Again, the DVE could be just field-based reducing vertical definition (not so much a problem until attempting a seamless cut from full size to live), or detecting movement and taking evasive action. Again, these precautions arent needed with progressively scanned images. Further downstream, video is encoded into an MPEG-2 bitstream for home delivery over the air, cable and satellite, or by DVD. A part of the MPEG-2 encoder operation is involved with detecting how areas of the images move, and sending this information with a few whole pictures, rather than a continuous stream of whole pictures. This greatly reduces the data used to describe the video. Clearly, there is more movement involved in 60 interlaced fields than 30, or even 24, progressive frames. So progressive is more MPEG-friendly and the reduction in movement data allows more room for special detail sharper pictures. Taking the example of analog 525-line (NTSC), if this had been transmitted at 30Hz frames progressive, the image would have flickered terribly. Moving to 60 Hz would solve the flicker but double the bandwidth. The reduction of vertical resolution on moving images to halve the bandwidth is a good compromise for television.

Object to be Scanned

=
Field 2

Field 1

Interlaced scan: Apparent difference in vertical position between field 1 and 2 makes object edges twitte r

1080/720
There is a very straightforward example of the progressive/interlaced resolution trade-off in the ATSC Table 3 (page 37). There you find 1080/30i and 720/60p. Working the figures, the 33 percent reduction of vertical (and proportional horizontal) picture size and 60 progressive frames rather than 60 interlaced fields (30 frames interlaced is 60 interlaced fields) produces a reduction of data of just 11 percent not really significant. It could be argued that the 1080-line interlace has higher vertical resolution for still picture areas, and similar for moving. But it wins on the horizontal resolution (1920 versus 1280). The removal of twitter in the progressive images is a bonus.

Hidden Progressive Benefits


Although resolution and movement portrayal are usually headlined, there are other important issues that concern the area of image processing. These hit the equipment designers first but they can also impact on production. Freezing a progressive image is relatively straightforward just continuously display one frame. The image is completely static and retains its full resolution. Freezing an interlaced

What you see


Beyond all the discussion, it is what viewers see that counts. With the exception of 720/60p, few actually see raw progressive pictures. Lower frame rates of 30 and 24 per second would flicker to distraction on most screens. Receivers include circuitry to do what is already done in cinemas double shuttering. Displaying each frame twice is sufficient to remove the flicker problem. Such provision has not been possible until recently, so interlace has not only eased the bandwidth problem but also the display

47

Topics

limitations by providing a time-lapsed form of double shuttering which requires no additional complication for the receiver. Today, receivers and set-top boxes are capable of much more and there is a real choice for progressive or interlace. The simple answer is that there is no answer both have benefits. The good news is that modern consumer equipment can handle both and display them to advantage, so the choice falls back on the program makers to pick whichever will suit their production best. Those that think that there has to be only one or the other have failed to realize that we now live in a multi-format world. See also: 24PsF, Interlace, Interlace Factor, Kell Factor, Progressive

Multi-format Distribution
With the explosion of multi-channel television broadcast systems, E-cinema, and media such as DVD, Internet, and streaming, there is a huge demand for programming. At the same time program budgets are spread ever thinner so the need is still greater to distribute programs on a wide basis and amortize costs. It is rare for a new standard to arise out of nowhere and HD is no exception. It has a legacy stretching back into SD. And so, just as SD had different standards across countries, so too does HD - although some issues have been resolved. While those in the USA were challenged by ATSCs Table 3 (page 37), Europes DVB developed an even bigger list of permitted video formats for SD and HD digital television. Add to that the need to address other media and it is clear that finding some common ground would be a great benefit.

Top Down
The requirement is to deliver programs in the best possible quality to all the media who need it at the right price. Technically, converters exist to move from virtually any television standard to any other. However, with picture quality and cost both issues, care is needed. If SD material is well shot the best is sourced from well scanned 35mm film, edited uncompressed and then up-converted to HD the results are surprisingly good. This shows the generally unseen full quality of the 601 digital sampling system. However, this is not as good as HD-original material. There is an increasing number of cases where material is shot on film, telecined to SD, edited, and distributed. Then another client becomes interested and wants an HD version. For most clients, up-conversion of a whole program is seen as something to avoid, if possible. So, its back to telecine to scan at HD and then re-edit. Now the realization is growing, even in Europe where there is no HD delivery system, that any program with anything more than a short shelf life is increasingly likely to be needed in HD.

Which Production Format?


Unlike the old days, we now live in a multi-format world that provides not only choice but demands decisions. There are three major variables that must be chosen: number of lines, frame rate, and progressive or interlace scan. Choice of the right combination will minimize the conversion effort required and provide the best results to the target markets. Above all, it is important that format issues do not compromise the material delivered to customers. Starting with the right production format is key. First, the top-down strategy dictates that if there is an HD customer now, or possibly in the future, the production should be in HD. This means the number of lines has to be 1080 or 720 (used with progressive scans), but

48

Topics

1080 is the most common. Depending on the techniques used the spatial conversion, or re-sizing, required to change the number of lines can be of high quality. Second, there is the issue of interlaced and progressive scans. Progressive scanning provides the best vertical resolution and is preferred for all types of dramas, news, and magazine programs but is not ideal for presenting fast action. Here, interlace has an advantage in that it updates the image at double the rate, providing twice the number of half-resolution vertical scans per second. Although the vertical resolution takes a hit, especially during action, the eye does not perceive such detail during movement, so this is a better compromise for action such as sports coverage or if using quick camera pans. Note that, although keeping the same frame rate, changing from interlaced to progressive scans constitutes a frame conversion. The movement between the two interlaced fields must be somehow resolved into a single still progressive frame. However, the reverse operation is simple, although it does not create movement between fields. It is effectively done every time the frames of a motion picture are scanned to an interlaced TV format as with 525/60 (NTSC) and 625/50 (PAL). Choosing the right frame rate is important as frame rate conversions are technically complex and, although they can look remarkably good, they are technically compromised. This may not only limit any further postproduction work as this often depends on access to high technical quality, but it may also may affect downstream performance. Coding video into the MPEG-2 bitstream delivered to digital viewers depends on making decisions about the movement of objects in successive pictures. Frame rate conversion may make this process less accurate resulting in poorer pictures delivered to the final customer.

24p frame scanning is also used for other image formats. 720/24p is used as a lower-priced, better quality alternative to film for shooting episodics and at standard definition it provides a format for HD offline.

Aspect Ratios
HD uses 16:9 aspect ratio pictures and it is fast becoming popular for widescreen SD too. However, it will be years, if ever, before all programs are produced in 16:9, and 4:3 productions and screens will probably continue to exist for evermore. So programs will need to be delivered for both aspect ratios: the 4:3 taking a pan window out of 16:9 original images.

Cinema
Some productions may be destined for the cinema as well as for television. Although HD projected material has been well received and is considered by many to be superior to film, there are other possibilities. Scanning 35mm film open gate using the 2K format is deemed to produce a good enough result to record back to film, after digital postproduction. These 2048 x 1536 images do not offer significantly more horizontal detail than HD but there is more vertical space (lines). This offers some freedom to select a 16:9 or 4:3 window when copying from the 2K master. The scans are often made in 16-bit RGB because the dynamic range of film is between 10-12 bits. This fact makes film a good acquisition medium because you can be less careful with the lighting and can adjust the dynamic range to fit video once in postproduction. See also: 2:3 Pulldown, Universal Master

1080/24p
A popular format choice is 1080/24p. This is well accepted as an international exchange format, partly because nearly every TV station in the world has no problem accepting 24-frame motion picture film (which is effectively progressive scanned). Using 1080 lines means the images are High Definition and so can provide high quality at both HD and SD. 24p does not need frame-rate conversion processing but can simply be directly mapped into other frame rates. Those requiring 60 fields per second can use 2:3 pull-down, and 25p is achieved by replaying material four percent faster. This puts the audio pitch out but pitch correction can now be used (although this has not been an issue in the past). Changing the lines per picture for 720 or SD formats uses a refined DVE zoom-down, which can be quite accurate. Through these operations 1080/24p can be published to all television standards with very high technical quality.

49

Topics

Audio and HD
The introduction of HD video and the major changes it imposes on the video/film production process are not mirrored in the audio world. The HD standard does not lay down any specifications for the delivery of audio, due mainly to the fact that the audio world pretty much has its house in order. The implications for audio in HD lie in combining two firmly established areas of audio production, film sound and television sound. The consumer experience of HD on their TV at home will be the bigger, wider, high-resolution pictures, so why not make it sound better at the same time? The answer is not based in some new emerging technology standards, they are here today and have been with us for some time. For many, many years, the film world has enjoyed multi-channel sound; in fact, Disneys Fantasia introduced it to the world in 1940. The added workflow and surround sound mixing tasks involved in multi-channel sound have been developed and exploited for over 40 years. From the early beginnings of mono sound, we moved through to stereo, quad, LCRS (Left speaker, Center speaker, Right speaker, Surround speakers), then 5.1 (left, center, right, left surround, right surround and sub), 6.1 (left, center, right, left surround, center surround, right surround and sub) and 7.1 (left, near left, center, near right, right, left surround, center surround, right surround and sub). However, the TV world has generally been restricted, in part by the bandwidth of the transmission or delivery medium and, in part by consumer choice, to a mono or, at best, stereo TV broadcast signal. What TV did develop was a fast and tightly integrated audio postproduction process that encompassed nonlinear video and audio editing. The acquisition-to-delivery time in TV is greatly reduced, especially for ENG (electronic news gathering), something that film does not experience. As stated, the question of Audio for HD already has an answer and the technology already exists. The most common example is the explosion of DVD and home cinema. Here we find a consumer environment that has the ability to replay multi-channel audio at very high quality in comparison to traditional home delivery formats. The introduction of a new technology by Dolby, the company that pioneered the film sound production process, of a new broadcast audio format called Dolby E, has now enabled digital broadcasters to deliver full multi-channel sound for their digital TV transmissions. So we can see that traditional working practices for producing multi-channel audio mixes for HD are the same as for traditional SD productions. However, there is some room for improvement within the working practices. All the current delivery formats for multi-channel audio use compression, primarily Dolby, Digital Theatre Systems (DTS), Sony Dynamic Digital Sound (SDDS), and MPEG. Now, the basic rule of any compression system is this: the better the quality of the data you start with, the better the result. For example, most compression systems deal very well with clean program material such as dialogue.

However, noisy material proves to be a headache for these advanced algorithms and can lead to unwanted artifacts. The professional audio market is constantly working to improve the standard of sound reproduction. In fact, many professional music engineers have never been comfortable with the introduction of digital recording some 20 years ago. The standard 16-bit, 44.1/ 48 kHz was regarded by many as limited in both its dynamic range (i.e., its ability to capture large transients) and its frequency response (48 kHz gives a theoretical response of 24 kHz, at best). The single biggest improvement in modern digital audio is the acceptance of 24-bit audio recording. 16 bits offers a theoretical dynamic range of approx. 96 dB (6:16). However, 18 dB is allowed for headroom (extra dynamic range in case things unexpectedly get really loud) leaving only about 70 dB for the program material. Again, if there is a quiet dialogue passage it may only be using half of this available level and this is where the problems start. Unlike analog tape, digital recording can exhibit some nasty side effects. Notably the increase in THD (total harmonic distortion) as the level drops. Consider an explosion it is loud and probably uses all 96 dB of dynamic range all 16 bits of data to digitize the signal. In contrast, a low-level bird-call in forest atmosphere may be way below the level of the louder sound effects. In this case, it could only be using 6 to 10 of the bits available in the recording process. Anyone who has played with early synthesizers or samplers, or used any childrens toys that talk or play sound, can hear how bad low bitrate record and playback systems can sound. Adding another 8 bits to the recording process gives much wider dynamic range of a theoretical 144 dB, improved noise floor and improved resolution to low level recordings. The second improvement for audio reproduction, which is currently hotly debated in the industry, is the use of the high sample rate 96 and 192 kHz formats. Utilizing advances in A-D and D-A technology, these formats offer extended frequency response for audio recording. However, they also double and quadruple the amount of storage and data processing required. So current DSP technology and drive storage typically makes these systems only usable in small track counts and not suitable in large multi-track systems (e.g., 64 tracks). Many regard the benefits of these formats as negligible against the extra cost of new technology and storage. However, history teaches us that, over the coming years, storage prices will continue to drop and DSP chips will get cheaper and faster, so it is fair to say that in the near-to-long-term future there will be higher sample rate systems that will deliver even higher quality audio.

50

Topics

Now, virtually all professional nonlinear audio editing systems and most professional digital mixers can happily accommodate 24-bit audio data paths. Some can even utilize the emerging higher sample rates, but the main problem lies with acquisition. The last five years have seen the near-total adoption of 16-bit 48 kHz DAT systems for location sound recording using such machines as the Fostex PD4 and HHb Portadat. But these systems were not easily upgradable to 24-bit recording. Instead, multi-channel recording devices, such as the industry-standard Tascam DA88, were increasingly used on some locations. Here, an external A-D system bit split the 24-bit signal, separating it into 16-bit chunks so the 24-bit signal could be recorded across two 16-bit tracks. Although cumbersome for some locations the system worked well, but luckily Tascam came to the rescue with a new range of high bitrate 8-track machines that could record 24 bits directly. Another device deserving mention is the Zaxcom Deva recording system. The device is a location hard disk recording system that uses a 4-track 24-bit recording system. Its record time is only limited by drive space and it does not suffer any mechanical or environmental (dust, heat, moisture, cold, etc.) problems of tape-based systems. The added advantage is that the unit produces SDII (Sound Designer II) audio files with timecode stamping for direct import into Media Composer or Pro Tools products, removing the need for timeconsuming conform sessions. As the adoption of HD continues to grow, production companies and studios can take comfort in the fact that the audio world is ready for this brave new world. The only question to be asked is how the production is to be delivered: TV, Film, or DVD. Up to the point of delivery the workflow is identical. However, choosing which multi-channel format to mix to and which to encode with will be driven by the delivery medium.

HD and Compression
The whole ethos of HD is based on providing a better viewing experience. Bigger, wider screens mean not only that pictures have far greater impact but also that any imperfections will be more easily seen. With technical picture quality very much on the agenda, there is every reason to take a look at the production chain to make sure viewers receive the compelling images that HD promises. Every part of the chain, from scene to screen, needs attention. In the middle, a great deal of the signal handling and processing takes place in postproduction and editing. Here, attention to potential quality losses is required. Maintaining maximum quality through these stages requires encoding techniques designed for real-time postproduction such as Avid DNxHD encoding, uncompressed HD, or digital transfer and editing of material such as HDV in its native format. Post and editing have no control over the upstream video quality, or what happens after they deliver the edited master. In between, there is much that can be done. It is accepted that HD video tape recording formats require the use of compression in order to be available at affordable prices or compact enough for camcorder use. In addition, the onward delivery system to viewers will require heavy compression to squeeze the data from big pictures down the narrow available channels. But the editing environment is not restricted by the need for portability. Nor is it limited by over-the-air channel bandwidth. There is the space to deploy the technology to do the job as needed without such physical restrictions.

Linear Editing
The technical demands of SD on storage and data rate are enormous but those of HD are up to seven times bigger (see Storage). When HD transmissions started in the USA in November 1998, virtually all editing was linear. There was widely viewed as a step back from the flexibility and speed of nonlinear, but the technology to make HD nonlinear editing viable was simply not available at the time. Apart from D6, all HD tape formats are compressed. The popular 1080/60i format requires a staggering 1.2 Gb/s for uncompressed processing (using 4:2:2, 10-bit sampling) but the available VTRs at best only offer up to about a quarter of this rate. From this point of view, the D5 HD VTR prevails and is widely used for mastering. DVCPRO HD records 100Mb/s and HDCAM around 140 Mb/s for video, also representing high compression compared to the SD Digibeta and DVCPRO 50 benchmarks at around 2:1/3:1. The immediate concern is for multi-generation work using these codecs, which were not designed for postproduction. Using these codecs in postproduction can cause significant signal distortions which are at odds with the HD picture-quality ethic. Even if the pictures

51

Topics

look good, that does not tell the whole story. Modern postproduction involves much more than splicing shots together. Processes such as keying and laying and secondary color correction are commonplace. These processes stretch image contrast to pull key signals from the pictures. Without the right codecs, such processes effectively raise the level of otherwise invisible errors to the point that they become visible imposing a limit on what can be done with the pictures. Typically, the DCT blocks (involved in compression) start to show. The introduction of affordable HDCAM and DVCPRO HD formats has done much to move the whole HD business forward. These codecs do a superb job for camcorders and over limited tape-to-tape generations but their performance in an editing environment is not ideal.

This high-quality environment combines the flexibility and tools of NLE with the freedom to adjust and retry edits, effects, color correction everything without fear of quality loss. Given the high viewer expectations for HD, such attention to detail is necessary. To achieve that level of editing performance requires edit length storage, i.e., one or two hours, available with at least one, preferably more, real-time HD channel bandwidths. The requirements of each of the many video formats vary only slightly, but the most demanding, and frequently used, is 1080/30. 10-bit mastering-quality Avid DNxHD encoding requires just 100GB/h, as compared to 560 GB/h for uncompressed 4:2:2 sampling, or 840 GB/h if using 4:4:4. The postproduction workflow is relatively straightforward. The stored video can be directly accessed not only for cuts but also for processing applications which operate directly on the picture data. So wipes, dissolves, keying, color correction and all effects treatments require no additional processing to the video. In a similar way, access to other applications via networks is straightforward. With the NLE systems store holding I-frame only compressed video, such as HDCAM or DVCPRO HD, then the material can be re-ordered, i.e., cut while still in compressed form. In this case, first-generation material is offered at the output. However, any other treatment, even dissolves or wipes, will require decoding to uncompress the video before the process can be executed. It is also likely that other networked equipment will require access to, and delivery of, uncompressed video. Afterwards, processing video will need to be re-compressed and put back to the edit store. This procedure, even though it may be invisible to the user, costs time, involves additional equipment, and will degrade the pictures if the uncompress/compress cycle is overused. However, it uses between 7 and 11 times less video storage and is less demanding on data speed.

Back to the future


Much has changed. The disk and digital processing technology, combined with Avid DNxHD encoding designed for postproduction, are already here to bring the recognized advantages of the nonlinear environment to the HD arena. These advancements allow real-time performance, compelling storage efficiencies, and real-time collaboration with HD images that are indistinguishable from uncompressed images. So the output of the NLE system has the same quality as its input.

SHOOT

RECORD

EDIT
NL edit system Uncompressed HD

MASTER

PUBLISH
SD 16:9 HD SD 4:3

Cuts, dissolves, key, DVE, color correct, wipes, FX, etc CUTS ONLY NL edit system compression-native SD 16:9 HD SD 4:3

Workflow
A workflow using Avid DNxHD encoding allows real-time performance, low storage cost, and real-time networked workflows similar to todays SD experience. Yet it maintains mastering-quality HD right to the start of the transmission chain. Knowing this is in place means that the compromises before and after editing will not stand in the way of HD excellence being delivered to viewers. The final processes of the scene-to-screen chain perform far better with artifact-free, low noise pictures. The MPEG-2 encoder has to compress the HD video to fit within the available delivery channel. For terrestrial broadcast this could be from 15 to 24 Mb/s. Experience with MPEG-2 shows that encoding is far more efficient when working with

Uncompress/ Compress

Compressed Video Uncompressed HD Video

Wipes, dissolves, DVE, key, color correct, FX, etc

Network Connection

Uncompressed and Compressed HD Production

Workflows

52

Topics

clean images. This is because artifacts, such as DCT blocking or ringing on sharp edges, as well as excessive film grain or electrical noise, appear as extra information to be coded. Rapidly changing artifacts, such as noise and grain, are interpreted as movement, and movement data is generated accordingly. This takes from the limited bandwidth budget leaving less room for picture detail.

Uncompressed Linear?
It is expected that uncompressed HD editing will not be available in linear suites for some timeif ever. Tape machines are designed to work alone and from a single tape. The priorities of videotape storage are currently pointing towards compact, robust, and economic formats as well as a middle-compressed mastering format. Data tape is also widely used for archiving but this does not offer real-time service and is complex to view. This is uncompressed linear but of little live use for attended edit sessions. The whole HD scene-to-screen process is pushing the limits of physics and technology. It needs all the care and attention it can get every step of the way. Until recently, HD editing was limited and costly, but the arrival of mastering-quality technology, like Avid DNxHD encoding, has brought the affordable performance that HD needs.

53

Appendices A-B

Appendix A - New Project Types


Media Composer Adrenaline and NewsCutter Adrenaline FX systems will accommodate six new project types when upgraded to HD via the HD Expansion Option. There will be six new HD project formats added, and one SD project format for HD offline editing, in addition to the existing project types. New project types are highlighted in gray.

Appendix B - Avid DNxHD Family of Resolutions


Each of the six new project formats will offer multiple video resolutions, available in the media creation settings for capture, titles, import, mixdown, and render. Avid DNxHD encoding supports both 8-bit and 10-bit component images. All HD resolutions will use the MXF container format. When combined with the multiple supported frame rates there will be 15 Avid DNxHD resolutions:
Project Format 1080i/59.94 1080i/59.94 1080i/59.94 720p/59.94 720p/59.94 720p/59.94 1080i/50 1080i/50 1080i/50 1080p/23.976 1080p/23.976 1080p/23.976 720p/23.976 720p/23.976 720p/23.976 Resolution Avid DNxHD 220x Avid DNxHD 220 Avid DNxHD 145 Avid DNxHD 220x Avid DNxHD 220 Avid DNxHD 145 Avid DNxHD 185x Avid DNxHD 185 Avid DNxHD 120 Avid DNxHD 175x Avid DNxHD 175 Avid DNxHD 115 Avid DNxHD 90x Avid DNxHD 90 Avid DNxHD 60 Frame Size 1920 x 1080 1920 x 1080 1920 x 1080 1280 x 720 1280 x 720 1280 x 720 1920 x 1080 1920 x 1080 1920 x 1080 1920 x 1080 1920 x 1080 1920 x 1080 1280 x 720 1280 x 720 1280 x 720 Bits 10 8 8 10 8 8 10 8 8 10 8 8 10 8 8 FPS 29.970 29.970 29.970 59.940 59.940 59.940 25.000 25.000 25.000 23.976 23.976 23.976 23.976 23.976 23.976 Mb/s 220 220 145 220 220 145 183 183 121 176 176 116 87 87 57 Byte/F 913408 913408 602112 454656 454656 299008 913408 913408 602112 913408 913408 602112 454656 454656 299008 min/GB 0.621 0.621 0.942 0.621 0.621 0.942 0.745 0.745 1.130 0.776 0.776 1.177 1.566 1.566 2.381

Format 23.976p NTSC 24p NTSC 30i NTSC 24p PAL 720p/29.97 HDV 25p PAL 25i PAL 720p/23.976 720p/59.94 1080p/23.976 1080i/50 1080i/59.94

HD/SD SD SD SD SD HD SD SD HD HD HD HD HD

Rate/FPS 23.976 24 29.97 24 29.97 25 25 24 59.94 23.976 25 29.97

Units/Second frames frames fields frames frames frames fields frames frames frames fields fields

It is expected that additional project types will become available in future versions of the product.

54

Appendix C

Appendix C
Avid DNxHD Naming Conventions
The Avid DNxHD family begins with three HD resolutions identified by bandwidth (megabits/second) and bit depth: Avid DNxHD 220x 10-bit, Avid DNxHD 220 8-bit, and Avid DNxHD 145 8-bit. The names designate the highest bit rate for each of the resolutions, but the actual bandwidth required will differ based on format and frame rate. For example, a 1920x1080i/59.95 HD raster captured as Avid DNxHD 220x requires 220 megabits/second, whereas 1280x720p/23.976 captured as Avid DNxHD 220x requires only 87 megabits/second. Refer to the Avid DNxHD Resolution Appendix [B]

Avid DNxHD Quality


Since Avid DNxHD technology is specifically designed for nonlinear editing and multigeneration compositing in collaborative postproduction and broadcast news environments, it includes a choice of 8- or 10-bit raster sampling, three user-selectable bit rates, and the ability to maintain image quality more effectively than other HD codecs. The chart below compares Avid DNxHD technology to other HD resolutions in use today:
Format Bit Depth Sampling Bandwidth Avid DNxHD 145 8-bit 4:2:2 145 Mb/sec Avid DNxHD 220 8- and 10-bit 4:2:2 220 Mb/sec DVCPRO HD 8-bit 4:2:2 100 Mb/sec HDCAM 8-bit 3:1:1 135 Mb/sec HDCAM SR 10-bit 4:2:2 or 4:4:4 440 Mb/sec

Avid DNxHD Benefits


Avid DNxHD 145 8-bit encoding requires approximately 20% less storage capacity than 8-bit uncompressed standard definition video. Because of the reduced bandwidth of Avid DNxHD encoding, single editing systems can work in HD with a simple 4- or 8-way drive stripe set. Perhaps more importantly, Avid DNxHD technology enables the first truly collaborative real-time HD environment with Avid Unity MediaNetwork systems. Avid DNxHD technology also supports Avid's Emmy award winning Multicamera functionality with up to three real-time streams of Avid DNxHD 145. By working in the final mastering resolution, the time consuming offline-to-online recapture is eliminated.

Other popular compressed HD formats do not natively support the full HD raster. Raster downsampling reduces the high frequency detail in the image, making the images easier to compress, but makes HD images look softer by degrading their true resolution in the process. Unlike other compressed HD formats, such as HDCAM and DVCPRO HD, Avid DNxHD technology maintains the full raster of the active video, sampling every available pixel within the image. The table below shows raster downsampling performed as part of other HD compression schemes:
Format Resolution / Frame Rate From HDCAM 1080i/60 1080i/60 1080i/50 720p/60 1920 1920 1920 1280 Y To 1440 1280 1440 960 Chroma From 960 960 960 640 To 480 640 720 480

Avid DNxHD and Openness


Avid DNxHD technology utilizes the MXF wrapper so media can be exchanged with any other MXF-compliant system. Because MXF is a codified international SMPTE standard, users and developers can rest assured that media files created by Avid applications will always be accessible with or without Avid equipment.

DVCPRO HD DVCPRO HD DVCPRO HD

Avid DNxHD and Future Potential


Avid DNxHD encoding technology is a scalable solution that will allow Avid to add different formats, resolutions, and data rates as required by the marketplace. As demands move to different bit rates, there will be an Avid DNxHD resolution to meet those needs. The source code for Avid DNxHD technology will be licensable free of charge through the Avid Web site as a download to any user who wants to compile it on any platform. For more information please refer to: http://www.avid.com/dnxhd

2004 Avid Technology, Inc. All rights reserved. Product features, specifications, system requirements, and availability are subject to change without notice. Adrenaline, Avid DNA, Avid DNxHD, Avid, Avid Unity, Avid Xpress, Digidesign, Digital Nonlinear Accelerator, make manage move | media, Media Composer, Meridien, NewsCutter, Nitris, OMF, Pro Tools, Softimage, Symphony, and XSI are either registered trademarks or trademarks of Avid Technology, Inc. in the United States and/or other countries. Emmy is a registered trademark of ATAS?NATAS. All other trademarks contained herein are the property of their respective owners.

55

You might also like