Next-Generation Interactive Broadcast Services
Next-Generation Interactive Broadcast Services
Next-Generation Interactive Broadcast Services
net/publication/200036132
CITATIONS READS
2 183
3 authors:
Jörg Heuer
Siemens
55 PUBLICATIONS 695 CITATIONS
SEE PROFILE
All content following this page was uploaded by Jörg Heuer on 01 August 2015.
Published in: Proc. WSDB 2004 - 5th Workshop Digital Broadcasting, September 23-24, 2004, Erlangen, Germany.
into individual semantically coherent entities called Pro- such a service, a rich set of metadata called a service de-
gram Items. To each of the Program Items, additional re- scription is necessary which contains the required informa-
lated background information can be assigned: HTML tion. Figure 1 illustrates the metadata model. The segmenta-
pages from the Web presentation of the News show, Video tion of the main program along the time line into program
clips featuring earlier reports on similar topics or audio items is shown. For personalization and random access via
clips featuring, e.g., reports from the radio related to a the STB user interface, each program item is annotated with
News story. This additional content is called asynchronous e.g. title, abstract and copyright. Furthermore, categories
because it only has a loose coupling with the timeline of the and topics provide means to group program items. Based on
program. Asynchronous content can also be assigned to a these metadata, a personalization component can select
complete news show or to Topics – a concept which allows Program Items based on a user profile expressing prefer-
to group multiple related News reports (e.g., reporting on ence for several categories or keywords. Such a user profile
the presidential elections) and provide a rich set of back- can be specified by the user, or it can be collected by index-
ground information to them with moderate editorial effort. ing the metadata of Program Items actively selected by a
The second class of additional content is of synchronous user for watching.
nature – i.e., it must be tightly synchronized with the main Figure 2 depicts the user interface of a scalable News ser-
program’s timeline. An example for that is an additional vice which allows access to the individual News items plus
video stream carrying a sign language interpreter for hear- background information by exploiting the annotations and
ing-impaired people. organization metadata.
Personalization requires a set top box (STB) which offers
Personal Digital Recording (PDR) functionality and oppor-
tunities to filter media data. A service may be personalized
in two ways: first, additional content is shown in a live
situation only if the personal profile of the viewer indicates
this (e.g., show a signer only if the viewer is known to be
hearing-impaired). Second, the complete program may be
recorded, and the individual Program Items are then filtered
according to personal preferences. This allows creating a
personal News program featuring Program Items originat-
ing possibly from multiple different News Shows.
The digital nature of the media data allows access to such a
rich media service using a variety of portable devices.
While a PDA or TabletPC may be used to consume the ser-
vice in-house via WLAN access to the STB, access with
mobile smartphones while on the move is possible, too. Figure 2. User interface of personalized, scalable News
service
The second large group of metadata describes the additional
content, supporting the rich media aspect of the service. For
each additional content item (ACI), properties like title,
synchronization with the main program or type have to be
described. Each ACI references one or more Media Items
which provide access to the actual media essences contain-
ing the additional content. The reference is realized by
means of a media locator (URI), which may e.g. reference
an HTML page via HTTP, a video clip via RTSP or via
DVB MPE (multi protocol encapsulation).
Service scalability means that a service is designed such
that it can be deployed on various devices with different
capabilities. A scalable service can benefit greatly from a
scalable content format. Such a format would allow encod-
Figure 1. Metadata model for personalized scalable rich ing each video stream once and then adapt it to different de-
media TV services vices by just decoding a well-defined part of the data pack-
In contrast to classic TV which consists just of audio and ets to reduce resolution or bitrate. However, scalable con-
video, new TV services contain a variety of media objects tent formats for video like MPEG4 Fine Granularity Scal-
with diverse relationships and meanings. To support the ability [10] have not seen wide acceptance yet. The current
various ways of presenting, accessing and personalizing MPEG-21 activity on Scalable Video Coding [8] may pro-
Published in: Proc. WSDB 2004 - 5th Workshop Digital Broadcasting, September 23-24, 2004, Erlangen, Germany.
vide a solution in a few years. Meanwhile, a scalable ser- bile devices (a TabletPC and a PDA) connected to the STB
vice can also be realized by a combination of simulcasting via WLAN. The user interface to personalize and select the
and transcoding, at the cost of lower bandwidth efficiency. content (cf. Figure 2) is generated on the STB by an MHP
Simulcasting means to transmit a media item simultane- application, taking the metadata into account. The mobile
ously in various formats. Transcoding means to have a devices have access to scaled-down versions of the stored
transcoder unit in the STB which converts a Media Item content.
from one format/bitrate into another. The proposed meta-
data structure supports both approaches. By referencing
multiple Media Items for one content item, simulcast is
supported. For each Media Item, a description of the media
properties is provided in the metadata. The most appropri-
ate Media Item to present a content item on a specific de-
vice can be found by matching these media properties with
the capabilities of the actual target device. To support trans-
coding, a Media Item can be marked as “virtual” by omit-
ting the media locator but providing the media descriptions.
This way, a transcoder engine can use a sibling media item
from the same content item as the source for transcoding,
turning the virtual media item into a real one and inserting a
URI to the newly created media essence.
Published in: Proc. WSDB 2004 - 5th Workshop Digital Broadcasting, September 23-24, 2004, Erlangen, Germany.
Multiprotocol encapsulation allows carrying IP data packets quire too much overhead. In contrast, the routing decision
in a DVB stream, making it possible to embed RTSP for asynchronous additional content may be revised at any
streaming video into a rich media service. This way, syn- time. Revising means that all users currently consuming this
chronized additional video streams can be sent along with content via the “old” channel will keep doing so, while new
the main programme (like a signer for the hearing impaired users will receive the content through the “new” channel.
people or streams taken from additional camera angles in a As asynchronous clips are usually short, the “old” channel
Sports program). will be freed quite fast.
This way, we can distinguish the following three ways of
Rich Media Content Delivery over Co-operating smart routing:
Broadcast and Broadband Networks
As described in the previous section, digital broadcast sys- 1. Fixed routing of asynchronous content: Asynchronous
tems (DVB) are capable of carrying not only the main content is inserted into either the DVB or DSL channel
audiovisual content, but also additional media objects and depending on a pre-set field in the service description.
even downstream IP traffic to a multitude of users at the 2. Re-routing of asynchronous content: if the number of
same time. On the other hand, broadband IP networks users of an audiovisual Media Item accessed over DSL
(DSL) are becoming widely available, being ideally suited exceeds a threshold, the item will be inserted into the
to carry personalized content on demand. We believe that a DVB stream, e.g. using private sections. If the usage
future personalized broadcast system will combine DVB figures drop again, the ACI is no longer made available
and DSL networks for delivering a personalized service at via DVB but can still be pulled via DSL.
optimized costs. 3. Fixed routing of synchronous content: synchronous
In order to make content transmission over co-operating content is inserted into either the DVB or DSL channel
networks a reality, two issues must be considered. First, the depending on a pre-set field in the service description.
system should be able to select the channel via which a me- A routing decision is executed by the system by triggering
dia item will be delivered. We call this feature Smart Rout- the playout system to insert the media item into the desired
ing. Second, the system must be able to synchronize content transmission channel. Furthermore, the system must change
delivered through DSL with the main TV program deliv- the media locator (URI) of a media item in the service de-
ered via DVB. scription. This way, the Content Access System (cf. Figure
4) is instructed to extract the media from the correct chan-
Smart Routing nel. As a prerequisite for that, the service description must
The basic idea of smart routing is to save transmission costs be updated regularly, and these updates must be signalled
by using the channel for transmitting the additional content frequently to the Content Access System. As both DVB and
which offers the lowest transmission costs. For DVB, the DSL can carry IP traffic, the handling of the packets is the
costs are independent of the number of users. For DSL, the same after extracting them from the transmission channel.
transmission costs grow proportionally with the number of
users because for each user there will be an individual Content Synchronization
stream transmitted. Making a routing decision means to Systems for rich media TV services must provide a new
select the N media items of a rich media program which are kind of synchronization: synchronizing additional content
most likely to be consumed by the most users and to trans- with the main TV program. While frame-accurate sync is
mit them in the DVB channel, where N depends on the required only in some very rare cases, synchronization with
available capacity in the DVB channel and the bitrate of the an accuracy of a few frames has many applications: addi-
individual media items. To get an estimate about how likely tional camera angles in sports programs, quiz or talk shows;
it is that a media item will be in high demand, the following or a sign language interpreter to make TV programs acces-
criteria can be used: sible for hearing-impaired people. Synchronization has two
• Estimation by the program author or playout operator facets: first, transmission delays must be compensated to
and insertion into the service description ensure that the data packets carrying the additional content
arrive at the right time in the decoder. Second, the presenta-
• Prediction using usage statistics from previous similar
tion of the main and the additional content must be syn-
programs an heuristics based on media properties
chronized. Ideally, this should be possible using standard
• Actual measurements during the current program components at the receiver side.
While the first two methods provide a fixed routing decision When an additional video stream is streamed over RTSP
and are suitable for both asynchronous and synchronous and transmitted via DVB in MPE, transmission synchroni-
additional content (i.e. clips and streams), the last method zation does not pose a major problem because no transmis-
allows re-routing and can thus only be applied to asynchro- sion delays between the main program and the additional
nous additional content. The reason is that for synchronous content occur. When the additional stream is transmitted via
additional content, the routing decision can not be changed DSL, however, both transmission delay compensation and
while the content is playing – a seamless change would re- presentation synchronization are necessary.
Published in: Proc. WSDB 2004 - 5th Workshop Digital Broadcasting, September 23-24, 2004, Erlangen, Germany.
[2]. A version of the DVB-T standard called DVB-H [3]
especially designed for efficient battery use of mobile re-
ceivers will be used as the downlink to broadcast multime-
dia content to mobile phones. Additionally, personalized
information, interactions and content protection / billing
functions will be provided using the GPRS or UMTS chan-
nels of the mobile telephone network, creating converged
mobile services. In alternative configurations of converged
broadcast-mobile systems, DVB-H may be replaced by a
multimedia version of DAB [5] or the Multimedia Broad-
cast Multicast Service (MBMS) [1] in cellular networks.
SERVICE CONVERGENCE
Mobile phones and digital TV receivers have in common
that these device categories become highly widespread in
daily life. However, a unique attribute of DVB set top
Figure 5: A signer synchronized with the TV program
boxes is the high display resolution compared to mobile
devices. This especially allows increasing the ease and in-
In the SAVANT project [9], we have developed a com- tuitivity of use of such devices due to enriched capabilities
bined approach to compensate transmission delays and en- of the graphical user interface (GUI). Both aspects make it
sure presentation synchronization: attractive to integrate other services than broadcast recep-
• Additional video content is streamed as MPEG-4 via tion into a digital STB.
RTSP/RTP. A very first trend towards this goal was the integration of
• A timing control component in the playout system en- web browsing capabilities over DSL into STBs. However,
sures that an additional content stream is started at the due to the lower display resolution of TVs and different
right point in time, denoted in the service description browser characteristics of STBs compared to the capabili-
and triggered by the clock driving the playout of the ties of PCs in general, web content has to be designed spe-
main video. If necessary (e.g. if the main content is sent cifically for the reception on TV. This recently results in
over satellite), the start time may be delayed slightly. ISPs providing web content in a “Walled Garden” concept
• Each RTP packet is time-stamped with a reference to which results in expensive information preparation. With a
the clock of the main video (Normal Play Time, NPT). growing number of STBs providing the capabilities of web
browsing, first business models of STB-specific web portals
• In the Content Access System, an RTSP Proxy inter- are discussed. Since these portals are accessed automati-
cepts the incoming data packets and buffers them. cally when the box is turned on, the content of the portal
• The transport stream demultiplexer in the CAS pro- appears to the user as if it has been pushed to the STB
vides the RTSP Proxy with the NPT timing information comparable to a broadcast service.
extracted from the main video. The convergence of such web and broadcast services is
• Knowing the time stamp of an RTP packet (which con- recognizable in that the so delivered web content allows
tains the NPT value when the packet left the playout interacting with the broadcasted content by means of pro-
system) and the current NPT value of the main video in gramming PDRs via Electronic Program Guides (EPGs) or
the CAS, the delay can be compensated by adjusting interactive games accompanying broadcasts. Even richer
the time stamp in the RTP packet accordingly. interactive web-based applications are enabled by synchro-
• The RTP packets with the adjusted time stamp are then nization mechanisms described in the previous section. A
passed to an unmodified MPEG4 player which presents prerequisite to enable service convergence is that the ser-
the additional stream. vice architecture has to converge. This is shown in Figure 6
which presents a high-level architecture view interfacing
Figure 5 shows a News program enhanced with a signer
interactive TV servers with play out servers.
which is synchronized with the main content using the
method just described. By adding DSL interfaces to the STB, it can act as a termi-
nating point not only for broadcast but also for IP-based
Convergent Mobile Broadcast Systems communication services. Due to its simplicity, SMS and
Up to here, extensions to broadcast systems have been dis- MMS clients have been ported to first STB prototypes. The
cussed which target the classic static TV set, with some in- obvious advantage of these concepts is the further simpli-
house mobility added by using WLAN access. However, fied use of the services. An example of an MMS Client on
digital media technology can offer more: delivering interac- TV is shown in Figure 7. In this case the integration of
tive broadcast to mobile phones and ultra-portable devices. MMS with broadcast allows new usage scenarios such as
This issue is currently being addressed by the DVB forum messaging of commented broadcast content. However, ob-
Published in: Proc. WSDB 2004 - 5th Workshop Digital Broadcasting, September 23-24, 2004, Erlangen, Germany.
stacles for this kind of service convergence are legal aspects a) Convergence which requires the coupling of ser-
of intellectual property rights regarding broadcasted content vice provisioning.
and the lack of a purely IP-based standardisation of SMS b) Convergence which results from coupling of ser-
and MMS services as shown in Figure 6. vices on the receiving device and leads to new us-
age scenarios of services.
In both cases, the service convergence examples seen so far
aim at the enhancement of ease of use and user experience
at the same time.
CONCLUSION
In this paper, we have listed some recent trends in enhanced
digital broadcast systems and have sketched technical prob-
lems, possible solutions and usage scenarios related to these
trends. It can be expected that in the next few years a num-
ber of new, interesting services will emerge based on the
new opportunities of digital media technology and converg-
ing networks.
Published in: Proc. WSDB 2004 - 5th Workshop Digital Broadcasting, September 23-24, 2004, Erlangen, Germany.