Academia.eduAcademia.edu

About the semantic verification of SMIL documents

2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532)

This paper presents a formal approach based on the RT-LOTOS formal description technique for the semantic verification of SMIL documents. The reachability analysis of RT-LOTOS specifications provides the verification of consistency properties of a document and, later on, it also enables the generation of a valid scheduling graph for its presentation. This graph characterizes the reference behaviors for the presentation of a document. Also, some erroneous semantic interpretations of SMIL documents which are not conformant with their reference behaviors are illustrated using some currently available SMIL players.

About the Semantic Verification of SMIL Documents P.N.M. Sampaio, C.A.S. Santos, J.P. Courtiat LAAS – CNRS 7 Av. du Colonel Roche 31400 Toulouse – France {psampaio, saibel, courtiat}@laas.fr ABSTRACT This paper presents a formal approach based on the RTLOTOS formal description technique for the semantic verification of SMIL documents. The reachability analysis of RT-LOTOS specifications provides the verification of consistency properties of a document and, later on, it also enables the generation of a valid scheduling graph for its presentation. This graph characterizes the reference behaviors for the presentation of a document. Also, some erroneous semantic interpretations of SMIL documents which are not conformant with their reference behaviors are illustrated using some currently available SMIL players. Keywords: Formal Methods, RT-LOTOS, Multimedia and Hypermedia Documents, SMIL 1. High-Level Document Authoring (SMIL, NCM Model, etc.) Automatic Translation into an RT-LOTOS Formal Specification no Derivation of the Minimal Reachability Graph INTRODUCTION The specification of the temporal structure of hypermedia documents has been reported in several publications by the proposal of models, languages and authoring tools, for instance, Firefly [1], IMAP [2], MADEUS [3], and recently Synchronized Multimedia Integration Language (SMIL) [4] which has been standardized by W3C as a new solution for describing interactive multimedia applications to be presented on the web. However, most of these publications focus on the specification of authoring requirements and synchronization constraints, and few of them address semantic verification issues. This paper presents the continuation of the approach previously introduced in [5] [6], which presents a formal methodology based on RT-LOTOS [7], a temporal extension of the standard LOTOS formal description technique [8], for the design of Interactive Multimedia Documents (IMDs). This work extends this methodology proposing the semantic verification for the presentation of IMDs based on a simple and operational scheduling graph. To illustrate the application of this methodology this paper addresses the XML-based DTD SMIL. An XML parser provides the correct syntactical verification of the author’s documents. However, the semantic correctness for its presentation is not always ensured. That means that the author’s synchronization requirements for the presentation of his document may not be always completely satisfied. The application of the above methodology provides a complete semantic verification framework for the presentation of Interactive Multimedia Documents. 2. presentation) of complex Interactive Multimedia Documents which relies on the Formal Description Technique RT-LOTOS and its associated verification/simulation tool RTL [7], developed at LAAS-CNRS. FORMAL DESIGN METHODOLOGY The proposed methodology aims to provide a framework for the design (specification, verification and Is the document Consistent? Analysis of Consistency Properties yes Scheduling Graph Figure 1. Formal design methodology Figure 1 illustrates this methodology. The edition of an IMD can be accomplished by means of a high-level authoring model (e.g., NCM [9], SMIL, etc.) which is, later on, automatically translated into an RT-LOTOS specification. It is important to note that the RT-LOTOS specification is kept totally hidden to the author during the specification and verification process. This specification is, then, analyzed by means of the RTL tool which generates a minimal reachability graph. Then, temporal consistency is expressed through some reachability properties, such as intrinsic and extrinsic consistency [5], defined using the minimal reachability graph. Also, aggregation techniques are applied in order to avoid the state space explosion problem that may come up with the utilization of labeled transition systems [10]. In [11], a document was considered as consistent if the action characterizing the start of the document presentation is necessarily followed (some time later) by an action characterizing the end of the presentation. This definition was revisited in [6] in order to make a clear distinction between two kinds of events that may lead to temporal inconsistencies, namely: Internal nondeterministic events which are related to the flexibility of media presentation duration as well as to incomplete timing constraints, and; External non-deterministic events which are related to the occurrence of events, such as user interactions. Img1 <par id="par01"> <seq id="seq01"> < Img1 dur=‘5s’ … /> <switch > <Audio dur=‘20s’ ... /> <Txt dur=‘3s’ … /> </switch> </seq> Audio Video Img2 Video Img2 <seq id="seq02" end="id(seq01)(end)"> <Video dur=‘10s’ ... /> <Img2 … /> </seq> </par> (a) (c) (b) Figure 2. Txt Img1 Temporal scenario using switch tag Temporal inconsistencies may be the consequence of either internal or external non-determinism, or even both. Basically, the temporal consistency of a document can be characterized by the identification of the inconsistency sources of a temporal scenario, and checking whether they can be handled by a temporal formatter. If a potential inconsistent branch is generated by the occurrence of an internal non-deterministic events, this inconsistency can be handled by the presentation system. However, if this branch is generated by the occurrence of an external non-deterministic event, it can not be ignored by the system to avoid an inconsistency situation (since the occurrence of this event is not controllable) [6]. If all the temporal constraints of the document can be fulfilled (if the document is consistent), then we are able to perform the scheduling of its presentation. The scheduling is, then, accomplished based on an appropriate representation (scheduling graph) which is obtained from the reachability graph. The scheduling graph is simple and operational enough and still provides the controllability of the document during its presentation. In opposite, if the document is still inconsistent after the reachability analysis, the high-level description of it must be revisited. For this purpose, the reachability analysis provides a feedback for the author proposing valid solutions for the presentation of the document. In particular, with respect to the occurrence of non-controllable events, such as user interactions. 3. SEMANTIC VERIFICATION An important issue about the specification of temporal constraints of an IMD is how to meet the user’s QoS requirements during its presentation. In the case of SMIL, the tag switch is applied to express the user's preferences concerning bit-rate transmission capabilities, preferred language, size of screen, alternative media presentation, etc. However, in some cases, the satisfaction of user's QoS requirements may lead to unexpected bad timing constraints. This section illustrates how different SMIL players, such as RealPlayer G2 [12], GRiNS [13], HPAS [14] and Soja [15], deal with this issue producing some semantic mis-interpretations. Consider the scenario illustrated in Figure 2 which consists of a sequence of a video clip (Video) followed by an image (Img2). This sequence must be presented simultaneously with another image (Img1) followed by some related information. This information corresponds to an element of the SMIL operator switch, such as an audio segment (Audio) or a text (Txt). This element is chosen in a switch operator if it can be decoded and if the evaluation of its test attributes is “true” [4]. The end of presentation of Img2 is determined by the end of the sequential presentation of Img1 and the element chosen in the switch operator. Assume that the duration of Img1, Audio, Txt and Video are, respectively, 5, 20, 3 and 10 seconds, and that media object Img2 does not have an explicit duration, that is, its presentation duration depends on the presentation of the alternative media (Audio or Txt). The reachability graph obtained from the verification of the previous SMIL document (Figure 2.b) is illustrated in Figure 3(a). Note that the branches of this graph lead only to valid temporal solutions since all of them take to the occurrence of the action endDoc of presentation (arcs 1012 and 11-12). Thus, the branches that cross the states 0, 2, 5 and 0, 2, 7 until 12 are temporally consistent. These branches describe, respectively, the presentation of the scenario if media Audio or Txt are chosen by the switch operator. The reachability graph is representative enough for the verification of consistency properties, as presented in [6]. Besides, it also provides all the possible behaviors for the presentation of an IMD. Although, for scheduling purposes, an operational and simple scheduling graph can be obtained from a consistent reachability graph (all the branches lead to the occurrence of the action end). For this reason, a scheduling graph is adopted, called a Time Labeled Automaton (TLA in short) and has been formalized in [16]. The TLA turns straightforward the semantic verification of an IMD's presentation since it describes the reference behavior for the scheduling of this document. Thus, a document's presentation is semantically consistent if its resultant behavior is in conformance with its associated scheduling graph. A TLA has as many clocks (called timers) as there are states in the automaton, and each timer measures the time during which the automaton remains in a state. The timer associated with a state is reset when the automaton enters the state, and it is frozen to its current value when the automaton leaves the state. Each transition on the TLA is associated with two timed conditions: (1) a mandatory firing window (denoted as W) and; (2) an optional enabling condition (denoted as K). These conditions are expressed as inequalities and define temporal constraints to be satisfied for firing the associated transition. Since the scenario illustrated in Figure 2 does not present nondeterministic temporal constraints, the TLA associated with this scenario (depicted in Figure 3b) is composed only of timed conditions (W conditions) which are expressed as equalities of known duration. 0 11 i(endDoc) 12 t0=0 i(startDoc) i(eAudio_eImg2) i(endDoc) 1 8 t1=0 i(sImg1_sVideo) t 10 9 i(eTxt_eVideo) i(eVideo_sImg2) 6 4 t t 5 7 i(eImg1_sAudio) i(eImg1_sTxt) 0 2 i(startDoc) t i(sImg1_sVideo) 3 (a) Reachability Graph Figure 3. eImg1_sAudio t0=0s t3=5 i(eVideo_sImg2) t2=20s eVideo_sImg2 4 t4=3 i(eTxt_eVideo) 6 t6=15 i(eAudio_eImg2) 16 t16=0 i(endDoc) t15=0 i(endDoc) 12 (b) TLA Reachability graph and TLA for the previous scenario The transitions of a TLA describe all the actions (start and end of presentation of media objects) to be executed during the presentation of a multimedia scenario; for instance, consider the TLA for the previous scenario, as illustrated in Figure 3(b). Initially, action sImg1_sVideo (between states 1 and 2) takes place at t=0 seconds. This action denotes the simultaneous start of presentation of Img1 and Video. Then, at t=5 seconds, actions eImg1_sAudio or eImg_sTxt occur. These actions denote, the end of presentation of Img1 and, respectively, the start of presentation of Audio or Txt. Further on, if Txt is chosen by the switch operator, its presentation terminates at t=8 seconds interrupting the presentation of Video (eTxt_eVideo). Similarly, if Audio is chosen by the switch operator, the branch that crosses states 2, 3 until 12 is executed. It is interesting to note that the progression of time is always relative to the time elapsed on the previous transition on the TLA. t1=5s t2=5 i(eImg1_sTxt) 3 15 1 t0=0s 2 t2=5 i(eImg1_sAudio) eAudio_eImg2 t1=3s the end of presentation of Img1 occurs and media objects Audio or Txt are presented alternatively. If the first one is chosen by the switch operator, its respective timeline is executed. Later on, the end of Video takes place and the presentation of Img2 starts after 10 seconds of presentation. The presentation of the scenario is finished by the end of presentation of Audio which takes place after 25 seconds. In opposite, if Txt is chosen by the switch operator the end of its presentation occurs after 8 seconds interrupting the presentation of all the scenario. In this case, Img2 will never be presented. As the reference timelines represent all the possible behaviors described by the TLA for the presentation of a scenario, we can also assume that the scenario presented by a player is semantically correct if the produced behavior belongs to the set of behaviors described by its respective reference timeline. Hence, consider the resulting behavior for the presentation of the previous scenario according to the correct syntax of SMIL, as illustrated in Figure 5(a). Figure 5(b) illustrates the notation that has been adopted to describe the resulting behavior for the presentation of the previous scenario. (a) eImg1_sTxt eTxt_eVideo end sImg1_sVideo t0=0s (b) Figure 4. t1=5s t2 ∈ {8,25}s Occurrence of an event Temporal window for an alternative presentation End of a timeline presentation Time progression Reference timeline for the previous scenario From the TLA it is possible to automatically derive the reference timeline for the presentation of the document, as illustrated in Figure 4(a). The reference timeline characterizes all the possible temporal scenarios associated with the presentation of the media objects of the document. Figure 4(b) illustrates the notation that has been adopted to describe the reference timeline. According to the reference timeline for the previous scenario, the presentation of Img1 and Video starts simultaneously with the occurrence of the action sImg1_sVideo. After 5 seconds, The first player applied was RealPlayer G2 from RealNetworks. According to G2, Img1 and Video are presented with the duration of 5 and 10 seconds respectively. After the end of the presentation of Img1, if media object Audio is chosen by the switch operator, this one is unexpectedly presented after a delay of 5 seconds, with a duration of 15 seconds. Note that G2 does not support a multiplexed audio device for the simultaneous presentation of an audio sequence and a video clip (QuickTime movie). Otherwise, if media object Txt is chosen by the switch operator, it is presented with a duration of 3 seconds. For both cases, media object Img2 is presented after Video with the duration of 5 seconds (This is the default duration for G2 since Img2 does not have an explicit duration). GRiNS (from CWI) and HPAS (from Digital) also present some unexpected behaviors for this scenario. In the beginning, Img1 and Video are presented with the duration of 5 and 10 seconds, respectively. After the end of presentation of Img1, if media object Audio is chosen by the switch operator, it is also presented after a delay of 5 seconds, with a duration of 15 seconds. Thus, after the end of presentation of Video, Img2 is presented during 15 seconds until the end of presentation of the Audio. In the opposite, if media object Txt is chosen by the switch operator, it is presented during 3 seconds interrupting the presentation of Video and, consequently, never presenting Img2. At last, Soja (from Helio Barbizon) also produces a different behavior for this scenario. Although, since this tools supports only the audio file formats *.au and *.auz as continuous objects, we replaced Video by another audio (Audio2). Thus, the scenario starts with the presentation of Img1 and Audio2 during 5 and 10 seconds, respectively. After the end of presentation of Img1, if media object Audio is chosen by the switch operator, this one is presented with a duration of 20 seconds. In this case, after the end of presentation of Audio2, media object Img2 is presented during 15 seconds until the end of presentation of Audio. Otherwise, if media object Txt is chosen by the switch operator, this one is presented during 3 seconds and, after the presentation of Audio2, Img2 is presented continuously leading the scenario to a deadlock. Img1 Audio Txt Video Img1 Audio Txt Video Img2 (a) 6. [1] [2] [3] [4] Img2 0 5 0 10 15 20 25 30 5 (G2) Img1 Audio Txt Video Txt Audio2 Img2 0 5 10 15 20 25 30 (HPAS) Presentation of a media object Alternative presentation of media objects Conditional presentation of a media object * 10 15 20 25 30 (GRiNS) Img1 Audio Img2 (b) research grant (Action Télécoms). The first and second authors are supported by a grant of the Brazilian Government (CAPES). [6] 0 5 ∞ 10 15 20 25 30 (Soja) Endless presentation of a media object [8] Presentation of the previous scenario. 4. CONCLUSION This paper presented an illustrative example of the application of RT-LOTOS for the formal design of IMDs and for the semantic verification of SMIL documents. As we have seen, most of the players provide a different presentation for the same scenario and none of these presentations are in conformance with the behavior described by the reference timeline. Different behaviors are produced since these players do not present properly either temporal constraints where there is an attempt to multiplex audio presentation channels or media objects that do not have an implicit duration. These players support syntactically correct SMIL documents, however their semantics are still implementation-dependent. Using reachability analysis and then, generating its respective TLA, it is possible to obtain a semantically correct interpretation of a document and, furthermore, to derive the reference timeline for its presentation. Still, one important breakthrough about the TLA, which was not presented in this paper, is that it is a scheduling graph that enables the controlling of the occurrence of non-deterministic events (which are non-controllable) within valid temporal intervals so that the synchronization constraints of an IMD can be fulfilled during its presentation. 5. [7] Interruption of a presentation by a media object * presented when Audio is chosen by the switch operator Figure 5. [5] ACKNOWLEDGMENTS The work reported on this paper is funded by a CNRS [9] [10] [11] [12] [13] [14] [15] [16] REFERENCES Buchanan, M.C.; Zellweger, P.T. Automatically Generating Consistent Schedules for Multimedia Documents. Multimedia Systems Journal, v.1, n.2,1993. pp.55-67 Vazirgiannis, M; Boll, S. Events in Interactive Multimedia Applications: Modeling and Implementation Design. In: Proc of IEEE International Conference on Multimedia Computing and Systems (ICMCS’97), Ottawa - Canada, June, 1997. Jourdan, M.; Layaïda, N.; Roisin, C; Sabry-Ismail, L.; Tardif, L. Madeus, an Authoring Environment for Interactive Multimedia Documents. In Proc. of ACM Multimedia’98, Bristol, UK, Sep. 1998. pp.267-272 W3C Recommendation. Synchronized Multimedia Integration Language (SMIL) 1.0 Specification. URL: http://www.w3.org/TR/REC-smil, June, 1998. Santos, C.A.S.; Soares, L.F.G.; Souza, G.L.; Courtiat, J.-P. Design methodology and formal validation of hypermedia documents. In: Proc. of ACM Multimedia’98, Bristol, UK, Sep. 1998. pp.39-48. Santos, C.A.S.; Sampaio, P.N.M.; Courtiat, J.-P. Revisiting the Concept of Hypermedia Document Consistency. ACM Multimedia’99, Orlando, USA, November, 1999. Courtiat, J.P.; Santos, C.A.S.; Lohr, C.; Outtaj, B. Experience with RT-LOTOS, a temporal extension of the LOTOS formal description technique. To appear in Computer Communications N.23. T.Bolognesi and E. Brinksma. Introduction to the ISO Specification language LOTOS. In P.H.J. van Eijk, C.A. Vissers, and M.Diaz, editors, The Formal Description Technique LOTOS, pages 23-76. Elsevier Science Publishers B.V. (North-Holland), 1989. Soares, L.F.G.; Rodriguez, N.L.R.; Casanova, M.A. Nested Composite Nodes and Version Control in an Open Hypermedia System. Information Systems Journal, Sep.1995. pp.501-519. Santos, C.A.S.; Courtiat, J.-P.; Saqui-Sannes, P. A Design Methodology for the Formal Specification and Verification of Hypermedia Documents. In Proc. of FORTE/PSTV’98, Paris, France, November, 1998. Chapman & Hall. Courtiat, J.-P.; Oliveira, R.C. Proving Temporal Consistency in a New Multimedia Synchronization Model. In Proc. of ACM Multimedia’96, Boston, USA, Nov. 1996. pp.141-152. RealSystem G2 Home Page (Version 6.0). URL:http://www.real.com. GRiNS Home Page (Version 1.0). URL: http://www.oratrix.com/GRiNS/. HPAS Home Page (Version 1.0). URL: http://www.research.digital.com/SRC/HPAS/. Helio Barbizon Home Page (Version Cherbourg 1.0). URL: http://www.helio.org/. Lohr, C.; Santos, C.A.S.; Sampaio, P.N.M.; Courtiat, J.P. Time Labeled Automata. Submitted for publication, 2000.