Academia.eduAcademia.edu

Towards iTV Accessibility: The MPEG-21 case

2008, 1st International Conference on Pervasive Technologies Related to Assistive Environments, PETRA 2008

In this paper, the accessibility of the interactive television is being discussed seen as a primary factor for its satisfactory adoption and commercial success. This is a presentation of work undertaken in the context of a research project aiming at delivering iTV services to disabled children. The project approaches this objective through the utilization of the upcoming MPEG-21 ISO standard. Based on that, iTV accessibility is faced through metadata and adaptation. This work approaches the accessibility of iTV in a wider manner contrary to previous studies that focus mostly onto the users with low vision. Such a case study is being presented accompanied with a discussion of relevant architectures, technologies and faced design issues.

Towards iTV Accessibility: The MPEG-21 Case Evangelos Vlachogiannis Damianos Gavalas Christos Anagnostopoulos Department of Product and Systems Design Engineering University of the Aegean Hermoupolis, Syros, Greece Department of Cultural Technology and Communication University of the Aegean Mytilene, Lesvos, Greece Department of Cultural Technology and Communication University of the Aegean Mytilene, Lesvos, Greece [email protected] [email protected] [email protected] George E. Tsekouras Department of Cultural Technology and Communication University of the Aegean Mytilene, Lesvos, Greece [email protected] ABSTRACT In this paper, the accessibility of the interactive television is being discussed seen as a primary factor for its satisfactory adoption and commercial success. This is a presentation of work undertaken in the context of a research project aiming at delivering iTV services to disabled children. The project approaches this objective through the utilization of the upcoming MPEG-21 ISO standard. Based on that, iTV accessibility is faced through metadata and adaptation. This work approaches the accessibility of iTV in a wider manner contrary to previous studies that focus mostly onto the users with low vision. Such a case study is being presented accompanied with a discussion of relevant architectures, technologies and faced design issues. Categories and Subject Descriptors H.5.1 [Multimedia Information Systems], J.4 [Social and Behavioral Sciences] General Terms Documentation, Design, Experimentation, Standardization, Legal Aspects. Human Factors, Keywords Accessibility, usability, human computer interaction, pervasive environments, metadata, iTV, MPEG-21, SMIL, XSLT. 1. INTRODUCTION A challenging application for adaptive web information systems research has recently been the interactive television (iTV). This field increasingly adopts techniques and technologies initially developed for the World Wide Web ([6],[7]). This is more apparent in the case of IP-TV but generally applies to all kinds of iTV. Considering also that the number of TV sets is considerably larger to that of PCs worldwide [20], it becomes evident that the interaction requirements and specifically the need for accessibility are crucial. For instance, an iTV user now is in front of a large number of services (term used for TV channels) with amazing possibilities. A similar “explosion” occurred in the past in the World Wide Web and search engines; later on it was the portals (equipped with search engine facilities) and the adaptation mechanisms that made the huge information manageable. Carmichael et al [3] discover similarities between the directions of iTV with that of the web and further note that the gained experience from the later has to be transferred to the domain of iTV to avoid similar mistakes, mistakes which have not been avoided so far. On the other hand, the different characteristics mainly regarding to the lack of computer literacy have been raised focusing on the evaluation of the iTV interface [4]. Even since 1997, RNIB has provided recommendations for the accessibility of iTV [5]. Carmichael et al. conclude that the accessibility characteristics that have not yet been given necessary emphasis are subtitles, captions and audio description [3], characteristics that are given emphasis in the corpus of the web (WCAG2.0, SMIL1, SVG2). The MPEG-21 standard [8] recently released by ISO, aiming at defining an open framework for multimedia applications, seems to find a natural fit in the world of iTV. In the related literature there have been several attempts to incorporate accessibility issues into the MPEG-21. It comes out that the majority of them are focused into visual disabilities (e.g. [11], [13], [14], [19]). Rice [11] presents the difficulties that visually disabled users face while consuming iTV services. This work gives emphasis into parameters like screen size, font size and color, icons’ identification and screen layout. The conclusion of this work is that the best facing approach of the problem situation is personalization due to the diverging requirements. Thang et al. [14] proposed systematic contrast-enhancement method to improve the content visibility for low-vision users, through MPEG-21 content adaptation. Yang et al. [19] propose a technique for the accessibility of iTV for people with visually deficiency, especially color blindness. This technique involves both the incorporation of MPEG-21 with relating descriptive metadata and the design of an adaptive system. Berglund & 1 Accessibility Features of SMIL: http://www.w3.org/TR/SMIL-access/ 2 Accessibility Features of SVG: http://www.w3.org/TR/SVG-access Johansson [2] study the benefits of the usage of speech - dialog in the domain of iTV and concludes to several design considerations. This paper presents the work undertaken in the context a Greek national project aiming at using MPEG-21 framework for adapting iTV’s content to disabled children requirements. The authors propose a number of view angles which the iTV accessibility needs to be looked from. From such a perspective, the focus of the approach is onto the interaction of the stakeholders through adaptation. Contrary to the majority of the approaches found in literature, this approach investigates iTV accessibility in a wider manner without focusing to a specific user group such as users with low vision. This paper is organized as follows: Section 2 introduces MPEG21 and highlights its accessibility role. Section 3 presents the proposed iTV architecture, and finally, Section 4 concludes our work and draws directions for future work. 2. MPEG-21 and its ACCESSIBILITY ROLE MPEG-21 is, among others, an attempt to provide the iTV designer a framework that can offer a big - integrative picture of an iTV system. Based on that, an indicative scenario has been devised, including production, delivery and consumption of the digital content, aiming at identifying the primary entities and the way these are involved in the overall design outcome (see Figure 1). According to that: ƒ ƒ ƒ ƒ ƒ ƒ The content designer (CD) identifies the target groups. The CD, supported by MPEG-21 metadata, describes the target groups using their characteristics (e.g. blindness) and associates interaction modes (e.g. auditory description) using an appropriate authoring tool. The CD develops the required content components (digital items) based on the above-decided interaction modes. These are integrated into the metadata using the authoring tool. End user A, say blind, wants to consume developed content. She has already stored her profile. The context of use is accomplished with attributes like access device capabilities, audio configuration, time and location of the end user. The context of use is delivered to the serving system accompanied by the user request. The system inferences and maps the user’s context of use with an appropriate composition of the components of the content. If, while consuming, the context of use is being modified, the system needs to be aware so that it can adapt to new requirements. Figure 1. MPEG-21 involvement in iTV: a possible scenario. Even if MPEG-21 addresses considerations for adaptation and specifically accessibility by including several relating XML elements into its schema, it seems that on its own this cannot ensure the accessibility of delivered content. Instead, this is a fundamental condition for providing accessibility output of the systems involved. In other words, it should be able to provide the required infrastructure so that a digital content would be able to obtain the requisite variety for both the content designer, to be able to design accessible content, and the involved systems, to have the required information to deliver an accessible result. From such a point of view, the content provider, the author (also referred to as content designer), the authoring tools, the systems of the content provider and of course the consumer with her accompanied interaction profile (preferences, device capabilities etcetera.) are identified and all play a major and cascading role to the iTV accessibility. Briefly, the role of the MPEG-21 towards the accessibility of iTV is revealed through the following dimensions: Alternative content: MPEG-21 offers metadata (defined in the Part 7 of the standard [9]) that allows content providers to provide the content in one or more alternative ways. The ways often refer to different modalities and thus they can include captions, audio descriptions, etc. Digital Content Navigation: In iTV environments, navigation facilities within available content are provided by an Electronic Program Guide (EPG). This is actually the interactive portion of the system that offers the required functionality to the user including service (channel) selection / retrieval, programs information and scheduling, profiling / personalizing, rating and/or even acting upon the content. Description of context of use (IN PARAMS): The usage context actually refers to all the information that needs to be taken into account to adapt digital content according to the user’s requirements. Description of presentation parameters of digital content (OUT PARAMS): This determines what technical characteristics need to be adapted. An important implementation consideration was the transformation of MPEG-21 to SMIL as an intermediate solution to ensure media players’ compatibility. This involved the mapping between those two infrastructures realized using XSLT. Device accessibility: This refers to the accessibility of the involved hardware including remote controls and set-top boxes3. Content provider accessibility policy: Probably, an important contribution to the field of accessibility of MPEG-21 is the capability of applying and claiming for an accessibility policy. In other words, content providers need to be capable of applying a kind of accessibility policy based on the target consumer group and the former’s requirements for quality assurance. For instance, such a policy could provide for digital content to be accompanied by subtitles of two languages (e.g. English, Greek) and every image with an alternative text between two and ten words. Applying such policies requires a mechanism for validating a digital content to a policy description and could be for instance implemented based on Schematron [12], an XML structure 3 http://www.tiresias.org/equipment/settop_boxes.htm validation language for making assertions about the presence or absence of patterns in trees. 3. itvSimu ARCHITECTURE Under the umbrella of our research project, the need for designing and developing of a simulation platform, acting as an interaction interface between our iTV architecture and the prospective viewer, was evident. In other words, a user interface prototype has been implemented to enable users to effectively browse, search, download and consume the provided audio-visual content. In the case of disabled people ‘effectively’ means that both the content and the value-added services need to be accessible to the user, as already discussed. Such an interface is actually a sub-system of the overall system architecture as briefly presented in Figure 2, consisting of an authoring tool, an expert system, storage (native XML database) and the user interface, referred to as itvSimu. The authoring tool allows content providers to easily author a diversity of multimedia resources supporting a MPEG-21 compliant metadata model [1]. The expert system uses an algorithm originally devised for clustering web documents [15], to classify digital items and user profiles based on their attributes and enable intelligent TV program recommendations. The aforementioned systems communicate through web services under a flexible - distributed architecture. This paper presents the design and the implementation of the itvSimu system. Figure 2. iTV adaptation architecture 3.1 Design Approach In effect, the developed User Interface comprises an EPG simulator. It should be noted that the choice of the implementation technologies has not been straight-forward considering the plethora of available standards and technologies like MHP4, GEM-IPTV, TV-Anytime, DVB-IP, Java-TV and more. Given the requirement for incorporating networking functionality into the EPG subsystem, a web-based approach instead of a standalone application has been adopted. This approach ensures execution of the EPG through a standard browser interface. The design approach follows. 4 http://www.mhp.org/ During the early faces of the design of the prototype system an identification of the stakeholders took place: ƒ ƒ ƒ The end user: he/she interacts with the ITV interface browsing and consuming digital content. The end user is associated with an XML-based user profile which includes personal data, preferences upon the audiovisual content (e.g. sports, news, movies) and potential disabilities (hearing problems, visual impairments, etc) The Service Provider: The analogous of the traditional TV channels. The TV Guide Provider: A service that informs end users about the offered services and their availability time schedule. Occasionally, the Service Provider and the TV Guide Provider coincide; for simplicity reasons we have made such assumption while designing our prototype. Our focus has been on the interaction of the end user with the iTV interface, since that will affect the overall functionality of a personalized system, with particular emphasis on disabled users. Figure 3 illustrates the three elementary sub-systems of the iTV user interface: the player, the EPG and the logger. These subsystems are supported by auxiliary services for enhancing the functionality of the iTV simulator. Bellow we analyze the functional and interactivity requirements of the above-mentioned subsystems and discuss the solutions adopted in our prototype. Figure 3. A screenshot of the iTV user interface: selection of TV programme through the EPG selection service 3.2 itvSimu Subsystems Logger subsystem: the simplest, yet, a crucial software module as it provides feedback to the user for the “hidden” operations. It records and displays all (implicit or explicit) user actions (e.g. profile modification, starting/pausing/resuming a TV program, etc). It has been implemented through Java Observer pattern whose actions activate the logger. <?xml version="1.0" encoding="UTF-8"?> <DIDL xmlns:xsi=http://www.w3.org/…="....\MPEG-21-DI\DIDL.xsd"> <ITEM> <DESCRIPTOR> <STATEMENT TYPE="text/plain"> Movie for normal, blind or deaf individuals </STATEMENT> </DESCRIPTOR> <ITEM> <DESCRIPTOR> <STATEMENT TYPE="text/plain"> Movie for normal individuals </STATEMENT> </DESCRIPTOR> <COMPONENT> <RESOURCE REF="video.mov" TYPE="video/mov"/> </COMPONENT> </ITEM> <ITEM> <DESCRIPTOR> <STATEMENT TYPE="text/plain"> Movie for blind individuals </STATEMENT> </DESCRIPTOR> <COMPONENT> <RESOURCE REF="video.mov" TYPE="video/mov"/> <RESOURCE REF="audiodescription.mov" TYPE="audio/mp3"/> </COMPONENT> </ITEM> <ITEM> <DESCRIPTOR> <STATEMENT TYPE="text/plain"> Movie for deaf individuals </STATEMENT> </DESCRIPTOR> <COMPONENT> <RESOURCE REF="video.mov" TYPE="video/mov"/> <RESOURCE REF="captions.txt" TYPE="text/plain"/> </COMPONENT> </ITEM> </ITEM> SMIL player has been implemented using the QuickTime for Java API5 [10]. The XSLT transformation of MPEG-21 digital items to SMIL documents depends on the user profile, taking into account potential user disabilities. An example of such digital item declaration and its SMIL representation is given in Figure 4.The second function of the Player subsystem is the provision of user interaction information to the expert (recommendation) system. An XML-based description of the user interaction is first stored into an XML native database located on the iTV’s server and retrieved by the recommendation system to enable more effective and reliable reasoning. In effect, the user interaction history comprises a function f (x, y, .., z), wherein x, y, .., z are the values of ‘interaction parameters’. Such parameters are either explicitly provided by the user or implicitly inferred by the player. Examples of implicit parameters are the playing time of a video over the overall video duration ratio, while the rating of a TV program (in a 0-10 scale) could be explicitly provided by the viewer. The interaction history function could be expressed as f(x) = a X + b Y where a, b represent weights based on the designer’s priorities, which could either be static or dynamically specified (through training). As shown in Figure 5, the user’s interaction history and the TV programs ratings posted by users that belong to the same ‘users cluster’ (e.g. the same disability group) comprise the input of the recommendation system. The latter recommends -among the available digital content- those programs that suit the user’s profile. Digital content rating MPEG-21 MPEG-21 documents documents (digital (digitalitems) items) <?xml version="1.0" encoding="UTF-8" ?> <smil xmlns:qt="http://www.apple.com/...." time-slider="true"> <head> <layout> <root-layout width="320" height="350" background-color="black" /> <region id="captions" backgroundColor="yellow" top="250" height="100" left="1" width="310" /> <region id="movie" left="0" top="0" width="620" height="740" /> </layout> </head> <body> <par> <textstream src=“captions.txt" region="captions“ systemCaptions="on" /> <video src=“video.mov" alt=“Movie title" region="movie“ begin="00:00.0" dur="00:14:02.000" /> </par> </body> </smil> Recommendation System iTV Schedule recommendation User Profile User Interaction History Figure 5. TV schedule recommendation. Figure 4. A Digital Item Declaration document transformed to SMIL format which synchronizes a video with captions (appropriate for hearing impaired individuals). Player subsystem: it reproduces iTV programs (digital items) as well as recording the user’s interaction history. Its elementary module is the digital content player. Such player should support more than basic functionality (play, pause, rewind, etc.), such as subtitles, audio descriptions, etc. Given that no MPEG-21 player is currently available we have chosen to use SMIL as intermediate technology mainly due to the numerous available SMIL players (e.g. X-Smiles [18], QuickTime player). In particular, the MPEG21 digital item declarations are transformed to SMIL format through an appropriate XSLT transformation and subsequently the SMIL markup code is parsed by the SMIL player. That approach ensures the iTV interface’s interoperability since SMIL is now considered a mature web technology. In our prototype, the EPG subsystem: This is the most ‘interactive’ subsystem since it is used by the user to browse, navigate and download audiovisual content. In the context of our research project we have identified several use cases according to which the iTV end-user may use EPG in order to: • 5 navigate within iTV available services (zapping); QuickTime for Java (QTJ) is a software library that allows software written in Java to provide multimedia functionality, by making calls into the native QuickTime library. QTJ offers SMIL support and also can handle a larger variety of multimedia formats than the ‘traditional’ Java Media Framework (JMF) API. • personalize the audiovisual content based on her final overall design comes out as a proprietary solution composed of several open standards. potential disabilities and content preferences; • 5. ACKNOWLEDGMENTS schedule a reminder for a TV program. An important consideration task during the EPG’s development has been the representation and retrieval of the TV schedule. To satisfy this design requirement we have used TV-Anytime Programme metadata [16] along with TV-Anytime Java API of BBC6. The overall functionality of the EPG has been based upon the specifications of the JAVA TV API (JSR-000927) in a non strict manner. The result of the BBC TV schedule retrieval on the iTV interface is shown on Figure 3. The most important part of content personalization has been the modelling of user characteristics (e.g. disabilities) and preferences. To address this issue, we have adopted the Interaction Profile of DAWIS framework for the design of adaptive web information systems [17]. The most abstract layer of the DAWIS Interaction Profile consists of the Service Interaction Profile, the Delivery Context Interaction Profile, the User Interaction Profile and the Platform Interaction Profile. Based on that, an itvProfile schema has been developed and serialized in XML syntax including elements like LanguageNative, Languages, ContentPreferences, Disabilities, Subtitles, Captions, AudioDescription and SignLanguage. The itvProfile instances are stored in a separate collection into the XML database storage through XQuery 7. 4. CONCLUSIONS This section aims at summarizing what have been achieved so far and also sharing our design and implementation experience with the standardization committees of relevant recommendation documents. This work is supported by the General Secretariat of Research and Technology (Project “Software Applications for Interactive Kids TV-MPEG-21”, project framework “Image, Sound, and Language Processing”, project number: EHΓ-16). The participants are the University of the Aegean, the Hellenic Public Radio and Television (ERT) and the Time Lapse Picture Hellas. 6. REFERENCES [1] Anagnostopoulos, C.; Tsekouras, G.; C.; Gavalas, D.; Economou, D.; Psoroulas, I. (2007); Increasing Interactivity in IPTV using MPEG-21 Descriptors, Proceedings of the 4th IFIP Conference on Artificial Intelligence Applications & Innovations (AIAI’2007), pp. 65-72, September 2007. [2] Berglund, A.; Johansson, P. 2004. Using speech and dialogue for interactive TV navigation. In Universal Access Inf Soc 3(3-4):224–238. [3] Carmichael, A.;Rice, M.;Sloan, D. 2006. Inclusive Design and Interactive Digital Television: Has an Opportunity been Missed? In 3rd Cambridge Workshop on Universal Access and Assistive Technology. Fitzwilliam College , Cambridge, 10-12 April 2006. [4] Chorianopoulos, K. and Spinellis, D. User Interface Evaluation of Interactive TV: A Media Studies Perspective, Universal Access in the Information Society, 5(2):209-218, Springer, 2006 [5] Darby, S., (1997). Introduction to Enhancing Accessibility of Digital Television. In RNIB. the So far, the developed system is at a prototype level and all systems (i.e. expert system, authoring tool, iTV simulator) have not been evaluated as a whole. However, at this stage, the itvSimu seems to offer an interesting and simplified architecture that can realize a primitive IP-TV platform and further serve as benchmarking software for further research in the field of content adaptation and accessibility. Currently, the prototype has implemented only a portion of user groups. The reason for this is that the difficulties for evaluating the adaptation behavior requires a considerable number of users with diverse profiles, and an analogous number of digital items. Such an evaluation is considered as future work. In addition, as a future work it would be interesting to consider more runtime parameters (implicit profile) and more effective models for multiplexing them, maybe through AI techniques and simulation. Finally, a separate version of itvSimu optimized for users with hearing problems (e.g. incorporating auditory menus functionality) will be implemented. [6] Ferguson, Douglas, Perse E. (2000). The World Wide Web as a Functional Alternative to Television Journal of Broadcasting & Electronic Media 2000 44:2, 155-174. From the point of view of standardization efforts, it came out that the selection of standards it was a difficult task as there are many of them, often overlapping and/or contradicting each other. Consequently, even if some designer uses open standards, her [11] Rice, M., (2004). Personalisation of interactive television for visually impaired viewers. In 2nd Cambridge Workshop on Universal Access and Assistive Technology. Fitzwilliam College , Cambridge, 22-24 March 2004. 6 http://www.bbc.co.uk/opensource/projects/tv_anytime_api/ 7 XQuery 1.0: An XML Query Language: http://www.w3.org/TR/xquery/ [7] Gil, A.; Pazos, J.; Lopez, C.; Lopez, J.; Rubio, R.; Ramos, M.; Diaz, R., "Surfing the Web on TV: the MHP approach," Multimedia and Expo, 2002. ICME '02. Proceedings. 2002 IEEE International Conference on , vol.2, no., pp. 285-288 vol.2, 2002 [8] ISO MPEG-21, Part I: Information technology - Multimedia framework (MPEG-21) —Vision, Technologies and Strategy, ISO/IEC TR 21000-1:2004. [9] ISO/IEC 21000-7, Information technology - Multimedia framework (MPEG-21) — Part 7: Digital Item Adaptation, First Edition, 2004. [10] QuickTime for http://developer.apple.com/quicktime/qtjava/. [12] Schematron http://xml.ascc.net/schematron/schematron1-5.sch. Java, 1.5, [13] Springett, M. and Griffiths, R. (2007). Accessibility of Interactive Television for Users with Low Vision: Learning from the Web. In Proceedings of the 5th European Conference on Interactive TV (EuroITV’2007), LNCS 4471, pp. 76-85, May 2007. [14] Thang T.C.; Yang S.; Ro Y.M.; Wong E.K. (2007). Media Accessibility for Low-Vision Users in the MPEG-21 Multimedia Framework. IEICE Transactions on Information and Systems, E90-D(8), 2007, pp.1271-1278. [15] Tsekouras, G.; Anagnostopoulos, C.; Gavalas, D.; Economou, D. (2007); Classification of Web Documents using Fuzzy Logic Categorical Data Clustering, Proceedings of the 4th IFIP Conference on Artificial Intelligence Applications & Innovations (AIAI’2007), pp. 93-100, September 2007. [16] TV-Anytime, ETSI TS 102 822: Broadcast and On-line Services: Search, select, and rightful use of content on personal storage systems. [17] Vlachogiannis, E. et al. (2008) “A reference framework for the Design of Adaptive Web Information Systems (DAWIS) inspired from a general systems’ research”. Working paper. [18] X-Smiles SMIL http://www.xsmiles.org/xsmiles_smil.html. player, [19] Yang, Α.;Ro, Y.;Nam, J.;Hong, J.,Choi, S.;Lee, J., (2004). Improving Visual Accessibility for Color Vision Deficiency Based on MPEG-21. In ETRI Journal, vol.26, no.3, June 2004, pp.195-202. [20] Zillmann D. (2000). The coming of media entertainment. In: Zillmann D, Vorderer P (eds) Media entertainment: the psychology of its appeal. Lawrence Erlbaum Associates, Mahwah, pp 1–20