Academia.eduAcademia.edu

Internet protocol-based push to talk

This paper discusses a way of offering voice instant messaging based on Internet Protocol using Session Initiation Protocol. The purpose of this investigation is to enhance the modern social communication amongst the people of South Africa who are already accustomed to text-based instant messaging. The proposed application aims to implement the traditional Push-to-Talk technology using Internet Protocol. Thus the proposed IP-based Push-to-Talk is a new approach to voice communication which emulates a walkie-talkie system. On the mobile phone IP-Push-to-Talk herein referred to as Push-to-Talk over a cell phone can be viewed as a voice SMS. The adoption of a Pushto-Talk service was inspired by the fact that it applies half-duplex communication. This enhances the primary objective of offering a cheap voice instant messaging. In half-duplex communication, only one person can talk at a time, thereby avoiding bidirectional charging. The project was implemented on two platforms, a PC and a mobile phone. The PC Push-to-Talk was implemented through client server approach whilst the mobile Pushto-Talk through a peer-to-peer approach. Several software engineering strategies were used for user requirements gathering as well for testing. Six users participated in the test and the results were gathered through questionnaires. The results showed that, halfduplex communication is efficient and yet very economical as it makes less usage of system resources.

Internet protocol-based push to talk Hlabishi I. Kobo, William D. Tucker and Michael J. Norman Department of Computer Science University of the Western Cape, Private Bag X17, Bellville 7535 South Africa Tel: +27 21 9592461, Fax: +27 21 9593006 E-mail: {2654952, btucker, mnorman}@uwc.ac.za Abstract- This paper discusses a way of offering voice instant messaging based on Internet Protocol using Session Initiation Protocol. The purpose of this investigation is to enhance the modern social communication amongst the people of South Africa who are already accustomed to text-based instant messaging. The proposed application aims to implement the traditional Push-to-Talk technology using Internet Protocol. Thus the proposed IP-based Push-to-Talk is a new approach to voice communication which emulates a walkie-talkie system. On the mobile phone IP-Push-toTalk herein referred to as Push-to-Talk over a cell phone can be viewed as a voice SMS. The adoption of a Pushto-Talk service was inspired by the fact that it applies half-duplex communication. This enhances the primary objective of offering a cheap voice instant messaging. In half-duplex communication, only one person can talk at a time, thereby avoiding bidirectional charging. The project was implemented on two platforms, a PC and a mobile phone. The PC Push-to-Talk was implemented through client server approach whilst the mobile Pushto-Talk through a peer-to-peer approach. Several software engineering strategies were used for user requirements gathering as well for testing. Six users participated in the test and the results were gathered through questionnaires. The results showed that, halfduplex communication is efficient and yet very economical as it makes less usage of system resources. Index Terms— Network services, Protocols, Push-toTalk, Session Initiation Protocol, Real-Time Transport Protocol I. INTRODUCTION Internet Protocol-based push-to-talk (IP-PTT) is a new instant messaging (IM) type of communication that is voicebased rather than text-based like many popular IM services. IP-PTT uses the PTT concept of radio systems that have been in existence for years, similar to a walkie-talkie or citizen's band (CB) radio. IP-PTT makes use of Internet protocol over a variety of networks instead of analog radio frequencies. IP-PTT can operate using standards of Voice over Internet Protocol (VoIP). The communication model for IP-PTT is half-duplex whereby only one person can talk at a time [1] [2] [3], following on the walkie-talkie and CB radio models. This can allow IP-PTT more efficiency than real-time VoIP since only one person can talk at a time eliminating interruptions [www.mobilein.com]. Transmission of voice occurs through a push or press of a dedicated PTT button. A PTT session is initiated when button is pressed and terminated when released. The call setup and termination can easily be controlled by the Session Initiation Protocol (SIP), a protocol more often used for real-time VoIP. The transmission of IP-PTT voice data packets occurs as normal. Because of IP's ubiquitous nature, IP-PTT can eliminate the geographical limitations imposed on traditional PTT systems. Many South Africans have joined the community of mobile instant messaging through MXit [4]. MXit is free mobile text-based IM system developed locally in South Africa. Since MXit is both mobile and text-based, a voicebased IP-PTT similar in functionality to MXit would no doubt increase use of mobile data. Modern communication technology worldwide is converging to a mobile environment, and South Africans are much more likely to have a cell phone than a computer [http://mybroadband.co.za/news/cellular/11723.html]. Mobile IP-PTT is often referred to as PTT over a cell phone (PoC). PoC can support 2.5 and 3rd generations of mobile data networks as well as the next generation network [3] [www.mobilein.com]. The concept of PoC was introduced in 2003 and standardization started in 2004 with the first version finalized in 2005 [www.mobilein.com] [6]. The Open Mobile Alliance (OMA) is the official standards body that oversees all of the infrastructures and processes supporting PoC. In June 2006, the first version OMA PoC 1.0 was released. PoC is available and popular on many cellular networks around the world, but unfortunately not in South Africa. We find this situation curious since PoC seems a natural way to increase data usage, and therefore, revenue generation, in South Africa. In the USA, Nextel communications and Motorola network providers offer the service, and both Orange and Vodafone offer PoC in the UK. The Nextel PTT service is called Direct Connect [1]. As shown in [12], PoC can even be implemented using Bluetooth. Thus there are many options for building PoC systems, some for revenue and some for recreation. This paper presents exploratory work to design, build and evaluate a purely IP-based PTT system, first on a PC and then on a mobile phone. Two methods of implementation were explored: a) a client-server approach was employed on the desktop PC application where the server was used to relay the real-time packets between two IP-PTT clients, and b) a peer-to-peer (P2P) approach was employed for the mobile phone PoC prototype. The latter approach does not involve any middle layer as it transmits directly from one endpoint to the other. The rest of this paper is organized as follows; Section II comprises the bulk of the paper and presents the methods for the design and evaluation of the prototypes. Section III discusses the findings, and Section IV highlights conclusions and avenues for future work. II. METHODS This section presents the methods used to define and analyse user requirements, the design of the prototypes and the protocols used to implement them, and finally the procedure for testing the prototypes and the results obtained from those tests. Standard software engineering (SE) approaches were used throughout the development process. A generic SE life cycle was adopted and used iteratively as in the incremental model (see Fig. 1). User requirements are always a determinant factor in software development. User requirements data were gathered through structured questionnaires and interviews with students in our university. Fig. 1 shows the overall iteration of the project development with the final product, in this case, being the PC and mobile phone prototypes. Fig. 1. The customary project life cycle provides an overview of the entire project, and consequently provides an outline for this section. These steps were iteratively applied to achieve the development of the prototypes. A. User requirements Many instant messaging services offered on mobile phones are text-based. According to a survey of 20 students at our university, users are still using short messaging service (SMS) even though they could send thousands of messages with a cellular IM service for the cost of a single SMS. This is because IM usage is charged by byte rather than by message. We found that the users interviewed want a convenient way of exchanging messages. According to most of them, SMS is not convenient in urgent situations due to the amount of time it takes to key in a message and they recognise that it is also expensive. Because cellular voice calls are even more expensive, these were key areas that the users identified as beneficial with a voice-based IM-like application such as PoC. The users' concerns include emergency situations, planning events like parties, as well as general social interaction amongst students on campus. For the users interviewed, youths in particular want to engage in social interactive communications with friends, family and colleagues. They want voice instant messaging in place of textual IM and they would also prefer that the service be affordable. They also want the system to support real-time communication without any difficulties. About 70% of the people interviewed have a PoC application integrated on their mobile phones but they cannot use it because network service providers do not support it. We discovered that Vodacom offers a PTT service, but for only corporate customers. None of the students interviewed worked for a corporation. It is interesting that some of the corporate areas supported include construction, transport, security, distribution, manufacturing, and surface mining, as well as companies operating in catering, hospitality and courier industries (Pieter Uys, Vodacom Chief Operating Officer, December 2005.) Yet this does not apply to the vast majority of South African users. Users interviewed told us they encounter difficulties to use text instant messaging in urgent situations. Users often make use of 'language compression' to enhance the speed of the keying in a message, such as "b4" and "2cu", as well reducing the amount of data to be sent and the consequent cost. This type of spelling and vocabulary can easily lose the contextual meaning of the message. This is mainly because people have different understandings of any given 'mobile' language, and it is usually based on phonetic contractions of spoken English. This can be acceptable by English-literate people. However, the English literacy of South African people is typically very low [www.eee.co.za]. For these reasons, people we interviewed deduced that text-based instant messaging is not 'multilingual friendly'. Thus, overall, the concept of a voice-based approach to IM seemed feasible to the users we interviewed. B. Requirements analysis and design Thorough analysis of the user requirements provided us with the design direction of the system. Fig. 2 illustrates a use-case view of the requirements analysis in terms of the functional activities that a user can perform. Fig. 3 shows a high level view of the design relationships between technology objects, in this case, a client and multiple servers. Fig. 2. A use-case diagram shows the users (on the left outside the box) as role players. They use the entities of the application via the phone’s user interface to the underlying functionality. PTT calls. Fig. 5 shows the basic flow of voice messages. Fig. 3. The relationships between objects identify how one object maps to another. Many clients can use a single server. The M represents 'many' while a 1 represents 'one', i.e. M:1 qualifies the many-to-one relationship of multiple clients for one PoC server. Floor control is also of vital importance because it controls the allocation of the floor during IP-PTT sessions. To obtain the floor, or the ability to speak, a client sends the PTT request to the PoC server and waits for feedback. The status of the floor is broadcast to all parties by the PoC server. Fig. 4 provides a high level description of the floor control class with UML (unified modelling language). Fig. 4. Floor control methods illustrate the high level design of the floor control of the PoC application. The methods manage the PoC application’s functional operation to give a user the ability to speak in half-duplex mode. C. Protocols and prototypes The prototypes runs on two platforms, PC and mobile, and although they are not explicitly compliant, both use common protocols defined in the IP Multimedia Subsystem (IMS). IMS is a functional architecture based on IP that provides multimedia services [7]. IMS is specified by the 3rd generation partnership project 3GPP [7] [8] and extended further by the European Telecommunication Standard Institute (ETSI) [7] to accommodate the convergence of multimedia services in the next generation network [7] [8]. Session Initiation Protocol (SIP) forms the core of the IP-PTT architecture. SIP is a signalling protocol used for multimedia session set-up and termination [9]. SIP is used in this context for the creation and termination of Fig. 5. Message flow in SIP, as taken from [13]. User A first sends an invite to user B that goes through the network proxy before the endpoints can communicate. The message flow is described in more detail in Table I. TABLE I SIP MESSAGE FLOW TAKEN FROM [13]. Step 1 A calls B Action INVITE Description 2 100 Trying Proxy sends a 100 response to A to acknowledge the request. 3 4 INVITE 100 Trying Proxy forwards invite to B B acknowledges request with a 100 response. 5 180 Ringing B sends a 180 response to the proxy to indicate that B is being alerted. 6 180 Ringing The proxy forwards B's180 response to A. 7 B answers 200 OK B sends a 200 response to the proxy for connection established. 8 200 OK The proxy forwards B's 200 response to A. 9 ACK A acknowledges the 200 response from the proxy. 10 ACK The proxy forwards the acknowledgement to B. 11 B terminates BYE B sends a BYE request to the proxy. 12 BYE The proxy forwards B's BYE request to A. A sends an INVITE request to the proxy. The fourteen steps in Fig. 5 are described in more detail in this table. The flow here shows a typical SIP call set-up and termination between two user agents acting via a proxy. The last two steps from Fig. 5 are ignored. The voice components within a given IP-PTT prototype is synchronously streamed over the IP network using Real-time Transport Protocol (RTP) [10]. RTP works together with RTP Control Protocol (RTCP) that provides RTP with statistics and control information. Presence on the application is provided by the SIP presence protocol extension, SIMPLE (Session Initiation Protocol for Instant Messaging and Presence Leveraging Extensions). The process of passing voice over the Internet is managed by these IP protocols, and others such as User Datagram Protocol (UDP), and is collectively referred to as VoIP. The implementation of our IP-PTT prototype used two architectures. The first was a client-server approach used for desktop PC IP-PTT (see Fig. 6 ). This method places a server in between IP-PTT clients. The server relays real-time voice as packet data from one end to the other. Another purpose of the server is the registration and authentication of users. The server used in this case was an open source Asterisk server (www.asterisk.org). We used an open source Java-based SIP environment called SIP communicator (sipcommunicator.org) as the basis for the desktop IP-PTT client. We used SIP communicator because of its simplicity and object-oriented style that enabled us to implement the IP-PTT in a straight-forward fashion using well-documented classes and methods. We used a wired local area network (LAN) as the transport medium for this approach. Fig. 6. Client-Server approach, The application on the clients uses SIP signalling with RTP as the voice carrier between the server and the clients. This is a typical VoIP application on a LAN. The second approach was PoC deployed in P2P fashion. Another open source SIP stack, PJSIP (www.pjsip.org) was used on the mobile phone as the basis for the P2P approach. PJSIP is an open source SIP stack for the Symbian platform, and we chose to work with a Nokia phone. The network medium for this approach was WiFi. Fig. 7 shows the overview of this P2P approach. The network media in this case is WiFi to ensure free mobility, although it should be noted that any cellular data protocol may also be used. WiFi is essentially 'free' in our lab, whereas 2.5G and 3G data are not. The most distinct factor of IP-PTT is the use of halfduplex real-time voice communication. We used the WireShark packet analyzer to observe the packet flow for the assurance of one-way communication. Fig. 7. Peer-to-peer architecture. This approch allows the clients to communicate directly with each other without the need for an interim relay server. D. Evaluation The evaluation of the prototypes was based on testing using various strategies commonly used by computer scientists as a part of the overall software development life cycle. Validation testing was carried out to examine whether the functionality of the software functioned in a manner reasonably expected by the end-user [11]. Stress testing was also conducted to test the system under immense pressure from a variety of traffic-related factors. We also conducted performance testing to examine the run-time performance of the software within the context of the overall integrated system [11]. Performance was tested with end-users and systematically measured with the Windows task manager. Six users 'hands on' tested the system, and structured questionnaires and interviews were again used to gather user feedback. The users were given three task scenarios to choose from: social interaction, emergency and corporate (secretariat) situations. The task scenarios were used to test the efficiency of the application from a user’s perspective. We tested both prototypes as follows: on the local area network (LAN) using two PCs and on a WiFi network using Nokia E51 and E71 handsets. The results were gathered from the various testing methods outlined above, thus examining the application from different angles in order to triangulate a firm picture of how the prototypes performed from both technical and user orientations. The validation test was successful as 100% of the users completed the task at hand, and about 95% of the users were satisfied with the results. Fig. 8 shows a graphical representation of the results for the following two questions: Question 1: Did you manage to complete your task? Question 2: Are you satisfied with the efficiency of the application? Note that most of the questions from the questionnaires are not included in this paper. Fig. 8. Validation test results. The graph shows user satisfaction as well as the task completion rate. The x-coordinate shows the test factors and the y-coordinate shows the percentages of Yes/No answers. The reason given for any unsatisfactory answers was mainly about the voice stream cuts, which was not the same for all users as the testing was conducted at different times under varying network traffic levels. For stress testing, the desktop IP-PTT application was tested under high network traffic to deal with a relatively high number of users. With traffic on the network, real-time VoIP communication tends to suffer due to latency (delay of voice stream packets on the network). This was confirmed by the breakdown of the voice streams and the delay in time for voice packets to arrive. With the mobile phone scenario, many WiFi networks on our campus can bear negative effects on the PoC prototype due to interference. The interference can cause lot of unnecessary background noise. Nonetheless the peer-to-peer approach overcomes the latency problems encountered in the PC application because even in the presence of the noise, the voice still arrives on time and clear. Software should conform to performance otherwise it will be unacceptable even if it is validated. Fig. 9 shows how we used the Windows task manager to measure performance during the PTT session on a PC. Fig. 9. Windows task manager. This screenshot illustrates the use of this tool during IP-PTT sessions to examine CPU usage, shown on the left. Bandwidth consumption is shown on the right hand side. CPU usage during the session shown in Fig. 9 is 2%, which confirms minimal use of system resources consumption by the application prototype. Before the application was run, we stopped every application on the PC such that the CPU usage and bandwidth consumption was 0%. The left side of Fig. 9 shows the CPU usage of 2%. The bandwidth consumption is depicted on the right hand side and is 0.09% during the solitary session. This is due to the half-duplex nature of the application. The bandwidth usage on the remote side (receiver) is 0%, as is expected for halfduplex communication. The user’s perspective of the prototype's performance was gathered with questionnaires. Fifty percent of the users view the performance in terms of response in time as good and the other half said it is acceptable. Fig. 10 depicts the response of the users to this question: Question: How would you rate the performance of the system in terms of the response time? Fig. 10. User's view of performance. This graph shows the user’s perspective of the prototype's performance, as collect with a questionnaire after hands-on use by the end-user. III. DISCUSSION Analysis of the results allows us to deduce that both prototypes appear to meet the desired user needs in terms of basic PTT functionality. However, following on the incremental iterative nature of the software engineering life cycle, there is a need for improvement in some aspects of the prototypes. All participants completed their tasks at hand successfully although a few were unsatisfied. The main concern amongst the participants was the latency on the desktop IP-PTT prototype and the interference on the mobile PoC prototype. Overall, we found that users prefer PoC since the interference tends to fade away within a period of seconds. The mobility and portability of the dynamic PoC also appears to play a role as far as preference is concerned. It must be noted that the interference encountered during the tests is actually very realistic since more and more WiFi networks are being established. From careful evaluation of the prototypes, and a literature survey, we deduce that the client-server IP-PTT that was employed on the PC desktop had a negative effect towards the application under extreme load. The ideal implementation of the prototype might be the use of the two approaches combined together. We kept them separate to evaluate the feasibility of half-duplex communication in both instances. In essence, a client-server server approach is usually employed in large communication networks, with large traffic, in order to keep strict control and monitoring over the usage. This approach can be more secure as a result of robust authentication on the servers. Peer-to-peer, on the other hand, is very inexpensive due to its instability [technet.microsoft.com/en-us/library/cc751396.aspx]; it is thus suitable for small networks. It was for this reason that peer-to-peer was employed on the WiFi environment as a WiFi network can be a small controllable network. Interviews with users indicate that they are more concerned about the cost of IP-PTT. Costs are based solely on bandwidth consumption. The half-duplex technique enables one-way bandwidth usage as compared to fullduplex VoIP applications. Although the majority of the users were satisfied with the efficiency of the application, about half of the participants only rated the performance 'in respect of time' as "good". The other half felt it was "acceptable". IV. CONCLUSIONS AND FUTURE WORK This paper discussed the implementation of IP-based Push-to-talk prototypes on a PC and mobile phone using Session Initiation Protocol. IP-PTT is a voice instant messaging service that operates like a walkie-talkie and/or CB radio system. PTT on a cellular phone is called PoC. IPPTT employs half-duplex communication that proved to be very efficient as far as bandwidth consumption is concerned. This paper roughly compared two architectural approaches, client-server and peer-to-peer, on two different platforms and network media: desktop PC on a LAN and mobile phone on a WiFi network, respectively. We conclude that the client-server approach was good for the PC on a LAN since such a network could be very big. A wired network is very stable and much more secure as compared to a wireless network. On the other hand, it is very difficult to apply the client-server approach to dynamic environments like PoC. Considering the amount of time it would take to handle the instability and interference of WiFi networks, we decided to rather use a peer-to-peer approach for PoC using WiFi. Both prototypes were tested with three task scenarios: social, emergency and corporate. Based on results from user questionnaires and interviews, we can conclude that IP-PTT and PoC offer acceptable voice messaging possibilities. The test results showed that the user oriented view regarding the performance leaves a lot to be improved for the next prototypes. Future work includes adding a video capability, thus developing a Push-to-Video in accordance with the NGN’s IMS multimedia convergence. Adding a video will enhance the communication at large as well as accommodating other social groups like the Deaf people. It would also be ideal to consider an asynchronous communication mode. This would allow users to still be able to sent audio or video IM where real time is not possible. ACKNOWLEDGEMENTS The authors thank the National Research Foundation (NRF) for the scholarship offered to the first author. We also thank Telkom, Cisco and THRIP (Technology and Human Resources for Industry Programme) for financial support via the Telkom Centre of Excellence (CoE) programme. THRIP funding is managed by the National Research Foundation (NRF). Any opinion, findings and conclusions or recommendations expressed in this material are those of the author(s) and therefore the NRF does not accept any liability in regard thereto. REFERENCES [1] C. Schmandt, J. Kim, K. Lee, G. Vallejo, and M. Ackerman, “Mediated voice communication via mobile IP,” Proceedings of the 15th annual ACM symposium on User interface software and technology, 2002, pp. 141–150. [2] E. O’Regan and D. Pesch, “Performance Estimation of a SIP based Push-to-Talk Service for 3G Networks,” Cork Institute of Technology, Ireland, Adaptive Wireless Systems Group, 2004. [3] L.Y. Wu, M.H. Tsai, Y.B. Lin, and J.S. Yang, “A client-side design and implementation for push to talk over cellular service,” Wireless Communications and Mobile Computing, vol. 7, 2007, pp. 539–552. [4] R. Thomas, “Parents Guide to MXit.” South Africa, 2006. [5] R. Koivisto, “Towards the Next Wave of Mobile Communications: Push-to-Talk over a Cellular: Still Searching the Flow of Success” In Proceedings of the Research Seminar on Telecommunications Business, TML-C19, pp 45-96, 2005. [6] Open Mobile Alliance, “Push to Talk over Cellular Architecture,” OMA-AD-PoC_V1_0-20050428-C Candidate Version 2.0, February 2008. [7] G. Bertrand, “The IP Multimedia Subsystem in Next Generation Networks,” Rapport technique, ENST Bretagne, 2007. [8] C. Menkens and N Kjellin, “IMS Social Network Application with J2ME compatible Push-To-Talk Service,” Next Generation Mobile Applications, Services and Technologies, 2007. NGMAST'07. The 2007 International Conference on, 2007, pp. 70–75. [9] J. Rosenberg, et al., “SIP: Session Initiation Protocol” IETF RFC 3261, June 2007. [10] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, “RTP: A Transport Protocol for Real-Time Applications” IETF RFC 3550, July 2007. [11] R.S. Pressman, Software Engineering: A Practitioner's Approach, 6th ed. McGraw-Hill, 2004, pp 406-408. [12] V. Ronnholm, “Push-to-Talk over Bluetooth,” proceedings of the 39th Annual Hawaii International Conference on System Sciences, vol. 9, 2006, pp232c. [13] Forum Nokia, "SIP/VOIP Call Flaw Messages", October 2008. Hlabishi I. Kobo obtained a BSc Honours degree in Computer Science from the University of the Western Cape in 2009. The first author is presently studying towards an MSc degree with the Bridging Applications and Networks Group (BANG) at the same institution. His main research interest is now wireless mesh routing protocols on mobile phones. William D. Tucker is a Senior Lecturer in Computer Science at UWC and leads BANG research. He obtained a PhD from University of Cape Town in 2009. His main research interest is applying Internet Protocol to the ICT4D context. Michael J. Norman is a Senior Lecturer in Computer Science at UWC. His interests are in Software Engineering.