TTP Trains
TTP Trains
TTP Trains
Department of Electrical Engineering, University of Padova, Via Gradenigo 6/a, 35131 Padova. 1 [email protected] 2 [email protected] II describes the requirements for a railway communication network. Section III reviews the TCN protocol. Section IV introduces the time-triggered protocols and illustrates two commercialized solutions, namely Time-Triggered Protocol (TTP) -version C- and FlexRay. Section V is concerned with the application of a time-triggered protocol aboard trains. Section VI concludes the paper. II. TRAIN COMMUNICATION SYSTEMS Both locomotives and railway vehicles are equipped with several electric/electronic devices: actuators for the execution of the traction maneuvers, electronic units for the control and the management of the train, sensors for the transduction of both the controlled variables and the quantities required by the train management, such as the status of the rolling stock (wear, heating, and so on) and of the equipment installed for the passenger comfort (conditioning system and other services). Till now the TCN protocol is not used for the transmission of critical information; however, the communication network must be enough dependable for the train to meet the specifications in terms of availability. For example, in Italian railways, the communication network must not cause more than one discontinuity of the service per year per train. From the hardware point of view (nodes, bus) the dependability requirements are fulfilled by using reliable components and by redounding the most critical ones. As for the protocols, they must guarantee the correctness of the data exchange, the real-time operation of the network, the fault detection and a pre-established behavior of the network after the occurrence of a fault. The latter item is achieved by duplicating the hardware but the protocols must be able to manage the duplication in an appropriate and quick way. III. TCN PROTOCOL The TCN protocol provides for two separate networks arranged hierarchically in two levels, termed layers, as shown in Fig.1. The upper layer is formed by the Wire Train Bus (WTB) and connects each other the locomotive and the cars. The connection of the devices within each vehicle is carried out by the Multifunction Vehicle Bus (MVB) at the lower layer. In the standard architecture each vehicle is endowed with a gateway which transfers the data from WTB to MVB and vice-versa. However, a MVB network can extend over many vehicles and, if the train composition is fixed, it can substitute for WTB, thus connecting the devices of the whole train. The largest number of nodes in
Abstract Modern trains are equipped with a communication network to exchange data among the numerous electric/electronic devices that nowadays are installed in the railways vehicles. Since a few years the Train Communication Network (TCN) protocol has been chosen as the standard for the communication network aboard trains. A crucial issue for this application is the achievement of higher levels of dependability while fulfilling the transmission needs. Recently communication networks have been developed, built up around time-triggered protocols, that accommodate powerful techniques to make dependable the network behavior. After reviewing the characteristics of TCN, the paper illustrates two commercialized time-triggered protocols, namely TimeTriggered Protocol and FlexRay, and investigates their application aboard trains.
I. INTRODUCTION Since many years communication networks are used for the data exchange among actuation, sensing and control devices. At the beginning, the communication networks were applied to the process industry and, successively, to the manufacturing industry, the road vehicles and also the railways vehicles. In fact, the increasing number of devices installed in the modern trains and the need of enforcing their integration have imposed the migration from the analog connections to the digital ones of serial type (i.e. to the communication networks). Such a migration has had the merits of reducing cable wiring and permitting the implementation of advanced functions regarding driving and supervision of a train; on the other side, it has the concern of using a shared bus to transmit information that can be critical not only for the train integrity but also for the safety of the passengers. Examples of this information are the commands of brakes and doors, and the retransmission of data received from the fixed signaling system. The communication network in use in the European trains is based on the Train Communication Network (TCN) protocol, which has been developed more than a decade ago and has been acknowledged by the IEC 61375 Standard in 1999 [1]. Recently, communication protocols new in conception have been introduced, on-purpose arranged for networking of safety-critical systems. They are the time-triggered protocols, which operate intrinsically in real-time and are characterized by a high degree of dependability. The purpose of this paper is to investigate the characteristics of the time-triggered protocols, compared to TCN, from the perspective of the railway applications and paying particular attention to the fault-tolerant capabilities of the protocols. The paper is organized as follows. Section
3685
a WTB network is 32 with at most 2 nodes in each vehicle. MVB, in turn, can connect a maximum of 256 nodes in the same vehicle [2]-[4].
Both WTB and MVB classify the transported data into two groups, termed process variables and messages. The process variables are quantities related to the train control, such as speed and current of the traction motor, and commands of the operator. They must reach their destination in a short and fixed time interval, the length of which depends on the specific process variables. The data transfer between devices situated in two different vehicles and connected by both WTB and MVB has to be completed within 100ms, whereas a maximum delay of 16ms is allowed for the critical variables traveling only on WTB. The messages are infrequent information with data size ranging from few bytes to many kilobytes. Delays up to several seconds are tolerated in transferring the messages as they typically are utilized for non-critical diagnosis operation or passenger services. WTB and MVB are master-slave protocols. There is one master for the WTB network and one master for each MVB network. The WTB master is always replicated whilst the replication of the MVB masters is optional. The masters arrange the communication activities in a cyclic way, as shown in Fig. 2. The basic period is 1 or 2ms for MVB and 25ms for WTB, and is divided into two phases, termed periodic and sporadic. During the periodic phase, only process variables are exchanged in accordance with a master-slave mechanism. During the sporadic phase, only messages are exchanged and the nodes needing to transmit a message access the bus whenever it is idle according to a CSMA method. A message too long to be transmitted during one sporadic phase is sent in two or more phases.
record of the update time of each process variable and release an alarm if it exceeds a given time interval. The situation is different for the messages since they are not sent periodically: there is no record of their update time but the messages are retransmitted in the successive sporadic phases in case that the related data frame is erroneous. TCN does not provide for the management of other kind of faults such as a babbling-idiot node, and relies on the application software to guarantee the safety of the train and of the passengers. In case of downtime of the communication network, the train is automatically driven to a safe state by gradually reducing the speed to zero. The format of the master frame is reported in Fig. 3 [5]. It begins with one start bit (SB), followed by a start delimiter of 8 bits and a string of 16 bits. The first 4 bits of the string constitute the F-code and gives the length of the subsequent slave frame; the other 12 bits are the identifier of the slave node that produces the requested variable. The master frame closes with an 8-bit check sequence and an 1bit end delimiter (ED). The format of the slave frame is very similar to the master frame, as reported in Fig.4. It begins with the start bit followed by the start delimiter and closes with the check sequence and the end delimiter. The length of the data field can be 16, 32 or 64 bits. When a greater number of data must be transmitted, one or three blocks, each of them formed by 64 bits of data and 8 bits of check sequence, are inserted in the frame and the total bits of the data field reach 128 or 256, respectively.
Fig.4. Format of the slave frame for 16, 32 or 64 data bits and for 256 data bits.
If a node receives an erroneous frame from either the master or a slave, there is no provision for its retransmission; the erroneous frame and all the next few slave frames are ignored until a valid master frame is received. Under normal circumstances, the process variables are updated after the respective communication period. To comply with persistent faults, the nodes keep a
In a WTB network, the data are transmitted at 1Mb/s using the Manchester encoding. The transmission medium is redundant and is formed by two twisted and shielded wire pairs, each of them running on a different side of the vehicle. The maximum length of the network without repeaters is 860m. In a MVB network, the data are transmitted at 1.5Mb/s. Depending on the length of the bus, either optical fibers (for length over 200m) or transformercoupled twisted pairs or RS-485 cables are employed. The transmission medium is usually duplicated but this is not required by the standard. A peculiarity of TCN, useful for railway applications, is the so-called inauguration procedure that is in charge of integrating the new WTB nodes every time the train composition changes. The inauguration procedure exploits the hardware structure, drawn in Fig.5, of the bus (or
3686
communication) controller of a WTB network. Each bus controller has two distinct circuits to interface the two sections of the trunk cable entering into the node and then it is able to manage the two sections individually. In the intermediate nodes of the network, the two sections are short-circuited by an electronic switch and only one interface circuit is activated: the node is connected to the bus in the conventional drop mode. Instead, in the end nodes of the network, the two sections of the trunk cable are separated and terminated with a resistor. A particular frame specifying the number of nodes of the WTB network is sent every 50ms on the two terminals of the trunk cable. Receipt of a similar frame via the WTB network means that railway vehicles have been added to the train and starts off the inauguration procedure. It is executed by the WTB network with the greater number of nodes and consists in assigning a new address to the nodes of the smaller WTB network. The inauguration procedure is able to comply with some unpredictable events such as the recognition of a fault node and its substitution by the replica. The maximum time length of the inauguration procedure is 1s for 32 nodes. Of course, this procedure is not envisaged by the MVB protocol because the number of nodes of a MVB network is fixed.
IV. TIME-TRIGGERED PROTOCOLS The time-triggered protocols are characterized by a bus access ruled by the passing of time and called Time Division Multiple Access (TDMA). By TDMA, the time is divided into time slots and during each time slot only one node of the network is allowed to access the bus. Moreover, the time slots are arranged in a cyclical way so that in a communication cycle each node accesses the bus at least one time. This access mode avoids the occurrence of collisions between the messages and assures that each message reaches its destination within a fixed and a priori known time interval. To work properly, TDMA requires that all the nodes of the network share the same time reference, which becomes the global time of the network. The distinctive merits of the time-triggered protocols are their dependability features. As a matter of fact, the assignation of the time slots to the nodes makes it feasible a network-distributed knowledge of the node entitled to send a message in a given instant. Then every node of the network is able to detect a missing message or to identify a faulty node engaging the bus at a wrong instant. Moreover, some time-triggered protocols force the reconfiguration of a faulty node by help of special frames. In addition, the time-triggered protocols cope even with a babbling-idiot
node that keeps on transmitting despite the other nodes try to reconfigure it. To this purpose, hardware devices, called bus guardians, are used to connect the transmitting port of each node to the bus only during the assigned time slots, preventing a babbling-idiot node from monopolizing the bus. The safety of the communication network is greatly improved if the bus is redounded and built up with two transmission channels. TDMA allows a simple management of the concurrent transmission of the same message over the two channels. The replication of the channel leads to a nearly full fault-tolerant behavior against the transmission errors due to the electromagnetic interferences or the break in the cable. In fact, if the two channels are placed along different paths, generally they are neither subjected to the same electromagnetic noises nor to a brake at the same time. Obviously, the two channels meet in the area where they connect to the nodes and both can undergo a fault there because of the spatial proximity. The redundancy of the bus also helps detecting and locating other kinds of faults. For instance, when the data frames coming from the two channels have the same incorrect Cyclic Redundancy Check (CRC) string, a fault is ascribed to the communication controller of the transmitting node. The absence of transmissions from a node on the two channels is a symptom of fault of that node whilst, if its data frames are missing only on one channel, the fault is likely located in one of the transceivers of the node. In any case the fault is detected within a time interval not exceeding one communication cycle. The time-triggered protocols today available are TTP, FlexRay and Time-Triggered CAN (TTCAN) [6]-[8]. TTP has been developed by TTTech for safety-critical real-time distributed applications. FlexRay has been expressly developed by a consortium of car manufacturers for safetycritical automotive applications. TTCAN has been designed by modifying the native bus access strategy of CAN. TTP and FlexRay have built-in services for the management of two transmission channels, for the detection of faulty nodes, and for the reaction against the faults [9]. On the contrary, TTCAN is not provided with these services and leaves their accomplishment, if needed, to application routines created by the user. Other time-triggered protocols based on CAN are around such as Flexible Time-Triggered CAN (FTT-CAN) [10] and FlexCAN [11], but they are not yet commercialized. Furthermore, they implement the timetriggering services in an application layer placed over the CAN protocol so that the host controller has to spent some of its resources to manage the bus access. Moreover TTP and FlexRay have a transmission rate of at least 10 Mb/s while the maximum rate for TTCAN and the other CANbased protocols is 1Mb/s. For these reasons, only TTP and FlexRay will be taken into consideration in the next sections of the paper. A) TTP Protocol TTP divides the global time into slots, each of them assigned to a single node for the bus access. The time slots
3687
are grouped in TDMA rounds that specify the sequence of the nodes in accessing the bus. As a general rule, the nodes are ordered in the set-up stage and the same order is maintained in assigning the time slots within the TDMA rounds. The rounds may differ in the nodes that are entitled to access the bus and in the type of messages that they transmit. Periodicity of the operation is achieved by repeating continuously a cluster cycle, obtained by sequencing a number of TDMA rounds. TTP comes with two parallel transmission channels and provides for the simultaneous transmission of the data on the two channels. Usually the same data are sent on the two channels but this is not compulsory. Fig.6 exemplifies a cluster cycle, where the first slot of the three depicted TDMA rounds is taken by the node A. In the first round, the node A transmits the messages m1, m2 and m3, and in the second one the messages m1, m6 and m8, on both the channels. In the third round, the node A transmits the messages m1 and m2 on the channel A and the messages m1, m2 and m4 on the channel B. This is done when the information carried on by the message m4 is non-vital and its possible loss due, for example, to a failure of the channel B does not jeopardize the safety of the application.
N-frames are devoted to the message transportation whilst the I-frames are devoted to the network synchronization and to the management of transmission errors and faulty nodes. In place of the messages, the I-frames contain the communication Controller state (C-state). The C-state is formed by the global time, the index of the current MEDL row and the membership vector. The latter one is a set of bits, each of them belonging to a specific node. The bits indicate whether the corresponding nodes have sent a correct message (state 1) or not (state 0) in the previous TDMA round. When a node detects that its own bit is 0, it suspends the transmission and begins the procedure of reconnection to the network.
Every TTP node has its own MEssage Descriptor List (MEDL). As Fig.7 shows, MEDL is a memory segment organized in the form of array. Each row of the array refers to one time slot of the cluster cycle and is drawn up at the network set-up with information on the data frame transmitted in that time slot. The information items are the transmission starting time, the storing address of the messages in the dual-port RAM -termed Communication Network Interface- interfacing the host controller with the communication controller, and four attributes denoted with D, L, I, and A. Attribute D is concerned with the type of data frame, attribute L with the length of the frame, attribute I with the flow of the frame (in or out), and attribute A with the communication mode (the concept of communication mode is explained below). Using MEDL information, a node accesses the bus when the global time equates the transmission starting time stored in the current row and the flow of the respective frame is set out. MEDL can contain information pertinent to more cluster cycles. This enables TTP to modify the transmission task, also called communication mode, by changing the cluster cycle in use. The change is effective at the same time for all the nodes and is initiated by setting some bits of the data frame header. In TTP there are two types of data frames: Normal frames (N-frames) and Initialization frames (I-frames). The
Under correct operation of the network, all the nodes agree with the membership vector content. The correctness could be checked by encapsulating the vector in the data field of all the frames at the expenses of lengthening the transmission times. The problem is circumvented by appending the membership vector to the data field when calculating CRC of the N-frames; the receiving node, in turn, appends its own membership vector to the received data field and calculates CRC. If the local CRC coincides with that of the received data frame, the node assumes that the frame is arrived safely and that the two membership vectors are equal; otherwise either a transmission error has occurred or the membership vectors are different: in both the cases the messages are thrown away. TTP has been set up with the physical layers of various standards, e.g. CAN, RS485 and Ethernet, using both copper cable and glass fiber as transmission medium. The transmission rate ranges from 1Mbit/s, with CAN transceivers, to a maximum of 5Mbit/s and 25Mbit/s with RS485 and Ethernet, respectively. An ad-hoc physical layer is near to be released and is expected to exhibit high transmission speed, built-in diagnostic of the bus failure and compatibility with automotive 42V power supply. B) FlexRay Protocol FlexRay is a protocol that, in addition to the timetriggered strategy, implements the event-triggered one so that the transmission bandwidth of the network is shared between the two types of transmissions. The global time is here divided into communication cycles that, in turn, are formed by two different segments: static and dynamic. The access to the bus is time-triggered in the static segment and event-triggered in the dynamic one. The static segment has the structure of Fig.8. The segment comprises several time slots, each of them long enough to allow for the transmission of one data frame. All the data frames have a serial number, termed data frame
3688
identifier, and are transmitted by the owner nodes as the time slot counter matches their serial number. The dynamic segment comprises several time slots too, as shown by Fig.9, but their length is shorter than a data frame; for this reason they are called minislots. Differently from a static segment, where the data frames are always transmitted during the assigned time slots, in a dynamic segment the data frames are transmitted only if this is requested by the application and the time remaining up to the end of the segment is long enough for their transmission. As with the static segment, each data frame has a serial number and its transmission starts when the minislot counter matches the serial number. In order to send the whole data frame, the count of the minislots is suspended during the transmission and is resumed when the bus becomes again idle. Hence collisions are also avoided during the dynamic segment. At the same time, a priority transmission mechanism is implicitly defined by the rule of sending the data frames in the ascending order of identifier.
exceeds the lower threshold, the node suspends the transmission but it is enabled to receive the messages and to attempt to synchronize itself: if succeeding, the error counter is zeroed and the node resumes the ordinary operation. In the opposite case, as the count reaches the higher threshold, the node stops its activities and must be rebooted by the user. FlexRay implements a robust technique against electromagnetic disturbances. Each received bit is sampled at a frequency eight times higher than the transmission rate and the digital value taken by the majority of the last five samples is assigned to the bit. This means that impulsive disturbances shorter than two sampling times do not cause errors in reading the bit. FlexRay protocol is independent of the physical layer, the only constraint being the maximum transmission rate that must be 10 Mbit/s. An on-purpose line driver is under development; in the meantime, the well-known RS485 standard is adopted. V. TRAIN APPLICATION OF TIME-TRIGGERED PROTOCOL From the above-described characteristics of TCN and time-triggered protocols, it emerges that there are three classes of features: a) features of TCN, useful for its application aboard trains but not owned by the timetriggered protocols, b) features common to both TCN and the time-triggered protocols such as redundancy of the bus and of the nodes, and c) features of the time-triggered protocols, which could be useful in the railway application but not owned by TCN. If a time-triggered protocol is supplemented with the features of class a), it would become a good substitute for TCN and would exhibits even better performance. The most important items of class a) are the hierarchical organization into two layers and the automatic integration of new nodes in the network by means of the inauguration procedure. According to their specifications, the timetriggered protocols are organized in only one layer; however, the subdivision of one single network into one train network and many vehicle networks can be obtained by assigning to each node the proper transmission and receipt time-slots. This is exemplified in Fig. 10, where nodes belonging to the various networks are identified by different forms (circle, square) and hatchings. The circle nodes are equivalent to the MVB nodes and constitute the vehicle networks. They are entitled to communicate only with the circle or square nodes having the same hatching. The square nodes are equivalent to the WTB nodes and constitute the train network. They are entitled to communicate only each other and with the circle nodes having the same hatching. The implementation of this network is simple and straightforward. Its application, however, is not suitable when the total amount of nodes on the train is very large because the messages coming from one vehicle, even if received only by the nodes of the same vehicle, engage the same bus used by the nodes of the other vehicles, thus reducing the band available for their messages. To
Fig.8. FlexRay static segment (DF stands for data frame and ID for identifier).
Like TTP, FlexRay comes with two transmission channels. They are utilized in different manner by the static and dynamic segments: during the static segment the same frames are sent over the two channels while during a dynamic segment the two channels are used disjointedly and two different frames can be transmitted simultaneously.
FlexRay does not have tools similar to MEDL and membership vector, but uses alternative solutions for managing the network. Synchronization is obtained by means of suitable messages sent by dedicated nodes. Communication tasks are known by the nodes in two steps: at power-on, every node is acquainted only with the time slots assigned to its data frames whilst the association between the time slots and the data frames of the other nodes are learnt step by step by monitoring the traffic on the bus. The only error accounted for by a node is the loss of synchronization with the global time. Its occurrences are counted and compared with two thresholds: when the count
3689
overcome this problem, two solutions can be conceived. In the first solution, the network can be split into two or more trunks, each of them encompassing one or more vehicles, connected by intelligent bridges that transfer from one trunk to another one only the messages coming from the nodes of the train network. In the second solution, a suitable location of the bridges reproduces the two-layer structure of TCN, where the nodes of the train network exchange data among themselves without traveling through any bridge. The two solutions are sketched in Fig.11 and Fig.12, respectively. Both the figures use the same conventions on forms and hatchings of the nodes as in Fig.10.
detection of a wide range of faults without interruption of the transmission, thus giving the network powerful capabilities in terms of efficiency, safety and continuity of the service. The bit rate of the time-triggered protocols is about ten times higher than TCN. A wider bandwidth is an additional merit of the time-triggered protocols as it allows to increment the number of exchanged variables or to reduce their update period. In the first case a more detailed knowledge of the train states is obtained and more accurate diagnostic procedure can be arranged; in the second case a faster response to the changes in the train states is achieved. VI. CONCLUSIONS The paper has dealt with the communication networks utilized in a train to exchange data among the locomotive and the railways vehicles. The purpose of the paper was to investigate the application of the recently developed networks based on the time-triggered protocols aboard trains. At first the paper has examined the safety requirements for a train communications network. Then the characteristics of the currently used protocol (TCN) and of two time-triggered protocols (TTP and FlexRay) have been illustrated. Finally, merits and inconveniences in applying a time-triggered protocol aboard trains have been discussed and two solutions aimed at reproducing the data traffic of TCN has been formulated. REFERENCES
Fig.11. One layer time-triggered network with bridges. [1] Electric Railway Equipment - Train Bus - Part 1: Train Communication Network, IEC Std 61375, 1999. [2] H.Kirrmann and P.A.Zuber, The IEC/IEEE train communication network, IEEE Micro, Vol.21, N.2, pp.81-92, March-April 2001. [3] P.A.Laplante and F.C.Woolsey, IEEE 1473: An open-source communication protocol for railway vehicles, IT Pro, pp.12-16, NovDec, 2003. [4] C.Schafers and G.Han, IEC 61375-1 and UIC 556 - International Standards for Train Communication, Proceedings of VTC, 2000, pp.1581-1585. [5] J.Jimenez, J.L.Martin, A.Zuloaga, U.Bidarte, and J.Arias, Comparison of two designs for the Multifunction Vehicle Bus, IEEE Transactions on Computer-Aided Design of Integrated Circuits and System, Accepted for publication. [6] T.Fhrer, B.Mller, W.Dieterle, F.Hartwich, R.Hugel, and M.Walther, Time-triggered Communication on CAN (Time-triggered th CAN-TTCAN), Proc. of the 7 International CAN Conference, 2000. Available at: http://www.can-cia.de/can/ ttcan/fuehrer.pdf. [7] TTTech, Time Triggered Protocol TTP/C, High level Specification Document, 2002. [8] C.Temple, Protocol Overview, FlexRay International Seminar, June 2003. [9] J.Rushby, A Comparison of Bus Architectures for Safety-Critical Embedded Systems, NASA/CR-2003-212161, 2003. Available at: http://www.tttech.com/technology/docs/protocol_comparisons/Rushby_20 03-03-Bus_Architectures.pdf. [10] L.Almeida, P.Pedreiras, and J.A.Fonseca, The FTT-CAN Protocol: Why and How , IEEE Transactions on Industrial Electronics, Vol. 49, N.6, December 2002 [11] J.R. Pimentel and J.A.Fonseca, FlexCAN: A Flexible Architecture for Highly Dependable Embedded Applications, Proc. of the 3rd International Workshop on Real-Time Networks, Catania, Italy, 2004.
Fig.12. Two layer time-triggered network with bridges (TT stands for Time-Triggered).
The inauguration procedure should be executed by software routines implemented in the host controller. Their development could result in a rather difficult task because the accommodation of new nodes in an existing network necessitates the reassignment of the time slots to all the nodes. The most important features of class c) are: i) the TDMA medium access, ii) the powerful functions for fault detection and iii) the higher transmission rate. TDMA profitably takes the place of the master-slave access of TCN. This permits both a better use of the communication bandwidth due to the lack of polling messages and the
3690