Papers by Peter Romirer-maierhofer
IEEE Transactions on Network and Service Management, 2000
The availability of synchronized packet-level traces captured at different links allows the extra... more The availability of synchronized packet-level traces captured at different links allows the extraction of one-way delays for the network section in between. Delay statistics can be used as quality indicators to validate the health of the network and to detect global performance drifts and/or localized problems. Since packet delays depend not only on the network status but also on the arriving traffic rate, the delay analysis must be coupled with the analysis of the traffic patterns at short time scales.
Elektrotechnik Und Informationstechnik - e&i, 2006
2013 Proceedings IEEE INFOCOM, 2013
Lecture Notes in Computer Science, 2008
In this paper we investigate the dynamics of one-way delays in an operational mobile core network... more In this paper we investigate the dynamics of one-way delays in an operational mobile core network. Our ultimate motivation is to develop anomaly detection schemes for the packet delay process in order to reveal network and equipment problems. This requires an online measurement system capable of collecting and processing delay statistics in realtime. We present an experimental deployment of such a measurement system in an operational General Packet Radio System (GPRS)/Universal Mobile Telecommunications System (UMTS) network and elaborate on some practical implementation issues. We present some measurement results for the Serving GPRS Support Nodes (SGSN) of UMTS and GPRS. We find that the delay at a UMTS-SGSN is moderately influenced by user mobility, while flow control and user mobility are considerably impacting the delay process at a GPRS-SGSN. We show that simple summary indicators can be extracted from the delay statistics, as a combination of percentiles and threshold-crossing probabilities. Such indicators can be used for the purpose of detecting abnormal delay deviations, pointing to problems in the network equipments.
Lecture Notes in Computer Science, 2011
Mobile users in cellular networks produce calls, initiate connections and send packets. Such even... more Mobile users in cellular networks produce calls, initiate connections and send packets. Such events have a binary outcome -success or failure. The term "failure" is used here in a broad sense: it can take different meanings depending on the type of event, from packet loss or late delivery to call rejection. The Mean Failure Probability (MFP) provides a simple summary indicator of network-wide performance -i.e., a Key Performance Indicator (KPI) -that is an important input for the network operation process. However, the robust estimation of the MFP is not trivial. The most common approach is to take the ratio of the total number of failures to the total number of requests. Such simplistic approach suffers from the presence of heavy-users, and therefore does not work well when the distribution of traffic (i.e., requests) across users is heavy-tailed -a typical case in real networks. This motivates the exploration of more robust methods for MFP estimation. In a previous work [1] we derived a simple but robust sub-optimal estimator, called EPWR, based on the weighted average of individual (per-user) failure probabilities. In this follow-up work we tackle the problem from a different angle and formalize the problem following a Bayesian approach, deriving two variants of non-parametric optimal estimators. We apply these estimators to a real dataset collected from a real 3G network. Our results confirm the goodness of the proposed estimators and show that EPWR, despite its simplicity, yields near-optimum performance.
Lecture Notes in Computer Science, 2015
ABSTRACT Nowadays mobile devices are highly heterogeneous both in terms of terminal types (e.g., ... more ABSTRACT Nowadays mobile devices are highly heterogeneous both in terms of terminal types (e.g., smartphones versus data modems) and usage scenarios (e.g., mobile browsing versus machine-to-machine applications). Additionally, the complexity of mobile terminals is continuously growing due to increases in computational power and advances in mobile operating systems. In this scenario novel traffic patterns may arise in mobile networks, and it is highly desirable for operators to understand their impact on the network performance. We address this problem by characterizing the traffic of different device types and Operating systems, analyzing real traces from a large scale mobile operator. We find the presence of highly time synchronized spikes in both data and signaling plane traffic generated by different types of devices. Additionally, by investigating a real case, we show that a device-specific view on traffic can efficiently support the root cause analysis of some type of network anomalies. Our analysis confirms that large traffic peaks, potentially leading to large-scale anomalies, can be induced by the misbehavior of a specific device type. Accordingly, we advocate the need for novel analysis methodologies for automatic detection and possibly mitigation of such device-triggered network anomalies.
GLOBECOM 2009 - 2009 IEEE Global Telecommunications Conference, 2009
In this work we present a novel scheme for statistical-based anomaly detection in 3G cellular net... more In this work we present a novel scheme for statistical-based anomaly detection in 3G cellular networks. The traffic data collected by a passive monitoring system are reduced to a set of per-mobile user counters, from which time-series of unidimensional feature distributions are derived. An example of feature is the number of TCP SYN packets seen in uplink for each mobile user in fixed-length time bins. We design a changedetection algorithm to identify deviations in each distribution time-series. Our algorithm is designed specifically to cope with the marked non-stationarities, daily/weekly seasonality and longterm trend that characterize the global traffic in a real network. The proposed scheme was applied to the analysis of a large dataset from an operational 3G network. Here we present the algorithm and report on our practical experience with the analysis of real data, highlighting the key lessons learned in the perspective of the possible adoption of our anomaly detection tool on a production basis.
Lecture Notes in Computer Science, 2009
In this study we present network-wide measurements of Round-Trip-Time (RTT) from an operational 3... more In this study we present network-wide measurements of Round-Trip-Time (RTT) from an operational 3G network, separately for GPRS/EDGE and UMTS/HSxPA sections. The RTTs values are estimated from passive monitoring based on the timestamps of TCP handshaking packets. Compared to a previous study in 2004, the measured RTT values have decreased considerably. We show that the network-wide RTT percentiles in UMTS/HSxPA are very stable in time and largely independent from the network load. Additionally, we present separate RTT statistics for handsets and laptops, finding that they are very similar in UMTS/HSxPA. During the study we identified a problem with the RTT measurement methodology -mostly affecting GPRS/EDGE data -due to early retransmission of SYNACK packets by some popular servers.
Lecture Notes in Computer Science, 2014
The bandwidth demand of today's mobile applications is permanently increasing. This requires more... more The bandwidth demand of today's mobile applications is permanently increasing. This requires more frequent upgrades of the mobile network capacity in the radio access as well as in the backhaul section. In such quickly evolving scenario, the risk of capacity bottleneck is increased, therefore network operators need tools to promptly detect capacity bottlenecks or, conversely, validate the current network state. To this end, we propose to exploit the passive observation of individual TCP connections. Being a closed loop protocol, the performances of every TCP connection depend on the status of the whole end-to-end path. Leveraging on this property, we propose a method to infer the presence of a capacity bottleneck along the path of an individual TCP connection by passively monitoring the DATA and ACK packets at a single monitoring point. We validate our approach with test traffic in a real 3G/4G operational network. The realized monitoring algorithm offers a powerful tool to network operators for on-line performance assessment and network troubleshooting.
Video and Multimedia Transmissions over Cellular Networks, 2009
International Journal of Network Management, 2010
... Similarly, we define the external dispersion Γ(k) as a synthetic indicator extracted from the... more ... Similarly, we define the external dispersion Γ(k) as a synthetic indicator extracted from the set of ... The detection scheme is based on the comparison between the internal and external metrics. ... method as it allows coping with the marked non-stationarity of real traffic (see analysis ...
2009 IEEE Globecom Workshops, 2009
In this work we address the problem of estimating the network-wide packet loss rate across the ra... more In this work we address the problem of estimating the network-wide packet loss rate across the radio access section of a 3G cellular network. The reference scenario consists of a passive monitoring probe located in the Core Network. The probe counts the number of TCP packets directed to each individual mobile terminal and, from the analysis of (un)acknowledged packets, infers the loss ratio for each individual terminal. The problem is then to derive a synthetic indicator representative of the network-wide packet loss, which can be used to detect largescale performance drifts and network anomalies. We show that common simplistic indicators like the total rate of lost packets (across all terminals) and the average per-terminal loss rate do not work well in the general case. The key problem is the large disparity of traffic volume across individual terminals. In this contribution we formulate the problem in terms of optimal statistical inference and provide a set of robust near-optimum estimators that are relatively simple to implement. We validate the proposed estimators with simulations in synthetic scenarios and provide results from a real operational 3G network.
IEEE Transactions on Network and Service Management, 2000
The availability of synchronized packet-level traces captured at different links allows the extra... more The availability of synchronized packet-level traces captured at different links allows the extraction of one-way delays for the network section in between. Delay statistics can be used as quality indicators to validate the health of the network and to detect global performance drifts and/or localized problems. Since packet delays depend not only on the network status but also on the arriving traffic rate, the delay analysis must be coupled with the analysis of the traffic patterns at short time scales.
e & i Elektrotechnik und Informationstechnik, 2006
A 3G network is a magnificently complex object embedded in a highly heterogeneous and ever-changi... more A 3G network is a magnificently complex object embedded in a highly heterogeneous and ever-changing usage environment. It combines the functional complexity of the wireless cellular paradigm with the protocol dynamics of TCP=IP networks. Understanding such an environment is more urgent and at the same time more difficult than for legacy 2G networks. Continuous traffic monitoring by means of an advanced system, coupled with routine expert-driven traffic analysis, provides an in-depth understanding of the status and performances of the network as well as of the statistical behaviour of the user population. Such knowledge allows for a better engineering and operation practice of the whole network, and specifically the early detection of hidden risks and emerging troubles. Furthermore, the exploitation of certain TCP=IP dynamic behaviour, particularly the TCP control-loop, coupled with information extracted from the 3GPP layers, provides a cost-effective means to monitor the status of the whole network without requiring access to all network elements. In this article the main lessons are summarized learned from a two-year research activity on traffic monitoring and analysis on top of an operational 3G network.
As global location systems offer only restricted availability, they are not suitable for a worldw... more As global location systems offer only restricted availability, they are not suitable for a worldwide tracking application without extensions. This thesis contains a goods-tracking solution, which can be considered globally working in contrast to formerly developed technologies. For the creation of an innovative approach, an evaluation of the previous efforts has to be made.
2008 16th IEEE Workshop on Local and Metropolitan Area Networks, 2008
In this paper we investigate the dynamics of one-way delays in an operational mobile core network... more In this paper we investigate the dynamics of one-way delays in an operational mobile core network. Our final goal is to develop anomaly detection schemes for the packet delay process in order to reveal network and equipment problems. This requires a preliminary exploration of the delay process in the core network, which we undertake in this study.
Computer Networks, 2014
Network operators collect monitoring data from various sources, ranging from equipment logs to fi... more Network operators collect monitoring data from various sources, ranging from equipment logs to fine-grained packet traces, for the purpose of network operation and troubleshooting. Such data are typically summarized into a set of Key Performance Indicators (KPI), measured over fixed-length timebins, that are then inspected to detect anomalies. Many widely adopted KPIs take the form of failure ratios, where the denominator and numerator count respectively the total number of attempts (e.g., starting a call or sending a packet) and the total number thereof with unsuccessful outcome (call rejection, packet loss). These KPIs are often affected by large statistical fluctuations, their intrinsic "noise" can obfuscate the presence of real anomalies. In this paper we study the problem and derive a concrete proposal for alternative KPI forms amenable of practical adoption. We formulate the problem in terms of robust estimation of the underlying Mean Failure Probability (MFP) across all active users, deriving a set of robust estimators with different trade-offs between optimality and simplicity. The performance of each estimator are evaluated extensively by analysis and simulations with synthetic data, for different traffic distributions in order to assess their generality. Numerical results are also provided for a sample real dataset from an operational 3G cellular network.
Lecture Notes in Computer Science, 2010
In this work we discuss the use of passive measurements of TCP performance indicators in support ... more In this work we discuss the use of passive measurements of TCP performance indicators in support of network operation and troubleshooting, presenting a case-study from a real 3G cellular network. From the analysis of TCP handshaking packets measured in the core network we infer Round-Trip-Times (RTT) on both the client and server sides separately for UMTS/HSPA and GPRS/EDGE sections. We also keep track of the relative share of packet pairs which did not lead to a valid RTT sample, e.g. due to loss and/or retransmission events, and use this metric as an additional performance signal. In a previous work we identified the risk of measurement bias due to early retransmission of TCP SYNACK packets by some popular servers. In order to mitigate this problem we introduce here a novel algorithm for dynamic classification and filtering of early retransmitters. We present a few illustrative cases of abrupt-change observed in the real network, based on which we derive some lessons learned about using such data for detecting anomalies in a real network. Thanks to such measurements we were able to discover a hidden congestion bottleneck in the network under study.
Uploads
Papers by Peter Romirer-maierhofer