Autonomic Critical Infrastructure Protection
(ACIP) System
Bilal Al Baalbaki, Youssif Al-Nashif, Salim Hariri
Douglas Kelly
NSF Center for Cloud and Autonomic Computing
The University of Arizona
Tucson, AZ, USA
[email protected]
{alnashif, hariri}@ece.arizona.edu
AVIRTEK, Inc.
Tucson, AZ, USA
[email protected]
Abstract— The dependency of critical infrastructures on the
Supervisory Control And Data Acquisition (SCADA) systems has
increased rapidly in the last few years to perform remote
monitoring and control services for a wide range of utilities such
as power distribution, gas production, and waste water
treatment. The trend toward operating the grid over IP networks
using open standard protocols, and the growing number of
attacks targeting critical infrastructure made the security of
SCADA systems an important research issue. Most of the
currently used SCADA communication protocols have no
encryption, authentication, or authorization, which makes them
vulnerable and easy target for cyber-attacks. This paper presents
an Autonomic Critical Infrastructure Protection (ACIP) system
that is based on anomaly-based intrusion detection and
autonomic computing to secure the control functions and
management tasks of critical infrastructure control systems with
a little or no involvement from the users or administrators. We
will show how we applied ACIP to the widely used Modbus
communication protocol to securely transfer commands and data
between RTUs and industrial control systems in smart grids.
Keywords— Smart Grid,
information technology, ACIP.
I.
renewable
energy,
SCADA,
INTRODUCTION
Industrial Control Systems (ICSs) are considered as the
backbone of many critical infrastructure sectors, and widely
used in many fields of industry such as electric, oil and water
treatment. ICSs include SCADA systems, Distributed Control
System (DCS) and PLCs (Programmable Logic Controllers)
[1]. SCADA is used in distributed systems to control processes
and operations that are geographically dispersed such as
railways, oil lines, and electrical power grids. It supports long
distance communications with centralized control and
monitoring capabilities [1]. DCS is a process-driven control
system that is implemented in small local plants. DCS mainly
concerns with the sequence of the industrial operations, so it
gathers the data in a steady state, and the communication with
field devices usually is done through Local Area Network
(LAN) [2]. PLCs are computer base devices initially have a
fixed number of inputs and outputs, that can be programmed
(using ladder logic, BASIC, C, etc.) to do a specific role or
function, and those inputs and outputs can be expanded by
adding additional racks. Additionally, PLCs have built-in ports
(RS-232 and/or Ethernet), and they communicate using one of
This work is partially supported by AFOSR DDDAS award number
FA95550-12-1-0241, and National Science Foundation research projects NSF
IIP-0758579, NCS-0855087 and IIP-1127873.
978-1-4799-0792-2/13/$31.00 ©2013 IEEE
the industrial protocols (such as Modbus, DNP3, PROFIBUS,
etc.).
In this paper, we focus on developing a protection system
for communication protocols (e.g., Modbus protocol). The
power grid is old and insecure. Consequently, its failures can
lead to catastrophic results. The current electric grid suffers
from many drawbacks: lack of real time monitoring, unsecure
communications, and longtime recovery, just to name a few.
Updating the existing power grid requires intelligent
infrastructure, which can adapt to the changes on unbalanced
loads, to the equipment’s failure, to the adversary attacks, etc.
Since SCADA systems perform run-time monitoring and relay
physical changes in power infrastructure by controlling the
PLCs, so applying it on the conventional electric grid will give
this outdated grid a sense of intelligence, and will move it
toward smart grid, which can be self-configured, selfoptimized, self-healed and self-* (any desired property or
function). Although SCADA is a powerful tool for supervising
the grid operations, it was not designed to protect its
infrastructures against cyber-attacks. This shortcoming makes
SCADA an attractive environment for Cyber-Physical attacks.
In this paper we show how to utilize anomaly behavior analysis
techniques and autonomic computing to build an Autonomic
Critical Infrastructure Protection (ACIP) system to secure the
control functions and management tasks of critical
infrastructure control systems with a little or no involvement
from the users or administrators. This paper is organized as
follows. In the next section, more detailed information about
background and the SCADA attacks with a focus on the
Modbus protocol is presented. Section 3 describes few past
related works. Section 4 explains the approach of the ACIP
approach. Section 5 provides an overview about ACIP tesbed
and an experimental evaluation. Finally, section 6 concludes
the current work and presents possible future opportunities.
II.
BACKGROUND
SCADA systems are considered more vulnerable due to
their connectivity to external corporate networks and the
internet. They use TCP/IP protocol stack and run the Microsoft
Windows environment, which might have computers affected
by viruses and worms [3].
Threats can be carried out from inside or outside the system
network or facility (internal and external threats). Furthermore,
attacks are classified into two main categories:
1) Physical Attacks: They are launched by an insider who has
privileges or penetrated the system, and by acting as a
single attack vector (manually or by sending false
commands), the whole system can be damaged or taken
down from inside.
2) Cyber Attacks: The penetrator who doesn’t have a physical
access to the SCADA components and the devices
launches them, and then the penetrator targets the network
or SCADA protocols in several ways.
Vulnerabilities in control systems communication protocols
are the primary weakness that makes SCADA systems
vulnerable to cyber-attacks. We focus on SCADA attacks in
smart power grids. The bidirectional communications and the
integration with Internet protocols and devices increase the
vulnerability of smart grids to cyber-attacks
Attacks are divided into direct and indirect attacks. Direct
attacks have specific consequences in the infected machine.
Another type is the indirect attacks, which is applied against
one machine to affect other machines.
1) Direct Attacks
a) Jamming: a jammer can disrupt the communication
medium via putting streams of packets into the shared medium.
As a result, the sender will not be able to send any packet
because the medium is busy all the time. Usually the adversary
sends thousands of trivial packets like echo messages. This
type of attacks may cause the connection between the control
center and the substations to be broken down; this leads to
another attack, Denial of Service (DoS). During DoS, all the
monitoring, reporting, set point accessing, and all other MTU’s
functionalities will be out of service. Based on [4] a jammer
can reduce the performance of the system form 80-90% to
about 40%.
b) Reply: This attack is implemented by sniffing the
packets being transmitted and then sends them again with the
same payload to the MTU through AMI; this will cause the
controller to issue a new command to the victim, which will be
any device in the comprised network. Such a forged request
leads to equipment damage, and network overloading if the
victim is a high power consumer. The worst case, if this
request affects human’s safety.
2) Indirect Attack
a) Brute Force: In June 2011, NATO (North Atlantic Treaty
Organization) experienced a security breach that led to the
public release of first and last names, usernames, and
passwords for more than 11,000 registered users of their ebookshop [5]. Access point is an initial step for the intruder to
start establishing other attacks. Unfortunately, Most SCADA
systems are secured by using vulnerable passwords, which can
be cracked by using different sniffing programs as Wireshark
[6] where with simple analysis for the captured packets we can
figure out the password. Another approach is by using different
password cracking programs that tried to guess a password by
implementing probabilistic algorithms. This type of attack does
not have immediate effect on the system rather than having
potential effects.
b) Worm: We can upload a file into I/O server such as that
one for time synchronization, but this file, which has a worm,
is corrupted then synchronize the time for all the devices. As a
result, all devices will be affected by this attack.
c. Man-in-the-Middle (MITM): This attack can affect the
confidentiality or the integrity of the message. By spoofing one
of the system devices’ IP, an adversary will communicate with
the MTU as it is the client, and vise versa for the client. The
attacker will act as a bridge passing the packets between the
router and the victim.
Modbus [10] is a simple request-response application layer
messaging protocol developed by Mdicon. It is one of the most
common used protocols in SCADA systems. In recent years,
the prevalence of TCP/IP networks (e.g. the Internet) and the
availability of inexpensive industrial equipment have enticed
many critical infrastructure operators to use TCP/IP
communications for control networks. The legacy Modbus
Serial protocol has been implemented over TCP/IP. Since
Modbus relies on TCP, a connection-oriented protocol, so it
inherits the three-way handshaking characteristic. Before
sending the data, the sender should check if the shared medium
is free or it is busy, and that is done through CSMA/CD
technique [11].
Many attacks that exploit the Modbus protocol
specifications were documented, and we identified 20 distinct
attacks with 59 attacks instances in the case of Modbus serial,
and 28 distinct attacks with 113 attacks instances in the case of
Modbus TCP/IP [5-12]. The primary targets are the master
unit, field devices, serial or network communication links and
the messages [5].
III.
RELATED WORK
To secure Smart Grid (SG), different techniques have been
applied to different parts of the smart grid. Aravinthan
presented a bidirectional method of verification. First, the
appliance sends a request to join the network with a password
given by the utility. AMI will examine this password. If it is
accepted, AMI will generate a unique private key for the
consumer for the next sessions [4]. Yan in [7] divided the
protected assets into three parts: First, the system which can be
protected by providing a trust platform module (TPM) using
Core Root Trust for Measurement (CRTM) for load time and
run time data, a security kernel, encryption for sensitive data,
and a shielded channel to prevent eavesdropping. Second, the
integrity of the process can be handled by providing
fingerprints for the devices as AES128 [8], or by using Hushbased message authentication code for the exchanged data.
Third, generated data integrity depends on collected data,
which can be secured by applying sort of checksum for each
received message, and ensure the path is trusted. Researcher in
[9] depicts the data into low frequency (LF) and high
frequency (LF), each of which has a specific ID. The
distinction based on the data patterns that being carried to the
utility. For instance, power usage pattern are considered as HF.
AMI has two methods for dealing with HF and LF. HF data
are transmitted directly to the utility, but LF are encrypted and
given a private session.
IV.
AUTONOMIC CRITICAL INFRASTRUCTURE PROTECTION
(ACIP) APPROACH
Many methods were presented to protect SCADA systems
such as specification-based, probabilistic-based, signaturebased, exceptions detection via SNORT rules [13], etc.
Our approach to develop the ACIP system, which is shown
in Fig. 1, is based on Autonomic Computing [14]. Fig. 1 shows
the main ACIP modules are: Online Monitoring, Feature
Selection and Correlation, Multi-level Behavior Analysis,
Decision Fusion, Automated/Semi-automated Actions,
Visualization, and Adaptive Learning. In what follows, we give
a brief overview of each of these modules.
(1) Online Monitoring: Our monitoring approach covers
both physical and cyber spaces. We use autonomic agents and
existing monitoring tools to continuously, 24/7, collect
multidimensional datasets that capture application level
behaviors and network level behaviors.
application can be characterized with respect to features such
as CPU, Memory, storage, network and system calls.
(3) Multi-level Behavior Analysis: The main objective of
using multiple levels to analyze the behavior is to decrease the
false alarms, while achieving a high detection rate.
During each observation period, T, network behavior,
Protocol behavior, and Application behavior analysis are used
simultaneously to detect anomalous events in SCADA system.
If an anomalous behavior is detected by any of these analysis
modules, then an alert is sent to the Decision Fusion module.
(4) Decision Fusion Module: With a large number of
alerts being generated by each level of the behavior analysis, it
is necessary to fuse the information generated in order to
increase the detection rate and also reduce the overhead of
processing false alarms.
(5) Risk and Impact Analysis Engine: When the risk and
impact analysis module (or engine) receives one alert from the
analysis modules, it determines what part of the system will be
affected by the detected anomaly. It also checks its knowledge
repository to determine the impact of each response action on
the SCADA system. In order to determine the impact for an
attack with respect to different criteria or for a protection
action, a dependency graph analysis and game theory technique
are applied in our impact analysis. At the end, the response
action that gives the best protection against the detected attack
with a minimal impact will be selected and performed by an
autonomic agent.
(6) Automated/Semi-automated Actions: The complexity
and propagation speed of anomalous events (e.g., network
attacks) make manual-intensive responses too slow and
ineffective. Our approach will use autonomic agents to
implement automated or semi-automated actions to mitigate
and/or eliminate the impact of the anomalous event in a timely
manner.
Fig. 1 Autonomic Critical Infrastructure Protection
Architecture
(2) Feature Selection and Correlation: Our assumption is
that any anomalous event (e.g., network attack, failure,
degradation in performance, or misconfiguration) will lead to a
certain anomalous behavior scenario, and it is imperative to be
able to accurately identify this anomalous behavior. For
example, communicating between HMI and RTUs during
normal operation is significantly different from that one during
DoS attack. The low level monitored data about protocol
services, applications, and infrastructure resources are stored in
well-defined data structures, which we refer to as AppFlowTM
structures. AppFlowTM can be viewed as an n-dimensional
array of features that capture temporal and spatial behaviors of
SCADA applications, and characterize their dynamic
interactions with respect to the keys. Hardware, software and
networks features can be categorized into three classes:
Hardware Flow; Software Flow; and Network Flow. By using
data mining and statistical techniques, one can accurately
characterize the normal operating region for the application
with respect to the selected features and any application trend
toward anomalous operating regions. For example, an
(7) Adaptive Learning: We use an unsupervised learning
algorithm to identify changes in network and system
operations. These changes may be significantly different from
the current baseline models of normal operations. However,
they still represent normal behavior based on the dependency
and correlation analysis of the monitored features. If there is a
discrepancy in the decisions provided by each analysis module,
the adaptive learning module implements a feedback
mechanism to adaptively re-train the errant system in real time.
In addition, this module performs a comprehensive analysis of
network, and application traffic data that might lead to the
discovery of a new anomalous behavior that was not detected
by one or more of the analysis modules. Consequently, the
supervised learning module may be activated again to generate
new rules, or to expand the trained profiles so as to capture
new types of anomalous behaviors.
(8) Visualization module: This module utilizes
visualization techniques (multivariate and network/graph
visualization) to achieve better understanding of the current
system state and its alerts such that it will significantly improve
the capability of administrators and analysts to promptly
respond to anomalous events.
V.
ACIP TESTBED AND EXPERIMENTAL EVALUATION
To generate enough Modbus traffic, three Modbus servers
are used to generate normal Modbus traffic to communicate
with five different types of PLCs. To generate enough Modbus
traffic, we developed an Modbus traffic generator. In order to
accurately characterize the normal behavior of the Modbus
protocol, we needed to collect and store vast amounts of
Modbus traffic. Since we did not have access to a large
Modbus PLC network to sniff, we created a tool capable of
generating many different types of messages. The 8 function
codes generated by this tool give potentially thousands of legal
message combinations. The function codes that write to
registers or coils “exercise” or send different bit combinations
to the PLC. For example, the “Exercise Multiple coils”
selection will invert all of the bits each time it is called.
Manny attacks focus on function code 8, the diagnostic
register. None of the PLCs in our network allow function code
8. This eliminates the restart attack and the diagnoistic register
reset attacks.
The following attacks were detected by ACIP system:
Clear All Registers: This attack writes 0 to each bit of the
holding registers. This event indicates that an attacker has
erased the counters and diagnostics in an effort to hide
attack information or increase the time to recover from an
attack.
Invert All Registers: This attack changes the state of each
bit of the holding registers (ON to OFF and OFF to ON).
An attacker can change the value of a single coil or input
register, or multiple coils or registers at the same time.
This attack can change the flow of the operation of the
system. The master operation would then be erroneous and
eventually will halt its operation.
Buffer Overflow: This attack appends data to the end of a
valid packet to overflow the buffers. This heap-based
buffer overflow allows remote attackers to cause a denial
of service and possibly execute arbitrary code via a
Modbus response packet with a crafted length field.
Modbus Network Scanning: This attack involves sending
benign messages to all possible addresses on a Modbus
network to obtain information about field devices.
Irregular TCP Framing: Multiple Modbus messages
cannot be placed in a single TCP frame. This attack
creates improperly framed messages, which may cause a
master unit or field device to close a connection.
VI.
CONCLUSION AND FUTURE WORK
In this paper we presented an approach to implement an
Autonomic Critical Infrastructure Protection (ACIP) system.
We illustrated our approach to secure the SCADA’s Modbus
communication protocol. We showed that the ACIP
implementation is capable of: real time monitoring, detecting
and stopping attacks against Modbus protocol. We are
currently working on applying our model-based anomaly
behavior analysis to other protocols that are widely used in
smart grids such as DNP3, BACNET, ZigBee, etc.
VII. REFERENCES
[1] Department of Energy (DOE), “Smart Grid Stakeholder
Books:
Consumer Advocates,” Litos Strategic Communication under contract No.
DE-AC26-04NT41817, Dec 2009.
[2] B. Galloway, G. Hancke, “Introduction to Industrial Control
Networks,” Communications Surveys & Tutorials, IEEE , vol.PP, no.99,
pp.1,21, 0, July 2012
[3] National Institiute for Standards and Technology Report on Smart Grid
Interoperability Standards Roadmap EPRI, [Online]. Available:
http://www.nist.gov/smartgrid/InterimSmartGridRoadmapNISTRestructure.p
df, Jun 17, 2012
[4] V. Aravinthan, V. Namboodiri, S. Sunku, and W. Jewell, “Wireless
AMI application and security for controlled home area networks,” Power and
Energy Society General Meeting, 2011 IEEE , vol., no., pp.1-8, 24-29 July
2011
[5] J. Qates. “NATO Site Hacked.” [Online]. Available:
http://www.theregister.co.uk/2011/06/24/nato_hack_attack/, Feb 21, 2013
[6] C. Sanders, “Practical Packet Analysis: Using Wireshark to Solve RealWorld Network Problems,” San Francisco, CA: No Starch Press, May 2007
[7] Y. Yan, Y. Qian, H. Sharif and D. Tipper, “A Survey on Cyber
Security for Smart Grid Communications,” Communications Surveys &
Tutorials, IEEE , vol.14, no.4, pp.998-1010, Fourth Quarter 2012
[8] L. Xiaona and X. Liping, “AES encryption algorithm keyless entry
system,” Consumer Electronics, Communications and Networks (CECNet),
2012 2nd International Conference on , vol., no., pp.3090-3093, 21-23 April
2012
[9] X. Li, X. Liang, R. Lu, X. Shen, X. Lin and H. Zhu, “Securing smart
grid: cyber-attacks, countermeasures, challenges,” IEEE Communications
Magazine, vol.50, no.8, pp.38-45, August 2012
[10] Steven Cheung et al. “Using Model-based Intrusion Detection for
SCADA Networks”, Computer Science Laboratory, SRI International, Dec.
2006.
[11] F. S. F. Poon, M. S. Iqbal, “Design of a physical layer security
mechanism for CSMA/CD networks” Communications, Speech and Vision,
IEE Proceedings I , vol.139, no.1, pp.103,112, Feb. 1992
[12] C. W. Ten, C. C. Liu and G. Manimaran, “Vulnerability Assessment of
Cyber security for SCADA Systems,” Power Systems, IEEE Transactions on,
vol.23, no.4, pp.1836-1846, Nov. 2008
[13] S. D. Garbrecht, “The Benefits of Object-Based Architectures for
SCADA
and
Supervisory
Systems,”
[Online].
Available:
http://global.wonderware.com/EN/Pages/WonderwareDevelopmentStudio.asp
x
[14] Prosoft Technology, “IEC61850: A Protocol with Powerful Potential,”
Dec 2009.