Nrusso Ccie Ccde Evolving Tech 10jun2017
Nrusso Ccie Ccde Evolving Tech 10jun2017
Nrusso Ccie Ccde Evolving Tech 10jun2017
Written Exam
Version X.1
Evolving Technologies
Study Guide
By: Nicholas J. Russo
CCIE™ #42518 (RS/SP)
CCDE™ #20160041
Nicholas (Nick) Russo holds active CCIE certifications in Routing and Switching and Service Provider, as
well as CCDE. Nick authored a comprehensive study guide for the CCIE Service Provider version 4
examination and this document provides updates to the written test for all CCIE/CCDE tracks. Nick also
holds a Bachelor’s of Science in Computer Science, and a minor in International Relations, from the
Rochester Institute of Technology (RIT). Nick lives in Maryland, USA with his wife, Carla, and daughter,
Olivia. For updates to this document and Nick’s other professional publications, please follow the author
on his Twitter, LinkedIn, and personal website.
Technical Reviewers: Guilherme Loch Góes, Angelos Vassiliou, and many from the RouterGods team.
This material is not sponsored or endorsed by Cisco Systems, Inc. Cisco, Cisco Systems, CCIE and the CCIE
Logo are trademarks of Cisco Systems, Inc. and its affiliates. The symbol ™ is included in the Logo
artwork provided to you and should never be deleted from this artwork. All Cisco products, features, or
technologies mentioned in this document are trademarks of Cisco. This includes, but is not limited to,
Cisco IOS®, Cisco IOS-XE®, and Cisco IOS-XR®. Within the body of this document, not every instance of
the aforementioned trademarks are combined with the symbols ® or ™ as they are demonstrated above.
THE INFORMATION HEREIN IS PROVIDED ON AN “AS IS” BASIS, WITHOUT ANY WARRANTIES OR
REPRESENTATIONS, EXPRESS, IMPLIED OR STATUTORY, INCLUDING WITHOUT LIMITATION, WARRANTIES
OF NONINFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Author’s Notes
This book is designed for any CCIE track (as well as the CCDE) that introduces the “Evolving
Technologies” section of the blueprint for the written qualification exam. It is not specific to any
examination track and provides an overview of the three key evolving technologies: Cloud, Software
Defined Networking (SDN), and Internet of Things (IoT). More detailed references are included at the
end of each chapter; candidates are encouraged to review those resources. Italic text represents cited
text from another not created by the author. This is typically directly from a Cisco document, which is
appropriate given that this is a summary of Cisco’s vision on the topics therein.
This book is not an official publication, does not have an ISBN assigned, and is not protected by any
copyright. It is not for sale and is intended for free, unrestricted distribution. The opinions expressed in
this study guide belong to the author and are not necessarily those of Cisco.
2
Nicholas J. Russo
Contents
1. Cloud 4
1.1 Compare and contrast Cloud deployment models 4
1.1.1 Infrastructure, platform, and software services (XaaS) 8
1.1.2 Performance and reliability 9
1.1.3 Security and privacy 12
1.1.4 Scalability and interoperability 13
1.2 Describe Cloud implementations and operations 14
1.2.1 Automation and orchestration 18
1.2.2 Workload mobility 18
1.2.3 Troubleshooting and management 19
1.2.4 OpenStack components 22
1.3 Resources and References 26
2. Network programmability [SDN] 27
2.1 Describe functional elements of network programmability (SDN) and how they interact 30
2.1.1 Controllers 30
2.1.2 APIs 35
2.1.3 Scripting 43
2.1.4 Agents 47
2.1.5 Northbound vs. Southbound protocols 52
2.2 Describe aspects of virtualization and automation in network environments 53
2.2.1 DevOps methodologies, tools and workflows 54
2.2.2 Network/application function virtualization (NFV, AFV) 56
2.2.3 Service function chaining 58
2.2.4 Performance, availability, and scaling considerations 59
2.3 Resources and References 61
3. Internet of Things (IoT) 62
3.1 Describe architectural framework and deployment considerations 62
3.1.1 Performance, reliability and scalability 65
3.1.2 Mobility 66
3.1.3 Security and privacy 66
3.1.4 Standards and compliance 67
3.1.5 Migration 70
3.1.6 Environmental impacts on the network 70
3.2 Resources and References 71
3
Nicholas J. Russo
1. Cloud
Cisco has defined cloud as follows:
IT resources and services that are abstracted from the underlying infrastructure and provided “on-
demand” and “at scale” in a multitenant environment.
Cisco identifies three key components from this definition that differentiate cloud deployments from
ordinary data center (DC) outsourcing strategies:
a. “On-demand” means that resources can be provisioned immediately when needed, released
when no longer required, and billed only when used.
b. “At-scale” means the service provides the illusion of infinite resource availability in order to meet
whatever demands are made of it.
c. “Multitenant environment” means that the resources are provided to many consumers from a
single implementation, saving the provider significant costs.
These distinctions are important for a few reasons. Some organizations joke that migrating to cloud is
simple; all they have to do is update their on-premises DC diagram with the words “Private Cloud” and
upper management will be satisfied. While it is true that the term “cloud” is often abused, it is
important to differentiate it from a traditional private DC. This is discussed next.
a. Public: Public clouds are generally the type of cloud most people think about when the word
“cloud” is spoken. They rely on a third party organization (off-premise) to provide infrastructure
where a customer pays a subscription fee for a given amount of compute/storage, time, data
transferred, or any other metric that meaningfully represents the customer’s “use” of the cloud
provider’s shared infrastructure. Naturally, the supported organizations do not need to maintain
the cloud’s physical equipment. This is viewed by many businesses as a way to reduce capital
expenses (CAPEX) since purchasing new DC equipment is unnecessary. It can also reduce
operating expenses (OPEX) since the cost of maintaining an on-premise DC, along with trained
staff, could be more expensive than a public cloud solution. A basic public cloud design is shown
on the following page; the enterprise/campus edge uses some kind of transport to reach the
public cloud network.
4
Nicholas J. Russo
CSP EDGE
INTERNET,
WAN, IXP
CAMPUS
EDGE
CAMPUS
b. Private: Like the joke above, this model is like an on-premises DC except must supply the three
key ingredients identified by Cisco to be considered a “private cloud”. Specifically, this implies
automation/orchestration, workload mobility, and compartmentalization must all be supported
in an on-premises DC to qualify. The organization is responsible for maintaining the cloud’s
physical equipment, which is extended to include the automation and provisioning systems. This
can increase OPEX as it requires trained staff. Like the on-premises DC, private clouds provide
application services to a given organization and multi-tenancy is generally limited to business
units or projects/programs within that organization (as opposed to entirely different customers).
The diagram on the following page illustrates a high-level example of a private cloud.
5
Nicholas J. Russo
CAMPUS CORE
PRIVATE CLOUD
(ON-PREM DC)
c. Virtual Private: A virtual private cloud is a combination of public and private clouds. An
organization may decide to use this to offload some (but not all) of its DC resources into the
public cloud, while retaining some things in-house. This can be seen as a phased migration to
public cloud, or by some skeptics, as a non-committal trial. This allows a business to objectively
assess whether the cloud is the” right business decision”. This option is a bit complex as it may
require moving workloads between public/private clouds on a regular basis. At the very
minimum, there is the initial private-to-public migration which occurs no matter what, and this
could be time consuming, challenging, and expensive. This is sometimes called a “hybrid cloud”
as well and could, in fact, represent a business’ IT end-state. The diagram on the following page
illustrates a high-level example of a virtual-private (hybrid) cloud.
6
Nicholas J. Russo
CSP EDGE
INTERNET,
WAN, IXP
CAMPUS
EDGE
CAMPUS
PRIVATE CLOUD
(ON-PREM DC)
d. Inter-cloud: Like the Internet (an interconnection of various autonomous systems to exchange
network reachability information), Cisco suggests that, in the future, the contiguity of cloud
computing may extend between many third-party organizations. This is effectively how the
Internet works; a customer signs a contract with a given service provider (SP) yet has access to
resources from several thousand other service providers on the Internet. The same concept
could be applied to cloud and this is an active area of research for Cisco.
Below is a based-on-a-true-story discussion that highlights some of the decisions and constraints
relating to cloud deployments.
7
Nicholas J. Russo
b. Years later, the same organization decides to keep their most important data on-premises to
meet seemingly-inflexible Government regulatory requirements, yet feels that migrating a
portion of their private cloud to the public cloud is a solution to reduce OPEX. This helps
increase the scalability of the systems for which the Government does not regulate, such as
virtualized network components or identity services, as the on-premises DC is bound by CAPEX
reductions. The private cloud footprint can now be reduced as it is used only for a subset of
tightly controlled systems, while the more generic platforms can be hosted from a cloud
provider at lower cost. Note that actually exchanging/migrating workloads between the two
clouds at will is not appropriate for this organization as they are simply trying to outsource
capacity to reduce cost. This deployment could be considered a “virtual private cloud” by Cisco,
but is also commonly referred to as a “hybrid cloud”.
c. Years later still, this organization considers a full migration to the public cloud. Perhaps this is
made possible by the relaxation of the existing Government regulations or by the new security
enhancements offered by cloud providers. In either case, the organization can migrate its
customized systems to the public cloud and consider a complete decommission of their existing
private cloud. Such decommissioning could be done gracefully, perhaps by first shutting down
the entire private cloud and leaving it in “cold standby” before removing the physical racks.
Rather than using the public cloud to augment the private cloud (like a virtual private cloud), the
organization could migrate to a fully public cloud solution.
a. Software as a Service (SaaS) is where application services are delivered over the network on a
subscription and on-demand basis. A simple example would be to create a document but not
installing the appropriate text editor on a user’s personal computer. Instead, the application is
hosted “as a service” that a user can access anywhere, anytime, from any machine. SaaS is an
interface between users and a hosted application, often times a hosted web application.
c. Infrastructure as a Service (IaaS) is where compute, network, and storage are delivered over the
network on a pay-as-you-go basis. The approach that Cisco is taking is to enable service
8
Nicholas J. Russo
providers to move into this area. This is likely the first thing that comes to mind when individuals
think of “cloud”. It represents the classic “outsourced DC” mentality that has existed for years
and gives the customer flexibility to deploy any applications they wish. Compared to SaaS, IaaS
just provides the “hardware”, roughly speaking, while SaaS provides both. IaaS may also provide
a virtualization layer by means of a hypervisor. A good example of an IaaS deployment could be
a miniature public cloud environment within an SP point of presence (POP) which provides
additional services for each customer: firewall, intrusion prevention, WAN acceleration, etc. IaaS
is effectively an interface between an operating system and the underlying hardware resources.
d. IT foundation is the basis of the above value chain layers. It provides basic building blocks to
architect and enable the above layers. While more abstract than the XaaS layers already
discussed, the IT foundation is generally a collection of core technologies that evolve over time.
For example, DC virtualization became very popular about 15 years ago and many organizations
spent most of the last decade virtualizing “as much as possible”. DC fabrics have also changed in
recent years; the original designs represented a traditional core/distribution/access layer design
yet the newer designs represent leaf/spine architectures. These are “IT foundation” changes
that occur over time which help shape the XaaS offerings, which are always served using the
architecture defined at this layer.
Although not defined in formal Cisco documentation, there are many more flavors of XaaS. Below are
some additional examples of storage related services commonly offered by large cloud providers:
9
Nicholas J. Russo
environment revolve around the introduction of automation. Automation is discussed in detail in section
1.2.1 but the trade-offs are discussed here as they directly influence the performance and reliability of a
system. Note that this discussion is typically relevant for private and virtual private clouds, as a public
cloud provider will always be large enough to warrant several automation tools.
Automation usually reduces the total cost of ownership (TCO), which is a desirable thing for any
business. This is the result of reducing the time (and labor wages) it takes for individuals to “do things”:
provision a new service, create a backup, add VLANs to switches, test MPLS traffic-engineering tunnel
computations, etc. The trade-off is that all software (including the automation system being discussed)
requires maintenance, whether that is in the form of in-house development or a subscription fee from a
third-party. If in the form of in-house development, software engineers are paid to maintain and
troubleshoot the software which could potentially be more expensive than just doing things manually,
depending on how much maintenance and unit testing the software requires. Most individuals who
have worked as software developers (including the author) know that bugs or feature requests always
seem to pop up, and maintenance is continuous for any non-trivial piece of code. Businesses must also
consider the cost of the subscription for the automation software against the cost of not having it (in
labor wages). Typically this becomes a simple choice as the network grows; automation often shines
here. This is why automation is such a key component of cloud environments because the cost of
dealing with software maintenance is almost always less than the cost of a large IT staff.
Automation can also be used for root cause analysis (RCA) whereby the tool can examine all the
components of a system to test for faults. For example, suppose an eBGP session fails between two
organizations. The script might test for IP reachability between the eBGP routers first, followed by
verifying no changes to the infrastructure access lists applied on the interface. It might also collect
performance characteristics of the inter-AS link to check for packet loss. Last, it might check for
fragmentation on the link by sending large pings with “don’t fragment” set. This information can feed
into the RCA which is reviewed by the network staff and presented to management after an outage.
The main takeaway is that automation should be deployed where it makes sense (TCO reduction) and
where it can be maintained with a reasonable amount of effort. Failing to provide the maintenance
resources needed to sustain an automation infrastructure can lead to disastrous results. With
automation, the “blast radius”, or potential scope of damage, can be very large. A real-life story from
the author: when updating SNMPv3 credentials, the wrong privacy algorithm was configured, causing
100% of devices to be unmanageable via SNMPv3 for a short time. Correcting the change was easily
done using automation, and the business impact was minimal, but it negatively affected every router,
switch, and firewall in the network.
Automation helps maximize the performance and reliability of a cloud environment. Another key aspect
of cloud design is accessibility, which assumes sufficient network bandwidth to reach the cloud
environment. A DC that was once located at a corporate site with 2,000 employees was accessible to
those employees over a company’s campus LAN architecture. Often times this included high-speed core
10
Nicholas J. Russo
and DC edge layers whereby accessing DC resources was fast and highly available. With public cloud, the
Internet/private WAN becomes involved, so cloud access becomes an important consideration.
Below is a table that compares access methods, reliability, and other characteristics of the different
cloud solutions.
Public Cloud Private Cloud Virtual Private Inter-Cloud
Cloud
Access (private Often times relies Corporate LAN or Combination of Same as public
WAN, IXP, on Internet VPN, WAN, which is corporate WAN for cloud, except relies
Internet VPN, but could also use often private. the private cloud on the Internet as
etc) an Internet Could be Internet- components and transport between
Exchange (IX) or based if SD-WAN whatever the clouds/cloud
private WAN deployments (e.g. public cloud access deployments
Cisco IWAN) are method is.
considered.
Reliability Heavily dependent Often times high Typically higher Assuming
(including on highly-available given the common reliability to access applications are
accessibility) and high- usage of private the private WAN distributed,
bandwidth links to WANs (backed by components, but reliability can be
the cloud provider carrier SLAs) depends entirely quite high if at least
on the public cloud one “cloud” is
access method accessible (anycast)
Fault Tolerance Typically high as Often constrained Unlike public or Assuming
/ intra-cloud the cloud provider by corporate private, the applications are
availability is expected to CAPEX; tends to be networking link distributed, fault-
have a highly a bit lower than a between the two is tolerance can be
redundant managed cloud an important quite high if at least
architecture service given the consideration for one “cloud” is
smaller DCs fault tolerance accessible (anycast)
Performance Typically high as Often constrained Unlike public or Unlike public or
(speed, etc) the cloud provider by corporate private, the private, the
is expected to CAPEX; tends to be networking link networking link
have a very dense a bit lower than a between the two is between the two is
compute/storage managed cloud an important an important
architecture service given the consideration, consideration,
smaller DCs especially when especially when
applications are applications are
distributed across distributed across
the two clouds the two clouds
11
Nicholas J. Russo
1.1.3 Security and privacy
From a purely network-focused perspective, many would argue that public cloud security is superior to
private cloud security. This is the result of hiring an organization whose entire business revolves around
providing a secure, high-performing, and highly-available network. A business where “the network is not
the business” may be less inclined or less interested in increasing OPEX within the IT department, the
dreaded cost center. The counter-argument is that public cloud physical security is always questionable,
even if the digital security is strong. Should a natural disaster strike a public cloud facility where disk
drives are scattered across a large geographic region (tornado comes to mind), what is the cloud
provider’s plan to protect customer data? What if the data is being stored in a region of the world
known to have unfriendly relations towards the home country of the supported business? These are
important questions to ask because when data is in the public cloud, the customer’s never really know
exactly “where” the data is being stored. This uncertainty can be offset by using “availability zones”
where some cloud providers will ensure the data is confined to a given geographic region. In many
cases, this sufficiently addresses the concern for most customers, but not always. As a customer, it is
also hard to enforce and prove this. This sometimes comes with an additional cost, too. Note that
disaster recovery (DR) is also a component of business continuity (BC) but like most things, it has
security considerations as well.
The following page compares the security and privacy characteristics between the different cloud
deployment options.
12
Nicholas J. Russo
Public Cloud Private Cloud Virtual Private Inter-Cloud
Cloud
Digital security Typically has best Focused IT staff Coordination Coordination
trained staff, but likely not IT- between clouds between clouds
focused on the focused upper could provide could provide attack
network and not management attack surfaces, surfaces (like what
much else (network is likely but isn’t wide- BGPsec is designed
(network is the not the business) spread to solve)
business)
Physical One cannot Generally high as a Combination of One cannot
security pinpoint their data business knows public and private; pinpoint their data
within the cloud where the data is depends on anywhere in the
provider’s network stored, breaches application world
notwithstanding component
distribution
Privacy Transport from Generally secure Need to ensure Need to ensure any
premises to cloud assuming any replicated replicated traffic
should be secured corporate WAN is traffic between between distributed
(Internet VPN, secure public/private public clouds is
secure private clouds is protected; not the
WAN, etc) protected; responsibility of
generally this is end-customers, but
true as the link to cloud providers
the public cloud is should provide it
protected
The following page includes a table that briefly discusses the scalability and interoperability
considerations across the cloud types.
13
Nicholas J. Russo
Public Cloud Private Cloud Virtual Private Inter-Cloud
Cloud
Scalability Appears to be High CAPEX and Scales well given Highest; massively
“infinite” which OPEX to expand it, public cloud distributed
allows the which limits scale resources architecture
customer to within a business
provision new
services quickly
Interoperability Up to the Interoperable with Combination of Up to the developer
developer to use the underlying public/private, to use cloud
cloud provider platform; i.e., one depending on provider APIs; these
APIs; these are “OpenStack where the are provided as part
provided as part of application” resource is of the cloud
the cloud offering should be located. Migration offering. Assumes
deployable to between the two consistent API
another OpenStack could be limited presentation
instance depending on APIs between different
invoked cloud ASes
a. Private WAN (like MPLS L3VPN): Using the existing private WAN, the cloud provider is
connected as an extranet. To use MPLS L3VPN as an example, the cloud-facing PE exports a
central service route-target (RT) and imports corporate VPN RT. This approach could give direct
cloud access to all sites in a highly scalable, highly performing fashion. Traffic performance
would (should) be protected under the ISP’s SLA to cover both site-to-site customer traffic and
site-to-cloud/cloud-to-site customer traffic. The ISP may even offer this cloud service natively as
part of the service contract; as discussed earlier, certain services could be collocated in an SP
POP as well. The private WAN approach is likely to be expensive and as companies try to drive
OPEX down, a private WAN may not even exist. Private WAN is also good for virtual private
(hybrid) cloud assuming the ISP’s SLA is honored and is routinely measuring better performance
than alternative connectivity options. Virtual private cloud makes sense over private WAN
because the SLA is assumed to be better, therefore the intra-DC traffic (despite being inter-site)
will not suffer performance degradation. Services could be spread between the private and
public clouds assuming the private WAN bandwidth is very high and latency is very low, both of
which would be required in a cloud environment. It is not recommended to do this as the
amount of intra-workflow bandwidth (database server on-premises and application/web server
14
Nicholas J. Russo
in the cloud, for example) is expected to be very high. The diagram below depicts private WAN
connectivity assuming MPLS L3VPN. In this design, branches could directly access cloud
resources without transiting the main site.
b. Internet Exchange Point (IXP): A customer’s network is connected via the IXP LAN (might be a
LAN/VLAN segment or a layer-2 overlay) into the cloud provider’s network. The IXP network is
generally access-like and connects different organizations together for exchange, but typically
does not provide transit services between sites like a private WAN. Some describe an IXP as a
“bandwidth bazaar” or “bandwidth marketplace” where such exchanges can happen in a local
area. A strict SLA may not be guaranteed but performance would be expected to be better than
the Internet VPN. This is likewise an acceptable choice for virtual private (hybrid) cloud but lacks
the tight SLA typically offered in private WAN deployments. A company could, for example, use
internet VPNs for inter-site traffic and an IXP for public cloud access. A private WAN for inter-site
15
Nicholas J. Russo
access is also acceptable. The diagram below shows a private WAN for branch connectivity and
an IXP used for cloud connectivity from the main campus site.
c. Internet VPN: By far the most common deployment, a customer creates a secure VPN over the
Internet (could be multipoint if outstations required direct access as well) to the cloud provider.
It is simple and cost effective, both from a WAN perspective and DC perspective, but offers no
SLA whatsoever. Although suitable for most customers, it is likely to be the most inconsistent
performing option. While broadband Internet connectivity is much cheaper than private WAN
bandwidth (in terms of price per Mbps), the quality is often lower. Whether this is “better” is
debatable and depends on the business drivers. Also note that Internet VPNs, even high
bandwidth ones, offer no latency guarantees at all. This option is best for fully public cloud
solutions since the majority of traffic transiting this VPN tunnel should be user service flows. The
solution is likely to be a poor choice for virtual private clouds, especially if workloads are
16
Nicholas J. Russo
distributed between the private and public clouds. The biggest drawback of the Internet VPN
access design is that slow cloud performance as a result of the “Internet” is something a
company cannot influence; buying more bandwidth is the only feasible solution. In this example,
the branches don’t have direct Internet access (but they could), so they rely on an existing
private WAN to reach the cloud service provider. This design is shown below.
The answer to the first question detailing how a cloud provider network is built, operated, and
maintained is discussed in the remaining sections.
17
Nicholas J. Russo
1.2.1 Automation and orchestration
Automation and orchestration are two different things although are sometimes used interchangeably
(and incorrectly so). Automation refers to completing a single task, such as deploying a virtual machine,
shutting down an interface, or generating a report. Orchestration refers to assembling/coordinating a
process/workflow, which is effectively and ordered set of tasks glued together with conditions. For
example, deploy this virtual machine, and if it fails, shutdown this interface and generate a report.
Automation is to task as orchestration is to process/workflow.
Often times the task to automate is what an engineer would configure using some
programming/scripting language such as Java, C, Python, Perl, Ruby, etc. The variance in tasks can be
very large since an engineer could be presented with a totally different task every hour. Creating 500
VLANs on 500 switches isn’t difficult, but is monotonous, so writing a short script to complete this task is
ideal. Adding this script as an input for an orchestration engine could properly insert this task into a
workflow. For example, run the VLAN-creation script after the nightly backups but before 6:00 AM the
following day. If it fails, the orchestrator can revert all configurations so that the developer can
troubleshoot any script errors.
With all the advances in network automation, it is important to understand the role of configuration
management (CM) and how new technologies may change the logic. Traditional networking CM typically
consisted of a configuration control board (CCB) along with an organization that maintained device
configurations. While the corporate governance gained by the CCB has value, the maintenance of device
configurations may not. Using the “infrastructure as code” concept, organizations can template/script
their device configurations and apply CM practices only to the scripts. One example is using Ansible with
the Jinja2 template language. Simply maintaining these scripts, along with their associated playbooks
and variable files, has many benefits:
1. Less to manage: A network with many nodes is likely to have many device configurations that
are almost identical. One such example would be restaurant/retail chains as it relates to WAN
sites. By creating a template for one architecture, then maintaining site-specific variable files,
updating configurations becomes simpler.
2. Enforcement: Simply running the script will baseline the entire network based on the CCB’s
policy. This can be done on a regular basis to wipe away and vestigial (or malicious/damaging)
configurations from devices quickly.
3. Easy to test: Running the scripts in a development environment, such as on some VMs in a
private data center or compute instances in public cloud, can simplify the testing of your code
before applying it to the production network.
18
Nicholas J. Russo
cluster, one of them might be performing more than 50% of the computationally-expensive work while
the others are underutilized. The ability to move these workloads is an important capability.
It is important to understand that workload mobility is not necessarily the same thing as VM mobility.
For example, a workload’s accessibility can be abstracted using anycast while the application exists in
multiple availability zones (AZ) spread throughout the cloud provider’s network. Using Domain Name
System (DNS), different application instances can be utilized based on geographic location, time of day,
etc. The VMs have not actually moved but the resource performing the workload may vary.
Although this concept has been around since the initial virtualization deployments, it is even more
relevant in cloud, since the massively scalable and potentially distributed nature of that environment is
abstracted into a single “cloud” entity. Using the cluster example from above, those 4 hosts might not
even be in the same DC, or even within the same cloud provider (as could be the case with Inter-cloud).
The concept of workload mobility needs to be extended large-scale; note that this doesn’t necessarily
imply layer-2 extensions across the globe. It simply implies that the workload needs to be moved or
distributed differently, which can be solved with geographically-based anycast solutions, for example.
Other protocols and languages, such as NETCONF and YANG, also help automate/simplify network
management indirectly. NETCONF (RFC6241) is the protocol by which configurations are installed and
changed. YANG (RFC6020) is the modeling language used to represent device configuration and state,
much like eXtensible Markup Language (XML). Put simply, NETCONF is the transport vessel for YANG
information to be transferred from a network management system (NMS) to a network device. Although
YANG can be quite complex to humans, it is similar to SNMP; it is simple for machines. YANG is an
abstraction away from network device CLIs which promotes simplified management in cloud
environments and a progressive migration toward one of the SDN models discussed later in this
document. Devices that implement NETCONF/YANG provide a uniform manageability interface which
means vendor hardware/software can be swapped in a network without affecting the management
architecture, operations, or strategy. NETCONF is to YANG as HTTP is to HTML.
19
Nicholas J. Russo
YANG defines how data is structured/modeled rather than containing data itself. The next page contains
a snippet from RFC 6020 which defines YANG (section 4.2.2.1). The YANG model defines a “host-name”
field as a string (array of characters) with a human-readable description. Pairing YANG with NETCONF,
the XML syntax references the data field by its name to set a value.
YANG Example:
leaf host-name {
type string;
description "Hostname for this system";
}
<host-name>my.example.com</host-name>
There are many options for storing data, not modeling it or defining its composition. One such option is
YAML Ain’t Markup Language (YAML). It solves a similar problem as YANG since it is primarily used for
configuration files, but generally contains a subset of functionality as it was specifically designed to be
simpler. Below is an example of a YAML configuration, most likely to be used as input for a provisioning
script or something similar. Note that YAML files always begin with “---“ and should end with “…”.
---
vrf: “customer1”
nexthop: “10.0.0.1”
devices:
- 1:
net: “192.168.0.0”
mask: “255.255.0.0”
state: “present”
- 2:
net: “172.16.0.0”
mask: “255.240.0.0”
state: “absent”
...
JavaScript Object Notation (JSON) is another data modeling language that is similar to YAML in concept.
It was designed to be simpler than traditional markup languages and uses key/value pairs to store
20
Nicholas J. Russo
information. The “value” of a given pair can be another key/value pair, which enables hierarchical data
nesting. The key/value pair structure and syntax is very similar to the “dictionary” data type in Python.
JSON is even lighter than YAML and is also commonly used for maintaining configuration files. The next
page displays a syntax example of JSON which represents the same data and same structure as the
YAML example.
"vrf": "customer1",
"nexthop": "10.0.0.1",
"devices": {
"1": {
"net": "192.168.0.0",
"mask": "255.255.0.0",
"state": "present"
},
"2": {
"net": "172.16.0.0",
"mask": "255.240.0.0",
"state": "absent"
Data structured in XML is very common and has been popular for decades. XML is very verbose and
explicit, relying on starting and ending tags to identify the size/scope of specific data fields. Below is an
example of XML code that resembles a similar structure as the previous YAML and JSON examples.
<stuff>
<process>“update routes”</process>
<vrf>“customer1”</vrf>
<nexthop>“10.0.0.1”</nexthop>
<devices>
21
Nicholas J. Russo
<1>
<net>“192.168.0.0”</net>
<mask>“255.255.0.0”</mask>
<state>“present”</state>
</1>
<2>
<ip>“172.16.0.0”</ip>
<mask>“255.240.0.0”</mask>
<state>“absent”</state>
</2>
</devices>
</stuff>
YANG isn’t directly comparable with YAML, JSON, and XML because it solves a different problem. If any
one of these languages solved all of the problems, then the others would not exist. Understand the
business drivers and the problems to be solved using these tools is the key to choosing to right one.
Troubleshooting a cloud network is often reliant on real-time network analytics. Collecting network
performance statistics is not a new concept, but designing software to intelligently parse, correlate, and
present the information in a human-readable format is becoming increasingly important to many
businesses. With a good analytics engine, the NMS can move/provision flows around the network
(assuming the network is both disaggregated and programmable) to resolve any problems. For problems
that cannot be resolved automatically, the issues are brought to the administrator’s attention using
these engines. The administrator can use other troubleshooting tools or NMS features to isolate and
repair the fault. Sometimes these analytics tools will export reports in YAML, JSON, or XML, which can
be archived for reference. They can also be fed into in-house scripts for additional, business-specific
analysis.
22
Nicholas J. Russo
prompts and user interfaces, such as GUIs, CLIs, etc.
The fundamental idea is to change the way IT is consumed (including compute, storage, and network).
The value proposition of this change includes increasing efficiency (peak of sums, not sum of peaks) and
on-demand elastic provisioning (faster engineering processes). For cost reduction in both CAPEX and
OPEX, the cost models generally resemble “pay for what you use”. A customer can lease the space from
a public cloud provider for a variable amount of time. In some cases, entire IT shops might migrate to a
public cloud indefinitely. In others, a specific virtual workload may need to be executed one time for 15
minutes in the public cloud since some computationally-expensive operations may take too long in the
on-premises DC. “Cloud bursting” is an example of utilizing a large amount of cloud resources for a very
short period of time, perhaps to reduce/compress a large chunk of data, which is a one-time event.
OpenStack releases are scheduled every 6 months and many vendors from across the stack contribute
to the code. The entire goal is to have an open-source cloud computing platform; while it may not be as
feature-rich as large-scale public cloud implementations, it is considered a viable and stable alternative.
OpenStack is composed of multiple projects (improved maintainability) which follow a basic process:
OpenStack has approximately 18 components which are discussed briefly below. The components have
code-names for quick reference within the OpenStack community; these are included in parentheses.
Many of the components are supplementary and don’t comprise core OpenStack deployments, but can
add value for specific cloud needs. Note that OpenStack compares directly to existing public cloud
solutions offered by large vendors, except is open source with all code being available online.
a. Compute (Nova): Fabric controller (the main part of an IaaS system). Manages pools of compute
resources. A compute resource could be a VM, container, or bare metal server. Side note:
Containers are similar to VMs except they share a kernel. They are otherwise independent, like
VMs, and are considered a lighter-weight yet secure alternative to VMs.
b. Networking (Neutron): Manages networks and IP addresses. Ensures the network is not a
bottleneck or otherwise limiting factor in a production environment. This is technology-agnostic
network abstraction which allows the user to create custom virtual networks, topologies, etc.
For example, virtual network creation includes adding a subnet, gateway address, DNS, etc.
c. Block Storage (Cinder): Manages creation, attaching, and detaching of block storage devices to
servers. This is not an implementation of storage itself, but provides an API to access that
storage. Many storage appliance vendors often have a Cinder plug-in for OpenStack integration;
this ultimately abstracts the vendor-specific user interfaces from the management process.
Storage volumes can be detached and moved between instances (an interesting form of file
transfer, for example) to share information and migrate data between projects.
23
Nicholas J. Russo
d. Identity (Keystone): Directory service contains users mapped to services they can access.
Somewhat similar to group policies applied in corporate deployments. Tenants are stored here
which allows them to access resources/services within OpenStack; commonly this is access to
the OpenStack Dashboard (Horizon) to manage an OpenStack environment.
e. Image (Glance): Provides discovery, registration, and retrieval of virtual machine images. It
supports a RESTful API to query image metadata and the image itself.
f. Object Storage (Swift): Storage system with built-in data replication and integrity. Objects and
files are written to disk using this interface which manages the I/O details. Scalable and resilient
storage for all objects like files, photos, etc. This means the customer doesn’t have to deploy a
block-storage solution themselves, then manage the storage protocols (iSCSI, NFS, etc).
g. Dashboard (Horizon): The GUI for administrators and users to access, provision, and automate
resources. The dashboard is based on Python Django framework and is layered on top of service
APIs. Logging in relies on Keystone for identity management which secures access to the GUI.
The dashboard supports different tenants (business units, groups/teams, customers, etc) with
separate permissions and credentials; this is effectively role-based access control. The GUI
provides the most basic/common functionality for users without needing CLI access, which is
supported for advanced functions. The “security group” construct is used to enforce access
control (often need to configure this before being able to access the new instances).
h. Orchestration (Heat): Orchestrates cloud applications via templates using a variety of APIs.
i. Workflow (Mistral): Manages user-created workflows (triggered manually or by some event).
j. Telemetry (Ceilometer): Provides a Single Point of Contact for billing systems used within the
cloud environment.
k. Database (Trove): This is a Database-as-a-service provisioning engine.
l. Elastic Map Reduce (Sahara): Automated way to provision Hadoop clusters, like a wizard.
m. Bare Metal (Ironic): Provisions bare metal machines rather than virtual machines.
n. Messaging (Zaqar): Cloud messaging service for Web Developments (full RESTful API) used to
communicate between SaaS and mobile applications.
o. Shared File System (Manila): Provides an API to manage shares in a vendor agnostic fashion
(create, delete, grant/deny access, etc).
p. DNS (Designate): Multi-tenant REST API for managing DNS (DNS-as-a-service).
q. Search (Searchlight): Provides search capabilities across various cloud services and is being
integrated into the Dashboard. Searching for compute instance status and storage system
names are common uses cases for administrators.
r. Key Manager (Barbican): Provides secure storage, provisioning, and management of secrets
(passwords).
The key components of OpenStack and their interactions are depicted on the following page. The source
of this image is included in the references as it was not created by the author. This depicts a near-
minimal OpenStack deploy with respect to the number of services depicted. At the time of this writing
and according to OpenStack’s Q&A forum, the minimum services required appear to be Nova, Keystone,
Glance, and Horizon. Such a deployment would not have any networking or remote storage support, but
could be used for developers looking to run code on OpenStack compute instances in isolation.
24
Nicholas J. Russo
Those who don’t need to create and operate their own private clouds may be inclined to use a well-
known and trusted public cloud provider. At the time of this writing, the three most popular cloud
providers are Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). The
table below compares the OpenStack components to their counterparts in the aforementioned public
cloud providers. This chart below is the result of the author’s personal research and will likely change
over time as these cloud providers modify their cloud offerings.
25
Nicholas J. Russo
1.3 Resources and References
Cisco Cloud Homepage
Cisco Cloud Whitepaper
Wikipedia OpenStack Components
Cisco Cloud Overview
Unleshing IT (Cisco)
OpenStack Homepage
Understanding Cisco Cloud Fundamentals
Designing Networks and Services for the Cloud
OpenStack Whitepaper (Third Party)
Jinja2 Template Language
RFC6020 - YANG
26
Nicholas J. Russo
2. Network programmability [SDN]
Software-Defined Networking (SDN) is a concept that networks can be both programmable and
disaggregated concurrently, ultimately providing additional flexibility, intelligence, and customization for
the network administrators. Because the definition of SDN varies so widely within the network
community, it should be thought of as a continuum of different models rather than a single, prescriptive
solution.
There are four main SDN models as defined in “The Art of Network Architecture: Business-Driven Design”
by Russ White and Denise Donohue (Cisco Press 2015). The models are discussed briefly below.
a. Distributed: Although not really an “SDN” model at all, it is important to understand the status
quo. Network devices today each have their own control-plane components which rely on
distributed routing protocols (such as OSPF, BGP, etc). These protocols form paths in the
network between all relevant endpoints (IP prefixes, etc). Devices typically do not influence one
another’s routing decisions individually as traffic is routed hop-by-hop through the network
without centralized oversight. This model totally distributes the control-plane across all devices.
Such control-planes are also autonomous; with minimal administrative effort, they often form
neighborships and advertise topology and/or reachability information. Some of the drawbacks
include potential routing loops (typically transient ones during periods of convergence) and
complex routing schemes in poorly designed/implemented networks. The diagram below
depicts several routers each with their own control-plane and no centralization.
27
Nicholas J. Russo
points of presence (POPs) in an SP core. Cisco’s Performance Routing (PfR) is an example of the
augment model. Another example includes offline path computation element (PCE) servers for
automated MPLS TE tunnel creation. In both cases, a small set of routers (PfR border routers or
TE tunnel head-ends) are modified, yet the remaining routers are untouched. This model has a
lower impact on the existing network because the wholesale failure of the controller simply
returns the network to the distributed model, which is known to work “well enough” in many
cases. The augmented model is depicted below.
PFR MC PUSHES
POLICY TO EDGE
c. Hybrid: This model is very similar to the augmented model except that controller-originated
policy can be imposed anywhere in the network. This gives additional granularity to network
administrators; the main benefit over the augmented model is that the hybrid model is always
topology-independent. The controller can overwrite the forward table of any device which
means that topological restrictions are removed. Cisco’s Application Centric Infrastructure (ACI)
is a good example of this model. ACI separates reachability from policy, which is critical from
both survivability and scalability perspectives. The failure of the centralized controller in these
models has an identical effect to that of a controller in the augment model; the network falls
back to a distributed control-plane model. The impact of a failed controller is a little more
significant since more devices are affected by the controller’s policy. A diagram is shown on the
following page.
28
Nicholas J. Russo
ACI APIC
PUSHES POLICY
ANYWHERE
d. Centralized: This is the model most commonly referenced when the phrase “SDN” is used. It
relies on a single controller, which hosts the entire control-plane. Ultimately, this device
commands all of the devices in the forwarding-plane. These controllers push their forwarding
tables with the proper information (which doesn’t necessarily have to be an IP-based table, it
could be anything) to the forwarding hardware as specified by the administrators. This offers
very granular control, in many cases, of individual flows in the network. The hardware
forwarders can be commoditized into white boxes (or branded white boxes, sometimes called
brite boxes) which are often inexpensive. Another value proposition of centralizing the control-
plane is that a “device” can be almost anything: router, switch, firewall, load-balancer, etc.
Emulating software functions on generic hardware platforms can add flexibility to the business.
The most significant drawback is the newly-introduced single point of failure and the inability to
create failure domains as a result. Some SDN scaling architectures suggest simply adding
additional controllers for fault tolerance or to create a hierarchy of controllers for larger
networks. While this is a valid technique, it somewhat invalidates the “centralized” model
because with multiple controllers, the distributed control-plane is reborn. The controllers still
must synchronize their routing information using some network-based protocol and the
possibility of inconsistencies between the controllers is real. When using this multi-controller
architecture, the network designer must understand that there is, in fact, a distributed control-
plane in the network; it has just been moved around. The failure of all controllers means the
entire failure domain supported by those controllers will be inoperable. The failure of the
communication paths between controllers could likewise cause inconsistent/intermittent
problems with forwarding, just like a fully distributed control-plane. OpenFlow is a good
example of a fully-centralized model.
29
Nicholas J. Russo
OPENFLOW CONTROLLER
CONTROLS EVERYTHING
2.1 Describe functional elements of network programmability (SDN) and how they
interact
2.1.1 Controllers
As discussed briefly above, “controllers” are components that are responsible for programming
forwarding tables of data-plane devices. Controllers themselves could even be routers, like Cisco’s PfR
operating as a master controller (MC), or they could be software-only appliances, as seen with
OpenFlow networks or Cisco’s Application Policy Infrastructure Controller (APIC) used with ACI. The
models discussed above help detail the significance of the controller; this is entirely dependent on the
deployment model. The more involved a controller is, the more flexibility the network administrator
gains. This must be weighed against the increased reliance on the controller itself.
A well-known example of an SDN controller is Open DayLight (ODL). ODL is commonly used as the SDN
controller for OpenFlow deployments. OpenFlow is the communications protocol between ODL and the
data-plane devices responsible for forwarding packets (southbound). ODL communicates with business
applications via APIs so that the network can be programmed to meet the requirements of the
applications (northbound).
It is worth discussing a few of Cisco’s solutions in this section as they are both popular with customers
and relevant to Cisco’s vision of the future of networking. Cisco’s Intelligent WAN (IWAN) is an
evolutionary strategy to bring policy abstraction to the WAN to meet modern design requirements, such
as path optimization, cost reduction via commodity transport, and transport independence. IWAN has
several key components:
1. Dynamic Multipoint Virtual Private Network (DMVPN): This feature is a multipoint IP tunneling
mechanism that allows sites to communicate to a central hub site without the hub needing to
configure every remote spoke. Some variants of DMVPN allow for direct spoke-to-spoke traffic
exchange using a reactive control-plane used to map overlay and underlay addresses into pairs.
30
Nicholas J. Russo
DMVPN can provide transport independence as it can be used as an overlay atop the public
Internet, private WANs (MPLS), or any other transport that carries IP.
2. IP Service Level Agreement (IP SLA): This feature is used to synthesize traffic to match
application flows on the network. By sending probes that look like specific applications, IWAN
can test application performance and make adjustments. This is called “active monitoring”.
3. Netflow: Like IP SLA, Netflow is used to measure the performance of specific applications across
an IWAN deployment, but does so without sending traffic. These measures can be used to
approximate bandwidth utilization, among other things. This is called “passive monitoring”.
4. IP Routing: Although not a new feature, some kind of overlay routing protocol is still needed.
One of IWAN’s greatest strengths is that it can still rely on IP routing for a subset of flows, while
choosing to optimize others. A total failure of the IWAN “intelligence” constructs will allow the
WAN to fall back to classic IP routing, which is a known-good design and guaranteed to work.
For this reason, existing design best practices and requirements gathering cannot be skipped
when IWAN is deployed as these decisions can have significant business impacts.
5. Performance Routing (PfR): PfR is the glue of IWAN that combines all of the aforementioned
features into a comprehensive and functional system. It enhances IP routing in a number of
ways:
a. Adjusting routing attributes, such as BGP local-preference, to prefer certain paths
b. Injecting longer matches to prefer certain paths
c. Installing dynamic route-maps for policy-routing when application packets are to be
forwarded based on something other than their destination IP address
When PfR is deployed, PfR speakers are classified as master controllers (MC) or border routers
(BR). MCs are the SDN “controllers” where policy is configured and distributed. The BRs are
relatively unintelligent in that they consume commands from the MC and apply the proper
policy. There can be a hierarchy of MC/BR as well to provide greater availability for remote sites
that lose WAN connectivity. MCs are typically deployed in a stateless HA pair using loopback
addresses with variable lengths; the MCs typically exist in or near the corporate data centers.
The following page depicts a high-level drawing of how IWAN works (from Cisco’s IWAN wiki page).
IWAN is generally position as an SD-WAN solution as a way to connect branch offices to HQ locations,
such as data centers.
31
Nicholas J. Russo
Another common Cisco SDN solution is Application Centric Infrastructure (ACI). As discussed earlier, ACI
separates policy from reachability and could be considered a hybrid SDN solution, much like IWAN. ACI
is more revolutionary than IWAN as it reuses less technology and relies on custom hardware and
software. Specifically, ACI is supported on the Cisco Nexus 9000 product line using a version of software
specific to ACI. This differs from the original NX-OS which is considered a “standalone” or “non-ACI”
deployment of the Nexus 9000. Unlike IWAN, ACI is positioned within the data center itself.
The creation of ACI, to include its complement of customized hardware, was driven by a number of
factors (not a comprehensive list):
1. Software alone cannot solve the migration of 1Gbps to 10Gbps in the server access layer or
10Gbps to 40Gbps/100Gbps in the DC aggregation and core layers.
2. The overall design of the DC has to change to better support east/west traffic flows being
generated by distributed, multi-tiered applications.
3. Rapid service deployment for internal IT consumers in a secure and scalable way. This prevents
individuals from going “elsewhere” when enterprise IT providers cannot meet their needs.
4. Central management isn’t a new thing, but has traditionally failed as network devices did not
have machine-friendly interfaces since they were often configured directly by humans. Such
interfaces are called Application Programmability Interfaces (API) which are discussed later in this
document.
The controller used for ACI is known as the Application Policy Infrastructure Controller (APIC). Cisco’s
approach in developing this controller was different from the classic thought process of “the controller
needs to get a packet from point A to point B”. Networks have traditionally done this job well. Instead,
APIC focuses on “when the packets can move and what happens when they do” (quoting Cisco). That is
to say, under what policy conditions a packet should be forward, dropped, rerouted over an alternative
link, etc. Packet forwarding continues in the distributed control-plane model as discussed before, but
the APIC is able to configure any node in the network with specific policy, to include security policy, or to
32
Nicholas J. Russo
enhance/modify any given flow in the data center. Policy is retained in the nodes even in the event of a
controller failure, but policy can only be modified by the APIC.
The ACI infrastructure is built on a leaf/spine network fabric which has a number of interesting
characteristics:
1. Adding bandwidth is achieved by adding spines and linking the leaves to it.
2. Adding access density is achieved by adding leaves and linking them to the spines.
3. “Border” leaves are identified as the egress point from the fabric, not the spines.
4. Nothing connects to the spines other than leaves, and spines are never connected laterally.
This architecture is not new (in general), but is becoming popular in DC fabrics for its superior
distribution of bandwidth for north/south and east/west DC flows. The significant advantage of this
design for any SDN DC solution is that it is universally useful; other SDN vendors in the DC space typically
prefer that the underlying architecture look this way. This topology need not change even as the APIC
policies change significantly since it is designed only for high-speed transport. In an ACI network, the
network makes no attempt to automatically classify, treat, and prioritize specific applications absent
input from the user (via APIC). That is to say, it is both cost prohibitive and error prone for the network
to make such a classification when the business drivers (i.e. human input) are what drives the
prioritization, security policy, and other treatment characteristics of a given application.
Policy applied to the APIC is applied to the network using several constructs:
1. Application Network Profile: Logical template for how the application connects and works. All
tiers of a given application are encompassed by this profile. The profile can contain multiple policies
which are applied between the components of an application. These policies can define things like
QoS, availability, and security requirements. This “declarative” model is intuitive and is application-
focused. The policy also follows applications across the DC as they migrate for mobility purposes.
2. Endpoint Groups (EPG): EPGs are designed to group elements together that share a common
policy. Consider a classic three-tier application. All web servers would be in one EPG, while the
application servers would be a second. The database servers would be in a third. The policy
application would occur between these endpoint groups. Components are placed into groups based
on any number of fields, such as VLAN, IP address, port (physical or layer-4), and other fields.
3. Contracts: These items determine the types of traffic (and their treatment) between EPGs. The
contract is the application of policy between EPGs which effectively represents an agreement
between two entities to exchange information. Contracts are aptly named as it is not possible to
violate a contract; this is enforced by the APIC policies, which are driven by business requirements.
The following page depicts a high level image of the ACI infrastructure provided by Cisco.
33
Nicholas J. Russo
Cisco’s Campus Fabric is a main component of the Digital Network Architecture (DNA), a major Cisco
networking initiative. Campus Fabric relies on a VXLAN-based data plane, encapsulating traffic at the
edges of the fabric inside IP packets to provide L2VPN and L3VPN service. Security Group Tags (SGT),
Quality of Service (QoS) markings, and the VXLAN Virtual Network ID (VNI) are all carried in the VXLAN
header, giving the underlay network some ability to apply policy to transit traffic. Campus Fabric was
designed with mobility, scale, and performance in mind.
The solution uses Location/Identification Separation Protocol (LISP) as its control-plane. LISP is like a
combination of DNS and NHRP as a mapping server binds endpoint IDs (EIDs) to routing locations
(RLOCs) in a centralized manner. Like NHRP, LISP is a reactive control plane whereby EIDs are exchanged
between endpoints via “conversational learning”. That is to say, edge nodes don’t retain all state at all
times, but rather only when it is needed. The initial setup of communications between two nodes when
the state is absent can take some time as LISP converges. Unlike DNS, the LISP mapping server does not
reply directly to LISP edge nodes as such a reply is not a guarantee that two edge nodes can actually
communicate. The LISP mapping server forwards the request to the remote edge node authoritative for
a given EID, which generates the response. This behavior is similar to how NHRP works in DMVPN
phases 2 and 3 when spoke-to-spoke tunnels are dynamically built.
Campus Fabric offers separation using both policy-based segmentation via Security Group Tags (SGT)
and network-based segmentation via VXLAN/LISP. These options are not mutually exclusive and can be
deployed together for even better separation between virtual networks. Extending virtual networks
outside of the fabric is done using VRF-Lite in an MPLS Inter-AS Option A fashion, effectively extending
the virtual networks without merging the control-planes. This architecture can be thought of like an SD-
LAN although Cisco (and the industry in general) do not use this term. The IP routed underlay is kept
simple with complex, tenant-specific overlays added on top according to the business needs.
34
Nicholas J. Russo
2.1.2 APIs
An Application Programmability Interface (API) is meant to define a standard way of interfacing with a
software application or operating system. It may consist of functions (methods, routines, etc), protocols,
system call constructs, and other “hooks” for integration. Both the controllers and business applications
would need the appropriate APIs revealed for integration between the two. This makes up the
northbound communication path as discussed in section 2.1.5. By creating a common API for
communications between controllers and business applications, either one can be changed at any time
without significantly impacting the overall architecture.
A common API that is discussed within the networking world is the Representational State Transfer
(REST) API. REST represents an “architectural style” of transferring information between clients and
servers. In essence, it is a way of defining attributes or characteristics of how data is moved. REST is
commonly used with HTTP by combining traditional HTTP methods (GET, POST, PUT, DELETE, etc) and
Universal Resource Identifiers (URI). The end result is that API requests look like URIs and are used to
fetch/write specific pieces of data to a target machine. This simplification helps promote automation,
especially for web-based applications or services. Note that HTTP is stateless which means the server
does not store session information for individual flows; REST API calls retain this stateless functionality
as well. This allows for seamless REST operation across HTTP proxies and gateways.
Because APIs are such a critical topic, this document details two new APIs present in Cisco’s IOS XE
software: the REST API and RESTCONF. The detailed walkthroughs for each API are shown below.
35
Nicholas J. Russo
Features: AsynchDNS GSS-Negotiate IDN IPv6 Largefile NTLM NTLM_WB SSL libz
unix-sockets
First, the basic configuration to enable the REST API feature on IOS XE devices is shown below. A brief
verification shows that the feature is enabled and uses TCP port 55443 by default. This port number is
important later as the curl command will need to know it.
interface GigabitEthernet1
description MGMT INTERFACE
ip address dhcp
! or a static IP address
virtual-service csr_mgmt
ip shared host-interface GigabitEthernet1
activate
ip http secure-server
transport-map type persistent webui HTTPS_WEBUI
secure-server
transport type persistent webui input HTTPS_WEBUI
remote-management
restful-api
Using curl for IOS XE REST API invocations requires a number of options. Those options are summarized
below. They are also described in the man pages for curl. This specific demonstration will be limited to
obtaining an authentication token, posting a QoS class-map configuration, and verifying that it was
written.
36
Nicholas J. Russo
d: sends the specified data in an HTTP POST request
k: insecure. This allows curl to accept certificates not signed by a trusted CA. For testing purposes, this is
required to accept the router’s self-signed certificate. It is not a good idea to use it in production
networks.
3: force curl to use SSLv3 for the transport to the managed device. This can be detrimental and should
be used cautiously (discussed later).
The first step is obtaining an authentication token. This allows the HTTPS client to supply authentication
credentials once, such as username/passwrd, and then can use the token for authentication for all
future API calls. The initial attempt at obtaining this token fails. This is a common error so the
troubleshooting to resolve this issue is described in this document. The two HTTPS endpoints cannot
communicate due to not supporting the same cipher suites. Note that it is critical to specify the REST API
port number (55443) in the URL, otherwise the standard HTTPS server will respond on port 443 and the
request will fail.
Sometimes installing/update the following packages can solve the issue. In this case, these updates did
not help.
If that fails, curl the following website. It will return a JSON listing of all ciphers supported by your
current HTTPS client. Piping the output into “jq”, a popular utility for querying JSON structures, pretty-
prints the JSON output for human readability.
37
Nicholas J. Russo
[root@ip-10-125-0-100 restapi]# curl https://www.howsmyssl.com/a/check | jq
% Total % Received % Xferd Average Speed Time Time Time
Current
Dload Upload Total Spent Left Speed
100 1417 100 1417 0 0 9572 0 --:--:-- --:--:-- --:--:--
9639
{
"given_cipher_suites": [
"TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384",
"TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA",
"TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256",
"TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA",
"TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384",
"TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA",
"TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256",
"TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA",
"TLS_DHE_RSA_WITH_AES_256_GCM_SHA384",
"TLS_DHE_RSA_WITH_AES_256_CBC_SHA",
"TLS_DHE_DSS_WITH_AES_256_CBC_SHA",
"TLS_DHE_RSA_WITH_AES_256_CBC_SHA256",
"TLS_DHE_RSA_WITH_AES_128_GCM_SHA256",
"TLS_DHE_RSA_WITH_AES_128_CBC_SHA",
"TLS_DHE_DSS_WITH_AES_128_CBC_SHA",
"TLS_DHE_RSA_WITH_AES_128_CBC_SHA256",
"TLS_DHE_RSA_WITH_3DES_EDE_CBC_SHA",
"TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA",
"TLS_RSA_WITH_AES_256_GCM_SHA384",
"TLS_RSA_WITH_AES_256_CBC_SHA",
"TLS_RSA_WITH_AES_256_CBC_SHA256",
"TLS_RSA_WITH_AES_128_GCM_SHA256",
"TLS_RSA_WITH_AES_128_CBC_SHA",
"TLS_RSA_WITH_AES_128_CBC_SHA256",
"TLS_RSA_WITH_3DES_EDE_CBC_SHA",
"TLS_RSA_WITH_RC4_128_SHA",
"TLS_RSA_WITH_RC4_128_MD5"
],
"ephemeral_keys_supported": true,
"session_ticket_supported": false,
"tls_compression_supported": false,
"unknown_cipher_suite_supported": false,
"beast_vuln": false,
"able_to_detect_n_minus_one_splitting": false,
"insecure_cipher_suites": {
"TLS_RSA_WITH_RC4_128_MD5": [
"uses RC4 which has insecure biases in its output"
],
"TLS_RSA_WITH_RC4_128_SHA": [
"uses RC4 which has insecure biases in its output"
38
Nicholas J. Russo
]
},
"tls_version": "TLS 1.2",
"rating": "Bad"
}
The utility “sslscan” can help find the problem. The issue is that the CSR1000v only supports the TLSv1
versions of the ciphers, not the SSLv3 version. The curl command issued above forced curl to use SSLv3
with the "-3" option as prescribed by the documentation. This is a minor error in the documentation
which has been reported and may be fixed at the time of your reading. This troubleshooting excursion is
likely to have value for those learning about REST APIs on IOS XE devices in a general sense, since
establishing HTTPS transport is a prerequisite.
Removing the "-3" option will fix the issue. It appears that the
“TLS_DHE_RSA_WITH_AES_256_CBC_SHA” cipher was chosen for the connection when the curl
command is issued again. Using "sslscan" was still useful because, ignoring the RC4 cipher itself used
with grep, one can note that the TLSv1 variant was accepted while the SSLv3 variant was rejected, which
would suggest a lack of support for SSLv3 ciphers. Below is the correct output from a successful curl.
39
Nicholas J. Russo
* issuer: CN=restful_api,ST=California,O=Cisco,C=US
* Server auth using Basic with user 'ansible'
> POST /api/v1/auth/token-services HTTP/1.1
> Authorization: Basic YW5zaWJsZTphbnNpYmxl
> User-Agent: curl/7.29.0
> Host: csr1:55443
> Accept:application/json
> Content-Length: 0
> Content-Type: application/x-www-form-urlencoded
>
< HTTP/1.1 200 OK
< Server: nginx/1.4.2
< Date: Sun, 07 May 2017 16:35:18 GMT
< Content-Type: application/json
< Content-Length: 200
< Connection: keep-alive
<
* Connection #0 to host csr1 left intact
{"kind": "object#auth-token", "expiry-time": "Sun May 7 16:50:18 2017",
"token-id": "YGSBUtzTpfK2QumIEk8dt9rXhHjZfAJSZXYXDXg162Q=", "link":
"https://csr1:55443/api/v1/auth/token-services/6430558689"}
The final step is using an HTTPS POST request to write new data to the router. One can embed the JSON
text as a single line into the curl command using the -d option. The command appears intimidating at a
glance. Note the single quotes (') surrounding the JSON data with the -d option; these are required since
the keys and values inside the JSON structure have double quotes ("). Additionally, the
username/password is omitted from the request, and additional headers (-H) are applied to include the
authentication token string and the JSON content type.
40
Nicholas J. Russo
> Host: csr1:55443
> Accept:application/json
> X-Auth-Token: YGSBUtzTpfK2QumIEk8dt9rXhHjZfAJSZXYXDXg162Q=
> content-type: application/json
> Content-Length: 136
>
* upload completely sent off: 136 out of 136 bytes
< HTTP/1.1 201 CREATED
< Server: nginx/1.4.2
< Date: Sun, 07 May 2017 16:48:05 GMT
< Content-Type: text/html; charset=utf-8
< Content-Length: 0
< Connection: keep-alive
< Location: https://csr1:55443/api/v1/qos/class-map/CMAP_AF11
<
* Connection #0 to host csr1 left intact
This newly-configured class-map can be verified using an HTTPS GET request. The data field is stripped
to the empty string, POST is changed to GET, and the class-map name is appended to the URL. The
verbose option (-v) is omitted for brevity. Writing this output to a file and using the jq utility can be a
good way to query for specific fields. Piping the output to "tee" allows it to the written to the screen and
redirected to a file.
Logging into the router to verify the request via CLI is a good idea while learning, although using HTTPS
GET verified the same thing.
41
Nicholas J. Russo
description QOS CLASS MAP FROM REST API CALL
match dscp af11
end
Enabling RESTCONF requires a single hidden command in global configuration, shown below as simply
“restconf”. This feature is not TAC supported at the time of this writing and should be used for
experimentation only. Additionally, a loopback interface with an IP address and description is
configured. For simplicity, RESTCONF testing will be limited to insecure HTTP to demonstrate the
capability without dealing with SSL/TLS ciphers.
The curl utility is useful with RESTCONF as it was with the class REST API. The difference is that the data
retrieval process is more intuitive. First, we query the interface IP address, then the description. Both of
the URLs are simple and the overall curl command syntax is easy to understand. The output comes back
in easy-to-read XML which is convenient for machines that will use this information. Some data is
nested, like the IP address, as there could be multiple IP addresses. Other data, like the description,
need not be nested as there is only ever one description per interface.
42
Nicholas J. Russo
<address xmlns="http://cisco.com/ns/yang/ned/ios" xmlns:y="http://tail-
f.com/ns/rest" xmlns:ios="http://cisco.com/ns/yang/ned/ios">
<primary>
<address>172.16.192.168</address>
<mask>255.255.255.255</mask>
</primary>
</address>
This section does not detail other HTTP operations such as POST, PUT, and DELETE using RESTCONF. The
feature is still very new and is tightly integrated with postman, a tool that generates HTTP requests
automatically.
2.1.3 Scripting
Scripting is a fundamental topic of an effective automation design. Scripting can allow network
administrators to define the policy constraints for a given application in a manner that is consistent and
clear. Other operators can “fork” this script (copy it and start a new version from a common point) and
potentially “merge” the changes back in later. This is how advanced policies can be defined in an SDN-
based architecture. The scripts can be manually applied as discussed earlier with the VLAN-creation
example (section 1.2.1), or they could be assembled in sequence by an orchestrator as part of a larger
workflow or process. In any case, an administrator needs to write the script in the first place, and like all
pieces of code, it must be maintained, tested, versioned, and continuously monitored. A popular
repository for text file configuration management is Github or BitBucket, while popular scripting
languages include Python, Ruby, JavaScript, and Perl. Neither of these are complete lists.
To show an example, a Google Codejam solution is shown below. The challenge was finding the minimal
scalar product between two vectors of equal length. The solution is to sort both vectors: one sorted
greatest-to-least, and one sorted least-to-greatest. Then, performing the basic scalar product logic, the
problem is solved. This code is not an exercise in absolute efficiency or optimization as it was written to
be modular and readable. The example below was written in Python 3.5.2.
class VectorPair:
'''
Constructor takes in two vectors and the vector length.
43
Nicholas J. Russo
'''
def __init__(self, v1, v2, n):
self.v1 = v1
self.v2 = v2
self.n = n
'''
Given two vectors of equal length, the scalar product is pairwise
multiplication of values and the sum of all pairwise products.
'''
def _resolve_sp(self, v1, v2):
sp = 0
return sp
'''
Given two vectors of equal length, the minimum scalar product is
the smallest number that exists given all permutations of
multiplying numbers between the two vectors.
'''
def resolve_msp(self):
Version Control
The author maintains a Github page with personal code that is free from any copyright protection. This
Github account is used to demonstrate a revision control example. Suppose that a change to the Python
script above is required, and specifically, a trivial comment change. Checking the Git status first, the
repository is up to date as no changes have been made.
### OPEN THE TEXT EDITOR AND MAKE CHANGES (NOT SHOWN) ###
44
Nicholas J. Russo
Nicholass-MBP:min-scalar-prod nicholasrusso$ grep Constructor VectorPair.py
Constructor takes in two vectors and the vector length.
Git status now reports that VectorPair.py has been modified but not added to the set of files to be
committed to the repository. The red text in the output is actually shown in the terminal and is retained
here for completeness.
modified: VectorPair.py
no changes added to commit (use "git add" and/or "git commit -a")
Adding this file to the list of changed files effectively stages it for commitment to the repository. The
green text from the terminal indicates this.
modified: VectorPair.py
Next, the file is committed with a comment explaining the change. This command does not update the
Github repository, only the local one. The local repository is considered a “branch” as it deviates from
the code everyone else can see, and Git does not know whether this is intentional (actually a branch) or
just an inconsistency (you haven’t copied the updates to the remote repository).
To update the remote repository, the committed updates must be “pushed”. After this is complete, the
Git status utility informs us that there are no longer any changes.
45
Nicholas J. Russo
Nicholass-MBP:min-scalar-prod nicholasrusso$ git push -u
Counting objects: 4, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 455 bytes | 0 bytes/s, done.
Total 4 (delta 2), reused 0 (delta 0)
remote: Resolving deltas: 100% (2/2), completed with 2 local objects.
To https://github.com/nickrusso42518/google-codejam.git
e8d0c54..74ed39a master -> master
Branch master set up to track remote branch master from origin.
Logging into the Git web page, one can verify the changes were successful. At the root directory
containing all of the Google Codejam challenges, the comment added to the last commit is visible.
Looking into the min-scalar-prod directory and specifically the VectorPair.py file, Git clearly displays the
additions/removals from the file. As such, Git is a powerful tool that can be used for scripting, data files
(YAML, JSON, XML, YANG, etc.) and any other text documents that need to be revision controlled.
46
Nicholas J. Russo
2.1.4 Agents
Management agents are typically on-box, add-on software components that allow an automation,
orchestration, or monitoring tool to communicate with the managed device. The agent exposes an API
that would have otherwise not been available. On the topic of monitoring, the agents allow the device
to report traffic conditions back to the controller (telemetry). Given this information, the controller can
sense (or, with analytics, predict) congestion, route around failures, and perform all manner of fancy
traffic-engineering as required by the business applications. Many of these agents perform the same
general function as SNMP yet offer increased flexibility and granularity as they are programmable.
Agents could also be used for non-management purposes. Interface to the Routing System (I2RS) is an
SDN technique where a specific control-plane agent is required on every data-plane forwarder. This
agent is effectively the control-plane client that communicates northbound towards the controller. This
is the channel by which the controller consults its RIB and populates the FIB of the forwarding devices.
The same is true for OpenFlow (OF) which is a fully centralized SDN model. The agent can be considered
an interface to a data-plane forwarder for a control-plane SDN controller.
Although not specific to “agents”, there are several common applications/frameworks that are used for
automation. Some of them rely on agents while others do not. Three of them are discussed briefly
below as these are found in Cisco’s “NX-OS DevNet Network Automation Guide”. Note that subsets of
the exact definitions are added here. Since these are third-party products, the author does not want to
misrepresent the facts or capabilities as understood by Cisco.
a. Puppet (by Puppet Labs): The Puppet software package is an open source automation toolset
for managing servers and other resources by enforcing device states, such as configuration
settings. Puppet components include a puppet agent which runs on the managed device (client)
and a puppet master (server) that typically runs on a separate dedicated server and serves
multiple devices. The Puppet master compiles and sends a configuration manifest to the agent.
The agent reconciles this manifest with the current state of the node and updates state based on
differences. A puppet manifest is a collection of property definitions for setting the state on the
device. Manifests are commonly used for defining configuration settings, but they can also be
used to install software packages, copy files, and start services.
In summary, Puppet is agent-based (requiring software installed on the client) and pushes
complex data structures to managed nodes from the master server. Puppet manifests are used
47
Nicholas J. Russo
as data structures to track node state and display this state to the network operators. Puppet is
not commonly used for managing Cisco devices as most Cisco products, at the time of this
writing, do not support the Puppet agent. The follow products support Puppet today:
1. Cisco Nexus 7000 and 7700 switches running NX-OS 7.3(0)D1(1) or later
2. Cisco Nexus 9300, 9500, 3100, and 300 switches running NX-OS 7.3(0)I2(1) or later
3. Cisco Network Convergence System (NCS) 5500 running IOS-XR 6.0 or later
4. Cisco ASR9000 routers running IOS-XR 6.0 or later
b. Chef (by Chef Software): Chef is a systems and cloud infrastructure automation framework that
deploys servers and applications to any physical, virtual, or cloud location, no matter the size of
the infrastructure. Each organization is comprised of one or more workstations, a single server,
and every node that will be configured and maintained by the chef-client. A cookbook defines a
scenario and contains everything that is required to support that scenario, including libraries,
recipes, files, and more. A Chef recipe is a collection of property definitions for setting state on
the device. While recipes are commonly used for defining configuration settings, they can also be
used to install software packages, copy files, start services, and more.
In summary, Chef is very similar to Puppet in that it requires agents and manages devices using
complex data structures. The concepts of cookbooks and recipes are specific to Chef (hence the
name) which contribute to a hierarchical data structure management system. A Chef cookbook
is loosely equivalent to a Puppet manifest. Like Puppet, Chef is not commonly used to manage
Cisco devices due to requiring the installation of an agent. Below is a list of supported platforms
that support being managed by Chef:
1. Cisco Nexus 7000 and 7700 switches running NX-OS 7.3(0)D1(1) or later
2. Cisco Nexus 9300, 9500, 3100, and 300 switches running NX-OS 7.3(0)I2(1) or later
3. Cisco Network Convergence System (NCS) 5500 running IOS-XR 6.0 or later
4. Cisco ASR9000 routers running IOS-XR 6.0 or later
c. Ansible (by Red Hat): Ansible is an open source IT configuration management and automation
tool. Unlike Puppet and Chef, Ansible is agent-less, and does not require a software agent to be
installed on the target node (server or switch) in order to automate the device. By default,
Ansible requires SSH and Python support on the target node, but Ansible can also be easily
extended to use any API.
In summary, Ansible is somewhat lighter-weight than Puppet or Chef given that management is
agent-less. No custom software needs to be installed on any device provided that it supports
SSH. This can be a drawback since individual device CLIs must be exposed to network operators
(or, at best, the Ansible automation engine) instead of using a more abstract API design. Ansible
is very commonly used to manage Cisco network devices as it requires no agent installation on
the managed devices. Any Cisco device that can be accessed using SSH can be managed by
Ansible. This includes Cisco ASA firewalls, older Cisco ISRs, and older Cisco Catalyst switches.
48
Nicholas J. Russo
Network Automation Example using Ansible
The author has deployed Ansible in production and is most familiar with Ansible when compared against
Puppet or Chef. This section will illustrate the value of automation using a simple but powerful script.
These tests were conducted on a Linux machine in Amazon Web Services (AWS) which was targeting a
Cisco CSR1000v. Before beginning, all of the relevant version information is shown below for reference.
Ansible playbooks are collections of plays. Each play targets a specific set of hosts and contains a list of
tasks. In YAML, arrays/lists are denoted with a hyphen (-) character. The first play in the playbook begins
with a hyphen since it’s the first element in the array of plays. The play has a name, target hosts, and
some other minor options. Gathering facts can provide basic information like time and date, which are
used in this script. When connection: local is used, the python commands used with Ansible are
executed on the control machine (Linux) and not on the target. This is required for many Cisco devices
being managed by the CLI.
The first task defines a credentials dictionary. This contains transport information like SSH port (default
is 22), target host, username, and password. The ios_config and ios_command modules, for example,
require this to log into the device. The second task uses the ios_config module to issue specific
commands. The commands will specify the SNMPv3 user/group and update the auth/priv passwords for
that user. For accountability reasons, a timestamp is written to the configuration as well using the
"facts" gathered earlier in the play. Minor options to the ios_config module, such as save: yes and
match: none are optional. The first option saves the configuration after the commands are issued while
the second does not care about what the router already has in its configuration. The commands in the
task will forcibly overwrite whatever is already configured. The changed_when: false option tells Ansible
to always report a status of "ok" rather than "changed" which makes the script "succeed" from an
operations perspective.
49
Nicholas J. Russo
- name: "Updating SNMPv3 pre-shared keys"
hosts: csr1
gather_facts: yes
connection: local
tasks:
- name: "SYS >> Define router credentials"
set_fact:
provider:
host: "{{ inventory_hostname }}"
username: "ansible"
password: "ansible"
- name: "IOS >> Issue commands to update the passwords, save config"
ios_config:
provider: "{{ provider }}"
commands:
- "snmp-server user {{ snmp.user }} {{ snmp.group }} v3 auth sha {{
snmp.authpass }} priv aes 256 {{ snmp.privpass }}"
- "snmp-server contact PASSWORDS UPDATED {{ ansible_date_time.date }}
at {{ ansible_date_time.time }}"
save: yes
match: none
changed_when: false
...
The playbook above makes a number of assumptions that have not been reconciled yet. First, one
should verify that "csr1" is defined and reachable. It is configured as a static hostname-to-IP mapping in
the system hosts file. Additionally, it is defined in the Ansible hosts file as a valid host. Last, it is valuable
to ping the host to ensure that it is powered on and responding over the network. The verification for all
aforementioned steps is below.
50
Nicholas J. Russo
Next, Ansible needs to populate variables for things like snmp.user and snmp.group. Ansible is smart
enough to look for file names matching the target hosts in a folder called "host_vars" and automatically
add all variables to the play. These files are in YAML format and items can be nested as shown below.
This makes it easier to organize variables for different features. Some miscellaneous BGP variables are
shown in the file below even though our script doesn't care about them. Note that if "groups" are used
in the Ansible hosts file, variable files can contain the names of those groups inside the "group_vars"
directly for similar treatment. Note that there are secure ways to deal with plain-text passwords with
Ansible, such as Ansible Vault. This feature is not demonstrated in this document.
The final step is to execute the playbook. Debugging is enabled so that the generated commands are
shown in the output below, which normally does not happen. Note that the variable substitution, as well
as Ansible timestamping, appears to be working. The play contained three tasks, all of which succeed.
Although gather_facts didn't look like a task in the playbook, behind the scenes the "setup" module was
executed on the control machine, which counts as a task.
TASK [IOS >> Issue commands to update the passwords, save config] ***********
ok: [csr1] => {"banners": {}, "changed": false, "commands": ["snmp-server
user USERV3 GROUPV3 v3 auth sha ABC123 priv aes 256 DEF456", "snmp-server
contact PASSWORDS UPDATED 2017-05-07 at 18:05:27"], "updates": ["snmp-server
user USERV3 GROUPV3 v3 auth sha ABC123 priv aes 256 DEF456", "snmp-server
contact PASSWORDS UPDATED 2017-05-07 at 18:05:27"]}
51
Nicholas J. Russo
PLAY RECAP ******************************************************************
csr1 : ok=3 changed=0 unreachable=0 failed=0
To verify that the configuration was successfully applied, log into the target router to manually verify the
configuration. To confirm that the configuration was saved, check the startup-configuration manually as
well. The verification is shown below.
The northbound interfaces are considered APIs which are interfaces to existing business applications.
This is generally used so that applications can make requests of the network, which could include
specific performance requirements (bandwidth, latency, etc). Because the controller “knows” this
information by communicating with the infrastructure devices via management agents, it can determine
the best paths through the network to satisfy these constraints. This is loosely analogous to the original
intent of the Integrated Services QoS model using Resource Reservation Protocol (RSVP) where
applications would reserve bandwidth on a per-flow basis. It is also similar to MPLS TE constrained SPF
(CSPF) where a single device can source-route traffic through the network given a set of requirements.
The logic is being extended to applications with a controller “shim” in between, ultimately providing a
full network view for optimal routing. A REST API is an example of a northbound interface.
The southbound interfaces include the control-plane protocol between the centralized controller and
the network forwarding hardware. These are the less intelligent network devices used for forwarding
only (assuming a centralized model). A common control-plane used for this purpose would be
OpenFlow; the controller determines the forwarding tables per flow per network device, programs this
information, and then the devices obey it. Note that OpenFlow is not synonymous with SDN; it is just an
example of one southbound control-plane protocol. Because the SDN controller is sandwiched between
52
Nicholas J. Russo
the northbound and southbound interfaces, it can be considered “middleware” in a sense. The
controller is effectively able to evaluate application constraints and produce forwarding-table outputs.
The image below depicts a very high-level diagram of the SDN layers as it relates to interaction between
components.
a. Ethernet VLANs using 802.1q encapsulation. Often used to create virtual networks at layer 2 for
security segmentation, traffic hair pinning through a service chain, etc. This is a form of data
multiplexing over Ethernet links. It isn’t a tunnel/overlay since the layer 2 reachability
information (MAC address) remains exposed and used for forwarding decisions.
b. VPN Routing and Forwarding (VRF) tables or other layer-3 virtualization techniques. Similar
uses as VLANs except virtualizes an entire routing instance, and is often used to solve a similar
set of problems. Can be combined with VLANs to provide a complete virtual network between
layers 2 and 3. Can be coupled with GRE for longer-range virtualization solutions over a core
network that may or may not have any kind of virtualization. This is a multiplexing technique as
53
Nicholas J. Russo
well but is control-plane only since there is no change to the packets on the wire, nor is there
any inherent encapsulation (not an overlay).
c. Frame Relay DLCI encapsulation. Like a VLAN, creates segmentation at layer 2 which might be
useful for last-mile access circuits between PE and CE for service multiplexing. The same is true
for Ethernet VLANs when using EV services such as EV-LINE, EV-LAN, and EV-TREE. This is a data-
plane multiplexing technique specific to Frame Relay.
d. MPLS VPNs. Different VPN customers, whether at layer 2 or layer 3, are kept completely
isolated by being placed in a different virtual overlay across a common core that has no/little
native virtualization. This is an example of an overlay type of virtual network.
e. Virtual eXtensible Area Network (VXLAN): Just like MPLS VPNs; creates virtual overlays atop a
potentially non-virtualized core. VXLAN is a MAC-in-IP/UDP tunneling encapsulation designed to
provide layer-2 mobility across a data center fabric with an IP-based underlay network. The
advantage is that the large layer-2 domain, while it still exists, is limited to the edges of the
network, not the core. VXLAN by itself uses a “flood and learn” strategy so that the layer-2 edge
devices can learn the MAC addresses from remote edge devices, much like classic Ethernet
switching. This is not a good solution for large fabrics where layer-2 mobility is required, so
VXLAN can be paired with BGP’s Ethernet VPN (EVPN) address family to provide MAC routing
between endpoints. Being UDP-based, the VXLAN source ports can be varied per flow to provide
better underlay (core IP transport) load-sharing, if required.
f. Network Virtualization using Generic Routing Encapsulation (NVGRE): This technology extends
classic GRE tunneling to include a subnet identifier within the GRE header, allowing GRE to
tunnel layer-2 Ethernet frames over IP/GRE. The use cases for NVGRE are also identical to
VXLAN except that, being a GRE packet, layer-4 port-based load sharing is not supported. Some
devices can support GRE key-based hashing, but this does not have flow-level visibility
g. OTV. Just like MPLS VPNs; creates virtual overlays atop a potentially non-virtualized core, except
provides a control-plane for MAC routing. IP multicast traffic is also routed intelligently using
GRE encapsulation with multicast destination addresses. This is another example of an overlay
type of virtual network.
The tools and workflows used within the DevOps community are things that support an information
sharing environment. Many of them are focused on version control, service monitoring, configuration
management, orchestration, containerization, and everything else needed to typically support a service
54
Nicholas J. Russo
through its lifecycle. The key to DevOps is that using a specific DevOps “tool” does not mean an
organization has embraced the DevOps culture or mentality. A good phrase is “People over Process over
Tools”, as the importance of a successful DevOps team is reliance on those things, in that order.
DevOps also introduces several new concepts. Two critical ones are continuous integration (CI) and
continuous delivery (CD). The CI/CD mindset suggests several changes to traditional software
development. Some of the key points are listed below.
a. Everyone can see the changes: Dev, Ops, Quality Assurance (QA), management, etc
b. Verification is an exact clone of the production environment, not simply a smoke-test on a
developer’s test bed
c. The build and deployment/upgrade process is automated
d. Provide software in short timeframes and ensure releases are always available in increments
e. Reduce friction, increase velocity
f. Reduce silos, increase collaboration
On the topic of software development models, it is beneficial to compare the commonly used models
with the new “agile” or DevOps mentality. Additional details on these software development models can
be found in the references. The table below contains a comparison chart of the different software
development models.
55
Nicholas J. Russo
2.2.2 Network/application function virtualization (NFV, AFV)
NFV and AFV refer to taking specific network functions, virtualizing them, and assembling them in a
sequence to meet a specific business need. NFV and AFV by themselves, in isolation, are generally
synonymous with creating virtual instances of things which were once physical. Many vendors offer
virtual routers (Cisco CSR1000v, Cisco IOS-XR9000v, etc), security appliances (Cisco ASAv, Cisco NGIPSv,
etc), telephony and collaboration components (Cisco UCM, CUC, IM&P, UCCX, etc) and many other
virtual products that were once physical appliances. Separating these products into virtual functions
allows a wide variety of organizations, from cloud providers to small enterprises, to select only the
components they require.
Although not directly related to NFV/AFV, two new design paradigms are becoming popular within the
data center. Hyper-convergence and disaggregation are polar opposites but are both highly effective in
solving specific business problems.
Hyper-convergence attempts to address issues with data center management and resource
consumption/provisioning. For example, the traditional DC architecture will consist of four main
components: network, storage, compute, and services (firewalls, load balances, etc.). These decoupled
items could be combined into a single and unified management infrastructure. The virtualization and
management layers are integrated into a single appliance, and these appliances can be bolted together
to scale-out linearly. Cisco sometimes refers to this as the Lego block model. This reduces the capital
investments a business must make over time since the architecture need not change as the business
grows. Hyper-converged systems, by virtue of their integrated management solution, simplify life cycle
management of DC assets as the “single pane of glass” concept can be used to manage all components.
Disaggregation is the opposite of hyper-convergence in that rather than combining functions (storage,
network, and compute) into a single entity, it breaks them apart even further. A network appliance, such
as a router or switch, can be decoupled from its network operating system (NOS). A white box or brite
box switch can be purchased at low cost with some other NOS installed, such as Cumulus Linux. Cumulus
generally does not sell hardware, only a NOS, much like VMware. Server/computer disaggregation has
been around for decades since the introduction of the personal computer (PC) whereby the common
Microsoft Windows operating system was installed on machines from a variety of manufacturers.
Disaggregation in the network realm has been adopted more slowly but has merit for the same reasons.
Another comparison not directly related to NFV/AFV relates to comparing virtual machines (VMs) to
containers. Conceptually, containers and virtual machines are similar in that they are a way to virtualize
services/machines on a single platform, effectively achieving multi-tenancy. This document will focus on
their differences and use cases.
Virtual machine systems rely on a hypervisor, which is a software shim that sits between the VMs
themselves and the underlying hardware. The hardware chipset would need to support this
virtualization, which is a technique to present hardware to VMs through the hypervisor. Each VM has its
own OS which is independent from the hypervisor. Hypervisors come in two flavors:
56
Nicholas J. Russo
1. Type 1: Runs on bare metal and is effectively an OS by itself. VMware ESXi is an example.
2. Type 2: Requires an underlying OS and provides virtualization services on top. Linux Kernel-
based Virtual Machine (KVM) is an example.
VMs are considered quite heavyweight with respect to the overhead needed to run them. This can
reduce the efficiency of a hardware platform as the VM count grows. It is especially inefficient when all
of the VMs run the same OS with very few differences other than configuration.
Containers on a given machine all share the same OS, unlike with VMs. This reduces the amount of
overhead, such as idle memory taxes, storage space for VM OS images, the general maintenance
associated with maintaining VMs. Multi-tenancy is achieved by memory isolation, effectively segmenting
the different services deployed in different containers. There is still a thin software shim between the
underlying OS and the containers known as the container manager, which enforces the multi-tenancy
via memory isolation and other techniques.
The main drawback of containers is that all containers must share the same OS. For applications or
services where such behavior is desired (for example, a container per customer consuming a specific
service), containers are a good choice. As a general-purpose virtualization platform in environments
where requirements may change often (such as military networks), containers are a poor choice.
Docker and Kubernetes are examples of container managers. Below is an image from Docker that
compares VMs to containers.
Tying the virtual machine discussion back into NFV, the European Telecommunications Standards
Institute (ETSI) has released a pair of documents that help describe the NFV architectural frame work
and its use cases, challenges, and value propositions. These documents are linked in the references.
57
Nicholas J. Russo
The high-level NFV architecture looks similar to the “Virtual Machines” diagram above because, in
general, Virtual Network Functions (VNFs) that are deployed as part of an NFV environment are virtual
machines. They can also be containers. At a high level, hardware resources such as compute, storage,
and networking are encompassed in the infrastructure layer, which rests beneath the hypervisor and its
associated operating system (the “virtualization layer” in ETSI terms). ETSI also defines a level of
hardware abstraction in what is called the NFV Infrastructure (NFVI) layer where by virtual compute,
virtual storage, and virtual networking can be exposed to VNFs. These VNFs are VMs or containers that
sit at the apex of the framework and perform some kind of network function, such as firewall, load
balancer, WAN accelerator, etc. The ETSI whitepaper is protected by copyright and the author has not
yet received written permission to use the graphic detailing this architecture. Readers are encouraged
to reference page 10 of the ETSI NFV Architectural document.
The “use case” whitepaper (not protected by any copyrights) is written by a diverse group of network
operators and uses a more conversational tone. This document discussed some of the advantages and
challenges of NFV which are pertinent to understanding the value proposition of NFV in general.
Examples from that document are listed below.
Advantage/Benefit Disadvantage/Challenge
Faster rollout of value added services Likely to observed decreased performance
Reduced CAPEX and OPEX Scalability exists only in purely NFV environment
Less reliance on vendor hardware refresh cycle Interoperability between different VNFs
Mutually beneficial to SDN; complementary Mgmt and orchestration alongside legacy systems
a. MPLS and Segment Routing. Some headend LSR needs to impose different MPLS labels for each
service in the chain that must be visited to provide a given service. MPLS is a natural choice here
given the label stacking capabilities and theoretically-unlimited label stack depth.
b. Networking Services Header (NSH). Similar to the MPLS option except is purpose-built for
service chaining. Being purpose-built, NSH can be extended or modified in the future to better
support new service chaining requirements, where doing so with MPLS shim header formats is
less likely. MPLS would need additional headers or other ways to carry “more” information.
58
Nicholas J. Russo
c. Out of band centralized forwarding. Although it seems unmanageable, a centralized controller
could simply instruct the data-plane devices to forward certain traffic through the proper
services without any in-band encapsulation being added to the flow. This would result in an
explosion of core state which could limit scalability, similar to policy-based routing at each hop.
d. Cisco vPath: This is a Cisco innovation that is included with the Cisco Nexus 1000v series switch
for use as a distributed virtual switch (DVS) in virtualized server environments. Each service is
known as a virtual service node (VSN) and the administrator can select the sequence in which
each node should be transited in the forwarding path. Traffic transiting the Nexus 1000v switch
is subject to redirection using some kind of overlay/encapsulation technology. Specifically, MAC-
in-MAC encapsulation is used for layer-2 tunnels while MAC-in-UDP is used for layer-4 tunnels.
59
Nicholas J. Russo
Distributed Augmented Hybrid Centralized
Availability Dependent on the Dependent on the Dependent on the Heavily reliant on a
protocol protocol protocol single SDN
convergence times convergence times convergence times controller, unless
and redundancy in and redundancy in and redundancy in one adds controllers
the network the network. the network. to split failure
Doesn’t matter Doesn’t matter domains or to
how bad the SDN how bad the SDN create resilience
controller is … its controller is … its within a single
failure is tolerable failure is tolerable failure domain
(which introduces a
distributed control-
plane in both cases)
Granularity / Generally low for Better than Moderately Very highly
control IGPs but better for distributed since granular since SDN granular; complete
BGP. All devices policy injection can policy decisions control over all
generally need a happen at the are extended to all routing decisions
common view of network edge, or a nodes. Can based on any
the network to small set of nodes. influence decisions arbitrary
prevent loops Can be combined based on any information within
independently. with MPLS TE for arbitrary a datagram
MPLS TE helps more granular information within
somewhat. selection. a datagram
Scalability Very high in a High, but gets Moderate, but gets Depends; all devices
(assume flow- properly designed worse with more worse with more retain state for all
based policy network (failure policy injection. policy injection. transiting flows.
and state domain isolation, Policies are Policy is Hardware-
retention) topology generally limited proliferated across dependent on
summarization, to key nodes (such the network to all TCAM and whether
reachability as border routers) nodes (though the SDN can use other
aggregation, etc) exact quantity may tables such as L4
vary per node) information, IPv6
flow labels, etc
60
Nicholas J. Russo
2.3 Resources and References
The Art of Network Architecture
CLN Recorded SDN Seminars
Cisco Service Provider NFV
Cisco vPath and vServices
Cisco Devnet Homepage
Cisco SDN
BRKNMS-2446 - Do IT with DevOps (2015 San Diego)
BRKDEV-2001 - DevOps in Programmable Network Environments (2015 San Diego)
BRKCRT-2603 - Cloudy with a chance of SDN (2015 San Diego)
BRKSDN-2761 - OpenDaylight: The Open Source SDN Controller Platform (2015 San Diego)
BRKSDN-1903 - A Model-driven Approach to Software Defined Networks with Yang, Netconf/Restconf
(2015 San Diego)
BRKRST-1014: Introduction to Software-Defined Networking (SDN) and Network Programmability (2015
San Diego)
Cisco - Introducing Network Programmability Fundamentals
Cisco - Introduction to APIC EM Deployment
Introduction to SDN (LiveLesson)
ETSI NFV Whitepaper
ETSI NFV Architectural Framework
NX-OS DevNet Network Automation Guide
RFC6241 - NETCONF
RFC6020 - YANG
Cisco IOS XE REST API
Cisco IOS XE RESTCONF
Ansible ios_config module
Ansible ios_command module
Cisco IWAN 2.1 Wiki
Cisco Campus Fabric CVD (Oct 2016)
61
Nicholas J. Russo
3. Internet of Things (IoT)
IoT, sometimes called Internet of Everything (IoE), is a concept that many non-person entities (NPEs) or
formerly non-networked devices in the world would suddenly be networked. This typically includes
things like window blinds, light bulbs, water treatment plant sensors, home heating/cooling units, street
lights, and anything else that could be remotely controlled or monitored. The business drivers for IoT are
substantial: electrical devices (like lights and heaters) could consume less energy by being smartly
adjusted based on changing conditions, window blinds can open and close based on the luminosity of a
room, and chemical levels can be adjusted in a water treatment plant by networked sensors. These are
all real-life applications of IoT and network automation in general.
The term Low-power and Lossy Networks (LLN) is commonly used in the IoT space since it describes the
vast majority of IoT networks. LLNs have the following basic characteristics (incomplete list):
a. Bandwidth constraints
b. Highly unreliable
c. Limited resources (power, CPU, and memory)
d. Extremely high scale (hundreds of millions and possibly more)
a. Data center (DC) Cloud: Although not a strict requirement, the understanding that a public
cloud infrastructure exists to support IoT is a common one. A light bulb manufacturer could
partner with a networking vendor to develop network-addressable light bulbs which are
managed from a custom application running in the public cloud. This might be better than a
private cloud solution since, if the application is distributed, regionalized instances could be
deployed in geographically dispersed areas using an “anycast” design for scalability and
performance improvements. As such, public cloud is generally assumed to be the DC presence
for IoT networks.
b. Core Networking and Services: This could be a number of transports to connect the public cloud
to the sensors. The same is true for any connection to public cloud, in reality, since even
businesses need to consider the manner in which they connect to public cloud. The primary
three options (private WAN, IXP, or Internet VPN) were discussed in the Cloud section. The same
options apply here. A common set of technologies/services seen within this layer include IP,
MPLS, mobile packet core, QoS, multicast, security, network services, hosted cloud applications,
big data, and centralized device management (such as a network operations facility).
c. Multi-service Edge (access network): Like most SP networks, the access technologies tend to
vary greatly based on geography, cost, and other factors. Access networks can be optically-
based to provide Ethernet handoffs to IoT devices; this would make sense for relatively large
62
Nicholas J. Russo
devices that would have Ethernet ports and would be generally immobile. Mobile devices, or
those that are small or remote, might use cellular technologies such as 2G, 3G, or 4G/LTE for
wireless backhaul to the closest POP. A combination of the two could be used by extending
Ethernet to a site and using 802.11 WIFI to connect the sensors to the WLAN. The edge network
may require use of “gateways” as a short-term solution for bridging (potentially non-IP) IoT
networks into traditional IP networks. The gateways come with an associated high CAPEX and
OPEX since they are custom devices to solve a very specific use-case. Specifically, gateways are
designed to perform some subset of the following functions, according to Cisco:
I. Map semantics between two heterogeneous domains: The word semantics in this
context refers to the way in which two separate networks operate and how each
network interprets things. If the embedded systems network is a transparent radio
mesh using a non-standard set of protocols while the multi-service edge uses IP over
cellular, the gateway is responsible for “presenting” common interfaces to both
networks. This allows devices in both networks to communicate using a “language” that
is common to each.
II. Perform translation in terms of routing, QoS security, management, etc: These items are
some concrete examples of semantics. An appropriate analogy for IP networkers is
stateless NAT64; an inside-local IPv4 host must send traffic to some outside-local IPv4
address which represents an outside-global IPv6 address. The source of that packet
becomes an IPv6 inside-global address so that the IPv6 destination can properly reply.
III. Do more than just protocol changes: The gateways serve as interworking devices
between architectures at an architectural level. The gateways might have a mechanism
for presenting network status/health between layers, and more importantly, be able to
fulfill their architectural role in ensuring end-to-end connectivity across disparate
network types.
Embedded Systems (Smart Things Network): This layer represents the host devices themselves.
They can be wired or wireless, smart or less smart, or any other classification that is useful to
categorize an IoT component. Often times, such devices support zero-touch provisioning (ZTP)
which helps with the initial deployment of massive-scale IoT deployments. For static
components, these components are literally embedded in the infrastructure and should be
introduced during the construction of a building, factory, hospital, etc. These networks are
rather stochastic (meaning that behavior can be unpredictable). The author classifies wireless
devices into three general categories which help explain what kind of RF-level transmission
methods are most sensible:
I. Long range: Some devices may be placed very far from their RF base stations/access
points and could potentially be highly mobile. Smart automobiles are a good example of
this; such devices are often equipped with cellular radios, such as 4G/LTE. Such an
option is not optimal for supporting LLNs given the cost of radios and power required to
run them. To operate a private cellular network, the RF bands must be licensed (in the
USA, at least), which creates an expensive and difficult barrier for entry.
63
Nicholas J. Russo
II. Short range with “better” performance: Devices that are within a local area, such as a
building, floor of a large building, or courtyard area, could potentially use unlicensed
frequency bands while transmitting at low power. These devices could be CCTV sensors,
user devices (phones, tablets, laptops, etc), and other general-purpose things whereby
maximum battery life and cost savings are eclipsed by the need for superior
performance. IEEE 802.11 WiFi is commonly used in such environments. IEEE 802.16
WiMAX specifications could also be used, but in the author’s experience, it is rare.
III. Short range with “worse” performance: Many IoT devices fall into this final category
whereby the device itself has a very small set of tasks it must perform, such as sending a
small burst of data when an event occurs (i.e., some nondescript sensor). Devices are
expected to be installed one time, rarely maintained, procured/operated at low cost,
and be value-engineered to do very few things. These devices are less commonly
deployed in home environments since many homes have WiFi; they are more commonly
seen spread across cities. Examples might include street lights, sprinklers, and
parking/ticketing meters. IEEE has defined 802.15.4 to support low-rate wireless
personal area networks (LR-PANS) which is used for many such IoT devices. Note that
802.15.4 is the foundation for upper-layer protocols such as ZigBee and WirelessHART.
ZigBee, for example, is becoming popular in homes to network some IoT devices, such
as thermostats, which may not support WIFI in their hardware.
IEEE 802.15.4 is worth a brief discussion by itself. Unlike WIFI, all nodes are “full-function” and can act as
both hosts and routers; this is typical for mesh technologies. A device called a PAN coordinator is
analogous to a WIFI access point (WAP) which connects the PAN to the wired infrastructure; this
technically qualifies the PAN coordinator as a “gateway” discussed earlier.
The basic IoT architecture is depicted on the following page.
64
Nicholas J. Russo
DATACENTER CLOUD
(PUBLIC, PRIVATE, ETC)
CORE TRANSPORT
(INET, PRIVATE WAN, ETC)
MULTI-SERVICE EDGE
(ETHERNET, WIFI, LTE, ETC)
EMBEDDED DEVICES
(SENSORS, MACHINES, ETC)
As a general comment, one IoT strategy is to “mesh under” and “route over”. This loosely follows the 7-
layer OSI model by attempting to constrain layers 1-2 to the IoT network, to include RF networking and
link-layer communications, then using some kind of IP overlay of sorts for network reachability.
Additional details about routing protocols for IoT are discussed later in this document.
The access type is mostly significant when performance is discussed. Although 4G LTE is very popular
and widespread in the United States and other countries, it is not available everywhere. Some parts of
the world are still heavily reliant on 2G/3G cellular service which is less capable and slower. A widely
distributed IoT network may have a combination of these access types with various levels of
performance. Higher performing 802.11 WiFi speeds typically require more expensive radio hardware,
more electricity, and a larger physical size. Physical access types (wired devices) will be generally
immobilized which could be considered a detriment to physical performance, if mobility is required for
an IoT device to do its job effectively.
65
Nicholas J. Russo
A new term which is becoming more popular in the IoT space is “fog” computing. It is sometimes
referred to as “edge” computing outside of Cisco environments, which is a more self-explanatory term.
Fog computing distributes storage, compute, and networking from a centralized cloud environment
closer to the users where a given service is being consumed. The drivers for edge computing typically
revolve around performance, notably latency reduction, as content is closer to users. The concept is
somewhat similar to Content Distribution Networking (CDN) in that users should not need to reach back
to a small number of remote, central sites to consume a service.
Fog computing is popular in IoT environments not just for performance reasons, but consumer
convenience. Wearing devices that are managed/tethered to other personally owned devices are a good
example. Some examples might be smart watches, smart calorie trackers, smart heart monitors, and
other wearables that “register” to a user’s cell phone or laptop rather than a large aggregation machine
in the cloud.
With respect to cost reduction when deploying a new service, comparing “cloud” versus “fog” can be
challenging and should be evaluated on a case-by-case basis. If the cost of backbone bandwidth from
the edge to the cloud is expensive, then fog computing might be affordable despite needing a capital
investment in distributed compute, storage, and networking. If transit bandwidth and cloud services are
cheap while distributed compute/storage resources are expensive, then fog computing is likely a poor
choice.
Finally, it is worth noting the endless cycle between the push to centralize and decentralize. Many
technical authors have noted this recurring phenomenon dating back many decades. From the
mainframe to the PC to the cloud to the fog, the pendulum continues to swing. The most realistic and
supportable solution is one that embraces both extremes, as well as moderate approaches, deploying
the proper solutions to meet the business needs.
3.1.2 Mobility
The mobility of an IoT device is going to be largely determined by its access method. Devices that are on
802.11 WiFi within a factory will likely have mobility through the entire factory, or possibly the entire
complex, but will not be able to travel large geographic distances. For some specific manufacturing work
carts (containing tools, diagnostic measurement machines, etc), this might be an appropriate method.
Devices connected via 4G LTE will have greater mobility but will likely represent something that isn’t
supposed to be constrained to the factory, such as a service truck or van. Heavy machinery bolted to the
factory floor might be wired since it is immobile.
66
Nicholas J. Russo
a. Use IEEE 802.1x for wired and wireless authentication for all devices. This is normally tied into a
Network Access Control (NAC) architecture which authorizes a set of permissions per device.
b. Encrypt wired and wireless traffic using MACsec/IPsec as appropriate.
c. Maintain physical accounting of all devices, especially small ones, to prevent theft and reverse
engineering.
d. Do not allow unauthorized access to sensors; ensure remote locations are secure also.
e. Provide malware protection for sensors so that the compromise of a single sensor is detected
quickly and suppressed.
f. Rely on cloud-based threat analysis (again, assumes cloud is used) rather than a distributed
model given the size of the IoT access network and device footprint. Sometimes this extension
of the cloud is called the “fog” and encompasses other things that produce and act on IoT data.
Another discussion point on the topic of security is determining how/where to “connect” an IoT
network. This is going to be determined based on the business needs, as always, but the general logic is
similar to what traditional corporate WANs use. Note that the terms “producer-oriented” and
“consumer-oriented” are creations of the author and exist primarily to help explain IoT concepts.
a. Fully private connections: Some IoT networks have no legitimate need to be accessible via the
public Internet. Such examples would include Government sensor networks which may be
deployed in a battlefield support capacity. More common examples might include Cisco’s
“Smart Grid” architecture which is used for electricity distribution and management within a
city. Exposing such a critical resource to a highly insecure network offers little value since the
public works department can likely control it from a dedicated NOC. System updates can be
performed in-house and the existence of the IoT network can be (and often times, should be)
largely unknown by the general population. In general, IoT networks that fall into this category
are “producer-oriented” networks.
b. Public Internet: Other IoT networks are designed to have their information shared or made
public between users. One example might be a managed thermostat service; users can log into a
web portal hosted by the service provider to check their home heating/cooling statistics, make
changes, pay bills, request refunds, submit service tickets, and the like. Other networks might be
specifically targeted to sharing information publicly, such as fitness watches that track how long
an individual exercises. The information could be posted publicly and linked to one’s social
media page so others can see it. A more practical and useful example could include public safety
information via a web portal hosted by the Government. In general, IoT networks that fall into
this category are “consumer-oriented” networks.
67
Nicholas J. Russo
Emergency Services (User), Municipality (owner), Manufacturer (Vendor). Who has provisioning access?
Who accepts Liability?
There is more than meets the eye with respect to standards and compliance for street lights. Most
municipalities (such as counties or townships within the United States) have ordinances that dictate how
street lighting works. The light must be a certain color, must not “trespass” into adjacent streets, must
not negatively affect homeowners on that street, etc. This complicates the question above because the
lines become blurred between organizations rather quickly. In cases like this, the discussions must occur
between all stakeholders, generally chaired by a Government/company representative (depending on
the consumer/customer), to draw clear boundaries between responsibilities.
Radio frequency (RF) spectrum is a critical point as well. While WiFi can operate in the 2.4 GHz and 5.0
GHz bands without a license, there are no unlicensed 4G LTE bands at the time of this writing. Deploying
4G LTE capable devices on an existing carrier’s network within a developed country may not be a
problem. Deploying 4G LTE in developing or undeveloped countries, especially if 4G LTE spectrum is
tightly regulated but poorly accessible, can be a challenge.
Several new protocols have been introduced specifically for IoT, some of which are standardized:
a. RPL - IPv6 Routing Protocol for LLNs (RFC 6550): RPL is a distance-vector routing protocol
specifically designed to support IoT. At a high-level, RPL is a combination of control-plane and
forwarding logic of three other technologies: regular IP routing, multi-topology routing (MTR),
and MPLS traffic-engineering (MPLS TE). RPL is similar to regular IP routing in that directed
acyclic graphs (DAG) are created through the network. This is a fancy way of saying “loop-free
shortest path” between two points. These DAGs can be “colored” into different topologies
which represent different network characteristics, such as high bandwidth or low latency. This
forwarding paradigm is similar to MTR in concept. Last, traffic can be assigned to a colored DAG
based on administratively-defined constraints, including node state, node energy, hop count,
throughput, latency, reliability, and color (administrative preference). This is similar to MPLS
TE’s constrained shortest path first (CSPF) process which is used for defining administrator-
defined paths through a network based on a set of constraints, which might have technical
and/or business drivers behind them.
b. 6LoWPAN - IPv6 over Low Power WPANs (RFC 4919): This technology was specifically
developed to be an adaptation layer for IPv6 for IEEE 802.15.4 wireless networks. Specifically, it
“adapts” IPv6 to work over LLNs which encompasses many functions:
I. MTU correction: The minimum MTU for IPv6 across a link, as defined in RFC2460, is
1280 bytes. The maximum MTU for IEEE 802.15.4 networks is 127 bytes. Clearly, no
value can mathematically satisfy both conditions concurrently. 6LoWPAN performs
fragmentation and reassembly by breaking the large IPv6 packets into IEEE 802.15.4
frames for transmission across the wireless network.
68
Nicholas J. Russo
II. Header compression: Many compression techniques are stateful and CPU-hungry. This
strategy would not be appropriate for low-cost LLNs, so 6LoWPAN utilizes an algorithmic
(stateless) mechanism. RFC4944 defines some common assumptions:
i. The version is always IPv6.
ii. Both source and destination addresses are link-local.
iii. The low-order 64-bits of the link-local addresses can be derived from the layer-2
network addressing in an IEEE 802.15.4 wireless network.
iv. The packet length can be derived from the layer-2 header.
v. Next header is always ICMP, TCP, or UDP.
vi. Flow label and traffic class are always zero.
As an example, an IPv6 header (40 bytes) and a UDP header (8 bytes) are 48 bytes long
when concatenated. This can be compressed down to 7 bytes by 6LoWPAN.
III. Mesh routing: Somewhat similar to WIFI, mesh networking is possible, but requires up
to 4 unique addresses. The original source/destination addresses can be retained in a
new “mesh header” while the per-hop source/destination addresses are written to the
MAC header.
IV. MAC level retransmissions: IP was designed to be fully stateless and any retransmission
or flow control was the responsibility of upper-layer protocols, such as TCP. When using
6LoWPAN, retransmissions can occur at layer-2.
c. CoAP - Constrained Application Protocol (RFC7252): At a high-level, CoAP is very similar to HTTP
in terms of the capabilities it provides. It is used to support the transfer of application data using
common methods such as GET, POST, PUT, and DELETE. CoAP runs over UDP port 5683 by
default (5684 for secure CoAP) and was specifically designed to be lighter weight and faster than
HTTP. Like the other IoT protocols, CoAP is designed for LLNs, and more specifically, to support
machine-to-machine communications. CoAP has a number of useful features and
characteristics:
I. Supports multicast: Because it is UDP-based, IP multicast is possible. This can be used
both for application discovery (in lieu of DNS) or efficient data transfer.
II. Built-in security: CoAP supports using datagram TLS (DTLS) with both pre-shared key
and digital certificate support. As mentioned earlier, CoAP DTLS uses UDP port 5684.
III. Small header: The CoAP overhead adds only 4 bytes.
IV. Fast response: When a client sends a CoAP GET to a server, the requested data is
immediately returned in an ACK message, which is the fastest possible data exchange.
Despite CoAP being designed for maximum efficiency, it is not a general replacement for HTTP.
It only supports a subset of HTTP capabilities and should only be used within IoT environments.
To interwork with HTTP, one can deploy an HTTP/CoAP proxy as a “gateway” device between
the multi-service edge and smart device networks.
d. Message Queuing Telemetry Transport (ISO/IEC 20922:2016): MQTT is, in a sense, the
predecessor of CoAP in that it was created in 1999 and was specifically designed for lightweight,
69
Nicholas J. Russo
web-based, machine-to-machine communications. Like HTTP, it relies on TCP, except uses ports
1883 and 8883 for plain-text and TLS communications, respectively. Being based on TCP also
implies a client/server model, similar to HTTP, but not necessary like CoAP. Compared to CoAP,
MQTT is losing traction given the additional benefits specific to modern IoT networks that CoAP
offers.
3.1.5 Migration
Migrating to IoT need not be swift. For example, consider an organization which is currently running a
virtual private cloud infrastructure with some critical in-house applications in their private cloud. All
remaining commercial applications are in the public cloud. Assume this public cloud is hosted locally by
an ISP and is connected via an MPLS L3VPN extranet into the corporate VPN. If this corporation owns a
large manufacturing company and wants to begin deploying various IoT components, it can begin with
the large and immobile pieces.
The multi-service edge (access) network from the regional SP POP to the factory likely already supports
Ethernet as an access technology, so devices can use that for connectivity. Over time, a corporate WLAN
can be extended for 802.11 WIFI capable devices. Assuming this organization is not deploying a private
4G LTE network, sensors can immediately be added using cellular as well. The migration strategy
towards IoT is very similar to adding new remote branch sites, except the number of hosts could be very
large. The LAN, be it wired or wireless, still must be designed correctly to support all of the devices.
70
Nicholas J. Russo
3.2 Resources and References
Cisco IOT Security
Cisco IOT General Information
Cisco IOT Assessment
Cisco IOT Homepage
BRKCRS-2444 - The Internet of Things: an Architectural Foundation and its Protocol
BRKSPG-2611 - IP Routing in Smart Object Networks and the Internet of Things
BRKIOT-2020 - The Evolution from Machine-to-Machine (M2M) to the Internet of Everything:
Technologies and Standards
DEVNET-1108 - Cisco Executives Discuss the Internet of Things
RFC6550 - RPL
RFC4919 - 6LoWPAN
RFC7252 - CoAP
MQTT Website
MQTT Specification (ISO/IEC 20922:2016)
71
Nicholas J. Russo