Cloud Computing Network Final
Cloud Computing Network Final
Introduction
Over the past few years, cloud computing has rapidly emerged as a widely accepted computing
paradigm built around core concepts such as on-demand computing resources, elastic scaling,
elimination of up-front investment, reduction of operational expenses, and establishing a pay-
per-usage business model for information technology and computing services. There are
different models of cloud computing that are offered today as services like Software as a Service
(SaaS), Platform as a Service (PaaS), and infrastructure as a Service (IaaS) [1]. IaaS, which is
the focus of this work, refers to the capability that is provided to the consumers to provision
processing, storage and networks, and other fundamental computing resources where they are
able to deploy and run arbitrary software. The consumer does not manage or control the
underlying cloud infrastructure but has control over operating systems, storage, deployed
applications, and possibly selected networking components (e.g., firewalls, load balancers, etc.).
Amazon is arguably the first major proponent of IaaS through its Elastic Computing Cloud (EC2)
service.
Cloud-computing technology is still evolving. Various companies, standards bodies, and
alliances are addressing several remaining gaps and concerns. Some of these concerns are:
What are the challenges behind the virtual networking in IaaS deployment? What are the
potential solutions using the existing technologies for the implementation of virtual networks
inside IaaS vision? Is there any room to utilize innovative paradigms like Software Defined
Networking (SDN) [2] to address virtual networking challenges? When cloud federation (or even
cloud bursting) is involved, should the servers in the cloud be on the same Layer 2 network as
the servers in the enterprise or, should a Layer 3 topology be involved because the cloud
servers are on a network outside the enterprise? In addition, how would this approach work
across multiple cloud data centers?
Consider a case where an enterprise uses two separate cloud service providers.
authentication information) are some of the problems with having the clouds “interoperate.”
For virtualized cloud services, VM migration is another factor to be considered in federation.
In this work we present a tutorial of networking in IaaS and key challenges and issues, which
should be addressed using existing technologies or novel and innovative mechanisms. Virtual
networking and extensions of cloud computing facilities along with federation issues are the
focus of this work. SDN as a novel and innovative mechanism provides proper solutions for these
issues, which are also included in a comparison of virtual networking techniques. A high- level
SDN-based cloud federation framework as an innovative opportunity is presented in this work
and the last part of this paper concludes our contribution.
Networking in IaaS
Although cloud computing does not necessarily depend on virtualization, several cloud
infrastructures are built with virtualized servers. Within a virtualized environment, some of the
networking functionalities (e.g., switching, firewall, application-delivery controllers, and load
balancers) can reside inside a physical server. Consider the case of the software-based Virtual
Switch as shown in Figure 1. The Virtual switch inside the same physical server can be used to
switch the traffic between the VMs and aggregate the traffic for connection to the external
physical switch. The Virtual Switch is often implemented as a plug-in to the hypervisor. The VMs
have virtual Ethernet adapters that connect to the Virtual Switch, which in turn connects to the
physical Ethernet adapter on the server and to the external Ethernet switch. Unlike physical
switches, the Virtual Switch does not necessarily have to run network protocols for its
operation, nor does it need to treat all its ports the same because it knows that some of them
are connected to virtual Ethernet ports. It can function through appropriate configuration from
an external management entity.
Challenges in IaaS
Among various challenges that should be addressed in an IaaS deployment, in this work
we focus on virtual networking and cloud extension and cloud federation issues and in the
sequel we provide innovative opportunities that could be utilized to address these issues.
Existing networking protocols and architectures such as Spanning Tree protocol and
Multi-Chassis Link Aggregation (MC-LAG) can limit the scale, latency, throughput and VM
migration of enterprise cloud networks. Therefore open standards and proprietary protocols
are proposed to address cloud computing networking issues. While existing layer 3 “fat tree”
networks provide a proven approach to address the requirements for a highly virtualized cloud
data center, there are several industry standards that enhance features of a flattened layer 2
network, using Transparent Interconnection of Lots of Links (TRILL), Shortest Path Bridging
(SPB) or have the potential to enhance future systems based on SDN concepts and OpenFlow.
The key motivation behind TRILL and SPB and SDN-based approach is the relatively flat nature
of the data-center topology and the requirement to forward packets across the shortest path
between the endpoints (servers) to reduce latency, rather than a root bridge or priority
mechanism normally used in the Spanning Tree Protocol (STP). The IEEE 802.1Qaz, known as
Enhanced Transmission Selection (ETS), in line with other efforts, allows low-priority traffic to
burst and use the unused bandwidth from the higher-priority traffic queues, thus providing
greater flexibility [4]. Vendor proprietary protocols are also developed by major networking
equipment manufacturers to address the same issues. For instance Juniper Networks produces
switches, using a proprietary multipath L2/L3 encapsulation protocol called QFabric, which
allows multiple distributed physical devices in the network to share a common control plane
and a separate common management plane. Virtual Cluster Switching (VCS) is a multipath layer
2 encapsulation protocol by Brocade, based on TRILL and Fabric Shortest Path First (FSPF) path
selection protocol and a proprietary method to discover neighboring switches. Cisco’s
FabricPath, is a multipath layer 2 encapsulation based on TRILL, which does not include TRILL’s
next-hop header, and has a different MAC learning technique. They all address the same issues
with different features for scalability, latency, oversubscription, and management. However,
none of these solutions have reached the same level of maturity as STP and MAC-LAG [4].
Layer 2 (switching) and Layer 3 (routing) are two possible options for cloud
infrastructure networking. Layer 2 is the simpler option, where the Ethernet MAC address and
Virtual LAN (VLAN) information are used for forwarding. The drawback of switching (L2) is
scalability. L2 networking flattens the network topology, which is not ideal when there is large
number of nodes. Routing (L3) option and subnets provide segmentation for the appropriate
functions at the cost of lower forwarding performance and network complexity.
Existing cloud networking architectures follow the “one size fits all” paradigm in meeting
the diverse requirements of a cloud. The network topology, forwarding protocols, and security
policies are all designed looking at the sum of all requirements preventing the optimal usage
• Application performance: Cloud tenants should be able to specify bandwidth
requirements for applications hosted in the cloud, ensuring similar performance to on-
premise deployments. Many tiered applications require some guaranteed bandwidth
between server instances to satisfy user transactions within an acceptable time frame
and meet predefined SLAs. Insufficient bandwidth between these servers will impose
significant latency on user interactions. Therefore without explicit control, variations in
cloud workloads and oversubscription can cause delay and drift of response time
beyond acceptable limits, leading to SLA violations for the hosted applications.
• Flexible deployment of appliances: Enterprises deploy a wide variety of security
appliances in their data centers, such as Deep Packet Inspection (DPI) or Intrusion
Detection Systems (IDS), and firewalls to protect their applications from attacks. These
are often employed alongside other appliances that perform load balancing, caching and
application acceleration. When deployed in the cloud, an enterprise application should
continue to be able to flexibly exploit the functionality of these appliances.
• Policy enforcement complexities: Traffic isolation and access control to the end-users
are among the multiple forwarding policies that should be enforced. These policies
directly impact the configuration of each router and switch. Changing requirements,
different protocols (e.g., OSPF, LAG, VRRP), different flavors of L2 spanning tree
protocols, along with vendor specific protocols, make it extremely challenging to build,
operate and inter-connect a cloud network at scale.
• Topology dependent complexity: The network topology of data centers is usually tuned
to match a pre-defined traffic requirement. For instance, a network topology, which is
optimized for east-west traffic (i.e., traffic among servers in a data center), is not the
same as the topology for north-south (traffic to/from the Internet). The topology design
also depends on how the L2 and/or L3 is utilizing the effective network capacity. For
instance adding a simple link and switch in the presence of a spanning tree based L2
forwarding protocol, may not provide additional capacity. Furthermore, evolving the
topology based on traffic pattern changes also requires complex configuration of L2 and
L3 forwarding rules.
• Application rewriting: Applications should run “out of the box” as much as possible, in
particular for IP addresses and for network-dependent failover mechanisms.
Applications may need to be rewritten or reconfigured before deployment in the cloud
to address several network related limitations. Two key issues are: 1) lack of a broadcast
domain abstraction in the cloud network and 2) cloud-assigned IP addresses for virtual
servers.
• Location dependency: Network appliances and servers (e.g., hypervisors) are typically
tied to a statically configured physical network, which implicitly creates a location
dependency constraint. For instance the IP address of a sever is typically determined
based on the VLAN or subnet it belongs to. VLAN and subnets are based on physical
switch port configuration. Therefore, a VM cannotbe easily and smoothly migrated
across the network. Constrained VM migration decreases the level of resource
utilization and flexibility. Besides, physical mapping of VLAN or subnet space to the
physical ports of a switch often leads to a fragmented IP address pool.
• Multi-layer network complexity: A typical three layer data center network includes TOR
layer connecting the servers in a rack, aggregation layer and core layer, which provides
connectivity to/from the Internet edge. This multi-layer architecture imposes significant
complexities in defining boundaries of L2 domains, L3 forwarding networks and policies,
and layer-specific multi-vendor networking equipment.
Providers of cloud computing services are currently operating their own data centers.
Connectivity between the data centers to provide the vision of “one cloud” is completely within the
control of the cloud service provider. There may be situations where an organization or enterprise
needs to be able to work with multiple cloud providers due to locality of access, migration from one
cloud service to another, merger of companies working with different cloud providers, cloud providers
who provide best-of-class services, and similar cases. Cloudinteroperability and the ability to share
various types of information between clouds become important in such scenarios. Although cloud
service providers might see less immediate needfor any interoperability, enterprise customers will
see a need to push them in this direction. This broad area of cloud interoperability is sometimes
known as cloud federation. Cloud federation manages consistency and access controls when two or
more independent cloud computing facilities share either authentication, computing resources,
command and control, or access to storage resources. Some of the considerations in cloud federation
are as follows:
• An enterprise user wishing to access multiple cloud services would be better served if
there were just a single authentication and/or authorization mechanism (i.e., single sign-
on scheme). This may be implemented through an authentication server maintained by
an enterprise that provides the appropriate credentials to the cloud service providers.
Alternatively, a central trusted authentication server could be used to which all cloud
services are interfaced. Computing and storage resources may be orchestrated through
the individual enterprise or through an interoperability scheme established between the
cloud providers. Files may need to be exchanged, services invoked, and computing
resources added or removed in a proper and transparent manner. A related area is VM
migration and how it can be done transparently.
• Cloud federation has to provide transparent workload orchestration between the clouds
on behalf of the enterprise user. Connectivity between clouds includes Layer 2 and/or
Layer 3 considerations and tunneling technologies that need to be agreed upon.
Consistency and a common understanding are required independent of the
technologies. An often ignored concern for cloud federation is charging or billing and
reconciliation. Management and billing systems need to work together for cloud
federation to be a viable option. This reality is underlined by the fact that clouds rely on
per-usage billing. Cloud service providers might need to look closely at telecom service
provider business models for peering arrangements as a possible starting point. Cloud
federation is a relatively new area in cloud computing. It is likely that standard
organizations will first need to agree on a set of requirements before the service
interfaces can be defined and subsequently materialized.
Consider an IaaS cloud, to which an enterprise connects to temporarily augment its server
capacity. It would be ideal if the additional servers provided by the IaaS cloud were part of the
same addressing scheme of the enterprise (e.g., 10.x.x.x). As depicted in Figure 2, the IaaS cloud
service provider has partitioned a portion of its public cloud to materialize a private cloud for
enterprise “E”. The private cloud is reachable as a LAN extension to the servers in enterprise E’s
data center. A secure VPN tunnel establishes the site-to-site VPN connection. The VPN gateway
on the cloud service provider side (private cloud “C”) maintains multiple contexts for each
private cloud. Traffic for enterprise “E” is decrypted and forwarded to an Ethernet switch to
the private cloud. A server on enterprise “E”’s internal data center sees a server on private
cloud “C” to be on the same network. Some evolution scenarios can be considered for this
scheme [5]:
• Automation of the VPN connection between the enterprise and cloud service provider:
This automation can be done through a management system responsible for the cloud
bursting and server augmentation. The system sets up the VPN tunnels and configures
the servers on the cloud service provider end. The management system is set up and
operated by the cloud service provider.
• Integration of the VPN functions with the site-to-site VPN network functions from
service providers: For instance, service providers offer MPLS Layer 3 VPNs and Layer 2
VPNs (also known as Virtual Private LAN Service, or VPLS) as part of their offerings.
Enterprise and cloud service providers could be set up to use these network services.
• Cloud service providers using multiple data centers: In such a situation, a VPLS-like
service can be used to bridge the individual data centers, providing complete
transparency from the enterprise side about the location of the cloud servers.
Cloud networking is not a trivial task. Modern data centers designed to provide cloud service
offerings face similar challenges to build the Internet itself due to their size. At its simplest case
(e.g. providing VMs like Amazon’s EC2), we are talking about data centers that need to provide
as much as 1 million networked devices in a single facility. These requirement means the need
for technologies with high performance, scalable, robust, reliable, flexible, easy to monitor,
control and manageable.
SDN-based Cloud Computing Networking
SDN [7] is an emerging network architecture where “network control functionality” is
decoupled from “forwarding functionality” and is directly programmable [6], [7]. This migration
of control, formerly tightly integrated in individual networking equipment, into accessible
computing devices (logically centralized) enables the underlying infrastructure to be
“abstracted” for applications and network services. Therefore applications can treat the
network as a logical or virtual entity. As a result, enterprises and carriers gain unprecedented
programmability, automation, and network control, enabling them to build innovative, highly
scalable, flexible networks that readily adapt to changing business needs.
A logical view of the SDN architecture is depicted in Figure 3. OpenFlow is the first
standard interface designed specifically for SDN, providing high-performance, granular traffic
control across multiple vendors’ network devices. Network intelligence is logically centralized in
SDN control software (e.g. OpenFlow controllers), which maintain a global view of the network.
As a result the network, in its ultimate abstracted view, appears as a single logical switch.
Adapting SDN architecture, greatly simplifies the design and operation of networks since it
removes the need to know and understand the operation details of hundreds of
protocols/standards. Enterprises and carriers gain vendor-independent control over the entire
network from a single logical point.
In addition to the network abstraction, SDN architecture will provide and support a set
of APIs that simplifies the implementation of common network services (e.g., slicing,
virtualization, routing, multicast, security, access control, bandwidth management, traffic
engineering, QoS, processor and/or storage optimization, energy consumption, and various
form of policy management). SDN’s promise is to enable the following key features:
A list of SDN & OpenFlow-based open source projects and initiatives are compiled in
Table 1. OpenFlow-based SDN has created opportunities to help enterprises build more
deterministic, more manageable and more scalable virtual networks that extend beyond
enterprise on-premises data centers or private clouds, to public IT resources, while ensuring
higher network efficiency to carriers seeking to improve their services profitability by
provisioning more services, with fewer, better-optimized resources.
Table 1: A categorized list of OpenFlow-based Open Source projects
Innovation opportunities
The first constraint of VLANs is 4K limitation of VLANs. Secondly, all the MAC addresses
from all the VMs are visible in the physical switches of the network. This can fill up the MAC
table of physical switches, especially if the deployed switches are legacy ones. Typical NICs are
able to receive unicast frames for a few MAC addresses. If the number of VMs are more than
these limit, then the NIC has to be put in promiscuous mode, which engages the CPU to handle
flooded packets. This will waste CPU cycles of hypervisor and bandwidth.
The VM-aware networking (architectural group b) scales a bit better. The whole idea is
that the VLAN list on the physical switch to the hypervisor link is dynamically adjusted based on
the server need. This can be done with VM-aware TOR switches (Arista, Force 10, Brocade), or
VM-Aware network management server (Juniper, Alcatel-Lucent, NEC), which configures the
physical switches dynamically, or VM-FEX from Cisco, or EVB from IBM. This approach reduces
flooding to the servers and CPU utilization and using proprietary protocols (e.g., Qfabric) it is
possible to decrease the flooding in physical switches. However, MAC addresses are still visible
in the physical network, the 4K limitations remain intact and the transport in physical network
is L2 based with associated flooding problems. This approach could be used for large virtualized
data centers but not for IaaS clouds.
The main idea behind vCDNI is that there is a virtual distributed switch which is isolated
from the rest of the network and controlled by vCloud director and instead of VLAN, uses a
proprietary MAC-in-MAC encapsulation. Therefore the VM MAC addresses are not visible in the
physical network. Since there is a longer header in vCDNI protocol, the 4K limitation of VLANs is
not intact anymore. Although unicast flooding is not exist in this solution, but multicast flooding
indeed exist in this approach. Furthermore it still uses L2 transport.
Conceptually, VXLAN is similar to the vCDNI approach, however instead of having a
proprietary protocol on top of L2; it runs on top of UDP and IP. Therefore, inside the hypervisor
the port groups are available, which are tight to VXLAN framing, which generates UDP packets,
going down through IP stack in the hypervisor and reaches the physical IP network. VXLAN
segments are virtual layer 2 segments over L3 transport infrastructure with a 24-bit segment ID
to alleviate the traditional VLAN limitation. L2 flooding is emulated using IP multicast. The only
issue of VXLAN is that it doesn’t have a control plane.
Nicira NVP is very similar to VXLAN with a different encapsulation format, which is point-
to-point GRE tunnels; however the MAC-to-IP mapping is downloaded to Open vSwitch [8]
using a centralized OpenFlow controller. This controller removes the need for any flooding as it
was required in VXLAN (using IP multicast). To be precise, this solution utilizes the MAC over IP
with a control plane. The virtual switches, which are used in this approach, are OpenFlow
enabled, which means that the virtual switches can be controlled by an external OpenFlow
controller (e.g., NOX). These Open vSwitches use point-to-point GRE tunnels that unfortunately
cannot be provisioned by OpenFlow. These tunnels have to be provisioned using other
mechanisms, because OpenFlow has no Tunnel provisioning message. The Open vSwitch
Database Management Protocol (OVSDB) [9] , which is a provisioning protocol, is used to
construct a full mesh GRE tunnels between the hosts that have VMs from the same tenant.
Whenever two hosts have one VM each that belong to the same tenant a GRE tunnel will be
established between them. Instead of using dynamic MAC learning and multicast the MAC to IP
mapping are downloaded as flow forwarding rules through OpenFlow to the Open vSwitches.
This approach scales much better than VXLAN, because there is no state to maintain in the
physical network. Furthermore, ARP proxy can be used to stop L2 flooding. This approach
requires an OpenFlow and OVSDB controller to work in parallel to automatically provision GRE
tunnels.
SDN-based Federation
There are general advantages to be realized by enterprises that adopt OpenFlow-
enabled SDN as the connectivity foundation for private and/or hybrid cloud connectivity. A
logically centralized SDN control plane will provide a comprehensive view (abstract view) of
data center and cloud resources and access network availability. This will ensure cloud-
federation (cloud extensions) are directed to adequately resourced data centers, on links
providing sufficient bandwidth and service levels. Using the SDN terminologies, a high level
description of key building blocks for an SDN-based cloud federation are:
• OpenFlow enabled cloud backbone edge nodes, which connect to the enterprise and
cloud provider data center
• OpenFlow enabled core nodes which efficiently switch traffic between these edge nodes
• An OpenFlow and/or SDN-based controller to configure the flow forwarding tables in
the cloud backbone nodes and providing a WAN network virtualization application (e.g.
Optical FlowVisor [10]).
• Hybrid cloud operation and orchestration software to manage the enterprise and
provider data center federation, inter-cloud workflow, and resource management of
compute/storage and inter-data center network management
SDN-based federation will facilitate multi-vendor networks between enterprise and service
provider data centers, helping enterprise customers to choose best-in-class vendors, while
avoiding vendor lock-in; pick a proper access technology from a wider variety (e.g. DWDM, DSL,
HFC, LTE, PON, etc.); access dynamic bandwidth for ad-hoc, timely inter-data center workload
migration and processing; and eliminate the burden of underutilized, costly high-capacity fixed
private leased lines. SDN-enabled bandwidth-on-demand services provide automated and
intelligent service provisioning, driven by cloud service orchestration logic and customer
requirements.
Conclusions
In this article the infrastructure as a service (IaaS) architecture and key challenges with a focus
on virtual networks and cloud federation were presented. IaaS has provided a flexible model, in
which customers are billed according to their compute usage, storage consumption, and the
duration of usage. Some of the challenges in the existing Cloud Networks are: guaranteed
performance of applications when applications are moved from on-premises to the cloud
facility, flexible deployment of appliances (e.g., deep packet inspection, intrusion detection
systems, or firewalls), and associated complexities to the policy enforcement and topology
dependence. A typical three layer data center network includes TOR layer connecting the
servers in a rack, aggregation layer and core layer, which provides connectivity to/from the
Internet edge. This multi-layer architecture imposes significant complexities in defining
boundaries of L2 domains, L3 forwarding networks and policies, and layer-specific multi-vendor
networking equipment. Applications should run “out of the box” as much as possible, in
particular for IP addresses and for network-dependent failover mechanisms. Network
appliances and servers (e.g., hypervisors) are typically tied to a statically configured physical
network, which implicitly creates a location dependency constraint. SDN architecture in
addition to decoupling the data forwarding and control planes will provide and support a set of
APIs that simplifies the implementation of common network services. VLAN, VM-aware
networking, vCDNI, VXLAN and Nicira NVP are technologies to provide virtual networks in cloud
infrastructures. Nicira NVP, which utilizes MAC in IP encapsulation and external control plane
provides the efficient solution for virtual network implementation. OpenFlow core and edge
nodes with a proper OpenFlow controller can be considered as a novel cloud federation
mechanism. SDN-based federation will facilitate multi-vendor networks between enterprise
and service provider data centers, helping enterprise customers to choose best-in-class
vendorsNetwork fabric, which is a proposal for network edge version of OpenFlow is one of the
recent proposals towards extension of SDN to increase the simplicity and flexibility of future
network designs. What we should make clear, is that SDN does not, by itself, solve all the issues
of cloud computing networking. The performance of SDN deployments, the scalability issue, the
proper specification of northbound interface in SDN and co-existence and/or integration of SDN
and network function virtualization, and proper extension to the OpenFlow to make it a viable
approach in WAN-based application (e.g. EU FP7 SPARC project) are among the topics that need
further research and investigations.
References:
[1]. P. Mell and T. Grance, “The NIST Definition of Cloud Computing,” September 2011:
http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf, accessed 30
November 2012.
[2].T. Koponen , M. Casado , N. Gude , J. Stribling , L. Poutievski , M. Zhu , R. Ramanathan , Y.
Iwata , H. Inoue , T. Hama , S. Shenker, “Onix: A Distributed Control Platform for Large-
scale Production Networks”, in Proc. OSDI, 2010.
[3].R. Niranjan Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V.
Subramanya, A. Vahdat, “PortLand: A Scalable Fault-Tolerant Layer 2 Data Center
Network Fabric,” ACM SIGCOMM Comput. Communi., vol. 39, no. 4, pp. 39-50, October
2009.
[4]. C. J. Sher Decusatis, A. Carranza, C. M. Decusatis, "Communication within clouds: open
standards and proprietary protocols for data center networking," Communications
Magazine, IEEE , vol.50, no.9, pp.26-33, September 2012.
[5]. T. Wood, P. Shenoy, K. K. Ramakrishnan, J. Von der Merwe, “CloudNet: A Platform for
Optimized WAN migration of Virtual Machines,” Technical Report 2010-002, University
of Massachusetts, http://people.cs.umass.edu/~twood/pubs/cloudnet-tr.pdf, accessed
30 November 2012.
[6]. N. McKeowen, et. al, “OpenFlow: Enabling Innovation in Campus Networks”, OpenFlow
white paper, 14 March 2008, available online:
http://www.openflow.org//documents/openflow-wp-latest.pdf, accessed 30 November
2012.
[7]. OpenFlow Switch Specifiation, version 1.3.1 (wire protocol 0x04), Open Networking
Foundation, 6 September 2012,
https://www.opennetworking.org/images/stories/downloads/specification/openflow-
spec-v1.3 1.pdf, access: 30 November 2012.
[8]. Open vSwitch, An Open Virtual Switch: http://openvswitch.org/
[9]. B. Pfaff, B. Davie, “The Open vSwitch Database Management Protocol,” Internet-draft,
draft-pfaff-ovsdb-proto-00, Nicira Inc., 20 August 2012.
[10]. S. Azodolmolky, R. Nejabati, S. Peng, A. Hammad, M. P. Channegowda, N. Efstathiou,
A. Autenrieth, P. Kaczmarek, and D. Simeonidou, “Optical FlowVisor: An OpenFlow-
based optical network virtualization approach,” In proceedings of OFC/NFOEC 2012,
paper JTh2A.41, 4-8 March 2012.