CCNP Route119
CCNP Route119
Disclaimer:................................................................................................................................... 2
Introduction:.................................................................................................................................3
Design Overview:.........................................................................................................................3
Topology Diagrams:................................................................................................................. 3
Core:........................................................................................................................................ 4
Point of Presence (POP):........................................................................................................ 4
Data Center:............................................................................................................................ 5
Addressing:..............................................................................................................................5
Network Underlay:....................................................................................................................... 6
Core & POP:............................................................................................................................ 6
Data Center:............................................................................................................................ 9
BGP................................................................................................................................... 9
IPv4 Unicast:..............................................................................................................10
BGP Link-State.......................................................................................................... 12
Data Center BGP....................................................................................................... 15
Segment Routing (SR)...............................................................................................................17
Traffic Engineering.............................................................................................................. 21
Controller Integration (PCE)............................................................................................... 24
Traffic Steering...............................................................................................................27
On-Demand Nexthop................................................................................................. 27
Binding SID Stitching................................................................................................. 33
Static Route SRTE..................................................................................................... 35
Summary....................................................................................................................37
Route 109 Project Documentation - Brett Koelling 2023
Page 2
TI-LFA....................................................................................................................................37
Operation Examination.................................................................................................... 37
Configuration Options...................................................................................................... 42
Summary..........................................................................................................................43
Overlay Services........................................................................................................................43
L3VPN................................................................................................................................... 43
VPNv4 Address Family.................................................................................................... 43
L2VPN................................................................................................................................... 52
BGP L2VPN EVPN VPWS...............................................................................................53
L2VPN EVPN SR ODN....................................................................................................57
EVPN VPLS..................................................................................................................... 60
Data Center (VXLAN)........................................................................................................... 61
VXLAN Layer 2 Concepts................................................................................................ 62
VXLAN Layer 3 Concepts................................................................................................ 68
Fabric Forwarding...................................................................................................... 71
L3VNI Routing............................................................................................................75
External Connectivity................................................................................................. 77
Orchestration (NSO).................................................................................................................. 81
NSO Device Import................................................................................................................82
Configuration Monitoring........................................................................................................87
NSO Templates......................................................................................................................89
Case Studies.............................................................................................................................. 97
Enterprise Customer........................................................................................................... 98
HQ....................................................................................................................................98
Remote Site 1................................................................................................................ 104
Data Center Interconnect...............................................................................................106
Summary........................................................................................................................110
Data Center Routing.......................................................................................................... 111
Downstream VNI............................................................................................................ 114
Project Summary..................................................................................................................... 115
Disclaimer:
Route 109 is a fictitious network provider that will act as a vessel to explore various
technologies being implemented within today's internet service provider landscape. The goal of
the Route 109 project is strictly educational and does not serve as any meaningful
implementation guide. This project is personal work and holds no affiliation to any external
entities. The following report will only serve as basic documentation.
Route 109 Project Documentation - Brett Koelling 2023
Page 3
Introduction:
The purpose of this project was simply to gain a further understanding of service delivery
technologies and associated design methodology. Beyond simple exposure to specialized
technology, the inherent complexity of this lab provides a better understanding of many aspects
of computer networking. The following document will be broken into sections with further
discussion regarding the how and whys with accompanying technical discussion.
Design Overview:
The Route 109 project consists strictly of Cisco physical and virtual devices performing routing
and switching operations. Besides the traditional networking infrastructure, Linux and Windows
devices will simulate various end users and provide supporting services. A single F5 Virtual
Edition will be used within the data center environment. A complete list of components can be
found below.
F5 BIG-IP VE 17.1.0.3 1
Topology Diagrams:
Route 109 Project Documentation - Brett Koelling 2023
Page 4
Core:
The core has been implemented with redundancy and path options in mind. This design allows
for node and link failure while retaining active paths spanning edge regions. The core is broken
into multiple IGP domains and orchestrated via a path computational element (PCE), which will
be discussed further in future sections. The core consists of only IOS-XR devices.
The point of presence architecture is relatively straightforward. This design allows equal-cost
paths to direct north and south traffic to and from the provider core. Like the core, this realm will
be orchestrated via PCE to make adjustments and provide path variation based on
operator-specified parameters. The point of presence consists of IOS-XE and IOS-XR devices.
Route 109 Project Documentation - Brett Koelling 2023
Page 5
Data Center:
The Data Center module uses a Clos architecture, which allows the provisioning of
VXLAN-based services. This design is optimized for east-west traffic patterns and allows
seamless host mobility. This module will interact via external border gateway protocol (eBGP)
with the service provider network providing external connectivity. The data center consists of
only NX-OS devices within the switching fabric.
Addressing
In the lab, attempts will be made to retain a predictable addressing scheme. The general
practice in use is
10.n1.n2.n
N1 is the higher numeric value node, with N2 being the lower value. The final octet value is the
node where the address is present. A CIDR length of 24 has been used to keep things simple
between nodes. Loopback addressing will be the node number repeated (e.g., 10.10.10.10/32).
The diagram below depicts the addressing scheme.
Route 109 Project Documentation - Brett Koelling 2023
Page 6
On the data center nodes IP unnumbered has been utilized on multiple links. The data center
underlay configuration section will discuss the specific IPs used. The loopback addresses have
slightly deviated within the point of presence module and will use the 10.40.n.n format.
Network Underlay:
The following section will discuss the underlay networks for all three modules. This section will
only cover basic IGP functions and configuration needs. Additional protocols interacting with the
IGP will be discussed further in their respective sections.
The core underlay will consist of three separate IGP domains. The separation of domains
provides scalability and creates trouble isolation. In this lab, the forwarding (non-ABR) routers in
each domain will not know each other. BGP will provide the necessary reachability for certain
endpoints across the domains.
Route 109 Project Documentation - Brett Koelling 2023
Page 7
As shown above, the breaking up of domains creates a modular design that provides central
troubleshooting points when investigating end-to-end forwarding. Data plane forwarding will be
handled via segment routing and discussed in future sections. The baseline IGP configurations
are displayed in the following code blocks accompanied with brief explanations.
!
router ospf 100
distribute link-state instance-id 100
segment-routing mpls
segment-routing forwarding mpls
segment-routing sr-prefer
area 0
prefix-suppression
mpls traffic-eng
interface Loopback0
prefix-sid index 3
!
interface GigabitEthernet0/0/0/5
network point-to-point
!
interface GigabitEthernet0/0/0/6
network point-to-point
!
!
!
IOS XR node OSPF Configuration
Route 109 Project Documentation - Brett Koelling 2023
Page 8
router ospf 40
router-id 10.40.1.1
prefix-suppression
segment-routing mpls
mpls traffic-eng router-id Loopback0
mpls traffic-eng area 0
IOS-XE node OSPF Configuration
All links between nodes are configured as point to point networks. Segment routing extensions
have been enabled, and link state information is being distributed. The link state information will
be forwarded via BGP to enable inter-domain traffic engineering. Prefix suppression is being
performed to minimize table size.
The IS-IS configuration is similar to OSPF. Metric-style wide has been configured to allow
segment routing TLVs to pass properly. Segment routing extensions have also been enabled.
Like prefix suppression in OSPF, the advertised passive-only command has been used to limit
table size. All links are configured as level 2 only and operate as point-to-point links.
Route 109 Project Documentation - Brett Koelling 2023
Page 9
Data Center:
Data center devices will utilize OSPF as an IGP underlay. The interconnections will be
configured as point-to-point IP unnumbered links. The data center module loopback addressing
is shown below:
The OSPF configuration for the data center underlay is simple as it will only serve as a
reachability method for BGP endpoints directing the overlay traffic.
interface Ethernet1/1
no switchport
medium p2p
ip unnumbered loopback0
ip ospf network point-to-point
ip router ospf 100 area 0.0.0.0
ip pim sparse-mode
no shutdown
Route 109 Project Documentation - Brett Koelling 2023
Page 10
BGP
It can be argued that BGP will act as both an under and overlay within this network. For this
reason, it will be divided into two separate sections. This section will review the underlay needs
while services will be discussed later.
IPv4 Unicast:
The IPv4 Unicast family will tie together the entire infrastructure. The first use case we will
explore is end-to-end reachability throughout the core and POP locations.
The diagram above displays the routers participating in the IPv4 Unicast address family. The
routers acting as an area border router and route reflector perform next-hop-self operations
between the domains. In the core layer, the designated route reflector is the relay point for P2,
P3, PE11, and PE12. The ABR/RR router acts as the relay point within the OSPF domain.
Through the advertisement of loopback interfaces, an operator can achieve end-to-end
connectivity (Segment Routing/LDP needs to be used for data plane forwarding). A quick
traceroute from PE20 to R4 will demonstrate this ability.
Route 109 Project Documentation - Brett Koelling 2023
Page 11
As displayed the traceroute probe can navigate all three domains by utilizing a single label
towards the ABR. The ABR will then reference its forwarding table, which has entries for both
domains. If necessary, the final ABR will place the final label, and traffic will terminate at the
correct endpoint. This document will now follow the BGP hops taken in the previous traceroute
and explore each device's BGP configuration. The block below will display the BGP
configuration present at PE20.
#PE20 Configuration
PE20 operates without much configuration. PE20 will act as a basic PE router in this state, only
requiring a next-hop change for eBGP learned routes before forwarding. Moving onto an ABR
router, the configuration becomes more involved as the router's responsibility increases.
The ABR must act as route reflectors to reflect routes into their domain. The route reflector
distinction overrides the iBGP loop prevention behavior. Also, the device must change the next
hop address due to the separated domains. In IOS-XR, the configuration must include the ibgp
policy out enforce-modifications for the next-hop-self command to have any impact on iBGP
routes.
Route 109 Project Documentation - Brett Koelling 2023
Page 13
BGP Link-State
While segment routing has yet to be discussed, understand that link-state information must be
shared between the domains to control traffic engineering paths. BGP has been adapted to
share this information between domain boundaries. This new address family is known as BGP
Link State (BGP-LS) AFI 16388, SAFI 71. The link state updates are robust and provide all the
necessary details for a traffic controller to map each domain.
Router configuration is minimal to share link-state information. There are two contact points, one
in the IGP configuration and another in adding the address family to the appropriate neighbors
within the BGP configuration.
Route 109 Project Documentation - Brett Koelling 2023
Page 14
router ospf 40
distribute link-state instance-id 40
IGP link-state distribution configuration
Each IGP process will be redistributed with an instance number. These numbers need to be
globally unique.
neighbor 100.100.100.100
remote-as 10200
update-source Loopback0
!
address-family link-state link-state
BGP link-state configuration
In the lab network, the controller that will utilize this information resides within the core module.
This allows the ABR to easily share the information for all domains without any involved routing
configuration. For the information to reach its final destination, the route reflector will peer with
the controller utilizing the BGP-LS family. Reviewing the BGP summary command, the controller
has received 98 entries that have been reflected.
The BGP Link-State table is a busy output. Still, it provides the basic information necessary to
verify that the links are properly distributed.
Route 109 Project Documentation - Brett Koelling 2023
Page 15
The objective of the data center module is to create a fabric. In this deployment, the route
reflectors peer with all three leaves, sharing all routes. The peering allows the devices to act as
one large switch. The L2VPN EVPN address family allows seamless layer 2 forwarding behavior
over the layer 3 underlay. This is known as a spine-leaf architecture. The origin of this
architecture comes from a telephone-switching layout developed by Charles Clos. The
high-level objective is multiple non-blocking links with predictable traffic patterns. A separate
loopback address is used for BGP configuration in this environment. This is due to VXLAN
behavior that would cause BGP to go down if the VXLAN tunnel endpoint shared the same
loopback. Also, enabling the correct features to configure the device properly is essential with
Cisco Nexus devices. Below is the needed feature(s) and the initial spine configuration.
###Leaf Only
feature fabric forwarding
feature interface-vlan
feature vn-segment-vlan-based
The spine accepts and reflects routes from every leaf peer. Extended communities are
necessary to interpret the L2VPN NLRI correctly. The leaf unit configuration is similar to that of
the spines. It creates a full mesh with the route-reflectors.
The SR configuration can be placed in a few lines. The first necessary configuration within this
specific lab will be to carve an explicit label block for SR. The explicit block is required due to
the inter-domain nature. This is recognized as the default value in a single domain, but it is best
to define ranges explicitly when considering multi-domain.
Route 109 Project Documentation - Brett Koelling 2023
Page 18
segment-routing
global-block 16000 23999
SR label block configuration
Once the block is allocated, IS-IS and OSPF can be configured to enable segment routing
extensions.
#PE11 Configuration
router isis LAB
address-family ipv4 unicast
mpls traffic-eng level-2-only
mpls traffic-eng router-id Loopback0
segment-routing mpls sr-prefer
!
interface Loopback0
address-family ipv4 unicast
prefix-sid index 11
router ospf 40
segment-routing mpls
segment-routing sr-prefer
area 0
interface Loopback0
prefix-sid index 11
ISIS SR configuration
Under the interface configuration, a prefix-sid is referenced; this is the node identifier for SR.
This is formally known as a node segment identifier (node SID). This SID needs to be globally
significant as it acts as the unique ID for said node. The index specification is the node's
position within the global block. Index 11 = 16000+11, resulting in a node side for PE11 of
16011.
Route 109 Project Documentation - Brett Koelling 2023
Page 19
Adjacency-sid verification
Route 109 Project Documentation - Brett Koelling 2023
Page 20
Viewing the mpls forwarding table the manually configured SID and the dynamic allocation are
both visible for Gi0/0/0/1. The static label can now be called in a traffic engineering policy,
forcing the applicable traffic through the specified link. The adjacency SID will also assume a
significant role when implementing fast reroute, which will be discussed in future sections. With
the basic configuration demonstrated in this section, the network should have end-to-end
reachability through SR labels. A traceroute from PE11 to PE20 will verify this claim.
Referring back to the node side map presented earlier in this section you can see the path
imposes the ABR prefix-sid. Upon reception of that initial packet, the ABR performs a lookup for
destination 20.20.20.20, which returns the node sid of PE20 (16020). This label is then imposed
and directed out the correct interface toward the destination.
Currently, the traffic successfully reaches its destination, but the operator has no granular path
control. The network only forwards to the border nodes where the new label is imposed, and the
best IGP path is taken. Traffic Engineering will provide the necessary tools to configure granular
label paths.
Traffic Engineering
Traffic engineering is not a new concept. This has previously been handled through the
combination of LDP and RSVP. Segment Routing has again provided new options and methods
for granular traffic control with less configuration overhead. For intra and interdomain traffic,
Segment Routing Traffic Engineering (SRTE) will utilize policy configuration to enforce the
desired path.
Route 109 Project Documentation - Brett Koelling 2023
Page 21
Focusing on the simple path depicted in the diagram above, a policy can be constructed to force
all traffic with a next hop of 3.3.3.3 over the specific path. The policy will consist of an explicit
path list directing all traffic over the transport path of P5-P2-P3.
segment-list 20-5-2-3
index 1 mpls label 16005
index 2 mpls label 16002
index 3 mpls label 16003
!
policy 20_to_3
color 3 end-point ipv4 3.3.3.3
autoroute
include ipv4 all
!
candidate-paths
preference 100
explicit segment-list 20-5-2-3
Explicit segment list configuration
The first configuration is the explicit path, which the SRTE policy will reference. In this path
configuration, the indexed instructions will utilize MPLS label values. The label stack imposed
Route 109 Project Documentation - Brett Koelling 2023
Page 22
will mirror the explicit path (The first hop will resolve, meaning the 16005 will not be placed, but
traffic will be directed out the interface towards P5 with the next label for forwarding.) The policy
is given a simple name, and the router will also create its own internal name based on the color
and endpoint values. The color value is used to identify traffic intended for specific paths. For
instance, multiple tunnels may exist toward node 3.3.3.3, and an operator could classify traffic
color to utilize a particular tunnel toward the endpoint (e.g., Color X is designated to tunnel one
while Color Y uses tunnel two). The endpoint is the IP value of the tunnel-terminating endpoint.
The autoroute configuration forces applicable traffic down the policy-defined path. In the case of
this intra-domain example, only 3.3.3.3 will be reachable via the tunnel. Autoroute is only
applicable for IGP prefixes. Before committing the policy a traceroute can prove that the traffic
is still taking the preferred IGP path.
Upon activation, the new path can be observed. Also, the routing table will now display the
SRTE policy as the next hop for traffic destined to 3.3.3.3.
The command show segment-routing traffic-engineering policy color X can be run to view
details and policy status. From this output, an operator can assess the state of the policy and
the current path in use.
The preceding intra-area demonstration simply represented the specificity achieved when
utilizing SRTE. Many additional options will be assessed in future sections.
Route 109 Project Documentation - Brett Koelling 2023
Page 24
Configuration of the BGP Link State address family provides the PCE node with a rich database
of information that can be utilized to delegate SRTE policies based on various specifications. An
onboard demonstration can help showcase this ability by forcing the PCE to calculate a path
based on known values.
In the example above, the PCE could examine all available paths between two nodes existing in
separate domains and determine the optimal path based on latency. The cspf-sr-mpls option in
the command specifies that the requested calculation should be performed with a constrained
shortest path first SR computation. To push this policy from the PCE to the PCC (PE11), the
Route 109 Project Documentation - Brett Koelling 2023
Page 26
configuration needed at the PCE node is similar to the manually configured policy demonstrated
in the previous section.
Pce
segment-routing
traffic-eng
peer ipv4 11.11.11.11
policy 11_20_Latency
color 100 end-point ipv4 20.20.20.20
candidate-paths
preference 100
dynamic mpls
metric
type latency
PCE based policy configuration
All traffic engineering configurations conducted on the PCE node will be placed under the
top-level PCE configuration mode. The PCE node must specify a peer target to know where to
send the specific configured policies. Like the manual policy, a name, color, and endpoint must
be set for the policy to be deemed valid. Under the candidate paths the dynamic path type is
selected. This allows the PCE to choose the path that best matches the metric constraint
dynamically. In this case, the PCE has considered all paths and selects the option with the
lowest latency. If this path fails, the PCE will assess the remaining paths and determine a new
low-latency option. Reviewing the SRTE policies at PE11 reveals that the policy has
successfully been pushed from the PCE.
Traffic Steering
SRTE traffic steering can be handled in various ways. While this document will aim to cover
most use cases, the options continue to grow. Within the Route 109 network, both automated
and manual traffic steering options will be utilized and documented.
On-Demand Nexthop
On-demand Nexthop (ODN) allows setting various policies that will be instantiated and torn
down as needed. A multi-domain environment, such as in this lab, simplifies operations by
enabling the PCE to handle the necessary stitching of labels while offloading computation
responsibility to a centralized device.
The diagram above demonstrates the high-level overview of the ODN process. L3VPN client
routes are tagged with a specific color upon ingress. These routes are then reflected to other
applicable provider edge nodes. If an existing on-demand policy is present, the traffic will be
steered into said policy based on the color value. The computation for this policy can be
performed locally or via the controller. Within this multi-domain lab, this will be handled by the
PCE. The configuration on the edge nodes is minimal and delegates control of the path to the
PCE.
segment-routing
traffic-eng
on-demand color 500
dynamic
pcep
PCC on-demand color configuration
With the current setup, any traffic received possessing the extended community color of 500 will
trigger a request toward the PCE to compute the necessary path. With no constraints, the path
returned will reflect the existing IGP path through each domain. Reviewing a route received via
VPNv4, the color 500 has triggered the PCE computation. The PCE extracts the next-hop from
the BGP route and calculates a policy for said node.
The traceroute demonstration verified the label stack defined by the PCE had been imposed
with the VPNv4 label residing at the bottom of the stack.
A simple change to the policy could also provide a dynamic low latency path for customers who
decide to purchase the service. Adding the color value of 600 will trigger a computation request
for a low-latency path toward the PCEP.
The color distinction is handled via VRF export on the IOS-XR edge node. A route policy
specifies an extended community list for route export. The route reflector reflects the route with
the color value to other edge nodes.
Route 109 Project Documentation - Brett Koelling 2023
Page 31
route-policy VPN-LL
set extcommunity color VPN-LL
End-policy
vrf CXw-24
address-family ipv4 unicast
import route-target
10200:24
export route-policy VPN-LL
export route-target 10200:24
IOS-XR router policy and VRF export configuration
With minimal configuration operators can maintain various traffic groups providing specific
behavior and constraints through the use of color distinctions and PCE driven policy.
On-demand next hop provides a simplified approach to scale with network/customer growth.
Route 109 Project Documentation - Brett Koelling 2023
Page 32
The diagram above displays the separation between policies. The explicit BSID value is pulled
from the Segment Routing Local Block (15,000 - 15,999). Beginning with Policy-ONE, an explicit
segment list will direct traffic over the desired path. The final label within the explicit list will be
the BSID value of Policy-TWO. This instruction will cause the traffic to be steered into the
following policy.
Route 109 Project Documentation - Brett Koelling 2023
Page 33
#PE20 16020
segment-list LIST-POL-1
index 1 mpls label 16006
index 2 mpls label 16005
index 3 mpls label 16002
index 4 mpls label 15009
policy Policy-ONE
color 700 end-point ipv4 2.2.2.2
candidate-paths
preference 100
explicit segment-list LIST-POL-1
policy Policy-TWO
binding-sid mpls 15009
color 700 end-point ipv4 11.11.11.11
candidate-paths
preference 100
explicit segment-list LIST-POL-2
#PE11 16011
segment-list LIST-POL-3
index 1 mpls label 16041
index 2 mpls label 16042
index 3 mpls label 16044
policy Policy-THREE
binding-sid mpls 15019
color 700 end-point ipv4 10.40.4.4
candidate-paths
preference 100
explicit segment-list LIST-POL-3
BSID policy configuration across domains
Route 109 Project Documentation - Brett Koelling 2023
Page 34
Through the use of a traceroute the explicit path can be visualized. The three separate policy
paths have been highlighted in the code block below. The bottom label signaled via the VPNv4
peerings remains unchanged as the traffic is sent through multiple policies.
Another benefit regarding the use of stitching policies is the control of the label stack depth.
Using multiple policies, the label stack is constantly refreshed along the end-to-end path and
never gets larger than four labels deep. Without the BSID, the label stack would possess seven
separate values at ingress.
segment-list RING
index 1 mpls label 16041
index 2 mpls label 16043
index 3 mpls label 16044
index 4 mpls label 16042
policy POL-RING
color 8 end-point ipv4 10.40.2.2
candidate-paths
preference 100
explicit segment-list RING
router static
address-family ipv4 unicast
10.40.2.2/32 sr-policy srte_c_8_ep_10.40.2.2
Static route SRTE policy configuration
The next-hop in the routing table now points directly to the SRTE policy and a traceroute will
verify that the traffic is flowing as intended.
Summary
This section served as the basis for all references going forward regarding SRTE policy creation
and manipulation. SRTE is a broad topic that has many use cases. The previous demonstration
provided a basic understanding of the technology's ability.
TI-LFA
Topology Independent Loop-Free Alternate replaces the previous technologies Loop-Free
Alternate (LFA) and Remote Loop-Free Alternate (RLFA). The high-level objective of these
technologies is to provide the rapid repair of forwarding paths upon failure of either link or node.
TI-LFA leans on segment routing as it is also an extension of the IGP process. Using node and
adjacency SIDs, TI-LFA can adequately protect all topologies, hence the distinction of topology
independence.
Operation Examination
From a configuration standpoint, TI-LFA is only a few lines placed under the IGP configuration.
But, it is essential to understand how the technology determines a suitable backup path. There
is a specific method used to select the most suitable option.
interface GigabitEthernet0/0/0/6
cost 10
network point-to-point
fast-reroute per-prefix
fast-reroute per-prefix ti-lfa enable
TI-LFA interface configuration
The OSPF routes command displays the most pertinent information regarding the selection of
the backup route.
In this initial topology node P5 has selected an outgoing interface of Gi0/0/0/1 with a label of
16003 to route around the failure. To understand this behavior, some definitions must be
introduced.
Reviewing the diagram again, the P and Q space nodes can be revealed.
These definitions can add unnecessary confusion. To simplify these terms, the P Space can be
viewed as the routers that node P5 or the Point of Local Repair (PLR) can reach without
traversing the protected link. This determination is based on cost, and documentation states that
ECMP paths should also be disqualified. The Q Space is the point where a given node can
reach the destination without needing to traverse the protected link. In the diagram above, P3
can reach the destination of 2.2.2.2/32 without ever needing to cross the protected link.
Route 109 Project Documentation - Brett Koelling 2023
Page 39
If traffic were to forward down the post-convergence path(s), the ECMP paths within node P6’s
table could potentially blackhole traffic. TI-LFA avoids this issue by taking action faster than IGP
convergence and pushing the correct labels to prevent traffic loss. In the example above, upon
link failure, the node SID of 16003 will be utilized on traffic destined toward 2.2.2.2/32. With this
label, traffic will not be able to create a loop. When the traffic arrives unlabeled at P3, the
destination of 2.2.2.2/32 will utilize Gig0/0/0/2 and have no reason to route back towards the
broken link. When the IGP reconverges to accommodate the newly failed link, TI-LFA will have
already placed the traffic flow on the correct path. This negates the need to swing traffic a
second time, which was sometimes necessary for previous fast re-route technologies.
Route 109 Project Documentation - Brett Koelling 2023
Page 40
Changing the link costs slightly demonstrates the flexibility of TI-LFA. These link alterations
present more loop potential in the environment.
With the new example, traffic is forced directly into a loop. Upon failure of the protected link,
traffic will be sent to P6, but without IGP convergence, the traffic will hairpin towards P5. This is
known as a micro loop. To avoid this, TI-LFA will use adjacency sids to force the traffic towards
the Q space.
When the protected link fails traffic will now be sent with the adjacency SID value of 24000.
Again, once traffic arrives at P3 unlabeled, it will follow the shortest path toward the destination.
If traffic were to arrive at P6 with only the label 16003, the shortest path would still be through
the failed link, again causing a micro loop. The adjacency SID forces traffic to P3, which can
complete the path.
Configuration Options
TI-LFA can be enabled on all IGP interfaces. Operators can place the basic configuration on the
interfaces they intend to protect, and the IGP will handle the rest. The one modifier is the type of
protection in place. Although only link protection was used in this demonstration, node and
Shared Risk Link Group (SRLG) are also available. TI-LFA is supported by both OSPF and ISIS
Summary
TI-LFA provides sub 50ms path convergence with little to no intervention from the operator.
Using Segment Routing labels, traffic can be steered in various ways while avoiding traffic loops
and loss. TI-LFA marks another consolidation brought upon by Segment Routing while providing
a beneficial feature set.
Overlay Services
Overlay services are the heart of provider networks. Customers are not buying lines of service
to peruse the core routing network. The main objective is to pair customers with content, provide
private transit, or enable further services. The overlay service needs seamless connectivity to
resources with little customer involvement. The following section will investigate multiple overlay
options and the services that they enable/provide.
L3VPN
Layer 3 VPNs are one of the most well-established services provided. Layer 3 VPNs enable
private customer connectivity over the provider network's shared medium. The concept is that
customers can interconnect multiple sites across many geographic areas while retaining private
transport. The traffic placed within the VPN is isolated from other customer circuits. L3VPN uses
VRF separation alongside various identifiers to keep traffic segmented throughout the path.
VPNv4 Peering will utilize the same loopback addresses advertised through the IPv4 unicast
address family. On IOS-XR nodes, no special consideration is needed regarding the extended
communities. On IOS-XE devices, the use of the communities must be explicitly configured to
avoid issues.
# IOS-XR PE Configuration
neighbor 100.100.100.100
remote-as 10200
update-source Loopback0
address-family vpnv4 unicast
next-hop-self
#IOS-XE PE Configuration
address-family vpnv4
neighbor 100.100.100.100 activate
neighbor 100.100.100.100 send-community both
neighbor 100.100.100.100 next-hop-self
IOS-XR and IOS-XE VPNv4 peeing configuration
Route 109 Project Documentation - Brett Koelling 2023
Page 44
Any route reflector configuration would remove the next-hop-self distinction and specify
route-reflector-client under the configured peers. While the peering will now be up, sharing
routes over the VPNv4 session warrants a discussion regarding VRFs, RDs, and RTs.
vrf CXw-24
address-family ipv4 unicast
import route-target
10200:24
!
export route-policy VPN-LL
export route-target
10200:24
VPNv4 route-target configuration
A route target is now included within the VPNv4 NLRI UPDATE message.
The purpose of the route-target value is to manage the import and export of routes into the
customer VRF over the VPNv4 network.
In the demonstrated lab configuration, all VPNv4 routes will be reflected to all peers. If a router
does not possess the VRF utilizing the specific import target, that route will be dropped upon
ingress. This behavior can be altered with the command retain route-target all which will retain
all VPNv4 routes received. This override has specific use cases that are not discussed in this
document.
The Route Distinguisher acts as a local identifier at an edge node. The main benefit of the RD is
the ability to implement overlapping address space across customer VRFs.
Route 109 Project Documentation - Brett Koelling 2023
Page 46
The RD is retained when advertised to other edge nodes. When customer routes are advertised
from a VRF, an additional label is also sent with the BGP update. This label is used as the
bottom of the stack label when sending traffic to the destination and allows the egress PE to
identify the VRF to which the route belongs. With these attributes, multiple customers can share
the same IP address ranges and be considered unique across the provider network. Below is a
Wireshark capture displaying the advertised RD and bottom of the stack or VPN label.
RD values can be reused across the same customer VPNs on separate edge nodes, but this
method may cause routing inefficiency. For instance, if multiple paths exist to a prefix but
identical RDs are used across nodes, the route-reflectors will only share the best route. With the
same RD used across multiple nodes, even if a customer has dual uplink across PE nodes, only
one route will be shared. Using different RD values, the route-reflectors consider all routes
unique.
Route 109 Project Documentation - Brett Koelling 2023
Page 47
Unique RD Scenario
Customer peering is achieved utilizing eBGP via VRF-based IPv4 Unicast sessions. These
sessions act identically to a global BGP session, with the difference being the added isolation
enabled by the VRF. The BGP configuration can be completed after the VRF is created and
assigned to an interface.
In IOS-XR the RD value is set within the BGP configuration while in IOS-XE the RD is
configured within the global VRF configuration.
Route 109 Project Documentation - Brett Koelling 2023
Page 48
Router PE20 (16020) receives a route for 24.24.24.20/32 from a directly attached customer via
eBGP peering.
With this route being received via a configured VRF BGP peer, the route will be pushed into the
VPNv4 address family. This propagation can be examined from the VPNv4 table.
Route 109 Project Documentation - Brett Koelling 2023
Page 49
The route will be sent out toward peer 100.100.100.100, the route-reflector.
The route reflector has received the route update from PE20 and will send the route to the
configured reflector clients.
Node 16044 will receive the route and examine the configured import route-targets. With a valid
match, the route will be imported to the appropriate VRFs configured with the specified RT.
Route 109 Project Documentation - Brett Koelling 2023
Page 50
Some syntax highlights have been added to emphasize key values in the output above. The
BGP prefix has been received with a label advertisement. This label will be placed at the bottom
of the stack and retained across the provider network. When traffic arrives at the PE node, the
bottom label will be used to forward traffic to the correct VRF instance. The RT import value has
matched the configured VRF value on the local node, allowing the route to be retained. The
Originator and Cluster value display that the route was sent from PE20 and reflected by
100.100.100.100. Finally, the source RD is also shared within the update. The same route will
be pushed into the IPv4 Unicast VRF table.
The propagation of the route allows end to end communication via L3VPN. A traceroute from
the customer router connected to node 16044 displays the interdomain path utilized.
This configuration can be scaled across multiple routers. VPNv4 sessions are all that is needed
in order to share the BGP routes depicted.
To further align previous discussions, segment routing is also a key factor in allowing L3VPN to
operate in the manner demonstrated. The underlay must be a labeled path to pass this traffic
without knowing the final destination along the transit routers. In this case, segment routing will
provide a labeled path to traverse the provider network.
L2VPN
Layer 2 VPNs (L2VPN) grant customers increased control and simplicity. Unlike the L3VPN
service, which requires provider network interactions, L2VPN provides a wire-like service. The
provider passes traffic transparently to provide the customer with a service that behaves as a
layer 2 link. Previously, L2VPN service creation was configuration intensive and would require
the instantiation of many pseudo-wires. Today, most L2VPN implementations will be centered
around Ethernet Virtual Private Network (EVPN). Specicifially the BGP address-family L2VPN
EVPN. The BGP family allows the sharing of layer 2 addressing over a layer 3 medium. With
MPLS/SR labeling layer 2 connectivity can be achieved across the provider core.
L2VPN allows different options for connection behavior as well. The service can act as a
traditional point-to-point, Virtual Private Wire Service (VPWS) or as a shared medium known as
Virtual Private LAN Service (VPLS).
Route 109 Project Documentation - Brett Koelling 2023
Page 52
The diagram above depicts the peering configuration that will be utilized for the VPWS
demonstration. BGP configuration enables the address family for each peer, and the proper
distinctions are made for route-reflection and next hop self.
#Route Reflector
!
address-family l2vpn evpn
route-reflector-client
!
#PE
!
address-family l2vpn evpn
next-hop-self
!
Route reflector and PE BGP configuration
With peering established, the session can exchange routes. To generate routes, additional
L2VPN configuration needs to be in place.
Route 109 Project Documentation - Brett Koelling 2023
Page 53
l2vpn
xconnect group 20_2_4
p2p xc224
interface GigabitEthernet0/0/0/1.77
neighbor evpn evi 1001 target 10001 source 20001
pw-class ONE
!
interface GigabitEthernet0/0/0/1.77 l2transport
encapsulation dot1q 77
L2VPN configuration
The L2VPN configuration above will enable basic layer 2 forwarding behavior for the specified
interface. The group distinguisher is simply an operator-selected name. The P2P configuration
specifies the cross-connect (xc) name. The included interface(s) are specified, and the EVPN
neighbor statement is defined.
The EVPN instance (EVI) acts as a portion of the route target (both import and export). The RT
creation is handled automatically within IOS-XR. The target and source value specified after the
EVI will represent the remote and local attachment circuits (AC). An attachment circuit is where
traffic will ingress/egress for a specific group.
The final group configuration specifies a pw-class, which is short for pseudowire class. This
class configuration allows operators to set parameters such as encapsulation type (MPLS) and
preferred paths for the layer 2 traffic. In the case of this lab, the multi-domain nature will force
the use of an SR path, which will be called explicitly in the first example.
#PE20
segment-routing
traffic-engineering
segment-list PW-PATH-01
index 1 mpls label 16005
index 2 mpls label 16002
index 3 mpls label 16001
index 4 mpls label 16011
index 5 mpls label 16041
index 6 mpls label 16044
policy PseudoWire_Pol1
color 77 end-point ipv4 10.40.4.4
candidate-paths
preference 100
explicit segment-list PW-PATH-01
l2vpn
pw-class ONE
encapsulation mpls
preferred-path sr-te policy srte_c_77_ep_10.40.4.4
L2VPN explicit SRTE policy
This traffic will need explicit instruction at both ends of the labeled path to function properly.
Once both ends have been configured and the SR policy has become active, the customer
devices will have a continuous layer 2 path between sites.
L2VPN xc status
Adding the detail specification to the show l2vpn xconnect group <name> command will provide
additional output that can be useful during troubleshooting. With the configured interfaces from
the L2VPN group in a UP state, routes will be sent and received over the BGP L2VPN EVPN
session.
Route 109 Project Documentation - Brett Koelling 2023
Page 55
Highlighted in red are the automatically configured values previously a manual task within the
VPNv4 address family. The values are easy to decipher as the route-distinguisher is RID:EVI,
and the route target is ASN:EVI. The RD and RTs serve the same purpose as they did in
VPNv4. All routes belonging to EVI 1001 will be imported and exported at each respective PE.
The RID:EVI combination will distinguish overlapping values at the PE.
With bi-directional SR policies in place and assigned to the pseudowire class, the layer 2 traffic
will now be able to flow across the core. This will be demonstrated by establishing an OSPF
peering across the client routers.
As displayed in the Wireshark captures, the node pushes the explicit label path on the traffic
upon ingress. Through this process, the client routers are abstracted from all core routing and
see the link as a direct connection to the remote peer. This service allows clients to link multiple
sites in various ways as the service no longer requires layer 3 interactions with provider routers.
#PE20 Configuration
segment-routing
traffic-eng
on-demand color 74
dynamic
Pcep
evpn
evi 1001
bgp
route-policy export EVPN-DYNAMIC
route-policy EVPN-DYNAMIC
if evpn-route-type is 1 then
set extcommunity color EVPN-74
Endif
This configuration is very similar to what has already been performed in previous sections. With
this, routes will be tagged with a specific color upon egress.
With the configuration again mirrored on both edge nodes, pings can flow with the new label
stack handed out by the PCE device.
Route 109 Project Documentation - Brett Koelling 2023
Page 58
With the 9000v, it was noticed that the traffic would not flow over the policy automatically. While
the PCC/PCE recognized the specific color value and triggered the calculation process, the
layer 2 traffic would only steer to the policy with explicit configuration in the pw-class.
Route 109 Project Documentation - Brett Koelling 2023
Page 59
EVPN VPLS
Virtual Private LAN Service allows a full mesh of connectivity to client sites. This service
operates as a shared segment where client devices will learn the MAC addresses from the
provider-connected interface. Again, this option provides the customer with many options for
interconnecting sites.
Unfortunately, this is again a limitation of the virtualized ASR platform. Below, the generic
configuration will be shared with a Cisco Doc link.
l2vpn
bridge group 800
bridge-domain BD-800
interface GigabitEthernet0/0/0/1.800
evi 8000
l2vpn
bridge group 800
bridge-domain BD-800
interface GigabitEthernet0/0/0/1.800
Route 109 Project Documentation - Brett Koelling 2023
Page 60
!!% Invalid argument: VPLS Bridge domains not supported on this platform
!
!
!
!
end
Basic DC fabric
The high-level goal is to configure the overlay service to emulate a single switching domain. In
reality, this domain is disaggregated across multiple nodes. This configuration option alleviates
the need to span layer 2 domains and leverages EVPN to transport the needed information over
BGP sessions. VXLAN also allows multi-tenancy, allowing the single fabric to act as multiple
fabrics with various levels of segmentation. The example outlined in this document will examine
simple Layer 2 and Layer 3 instances.
Route 109 Project Documentation - Brett Koelling 2023
Page 61
NVE/VTEP clarification
An NVE depends upon a loopback interface to source traffic. A good practice is not reusing
loopbacks between services (BGP, OSPF, IS-IS) in VXLAN. NVE’s are only present on leaf
devices. The configuration displayed in the remainder of this section will be taken exclusively
from Cisco NX-OS devices.
#Source Loopback
interface loopback192
description NVE Loopback
ip address 192.168.1.101/32
ip router ospf 100 area 0.0.0.0
ip pim sparse-mode
#NVE
interface nve1
no shutdown
host-reachability protocol bgp
source-interface loopback192
The NVE configuration above specifies the use of loopback 192 as the source interface and
also instructs to utilize BGP as the desired control plane protocol. With the configuration
displayed, the NVE is not fully functional but ready to encapsulate/decapsulate traffic. The
following configuration will examine getting traffic to flow through the NVE.
Another significant feature of VXLAN is the expanded ID space. Using a Virtual Network
Identifier (VNI), the space ranges from 1 to 16777215. It is essential not to confuse this with the
regular configurable VLAN range, which remains the usual 1 to 4094. The VNI is utilized in the
overlay to separate traffic and provide unique values.
Route 109 Project Documentation - Brett Koelling 2023
Page 63
As the diagram depicts, the VNI adds another lidentification layer. With VXLAN, the VLANs local
to the switch are no longer globally significant across the domain as they were in legacy layer 2
data centers. The VNI now acts as the global value. With identical VNI values, hosts attached to
the three separate VLANs can reach each other across the fabric. VNI distinction is made under
the VLAN configuration mode.
vlan 100
vn-segment 10010
VNI configuration
Assigning the VLAN to access ports and or trunks south of the VXLAN fabric edge does not
require any new style configuration. After a VLAN is assigned to the desired ports, the NVE
interface must be made aware that this VNI needs to be advertised across the fabric.
interface nve1
member vni 10010
ingress-replication protocol bgp
VNI association to the NVE
At this time, the MAC address learned on VLAN 100 (VNI 10010) will be picked up by the EVPN
process and forwarded to other interested peers. To address all layer 2 forwarding caveats,
VXLAN must provide a mechanism to handle Broadcast, Unknown Unicast, and Multicast
(BUM) traffic. Two methods have been provided, the first being the addition of
protocol-independent multicast (PIM) in the underlay. In the PIM-driven solution, multiple VNIs
are assigned to a specific multicast group, and BUM traffic is sent to the specified group
Route 109 Project Documentation - Brett Koelling 2023
Page 64
address to reach all intended recipients. In the configuration demonstrated, BGP ingress
replication will be utilized. Ingress replication uses a specific VXLAN route type, allowing the
auto-discovery of peer NVEs. When this peering is established, overlay tunnels may form to
handle BUM traffic between interested nodes. Just as VLANs constrain flooding domains,
VNI-specific BUM traffic will also be isolated.
Advertised path-id 1
Path type: internal, path is valid, is best path, no labeled nexthop
Imported from
10.65.100.101:32867:[3]:[0]:[32]:[192.168.1.101]/88
AS-Path: NONE, path sourced internal to AS
192.168.1.101 (metric 81) from 10.74.0.201 (10.65.100.201)
Origin IGP, MED not set, localpref 100, weight 0
Extcommunity: RT:65500:10010 ENCAP:8
Originator: 10.65.100.101 Cluster list: 10.65.100.201
PMSI Tunnel Attribute:
flags: 0x00, Tunnel type: Ingress Replication
Label: 10010, Tunnel Id: 192.168.1.101
Ingress replication route type 3
With the VNI configured on the NVE(s) the EVPN routes will be shared across peers. The figure
above shows the type-3 route used to auto-discover and establish BUM traffic-specific
tunneling. The Provider Multicast Service Interface (PMSI) tunnel attribute shares essential
information to distinguish tunnels. The values used are the tunnel type, which is zero and
represents ingress replication, and the tunnel label, which will be identical across peers as that
value is the configured VNI. Finally, the tunnel ID specifies the leaf where the advertisement
originated. After route type 3 information has been shared, leafs where the VNI has been
configured will form peer relationships.
The detail command output has been narrowed to display only the peer IP and the VNIs that
have been learned. With the addition of two end hosts at L102 and L103, the EVPN MAC
advertisements can be further examined.
Beginning at L-102, the directly connected MAC address for host 10.100.0.102 now has an
EVPN entry and is being advertised northbound to the two spine units.
Route 109 Project Documentation - Brett Koelling 2023
Page 66
#From L-102
BGP routing table entry for
[2]:[0]:[0]:[48]:[000c.299f.8040]:[0]:[0.0.0.0]/216, version 56
Paths: (1 available, best #1)
Flags: (0x000102) (high32 00000000) on xmit-list, is not in l2rib/evpn
Advertised path-id 1
Path type: local, path is valid, is best path, no labeled nexthop
AS-Path: NONE, path locally originated
192.168.1.102 (metric 0) from 0.0.0.0 (10.65.100.102)
Origin IGP, MED not set, localpref 100, weight 32768
Received label 10010
Extcommunity: RT:65500:10010 ENCAP:8
The route type displayed is an EVPN route-type 2. This route type carries MAC and IP
information and is the most common route seen in VXLAN. Extended communities in VXLAN
are integral to identifying and steering traffic. Highlighted above, the two communities observed
are the route-target and the encapsulation value. The route-target within VXLAN is automatically
derived utilizing ASN:VNI. The encapsulation value of 8 identifies VXLAN encapsulation. Node
L-102 has also learned routes from peer L-103 for the MAC address of its connected host.
#L-102
BGP routing table entry for [2]:[0]:[0]:[48]:[000c.2960.55ac]:[0]:[0.0.0.0]/216, version 67
Paths: (1 available, best #1)
Flags: (0x000212) (high32 00000000) on xmit-list, is in l2rib/evpn, is not in HW
Advertised path-id 1
Path type: internal, path is valid, is best path, no labeled nexthop, in rib
Imported from
10.65.100.103:32867:[2]:[0]:[0]:[48]:[000c.2960.55ac]:[0]:[0.0.0.0]/216
AS-Path: NONE, path sourced internal to AS
192.168.1.103 (metric 81) from 10.74.0.201 (10.65.100.201)
Origin IGP, MED not set, localpref 100, weight 0
Received label 10010
Extcommunity: RT:65500:10010 ENCAP:8
Originator: 10.65.100.103 Cluster list: 10.65.100.201
Route 109 Project Documentation - Brett Koelling 2023
Page 67
With the information shared to both necessary peers, two-way communication between the
hosts succeeds. The Wireshark capture displays the stacked nature of VXLAN. The external
portion of the packet identifies the VTEP addresses, while the data beyond the VXLAN header
is the original information.
This initial demonstration has proven the concept of layer 2 VXLAN forwarding. Clients behave
as if they were hosted on the same layer 2 segment but are passing through a layer 3 fabric.
Through this configuration, tenants can independently operate while utilizing the same shared
underlay. Furthermore, the L3VNI acts as the connection between all the tenant VLANs that
require routing. As mentioned, the fabric acts as a layer 2 switch, but the L3VNI alters that
behavior and now allows Layer 3 routing. With this change comes additional configuration that
will be explored through the following example in the diagram below.
L3VNI topology
Route 109 Project Documentation - Brett Koelling 2023
Page 69
Tenant 01 Will utilize L3VNI 2000 to allow routing across the three respective VLANs. The first
step is to create a VRF and VLAN and then associate the VNI.
vlan 2000
name TEN-01-VNI
vn-segment 2000
interface Vlan2000
no shutdown
vrf member TEN-01
no ip redirects
ip forward
interface nve 1
member vni 2000 associate-vrf
The configuration above can be placed at every leaf node. This configuration creates a VRF to
propagate IP routing information throughout the fabric using extended communities. These
communities will be explored once valid routes have been shared. The empty interface VLAN
2000 enables the passing of traffic between all members of the VRF. This will be explained in
further detail in an upcoming section, which will surround using the Router MAC (RMAC)
attribute in BGP advertisements. Specific configuration will also be needed at leaf nodes under
each local VLAN interface placed within Tenant-01’s domain.
Route 109 Project Documentation - Brett Koelling 2023
Page 70
#L-101
feature fabric-forwarding
interface Vlan10
description TEN-01 Vlan 10
no shutdown
vrf member TEN-01
ip address 10.10.10.1/24
fabric forwarding mode anycast-gateway
Local VLAN interface with fabric forwarding
The introduction of the fabric forwarding command warrants a separate discussion regarding
this feature set.
Fabric Forwarding
The Fabric Forwarding feature set, in the context of VXLAN, allows host mobility. The concept is
that the VLAN interface(s) share a common MAC address. This means a host can be moved
anywhere within the participating fabric and continue forwarding traffic utilizing the same MAC.
This negates the need to send out any unnecessary ARP requests and allows host transition
without additional configuration. Using an anycast gateway also moves L3 forwarding to every
edge node. There is no longer a need to push all Layer 3 routing up to “the core.” The following
diagram will be explored to demonstrate Anycast forwarding and the accompanying EVPN
behavior. It is important to note that the Anycast Gateway and the Router MAC (RMAC) are two
separate components.
Route 109 Project Documentation - Brett Koelling 2023
Page 71
A VM may move for various reasons within the data center compute realm. Anycast gateway
aims to alleviate any worry around network transition with said moves. When VM-01 encounters
an issue that warrants a move, the VM platform will initiate a “hot move” to another available
compute host. With Anycast Gateway, all gateway addresses and MACs are identical across
client VLANs, meaning the hosts see no changes in the underlying network. With the burden
removed from the host, the network must absorb this responsibility. In time slot T1, referencing
the route table at node L-102, it can be observed that VM-01’s BGP information is as expected.
The route is being learned from L-101’s VTEP address (192.168.1.101) relayed by S-201
(10.65.100.201).
Route 109 Project Documentation - Brett Koelling 2023
Page 72
Advertised path-id 1
Path type: internal, path is valid, is best path, no labeled nexthop
Imported from
10.65.100.101:32777:[2]:[0]:[0]:[48]:[000c.2910.811e]:[32]:[10.10.10.10
1]/272
AS-Path: NONE, path sourced internal to AS
192.168.1.101 (metric 81) from 10.74.0.201 (10.65.100.201)
Origin IGP, MED not set, localpref 100, weight 0
Received label 10 2000
Extcommunity: RT:65500:10 RT:65500:2000 ENCAP:8 Router
MAC:0000.4352.1b08
Originator: 10.65.100.101 Cluster list: 10.65.100.201
Moving forward to time slot T3 after the VM has moved viewing the entry from L-102 revelas a
new attribute.
Route 109 Project Documentation - Brett Koelling 2023
Page 73
Advertised path-id 1
Path type: internal, path is valid, is best path, no labeled nexthop
Imported from
10.65.100.103:32777:[2]:[0]:[0]:[48]:[000c.2910.811e]:[32]:[10.10.10.10
1]/272
AS-Path: NONE, path sourced internal to AS
192.168.1.103 (metric 81) from 10.74.0.201 (10.65.100.201)
Origin IGP, MED not set, localpref 100, weight 0
Received label 10 2000
Extcommunity: RT:65500:10 RT:65500:2000 ENCAP:8 MAC Mobility
Sequence:00:1
Router MAC:0001.2441.1b08
Originator: 10.65.100.103 Cluster list: 10.65.100.201
The MAC mobility attribute is the single attribute that sorts the issue of MAC moves within the
VXLAN fabric. In slot T4, upon receiving traffic from VM-01, node L-103 will realize that the MAC
previously learned from node L-101 has moved. This will trigger a route advertisement
originating from node L-103 with the new attribute of MAC Mobility Sequence incremented to 1.
The sequence number represents the most recent version of the advertisement. With a
sequence of 1, the route from L-103 becomes the newest and is installed in the routing table of
L-101 and L-102. As demonstrated, the VM incurs no additional configuration and can move
freely throughout the fabric. The one basic requirement that may cause issues is that the correct
VLAN must be present on the switch to which the host will be migrated. The traffic will be
dropped without a valid VLAN interface and fabric forwarding configuration.
Route 109 Project Documentation - Brett Koelling 2023
Page 74
L3VNI Routing
Returning to the example of Tenant-01 within the VXLAN fabric. Adding the Anycast Gateway at
the three respective VLAN interfaces (10,20,30) marks the last step before achieving functional
routing.
At node L-101 both the host routes from L-102 and L-103 have been received. This allows end
hosts to participate in inter-vlan routing.
While end-to-end communication has been established, the routing behavior still requires further
investigation to increase understanding—particularly the addition of the Router MAC attribute in
the BGP EVPN advertisement.
Route 109 Project Documentation - Brett Koelling 2023
Page 75
The Router MAC value shared via EVPN is the MAC address belonging to the SVI of the L3VNI.
In this case, the received Router MAC is from L-103’s VLAN 2000 interface.
The Router MAC is used to forward traffic to another VTEP as VXLAN needs inner MAC values.
This requires traffic traversing from L-101 to L-103 over L3VNI 2000, which utilizes the Router
MAC value received in the BGP update.
Route 109 Project Documentation - Brett Koelling 2023
Page 76
Once the traffic has been received at the remote leaf, a route lookup can occur within the VRF
associated with L3VNI 2000 (TEN-01). After the lookup, traffic is traditionally forwarded through
the leaf and to the desired endpoint.
External Connectivity
Thus far, the methods described have only accounted for traffic within the fabric. To
communicate outside the fabric, nodes need to be able to connect and advertise to external
destinations. Node L-103 has been designated as the border leaf, which will act as the exit/entry
point for the fabric. Due to segment routing and fabric forwarding limitations on Nexus devices,
L-103 has been peered directly to an additional IOS-XR router named DC-BORDER.
Route 109 Project Documentation - Brett Koelling 2023
Page 77
BGP being the basis for EVPN VXLAN external connectivity is an easy addition to the fabric.
The DCBorder node can utilize default IPv4 Unicast peerings or steer MPLS L3VPN traffic to
further extend private customer networks to the data center space. The connection between
DC-BORDER and L-103 will be a routed link to the Nexus device, and the BGP peering will
happen over sub-interfaces. A generic peering is configured over sub-interface 179 and will
serve as the global IPv4 peering. Any Tenant VRF can be assigned to a specific sub-interface
and retain separation through the entire path when joined with MPLS L3VPN. To demonstrate
this ability, the existing VPNv4 network for VRF CXw-24 will be integrated into the TENANAT-01
VXLAN configuration.
On node L-103, a subinterface will be configured and associated with the TEN-01 VRF. Once
the basic IP configuration is in place, a new neighbor configuration will point toward the
DC-BORDER node.
interface Ethernet1/9.2000
encapsulation dot1q 2000
vrf member TEN-01
ip address 172.16.200.0/31
no shutdown
There is no further configuration needed from the Nexus side to bring this peering up. The focus
will now turn to the DC-BORDER XR node.
interface GigabitEthernet0/0/0/0.2000
vrf CXw-24
ipv4 address 172.16.200.1 255.255.255.254
encapsulation dot1q 2000
!
The configuration above will establish peering and advertise routes into the VXLAN VRF. This
traffic will need to flow across multiple domains in the SP core, which forces the need to engage
the PCE node to calculate on-demand paths. The routes from DC-BORDER will be colored with
the value of 600 as the on-demand policies already exist in the network.
Route 109 Project Documentation - Brett Koelling 2023
Page 79
After the PCE distributes the needed SRTE policy, VM-01 can ping both hosts across the
domains.
The host routes are shared automatically from the VXLAN domain. Additional routes can be
shared through traditional network advertisements under the IPv4 Unicast family in the BGP
VRF configuration. For instance, if a firewall protected additional networks behind the VXLAN
known networks, a BGP advertisement could be made for said networks.
Route 109 Project Documentation - Brett Koelling 2023
Page 80
The common language of BGP makes it easy to integrate external connectivity to and from
VXLAN fabrics. L2VPN EVPN is a powerful address family that has changed how many data
centers handle layer two domains and host mobility.
Orchestration (NSO)
Thus far, the network has been entirely configured via CLI command input. This has worked
without issue, but the configuration is tedious even at this small size. When dealing with a
provider network of thousands of nodes, it is not feasible to configure everything by hand. Not
only is the approach time-consuming, but it is highly prone to error. For the discussion of
orchestration, a brief investigation of Cisco’s Network Services Orchestrator (NSO) will be
outlined in the following section.
NSO is a multi-vendor orchestration tool. The tool can be utilized to push and maintain
configuration across devices. The service is relatively involved and ties together CLI alongside
GUI-based operation. The goal of NSO is to deliver a model-driven network utilizing yang
models. In this demonstration, device import and some simple configuration manipulation and
rollback will be reviewed.
Route 109 Project Documentation - Brett Koelling 2023
Page 81
The installation of NSO is beyond the scope of this document. Cisco DevNet has provided
step-by-step instructions for the entire process. This review will begin with a basic NSO instance
installed on a linux host with the services started.
On the top left of the new screen, a blue icon with the “+” sign will be displayed. Clicking that will
open up a new device import workflow.
A shell of the device has now been added. Navigating to the device and switching the GUI
mode to “Edit-config” will allow the input of the needed details.
● Authgroup
The authgroup is where operators can specify credentials to be used for a device.
Using the dropdown, navigate to the “source” option, which allows the creation of a new group.
This will push the user into a new page where another blue “+” symbol is available to create the
new object. Then again, navigate to the group shell that has been made.
Add a new value under the “default-map” section and specify a remote user and password that
will be utilized to log in via SSH to the device(s) within the group.
Route 109 Project Documentation - Brett Koelling 2023
Page 83
Use the navigation bar to return to the device configuration, and the new authgroup will be
available.
The next value needed for minimum configuration is the address and to turn off SSH key
verification.
Finally, a device type must be specified. The CLI method will be used in this demonstration, and
the device is an IOS-XR router.
Navigate to the commit manager at the bottom of the page and commit the changes. After the
commit, the device should ping, but an issue will prevent NSO from logging into the device.
This is where the CLI is needed to “unlock” the device. To gain access to the CLI, navigate to
the NSO Linux server via SSH and run the following command in the instance.
ncs_cli -C -u admin
The CLI is similar to a Cisco device regarding behavior. The device state must be changed to
resolve this issue with the following commands.
config
devices device 10200-PE12 state admin-state unlocked
commit
With the device imported, NSO can retrieve device configuration, which will become vital as
deployments begin to originate from NSO.
Configuration Monitoring
NSO will act as the source of truth for device configuration. An operator can run the sync-from
command in the device menu after importing a device. This will retrieve the current running
configuration and treat it as the trusted iteration.
NSO sync-from
With this in place, any changes made via CLI or outside of NSO will flag the compare-config
action. As a simple example, a loopback will be added via CLI.
Route 109 Project Documentation - Brett Koelling 2023
Page 87
interface Loopback99
description UNAUTHORIZED LOOPBACK
Unauthorized configuration example
With the change implemented at PE12, a config compare can now be run from NSO, and the
unauthorized change will be flagged.
To remedy this issue, a simple sync-to can be run to normalize the configuration.
The reverse of this operation can also be achieved via the sync-from. If changes are made via
CLI and are authorized, the sync-from command will pull the latest configuration and begin to
reference it as the baseline.
NSO Templates
Using templates grants NSO its ability to orchestrate across multiple platforms. Operators can
build a personal repository that fits their needs by configuring templates. A simple loopback
configuration will be demonstrated to showcase the ability of NSO templates. It is recommended
Route 109 Project Documentation - Brett Koelling 2023
Page 88
that operators be able to pull and edit files from the NSO machine via some form of text editor.
In this lab, WinSCP and SublimeText4 have been utilized.
Service templates are implemented via XML and yang. The XML file will specify variables, and
the yang model will classify the variable type (string, IP address). It is helpful to see the building
blocks in practice rather than theory. The majority of this configuration will be handled via CLI
and text editor.
The first step is to create a template shell within the NSO instance. Navigate to the directory of
the active NSO instance and continue to the packages folder. A make package command within
the folder will be run, and the name will be specified.
~/nso-directory/nso-instance/packages
NSO creates the necessary folder and file structure with blank values that can be modified to fit
the template needs. As seen in the command above, this is known as a service skeleton.
ispadmin@nso:~/ncs-6.1/nso-instance/packages/looptemplate$ ls
package-meta-data.xml src templates test
Created folder structure from service skeleton
The templates folder is where the XML data is housed, and the src folder is designated for yang
model information. The next step will be retrieving the configuration structure in XML format. On
most Cisco devices, the output can be returned in XML format, but NSO can also run
configuration against the device model. The NSO based method will be used via the CLI
services.
Route 109 Project Documentation - Brett Koelling 2023
Page 89
admin@ncs# config
Entering configuration mode terminal
admin@ncs(config)# devices device 10200-PE12
admin@ncs(config-device-10200-PE12)# config
admin@ncs(config-config)# interface Loopback 99
admin@ncs(config-if)# ipv4 address 99.99.99.99 /32
admin@ncs(config-if)# top
admin@ncs(config)# show configuration
devices device 10200-PE12
config
interface Loopback 99
ipv4 address 99.99.99.99 /32
no shutdown
exit
!
!
NSO based configuration
What is happening is that NSO is allowing the configuration to be run by referencing the device
model. This configuration can then be run with a dry-run distinction, meaning NSO will verify the
command and not implement anything on the device. It is via the dry-run command an operator
can retrieve the XML format.
ispadmin@nso:~/ncs-6.1/nso-instance/packages/looptemplate/templates$ ls
looptemplate-template.xml
XML template folder path
Opening the looptemplate-template.xml file in the text editor reveals a blank file with some
descriptions in place to guide users.
<config-template xmlns="http://tail-f.com/ns/config/1.0"
servicepoint="looptemplate">
<devices xmlns="http://tail-f.com/ns/ncs">
<device>
<!--
Select the devices from some data structure in the service
model. In this skeleton the devices are specified in a
leaf-list.
Select all devices in that leaf-list:
-->
<name>{/device}</name>
<config>
<!--
Add device-specific parameters here.
In this skeleton the service has a leaf "dummy"; use that
to set something on the device e.g.:
<ip-address-on-device>{/dummy}</ip-address-on-device>
-->
</config>
</device>
</devices>
</config-template>
Blank NSO XML template
The device section will remain blank as the device is specified upon running the action in NSO.
Placing the XML configuration structure within the respective section needs to be done. Below is
the configuration used with added variable identifiers (/LOOPNUM,/ipadd)
Route 109 Project Documentation - Brett Koelling 2023
Page 91
<config-template xmlns="http://tail-f.com/ns/config/1.0"
servicepoint="looptemplate">
<devices xmlns="http://tail-f.com/ns/ncs">
<device>
<name>{/device}</name>
<config>
<interface xmlns="http://tail-f.com/ned/cisco-ios-xr">
<Loopback>
<id>{/LOOPNUM}</id>
<ipv4>
<address>
<ip>{/ipadd}</ip>
<mask>/32</mask>
</address>
</ipv4>
</Loopback>
</interface>
</config>
</device>
</devices>
</config-template>
Loopback XML configuration with added variables
The variables will allow the specification of values at the time of service instantiation. The XML
file can then be saved with its original name and file location on the NSO server. The following
configuration is needed to specify the variable value types in the Yang model.
Route 109 Project Documentation - Brett Koelling 2023
Page 92
module looptemplate {
namespace "http://com/example/looptemplate";
prefix looptemplate;
import ietf-inet-types {
prefix inet;
}
import tailf-ncs {
prefix ncs;
}
list looptemplate {
key name;
uses ncs:service-data;
ncs:servicepoint "looptemplate";
leaf name {
type string;
}
leaf LOOPNUM {
type string;
}
}
}
NSO yang model to be used for XML template
The yang file handles the variable data type and instructs on how that information should be
interpreted. The yang file will also be saved under the same file name in the same location.
Once the files are saved on the NSO server, operators must navigate to the src folder under the
template and run the following command, which activates the yang file.
Route 109 Project Documentation - Brett Koelling 2023
Page 93
ispadmin@nso:~/ncs-6.1/nso-instance/packages/looptemplate/src$ make
/home/ispadmin/ncs-6.1/bin/ncsc `ls looptemplate-ann.yang > /dev/null 2>&1
&& echo "-a looptemplate-ann.yang"` \
--fail-on-warnings \
\
-c -o ../load-dir/looptemplate.fxs yang/looptemplate.yang
NSO make command
If a fail condition is met, more information will be displayed on the failure in the yang file format.
After the make command is completed, a package reload must occur in the NSO CLI.
ispadmin@nso:~/ncs-6.1/nso-instance/packages/looptemplate/src$ ncs_cli -C
-u admin
Navigating back to the NSO GUI, under the Service Manager menu the new template is
available to deploy.
Route 109 Project Documentation - Brett Koelling 2023
Page 94
Provide a user-defined name for the service and confirm the value. This will again create a shell
object that can be navigated and configuration values specified.
Route 109 Project Documentation - Brett Koelling 2023
Page 95
A commit will then deploy the service to PE12. Any failures will be displayed within NSO. At
PE12, the deployment was successful as the device now has loopback 99 configured.
Within the service manager menu for the specific template and device, the Actions tab can be
utilized to “un-deploy” the change.
Route 109 Project Documentation - Brett Koelling 2023
Page 96
Scrolling to the bottom of the following page and selecting the run un-deploy action will remove
the loopback interface.
RP/0/RP0/CPU0:10200-PE12#sh int lo 99
Mon Nov 27 10:03:26.753 UTC
Interface not found (Loopback99)
Loopback removed via NSO un-deploy action
NSO templates provide the ability to model a network change for various devices and
programmatically implement them. Although this example was simple, the principle can be
applied to any configuration.
Case Studies
Multiple technologies and methods have been explored throughout this document. A combined
demonstration will be examined in the form of simple case studies.
Route 109 Project Documentation - Brett Koelling 2023
Page 97
Enterprise Customer
In this study, the connectivity requirements of Communicore Corp will be examined. The focus
will be on the provider technologies. The nodes used to emulate Communicore Corp will be
generic IOS-XE routers.
The diagram above illustrates the high-level design as seen by Communicore Corp. The
following discussion will follow how the provider will implement the customer requirements.
HQ
The HQ office will require direct internet access (DIA) and L3VPN service. This will be
implemented via dot1.q subinterface peerings with one interface participating in the global table
and the other confined to a VRF. For all sites, the VRF, extended community
coloring,on-demand SRTE policy, and BGP peer configuration will be handled by modular NSO
services.
Route 109 Project Documentation - Brett Koelling 2023
Page 98
Please reference the GitHub page for the full XML/Yang files for each service. With NSO,
operators can quickly deploy the needed configuration across all included provider edge routers.
After VRF creation, the VRF can be called, and a route-policy is assigned at the target export
level. Then, an On-demand SR-TE policy is defined to match the color and rely on the PCE to
calculate the inter-domain path. Finally, the BGP VPNv4 and IPv4 Unicast details are pushed
via NSO.
What would have previously required four separate logins and 10+ lines of configuration is now
done from a single screen, error-free. With the VRF in place, the route coloring workflow can be
applied to the same PEs.
Native VLAN that will be placed at the router level from NSO for route color policy
An On-demand color policy is then pushed referencing the previously configured color applied
to the route map.
Currently, all the VRF and SR-TE configuration is in place. The last step within NSO will be
configuring the BGP peerings for the IPv4 and VPNv4 address families.
Currently, the only configuration remaining is the local interface configuration at the respective
PE. All the BGP signaling and SRTE configurations have been placed via NSO service
templates.
HQ physical topology
Upon configuration of the PE and the HQ router interfaces, both peering sessions are up, and
IPv4 routes are being received.
Route 109 Project Documentation - Brett Koelling 2023
Page 102
Communicore HQ will advertise the 80.0.0.0/24 and 10.80.100.0/24. Route maps will be applied
to both neighbors to control route advertisement toward the correct service type.
HQ route advertisement
#COMMUNICORE-HQ Router
address-family ipv4
network 80.0.0.0 mask 255.255.255.0
neighbor 10.82.20.1 activate
neighbor 10.82.20.1 allowas-in
neighbor 10.82.20.1 route-map mNET out
neighbor 16.20.20.1 activate.
neighbor 16.20.20.1 allowas-in
neighbor 16.20.20.1 route-map iNET out
exit-address-family
HQ Router BGP configuration
Route 109 Project Documentation - Brett Koelling 2023
Page 103
PE12 has now received all the advertised routes from the HQ router.
At this point the HQ office is configured and focus can move onto the SITE-1 router.
Remote Site 1
Communicore’s remote site will be another simple implementation of L3VPN and DIA. All the
VRF and SRTE policy configurations are already in place from the push made by NSO. The one
remaining configuration that will be needed from NSO is the IPv4/VPNv4 peering.
SITE-1 will make two route advertisements. Similar to the HQ router, these advertisements will
be scoped down via route-maps to enforce proper separation of the IPv4 and VPNv4 routes.
At this point the SRTE policy is in an active state and SITE-1 and HQ can communicate over
both the VPNv4 and IPv4 routes.
pass any traffic through as OSPF will utilize multicast to form a neighbor relationship. Direct
peering information is displayed in the diagram below.
Again, NSO will be responsible for deploying IPv4 and VPNv4 peering information on the
provider side. DC1 will also share its IPv4 and VPNv4 Unicast routes toward the provider
network.
The BGP configuration in this example is a repeat of what has already been demonstrated.
Where the DC connections differ is the use of L2VPN. Another NSO template will deploy the
necessary L2VPN information to the involved PE nodes.
Route 109 Project Documentation - Brett Koelling 2023
Page 107
l2vpn
pw-class MP
encapsulation mpls
!
!
xconnect group 4_2_11
p2p 4211
interface GigabitEthernet0/0/0/1.200
neighbor evpn evi 1082 target 1182 source 482
pw-class MP
Provider L2VPN configuration
With the basic L2VPN configuration up on both PE nodes (PE11 and PE4) the cross-connect is
up. Both DC1 and DC2 can pass traffic as expected.
Route 109 Project Documentation - Brett Koelling 2023
Page 108
DC1 and DC2 now have a layer 2 adjacency with OSPF layered over the top.
COMMUNICORE-DC1#traceroute 100.64.200.2
Type escape sequence to abort.
Tracing the route to 100.64.200.2
VRF info: (vrf in name/id, vrf out name/id)
1 100.64.200.2 3 msec * 5 msec
Summary
The enterprise customer is now fully connected across all sites via the provider network. This
demonstration touched on three major “global” and private connectivity options. Segment
routing was also featured and proven crucial in the inter-domain nature of the traffic flows. NSO
increased configuration efficiency and provided standardized deployment methods.
The app servers will allow inbound connections from the F5 and use a route leak process to
utilize DNS and NTP from a shared resource in another VRF. The 65.65.65.x/24 Route will be
advertised via BGP and will allow reachability from external peers via IPv4 Unicast.
Route 109 Project Documentation - Brett Koelling 2023
Page 111
The route leaking method is similar to the premise of L3VPN route sharing. The process will
utilize the extended community of route targets to import/export routes selectively.
A route map will further define the import/export action. Before the VRF manipulation, both
tables only contain their local routes. The further defined objective is to extract 10.30.30.103/32
from TEN-01 and import it into the SERVER-VRF while allowing two-way communication
between the VRFs. To control the import process, a prefix list will be applied to VRF
SERVER-VRF.
The correct extended communities need to be identified to import the proper routes. Like MPLS,
these values will be explicitly called out to trigger the import/export process.
With the import specified the route for 10.30.30.103/32 is now present in the SERVER-VRF
table. This route is identified as an asymmetric entry due to the L3VNI the route possesses
differing from the value specified for the SERVER-VRF (2100 vs 2000).
A mirrored operation is now needed within the TEN-01 VRF to allow bidirection communication.
With the configuration in place for both VRFs, specifically at L-101 and L-103 pings can now
traverse VRF boundaries.
Downstream VNI
The previously demonstrated behavior is known as Downstream VNI. Downstream VNI is
similar to MPLS in that the egress VTEP advertises the route with a defined label. In this case,
the label is the VNI value associated with the VRF.
When traffic is sent from either node destined for the remote VRF, the label used is the received
VNI. The remote VTEP receives the traffic and sends it toward the correct instance. So, for this
example, traffic from the SERVER-VRF destined toward TEN-01 will use VNI 2000, and the
process is the same for return traffic. A wireshark capture confirms this behavior.
Downstream VNI provides flexibility and eases the operational burden of deploying VRFs. The
leaking of routes between VRFs allows tenants to reach shared resources while maintaining
separation from the other Tenants. As demonstrated, the separation can be further tuned via
route-maps.
Route 109 Project Documentation - Brett Koelling 2023
Page 114
Project Summary
The project aimed to further my understanding of some service provision technologies. The
project was also a test of myself to see if I am insane enough to go after a CCIE in this
discipline, which I am still unsure about. If you have made it this far, thank you for reading until
the end, and I hope this adds some value to your learning experience.
Thank you!
Appendix Information
All NSO and Device config files can be found on the GitHub page. The configuration files are
raw and may contain additional configurations unnecessary to the demonstrations reviewed
throughout the document.