lag module: Model peerlink as regular lag link #2084

jbemmel · 2025-03-25T23:24:58Z

Currently the link between MLAG peers is crafted internally within the lag module, using specific internal mechanisms to assign IP addresses and making it impossible to configure features such as OSPF or BGP between the peers.

This PR models the peerlink as a regular lag link with a VLAN trunk, enabling the full suite of Netlab features to be configured between the peers:

Assign IP addressing to the link; users can use prefix or pool, or leave default to get VLAN lan pool addressing
Apply module processing (ospf, bgp, gateway to name a few)
Configure the use of LACP

Fixes #1909

Note: Arubacx templates are modified but will need further adjustments and testing (like defining the vlan that gets created for peering, similar to eos)

Testing: added a test for OSPF over peerlink

NETLAB_DEVICE=cumulus_nvue NETLAB_PROVIDER=libvirt ./device-module-test -v lag
NETLAB_DEVICE=eos NETLAB_PROVIDER=clab ./device-module-test -v lag

Breaking changes:

Dell OS10 - the backup destination IP is now taken from intf.lag.mlag.peer_backup_ip instead of intf.lag.mlag.peer

Implementation notes:

IP assignment to a routed vlan is achieved by creating a vlan in irb mode with a prefix, overriding the mode to route at the interface level
All vlans allowed on the peerlink by default, unless the user specifies a specific VLAN trunk. New internal flag _allow_all introduced to model this

ipspace

Let's start with the obvious one: is there a reason you couldn't add "allocation: p2p" to the global P2P pool?

It is supposed to be used for P2P links anyway, and uses /30 prefix, so there cannot be more than two nodes attached to it. Alternatively, one could define custom pool and use it for peer link VLANs.

Both options are better (IMO) than adding random global attributes.

jbemmel · 2025-03-26T12:06:00Z

Let's start with the obvious one: is there a reason you couldn't add "allocation: p2p" to the global P2P pool?

I suggested that in the past (#1611) but you disagreed at the time. Happy to revisit

ipspace · 2025-03-26T12:18:10Z

Let's start with the obvious one: is there a reason you couldn't add "allocation: p2p" to the global P2P pool?

I suggested that in the past (#1611) but you disagreed at the time. Happy to revisit

Well, thanks for pointing that out ;) The problem was (and still is) that using P2P allocation strategy on a VLAN link masks the fact that an IRB VLAN has a single prefix. Not specifying the p2p allocation strategy crashes the allocation process, alerting the user to the fact that something weird is going on.

And now we uncovered another gotcha: you're doing too much tweaking. I never got that far because I stumbled on the "new global pool" showstopper.

ipspace · 2025-03-26T12:28:48Z

This PR models the peerlink as a regular lag link with a VLAN trunk, enabling the full suite of Netlab features to be configured between the peers:

Is there supposed to be a default VLAN to be used on that link for the global IP routing, or is it just a VLAN trunk like any other trunk?

Note: Arubacx templates are modified but will need further adjustments and testing (like defining the vlan that gets created for peering, similar to eos)

There's a showstopper right there ;)

IP assignment to a routed vlan is achieved by creating a vlan in irb mode with a prefix, overriding the mode to route at the interface level

There's absolutely no need for that. Peer links are established between devices of the same type anyway, and if a device does not support mixed trunks, they you use IRB for the global IP routing VLAN. Any further device-specific limitations can be addressed in device quirks.

Please (always) try to modify the minimum amount of data structures necessary. We're learned the hard way that doing anything else creates unpredictable effects in the future.

All vlans need to be allowed on the peerlink, I prefer to do this implicitly rather than explicitly list all vlans used in the topology because the latter may create "unsupported" scenarios (such as mixed trunks on Dell OS10).

I have no problem with this (just skip the VLAN assignment on the peer link), but it has to be documented.

It would be nicer to have "allow all VLANs" syntax, such that we could avoid lag module specific modifications to the vlan template

I don't see a need for the "allow all" trunks from the topology perspective, but I have no problem with an internal flag like vlan._all_vlans that would be used in VLAN configuration templates instead of checking the peerlink interface name or type.

Global p2p pool could not be used due to the lack of a 'p2p' allocation strategy

As you can assign any pool to any VLAN, I fail to see why this is relevant ;)

jbemmel · 2025-03-26T13:02:39Z

Global p2p pool could not be used due to the lack of a 'p2p' allocation strategy

As you can assign any pool to any VLAN, I fail to see why this is relevant ;)

I did try it - it results in an address allocation failure for the topology/input/lag-mlag-m_to_m.yml test case

jbemmel · 2025-03-26T13:22:03Z

Please (always) try to modify the minimum amount of data structures necessary. We're learned the hard way that doing anything else creates unpredictable effects in the future.

Ironically, this is exactly the reason why I opted for a separate mlag_p2p addressing pool, rather than add allocation: p2p to the existing global p2p one. It results in the minimum amount of change to the installed base

jbemmel · 2025-03-26T13:28:43Z

Is there supposed to be a default VLAN to be used on that link for the global IP routing, or is it just a VLAN trunk like any other trunk?

All VLANs - including the default VLAN - need to be allowed on the peerlink. On Cumulus the default VLAN is not used for IP routing - no IP address assigned to it - but removing it causes test cases to fail (without clear indications of why)

ipspace · 2025-03-26T16:04:44Z

BTW, there's no need to set allocation: p2p on the P2P pool. You can set prefix.allocation on any link (or VLAN) and still get the IPv4/IPv6 address assigned from the pool. For example:

provider: clab

nodes:
  a:
    device: frr
    id: 33
  b:
    device: linux
    id: 44

links:
- interfaces: [ a,b ]
  prefix.allocation: p2p

docs/module/vlan.md

ipspace · 2025-03-26T16:09:25Z

docs/module/lag.md

@@ -113,6 +113,12 @@ links:
    mlag.peergroup: True # (also) used to derive a unique MAC address for this group of MLAG peers
 ```

+### Peerlink configuration


This left me a bit confused. I think it would be nice to have two short examples, one of them documenting how the peer link can be used as pure routed link (although we know there are other things going on in the background), another one describing how to use VLANs on the peer link (example use case: VRF Lite)

It's also worth mentioning that you chose that netlab configures "allow all VLANs" on the peer link. That is not always the case in real-life deployments.

jbemmel · 2025-03-26T17:38:43Z

You can set prefix.allocation on any link

I did try that first too, but it didn't seem to work. I'll try again

Update: Here's why that didn't work in this case #2090

Modeling the peer VLAN explicitly, to allow configuration of features like OSPF between the mlag peers

…es if needed

* Remove custom IP subnets, use p2p pool instead * Fix warning about Linux bridge blocking LACP (removed links don't pose an issue)

…ocation

* Set 'allocation' to 'p2p' for the global pool

…n prefix is set to False

jbemmel marked this pull request as draft March 25, 2025 23:27

jbemmel force-pushed the lag_peer_vlan branch 2 times, most recently from 96c1995 to 10c28ec Compare March 26, 2025 02:10

jbemmel requested review from ssasso and ipspace March 26, 2025 02:28

jbemmel marked this pull request as ready for review March 26, 2025 02:28

ipspace requested changes Mar 26, 2025

View reviewed changes

jbemmel marked this pull request as draft March 26, 2025 13:15

jbemmel force-pushed the lag_peer_vlan branch from 27afb0a to bc1a3fe Compare March 26, 2025 13:24

ipspace reviewed Mar 26, 2025

View reviewed changes

docs/module/vlan.md Outdated Show resolved Hide resolved

ipspace reviewed Mar 26, 2025

View reviewed changes

jbemmel added 11 commits March 29, 2025 09:41

LAG: Refactor peerlinks to simply be another instance of a port-channel

ac94537

Modeling the peer VLAN explicitly, to allow configuration of features like OSPF between the mlag peers

Working Cumulus

acfc0cb

Add custom mlag pool to enable ipv6 lla

4186b88

Disable ipv6 configs on peerlink, still not quite working

d805743

Working now

4ca1bb3

Update test results, enable ospf v4

72be25b

Dell: Ignore 'peerlink' interfaces (not the most elegant solution)

610644f

Add peer vlan in 'irb' modei with prefix, downgrade to 'route' at nod…

6faaabc

…es if needed

Use existing untagged vlan if available

fc2bbd1

Fixes

2024d91

Fix STP disable

f4343cb

jbemmel added 20 commits March 29, 2025 09:43

* Define custom pool in YAML

4c9ec0c

* Remove custom IP subnets, use p2p pool instead * Fix warning about Linux bridge blocking LACP (removed links don't pose an issue)

Need to allow all vlans on the peerlink

a91de2e

Newline

651afa4

Cannot use the default p2p pool because it does not specify 'p2p' all…

fd8b6e7

…ocation

Updated test results

ba88f7c

Define mlag feature for 'none' device

6318ad7

Support unnumbered ipv4 too

335ea38

Prefer 'linklocal' if ipv6 is True

5483366

Use global p2p address pool instead of defining a special instance

c5f4370

* Set 'allocation' to 'p2p' for the global pool

Updated test results

5acefff

Update error test result

d3e2150

Add '_allow_all' flag for VLAN trunks

a78487e

Add flag to VLAN attributes

4755db7

Update tests with new VLAN flag

d62b6aa

Undo p2p pool changes, make trunk configurable, add examples

4c2cc4e

Undo error test changes

c7c3667

Use 'all'

123827b

Fix bug ipspace#2090

bc4a88a

Updated test results after fixing VLAN prefix allocation bug

8338f7d

Remove internal flag

d1e9ce9

jbemmel force-pushed the lag_peer_vlan branch from 84ec44f to d1e9ce9 Compare March 29, 2025 14:43

jbemmel added 9 commits March 29, 2025 09:47

Don't configure 'p2p' pool on peerlink VLAN

3b94cfd

Update test results

61f06e6

Update docs

83e79a0

Use _allow_all flag

38b7247

Fix Dell OS10 VLAN script

38561a4

Let '_all_vlans' apply to native VLAN too, avoid creating L3 VLAN whe…

dceba30

…n prefix is set to False

Skip creation of default vlan

60a542e

Fix check for all vlans

5115e6b

Guard against access_id not being set

95d3435

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lag module: Model peerlink as regular lag link #2084

lag module: Model peerlink as regular lag link #2084

jbemmel commented Mar 25, 2025 •

edited

Loading

ipspace left a comment

jbemmel commented Mar 26, 2025

ipspace commented Mar 26, 2025

ipspace commented Mar 26, 2025

jbemmel commented Mar 26, 2025 •

edited

Loading

jbemmel commented Mar 26, 2025

jbemmel commented Mar 26, 2025

ipspace commented Mar 26, 2025

ipspace Mar 26, 2025

jbemmel commented Mar 26, 2025 •

edited

Loading

lag module: Model peerlink as regular lag link #2084

Are you sure you want to change the base?

lag module: Model peerlink as regular lag link #2084

Conversation

jbemmel commented Mar 25, 2025 • edited Loading

ipspace left a comment

Choose a reason for hiding this comment

jbemmel commented Mar 26, 2025

ipspace commented Mar 26, 2025

ipspace commented Mar 26, 2025

jbemmel commented Mar 26, 2025 • edited Loading

jbemmel commented Mar 26, 2025

jbemmel commented Mar 26, 2025

ipspace commented Mar 26, 2025

ipspace Mar 26, 2025

Choose a reason for hiding this comment

jbemmel commented Mar 26, 2025 • edited Loading

jbemmel commented Mar 25, 2025 •

edited

Loading

jbemmel commented Mar 26, 2025 •

edited

Loading

jbemmel commented Mar 26, 2025 •

edited

Loading