APIC M1/M2/L1/L2 to M3/L3 Cluster Migration

Mukesh Chanderia
Apr 24, 2023
4 min read

Both APIC L1/M1 and APIC L2/M2 servers have reached their end-of-sales and end-of-life date.

APIC servers forming a cluster must all run the same software release. You cannot have different software releases inside one cluster.

To determine which release version, you are currently running on the APIC M3/L3 server:

Step 1. Power on your APIC M3/L3 and determine which release you are currently running. The out of the box APIC M3/L3 will be running release 4.0(3d). If not, you must upgrade the server to a supported release.

Step 2. You can perform that step booting directly from the ISO image obtained from cisco.com, through the Cisco Integrated Management Controller’s (CIMC’s) virtual console or using a Serial over LAN (SoL) connection.

Step 3. Make sure you bring the existing cluster to the same release before continuing any further.

How to replace every existing M1/M2/L1/L2 server with an M3/L3 server model in service with no impact to the data plane nor the control plane?

Caution:

Before you begin, ensure that you:

- do not decommission more than one APIC at a time

- wait until the cluster reaches the fully fit state before proceeding with a new replacement

- do not leave a decommissioned APIC powered on

- do not introduce a new APIC with a mismatched software release.

Step 1. Validate the existing cluster is fully-fit.

Ensure your existing cluster is fully fit before attempting this procedure. You must not upgrade or modify an

APIC cluster that is not fully fit.

To verify your existing cluster is fully fit:

a. From the main page, select Controllers.

b. Expand Controllers and select any APIC.

c. Expand the APIC and select ‘cluster as seen by node'.

Step 2. Cable the replacement APIC M3/L3 servers.

Physically install the replacement of the servers in the data center and cable them up to the existing ACI fabric as you would with any server.

If necessary, ensure LLDP is disabled at the CIMC NIC level4

Cable the Out of Band (OOB) management connection. PIC will simply take over the IP of the server it is replacing.

Step 3. Power up the replacement APIC servers M3/L3

Power up all APIC M3/L3 servers and bring up a virtual keyboard, video, mouse session, Serial over LAN (SoL), or physical VGA connection so you can monitor their boot process. After a few minutes, you will be prompted to press any key to continue. Do not press a key just yet. Leave the APIC M3/L3 servers in that stage for the time being.

Step 4. Record the name and infra VLAN of your existing fabric.

You must obtain the exact name of your current fabric, as well as the infra VLAN in use.

To retrieve the name of the fabric, enter the following command on any existing APIC:

1) Name of Current Domain

apic# acidiag avread | egrep -i –color ‘fabric_domin’

2) Find out infra VLAN

apic# ifconfig | grep -I bond0

3) To obtain the TEP pool, enter the following command:

apic1# acidiag avread | grep -o -P 'tep address=.{0,18}'

4) To determine the multicast IP pool, enter the following command:

apic1# moquery -c fvBD | grep -E “name|bcastP|dn” | grep -B 2 “infra”

Step 5. Decommission APIC

From APIC ‘cluster as seen by node’ view (Figure 1), decommission the last APIC by right-clicking on that APIC and selecting ‘decommission’.

Wait for status change from in service to out of service:

Now log into APIC’s CIMC or attach a physical keyboard and monitor to its back so you can initiate a power off sequence.

When the old APIC is out of service, power it off:

From CIMC: - Power --> Power Off System

After a minute, the status of this APIC will change to unregistered.

Step 6. Bring in the replacement APIC.

Pick one of the new APIC M3/L3 that is sitting at the ‘press any key to continue’ prompt.

You will be prompted to configure this APIC.

Once you have entered all parameters, you will be asked whether you want to modify them. Reply ‘N’ unless you make a mistake that you’d like to correct.

Step 7. Register the new APIC for the cluster membership.

After roughly 7 to 10 minutes, the new server appears as unregistered in the ‘cluster as seen by node’ tab in the UI, as shown below. Right-click on the server and commission it. Wait till the health state goes to fully fit for the new and all servers before continuing any further. This usually takes 5 minutes.

Step 8. Validate the cluster membership.

After 5 minutes or so, you will observe transitions in operational state and health status. The new server first has a data layer partially diverged state before fully converging:

Shortly after, the new server’s database is fully in sync with the other members of the cluster. This is reflected in a fully fit health state.

Remember that to decommission a controller, you need to perform the operation from another server’s standpoint. If you are logged into APIC-1 for instance, do not decommission APIC-1.

Troubleshooting the New Cluster

In most cases a new cluster member will not join the cluster due to incorrect configuration parameters with the infra VLAN, TEP pool, fabric name, and multicast pool or incorrect cabling. You will need to doublecheck these.

Keep in mind it takes a bit of time for a new controller to fully converge, wait at least 10 minutes.

You can always log into a non-ready cluster member using the rescue-user account. No password will be required if the cluster is in discovery mode. If a password is required, use the admin password.

Step 1. Verify the physical interfaces toward the fabric.

Ensure interfaces toward the fabric are up. You can enter the cat /proc/net/bonding/bond0 command. At least one interface must be up. It is a necessary and sufficient condition to establish cluster membership.

However, if a single interface is up then a major or critical fault will be raised in the APIC.

You can run the acidiag bond0test command to validate cabling:

Step 2. Check the cluster health from the new APIC.

Once at the prompt of the new APIC using either the console, VGA output or SSH use the acidiag avread command to examine this APIC’s view of the cluster.

If you do not see the other APIC servers, there is probably a configuration parameter mismatch, a cabling problem or a software release problem.

Step 3. Verify the database consistency.

APIC stores all configuration and runtime data in a distributed database that is broken down into units called shards. Shards are triplicated within a cluster for resiliency purposes. The command allows you inspect whether the database is fully synchronized across the cluster with a consistent data layer.

Use the acidiag rvread command and ensure no forward or backward slashes appear anywhere in the shard or service ID matrix:

APIC M1/M2/L1/L2 to M3/L3 Cluster Migration

Recent Posts

Comments