Troubleshooting Tools: Instructions on using tools like the Multi-Site troubleshooting report, API call logs, VM data collection, and logs to verify microservices and policy resolution on Cisco APIC sites. It also covers:
Consistency checker
Docker container information
Executing logs
APIC policy resolution
Consistency Checker
The Consistency Checker is a feature in Cisco ACI Multi-Site Orchestrator that helps you verify deployments after they've been initially deployed. It integrates seamlessly within the user interface to ensure that cross-site mappings are correct.
You can use this tool on any template that has been deployed across at least two sites and includes at least one of the following policies:
Endpoint Group (EPG)
Virtual Routing and Forwarding (VRF)
Bridge Domain (BD)
External EPG
Verifying a Deployed Template Across Sites
To ensure your deployed templates are consistent across multiple sites, follow these steps:
Before You Begin
Ensure that the template you want to verify:
Has been deployed across at least two sites.
Contains at least one of these policies: EPG, VRF, BD, or External EPG.
Steps
Log In: Access the Multi-Site Orchestrator GUI.
Select the Schema:
Navigate to the Schemas section from the main menu.
On the Schema List page, choose the appropriate schema.
Choose the Template:
Click on the deployed template you wish to verify.
Initiate Verification:
In the top-right corner, click on Unverified.
Run the Consistency Checker:
In the Template Verification Summary dialog box, click VERIFY.
A message will appear: "Consistency verification has been successfully triggered."
Review Verification Status:
The status will update to either:
Verification Successful — No action needed.
Verification Failed — Action required.
If it failed:
Click on Verification Failed.
For the site(s) that failed, click the pencil icon to view a detailed report.
Hover over the red X to see the issue description, which could be:
Not Found — Unable to locate the policy.
Mismatch — Misconfiguration detected.
You can then:
Download the report for the current site.
Verify Template across all sites again.
Setting Up Scheduled Verification for Deployed Templates
Automate the verification process for every deployed template on a per-tenant basis:
Steps
Log In: Access the Multi-Site Orchestrator GUI.
Access Tenant Settings:
Go to the Tenant section from the main menu.
On the Tenant List page, click Set Schedule for the desired tenant.
Configure the Schedule:
In the Consistency Checker Scheduler Settings, uncheck Disable Schedule.
Select the preferred time and frequency for verification.
Click OK to save your settings.
Troubleshooting Verification Errors
If you encounter errors during verification, here's how to troubleshoot them:
Steps
Log In: Access the Multi-Site Orchestrator GUI.
Open the Schema Health Dashboard:
Navigate to the Dashboard.
In the Schema Health section, click on the schema verification icon in the View By field.
You'll see small squares representing templates within each site, color-coded by status:
Green — Passed verification.
Red — Failed verification.
Yellow — Unverified.
Identify Issues:
Expand any schema containing a red indicator to reveal the problematic templates.
Hover over red sites to see that they have FAILED.
View Detailed Report:
Click on a failed site to open a detailed report.
Hover over the red X icons to read descriptions of the issues:
Not Found — The policy couldn't be located.
Mismatch — There's a configuration mismatch.
Take Action:
Choose to Download the report for the current site or Verify Template across all sites.
Check Template Statuses:
Review which templates have passed, failed, or remain unverified.
Optional Actions:
To verify the entire schema, click the ... (ellipsis) next to the schema name and select Verify Schema.
To search for specific policies (EPG, BD, VRF, or External EPG), use the search function to find which schemas contain them.
Downloading System Logs
Generate a troubleshooting report and download system logs for all schemas, sites, tenants, and users managed by Cisco ACI Multi-Site Orchestrator:
Steps
Log In: Access the Multi-Site Orchestrator GUI.
Navigate to Tech Support:
From the main menu, select Operations > Tech Support.
Download Logs:
In the System Logs frame, click the Edit button in the top-right corner.
Select the logs you wish to download.
Click the Download button.
An archive file will be downloaded to your system, containing:
All schemas in JSON format.
All site definitions in JSON format.
All tenant definitions in JSON format.
All user definitions in JSON format.
All container logs in the infra_logs.txt file.
Gathering Docker Container Information
You can access one of the Orchestrator VMs to collect information about Docker services and their logs for specific containers. Here’s how to do it:
1) Checking Docker Container Health
To ensure Docker services are running smoothly, use the following command:
# docker service ls
This command displays the health status of each service. Look at the REPLICAS column to make sure all containers are running as expected. If any container is down, it may indicate an issue that needs attention.
Example Output:
ID NAME MODE REPLICAS [...]
ve5m9lwb1qc4 msc_auditservice replicated 1/1 [...]
bl0op2eli7bp msc_authyldapservice replicated 1/1 [...]
uxc6pgzficls msc_authytacacsservice replicated 1/1 [...]
2) Finding Container IDs
To list all running container IDs, use:
# docker ps
Example Output:
CONTAINER ID IMAGE COMMAND [...]
05f75d088dd1 msc-ui:2.1.2g "/nginx.sh" [...]
0ec142fc639e msc-authyldap:v.4.0.6 "/app/authyldap.bin" [...]
3) If you need the container ID for a specific service, use:
# docker ps | grep <service-name>
Example:
docker ps | grep executionengine
Output:
685f54b70a0d msc-executionengine:2.1.2g "bin/executionengine" [...]
4) To include containers that have stopped, use:
# docker ps -a | grep <service-name>
Example:
docker ps -a | grep executionengine
Output:
685f54b70a0d msc-executionengine:2.1.2g "bin/executionengine" Up 2 weeks (healthy)
3870d8031491 msc-executionengine:2.1.2g "bin/executionengine" Exited (143) 2 weeks ago
5) Viewing Container Logs
To see the logs for a specific container, use:
# docker logs <container-id>
Note: Logs can be large due to extensive data transfer. Ensure your network can handle the download speed.
6) Log File Location:
/var/lib/docker/containers/<container-id>/
You might find multiple log files like <container-id>-json.log.
Example:
# cd /var/lib/docker/containers
ls -al
total 140
drwx------. 47 root root 4096 Jul 9 14:25 .
drwx--x--x. 14 root root 4096 May 7 08:31 ..
drwx------. 4 root root 4096 Jun 24 09:58 051cf8e374dd9a3a550ba07a2145b92c6065eb1071060abee12743c579e5472e
drwx------. 4 root root 4096 Jul 11 12:20 0eb27524421c2ca0934cec67feb52c53c0e7ec19232fe9c096e9f8de37221ac3
7) To view logs for a specific container:
cd 051cf8e374dd9a3a550ba07a2145b92c6065eb1071060abee12743c579e5472e/
ls -al
total 48
drwx------. 4 root root 4096 Jun 24 09:58 .
drwx------. 47 root root 4096 Jul 9 14:25 ..
-rw-r-----. 1 root root 4572 Jun 24 09:58 051cf8e374dd9a3a550ba07a2145b92c6065eb1071060abee12743c579e5472e-json.log
drwx------. 2 root root 6 Jun 24 09:58 checkpoints
-rw-------. 1 root root 4324 Jun 24 09:58 config.v2.json
-rw-r--r--. 1 root root 1200 Jun 24 09:58 hostconfig.json
-rw-r--r--. 1 root root 13 Jun 24 09:58 hostname
-rw-r--r--. 1 root root 173 Jun 24 09:58 hosts
drwx------. 3 root root 16 Jun 24 09:58 mounts
-rw-r--r--. 1 root root 38 Jun 24 09:58 resolv.conf
-rw-r--r--. 1 root root 71 Jun 24 09:58 resolv.conf.hash
8) Viewing Docker Networks
To see all Docker networks, use:
# docker network list
NETWORK ID NAME DRIVER SCOPE
c0ab476dfb0a bridge bridge local
79f5e2d63623 docker_gwbridge bridge local
dee475371fcb host host local
Generating the API Call Logs
You can access the Multi-Site Orchestrator API call logs through the Infra Logs in a Troubleshooting Report.
You can also access the API call logs Multi-Site with the following steps:
Procedure
Step 1
Locate the worker node that has the msc-executionengine service running, as in the following example:
Example:
[root@worker1 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1538a9289381 msc-kong:latest "/docker-entrypoin..." 2 weeks ago Up 2 weeks 7946/tcp, msc_kong.1.ksdw45p0qhb6c08i3c8i4ketc
8000-8001/tcp, 8443/tcp
cc693965f502 msc-executionengine:latest "bin/executionengine" 2 weeks ago Up 2 weeks (healthy) 9030/tcp msc_executionengine.1.nv4j5uj5786yj621wjxsxvgxl
00f627c6804c msc-platformservice:latest "bin/platformservice" 2 weeks ago Up 2 weeks (healthy) 9050/tcp msc_platformservice.1.fw58jr62dfcme4noh67am0s73
In this case, on cc693965f502 the image is msc-executionengine:latest, find the -json.log, that contains the API calls from Multi-Site to the APIC controllers.
Step 2
Enter the command in the following example:
Example:
# cd /var/lib/docker/containers/cc693965f5027f291d3af4a6f2706b19f4ccdf6610de3f7ccd32e1139e31e712
# ls
cc693965f5027f291d3af4a6f2706b19f4ccdf6610de3f7ccd32e1139e31e712-json.log checkpoints config.v2.json hostconfig.json hostname
hosts resolv.conf resolv.conf.hash shm
# less \
cc693965f5027f291d3af4a6f2706b19f4ccdf6610de3f7ccd32e1139e31e712-json.log | grep intersite
{"log":" \u003cfvBD name=\"internal\" arpFlood=\"yes\" intersiteBumTrafficAllow=\"yes\" unkMacUcastAct=\"proxy\"
intersiteL2Stretch=\"yes\"\u003e\n","stream":"stdout","time":"2017-07-25T08:41:51.241428676Z"}
{"log":" \"intersiteBumTrafficAllow\" : true,\n","stream":"stdout","time":"2017-07-27T07:17:55.418934202Z"}
Reading the Execution Log
The execution log provides three different kinds of log information:
Websocket refresh information that is printed out every 5 minutes.
2017-07-11 18:02:45,541 [debug] execution.serice.monitor.WSAPicActor - WebSocket connection open
2017-07-11 18:02:45,542 [debug] execution.serice.monitor.WSAPicActor - Client 3 intialized
2017-07-11 18:02:45,551 [debug] execution.serice.monitor.WSAPicActor - WSAPicActor stashing message Monitor Policy(WSMonitorQuery(/api/class/fvRsNodeAtt,?subscript
2017-07-11 18:02:45,551 [debug] execution.serice.monitor.WSAPicActor - WSAPicActor stashing message RefreshClientTokenFailed()
The schema to push and the plan being generated.
Websocket monitoring VNID for cross VNID programming.
Note the following signs of errors:
Log lines starting with a red error.
Stacktrace for exceptions.
Verifying Policy Resolution on APIC Sites
In this task, use a REST API MO query on local APIC sites or switches to view the policies resolved on an APIC, for a site managed by Cisco ACI Multi-Site.
For diagrams of the managed objects (MO) relationships, see the Cisco APIC Management Information Model Reference (MIM). For example, in the MIM, see the diagram for fv:FabricExtConnP.
Procedure
Step 1
To view details for the logical MOs under the Fabric External Connection Profile (fabricExtConnP), log on to the APIC CLI and enter the following MO query:
Example:
admin@apic1:~> moquery -c fvFabricExtConnP -x "query-target=subtree" | egrep "#|dn"
# fv.IntersiteMcastConnP
dn: uni/tn-infra/fabricExtConnP-1/intersiteMcastConnP
# fv.IntersitePeeringP
dn: uni/tn-infra/fabricExtConnP-1/ispeeringP
# fv.IntersiteConnP
dn: uni/tn-infra/fabricExtConnP-1/podConnP-1/intersiteConnP-[5.5.5.1/32]
Step 2
To view the logical MOs for the L3Out used for Multi-Site connections, log on to the APIC CLI and enter an MO query, such as the following:
Example:
admin@apic1:~> moquery -c l3extOut -x "query-target=subtree" | egrep "#|dn.*intersite" | grep -B 1 dn
# bgp.ExtP
dn: uni/tn-infra/out-intersite/bgpExtP
# fv.RsCustQosPol
dn: uni/tn-infra/out-intersite/instP-intersiteInstP/rscustQosPol
# l3ext.InstP
dn: uni/tn-infra/out-intersite/instP-intersiteInstP
# bgp.AsP
dn: uni/tn-infra/out-intersite/lnodep-node-501-profile/infraPeerP-[6.6.6.3]/as
# bgp.RsPeerPfxPol
Step 3
To view the resolved MOs for an APIC local site, log on to the APIC CLI and enter an MO query such as the following:
Example:
admin@apic1:~> moquery -c fvSite -x "query-target=subtree" | egrep "#|dn"
# fv.RemoteBdDef
dn: resPolCont/sitecont/site-6/remotebddef-[uni/tn-msite-tenant-welkin/BD-internal]
# fv.RemoteCtxDef
dn: resPolCont/sitecont/site-6/remotectxdef-[uni/tn-msite-tenant-welkin/ctx-dev]
# fv.RemoteEPgDef
dn: resPolCont/sitecont/site-6/remoteepgdef-[uni/tn-msite-tenant-welkin/ap-Ebiz/epg-data]
Step 4
To view the concrete MOs on a switch for a Multi-Site site, log on to the switch and enter an MO query such as the following:
Example:
spine501# moquery -c dci.LocalSite -x "query-target=subtree" | egrep "#|dn"
# l2.RtToLocalBdSubstitute //(site5 vrf 2195456 -> bd 15794150 is translated to
site6 vrf 2326528 -> bd 16449430)
dn: sys/inst-overlay-1/localSite-5/localCtxSubstitute-[vxlan-2195456]/localBdSubstitute-
[vxlan-15794150]/rttoLocalBdSubstitute-[sys/inst-overlay-1/remoteSite-6/remoteCtxSubstitute-
[vxlan-2326528]/remoteBdSubstitute-[vxlan-16449430]]
# l2.LocalBdSubstitute
dn: sys/inst-overlay-1/localSite-5/localCtxSubstitute-[vxlan-2195456]/localBdSubstitute-
[vxlan-15794150]
What to look for: The output displays the data translated between sites. In this example, the original data on the sites was as follows:
site5 vrf msite-tenant-welkin:dev -> vxlan 2195456, bd internal -> vxlan 15794150, epg web: access-encap 200 → pcTag 49154, access-encap 201 → pcTag 16387
site6 vrf msite-tenant-welkin:dev -> vxlan 2326528, bd internal -> vxlan 16449430, epg web: access-encap 200 ->pcTag 32770,access-encap 201 ->pcTag 16386
Step 5
To verify the concrete MOs for a remote site, enter an MO query such as the following:
Example:
spine501# moquery -c dci.RemoteSite -x "query-target=subtree"
| egrep "#|dn"
# dci.AnycastExtn
dn: sys/inst-overlay-1/remoteSite-6/anycastExtn-[6.6.6.1/32]
// attribute is_unicast is Yes, Unicast ETEP
# dci.AnycastExtn
dn: sys/inst-overlay-1/remoteSite-6/anycastExtn-[6.6.6.2/32]
// attribute is_unicast is No, Multicast ETEP
# l2.RsToLocalBdSubstitute
dn: sys/inst-overlay-1/remoteSite-6/remoteCtxSubstitute-[vxlan-2326528]/remoteBdSubstitute-
[vxlan-16449430]/rsToLocalBdSubstitute
Comments