Hi Team,
Today I am going to explain different issues faced during IOS upgradation & workaround to resolve
the issue.
General Issue:
If post upgradation or for any other reason, any RSP goes to unknown state, then first of all ask
FE to take a clear snap shot of LED status
JOJI the RSP & observe if it is coming online
Ask FE to connect the console cable at problematic RSP & check whether it is stuck to Rommon>
mode
If it is stuck in rommon mode then try to boot the RSP from bootflash or usb drive after copying
the file from the active RSP(in case of USB)
If standby RSP is not coming online or reloading continuously even after multiple RSP insertion &
stuck in standby cold state followed by continuous booting then follow below steps:
Collect below logs before the activity
sh ip int brief
sh environment
sh platform
sh isis neighbors
sh int description
sh int description | i up
sh int description | i down
sh hw-module all fpd
sh ver
sh mpls l2transport vc
sh ip bgp all summary
sh redundancy
show ip ospf neighbor
show rep topology detail
show standby brief
show vrrp brief
show ip arp
sh running-config
sh tech-support
Jack out active RSP from slot X
Jack IN problematic RSP (which was inserted before) at slot Y & observe whether it is coming online
or not
If it comes online then Jack IN active RSP (now it will be standby while slot Y RSP comes active) at
slot X & observe till it comes online.
Arrange spare 2-3 spare RSP standby if there is any issue with previous 2 RSP’s
Collect all session log, console log & LED snap shot.
Bulk-sync issue:
If from console/terminal log, it is observed that standby RSP is reloading due to bulk-sync error
logs then do the following step
Please run below command to check the sync mismatch in active and standby RPs.
#show redundancy config-sync failures bem
#show redundancy config-sync failures mcl
#show redundancy config-sync failures prc
If you find mismatching lines There are two ways to avoid this, as below:
1- Remove problematic config lines from configuration and re-add them after a successful bootup
of standby RSP
2- no policy config-sync lbl prc “ and “ no policy config-sync bulk
3- Run command “#redundancy config-sync ignore mismatched-commands” and reload the RSP
Unable to copy/view files in flash, do the below steps:
1 Do RSP switch over if active RSP is having the issue else follow from step 2
2 Soft reboot active RSP (after switch over it will become standby)
3 Try to boot from usb drive with software copied from other RSP
4 Check below issue solved or not while copying image
5 Else hard reboot the RSP & check
6 Else replace RSP with spare
7 Collect all console, session logs & LED snap shot.
CHCHEVIRCAG02CA903#format bootflash:
Format operation may take a while. Continue? [confirm]
Format operation will destroy all data in "bootflash:". Continue? [confirm]
The system was booted from bootflash: and is running from media.
In order to reformat bootflash: you must boot from consolidated package
or boot from a different media device.
mount: /dev/bootflash1 already mounted or /bootflash busy
mount: according to mtab, /dev/bootflash1 is already mounted on /bootflash
%Error formatting bootflash: (I/O error)
CHCHEVIRCAG02CA903#mkdir bootflash:XE313ES
Create directory filename [XE313ES]?
%Error Creating dir bootflash:/XE313ES (File exists)
Unable to upgrade new image:
In case, new image is copied into bootflash but during upgrade expected & calculated checksome
value is not matching then change the usb drive & try again.
Check copied checksome with original checksome which should be same.
For an example:
903#verify /md5 bootflash:/Image/asr903rsp1-universalk9_npe.03.16.01a.S.155-3.S1a-ext.bin
Standby RSP is standby-cold state:
In case, standby RSP is up but stuck in standby-cold state then check the redundancy mode which
should be SSO & not RPR
CHCHEVIRCAG02CA903#sh redundancy
Redundant System Information :
------------------------------
Available system uptime = 48 weeks, 10 hours, 16 minutes
Switchovers system experienced = 2
Standby failures = 0
Last switchover reason = active unit removed
Hardware Mode = Duplex
Configured Redundancy Mode = sso
Operating Redundancy Mode = sso
Maintenance Mode = Disabled
Communications = Up
In case redundancy mode is sso then change it to rpr in order to make the state standby-hot. During
the change standby RSP will get rebooted.
CHCHEVIRCAG02CA903(config)#redundancy
CHCHEVIRCAG02CA903(config-red)#mode ?
rpr Route Processor Redundancy
sso Stateful Switchover
CHCHEVIRCAG02CA903(config)#exitCHCHEVIRCAG02CA903(config-red)#mode sso
CHCHEVIRCAG02CA903(config-red)#exit
Case study:
Observation:-
RSP went down at DLDELKHOCAG01CA903 router
No logs taken before replacing the RSP with spare
Hence we have arranged & checked the faulty RSP in spare chassis at standby slot
We have connected active working RSP at active slot & collected logs both from active & standby
RSP
Initially the faulty RSP had come up after normal booting process but after some time it again
went down
Router#sh platform
Chassis type: ASR-903
Slot Type State Insert time (ago)
--------- ------------------- --------------------- -----------------
R0 booting 00:00:47
R1 A903-RSP1B-55 ok, active 00:05:46
F0 unknown 00:00:47
F1 ok, active 00:05:46
P0 A900-PWR550-D ok 00:05:01
P1 A900-PWR550-D ps, fail 00:04:58
P2 Unknown N/A never
Slot CPLD Version Firmware Version
--------- ------------------- ---------------------------------------
R0 N/A N/A
R1 11102133 15.3(2r)S
F0 N/A N/A
F1 11102133 15.3(2r)S
++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Router#sh platform
Chassis type: ASR-903
Slot Type State Insert time (ago)
--------- ------------------- --------------------- -----------------
R0 A903-RSP1B-55 init, standby 00:04:25
R1 A903-RSP1B-55 ok, active 00:09:24
F0 init, standby 00:04:25
F1 ok, active 00:09:24
P0 A900-PWR550-D ok 00:08:39
P1 A900-PWR550-D ps, fail 00:08:36
P2 Unknown N/A never
Slot CPLD Version Firmware Version
--------- ------------------- ---------------------------------------
R0 11102133 15.3(2r)S
R1 11102133 15.3(2r)S
F0 11102133 15.3(2r)S
F1 11102133 15.3(2r)S
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Router#sh platform
Chassis type: ASR-903
Slot Type State Insert time (ago)
--------- ------------------- --------------------- -----------------
R0 A903-RSP1B-55 ok, standby 00:07:14 >>>>>>>>>>>>>>>>>>>>>RSP came up after
normal booting
R1 A903-RSP1B-55 ok, active 00:12:13
F0 ok, standby 00:07:14
F1 ok, active 00:12:13
P0 A900-PWR550-D ok 00:11:28
P1 A900-PWR550-D ps, fail 00:11:25
P2 Unknown N/A never
Slot CPLD Version Firmware Version
--------- ------------------- ---------------------------------------
R0 11102133 15.3(2r)S
R1 11102133 15.3(2r)S
F0 11102133 15.3(2r)S
F1 11102133 15.3(2r)S
Router#sh platform
Chassis type: ASR-903
Slot Type State Insert time (ago)
--------- ------------------- --------------------- -----------------
R0 booting 00:01:09 >>>>>>>>>>>>>>>>>>>>>>>>>>Again it went down
after some time
R1 A903-RSP1B-55 ok, active 00:20:41
F0 unknown 00:01:09
F1 ok, active 00:20:41
P0 A900-PWR550-D ok 00:19:56
P1 A900-PWR550-D ps, fail 00:19:53
P2 Unknown N/A never
Slot CPLD Version Firmware Version
--------- ------------------- ---------------------------------------
R0 N/A N/A
R1 11102133 15.3(2r)S
F0 N/A N/A
F1 11102133 15.3(2r)S
From active RSP, we have observed no space left on device
*Dec 7 08:23:04.641: %IOSXE-5-PLATFORM: F0: kernel: scsi 0:0:0:0: Direct-Access SanDisk Cruzer
Blade 1.00 PQ: 0 ANSI: 6
*Dec 7 08:23:04.641: %IOSXE-5-PLATFORM: F0: kernel: sd 0:0:0:0: Attached scsi generic sg0 type 0
*Dec 7 08:23:04.645: %IOSXE-5-PLATFORM: F0: kernel: sd 0:0:0:0: [sda] 30464000 512-byte logical
blocks: (15.5 GB/14.5 GiB)
*Dec 7 08:23:04.660: %IOSXE-5-PLATFORM: F0: kernel: sd 0:0:0:0: [sda] Attached SCSI removable
disk
*Dec 7 08:23:26.316: %BTRACE_ROTATE-3-ARCHIVE_FAIL: SIP0: btrace_rotate.sh: Error archiving
trace file - cman_fp_F0-0.log.13499.20171207081124.gz(error:'cp: cannot create regular file
`/harddisk/tracelogs/cman_fp_F0-0.log.13499.20171207081124.gz': No space left on device' for file
created:2017-12-07:08:23:26, closed:2017-12-07:08:11:24)
*Dec 7 08:23:36.762: %BTRACE_ROTATE-3-ARCHIVE_FAIL: SIP0: btrace_rotate.sh: Error archiving
trace file - cmcc_0-0.log.13363.20171207081133.gz(error:'cp: cannot create regular file
`/harddisk/tracelogs/cmcc_0-0.log.13363.20171207081133.gz': No space left on device' for file
created:2017-12-07:08:23:36, closed:2017-12-07:08:11:33)
*Dec 7 08:28:35.904: %IOSXE_OIR-6-OFFLINECARD: Card (rp) offline in slot R0
*Dec 7 08:28:35.921: %REDUNDANCY-3-STANDBY_LOST: Standby processor fault
(PEER_NOT_PRESENT)
*Dec 7 08:28:35.921: %REDUNDANCY-3-STANDBY_LOST: Standby processor fault (PEER_DOWN)
*Dec 7 08:28:35.921: %REDUNDANCY-3-STANDBY_LOST: Standby processor fault
(PEER_REDUNDANCY_STATE_CHANGE)
*Dec 7 08:28:36.015: %IOSXE_OIR-6-OFFLINECARD: Card (fp) offline in slot F0
*Dec 7 08:28:35.945: %CMRP-6-FP_HA_STATUS: R1/0: cmand: F1 redundancy state is Active with
no Standby
*Dec 7 08:28:35.962: %CMRP-6-RP_SB_RELOAD_REQ: R1/0: cmand: Reloading Standby RP: initiated
by RF reload message
*Dec 7 08:28:36.968: %IOSXE_OIR-6-REMCARD: Card (rp) removed from slot R0
*Dec 7 08:28:37.079: %IOSXE_OIR-6-REMCARD: Card (fp) removed from slot F0
*Dec 7 08:28:37.185: %IOSXE_OIR-6-REMCARD: Card (cc) removed from slot 0
*Dec 7 08:28:40.268: %IOSXE_OIR-6-INSCARD: Card (rp) inserted in slot R0
*Dec 7 08:28:40.270: %IOSXE_OIR-6-INSCARD: Card (fp) inserted in slot F0
*Dec 7 08:28:40.281: %IOSXE_OIR-6-INSCARD: Card (cc) inserted in slot 0
*Dec 7 08:28:46.553: %RF-5-RF_RELOAD: Peer reload. Reason: EHSA standby down
We replaced the flash card & inserted again the faulty card & it came up after normal booting
but again it went down
*Dec 7 08:38:58.830: %IOSXE_OIR-6-ONLINECARD: Card (rp) online in slot R0
*Dec 7 08:39:09.540: %IOSXE_OIR-6-ONLINECARD: Card (fp) online in slot F0
*Dec 7 08:39:25.816: %IOSXE_OIR-6-OFFLINECARD: Card (cc) offline in slot 0
*Dec 7 08:39:25.817: %IOSXE_OIR-6-OFFLINECARD: Card (cc) offline in slot 0
*Dec 7 08:39:49.642: %IOSXE_OIR-6-OFFLINECARD: Card (rp) offline in slot R0
*Dec 7 08:40:41.967: %IOSXE_OIR-6-OFFLINECARD: Card (fp) offline in slot F0
*Dec 7 08:45:55.708: %IOSXE_OIR-6-REMCARD: Card (rp) removed from slot R0
*Dec 7 08:45:55.808: %IOSXE_OIR-6-REMCARD: Card (fp) removed from slot F0
*Dec 7 08:45:55.914: %IOSXE_OIR-6-REMCARD: Card (cc) removed from slot 0
*Dec 7 08:49:14.609: %IOSXE_OIR-6-INSCARD: Card (rp) inserted in slot R0
*Dec 7 08:49:14.620: %IOSXE_OIR-6-INSCARD: Card (fp) inserted in slot F0
*Dec 7 08:49:14.621: %IOSXE_OIR-6-INSCARD: Card (cc) inserted in slot 0
*Dec 7 09:00:47.899: %IOSXE_OIR-6-ONLINECARD: Card (rp) online in slot R0
*Dec 7 09:01:00.067: %IOSXE_OIR-6-ONLINECARD: Card (fp) online in slot F0
*Dec 7 09:01:15.313: %IOSXE_OIR-6-ONLINECARD: Card (cc) online in slot 0
*Dec 7 09:01:15.318: %IOSXE_OIR-6-OFFLINECARD: Card (cc) offline in slot 0
*Dec 7 09:01:55.794: %REDUNDANCY-5-PEER_MONITOR_EVENT: Active detected a standby
insertion (raw-event=PEER_FOUND(4))
*Dec 7 09:01:55.794: %REDUNDANCY-5-PEER_MONITOR_EVENT: Active detected a standby
insertion (raw-event=PEER_REDUNDANCY_STATE_CHANGE(5))
*Dec 7 09:01:58.618: %REDUNDANCY-3-IPC: IOS versions do not match.
*Dec 7 09:02:28.478: %CMRP-6-FP_HA_STATUS: R1/0: cmand: F0 redundancy state is Standby
*Dec 7 09:02:31.963: %XDR-6-ISSUCLIENTABSENT: XDR client IPv6 table broker absent on slot 6 (6).
Client functionality may be affected.
*Dec 7 09:02:59.965: %IOSXE_OIR-6-INSCARD: Card (cc) inserted in slot 0
*Dec 7 09:03:40.600: %HA_CONFIG_SYNC-6-BULK_CFGSYNC_SUCCEED: Bulk Sync succeeded
Router#sh platform
Chassis type: ASR-903
Slot Type State Insert time (ago)
--------- ------------------- --------------------- -----------------
R0 A903-RSP1B-55 ok, standby 00:11:42
R1 A903-RSP1B-55 ok, active 00:54:58
F0 ok, standby 00:11:42
F1 ok, active 00:54:58
P0 A900-PWR550-D ok 00:54:12
P1 A900-PWR550-D ps, fail 00:54:10
P2 Unknown N/A never
Slot CPLD Version Firmware Version
--------- ------------------- ---------------------------------------
R0 11102133 15.3(2r)S
R1 11102133 15.3(2r)S
F0 11102133 15.3(2r)S
F1 11102133 15.3(2r)S
The RSP again went down after some time
*Dec 7 09:07:37.831: %IOSXE_OIR-6-OFFLINECARD: Card (rp) offline in slot R0
*Dec 7 09:07:37.879: %IOSXE_OIR-6-OFFLINECARD: Card (fp) offline in slot F0
*Dec 7 09:07:37.873: %CMRP-6-FP_HA_STATUS: R1/0: cmand: F1 redundancy state is Active with
no Standby
*Dec 7 09:07:37.890: %CMRP-6-RP_SB_RELOAD_REQ: R1/0: cmand: Reloading Standby RP:
initiated by RF reload message
*Dec 7 09:07:37.892: %REDUNDANCY-3-STANDBY_LOST: Standby processor fault
(PEER_NOT_PRESENT)
*Dec 7 09:07:37.893: %REDUNDANCY-3-STANDBY_LOST: Standby processor fault (PEER_DOWN)
*Dec 7 09:07:37.893: %REDUNDANCY-3-STANDBY_LOST: Standby processor fault
(PEER_REDUNDANCY_STATE_CHANGE)
*Dec 7 09:07:38.890: %IOSXE_OIR-6-REMCARD: Card (rp) removed from slot R0
*Dec 7 09:07:39.008: %IOSXE_OIR-6-REMCARD: Card (fp) removed from slot F0
*Dec 7 09:07:39.116: %IOSXE_OIR-6-REMCARD: Card (cc) removed from slot 0
*Dec 7 09:07:42.195: %IOSXE_OIR-6-INSCARD: Card (rp) inserted in slot R0
*Dec 7 09:07:42.197: %IOSXE_OIR-6-INSCARD: Card (fp) inserted in slot F0
*Dec 7 09:07:42.209: %IOSXE_OIR-6-INSCARD: Card (cc) inserted in slot 0
*Dec 7 09:07:46.321: %RF-5-RF_RELOAD: Peer reload. Reason: EHSA standby down
Router#sh platform
Chassis type: ASR-903
Slot Type State Insert time (ago)
--------- ------------------- --------------------- -----------------
R0 unknown 00:06:47
R1 A903-RSP1B-55 ok, active 01:05:21
F0 unknown 00:06:47
F1 ok, active 01:05:21
P0 A900-PWR550-D ok 01:04:36
P1 A900-PWR550-D ps, fail 01:04:33
P2 Unknown N/A never
Slot CPLD Version Firmware Version
--------- ------------------- ---------------------------------------
R0 N/A N/A
R1 11102133 15.3(2r)S
F0 N/A N/A
F1 11102133 15.3(2r)S
We got the RSP rebooted from usb with image copied from active RSP. This time it was not
coming up but observed multibit error on ECC memory.
rommon 1 > dev
Devices in device table:
id name
bootflash: Internal disk
usb0: USB disk
rommon 2 > dir usb0:
Checking USB devices..
USB EHCI 1.00
scanning USB bus for devices..
2 USB Device(s) found
scanning bus for storage devices...
USB Mass Storage device detected
1 Storage Device(s) found
File System: FAT32
3 293413492 -rw- asr903rsp1-universalk9_npe.03.16.03a.S.155-3.S3a-ext.bin
rommon 3 > boot usb0:asr903rsp1-universalk9_npe.03.16.03a.S.155-3.S3a-ext.bin
Located asr903rsp1-universalk9_npe.03.16.03a.S.155-3.S3a-ext.bin, start cluster is 3
#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#
.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.
#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#
.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.
#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#.#
.#.#.#.#.#.#.#.#.#.#.#.#.#.#.
Image loaded
Boot image size = 293413492 (0x117d2274) bytes
Package header rev 0 structure detected
Calculating SHA-1 hash...done
validate_package: SHA-1 hash:
calculated 3410323c:1c0edcc2:f54e8142:4fd1b6f7:05e60036
expected 3410323c:1c0edcc2:f54e8142:4fd1b6f7:05e60036
Image validated
Passing control to the main image..
%IOSXEBOOT-4-FILESYS_ERRORS_CORRECTED: (rp/0): bootflash 1 contained errors which were auto-
corrected.
%IOSXEBOOT-4-FILESYS_ERRORS_CORRECTED: (rp/0): bootflash 5 contained errors which were auto-
corrected.
%IOSXEBOOT-4-FILESYS_ERRORS_CORRECTED: (rp/0): bootflash 6 contained errors which were auto-
corrected.
%IOSXEBOOT-4-FILESYS_ERRORS_CORRECTED: (rp/0): bootflash 7 contained errors which were auto-
corrected.
%IOSXEBOOT-4-FILESYS_ERRORS_CORRECTED: (rp/0): bootflash 8 contained errors which were auto-
corrected.
Restricted Rights Legend
Use, duplication, or disclosure by the Government is
subject to restrictions as set forth in subparagraph
(c) of the Commercial Computer Software - Restricted
Rights clause at FAR sec. 52.227-19 and subparagraph
(c) (1) (ii) of the Rights in Technical Data and Computer
Software clause at DFARS sec. 252.227-7013.
cisco Systems, Inc.
170 West Tasman Drive
San Jose, California 95134-1706
Cisco IOS Software, ASR903 Software (PPC_LINUX_IOSD-UNIVERSALK9_NPE-M), Version 15.5(3)S3a,
RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2016 by Cisco Systems, Inc.
Compiled Thu 16-Jun-16 02:53 by mcpre
Cisco IOS-XE software, Copyright (c) 2005-2016 by cisco Systems, Inc.
All rights reserved. Certain components of Cisco IOS-XE software are
licensed under the GNU General Public License ("GPL") Version 2.0. The
software code licensed under GPL Version 2.0 is free software that comes
with ABSOLUTELY NO WARRANTY. You can redistribute and/or modify such
GPL code under the terms of GPL Version 2.0. For more details, see the
documentation or "License Notice" file accompanying the IOS-XE software,
or the applicable URL provided on the flyer accompanying the IOS-XE
software.
Kernel panic - not syncing: Multi Bit error on ECC memory: address: 0x95C096D0, proc: chasync.sh
ÿ
Router#
Router#sh plat
Router#sh platform
Chassis type: ASR-903
Slot Type State Insert time (ago)
--------- ------------------- --------------------- -----------------
R0 A903-RSP1B-55 unknown 00:08:35
R1 A903-RSP1B-55 ok, active 01:40:37
F0 init, standby 00:08:35
F1 ok, active 01:40:37
P0 A900-PWR550-D ok 01:39:51
P1 A900-PWR550-D ps, fail 01:39:49
P2 Unknown N/A never
Slot CPLD Version Firmware Version
--------- ------------------- ---------------------------------------
R0 11102133 15.3(2r)S
R1 11102133 15.3(2r)S
F0 11102133 15.3(2r)S
F1 11102133 15.3(2r)S