kernel:[ 200.038456] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [swapper/1:0]
by huyuhui from LinuxQuestions.org on (#582X0)
I am facing an issue, which is NMI watchdog: BUG: soft lockup. The system hangs up and can not be reached via any terminal and ping command.
The issue happens in a virtual machine.
Host CPU information is below
Code:# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 160
On-line CPU(s) list: 0-159
Thread(s) per core: 2
Core(s) per socket: 10
Socket(s): 8
NUMA node(s): 8
Vendor ID: GenuineIntel
CPU family: 6
Model: 47
Stepping: 2
CPU MHz: 1064.000
BogoMIPS: 4800.28
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 30720K
NUMA node0 CPU(s): 1-10,41-50
NUMA node1 CPU(s): 11-20,51-60
NUMA node2 CPU(s): 21-30,61-70
NUMA node3 CPU(s): 31-40,71-80
NUMA node4 CPU(s): 0,81-89,120-129
NUMA node5 CPU(s): 90-99,130-139
NUMA node6 CPU(s): 100-109,140-149
NUMA node7 CPU(s): 110-119,150-159Host virtual machine information is below
Code:# virt-manager --version
0.9.4Host OS information is below
Code:# uname -r
3.0.101-0.47.79-default
# cat /etc/SuSE-release
SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 3Guest CPU information is below. In virt-manager, I selected "Copy host CPU configuration".
Code:# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 42 bits physical, 48 bits virtual
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 4
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 2
Model name: QEMU Virtual CPU version 1.4.2
Stepping: 3
CPU MHz: 2400.084
BogoMIPS: 4800.16
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 4096K
NUMA node0 CPU(s): 0-3
Flags: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl cpuid tsc_known_freq pni cx16 popcnt hypervisor lahf_lm ptiGuest OS information
Code:# uname -r
4.12.14-197.37-default
# lsb-release -a
LSB Version: n/a
Distributor ID: SUSE
Description: SUSE Linux Enterprise Server 15 SP1
Release: 15.1
Codename: n/aI checked below informatin in guest system.
Code:# cat /proc/sys/kernel/tainted
0
# cat /proc/sys/kernel/watchdog
1
# cat /proc/sys/kernel/watchdog_thresh
10
# cat /proc/sys/kernel/nmi_watchdog
0
# cat /proc/sys/kernel/soft_watchdog
1
# cat /proc/sys/kernel/softlockup_panic
0
# cat /proc/sys/kernel/unknown_nmi_panic
0And I did below update in guest system.
Code:# echo 0 > /proc/sys/kernel/watchdog
# echo 0 > /proc/sys/kernel/soft_watchdog
# echo 20 > /proc/sys/kernel/watchdog_threshThe issue is still there and the challenge is that host OS can not be upgraded to higher SLES version. Could you please advise if any solutions.


The issue happens in a virtual machine.
Host CPU information is below
Code:# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 160
On-line CPU(s) list: 0-159
Thread(s) per core: 2
Core(s) per socket: 10
Socket(s): 8
NUMA node(s): 8
Vendor ID: GenuineIntel
CPU family: 6
Model: 47
Stepping: 2
CPU MHz: 1064.000
BogoMIPS: 4800.28
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 30720K
NUMA node0 CPU(s): 1-10,41-50
NUMA node1 CPU(s): 11-20,51-60
NUMA node2 CPU(s): 21-30,61-70
NUMA node3 CPU(s): 31-40,71-80
NUMA node4 CPU(s): 0,81-89,120-129
NUMA node5 CPU(s): 90-99,130-139
NUMA node6 CPU(s): 100-109,140-149
NUMA node7 CPU(s): 110-119,150-159Host virtual machine information is below
Code:# virt-manager --version
0.9.4Host OS information is below
Code:# uname -r
3.0.101-0.47.79-default
# cat /etc/SuSE-release
SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 3Guest CPU information is below. In virt-manager, I selected "Copy host CPU configuration".
Code:# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 42 bits physical, 48 bits virtual
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 4
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 2
Model name: QEMU Virtual CPU version 1.4.2
Stepping: 3
CPU MHz: 2400.084
BogoMIPS: 4800.16
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 4096K
NUMA node0 CPU(s): 0-3
Flags: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl cpuid tsc_known_freq pni cx16 popcnt hypervisor lahf_lm ptiGuest OS information
Code:# uname -r
4.12.14-197.37-default
# lsb-release -a
LSB Version: n/a
Distributor ID: SUSE
Description: SUSE Linux Enterprise Server 15 SP1
Release: 15.1
Codename: n/aI checked below informatin in guest system.
Code:# cat /proc/sys/kernel/tainted
0
# cat /proc/sys/kernel/watchdog
1
# cat /proc/sys/kernel/watchdog_thresh
10
# cat /proc/sys/kernel/nmi_watchdog
0
# cat /proc/sys/kernel/soft_watchdog
1
# cat /proc/sys/kernel/softlockup_panic
0
# cat /proc/sys/kernel/unknown_nmi_panic
0And I did below update in guest system.
Code:# echo 0 > /proc/sys/kernel/watchdog
# echo 0 > /proc/sys/kernel/soft_watchdog
# echo 20 > /proc/sys/kernel/watchdog_threshThe issue is still there and the challenge is that host OS can not be upgraded to higher SLES version. Could you please advise if any solutions.