近期苏州天剑服务工程师为用户处置了一个由于HBA卡硬件固件BUG导致的VMware ESXI虚拟化故障,在此苏州天剑服务工程师建议各位用户,在VMware ESXI上线前,请务必完成硬件兼容性检查并更新至满足兼容性列表要求的固件及驱动版本,以最大程度确保业务的运行可靠性,提前规避可能发生的业务风险。

1、VMware兼容性列表查询

https://www.vmware.com/resources/compatibility/search.php

通常情况下我们可以通过搜索硬件型号或手动筛选的方式,找到自己的硬件设备,以查询相应的兼容性要求列表,也可以通过VID、DID号进行更为精准的查询。

2、检查当前固件及驱动版本

以戴尔服务器举例,在完成BIOS及相关固件版本更新后,需检查I/O等设备的固件版本及驱动版本是否满足兼容性要求,如HBA卡、网卡、RAID卡等,在操作之前需要开启ESXI的SSH服务。

2.1、HBA卡

esxcfg-scsidevs -a #查询IDEV插槽的列表信息 
esxcli storage san fc list #查询FC的列表信息
vmkchdev -l | grep vmhba[x] #查询HBA卡的VID、DID等信息
[root@localhost:/tmp] esxcfg-scsidevs -a
vmhba0 lsi_mr3 link-n/a sas.5f4ee0806a463900 (0000:65:00.0) Broadcom / LSI PERC H755 Front
vmhba1 vmw_ahci link-n/a sata.vmhba1 (0000:00:11.5) Intel Corporation Lewisburg SATA AHCI Controller
vmhba2 vmw_ahci link-n/a sata.vmhba2 (0000:00:17.0) Intel Corporation Lewisburg SATA AHCI Controller
vmhba3 lpfc link-up fc.200070b7e401f9fe:100070b7e401f9fe (0000:4b:00.0) Emulex Corporation Emulex LightPulse LPe31000/LPe32000 PCIe Fibre Channel Adapter
vmhba4 lpfc link-up fc.200070b7e401f9f8:100070b7e401f9f8 (0000:98:00.0) Emulex Corporation Emulex LightPulse LPe31000/LPe32000 PCIe Fibre Channel Adapter
vmhba64 lpfc link-up fc.200070b7e401f9fe:100070b7e401f9fe (0000:4b:00.0) Emulex Corporation Emulex LightPulse LPe31000/LPe32000 PCIe Fibre Channel Adapter
vmhba65 lpfc link-up fc.200070b7e401f9f8:100070b7e401f9f8 (0000:98:00.0) Emulex Corporation Emulex LightPulse LPe31000/LPe32000 PCIe Fibre Channel Adapter

[root@localhost:/tmp] esxcli software vib list | grep lpfc
lpfc 14.0.622.0-1OEM.700.1.0.15843807 EMU VMwareCertified 2024-01-31

[root@localhost:/tmp] vmkload_mod -s lpfc | grep Version
Version: 14.0.622.0-1OEM.700.1.0.15843807

[root@localhost:/tmp] esxcli storage san fc list
   Adapter: vmhba3
   Port ID: 0D0500
   Node Name: 20:00:70:b7:e4:01:f9:fe
   Port Name: 10:00:70:b7:e4:01:f9:fe
   Speed: 16 Gbps
   Port Type: NPort
   Port State: ONLINE
   Model Description: Emulex LightPulse LPe31000-M6-D 1-Port 16Gb Fibre Channel Adapter
   Hardware Version: 0000000c
   OptionROM Version: 14.2.566.14
   Firmware Version: 14.2.566.14
   Driver Name: lpfc
   DriverVersion: 14.0.622.0

   Adapter: vmhba4
   Port ID: 170500
   Node Name: 20:00:70:b7:e4:01:f9:f8
   Port Name: 10:00:70:b7:e4:01:f9:f8
   Speed: 16 Gbps
   Port Type: NPort
   Port State: ONLINE
   Model Description: Emulex LightPulse LPe31000-M6-D 1-Port 16Gb Fibre Channel Adapter
   Hardware Version: 0000000c
   OptionROM Version: 14.2.566.14
   Firmware Version: 14.2.566.14
   Driver Name: lpfc
   DriverVersion: 14.0.622.0

   Adapter: vmhba64
   Port ID: 0D0500
   Node Name: 20:00:70:b7:e4:01:f9:fe
   Port Name: 10:00:70:b7:e4:01:f9:fe
   Speed: 16 Gbps
   Port Type: NPort
   Port State: ONLINE
   Model Description: Emulex LightPulse LPe31000-M6-D 1-Port 16Gb Fibre Channel Adapter
   Hardware Version: 0000000c
   OptionROM Version: 14.2.566.14
   Firmware Version: 14.2.566.14
   Driver Name: lpfc
   DriverVersion: 14.0.622.0

   Adapter: vmhba65
   Port ID: 170500
   Node Name: 20:00:70:b7:e4:01:f9:f8
   Port Name: 10:00:70:b7:e4:01:f9:f8
   Speed: 16 Gbps
   Port Type: NPort
   Port State: ONLINE
   Model Description: Emulex LightPulse LPe31000-M6-D 1-Port 16Gb Fibre Channel Adapter
   Hardware Version: 0000000c
   OptionROM Version: 14.2.566.14
   Firmware Version: 14.2.566.14
   Driver Name: lpfc
   DriverVersion: 14.0.622.0

依据以上查询到的情况来看,该HBA型号为Emulex LightPulse LPe31000-M6-D 1-Port 16Gb Fibre Channel Adapter,固件版本为14.2.566.14,驱动版本为14.0.622.0。

VMware 硬件兼容性检查及ESXI驱动更新方法

根据兼容性列表查询的结果来看,当前的固件版本为14.2.566.14,所匹配的驱动版本为lpfc version 14.2.567.0,当前版本低于该版本,故需要对HBA的驱动进行升级,后续同理。

2.2、网卡

esxcli network nic list #显示网卡列表信息
esxcli netwrok nic get -n vmnic[x] #显示指定网卡的详细信息
vmkchdev -l | grep vmnicX #显示网卡的VID,DID 等信息
[root@localhost:/tmp] esxcli network nic list
Name PCI Device Driver Admin Status Link Status Speed Duplex MAC Address MTU Description
------ ------------ ------ ------------ ----------- ----- ------ ----------------- ---- -----------
vmnic0 0000:04:00.0 ntg3 Up Down 0 Half ec:2a:72:f9:c3:3c 1500 Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet
vmnic1 0000:04:00.1 ntg3 Up Down 0 Half ec:2a:72:f9:c3:3d 1500 Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet
vmnic2 0000:31:00.0 i40en Up Up 10000 Full 6c:fe:54:8b:22:8c 1500 Intel(R) Ethernet Controller X710 for 10GbE SFP+
vmnic3 0000:31:00.1 i40en Up Up 10000 Full 6c:fe:54:8b:22:8d 1500 Intel(R) Ethernet Controller X710 for 10GbE SFP+
vmnic4 0000:b2:00.0 i40en Up Up 10000 Full 6c:fe:54:82:4d:50 1500 Intel(R) Ethernet Controller X710 for 10GbE SFP+
vmnic5 0000:b2:00.1 i40en Up Up 10000 Full 6c:fe:54:82:4d:51 1500 Intel(R) Ethernet Controller X710 for 10GbE SFP+
vmnic6 0000:b1:00.0 i40en Up Up 10000 Full 6c:fe:54:82:71:90 1500 Intel(R) Ethernet Controller X710 for 10GbE SFP+
vmnic7 0000:b1:00.1 i40en Up Down 0 Half 6c:fe:54:82:71:91 1500 Intel(R) Ethernet Controller X710 for 10GbE SFP+
[root@localhost:/tmp] esxcli network nic get -n vmnic2
Advertised Auto Negotiation: true
Advertised Link Modes: Auto, 10000BaseSR/Full
Auto Negotiation: true
Cable Type: FIBRE
Current Message Level: 0
Driver Info:
Bus Info: 0000:31:00:0
Driver: i40en
Firmware Version: 9.40 0x8000e9bd 22.5.7
Version: 2.3.4.0
Link Detected: true
Link Status: Up
Name: vmnic2
PHYAddress: 0
Pause Autonegotiate: false
Pause RX: false
Pause TX: false
Supported Ports: FIBRE
Supports Auto Negotiation: true
Supports Pause: true
Supports Wakeon: true
Transceiver:
Virtual Address: 00:50:56:5a:59:3e
Wakeon: MagicPacket(tm)

2.3、RAID卡

esxcfg-scsidevs -a #查询IDEV插槽的列表
esxcli storage san sas list #查询RAID卡的详细信息
vmkchdev -l | grep vmhba[x] #查询RAID的VID、DID等信息
[root@localhost:/tmp] esxcli storage san sas list
Device Name: vmhba0
SAS Address: 5f:4e:e0:80:6a:46:39:00
Physical ID: 0
Minimum Link Rate: 0 Mbps
Maximum Link Rate: 0 Mbps
Negotiated Link Rate: 0 Mbps
Model Description: PERC H755 Front
Hardware Version: A
OptionROM Version: 7.26.00.0_0x071A0000
Firmware Version: 52.26.0-5179
Driver Name: lsi_mr3
Driver Version: 7.722.02.00

3、升级操作

仍以戴尔服务器举例,硬件的固件版本可以通过IDRAC进行更新,驱动程序则在VMware官方下载后,上传至/tmp目录下,在升级之前则需将ESXI进入维护模式。

esxcli system maintenanceMode set --enable yes #进入维护模式
[root@localhost:/tmp] esxcli software component apply -d /tmp/Broadcom-lsi-mr3_7.726.02.00-1OEM.700.1.0.15843807_22115906.zip
Installation Result
Components Installed: Broadcom-lsi-mr3_7.726.02.00-1OEM.700.1.0.15843807
Components Removed: Broadcom-lsi-mr3_7.722.02.00-1OEM.700.1.0.15843807
Components Skipped:
Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective.
Reboot Required: true

[root@localhost:/tmp] esxcli software component apply -d /tmp/Broadcom-ELX-lpfc_14.2.567.0-1OEM.700.1.0.15843807_21768986.zip
Installation Result
Components Installed: Broadcom-ELX-lpfc_14.2.567.0-1OEM.700.1.0.15843807
Components Removed: Broadcom-ELX-lpfc_14.0.622.0-1OEM.700.1.0.15843807
Components Skipped:
Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective.
Reboot Required: true

[root@localhost:/tmp] esxcli software component apply -d /tmp/Intel-i40en_2.5.11.0-1OEM.700.1.0.15843807_22757618.zip
Installation Result
Components Installed: Intel-i40en_2.5.11.0-1OEM.700.1.0.15843807
Components Removed: Intel-i40en_2.3.4.0-1OEM.700.1.0.15843807
Components Skipped:
Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective.
Reboot Required: true

所需的驱动程勋全部更新完成后,通过reboot命令对ESXI物理主机进行重启,重启后按照上述步骤确认是否升级为目标版本。

VMware 硬件兼容性检查及ESXI驱动更新方法

全部升级操作结束后,关闭维护模式即可。

esxcli system maintenanceMode set --enable no #关闭维护模式

另外可以通过以下命令检查ESXI主机当前所处的维护模式状态。

esxcli system maintenanceMode get 

4、引用

https://www.dell.com/support/kbdoc/zh-cn/000194101/how-to-install-vmware-vsphere-esxi7-0-drivers

https://blog.csdn.net/jkxiaoshao/article/details/120609303

https://blog.csdn.net/fq3758/article/details/107791920

https://blog.csdn.net/fq3758/article/details/107616042

5、由于固件版本导致的故障案例参考

https://blog.51cto.com/u_15127623/3992566

相关新闻

联系我们

联系我们

400-0512-768

邮件:support@sworditsys.com

工作时间:周一至周五 8:00 - 21:00

分享本页
返回顶部