Wednesday, 30 December 2020

VMware ESX host lost a path to the storage device

 Recently, there was a customer issue that was brought to my notice. What was happening was that the storage device had lost one of the storage path on one ESX host (while it was working ok on other ESX hosts in the cluster). 


The ESX host had two standard vswitches running vmkernal ports on dedicated physical NICs. 

Here is what was done to troubleshoot and fix the issue

1) Identified the two host vmkernel IPs assigned to the vmk1 (vswtich1) and vmk2 (vswitch2). Identified the storage IP address. Finally, identified the two vmkernel IPs that was assigned to a working ESX host. Here is a pictorial representation of what was found. 


2) Ran a ping from the working host to vmkernel IPs 192.168.10.19 and 192.168.10.20. Both the IPs were reachable on the networks. The pings to the storage IPs responded fine. It looked the physical connectivity was working ok. 


3) Ran a ping from the faulty ESX host using the two vmkernel IPs. 

vmkping -I vmk2 192.168.10.40 -- > Ping failed 

vmkping -I vmk1 192.168.10.40 --> ping successful. 

This led us to the conclusion that vmk2 was not connected to the storage IP. 


4) Ran the following command on the faulty ESX host to identify the state of the IP connections. The command output showed that vmk2 connection was in SYN_SENT connection state.  

esxcli network ip connection list 

tcp 0 0 192.168.10.20:45796 192.168.10.40:3260 SYN_SENT 66477 newreno vmkiscsid
tcp 0 0 192.168.10.20:45795 192.168.10.40:3260 SYN_SENT 66477 newreno vmkiscsid
tcp 0 0 192.168.10.20:45794 192.168.10.40.10:3260 SYN_SENT 66477 newreno vmkiscsid


 5) To fix the issue, changed the vmk2 IP address from 192.168.10.20 to 192.168.10.21. This fixed the issue for me. Later, I was able to change the vmk2 back to 192.168.10.20 without any further issue. 


It looks like the network IP connection was in a weired state and might have been caused at the time of connection negotiation. 


No comments:

Post a Comment

Commvault : DR backup to cloud fails to run

 The Commvault DR backup to cloud (an option within Control Panel of Commvault console) was reporting failures.  The CVCloudService.log repo...