Monday, January 24, 2011

Due to a possible dead lock on rpmdb, upgrading ESX 4.0 to 4.0 Update 1 can fail or time out and leave the host in an unusable state


Due to a possible dead lock on rpmdb, upgrading ESX 4.0 to 4.0 Update 1 can fail or time out and leave the host in an unusable state

Symptoms

When attempting to upgrade ESX 4.0 to ESX 4.0 Update 1 (U1), you may experience these symptoms:
  • Upgrade operation may fail or hang and can result in an incomplete installation
  • Upon reboot, the host that was being upgraded may be left in an inconsistent state and may display a purple diagnostic screen with the following error:

    COS Panic: Int3 @ mp_register_ioapic

Purpose

ESX 4.0 U1 includes an upgrade to glibc version 5.3 which implements a change in locking mechanism compared to glibc version 5.2 already installed with ESX 4.0. If rpm command is run during the installation of ESX 4.0 U1, a dead lock may be placed on rpmdb. For more information, see RedHat PR 463921. 
 
As a result, upgrading ESX 4.0 to 4.0 U1 can fail or time out and leave the host in an unusable state. 
 
While this issue is not hardware vendor specific, this has been reported to occur on HP Proliant systems if Insight Management Agents are already installed and running on the host being upgraded. Investigations into this issue revealed that Insight Management Agents run rpm commands on a regular basis which triggers the deadlock during the U1 installation. This can also occur on any system from other vendors that has a process or an application running rpm, or if you happen to manually run the rpm command, like rpm -qa, while Update 1 installation is in progress.

Note: VMware esxupdate tool can be used standalone and is also used by VMware Update Manager and VMware Host Update Utility.

Resolution

Who is affected

  1. Customers using VMware vSphere 4 upgrading to ESX 4.0 U1 on HP Proliant systems with a supported version of HP Insight Management Agents running.
  2. Customers running rpm commands on systems from any vendor while upgrading to ESX 4.0 U1.
This affects any of the following upgrading scenarios:
  • Upgrade using Update Manager
  • Upgrade using esxupdate
  • Upgrade using vSphere Host Update Utility
Note: ESXi is not affected.

Solution

ESX 4.0 Update 1 has been re-released with changes to avoid this issue. The installation process checks for running agents and stops them before proceeding.
 
The re-released ESX 4.0 Update1 is referred to as ESX 4.0 Update 1a and is available via vSphere Update Manager (VUM) and the VMware Downloads site.
 
Note: The changes in ESX 4.0 Update 1a do not address the issue with glibc locking mechanism. It is critical that you do not run rpm commands on any host while the ESX 4.0 Update 1a installation is in progress. 
 
If you meet one or both of the conditions of Who is Affected and you already ran the original ESX 4.0 Update 1 installation but have not rebooted the host, do not reboot the ESX host. Contact VMware Technical Support for assistance. For more information, see How to Submit a Support Request.
 
WARNING: Rebooting the host means the host may need to be reinstalled because it is not recoverable after a reboot.
 
WARNING: If you have virtual machines running on local storage, they may not be retained if you reinstall ESX 4.0 as a result of this issue. Contact VMware Support for assistance before reinstalling.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.