Patch Name: PHKL_23611 Patch Description: s700 10.20 LVM cumulative patch, array controller hot-swap Creation Date: 01/03/16 Post Date: 01/03/28 Warning: 01/10/06 - This Critical Warning has been issued by HP. - PHKL_23480 introduced behavior whereby LVM commands may hang. In most cases a reboot will clear the hang. However, in some cases, most notably with volume group activation, the hang will re-occur after the reboot. - This behavior is also exhibited with superseding patch PHKL_23611. - If you are already successfully using these patches then it should be safe to continue to do so (see item on Volume Group Importing below). - If these patches have not yet been installed, before installing the patches it is recommended to perform the following check: 1. Discover the name of any boot disk for the system: # lvlnboot -v vg00 | grep -i "Boot Disk" /dev/dsk/c0t6d0 (52.6.0) -- Boot Disk 2. Run the following xd(1) command on any one of the boot disks: # xd -An -tx -j0x218000 -N4 /dev/dsk/c0t6d0 3b39e3a8 If the number returned by xd(1) (0x3b39e3a8 in the above example) is 0x70000000 or greater then do NOT apply the patches. - Volume group importing: If you vgimport(1M) volume groups from other systems, then perform the above check on those systems as well. If the other systems fail this check then either do not apply these patches, remove the patches from the importing system or do not vgimport(1m) their volume groups until the patches are replaced by a patch that addresses this behavior. - If the above described check fails and an affected volume group must be imported, HP recommends removing PHKL_23480 and PHKL_23611 from all systems that exhibit this problem. The patch should also be removed from software depots used to install patches on these systems. - Please note that the LVM commands patch PHCO_23437 is dependent upon PHKL_23611 to address the problem of LVM commands abort on systems where any PV has more than 3 alternate PV links. If you choose to remove PHKL_23611, be aware that this problem will not be addressed. - The previous LVM patch, PHKL_22528, does not exhibit this same behavior and is being re-released until a replacement patch is available. To ensure as many known issues as possible are addressed, HP recommends that PHKL_22528 be installed after PHKL_23480 and PHKL_23611 are removed. If PHKL_22528 was installed prior to PHKL_23480 or PHKL_23611, it will automatically be restored when they are removed and it will not need to be reinstalled. - PHKL_23611 is included in the following Support Plus Patch Bundles: Sep 2001: XSW700GR1020,B.10.20.54.1 Sep 2001: XSW700HW1020,B.10.20.54.6 Warning: 01/10/24 - This Critical Warning has been issued by HP. - PHKL_23480 introduced behavior whereby the LVM vgchange(1M) command will fail to activate a volume group and the following message is displayed: Quorum not present, or some physical volume(s) are missing This will only occur if vgcfgrestore(1M) has been used to restore LVM configuration data to all available physical volumes (PV) in the volume group (VG) before the volume group is activated with vgchange(1M). - This behavior will not occur if vgcfgrestore(1M) was used to restore LVM configuration data to a subset of the physical volumes in the volume group before the volume group is activated. - This behavior is also exhibited with superseding patch PHKL_23611. - To avoid this behavior, HP recommends removing PHKL_23480 and PHKL_23611 from all systems on which they are installed. The patches should also be removed from software depots used to install patches on these systems. - Please note that the LVM commands patch PHCO_23437 is dependent upon PHKL_23611 to address the problem of LVM commands aborting on systems where any PV has more than 3 alternate PV links. If you choose to remove PHKL_23611, be aware that this problem will not be addressed. - The previous LVM patch, PHKL_22528, does not exhibit this same behavior. To ensure as many known issues as possible are addressed, HP recommends that PHKL_22528 be installed after PHKL_23480 and PHKL_23611 are removed. If PHKL_22528 was installed prior to PHKL_23480 and PHKL_23611, it will automatically be restored when PHKL_23480 and PHKL_23611 are removed and it will not need to be reinstalled. - PHKL_23611 is included in the following Support Plus Patch Bundles: Sep 2001: XSW700GR1020,B.10.20.54.1 Sep 2001: XSW700HW1020,B.10.20.54.6 Hardware Platforms - OS Releases: s700: 10.20 Products: N/A Filesets: LVM.LVM-KRN OS-Core.CORE-KRN Automatic Reboot?: Yes Status: General Superseded With Warnings Critical: No (superseded patches were critical) PHKL_23480: PANIC HANG CORRUPTION PHKL_19704: PANIC PHKL_20963: HANG PHKL_19696: PANIC PHKL_19209: OTHER This patch is essential to improve the recovery of FC devices in large configurations. PHKL_19166: HANG Path Name: /hp-ux_patches/s700/10.X/PHKL_23611 Symptoms: PHKL_23611: (SR: 8606176439 CR: JAGad45677) Various LVM commands coredump on systems where any PV has more than 3 alternate PV links in addition to the primary link. PHKL_23480: (SR: 8606142976 CR: JAGad12319) LVM commands may hang forever in the kernel lv_sa_config() routine (requiring a system reboot to break the hang). Volume group activations may fail with indications that the disks do not belong to the volume group or some other obscure LVM error message. Although it has never been reported from the field, there is also a remote possibility of data corruption caused by LVM using an older copy of the mirror consistency record (MCR) or the volume group status (VGSA). These situations may most likely happen on systems where the system time has been advanced far into the future, then changed back to the correct time again. The hangs can also occur if a volume group is exported from one system and imported to another with a system time behind the first. Multiple volume groups on the system may be affected simultaneously. Volume Groups that are most likely to demonstrate the problem have timestamps on the LVM disk VGSA, VGDA or MWC data structures which exceed the current system time (although these timestamps are themselves not bad). (SR: 8606166039 CR: JAGad35326) When media defects are encountered on LVM disks, i/o's may panic while LVM is trying to map the defect to an alternate block. The problem is characterized by an i/o path panic "wait_for_lock_spinner: Already own this lock!", with a thread deadlocked on a spinlock in the kernel lv_defecthash() routine. This is similar to a problem discovered on 11.00: JAGad30462. (SR: 8606166971 CR: JAGad36258) On ServiceGuard OPS clusters, A kernel hang can occur in the lv_resyncpv() routine, when previously failed devices are recovered and are being resynced by LVM. This problem is characterized by a situation where a device goes offline then returns, but LVM never recovers the device and never starts using it again because LVM has gotten stuck in lv_resyncpv(). (SR: 8606168136 CR: JAGad37418) A kernel Data Page Fault (DPF) panic is possible in the kernel lv_resyncpv() routine when a volume group is deactivated while disks are offline and LVM is actively trying to recover the disks. A similar DPF panic is possible in lv_resyncpv() when vgreduce(1M) is used to reduce PVs from a VG while a resync of the logical volumes(s) containing the PV is in progress. (SR: 8606173900 CR: JAGad43153) After a system crash, mirrored logical volumes might not be resynchronized correctly if the Mirror Write Cache is enabled (see lvchange(1M) for details). The result is that the data written at the time of a system crash might not be made consistent across the mirrors. This is a form of data corruption. (SR: 8606161601 CR: JAGad30917) Very poor read performance for HFS filesystems on LVM LVs during an lvmerge(1M) operation. VxFS file systems are not affected. Reads are held off for as long as 30 seconds. (SR: 8606124005 CR: JAGac39365) During heavy I/O stress system may panic in LVM lv_unblock() due to a spinlock held too long. PHKL_22528: ( SR:8606113703 DTS: JAGac07217 ) If a physical volume in an LVM mirrored disk configuration is replaced without deactivating the volume group it belongs to, an application might see spurious I/O errors, mirror resync failures, or data corruption. (The I/O errors and resync failures may have been observed prior to replacing the disk.) (SR: 8606138886 CR: JAGad08152) A shared volume group has one or more LUNs on an active- passive disk array (Nike, Galaxy or Optimus), and one of the array's Service Processors is hot-swapped without deactivating the volume group. Later, a node deactivates the volume group (while it remains activated on other nodes), and that node is not able to reactivate it. (SR: 8606162303 CR: JAGad31619) Read performance on HFS filesystems on LVM logical volumes drops substantially when lvmerge operations are in progress. VxFS file systems are unaffected. Reads/writes from raw logical volumes are unaffected. The problem occurs after installing patch PHKL_20963. PHKL_21369: (SR: 8606108373 CR: JAGab78776) Reads from a mirrored LV can take a very long time to complete if one of the mirrors is unavailable. (SR: 8606106798 CR: JAGab76189) With PHKL_20963 (recalled) installed, LVM may perform a full resync every time a volume group is activated. While the resync is in progress, system performance may be degraded. In addition, since LVM does not have two valid copies of all data during the resync, the system is vulnerable to a disk failure until the resync completes. PHKL_19704: ( SR: 1653305987 DTS: JAGab17773 ) If bad block relocation is enabled on a logical volume with parallel read support, then any requests to a block currently being relocated results in system panic. PHKL_20963: (SR: 8606128444 CR: JAGac81735) It might not be possible to activate a volume group in shared mode if any of its physical volumes are on an Optimus Prime Disk Array. (SR: 8606106637 CR: JAGab75913) An LVM deadlock (hang) can occur when LVM commands which operate on logical volumes are run at the same time as device query operations. The result is that the LVM commands and the query operations never complete (and cannot be terminated), and it is not possible to run any subsequent LVM commands. Furthermore, it is possible that subsequent device recovery will be delayed indefinitely. The only way to restore normal operation is to reboot the system. For example, the lvmerge(1M), lvsplit(1M) commands and glance(1) when run together could cause the commands to deadlock, resulting in a situation where they make no forward progress and cannot be interrupted or killed. The same result could occur running lvchange(1M) or lvextend(1M) and lvdisplay(1M) together. The defect was introduced in PHKL_19209 which has since been recalled. This new patch supersedes PHKL_19696, PHKL_19209, PHKL_20040 and PHKL_20807. This patch contains all the fixes contained in these patches. Customers with any of these superseded patches installed should apply this new patch. (SR: 8606106012 CR: JAGab74797) It's possible for an I/O request to be accepted while a logical volume is being closed, causing the operating system to panic. Typical actions that close a logical volume are unmounting a filesystem and closing a database (or other) application which uses raw logical volumes. The panic would likely be a data page fault in an lvm ("lv_") routine. PHKL_20807: ( SR:8606100412 DTS: JAGab31786 ) This is an interim patch to support the Optimus disk array. PHKL_20040: ( SR:8606100412 DTS: JAGab31786 ) LVM incorrectly treats two volumes within an Optimus disk array as alternate paths to a single volume because they have the same LUN ID, even though they are distinct volumes and have different target IDs. PHKL_19696: ( SR:8606101971 DTS: JAGab66231 ) If there is a problem with the physical volume to which the logical volume is mapped, LVM returns EIO error for logical requests, without retrying till the I/O timeout value set on that logical volume. PHKL_19209: ( SR:5003437970 DTS: JAGaa40887 ) When multiple physical volumes or paths to physical volumes are lost, it can require minutes to recover them. During the time the PVs for a given volume group are tested, locks were held which delayed other LVM operations and the opens and closes of logical volumes. Prior changes to the device recovery code provided some benefit, assuring that device recovery was 1-2 minutes regardless of the number of paths or devices to be recovered, however this still was not enough. The new device recovery code in this patch reduces the recovery time to under 35 seconds, again independent of the number of paths or devices offline. PHKL_19166: ( SR: 8606100864 DTS: JAGab39559 ) ( SR: 4701424846 DTS: JAGab14452 ) Performance degradation when massively parallel subpage size (<8K) reads are performed (as with Informix). ( SR: 8606100864 DTS: JAGab39559 ) ( SR: 1653289132 DTS: JAGaa67952 ) The system hangs when lvmkd is waiting for the lock obtained earlier by an application that performs a vg_create operation. The hang does not happen unless there is a powerfailed disk. ( SR: 8606100864 DTS: JAGab39559 ) ( SR: 4701424895 DTS: JAGab14455 ) Optimus Disk Arrays (model number A5277A-HP) are not recognized as an ACTIVE/PASSIVE device and subsequently are not handled properly by the driver. PHKL_17546: ( SR: 1653289553 DTS: JAGaa46305 ) LVM's autoresync after disk powerfail can leave extents stale. Defect Description: PHKL_23611: (SR: 8606176439 CR: JAGad45677) The maximum number of physical volumes, including links, which LVM can support was used incorrectly in the code. Resolution: The problem have been fixed by the LVM command patch PHCO_23437. Now LVM can support up to 8 links per physical volume (1 primary, 7 alternates). In this kernel patch, we added an error message to reject any LVM command which tries to add the 8th alternate link to any physical volume. In the future, LVM might be able to support more than 8 links when the resources are available. PHKL_23480: (SR: 8606142976 CR: JAGad12319) The cause of the LVM command hang reported in the CR was that the LVM in-memory timestamp had overflowed so that a new timestamp had a value older (less) than the prior one applied to the VGSA. The potential for data corruption also exists due to the remote possibility that latest LVM on-disk metadata may be stamped with an old timestamp causing LVM not to use the latest metadata upon a subsequent activation. Resolution: The fix in this patch is a redesign of the timestamp algorithm to far reduce the possiblity of a timestamp overflow by modifying how timestamps are generated. The new timestamp algorithm generates timestamps independent of the system time, and generates independent timestamps for each volume group. The new timestamp algorithm also corrects errors in the code which could cause the timestamp to advance faster than it should, and situations where the timestamp could be truncated. The patch also includes code to correct (roll-back) the LVM VG timestamp in a safe way should the timestamp on a VG pass a danger threshold approaching the overflow point. (SR: 8606166039 CR: JAGad35326) The panic and potential i/o hang mentioned in the CR was caused by a kernel spinlock deadlock in lv_defecthash(). lv_defecthash() incorrectly acquired a spinlock already held. Resolution: lv_defecthash() was corrected so it would not acquire the unnecessary spinlock. This problem was fixed in 11.0. Migrated the fix for SR: 8606161146 CR: JAGad30462 to 10.20. (SR: 8606166971 CR: JAGad36258) The LVM resync hang is most definitively identified by a deadlock between a resync process running on the VG server node deadlocked with the processing of a resync request from a client node. The server resync process is holding the vg_lock waiting for the slvm_resync_lock, and the client resync request process is holding the slvm_resync_lock waiting for the vg_lock. The deadlock was due to the order of acquisition of locks in lv_resyncpv(). Another problem within lv_resyncpv() was that a routine called during the resync dropped locks held which kept the PV from being removed or the VG from being deactivated. So the physical volume or logical volume data structures could be deallocated while they were still being used by lv_resyncpv(), resulting in a Data Page Fault. Resolution: lv_resyncpv() was redesigned for 11.11 to modify how the different locks were acquired and to avoid the deadlock and DPF problems. The re-written routine was included in this 10.20 patch. (SR: 8606168136 CR: JAGad37418) The description for this defect is similar to the one for SR: 8606166971 CR: JAGad36258. The kernel LVM lv_resyncpv() routine called during device resync dropped locks held which kept the PV from being removed or the VG from being deactivated. So the physical volume or logical volume data structures could be deallocated while they were still being used by lv_resyncpv(), resulting in a Data Page Fault. Resolution: The resolution for this defect is similar to the one for SR: 8606166971 CR: JAGad36258. lv_resyncpv() was re-written for 11.11 to modify how the different locks were acquired and to avoid the deadlock and DPF problems. The re-written routine was included in this 10.20 patch. (SR: 8606173900 CR: JAGad43153) There are two problems in the routine which synchronizes extents from the MWC cache entries (lv_recover_ltg()): One problem is an off-by-one bug which causes the last mirror copy to not be considered when looking for a fresh available copy of a given extent. To make matters worse, if for a given extent there is no fresh data available to read from to perform the MWC resync, the extent is skipped, leaving the fresh mirror copies still fresh even when their data is possibly inconsistent. Resolution: The fix for the off-by-one error from PHKL_23127 for 11.00 JAGad43153 was included in this patch to fix the first problem. Additional code from 11.00 MR was included in this 10.20 patch to assure that a failed attempt to resynchronize an extent leaves only synchronized copies of the extent fresh. (SR: 8606161601 CR: JAGad30917) The problem was that every HFS read makes a call to an LVM lv_readahead_info() routine to determine the optimum readahead (presumably to optimize readahead for striped volumes). The LVM routine grabs the necessary locks to assure consistency of the data structures it accesses, competing for them with the LVM merge operation. Thus, adversely affecting read performance. Resolution: This patch eliminates the locking in lv_readahead_info(). This works because LVM lv_readahead_info() routine cannot be called with a closed logical volume and the stripe factor cannot be changed once a logical volume is created. (SR: 8606124005 CR: JAGac39365) When an i/o completes, all the requests in the LVM work queue containing the request are analyzed to see if any requests can be unblocked, whether or not the request could necessarily be unblocked by the completed i/o. The LV spinlock must be held while traversing the queue, and it is held a very long time. The algorithm is time consuming because it requires traversing the queue up to each request, for each request on the list. The algorithm is wasteful because the only requests which can be unblocked by the completed request are those directed to the same LTG as the completed request, and on the work queue after it. Resolution: The fix was migrated from 11.00 patch PHKL_22233. Instead of traversing the list of all prior requests once for each request in the list to determine whether any of the requests can be unblocked, the new unblock algorithm scans the list only once to attempt to unblock only the requests that can be unblocked by completion of the i/o request. Thus, reducing the length of time the lock protecting the queue is held. PHKL_22528: ( SR:8606113703 DTS: JAGac07217 ) When a physical volume is replaced without deactivating the volume group it belongs to, the operating system does not read the bad block directory from the disk, but continues using the old one. This can cause spurious I/O errors or mirror resync failures if bad block relocation is disabled, or data corruption if bad block relocation is enabled. Resolution: Whenever a physical volume is replaced, the bad block directory from the new physical volume is read. (SR: 8606138886 CR: JAGad08152) If a shared volume group has one or more LUNs on an active-passive disk array (Nike, Galaxy or Optimus), and if one of the array's Service Processors is hot-swapped without deactivating the volume group, then there is no guarantee that all the nodes in the cluster will be using the same IDs to refer to those LUNs or to refer to particular paths (i.e., PVLinks) to the LUNs. This is because each Service Process has a unique "controller signature" which is used to construct these IDs. The LUN and path IDs are passed between nodes in PVLinks- related messages. If the IDs do not match, the messages will fail. This can lead to various symptoms, but the most obvious is that if a node deactivates the volume group (while it remains activated on other nodes), that node will not be able to reactivate it. Resolution: The solution is to detect situations where the IDs might be out of sync and reconstruct them, by reading the controller signatures again. If necessary, messages are reprocessed (if the receiver was using an old ID) or resent (if the sender was using an old ID). (SR: 8606162303 CR: JAGad31619) Reads directed to HFS file systems on LVM logical volumes can take substantially longer to complete during an lvmerge operation. The cause is that the LVM lv_readahead_info() interface called by HFS during each read is substantially slower than it was prior to patch PHKL_20963 (in which significant LVM locking improvements were made). Resolution: LVM lv_readahead_info() performance has been improved by not being quite so strict with the locks acquired during the lv_readahead_info() operation. PHKL_21369: (SR: 8606108373 CR: JAGab78776) When reading from a mirrored logical volume, LVM might try a disk that is known to be off line before it tries another disk which is still available. The read is delayed while the first I/O times out. Resolution: In selecting the best mirror to read from, give preference to disks that are still available over disks that are known to be off line. (SR: 8606106798 CR: JAGab76189) When a volume group is activated, LVM validates a data structure on each physical volume called the Mirror Consistency Record (MCR). If the MCR is not valid, LVM performs a full resync of the physical volume and should rewrite the MCR. But with PHKL_20963 installed, LVM does not rewrite the MCR. Instead, if the MCR is invalid, LVM performs a full resync every time the volume group is activated, rather than just the first time. While the resync is in progress, system performance may be degraded. In addition, since LVM does not have two valid copies of all data during the resync, the system is vulnerable to a disk failure until the resync completes. (This problem does not affect performance or availability of mirrored logical volumes after the resync has completed.) With PHKL_20963 installed, LVM might also display the wrong physical volume (PV) number in certain diagnostic messages. PHKL_21369 corrects this, as well. Resolution: If the MCR is not valid, rewrite it after performing a full resync. PHKL_19704: ( SR: 1653305987 DTS: JAGab17773 ) Currently, LVM does not allow bad block relocation with parallel read operation for consistency in bad block relocation. So when a bad block relocation is going on and a new request comes on to the same block then we panic. As an enhancment, we enable bad block relocation with parallel read operation if we notice that the block we are accessing is being relocated, depending on the state either initiate bad block relocation or wait till the bad block relocation is completed. If REL_DESIRED meaning a read noticed a bad block is set on a block then we initiate the relocation for this block. If REL_PENDING or REL_DEVICE meaning relocation is going on, then we wait till the relocation is completed and then we will do the I/O from the new location. Resolution : Modified lv_hwreloc() and lv_validblk() for taking action appropriately for the above mentioned states. PHKL_20963: (SR: 8606128444 CR: JAGac81735) It might not be possible to activate a volume group in shared mode if any of its physical volumes are on an Optimus Prime Disk Array, because the serial numbers for these device are truncated when they're passed between nodes in a ServiceGuard cluster. Resolution: Don't truncate Optimus Prime serial numbers. [Although this defect was resolved in PHKL_20963, the original documentation did not mention it.] (SR: 8606106637 CR: JAGab75913) The LVM deadlock (hang) occurs due to a defect introduced in PHKL_19209 (recalled). In the bad patch, the problem was that an easily encountered deadlock condition was introduced while attempting to correct another relatively rare deadlock. The problem can be easily reproduced by running LVM commands which operate on existing logical volumes such as lvextend(1M), lvsplit(1M) or lvmerge(1M) along with commands that query logical volumes, such as glance(1). The deadlock occurs roughly 10% of the time, but when it does happen there are severe consequences. The deadlock makes it impossible to complete the operations or to run any other LVM commands, without rebooting the system. Resolution: The LVM kernel code was modified. The volume group lock and other LVM locks were reordered and a new volume group data lock was added to allow device recovery operations to occur simultaneously with command operations. Thus correcting the old and newly introduced deadlock defects. This patch supersedes the interim PHKL_20807 patch. It reintroduces the device recovery changes from PHKL_19209 and the bug fixes from PHKL_19696 which were purposely excluded from PHKL_20807. (SR: 8606106012 CR: JAGab74797) Because of a race condition in LVM, it is possible for an I/O request to be accepted when the logical volume is being closed. Eventually, a data structure that has already been freed (as a result of closing the logical volume) is referenced, causing the operating system to panic. Resolution: Eliminate the race condition so that I/O cannot proceed after a logical volume has been closed. [Although this defect was resolved in PHKL_20963, the original documentation did not mention it.] PHKL_20807: ( SR:8606100412 DTS: JAGab31786 ) This is an interim patch to support the Optimus disk array. PHKL_20040: ( SR:8606100412 DTS: JAGab31786 ) For some disk arrays, LVM treats all occurrences of the same LUN ID as alternate paths to a single volume. This assumption is not correct for the Optimus disk array: two distinct volumes may have the same LUN ID, but different target IDs. Resolution: To identify a unique volume in an Optimus array, LVM now uses both its LUN ID and target ID. PHKL_19696: ( SR:8606101971 DTS: JAGab66231 ) If the physical volume to which the logical volume is mapped has problems, instead of retrying till the lv_iotimeout value set on the logical volume, LVM returns EIO for logical requests before lv_iotimeout. This is because we are initializing the start time on the request during scheduling of the request. If the PV to which the request is to be scheduled is down then we append the request to powerfail wait queue without scheduling. When the PV comes back, we start resending the buffers in the powerfail wait queue and at that time we check the elapsed time (current time - initial time set) of the request, since we had not initialized the time on the request as we did not do the scheduling, it will be set to zero resulting in a value higher than the lv_iotimeout. Hence we bail out the request without processing it, although the time elapsed will be much less than the lv_iotimeout value. Resolution : Initializing logical buf start time in lv_strategy(), at the time of processing the request instead of setting it during scheduling in lv_schedule(). PHKL_19209: ( SR:5003437970 DTS: JAGaa40887 ) The problem was that some of the LVM device recovery was still a serial process. Resolution: The LVM device recovery code was modified to cause all tests of devices and paths to be conducted in parallel. Devices which are available are immediately brought online again, irrespective of other failed devices or paths. The changes in this patch assure that devices recover within the time it takes to test the device/path and to update its data structures. The volume group data structures and LVM operations that require them --LVM commands and opens and closes of logical volumes should be held off no more than 35 seconds. PHKL_19166: ( SR: 8606100864 DTS: JAGab39559 ) ( SR: 4701424846 DTS: JAGab14452 ) Informix issues massive amounts of 1K reads in parallel. With an 8K page size and I/Os serialized within the page, performance suffers. Resolution: Logic was added to allow reads from the same 8K page to proceed in parallel when bad block relocation is completely disabled (lvchange -r N). ( SR: 8606100864 DTS: JAGab39559 ) ( SR: 1653289132 DTS: JAGaa67952 ) If the holder of the vg_lock is waiting for I/O to finish, and if the I/O can't finish until we switch to another link, then we get into a deadlock. Resolution: To resolve the deadlock, the code now obtains the lock temporarily, in order to switch to the alternate link, then returns the lock to the original holder to finish the I/O. ( SR: 8606100864 DTS: JAGab39559 ) ( SR: 4701424895 DTS: JAGab14455 ) We need to recognize Optimus Array as an ACTIVE/PASSIVE device. Resolution: Added code to recognize the Optimus Array as an ACTIVE/PASSIVE device. PHKL_17546: ( SR: 1653289553 DTS: JAGaa46305 ) lv_syncx() may return with stale extents without actually syncing all the extents. Resolution: Added additional check to see if all the extents are synced; otherwise return error. lv_syncx() will return SUCCESS only when the syncing is completed. Made changes in lv_resyncpv() to preserve error value. SR: 1653289132 1653289553 1653305987 4701424846 4701424895 5003437970 8606100412 8606100864 8606101971 8606106637 8606106798 8606108373 8606113703 8606124005 8606138886 8606142976 8606161601 8606162303 8606166039 8606166971 8606168136 8606173900 8606176439 Patch Files: /usr/conf/lib/libhp-ux.a(lv_lvm.o) /usr/conf/lib/libhp-ux.a(rw_lock.o) /usr/conf/lib/liblvm.a(lv_block.o) /usr/conf/lib/liblvm.a(lv_cluster_lock.o) /usr/conf/lib/liblvm.a(lv_defect.o) /usr/conf/lib/liblvm.a(lv_hp.o) /usr/conf/lib/liblvm.a(lv_ioctls.o) /usr/conf/lib/liblvm.a(lv_lvsubr.o) /usr/conf/lib/liblvm.a(lv_mircons.o) /usr/conf/lib/liblvm.a(lv_phys.o) /usr/conf/lib/liblvm.a(lv_schedule.o) /usr/conf/lib/liblvm.a(lv_spare.o) /usr/conf/lib/liblvm.a(lv_strategy.o) /usr/conf/lib/liblvm.a(lv_subr.o) /usr/conf/lib/liblvm.a(lv_syscalls.o) /usr/conf/lib/liblvm.a(lv_vgda.o) /usr/conf/lib/liblvm.a(lv_vgsa.o) /usr/conf/lib/liblvm.a(sh_vgsa.o) /usr/conf/lib/liblvm.a(slvm_comm.o) what(1) Output: /usr/conf/lib/libhp-ux.a(lv_lvm.o): lv_lvm.c $Date: 2000/04/27 14:02:14 $ $Revision: 1.3 .98.5 $ PATCH_10.20 (PHKL_21369) /usr/conf/lib/libhp-ux.a(rw_lock.o): rw_lock.c $Date: 2000/10/20 15:32:10 $ $Revision: 1. 8.98.7 $ PATCH_10.20 (PHKL_22528) /usr/conf/lib/liblvm.a(lv_block.o): lv_block.c $Date: 2001/03/10 14:56:42 $ $Revision: 1 .13.98.9 $ PATCH_10.20 (PHKL_23480) /usr/conf/lib/liblvm.a(lv_cluster_lock.o): lv_cluster_lock.c $Date: 2000/04/27 13:40:16 $ $Revi sion: 1.10.98.9 $ PATCH_10.20 (PHKL_21369) /usr/conf/lib/liblvm.a(lv_defect.o): lv_defect.c $Date: 2001/03/10 14:56:40 $ $Revision: 1.16.98.9 $ PATCH_10.20 (PHKL_23480) /usr/conf/lib/liblvm.a(lv_hp.o): lv_hp.c $Date: 2001/03/10 14:55:21 $ $Revision: 1.18 .98.40 $ PATCH_10.20 (PHKL_23480) /usr/conf/lib/liblvm.a(lv_ioctls.o): lv_ioctls.c $Date: 2001/03/16 08:54:49 $ $Revision: 1.18.98.29 $ PATCH_10.20 (PHKL_23611) /usr/conf/lib/liblvm.a(lv_lvsubr.o): lv_lvsubr.c $Date: 2001/03/10 14:56:29 $ $Revision: 1.15.98.27 $ PATCH_10.20 (PHKL_23480) /usr/conf/lib/liblvm.a(lv_mircons.o): lv_mircons.c $Date: 2001/03/10 14:56:04 $ $Revision: 1.14.98.10 $ PATCH_10.20 (PHKL_23480) /usr/conf/lib/liblvm.a(lv_phys.o): lv_phys.c $Date: 2000/04/27 13:59:37 $ $Revision: 1. 14.98.21 $ PATCH_10.20 (PHKL_21369) /usr/conf/lib/liblvm.a(lv_schedule.o): lv_schedule.c $Date: 2000/04/27 13:59:39 $ $Revision : 1.18.98.16 $ PATCH_10.20 (PHKL_21369) /usr/conf/lib/liblvm.a(lv_spare.o): lv_spare.c $Date: 2000/04/27 13:59:58 $ $Revision: 1 .3.98.12 $ PATCH_10.20 (PHKL_21369) /usr/conf/lib/liblvm.a(lv_strategy.o): lv_strategy.c $Date: 2000/04/27 14:00:04 $ $Revision : 1.14.98.18 $ PATCH_10.20 (PHKL_21369) /usr/conf/lib/liblvm.a(lv_subr.o): lv_subr.c $Date: 2001/03/10 14:56:09 $ $Revision: 1. 18.98.12 $ PATCH_10.20 (PHKL_23480) /usr/conf/lib/liblvm.a(lv_syscalls.o): lv_syscalls.c $Date: 2000/04/27 14:00:07 $ $Revision : 1.14.98.13 $ PATCH_10.20 (PHKL_21369) /usr/conf/lib/liblvm.a(lv_vgda.o): lv_vgda.c $Date: 2001/03/10 14:56:12 $ $Revision: 1. 18.98.9 $ PATCH_10.20 (PHKL_23480) /usr/conf/lib/liblvm.a(lv_vgsa.o): lv_vgsa.c $Date: 2001/03/10 14:56:27 $ $Revision: 1. 14.98.9 $ PATCH_10.20 (PHKL_23480) /usr/conf/lib/liblvm.a(sh_vgsa.o): sh_vgsa.c $Date: 2001/03/10 14:56:21 $ $Revision: 1 .3.98.12 $ PATCH_10.20 (PHKL_23480) /usr/conf/lib/liblvm.a(slvm_comm.o): slvm_comm.c $Date: 2001/03/10 14:56:44 $ $Revision: 1.3.98.6 $ PATCH_10.20 (PHKL_23480) cksum(1) Output: 29448133 156624 /usr/conf/lib/libhp-ux.a(lv_lvm.o) 2352419582 7132 /usr/conf/lib/libhp-ux.a(rw_lock.o) 1935308342 2608 /usr/conf/lib/liblvm.a(lv_block.o) 2417573758 10592 /usr/conf/lib/liblvm.a(lv_cluster_lock.o) 590105967 12868 /usr/conf/lib/liblvm.a(lv_defect.o) 1902605513 91964 /usr/conf/lib/liblvm.a(lv_hp.o) 2276752510 35088 /usr/conf/lib/liblvm.a(lv_ioctls.o) 658933363 39148 /usr/conf/lib/liblvm.a(lv_lvsubr.o) 4006329228 18760 /usr/conf/lib/liblvm.a(lv_mircons.o) 1289736306 7740 /usr/conf/lib/liblvm.a(lv_phys.o) 3027240879 26432 /usr/conf/lib/liblvm.a(lv_schedule.o) 775224458 38920 /usr/conf/lib/liblvm.a(lv_spare.o) 911903142 7668 /usr/conf/lib/liblvm.a(lv_strategy.o) 1311226172 15572 /usr/conf/lib/liblvm.a(lv_subr.o) 4065182126 14080 /usr/conf/lib/liblvm.a(lv_syscalls.o) 4205964867 9620 /usr/conf/lib/liblvm.a(lv_vgda.o) 2201638634 12716 /usr/conf/lib/liblvm.a(lv_vgsa.o) 564404207 42624 /usr/conf/lib/liblvm.a(sh_vgsa.o) 2471680826 27104 /usr/conf/lib/liblvm.a(slvm_comm.o) Patch Conflicts: None Patch Dependencies: s700: 10.20: PHKL_16750 Hardware Dependencies: None Other Dependencies: PHKL_21084 should also be installed on systems using Nike, Galaxy, or Optimus disk arrays in a Shared LVM configuration to improve the recovery after the hot-swapping of array Service Processors. Supersedes: PHKL_17546 PHKL_19166 PHKL_19209 PHKL_19696 PHKL_19704 PHKL_20040 PHKL_20807 PHKL_20963 PHKL_21369 PHKL_22528 PHKL_23480 Equivalent Patches: PHKL_23612: s800: 10.20 Patch Package Size: 650 KBytes Installation Instructions: Please review all instructions and the Hewlett-Packard SupportLine User Guide or your Hewlett-Packard support terms and conditions for precautions, scope of license, restrictions, and, limitation of liability and warranties, before installing this patch. ------------------------------------------------------------ 1. Back up your system before installing a patch. 2. Login as root. 3. Copy the patch to the /tmp directory. 4. Move to the /tmp directory and unshar the patch: cd /tmp sh PHKL_23611 5a. For a standalone system, run swinstall to install the patch: swinstall -x autoreboot=true -x match_target=true \ -s /tmp/PHKL_23611.depot By default swinstall will archive the original software in /var/adm/sw/patch/PHKL_23611. If you do not wish to retain a copy of the original software, you can create an empty file named /var/adm/sw/patch/PATCH_NOSAVE. WARNING: If this file exists when a patch is installed, the patch cannot be deinstalled. Please be careful when using this feature. It is recommended that you move the PHKL_23611.text file to /var/adm/sw/patch for future reference. To put this patch on a magnetic tape and install from the tape drive, use the command: dd if=/tmp/PHKL_23611.depot of=/dev/rmt/0m bs=2k Special Installation Instructions: New tests run for the first time on this patch revealed a preexisting problem with some LVM commands which will be fixed in a separate commands patch. Commands that are run independently work fine, but when a logical volume command (lvchange, lvcreate, lvextend, lvreduce, lvremove, lvrmboot) is run at the same time as a volume group command (vgchgid, vgexport, vgreduce, vgremove, vgscan), it is possible that the 'lvlnboot -R' and vgcfgbackup portions of the logical volume command may fail. The best way to avoid the problem is not to run the indicated LVM commands simultaneously. If the 'lvlnboot -R' or vgcfgbackup operation fails, the workaround is simply to repeat these manually. Similarly, vgdisplay and lvdisplay might fail if they are run while the LVM configuration is changing. If this happens, simply repeat the vgdisplay or lvdisplay command. There are no inconsistencies introduced into the LVM configuration file or the on-disk LVM data structures by this defect. However, it is important to run the failed 'lvlnboot -R' command when boot logical volumes are changed and to perform configuration backups whenever the volume group configuration is changed. This patch depends on base patch PHKL_16750. For successful installation, please ensure that PHKL_16750 is in the same depot with this patch, or PHKL_16750 is already installed. Due to the number of objects in this patch, the customization phase of the update may take more than 10 minutes. During that time the system will not appear to make forward progress, but it will actually be installing the objects.