Wednesday, January 13, 2016

Solaris/VxVM - "vxdisk list" shows status of diskgroup as 'online dgdisabled'.

Problem

What to do when "vxdisk list" shows status of  diskgroup as'online dgdisabled'.

Solution


DEVICE       TYPE      DISK         GROUP        STATUS
c0t0d0s2      sliced     disk03         rootdg         online
c1t12d0s2    sliced     disk12         raid5dg        
online dgdisabled
c1t13d0s2    sliced     disk13         raid5dg        online dgdisabled
c1t14d0s2    sliced     disk14         raid5dg        online dgdisabled
c1t15d0s2    sliced     disk15         raid5dg        online dgdisabled

This situation can happen when every disk in a disk group is lost from a bad power supply, power turned off to the disk array, cable disconnected, etc.
This can also occur when a disk group consists of only simple and/or nopriv disks and is changed to the enclosure-based naming scheme with VERITAS Volume Manager (VxVM) 3.2.
The correction for this is explained in the VxVM 3.2 System Administrator's Guide, section 'Simple/Nopriv Disks in Non-Root Diskgroups'.

The disk group will not show in the output from vxprint -ht.

The disk group will show as disabled in vxdg list:

NAME         STATE           ID
rootdg       enabled  957541872.1025.scrollsaw
raid5dg      
disabled 960304056.1215.scrollsaw

This is the output of vxdg list raid5dg:

Group:     raid5dg
dgid:      960304056.1215.scrollsaw
import-id: 0.1214
flags:    
disabled
version:   0
copies:    nconfig=default nlog=default
config:    seqno=0.1052 permlen=1162 free=1154 templen=4 loglen=176
config disk c1t12d0s2 copy 1 len=1162 state=iofail failed
      config-tid=0.1052 pending-tid=0.1052
      Error: error=Disk write failure
config disk c1t13d0s2 copy 1 len=1162 state=iofail failed
      config-tid=0.1052 pending-tid=0.1052
      Error: error=Disk write failure
config disk c1t14d0s2 copy 1 len=1162 state=iofail failed
      config-tid=0.1052 pending-tid=0.1052
      Error: error=Disk write failure
config disk c1t15d0s2 copy 1 len=1162 state=iofail failed
      config-tid=0.1052 pending-tid=0.1052
      Error: error=Disk write failure
log disk c1t12d0s2 copy 1 len=176 invalid
log disk c1t13d0s2 copy 1 len=176 invalid
log disk c1t14d0s2 copy 1 len=176 invalid
log disk c1t15d0s2 copy 1 len=176 invalid

Once power to the disk has been restored, VxVM still will not see the disk group, but thinks the disk group is imported:

root@scrollsaw# vxvol start raid5vol
vxvm:vxvol: ERROR: raid5vol: Not in any imported disk group
root@scrollsaw# vxdg import raid5dg
vxvm:vxdg: ERROR: Disk group raid5dg: import failed: Disk group exists and is imported
 

This can be remedied by deporting, then importing the disk group:

vxdg deport raid5dg
vxdg import raid5dg

The disk group now shows in vxprint -ht with the volume and plexes disabled:

dg raid5dg      default      default  79000    960304056.1215.scrollsaw

dm disk12       c1t12d0s2    sliced   1599     17910400 -
dm disk13       c1t13d0s2    sliced   1599     17910400 -
dm disk14       c1t14d0s2    sliced   1599     17910400 -
dm disk15       c1t15d0s2    sliced   1599     17910400 -


v  raid5vol     raid5        DISABLED ACTIVE   409600   RAID      -
pl raid5vol-01  raid5vol    
DISABLED ACTIVE   409600   RAID      2/32     RW
sd disk12-01    raid5vol-01  disk12   0        409600   0/0       c1t12d0  ENA
sd disk13-01    raid5vol-01  disk13   0        409600   1/0       c1t13d0  ENA
pl raid5vol-02  raid5vol    
DISABLED LOG      1600     CONCAT    -        RW
sd disk14-01    raid5vol-02  disk14   0        1600     0         c1t14d0  ENA
pl raid5vol-03  raid5vol    
DISABLED LOG      1600     CONCAT    -        RW
sd disk15-01    raid5vol-03  disk15   0        1600     0         c1t15d0  ENA

Now the volume can be started:

vxvol start raid5vol

v  raid5vol     raid5        
ENABLED  ACTIVE   409600   RAID      -
pl raid5vol-01  raid5vol    
ENABLED  ACTIVE   409600   RAID      2/32     RW
sd disk12-01    raid5vol-01  disk12   0        409600   0/0       c1t12d0  ENA
sd disk13-01    raid5vol-01  disk13   0        409600   1/0       c1t13d0  ENA
pl raid5vol-02  raid5vol    
ENABLED  LOG      1600     CONCAT    -        RW
sd disk14-01    raid5vol-02  disk14   0        1600     0         c1t14d0  ENA
pl raid5vol-03  raid5vol    
ENABLED  LOG      1600     CONCAT    -        RW
sd disk15-01    raid5vol-03  disk15   0        1600     0         c1t15d0  ENA
 


Note: You may need to verify that there are no PIDs accessing the file systems associated to the disk group that is disabled.  If there are processes that are still pending on these volumes, you may need to stop or kill the PIDs or, if using Solaris 8 and either ufs files system or VxFS 3.4+patch02, force umount the file systems. Refer to the Man page for umount.

No comments:

Post a Comment