I have a Solaris 11.1 machine which has some disks attached to it via an expander (LSISAS2X36) through an LSI 1068 controller. The setup used to work quite decently, but as I have added another batch of disks, I see some strange effects:
format
hangs after selecting a disk (any disk) if I do not specify NOINUSE_CHECK=1
zpool create test c10d20t0
will hang - seemingly due to the same reason as format
does. The NOINUSE_CHECK variable seems to have no effect, although old news archives seem to suggest that it seemed to help for previous releases of Solaris.I already tried running devfsadm -Cv
to clean up dev entries for non-present devices, but to no avail. I also figured that invalid partition information on one of the newly added disks might cause the "in use" check to hang and ran the fdisk
menu for all of the added disks to create a 100% Solaris partition, but this did not help things either.
A truss zpool create test c10t20d0
reveals a lot of reading links off /dev/rdsk/
and stops with these lines:
readlink("/dev/zvol/rdsk/rpool/dump", "../../../..//devices/pseudo/zfs@0:1,raw", 1023) = 39
lstat("/dev", 0xF8D35310) = 0
lstat("/dev/zvol", 0xF8D35310) = 0
lstat("/dev/zvol/rdsk", 0xF8D35310) = 0
lstat("/dev/zvol/rdsk/rpool", 0xF8D35310) = 0
lstat("/dev/zvol/rdsk/rpool/swap", 0xF8D35310) = 0
readlink("/dev/zvol/rdsk/rpool/swap", "../../../..//devices/pseudo/zfs@0:2,raw", 1023) = 39
open("/devices/pseudo/devinfo@0:devinfo", O_RDONLY) = 7
ioctl(7, DINFOIDENT, 0x00000000) = 57311
ioctl(7, 0x10DF00, 0xF8D36F10) = 380014
ioctl(7, DINFOUSRLD, 0x08D62000) = 380928
close(7) = 0
close(6) = 0
munmap(0xF5FE1000, 4096) = 0
munmap(0xF5FD2000, 20480) = 0
munmap(0xF5FC7000, 24576) = 0
munmap(0xF6014000, 110592) = 0
munmap(0xF6030000, 40) = 0
close(5) = 0
stat64("/opt/VRTSvxvm/lib/libsysevent.so.1", 0xF8D36910) Err#2 ENOENT
stat64("/lib/libsysevent.so.1", 0xF8D36910) = 0
resolvepath("/lib/libsysevent.so.1", "/lib/libsysevent.so.1", 1023) = 21
open("/lib/libsysevent.so.1", O_RDONLY) = 5
mmapobj(5, MMOBJ_INTERPRET, 0xF6040B78, 0xF8D3697C, 0x00000000) = 0
close(5) = 0
mmap(0x00000000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xF5FE0000
memcntl(0xF6020000, 11280, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
getuid() = 0 [0]
statvfs("/system/volatile", 0xF8D369B0) = 0
stat("/system/volatile/sysevent_channels", 0xF8D36A50) = 0
mkdir("/system/volatile/sysevent_channels/syseventd_channel", 0755) Err#17 EEXIST
stat("/system/volatile/sysevent_channels/syseventd_channel", 0xF8D368F0) = 0
getuid() = 0 [0]
modctl(MODEVENTS, 0x00000006, 0x08D560EB, 0x00000000, 0xF8D36880) = 0
modctl(MODEVENTS, 0x00000006, 0x08D560EB, 0x00000000, 0xF8D36A40) = 0
unlink("/system/volatile/sysevent_channels/syseventd_channel/59") Err#2 ENOENT
open("/system/volatile/sysevent_channels/syseventd_channel/59", O_RDWR|O_CREAT, 0600) = 5
door_create(0xF6024174, 0x08D56088, DOOR_REFUSE_DESC|DOOR_NO_CANCEL) = 6
getpid() = 22082 [22081]
priocntlsys(1, 0xF8D365B0, 3, 0xF8D366A0, 0) = 22082
priocntlsys(1, 0xF8D36540, 1, 0xF8D36600, 0) = 4
priocntlsys(1, 0xF8D36500, 0, 0xF6575FB8, 0) = 4
mmap(0x00000000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xF5FBF000
mmap(0x00000000, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xF5FA0000
sigaction(SIGCANCEL, 0xF8D366C0, 0x00000000) = 0
sysconfig(_CONFIG_STACK_PROT) = 3
mmap(0x00000000, 1040384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON, -1, 0) = 0xF5EA1000
mmap(0x00010000, 65536, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xF5E90000
getcontext(0xF8D36510)
uucopy(0xF8D364D0, 0xF5F9EFEC, 20) = 0
lwp_create(0xF8D36760, LWP_DETACHED|LWP_SUSPENDED, 0xF8D3675C) = 2
/1: lwp_continue(2) = 0
/2: lwp_create() (returning as new lwp ...) = 0
/1: yield() = 0
/2: setustack(0xF5E902A0)
/2: schedctl() = 0xF623B040
/1: umount2("/system/volatile/sysevent_channels/syseventd_channel/59", 0x00000000) Err#22 EINVAL
/1: ioctl(6, I_CANPUT, 0x00000000) Err#89 ENOSYS
/1: door_info(6, 0xF8D36640) = 0
/1: mount(0, "/system/volatile/sysevent_channels/syseventd_channel/59", MS_DATA|MS_NOMNTTAB, "namefs", 0xF8D3663C, 4, 0x00000000, 0) = 0
/1: close(5) = 0
/1: open("/system/volatile/sysevent_channels/syseventd_channel/reg_door", O_RDONLY) = 5
/2: door_return(0x00000000, 0, 0x00000000, 0xF5F9EE00, 1007360) (sleeping...)
/1: door_call(5, 0xF8D369F0) (sleeping...)
^C/1: Received signal #2, SIGINT, in door_call() [default]
a truss format c10t20d0
looks pretty much the same towards the end.
Anything else I could do to narrow down the possible causes or just try and see if it would work?
It looks like the system did not handle a pulled disk very well. Although most of it seemed to work correctly, the format
and zpool create
commands hung even after the missing disk has been re-inserted.
Rebooting the system helped matters - a fast reboot was sufficient.
User contributions licensed under CC BY-SA 3.0