Help. Replaced unavail disk with new disk but still unavail - S11.3 zpool
by everyday from LinuxQuestions.org on (#5H6Q4)
Hi. We have a Solaris 11.3 system that was purchased many years ago. The company is no longer dealing with Solaris installs.
I am the sysadmin but know near nothing about Solaris, but do know a bit of Linux.
Our Solaris unit has 36 SAS hard drives. One of them in tank1 has a red light (c15t1d30). I have replaced it with another, exactly the same, brand new drive.
There is also another slot that went through the same issue about a year ago, which is in the same situation. So would like to fix that also (c15t1d8).
My question is, can someone please help me bring it (c15t1d30) back online. I have followed the oracle instructions on replacing a drive with a new one in the same slot, and nothing I seem to do can bring it back. Drive still remains in unavail state with red light on.
Below is my 'zpool status -v' output and a list of the things I have tried:
Code:# zpool status -v
pool: rpool
state: ONLINE
status: The pool is formatted using an older on-disk format. The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
pool will no longer be accessible on older software versions.
scan: resilvered 34.8G in 6m59s with 0 errors on Mon Nov 4 14:05:41 2013
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c8t1d0 ONLINE 0 0 0
c8t0d0 ONLINE 0 0 0
errors: No known data errors
pool: tank1
state: DEGRADED
status: One or more devices are unavailable in response to persistent errors.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or 'fmadm repaired', or replace the device
with 'zpool replace'.
scan: scrub canceled on Tue Nov 12 17:18:14 2013
config:
NAME STATE READ WRITE CKSUM
tank1 DEGRADED 0 0 0
raidz2-0 ONLINE 0 0 0
c15t1d0 ONLINE 0 0 0
c15t1d1 ONLINE 0 0 0
c15t1d2 ONLINE 0 0 0
c15t1d3 ONLINE 0 0 0
c15t1d4 ONLINE 0 0 0
c15t1d5 ONLINE 0 0 0
raidz2-1 DEGRADED 0 0 0
c15t1d6 ONLINE 0 0 0
c15t1d7 ONLINE 0 0 0
c15t1d8 UNAVAIL 0 0 0
c15t1d9 ONLINE 0 0 0
c15t1d10 ONLINE 0 0 0
c15t1d11 ONLINE 0 0 0
raidz2-2 ONLINE 0 0 0
c15t1d12 ONLINE 0 0 0
c15t1d13 ONLINE 0 0 0
c15t1d14 ONLINE 0 0 0
c15t1d15 ONLINE 0 0 0
c15t1d16 ONLINE 0 0 0
c15t1d17 ONLINE 0 0 0
raidz2-3 ONLINE 0 0 0
c15t1d18 ONLINE 0 0 0
c15t1d19 ONLINE 0 0 0
c15t1d20 ONLINE 0 0 0
c15t1d21 ONLINE 0 0 0
c15t1d22 ONLINE 0 0 0
c15t1d23 ONLINE 0 0 0
raidz2-4 ONLINE 0 0 0
c15t1d24 ONLINE 0 0 0
c15t1d25 ONLINE 0 0 0
c15t1d26 ONLINE 0 0 0
c15t1d27 ONLINE 0 0 0
c15t1d28 ONLINE 8 0 0
c15t1d29 ONLINE 0 0 0
raidz2-5 DEGRADED 0 0 0
c15t1d30 UNAVAIL 0 23 0
c15t1d31 ONLINE 0 0 0
c15t1d32 ONLINE 0 0 0
c15t1d33 ONLINE 0 0 0
c15t1d34 ONLINE 0 0 0
c15t1d35 ONLINE 0 0 0
logs
c9t5000A72B30077A50d0 ONLINE 0 0 0
c11t5000A72B30077A4Cd0 ONLINE 0 0 0
cache
c8t2d0 ONLINE 0 0 0
c8t3d0 ONLINE 0 0 0
c8t4d0 ONLINE 0 0 0
c8t5d0 ONLINE 0 0 0
device details:
c15t1d8 UNAVAIL too many errors
status: FMA has faulted this device.
action: Run 'fmadm faulty' for more information. Clear the errors
using 'fmadm repaired'.
see: http://support.oracle.com/msg/FMD-8000-4M for recovery
c15t1d30 UNAVAIL too many errors
status: FMA has faulted this device.
action: Run 'fmadm faulty' for more information. Clear the errors
using 'fmadm repaired'.
see: http://support.oracle.com/msg/ZFS-8000-FD for recovery
errors: No known data errors
What I have tried:
Code:# zpool offline tank1 c15t1d30- physically replaced the hard drive with a brand new, exactly the same one, in the same slot.
Code:# zpool replace tank1 c15t1d30
cannot label 'c15t1d30': try using fdisk(1M) and then provide a specific slice
Unable to build pool from specified devices: invalid vdev configuration- this could be the issue, but I have ZERO idea what it means. Google revealed nothing that helpful.
- zpool status -v showed no change at all. ie still unavail
Code:# zpool clear tank1 c15t1d30- zpool status -v showed no change at all except now error count is at 0, but still unavail
Code:# zpool online tank1 c15t1d30- zpool status -v showed no change
- put old drive back in then:
Code:# devfsadm -Cv
devfsadm[21586]: verbose: removing file: /dev/dsk/c12t5000A72A30077A50d0s9
devfsadm[21586]: verbose: removing file: /dev/dsk/c15t1d8
...
...
...- did lots of those 'removing files'
- replace with new hard drive again
Code:# devfsadm -Cv- zpool status -v showed no change
Code:# zpool replace tank1 c15t1d30 (again)
cannot label 'c15t1d30': try using fdisk(1M) and then provide a specific slice
Unable to build pool from specified devices: invalid vdev configuration- zpool status -v showed no change
Code:# fmadm faulty
--------------- ------------------------------------ -------------- ---------
TIME EVENT-ID MSG-ID SEVERITY
--------------- ------------------------------------ -------------- ---------
Apr 29 08:49:13 64ebbc9f-ecca-4a54-bf04-ca3eefb783a6 ZFS-8000-LR Major
Problem Status : open
Diag Engine : zfs-diagnosis / 1.0
System
Manufacturer : unknown
Name : unknown
Part_Number : unknown
Serial_Number : unknown
System Component
Manufacturer : Cisco Systems Inc
Name : UCSC-C24-M3S
Part_Number :
Serial_Number : WZP1709000E
Host_ID : 008457b8
----------------------------------------
Suspect 1 of 1 :
Problem class : fault.fs.zfs.open_failed
Certainty : 100%
Affects : zfs://pool=64513e8f0e484ee2/vdev=ea0091c4caa611ec/pool_name=tank1/vdev_name=id1,sd@n600100404f361ca0a2900c8d00000000/a
Status : faulted and taken out of service
FRU
Status : faulty
FMRI : "zfs://pool=64513e8f0e484ee2/vdev=ea0091c4caa611ec/pool_name=tank1/vdev_name=id1,sd@n600100404f361ca0a2900c8d00000000/a"
Description : ZFS device 'id1,sd@n600100404f361ca0a2900c8d00000000/a' in pool
'tank1' failed to open.
Response : An attempt will be made to activate a hot spare if available.
Impact : Fault tolerance of the pool may be compromised.
Action : Use 'fmadm faulty' to provide a more detailed view of this event.
Run 'zpool status -lx' for more information. Please refer to the
associated reference document at
http://support.oracle.com/msg/ZFS-8000-LR for the latest service
procedures and policies regarding this diagnosis.Code:# fmadm repaired zfs://pool=64513e8f0e484ee2/vdev=ea0091c4caa611ec/pool_name=tank1/vdev_name=id1,sd@n600100404f361ca0a2900c8d00000000/a- zpool status -v showed no change
Current state same as original condition except "write erros 0":
Code:# zpool status -v
pool: rpool
state: ONLINE
status: The pool is formatted using an older on-disk format. The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
pool will no longer be accessible on older software versions.
scan: resilvered 34.8G in 6m59s with 0 errors on Mon Nov 4 14:05:41 2013
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c8t1d0 ONLINE 0 0 0
c8t0d0 ONLINE 0 0 0
errors: No known data errors
pool: tank1
state: DEGRADED
status: One or more devices are unavailable in response to persistent errors.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or 'fmadm repaired', or replace the device
with 'zpool replace'.
scan: scrub canceled on Tue Nov 12 17:18:14 2013
config:
NAME STATE READ WRITE CKSUM
tank1 DEGRADED 0 0 0
raidz2-0 ONLINE 0 0 0
c15t1d0 ONLINE 0 0 0
c15t1d1 ONLINE 0 0 0
c15t1d2 ONLINE 0 0 0
c15t1d3 ONLINE 0 0 0
c15t1d4 ONLINE 0 0 0
c15t1d5 ONLINE 0 0 0
raidz2-1 DEGRADED 0 0 0
c15t1d6 ONLINE 0 0 0
c15t1d7 ONLINE 0 0 0
c15t1d8 UNAVAIL 0 0 0
c15t1d9 ONLINE 0 0 0
c15t1d10 ONLINE 0 0 0
c15t1d11 ONLINE 0 0 0
raidz2-2 ONLINE 0 0 0
c15t1d12 ONLINE 0 0 0
c15t1d13 ONLINE 0 0 0
c15t1d14 ONLINE 0 0 0
c15t1d15 ONLINE 0 0 0
c15t1d16 ONLINE 0 0 0
c15t1d17 ONLINE 0 0 0
raidz2-3 ONLINE 0 0 0
c15t1d18 ONLINE 0 0 0
c15t1d19 ONLINE 0 0 0
c15t1d20 ONLINE 0 0 0
c15t1d21 ONLINE 0 0 0
c15t1d22 ONLINE 0 0 0
c15t1d23 ONLINE 0 0 0
raidz2-4 ONLINE 0 0 0
c15t1d24 ONLINE 0 0 0
c15t1d25 ONLINE 0 0 0
c15t1d26 ONLINE 0 0 0
c15t1d27 ONLINE 0 0 0
c15t1d28 ONLINE 8 0 0
c15t1d29 ONLINE 0 0 0
raidz2-5 DEGRADED 0 0 0
c15t1d30 UNAVAIL 0 0 0
c15t1d31 ONLINE 0 0 0
c15t1d32 ONLINE 0 0 0
c15t1d33 ONLINE 0 0 0
c15t1d34 ONLINE 0 0 0
c15t1d35 ONLINE 0 0 0
logs
c9t5000A72B30077A50d0 ONLINE 0 0 0
c11t5000A72B30077A4Cd0 ONLINE 0 0 0
cache
c8t2d0 ONLINE 0 0 0
c8t3d0 ONLINE 0 0 0
c8t4d0 ONLINE 0 0 0
c8t5d0 ONLINE 0 0 0
device details:
c15t1d8 UNAVAIL too many errors
status: FMA has faulted this device.
action: Run 'fmadm faulty' for more information. Clear the errors
using 'fmadm repaired'.
see: http://support.oracle.com/msg/FMD-8000-4M for recovery
c15t1d30 UNAVAIL too many errors
status: FMA has faulted this device.
action: Run 'fmadm faulty' for more information. Clear the errors
using 'fmadm repaired'.
see: http://support.oracle.com/msg/ZFS-8000-LR for recovery
errors: No known data errorsPhew, alot of info!
So basically, can anyone help get this drive back online or shed any insight?
Thanks!
Jono


I am the sysadmin but know near nothing about Solaris, but do know a bit of Linux.
Our Solaris unit has 36 SAS hard drives. One of them in tank1 has a red light (c15t1d30). I have replaced it with another, exactly the same, brand new drive.
There is also another slot that went through the same issue about a year ago, which is in the same situation. So would like to fix that also (c15t1d8).
My question is, can someone please help me bring it (c15t1d30) back online. I have followed the oracle instructions on replacing a drive with a new one in the same slot, and nothing I seem to do can bring it back. Drive still remains in unavail state with red light on.
Below is my 'zpool status -v' output and a list of the things I have tried:
Code:# zpool status -v
pool: rpool
state: ONLINE
status: The pool is formatted using an older on-disk format. The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
pool will no longer be accessible on older software versions.
scan: resilvered 34.8G in 6m59s with 0 errors on Mon Nov 4 14:05:41 2013
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c8t1d0 ONLINE 0 0 0
c8t0d0 ONLINE 0 0 0
errors: No known data errors
pool: tank1
state: DEGRADED
status: One or more devices are unavailable in response to persistent errors.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or 'fmadm repaired', or replace the device
with 'zpool replace'.
scan: scrub canceled on Tue Nov 12 17:18:14 2013
config:
NAME STATE READ WRITE CKSUM
tank1 DEGRADED 0 0 0
raidz2-0 ONLINE 0 0 0
c15t1d0 ONLINE 0 0 0
c15t1d1 ONLINE 0 0 0
c15t1d2 ONLINE 0 0 0
c15t1d3 ONLINE 0 0 0
c15t1d4 ONLINE 0 0 0
c15t1d5 ONLINE 0 0 0
raidz2-1 DEGRADED 0 0 0
c15t1d6 ONLINE 0 0 0
c15t1d7 ONLINE 0 0 0
c15t1d8 UNAVAIL 0 0 0
c15t1d9 ONLINE 0 0 0
c15t1d10 ONLINE 0 0 0
c15t1d11 ONLINE 0 0 0
raidz2-2 ONLINE 0 0 0
c15t1d12 ONLINE 0 0 0
c15t1d13 ONLINE 0 0 0
c15t1d14 ONLINE 0 0 0
c15t1d15 ONLINE 0 0 0
c15t1d16 ONLINE 0 0 0
c15t1d17 ONLINE 0 0 0
raidz2-3 ONLINE 0 0 0
c15t1d18 ONLINE 0 0 0
c15t1d19 ONLINE 0 0 0
c15t1d20 ONLINE 0 0 0
c15t1d21 ONLINE 0 0 0
c15t1d22 ONLINE 0 0 0
c15t1d23 ONLINE 0 0 0
raidz2-4 ONLINE 0 0 0
c15t1d24 ONLINE 0 0 0
c15t1d25 ONLINE 0 0 0
c15t1d26 ONLINE 0 0 0
c15t1d27 ONLINE 0 0 0
c15t1d28 ONLINE 8 0 0
c15t1d29 ONLINE 0 0 0
raidz2-5 DEGRADED 0 0 0
c15t1d30 UNAVAIL 0 23 0
c15t1d31 ONLINE 0 0 0
c15t1d32 ONLINE 0 0 0
c15t1d33 ONLINE 0 0 0
c15t1d34 ONLINE 0 0 0
c15t1d35 ONLINE 0 0 0
logs
c9t5000A72B30077A50d0 ONLINE 0 0 0
c11t5000A72B30077A4Cd0 ONLINE 0 0 0
cache
c8t2d0 ONLINE 0 0 0
c8t3d0 ONLINE 0 0 0
c8t4d0 ONLINE 0 0 0
c8t5d0 ONLINE 0 0 0
device details:
c15t1d8 UNAVAIL too many errors
status: FMA has faulted this device.
action: Run 'fmadm faulty' for more information. Clear the errors
using 'fmadm repaired'.
see: http://support.oracle.com/msg/FMD-8000-4M for recovery
c15t1d30 UNAVAIL too many errors
status: FMA has faulted this device.
action: Run 'fmadm faulty' for more information. Clear the errors
using 'fmadm repaired'.
see: http://support.oracle.com/msg/ZFS-8000-FD for recovery
errors: No known data errors
What I have tried:
Code:# zpool offline tank1 c15t1d30- physically replaced the hard drive with a brand new, exactly the same one, in the same slot.
Code:# zpool replace tank1 c15t1d30
cannot label 'c15t1d30': try using fdisk(1M) and then provide a specific slice
Unable to build pool from specified devices: invalid vdev configuration- this could be the issue, but I have ZERO idea what it means. Google revealed nothing that helpful.
- zpool status -v showed no change at all. ie still unavail
Code:# zpool clear tank1 c15t1d30- zpool status -v showed no change at all except now error count is at 0, but still unavail
Code:# zpool online tank1 c15t1d30- zpool status -v showed no change
- put old drive back in then:
Code:# devfsadm -Cv
devfsadm[21586]: verbose: removing file: /dev/dsk/c12t5000A72A30077A50d0s9
devfsadm[21586]: verbose: removing file: /dev/dsk/c15t1d8
...
...
...- did lots of those 'removing files'
- replace with new hard drive again
Code:# devfsadm -Cv- zpool status -v showed no change
Code:# zpool replace tank1 c15t1d30 (again)
cannot label 'c15t1d30': try using fdisk(1M) and then provide a specific slice
Unable to build pool from specified devices: invalid vdev configuration- zpool status -v showed no change
Code:# fmadm faulty
--------------- ------------------------------------ -------------- ---------
TIME EVENT-ID MSG-ID SEVERITY
--------------- ------------------------------------ -------------- ---------
Apr 29 08:49:13 64ebbc9f-ecca-4a54-bf04-ca3eefb783a6 ZFS-8000-LR Major
Problem Status : open
Diag Engine : zfs-diagnosis / 1.0
System
Manufacturer : unknown
Name : unknown
Part_Number : unknown
Serial_Number : unknown
System Component
Manufacturer : Cisco Systems Inc
Name : UCSC-C24-M3S
Part_Number :
Serial_Number : WZP1709000E
Host_ID : 008457b8
----------------------------------------
Suspect 1 of 1 :
Problem class : fault.fs.zfs.open_failed
Certainty : 100%
Affects : zfs://pool=64513e8f0e484ee2/vdev=ea0091c4caa611ec/pool_name=tank1/vdev_name=id1,sd@n600100404f361ca0a2900c8d00000000/a
Status : faulted and taken out of service
FRU
Status : faulty
FMRI : "zfs://pool=64513e8f0e484ee2/vdev=ea0091c4caa611ec/pool_name=tank1/vdev_name=id1,sd@n600100404f361ca0a2900c8d00000000/a"
Description : ZFS device 'id1,sd@n600100404f361ca0a2900c8d00000000/a' in pool
'tank1' failed to open.
Response : An attempt will be made to activate a hot spare if available.
Impact : Fault tolerance of the pool may be compromised.
Action : Use 'fmadm faulty' to provide a more detailed view of this event.
Run 'zpool status -lx' for more information. Please refer to the
associated reference document at
http://support.oracle.com/msg/ZFS-8000-LR for the latest service
procedures and policies regarding this diagnosis.Code:# fmadm repaired zfs://pool=64513e8f0e484ee2/vdev=ea0091c4caa611ec/pool_name=tank1/vdev_name=id1,sd@n600100404f361ca0a2900c8d00000000/a- zpool status -v showed no change
Current state same as original condition except "write erros 0":
Code:# zpool status -v
pool: rpool
state: ONLINE
status: The pool is formatted using an older on-disk format. The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
pool will no longer be accessible on older software versions.
scan: resilvered 34.8G in 6m59s with 0 errors on Mon Nov 4 14:05:41 2013
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c8t1d0 ONLINE 0 0 0
c8t0d0 ONLINE 0 0 0
errors: No known data errors
pool: tank1
state: DEGRADED
status: One or more devices are unavailable in response to persistent errors.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or 'fmadm repaired', or replace the device
with 'zpool replace'.
scan: scrub canceled on Tue Nov 12 17:18:14 2013
config:
NAME STATE READ WRITE CKSUM
tank1 DEGRADED 0 0 0
raidz2-0 ONLINE 0 0 0
c15t1d0 ONLINE 0 0 0
c15t1d1 ONLINE 0 0 0
c15t1d2 ONLINE 0 0 0
c15t1d3 ONLINE 0 0 0
c15t1d4 ONLINE 0 0 0
c15t1d5 ONLINE 0 0 0
raidz2-1 DEGRADED 0 0 0
c15t1d6 ONLINE 0 0 0
c15t1d7 ONLINE 0 0 0
c15t1d8 UNAVAIL 0 0 0
c15t1d9 ONLINE 0 0 0
c15t1d10 ONLINE 0 0 0
c15t1d11 ONLINE 0 0 0
raidz2-2 ONLINE 0 0 0
c15t1d12 ONLINE 0 0 0
c15t1d13 ONLINE 0 0 0
c15t1d14 ONLINE 0 0 0
c15t1d15 ONLINE 0 0 0
c15t1d16 ONLINE 0 0 0
c15t1d17 ONLINE 0 0 0
raidz2-3 ONLINE 0 0 0
c15t1d18 ONLINE 0 0 0
c15t1d19 ONLINE 0 0 0
c15t1d20 ONLINE 0 0 0
c15t1d21 ONLINE 0 0 0
c15t1d22 ONLINE 0 0 0
c15t1d23 ONLINE 0 0 0
raidz2-4 ONLINE 0 0 0
c15t1d24 ONLINE 0 0 0
c15t1d25 ONLINE 0 0 0
c15t1d26 ONLINE 0 0 0
c15t1d27 ONLINE 0 0 0
c15t1d28 ONLINE 8 0 0
c15t1d29 ONLINE 0 0 0
raidz2-5 DEGRADED 0 0 0
c15t1d30 UNAVAIL 0 0 0
c15t1d31 ONLINE 0 0 0
c15t1d32 ONLINE 0 0 0
c15t1d33 ONLINE 0 0 0
c15t1d34 ONLINE 0 0 0
c15t1d35 ONLINE 0 0 0
logs
c9t5000A72B30077A50d0 ONLINE 0 0 0
c11t5000A72B30077A4Cd0 ONLINE 0 0 0
cache
c8t2d0 ONLINE 0 0 0
c8t3d0 ONLINE 0 0 0
c8t4d0 ONLINE 0 0 0
c8t5d0 ONLINE 0 0 0
device details:
c15t1d8 UNAVAIL too many errors
status: FMA has faulted this device.
action: Run 'fmadm faulty' for more information. Clear the errors
using 'fmadm repaired'.
see: http://support.oracle.com/msg/FMD-8000-4M for recovery
c15t1d30 UNAVAIL too many errors
status: FMA has faulted this device.
action: Run 'fmadm faulty' for more information. Clear the errors
using 'fmadm repaired'.
see: http://support.oracle.com/msg/ZFS-8000-LR for recovery
errors: No known data errorsPhew, alot of info!
So basically, can anyone help get this drive back online or shed any insight?
Thanks!
Jono