Article 5PMYX Hard drive SMART test failure in TrueNAS

Hard drive SMART test failure in TrueNAS

by
fusion1275
from LinuxQuestions.org on (#5PMYX)
Hello all,

I recently purchased a WD 3TB NAS drive (2nd hand) and have sync'd it up with my other 3x 3TB WD disks in my home NAS. But I have received an alert saying the disk has failed it's SMART tests. I am no good at reading these things but can see numerous errors and a read error when it was trying to perform a test? Am I right there?

Here is the output of smartctl -x /dev/ada0:

Code:smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p9 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Red
Device Model: WDC WD30EFRX-68EUZN0
Serial Number: WD-WCC4N4TU22E2
LU WWN Device Id: 5 0014ee 260bf52fc
Firmware Version: 82.00A82
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2 (minor revision not indicated)
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Fri Sep 17 08:17:45 2021 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM feature is: Unavailable
Rd look-ahead is: Enabled
Write cache is: Enabled
DSN feature is: Unavailable
ATA Security is: Disabled, frozen [SEC2]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 113) The previous self-test completed having
the read element of the test failed.
Total time to complete Offline
data collection: (40860) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 410) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate POSR-K 200 200 051 - 42
3 Spin_Up_Time POS--K 183 178 021 - 5825
4 Start_Stop_Count -O--CK 100 100 000 - 176
5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0
7 Seek_Error_Rate -OSR-K 200 200 000 - 0
9 Power_On_Hours -O--CK 063 063 000 - 27379
10 Spin_Retry_Count -O--CK 100 100 000 - 0
11 Calibration_Retry_Count -O--CK 100 100 000 - 0
12 Power_Cycle_Count -O--CK 100 100 000 - 126
192 Power-Off_Retract_Count -O--CK 200 200 000 - 4
193 Load_Cycle_Count -O--CK 200 200 000 - 1026
194 Temperature_Celsius -O---K 123 108 000 - 27
196 Reallocated_Event_Count -O--CK 200 200 000 - 0
197 Current_Pending_Sector -O--CK 200 200 000 - 0
198 Offline_Uncorrectable ----CK 100 253 000 - 0
199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0
200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 3
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning

General Purpose Log Directory Version 1
SMART Log Directory Version 1 [multi-sector log support]
Address Access R/W Size Description
0x00 GPL,SL R/O 1 Log Directory
0x01 SL R/O 1 Summary SMART error log
0x02 SL R/O 5 Comprehensive SMART error log
0x03 GPL R/O 6 Ext. Comprehensive SMART error log
0x06 SL R/O 1 SMART self-test log
0x07 GPL R/O 1 Extended self-test log
0x09 SL R/W 1 Selective self-test log
0x10 GPL R/O 1 NCQ Command Error log
0x11 GPL R/O 1 SATA Phy Event Counters log
0x21 GPL R/O 1 Write stream error log
0x22 GPL R/O 1 Read stream error log
0x80-0x9f GPL,SL R/W 16 Host vendor specific log
0xa0-0xa7 GPL,SL VS 16 Device vendor specific log
0xa8-0xb7 GPL,SL VS 1 Device vendor specific log
0xbd GPL,SL VS 1 Device vendor specific log
0xc0 GPL,SL VS 1 Device vendor specific log
0xc1 GPL VS 93 Device vendor specific log
0xe0 GPL,SL R/W 1 SCT Command/Status
0xe1 GPL,SL R/W 1 SCT Data Transfer
0xe1 GPL,SL R/W 1 SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (6 sectors)
Device Error Count: 9
CR = Command Register
FEATR = Features Register
COUNT = Count (was: Sector Count) Register
LBA_48 = Upper bytes of LBA High/Mid/Low Registers ] ATA-8
LH = LBA High (was: Cylinder High) Register ] LBA
LM = LBA Mid (was: Cylinder Low) Register ] Register
LL = LBA Low (was: Sector Number) Register ]
DV = Device (was: Device/Head) Register
DC = Device Control Register
ER = Error register
ST = Status register
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 9 [8] occurred at disk power-on lifetime: 26519 hours (1104 days + 23 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 ea f2 67 90 40 00 Error: UNC at LBA = 0xeaf26790 = 3941754768

Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 08 00 c0 00 00 00 10 ca 70 40 08 1d+04:51:17.524 READ FPDMA QUEUED
61 00 48 00 b8 00 00 00 21 2a c8 40 08 1d+04:51:17.524 WRITE FPDMA QUEUED
60 00 10 00 b0 00 00 09 1a 18 e8 40 08 1d+04:51:17.524 READ FPDMA QUEUED
60 00 10 00 a8 00 00 09 1a 18 98 40 08 1d+04:51:17.522 READ FPDMA QUEUED
60 00 e0 00 a0 00 00 ea f2 86 00 40 08 1d+04:51:17.520 READ FPDMA QUEUED

Error 8 [7] occurred at disk power-on lifetime: 26519 hours (1104 days + 23 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 ea f2 67 90 40 00 Error: UNC at LBA = 0xeaf26790 = 3941754768

Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 e8 00 e8 00 00 ea f2 67 30 40 08 1d+04:51:13.663 READ FPDMA QUEUED
60 00 50 00 e0 00 00 ea f2 66 e0 40 08 1d+04:51:13.662 READ FPDMA QUEUED
60 00 20 00 d8 00 00 ea f2 66 c0 40 08 1d+04:51:13.661 READ FPDMA QUEUED
60 00 08 00 d0 00 00 ea f2 66 b8 40 08 1d+04:51:13.661 READ FPDMA QUEUED
60 00 08 00 c8 00 00 ea f2 66 b0 40 08 1d+04:51:13.661 READ FPDMA QUEUED

Error 7 [6] occurred at disk power-on lifetime: 26467 hours (1102 days + 19 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:

After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 02 73 ee c0 40 00 Error: UNC at LBA = 0x0273eec0 = 41152192

Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 02 d0 00 90 00 00 02 74 0c 10 40 08 02:45:46.811 READ FPDMA QUEUED
60 04 00 00 88 00 00 02 74 08 10 40 08 02:45:46.805 READ FPDMA QUEUED
60 04 00 00 80 00 00 02 74 04 10 40 08 02:45:46.805 READ FPDMA QUEUED
60 04 00 00 78 00 00 02 74 00 10 40 08 02:45:46.804 READ FPDMA QUEUED
60 00 70 00 70 00 00 00 98 4a 70 40 08 02:45:46.804 READ FPDMA QUEUED

Error 6 [5] occurred at disk power-on lifetime: 26467 hours (1102 days + 19 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 02 73 ee c0 40 00 Error: UNC at LBA = 0x0273eec0 = 41152192

Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 30 00 58 00 00 02 73 ff e0 40 08 02:45:42.986 READ FPDMA QUEUED
60 02 00 00 50 00 00 02 73 fd e0 40 08 02:45:42.986 READ FPDMA QUEUED
60 02 00 00 48 00 00 02 73 fb e0 40 08 02:45:42.986 READ FPDMA QUEUED
60 02 00 00 40 00 00 02 73 f9 e0 40 08 02:45:42.986 READ FPDMA QUEUED
60 02 00 00 38 00 00 02 73 f7 e0 40 08 02:45:42.986 READ FPDMA QUEUED

Error 5 [4] occurred at disk power-on lifetime: 24323 hours (1013 days + 11 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 02 73 fd 18 40 00 Error: UNC at LBA = 0x0273fd18 = 41155864

Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 08 00 b0 00 00 46 50 22 40 40 08 22d+05:53:36.447 READ FPDMA QUEUED
61 00 10 00 a8 00 00 46 50 22 30 40 08 22d+05:53:36.446 WRITE FPDMA QUEUED
60 00 08 00 a0 00 00 02 73 fd 18 40 08 22d+05:53:36.445 READ FPDMA QUEUED
60 00 08 00 98 00 00 02 73 fd 20 40 08 22d+05:53:36.445 READ FPDMA QUEUED
60 00 08 00 90 00 00 02 73 fd 28 40 08 22d+05:53:36.445 READ FPDMA QUEUED

Error 4 [3] occurred at disk power-on lifetime: 24323 hours (1013 days + 11 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 02 73 fd 18 40 00 Error: UNC at LBA = 0x0273fd18 = 41155864
40 -- 51 00 00 00 00 02 73 fd 18 40 00 Error: UNC at LBA = 0x0273fd18 = 41155864

Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 03 e0 00 e8 00 00 02 73 f9 e0 40 08 22d+05:53:32.674 READ FPDMA QUEUED
60 04 00 00 e0 00 00 02 73 f5 e0 40 08 22d+05:53:32.674 READ FPDMA QUEUED
ea 00 00 00 00 00 00 00 00 00 00 e0 08 22d+05:53:32.635 FLUSH CACHE EXT
61 00 02 00 d0 00 00 00 90 3e e8 40 08 22d+05:53:32.635 WRITE FPDMA QUEUED
ea 00 00 00 00 00 00 00 00 00 00 e0 08 22d+05:53:32.615 FLUSH CACHE EXT

Error 3 [2] occurred at disk power-on lifetime: 24142 hours (1005 days + 22 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 02 73 d3 f0 40 00 Error: UNC at LBA = 0x0273d3f0 = 41145328

Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 08 00 38 00 00 00 3d 6f b8 40 08 14d+17:11:33.713 READ FPDMA QUEUED
60 01 00 00 30 00 00 02 73 db e0 40 08 14d+17:11:33.712 READ FPDMA QUEUED
60 00 18 00 28 00 00 46 4f fa e0 40 08 14d+17:11:33.712 READ FPDMA QUEUED
60 00 08 00 20 00 00 02 73 d3 f0 40 08 14d+17:11:33.712 READ FPDMA QUEUED
ea 00 00 00 00 00 00 00 00 00 00 e0 08 14d+17:11:33.685 FLUSH CACHE EXT

Error 2 [1] occurred at disk power-on lifetime: 24142 hours (1005 days + 22 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 02 73 d3 f0 40 00 Error: UNC at LBA = 0x0273d3f0 = 41145328

Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 04 00 00 50 00 00 02 73 d7 e0 40 08 14d+17:11:29.947 READ FPDMA QUEUED
60 04 00 00 48 00 00 02 73 d3 e0 40 08 14d+17:11:29.947 READ FPDMA QUEUED
60 04 00 00 40 00 00 02 73 cf e0 40 08 14d+17:11:29.924 READ FPDMA QUEUED
60 04 00 00 38 00 00 02 73 cb e0 40 08 14d+17:11:29.916 READ FPDMA QUEUED
60 04 00 00 30 00 00 02 73 c7 e0 40 08 14d+17:11:29.898 READ FPDMA QUEUED

SMART Extended Self-test Log Version: 1 (1 sectors)
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed: read failure 10% 27371 41141648
# 2 Short offline Completed without error 00% 27323 -
# 3 Extended offline Completed: read failure 90% 27283 41152208
# 4 Short offline Completed: read failure 10% 27275 41141944
# 5 Short offline Completed without error 00% 26459 -
# 6 Extended offline Completed without error 00% 26284 -
# 7 Extended offline Completed without error 00% 26112 -
# 8 Extended offline Completed without error 00% 25986 -
# 9 Extended offline Completed without error 00% 25818 -
#10 Extended offline Completed without error 00% 25651 -
# 9 Extended offline Completed without error 00% 25818 -
#10 Extended offline Completed without error 00% 25651 -
#11 Extended offline Completed without error 00% 25482 -
#12 Extended offline Completed without error 00% 25315 -
#13 Extended offline Completed without error 00% 25147 -
#14 Extended offline Completed without error 00% 24979 -
#15 Extended offline Completed without error 00% 24811 -
#16 Extended offline Completed without error 00% 24643 -
#17 Extended offline Completed without error 00% 24475 -
#18 Extended offline Completed without error 00% 24308 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version: 3
SCT Version (vendor specific): 258 (0x0102)
Device State: Active (0)
Current Temperature: 27 Celsius
Power Cycle Min/Max Temperature: 25/36 Celsius
Lifetime Min/Max Temperature: 2/42 Celsius
Under/Over Temperature Limit Count: 0/0
Vendor specific:
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

SCT Temperature History Version: 2
Temperature Sampling Period: 1 minute
Temperature Logging Interval: 1 minute
Min/Max recommended Temperature: 0/60 Celsius
Min/Max Temperature Limit: -41/85 Celsius
Temperature History Size (Index): 478 (163)

Index Estimated Time Temperature Celsius
164 2021-09-17 00:20 28 *********
... ..(227 skipped). .. *********
392 2021-09-17 04:08 28 *********
393 2021-09-17 04:09 27 ********
... ..( 63 skipped). .. ********
457 2021-09-17 05:13 27 ********
458 2021-09-17 05:14 29 **********
... ..( 74 skipped). .. **********
55 2021-09-17 06:29 29 **********
56 2021-09-17 06:30 28 *********
... ..(106 skipped). .. *********
163 2021-09-17 08:17 28 *********

SCT Error Recovery Control:
Read: 70 (7.0 seconds)
Write: 70 (7.0 seconds)
SCT Status Version: 3
SCT Version (vendor specific): 258 (0x0102)
Device State: Active (0)
Current Temperature: 27 Celsius
Power Cycle Min/Max Temperature: 25/36 Celsius
Lifetime Min/Max Temperature: 2/42 Celsius
Under/Over Temperature Limit Count: 0/0
Vendor specific:
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

SCT Temperature History Version: 2
Temperature Sampling Period: 1 minute
Temperature Logging Interval: 1 minute
Min/Max recommended Temperature: 0/60 Celsius
Min/Max Temperature Limit: -41/85 Celsius
Temperature History Size (Index): 478 (163)

Index Estimated Time Temperature Celsius
164 2021-09-17 00:20 28 *********
... ..(227 skipped). .. *********
392 2021-09-17 04:08 28 *********
393 2021-09-17 04:09 27 ********
... ..( 63 skipped). .. ********
457 2021-09-17 05:13 27 ********
458 2021-09-17 05:14 29 **********
... ..( 74 skipped). .. **********
55 2021-09-17 06:29 29 **********
56 2021-09-17 06:30 28 *********
... ..(106 skipped). .. *********
163 2021-09-17 08:17 28 *********

SCT Error Recovery Control:
Read: 70 (7.0 seconds)
Write: 70 (7.0 seconds)

Device Statistics (GP/SMART Log 0x04) not supported

Pending Defects log (GP Log 0x0c) not supported

SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 2 0 Command failed due to ICRC error
0x0002 2 0 R_ERR response for data FIS
0x0003 2 0 R_ERR response for device-to-host data FIS
0x0004 2 0 R_ERR response for host-to-device data FIS
0x0005 2 0 R_ERR response for non-data FIS
0x0006 2 0 R_ERR response for device-to-host non-data FIS
0x0007 2 0 R_ERR response for host-to-device non-data FIS
0x0008 2 0 Device-to-host non-data FIS retries
0x0009 2 12 Transition from drive PhyRdy to drive PhyNRdy
0x000a 2 9 Device-to-host register FISes sent due to a COMRESET
0x000b 2 0 CRC errors within host-to-device FIS
0x000f 2 0 R_ERR response for host-to-device data FIS, CRC
0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC
0x8000 4 416659 Vendor specificAny help would be greatly appreciated.

Thank youlatest?d=yIl2AUoC8zA latest?i=LXx7qzE8cO4:ccho9tP33Co:F7zBnMy latest?i=LXx7qzE8cO4:ccho9tP33Co:V_sGLiP latest?d=qj6IDK7rITs latest?i=LXx7qzE8cO4:ccho9tP33Co:gIN9vFwLXx7qzE8cO4
External Content
Source RSS or Atom Feed
Feed Location https://feeds.feedburner.com/linuxquestions/latest
Feed Title LinuxQuestions.org
Feed Link https://www.linuxquestions.org/questions/
Reply 0 comments