How to interpret this smartctl (smartmon) dataHow can I sort du -h output by sizeSkipping scheduled self-tests and predicting drive EOLUnderstanding smartctl -a outputsmartctl -A Missing Attributessmartctl or hddtemp for xvdaDegradedArray event after rsync but later mdadm and smartctl do not show any issueHow increase write speed of raid1 mdadm?smartctl drive Media_Wearout_Indicator values outside normal boundssmartctl 6.6 missing attributes tableHow to collect historical data using smartctl?

How do I spend money in the US?

Is there an evolutionary advantage to having two heads?

Is it possible to change original filename of an exe?

How to capture more stars?

What are the problems in teaching guitar via Skype?

Why is A union B also called "A or B"?

Is floating in space similar to falling under gravity?

Socratic Paradox

What is the difference between nullifying your vote and not going to vote at all?

Is there an explanation for Austria's Freedom Party virtually retaining its vote share despite recent scandal?

Thousands and thousands of words

What was this black-and-white film set in the Arctic or Antarctic where the monster/alien gets fried in the end?

Can a non-EU citizen travel within the Schengen area without identity documents?

Where did the “vikings wear helmets with horn” stereotype come from and why?

Biblical Basis for 400 years of silence between old and new testament

What does "Marchentalender" on the front of a postcard mean?

Differences between “pas vrai ?”, “c’est ça ?”, “hein ?”, and “n’est-ce pas ?”

What caused the tendency for conservatives to not support climate change regulations?

Different PCB color ( is it different material? )

Mapping a function f[xi_,xj_] over a list x1, ...., xn with the i < j restriction

Yandex programming contest: Alarms

Looking after a wayward brother in mother's will

Why does the UK have more political parties than the US?

Can I install a row of bricks on a slab to support a shed?



How to interpret this smartctl (smartmon) data


How can I sort du -h output by sizeSkipping scheduled self-tests and predicting drive EOLUnderstanding smartctl -a outputsmartctl -A Missing Attributessmartctl or hddtemp for xvdaDegradedArray event after rsync but later mdadm and smartctl do not show any issueHow increase write speed of raid1 mdadm?smartctl drive Media_Wearout_Indicator values outside normal boundssmartctl 6.6 missing attributes tableHow to collect historical data using smartctl?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








17















We have a linux server that has been in heavy use for 3 years. We're running a number of virtualized servers on it, some that have not been well behaved, and for a significant time the server's io capacity was exceeded leading to bad iowait. It's got 4 500gb Barracuda sata drives connected to a 3com raid controller. 1 Drive has the OS, and the other 3 are setup raid-5.



Now we have a debate as to the condition of the drives and whether they are actively failing.



Here's a portion of the output for 1 of the 4 disks. They all have relatively similar statistics:




SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 118 099 006 Pre-fail Always - 169074425
3 Spin_Up_Time 0x0003 095 092 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 26
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 077 060 030 Pre-fail Always - 200009354607
9 Power_On_Hours 0x0032 069 069 000 Old_age Always - 27856
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 1
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 26
184 Unknown_Attribute 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 071 060 045 Old_age Always - 29 (Lifetime Min/Max 26/37)
194 Temperature_Celsius 0x0022 029 040 000 Old_age Always - 29 (0 21 0 0)
195 Hardware_ECC_Recovered 0x001a 046 033 000 Old_age Always - 169074425
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged


My interpretation of this is that we have not had any bad sectors or other indications that any of the drives are actively failing.



However, the high Raw_Read_Error_Rate and Seek_Error_Rate is being pointed to as indications that the drives are dying.










share|improve this question

















  • 1





    There is a good description here (too long to repost, please follow the link): lime-technology.com/wiki/Understanding_SMART_Reports In case the link goes down, some important quotes: "This is an indicator of the current rate of errors of the low level physical sector read operations. In normal operation, there are ALWAYS a small number of errors [...] there is NO issue with the drive." and "PLEASE completely ignore the RAW_VALUE number! Only Seagates report the raw value, which yes, does appear to be the number of raw read errors, but should be ignored, completely."

    – Konrad Gajewski
    Feb 12 '18 at 21:19

















17















We have a linux server that has been in heavy use for 3 years. We're running a number of virtualized servers on it, some that have not been well behaved, and for a significant time the server's io capacity was exceeded leading to bad iowait. It's got 4 500gb Barracuda sata drives connected to a 3com raid controller. 1 Drive has the OS, and the other 3 are setup raid-5.



Now we have a debate as to the condition of the drives and whether they are actively failing.



Here's a portion of the output for 1 of the 4 disks. They all have relatively similar statistics:




SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 118 099 006 Pre-fail Always - 169074425
3 Spin_Up_Time 0x0003 095 092 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 26
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 077 060 030 Pre-fail Always - 200009354607
9 Power_On_Hours 0x0032 069 069 000 Old_age Always - 27856
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 1
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 26
184 Unknown_Attribute 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 071 060 045 Old_age Always - 29 (Lifetime Min/Max 26/37)
194 Temperature_Celsius 0x0022 029 040 000 Old_age Always - 29 (0 21 0 0)
195 Hardware_ECC_Recovered 0x001a 046 033 000 Old_age Always - 169074425
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged


My interpretation of this is that we have not had any bad sectors or other indications that any of the drives are actively failing.



However, the high Raw_Read_Error_Rate and Seek_Error_Rate is being pointed to as indications that the drives are dying.










share|improve this question

















  • 1





    There is a good description here (too long to repost, please follow the link): lime-technology.com/wiki/Understanding_SMART_Reports In case the link goes down, some important quotes: "This is an indicator of the current rate of errors of the low level physical sector read operations. In normal operation, there are ALWAYS a small number of errors [...] there is NO issue with the drive." and "PLEASE completely ignore the RAW_VALUE number! Only Seagates report the raw value, which yes, does appear to be the number of raw read errors, but should be ignored, completely."

    – Konrad Gajewski
    Feb 12 '18 at 21:19













17












17








17


10






We have a linux server that has been in heavy use for 3 years. We're running a number of virtualized servers on it, some that have not been well behaved, and for a significant time the server's io capacity was exceeded leading to bad iowait. It's got 4 500gb Barracuda sata drives connected to a 3com raid controller. 1 Drive has the OS, and the other 3 are setup raid-5.



Now we have a debate as to the condition of the drives and whether they are actively failing.



Here's a portion of the output for 1 of the 4 disks. They all have relatively similar statistics:




SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 118 099 006 Pre-fail Always - 169074425
3 Spin_Up_Time 0x0003 095 092 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 26
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 077 060 030 Pre-fail Always - 200009354607
9 Power_On_Hours 0x0032 069 069 000 Old_age Always - 27856
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 1
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 26
184 Unknown_Attribute 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 071 060 045 Old_age Always - 29 (Lifetime Min/Max 26/37)
194 Temperature_Celsius 0x0022 029 040 000 Old_age Always - 29 (0 21 0 0)
195 Hardware_ECC_Recovered 0x001a 046 033 000 Old_age Always - 169074425
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged


My interpretation of this is that we have not had any bad sectors or other indications that any of the drives are actively failing.



However, the high Raw_Read_Error_Rate and Seek_Error_Rate is being pointed to as indications that the drives are dying.










share|improve this question














We have a linux server that has been in heavy use for 3 years. We're running a number of virtualized servers on it, some that have not been well behaved, and for a significant time the server's io capacity was exceeded leading to bad iowait. It's got 4 500gb Barracuda sata drives connected to a 3com raid controller. 1 Drive has the OS, and the other 3 are setup raid-5.



Now we have a debate as to the condition of the drives and whether they are actively failing.



Here's a portion of the output for 1 of the 4 disks. They all have relatively similar statistics:




SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 118 099 006 Pre-fail Always - 169074425
3 Spin_Up_Time 0x0003 095 092 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 26
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 077 060 030 Pre-fail Always - 200009354607
9 Power_On_Hours 0x0032 069 069 000 Old_age Always - 27856
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 1
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 26
184 Unknown_Attribute 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 071 060 045 Old_age Always - 29 (Lifetime Min/Max 26/37)
194 Temperature_Celsius 0x0022 029 040 000 Old_age Always - 29 (0 21 0 0)
195 Hardware_ECC_Recovered 0x001a 046 033 000 Old_age Always - 169074425
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged


My interpretation of this is that we have not had any bad sectors or other indications that any of the drives are actively failing.



However, the high Raw_Read_Error_Rate and Seek_Error_Rate is being pointed to as indications that the drives are dying.







linux smartctl






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Sep 20 '11 at 21:28









gviewgview

7681716




7681716







  • 1





    There is a good description here (too long to repost, please follow the link): lime-technology.com/wiki/Understanding_SMART_Reports In case the link goes down, some important quotes: "This is an indicator of the current rate of errors of the low level physical sector read operations. In normal operation, there are ALWAYS a small number of errors [...] there is NO issue with the drive." and "PLEASE completely ignore the RAW_VALUE number! Only Seagates report the raw value, which yes, does appear to be the number of raw read errors, but should be ignored, completely."

    – Konrad Gajewski
    Feb 12 '18 at 21:19












  • 1





    There is a good description here (too long to repost, please follow the link): lime-technology.com/wiki/Understanding_SMART_Reports In case the link goes down, some important quotes: "This is an indicator of the current rate of errors of the low level physical sector read operations. In normal operation, there are ALWAYS a small number of errors [...] there is NO issue with the drive." and "PLEASE completely ignore the RAW_VALUE number! Only Seagates report the raw value, which yes, does appear to be the number of raw read errors, but should be ignored, completely."

    – Konrad Gajewski
    Feb 12 '18 at 21:19







1




1





There is a good description here (too long to repost, please follow the link): lime-technology.com/wiki/Understanding_SMART_Reports In case the link goes down, some important quotes: "This is an indicator of the current rate of errors of the low level physical sector read operations. In normal operation, there are ALWAYS a small number of errors [...] there is NO issue with the drive." and "PLEASE completely ignore the RAW_VALUE number! Only Seagates report the raw value, which yes, does appear to be the number of raw read errors, but should be ignored, completely."

– Konrad Gajewski
Feb 12 '18 at 21:19





There is a good description here (too long to repost, please follow the link): lime-technology.com/wiki/Understanding_SMART_Reports In case the link goes down, some important quotes: "This is an indicator of the current rate of errors of the low level physical sector read operations. In normal operation, there are ALWAYS a small number of errors [...] there is NO issue with the drive." and "PLEASE completely ignore the RAW_VALUE number! Only Seagates report the raw value, which yes, does appear to be the number of raw read errors, but should be ignored, completely."

– Konrad Gajewski
Feb 12 '18 at 21:19










6 Answers
6






active

oldest

votes


















7














In my experience, Seagates have weird numbers for those two SMART attributes. When diagnosing a Seagate I tend to ignore those and look more closely at other fields like Reallocated Sector Count. Of course, when in doubt replace the drive, but even brand new Seagates will have high numbers for those attributes.






share|improve this answer






























    50














    For Seagate disks (and possibly some old ones from WD too) the Seek_Error_Rate and Raw_Read_Error_Rate are 48 bit numbers, where the most significant 16 bits are an error count, and the low 32 bits are a number of operations.



    % python
    >>> 200009354607 & 0xFFFFFFFF
    2440858991
    >>> (200009354607 & 0xFFFF00000000) >> 32
    46


    So your disk has performed 2440858991 seeks, of which 46 failed. My experience with Seagate drives is that they tend to fail when the number of errors goes over 1000. YMMV.






    share|improve this answer




















    • 5





      Thans for this, I wish I had that information back when I originally posed the question.

      – gview
      Jan 31 '14 at 17:55











    • This, very useful. Saved me from panic.

      – Halsafar
      Nov 14 '18 at 23:11


















    7














    The "seek error rate" and "raw read error rate" RAW_VALUES are virtually meaningless for anyone but Seagate's support. As others pointed out, raw values of parameters like "reallocated sector count" or entries in the drive's error log are more likely to indicate a higher probability of failure.



    But you can take a look at the interpreted data in the VALUE, WORST and THRESH columns which are meant to be read as gauges:



    ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH
    7 Seek_Error_Rate 0x000f 077 060 030


    Meaning that your seek error rate is currently considered to be "77% good" and is reported as a problem by SMART when it reaches "30% good". It had been as low as "60% good" once, but has magically recovered since. Note that the interpreted values are calculated by the drive's SMART logic internally and the exact calculation may or may not be published by the manufacturer and typically cannot be tweaked by the user.



    Personally, I consider a drive containing error log entries as "failing" and urge for a replacement as soon as they occur. But all in all, SMART data has turned out to be a rather weak indicator for failure prediction, as a research paper published by Google uncovered.






    share|improve this answer
































      3














      I realized this discussion is a bit old but want to add my 2 cents. I have found the smart information to be quite a good indicator of pre-fail. When you get a smart threshold tripped then replace the drive. That is what those thresholds are for.



      The vast majority of time you will start to see bad sectors. That is a sure sign the drive is starting to fail. SMART has saved me many times. I use software RAID 1 and it's very helpful since you simply replace the failing drive and rebuild the array.



      I also run short and long self test weekly.



      smartctl -t short /dev/sda
      smartctl -t long /dev/sda


      Or add it /etc/smartd.conf and get it to email you if there are errors



      /dev/sda -s L/../../3/22 -I 194 -m someemail@somedomain
      /dev/sdb -s L/../../7/22 -I 194 -m someemail@somedomain


      Make sure to install logwatch and redirect root to an email address and check the daily emails from logwatch. SMARTD tripped flags will show up there but it's of no help if nobody is monitoring that regularly.






      share|improve this answer






























        1














        Yes, those fields look bad but I don't trust (anymore) the info reported by smart (my test machine have a drive which should be dead a long time ago if you read the data with smartctrl)
        The fact is that you have reported high iowait and the drives are 3 years old. This should be enough for you to change the drives.






        share|improve this answer


















        • 1





          For various reasons we need to maximize our investment in the hardware. The iowait had to do with the ridiculous load, as well as some configuration mistakes we made when setting up the box.

          – gview
          Sep 20 '11 at 22:57


















        0














        Sorry to commit necromancy on this post, but in my experience, the "Raw Read Error Rate" and "Hardware ECC Recovered" fields for a Seagate drive will quite literally go all over the place and increment constantly into the trillions range at which point they'll cycle back around to zero to continue the process again. I've a Seagate ST9750420AS that has had that problem since day one and still works great even after quite a few years and 3500+ hours of use.



        I think those fields can be safely ignored if you're running one in your case. Just make sure the two fields are reporting the same number and in sync constantly. If they're not...well... That actually might mean a problem.






        share|improve this answer























          Your Answer








          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "2"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f313649%2fhow-to-interpret-this-smartctl-smartmon-data%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          6 Answers
          6






          active

          oldest

          votes








          6 Answers
          6






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          7














          In my experience, Seagates have weird numbers for those two SMART attributes. When diagnosing a Seagate I tend to ignore those and look more closely at other fields like Reallocated Sector Count. Of course, when in doubt replace the drive, but even brand new Seagates will have high numbers for those attributes.






          share|improve this answer



























            7














            In my experience, Seagates have weird numbers for those two SMART attributes. When diagnosing a Seagate I tend to ignore those and look more closely at other fields like Reallocated Sector Count. Of course, when in doubt replace the drive, but even brand new Seagates will have high numbers for those attributes.






            share|improve this answer

























              7












              7








              7







              In my experience, Seagates have weird numbers for those two SMART attributes. When diagnosing a Seagate I tend to ignore those and look more closely at other fields like Reallocated Sector Count. Of course, when in doubt replace the drive, but even brand new Seagates will have high numbers for those attributes.






              share|improve this answer













              In my experience, Seagates have weird numbers for those two SMART attributes. When diagnosing a Seagate I tend to ignore those and look more closely at other fields like Reallocated Sector Count. Of course, when in doubt replace the drive, but even brand new Seagates will have high numbers for those attributes.







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Sep 20 '11 at 22:38









              hwilbankshwilbanks

              42623




              42623























                  50














                  For Seagate disks (and possibly some old ones from WD too) the Seek_Error_Rate and Raw_Read_Error_Rate are 48 bit numbers, where the most significant 16 bits are an error count, and the low 32 bits are a number of operations.



                  % python
                  >>> 200009354607 & 0xFFFFFFFF
                  2440858991
                  >>> (200009354607 & 0xFFFF00000000) >> 32
                  46


                  So your disk has performed 2440858991 seeks, of which 46 failed. My experience with Seagate drives is that they tend to fail when the number of errors goes over 1000. YMMV.






                  share|improve this answer




















                  • 5





                    Thans for this, I wish I had that information back when I originally posed the question.

                    – gview
                    Jan 31 '14 at 17:55











                  • This, very useful. Saved me from panic.

                    – Halsafar
                    Nov 14 '18 at 23:11















                  50














                  For Seagate disks (and possibly some old ones from WD too) the Seek_Error_Rate and Raw_Read_Error_Rate are 48 bit numbers, where the most significant 16 bits are an error count, and the low 32 bits are a number of operations.



                  % python
                  >>> 200009354607 & 0xFFFFFFFF
                  2440858991
                  >>> (200009354607 & 0xFFFF00000000) >> 32
                  46


                  So your disk has performed 2440858991 seeks, of which 46 failed. My experience with Seagate drives is that they tend to fail when the number of errors goes over 1000. YMMV.






                  share|improve this answer




















                  • 5





                    Thans for this, I wish I had that information back when I originally posed the question.

                    – gview
                    Jan 31 '14 at 17:55











                  • This, very useful. Saved me from panic.

                    – Halsafar
                    Nov 14 '18 at 23:11













                  50












                  50








                  50







                  For Seagate disks (and possibly some old ones from WD too) the Seek_Error_Rate and Raw_Read_Error_Rate are 48 bit numbers, where the most significant 16 bits are an error count, and the low 32 bits are a number of operations.



                  % python
                  >>> 200009354607 & 0xFFFFFFFF
                  2440858991
                  >>> (200009354607 & 0xFFFF00000000) >> 32
                  46


                  So your disk has performed 2440858991 seeks, of which 46 failed. My experience with Seagate drives is that they tend to fail when the number of errors goes over 1000. YMMV.






                  share|improve this answer















                  For Seagate disks (and possibly some old ones from WD too) the Seek_Error_Rate and Raw_Read_Error_Rate are 48 bit numbers, where the most significant 16 bits are an error count, and the low 32 bits are a number of operations.



                  % python
                  >>> 200009354607 & 0xFFFFFFFF
                  2440858991
                  >>> (200009354607 & 0xFFFF00000000) >> 32
                  46


                  So your disk has performed 2440858991 seeks, of which 46 failed. My experience with Seagate drives is that they tend to fail when the number of errors goes over 1000. YMMV.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Aug 20 '15 at 1:25









                  Dan Pritts

                  2,5592023




                  2,5592023










                  answered Apr 2 '13 at 1:05









                  tsunatsuna

                  1,283139




                  1,283139







                  • 5





                    Thans for this, I wish I had that information back when I originally posed the question.

                    – gview
                    Jan 31 '14 at 17:55











                  • This, very useful. Saved me from panic.

                    – Halsafar
                    Nov 14 '18 at 23:11












                  • 5





                    Thans for this, I wish I had that information back when I originally posed the question.

                    – gview
                    Jan 31 '14 at 17:55











                  • This, very useful. Saved me from panic.

                    – Halsafar
                    Nov 14 '18 at 23:11







                  5




                  5





                  Thans for this, I wish I had that information back when I originally posed the question.

                  – gview
                  Jan 31 '14 at 17:55





                  Thans for this, I wish I had that information back when I originally posed the question.

                  – gview
                  Jan 31 '14 at 17:55













                  This, very useful. Saved me from panic.

                  – Halsafar
                  Nov 14 '18 at 23:11





                  This, very useful. Saved me from panic.

                  – Halsafar
                  Nov 14 '18 at 23:11











                  7














                  The "seek error rate" and "raw read error rate" RAW_VALUES are virtually meaningless for anyone but Seagate's support. As others pointed out, raw values of parameters like "reallocated sector count" or entries in the drive's error log are more likely to indicate a higher probability of failure.



                  But you can take a look at the interpreted data in the VALUE, WORST and THRESH columns which are meant to be read as gauges:



                  ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH
                  7 Seek_Error_Rate 0x000f 077 060 030


                  Meaning that your seek error rate is currently considered to be "77% good" and is reported as a problem by SMART when it reaches "30% good". It had been as low as "60% good" once, but has magically recovered since. Note that the interpreted values are calculated by the drive's SMART logic internally and the exact calculation may or may not be published by the manufacturer and typically cannot be tweaked by the user.



                  Personally, I consider a drive containing error log entries as "failing" and urge for a replacement as soon as they occur. But all in all, SMART data has turned out to be a rather weak indicator for failure prediction, as a research paper published by Google uncovered.






                  share|improve this answer





























                    7














                    The "seek error rate" and "raw read error rate" RAW_VALUES are virtually meaningless for anyone but Seagate's support. As others pointed out, raw values of parameters like "reallocated sector count" or entries in the drive's error log are more likely to indicate a higher probability of failure.



                    But you can take a look at the interpreted data in the VALUE, WORST and THRESH columns which are meant to be read as gauges:



                    ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH
                    7 Seek_Error_Rate 0x000f 077 060 030


                    Meaning that your seek error rate is currently considered to be "77% good" and is reported as a problem by SMART when it reaches "30% good". It had been as low as "60% good" once, but has magically recovered since. Note that the interpreted values are calculated by the drive's SMART logic internally and the exact calculation may or may not be published by the manufacturer and typically cannot be tweaked by the user.



                    Personally, I consider a drive containing error log entries as "failing" and urge for a replacement as soon as they occur. But all in all, SMART data has turned out to be a rather weak indicator for failure prediction, as a research paper published by Google uncovered.






                    share|improve this answer



























                      7












                      7








                      7







                      The "seek error rate" and "raw read error rate" RAW_VALUES are virtually meaningless for anyone but Seagate's support. As others pointed out, raw values of parameters like "reallocated sector count" or entries in the drive's error log are more likely to indicate a higher probability of failure.



                      But you can take a look at the interpreted data in the VALUE, WORST and THRESH columns which are meant to be read as gauges:



                      ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH
                      7 Seek_Error_Rate 0x000f 077 060 030


                      Meaning that your seek error rate is currently considered to be "77% good" and is reported as a problem by SMART when it reaches "30% good". It had been as low as "60% good" once, but has magically recovered since. Note that the interpreted values are calculated by the drive's SMART logic internally and the exact calculation may or may not be published by the manufacturer and typically cannot be tweaked by the user.



                      Personally, I consider a drive containing error log entries as "failing" and urge for a replacement as soon as they occur. But all in all, SMART data has turned out to be a rather weak indicator for failure prediction, as a research paper published by Google uncovered.






                      share|improve this answer















                      The "seek error rate" and "raw read error rate" RAW_VALUES are virtually meaningless for anyone but Seagate's support. As others pointed out, raw values of parameters like "reallocated sector count" or entries in the drive's error log are more likely to indicate a higher probability of failure.



                      But you can take a look at the interpreted data in the VALUE, WORST and THRESH columns which are meant to be read as gauges:



                      ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH
                      7 Seek_Error_Rate 0x000f 077 060 030


                      Meaning that your seek error rate is currently considered to be "77% good" and is reported as a problem by SMART when it reaches "30% good". It had been as low as "60% good" once, but has magically recovered since. Note that the interpreted values are calculated by the drive's SMART logic internally and the exact calculation may or may not be published by the manufacturer and typically cannot be tweaked by the user.



                      Personally, I consider a drive containing error log entries as "failing" and urge for a replacement as soon as they occur. But all in all, SMART data has turned out to be a rather weak indicator for failure prediction, as a research paper published by Google uncovered.







                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited Jul 2 '14 at 15:26

























                      answered Sep 20 '11 at 23:08









                      the-wabbitthe-wabbit

                      36.2k1181151




                      36.2k1181151





















                          3














                          I realized this discussion is a bit old but want to add my 2 cents. I have found the smart information to be quite a good indicator of pre-fail. When you get a smart threshold tripped then replace the drive. That is what those thresholds are for.



                          The vast majority of time you will start to see bad sectors. That is a sure sign the drive is starting to fail. SMART has saved me many times. I use software RAID 1 and it's very helpful since you simply replace the failing drive and rebuild the array.



                          I also run short and long self test weekly.



                          smartctl -t short /dev/sda
                          smartctl -t long /dev/sda


                          Or add it /etc/smartd.conf and get it to email you if there are errors



                          /dev/sda -s L/../../3/22 -I 194 -m someemail@somedomain
                          /dev/sdb -s L/../../7/22 -I 194 -m someemail@somedomain


                          Make sure to install logwatch and redirect root to an email address and check the daily emails from logwatch. SMARTD tripped flags will show up there but it's of no help if nobody is monitoring that regularly.






                          share|improve this answer



























                            3














                            I realized this discussion is a bit old but want to add my 2 cents. I have found the smart information to be quite a good indicator of pre-fail. When you get a smart threshold tripped then replace the drive. That is what those thresholds are for.



                            The vast majority of time you will start to see bad sectors. That is a sure sign the drive is starting to fail. SMART has saved me many times. I use software RAID 1 and it's very helpful since you simply replace the failing drive and rebuild the array.



                            I also run short and long self test weekly.



                            smartctl -t short /dev/sda
                            smartctl -t long /dev/sda


                            Or add it /etc/smartd.conf and get it to email you if there are errors



                            /dev/sda -s L/../../3/22 -I 194 -m someemail@somedomain
                            /dev/sdb -s L/../../7/22 -I 194 -m someemail@somedomain


                            Make sure to install logwatch and redirect root to an email address and check the daily emails from logwatch. SMARTD tripped flags will show up there but it's of no help if nobody is monitoring that regularly.






                            share|improve this answer

























                              3












                              3








                              3







                              I realized this discussion is a bit old but want to add my 2 cents. I have found the smart information to be quite a good indicator of pre-fail. When you get a smart threshold tripped then replace the drive. That is what those thresholds are for.



                              The vast majority of time you will start to see bad sectors. That is a sure sign the drive is starting to fail. SMART has saved me many times. I use software RAID 1 and it's very helpful since you simply replace the failing drive and rebuild the array.



                              I also run short and long self test weekly.



                              smartctl -t short /dev/sda
                              smartctl -t long /dev/sda


                              Or add it /etc/smartd.conf and get it to email you if there are errors



                              /dev/sda -s L/../../3/22 -I 194 -m someemail@somedomain
                              /dev/sdb -s L/../../7/22 -I 194 -m someemail@somedomain


                              Make sure to install logwatch and redirect root to an email address and check the daily emails from logwatch. SMARTD tripped flags will show up there but it's of no help if nobody is monitoring that regularly.






                              share|improve this answer













                              I realized this discussion is a bit old but want to add my 2 cents. I have found the smart information to be quite a good indicator of pre-fail. When you get a smart threshold tripped then replace the drive. That is what those thresholds are for.



                              The vast majority of time you will start to see bad sectors. That is a sure sign the drive is starting to fail. SMART has saved me many times. I use software RAID 1 and it's very helpful since you simply replace the failing drive and rebuild the array.



                              I also run short and long self test weekly.



                              smartctl -t short /dev/sda
                              smartctl -t long /dev/sda


                              Or add it /etc/smartd.conf and get it to email you if there are errors



                              /dev/sda -s L/../../3/22 -I 194 -m someemail@somedomain
                              /dev/sdb -s L/../../7/22 -I 194 -m someemail@somedomain


                              Make sure to install logwatch and redirect root to an email address and check the daily emails from logwatch. SMARTD tripped flags will show up there but it's of no help if nobody is monitoring that regularly.







                              share|improve this answer












                              share|improve this answer



                              share|improve this answer










                              answered Jul 5 '14 at 14:21









                              Fred FlintFred Flint

                              42144




                              42144





















                                  1














                                  Yes, those fields look bad but I don't trust (anymore) the info reported by smart (my test machine have a drive which should be dead a long time ago if you read the data with smartctrl)
                                  The fact is that you have reported high iowait and the drives are 3 years old. This should be enough for you to change the drives.






                                  share|improve this answer


















                                  • 1





                                    For various reasons we need to maximize our investment in the hardware. The iowait had to do with the ridiculous load, as well as some configuration mistakes we made when setting up the box.

                                    – gview
                                    Sep 20 '11 at 22:57















                                  1














                                  Yes, those fields look bad but I don't trust (anymore) the info reported by smart (my test machine have a drive which should be dead a long time ago if you read the data with smartctrl)
                                  The fact is that you have reported high iowait and the drives are 3 years old. This should be enough for you to change the drives.






                                  share|improve this answer


















                                  • 1





                                    For various reasons we need to maximize our investment in the hardware. The iowait had to do with the ridiculous load, as well as some configuration mistakes we made when setting up the box.

                                    – gview
                                    Sep 20 '11 at 22:57













                                  1












                                  1








                                  1







                                  Yes, those fields look bad but I don't trust (anymore) the info reported by smart (my test machine have a drive which should be dead a long time ago if you read the data with smartctrl)
                                  The fact is that you have reported high iowait and the drives are 3 years old. This should be enough for you to change the drives.






                                  share|improve this answer













                                  Yes, those fields look bad but I don't trust (anymore) the info reported by smart (my test machine have a drive which should be dead a long time ago if you read the data with smartctrl)
                                  The fact is that you have reported high iowait and the drives are 3 years old. This should be enough for you to change the drives.







                                  share|improve this answer












                                  share|improve this answer



                                  share|improve this answer










                                  answered Sep 20 '11 at 22:28









                                  migabimigabi

                                  1543




                                  1543







                                  • 1





                                    For various reasons we need to maximize our investment in the hardware. The iowait had to do with the ridiculous load, as well as some configuration mistakes we made when setting up the box.

                                    – gview
                                    Sep 20 '11 at 22:57












                                  • 1





                                    For various reasons we need to maximize our investment in the hardware. The iowait had to do with the ridiculous load, as well as some configuration mistakes we made when setting up the box.

                                    – gview
                                    Sep 20 '11 at 22:57







                                  1




                                  1





                                  For various reasons we need to maximize our investment in the hardware. The iowait had to do with the ridiculous load, as well as some configuration mistakes we made when setting up the box.

                                  – gview
                                  Sep 20 '11 at 22:57





                                  For various reasons we need to maximize our investment in the hardware. The iowait had to do with the ridiculous load, as well as some configuration mistakes we made when setting up the box.

                                  – gview
                                  Sep 20 '11 at 22:57











                                  0














                                  Sorry to commit necromancy on this post, but in my experience, the "Raw Read Error Rate" and "Hardware ECC Recovered" fields for a Seagate drive will quite literally go all over the place and increment constantly into the trillions range at which point they'll cycle back around to zero to continue the process again. I've a Seagate ST9750420AS that has had that problem since day one and still works great even after quite a few years and 3500+ hours of use.



                                  I think those fields can be safely ignored if you're running one in your case. Just make sure the two fields are reporting the same number and in sync constantly. If they're not...well... That actually might mean a problem.






                                  share|improve this answer



























                                    0














                                    Sorry to commit necromancy on this post, but in my experience, the "Raw Read Error Rate" and "Hardware ECC Recovered" fields for a Seagate drive will quite literally go all over the place and increment constantly into the trillions range at which point they'll cycle back around to zero to continue the process again. I've a Seagate ST9750420AS that has had that problem since day one and still works great even after quite a few years and 3500+ hours of use.



                                    I think those fields can be safely ignored if you're running one in your case. Just make sure the two fields are reporting the same number and in sync constantly. If they're not...well... That actually might mean a problem.






                                    share|improve this answer

























                                      0












                                      0








                                      0







                                      Sorry to commit necromancy on this post, but in my experience, the "Raw Read Error Rate" and "Hardware ECC Recovered" fields for a Seagate drive will quite literally go all over the place and increment constantly into the trillions range at which point they'll cycle back around to zero to continue the process again. I've a Seagate ST9750420AS that has had that problem since day one and still works great even after quite a few years and 3500+ hours of use.



                                      I think those fields can be safely ignored if you're running one in your case. Just make sure the two fields are reporting the same number and in sync constantly. If they're not...well... That actually might mean a problem.






                                      share|improve this answer













                                      Sorry to commit necromancy on this post, but in my experience, the "Raw Read Error Rate" and "Hardware ECC Recovered" fields for a Seagate drive will quite literally go all over the place and increment constantly into the trillions range at which point they'll cycle back around to zero to continue the process again. I've a Seagate ST9750420AS that has had that problem since day one and still works great even after quite a few years and 3500+ hours of use.



                                      I think those fields can be safely ignored if you're running one in your case. Just make sure the two fields are reporting the same number and in sync constantly. If they're not...well... That actually might mean a problem.







                                      share|improve this answer












                                      share|improve this answer



                                      share|improve this answer










                                      answered May 15 at 20:05









                                      Ryan GandyRyan Gandy

                                      1




                                      1



























                                          draft saved

                                          draft discarded
















































                                          Thanks for contributing an answer to Server Fault!


                                          • Please be sure to answer the question. Provide details and share your research!

                                          But avoid


                                          • Asking for help, clarification, or responding to other answers.

                                          • Making statements based on opinion; back them up with references or personal experience.

                                          To learn more, see our tips on writing great answers.




                                          draft saved


                                          draft discarded














                                          StackExchange.ready(
                                          function ()
                                          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f313649%2fhow-to-interpret-this-smartctl-smartmon-data%23new-answer', 'question_page');

                                          );

                                          Post as a guest















                                          Required, but never shown





















































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown

































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown







                                          Popular posts from this blog

                                          How to write a 12-bar blues melodyI-IV-V blues progressionHow to play the bridges in a standard blues progressionHow does Gdim7 fit in C# minor?question on a certain chord progressionMusicology of Melody12 bar blues, spread rhythm: alternative to 6th chord to avoid finger stretchChord progressions/ Root key/ MelodiesHow to put chords (POP-EDM) under a given lead vocal melody (starting from a good knowledge in music theory)Are there “rules” for improvising with the minor pentatonic scale over 12-bar shuffle?Confusion about blues scale and chords

                                          What if the end-user didn't have the required library?What is setup.py?What is a clean, pythonic way to have multiple constructors in Python?What does Ruby have that Python doesn't, and vice versa?What is the reason for having '//' in Python?How do I create a namespace package in Python?How to package shared objects that python modules depend on?setuptools vs. distutils: why is distutils still a thing?Navigation in Windows 10 vs code not going to virtualenv library when the same library is installed at user levelPython create package for local usePackaging a project that uses multiple python versionsWhy is permission denied on pip install except for when “--user” is included at end of command?

                                          Esgonzo ibérico Índice Descrición Distribución Hábitat Ameazas Notas Véxase tamén "Acerca dos nomes dos anfibios e réptiles galegos""Chalcides bedriagai"Chalcides bedriagai en Carrascal, L. M. Salvador, A. (Eds). Enciclopedia virtual de los vertebrados españoles. Museo Nacional de Ciencias Naturales, Madrid. España.Fotos