Is this drive slowly dying?

Haole Boy · Jul 19, 2018

I'm back with yet another question about possible disk failure. Customer called saying machine was running slow, so I set up an appointment went over and the machine seemed normal to me. Did my regular cleanup / update routine and found the following in the SMART output (full gSmartCtl output available here)

ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0
194 Temperature_Celsius -O---K 094 084 000 - 53
196 Reallocated_Event_Count -O--CK 200 200 000 - 0
197 Current_Pending_Sector -O--CK 200 200 000 - 4
198 Offline_Uncorrectable ----CK 200 200 000 - 0

But, system seemed to run normally at that point. 1 month later I get another call saying it's taking 10 minutes to logon. When I got there, the system was running normally. I'm *assuming* that this was the result of Windows trying to read a bad sector and it eventually got remapped to a good sector. Here's the SMART output from that visit

ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0
194 Temperature_Celsius -O---K 090 084 000 - 57
196 Reallocated_Event_Count -O--CK 200 200 000 - 0
197 Current_Pending_Sector -O--CK 200 200 000 - 4
198 Offline_Uncorrectable ----CK 200 200 000 - 4

So... the Current_Pending_Sector count remained the same, but the Offline Uncorrectable now shows 4 bad sectors. Does this mean that there are now 8 bad sectors - i.e.: the drive is deteriorating and should be replaced? Or are these the same 4 sectors showing up in both categories? And if this really was a bad sector being remapped, why is the Reallocated_Sector_Ct still 0?

Also, should I be concerned about the temperature? (57 Celsius). This is an HP all in one.

Mahalo for your assistance,

Harry Z

coffee · Jul 19, 2018

This could have been taken care of in one service call instead of multiple ones. You should have brought a bootable linux cd (Linux Mint Mate) and booted from it. Then fired up "Disks" and checked the drive. As soon as you see anything wrong with it then its time to backup data and replace drive.

Haole Boy · Jul 19, 2018

I can't remember if I booted my Linux USB drive to run gSmartCtl or I just ran it from Windows. From reading this Forum for many years, there does not seem to be universal agreement on how many bad sectors there has to be before you replace the drive. Some folks do it with 1 bad sector, others have a tolerance of 10 or 20 bad sectors. Considering who the customer is (elderly, money would be a challenge, occasional use of the PC) I decided to monitor the situation. I realize other folks would have replaced the drive at the first service call.

In this instance, the SMART indicators have changed and I want to understand more about this before recommending replacing the drive.

Mahalo for your feedback,

Harry Z

MichaelBits · Jul 19, 2018

You might try giving this a read:
https://www.backblaze.com/blog/what-smart-stats-indicate-hard-drive-failures/

Also some more basic info:
https://en.wikipedia.org/wiki/S.M.A.R.T.

I had a copy of all the SMART attributes and meanings somewhere... but the wiki above has most of the same info.

coffee · Jul 19, 2018

Haole Boy said:
does not seem to be universal agreement on how many bad sectors there has to be before you replace the drive

One is enough for me. Where there is one the chances of more is pretty good. Just replace the drive. Otherwise you run the risk of the client coming back soon.

Porthos · Jul 19, 2018

REPLACE the Darn drive, It is toast. stop trying to save/fix the drive. Get it backupd NOW. Don't risk Client data............

Markverhyden · Jul 19, 2018

How many bad blocks? 1 in my book. It's not like drives are rare or extremely expensive. Where there is smoke there's certainly going to be some fire. Just because you find 1 or 5 does not mean that is all there is. Personally I don't even need SMART, just a poorly performing computer is enough.

Porthos · Jul 19, 2018

If I see any errors on gsmart or crystal disk I REPLACE the drive. No Exceptions.

backwoodsman · Jul 19, 2018

Haole Boy said:
197 Current_Pending_Sector -O--CK 200 200 000 - 4

If it were my own drive, I'd run a badblocks write test to try to get the drive to reallocate those sectors. If it does, the drive can stay in service; or, if it's a failing drive, you'll end up with a lot more unreadable sectors. But for a customer who's paying by the hour, it's not worth doing that; the testing takes time, and the OS has to be reinstalled anyway, so better to just go to a new drive.

198 Offline_Uncorrectable ----CK 200 200 000 - 4

Usually nothing to worry about; the unreadable sectors are the real problem.

And if this really was a bad sector being remapped, why is the Reallocated_Sector_Ct still 0?

The drive only remaps sectors on a write operation, which will never happen for sectors occupied by stuff that doesn't change. Seems like a bad choice not to have repeated read failures trigger a remap, but what do I know?

Also, should I be concerned about the temperature? (57 Celsius). This is an HP all in one.

That's higher than I like to see, but may not be anything you can do about it in an AIO.

Alexey · Jul 19, 2018

Haole Boy said:
And if this really was a bad sector being remapped, why is the Reallocated_Sector_Ct still 0?

The sector is not being remapped. It is pending rewrite attempt. The data cannot be read, so there is no point in remapping the sector. There is nothing to put into a spare sector anyway. The write, if it ever happens, will be verified. If the write succeeds (clearing transient error) there will be no remap and Current Pending Sector Count will decrease. If the write fails, the sector will be remapped, Current Pending Sector Count will decrease, and Reallocated Sector Count will encrease. Until there is a write attempt to this sector, nothing is going to change.

Porthos · Jul 19, 2018

backwoodsman said:
That's higher than I like to see, but may not be anything you can do about it in an AIO.

True.
I understand getting into some AIO's to replace the drive is a pain as well.

Alexey · Jul 19, 2018

backwoodsman said:
Seems like a bad choice not to have repeated read failures trigger a remap

This is kinda difficult question. Lets' say the sector does not read. After three attempts we remap it. What data do we put into the remapped sector? The original can't be read (or there won't be a problem). Do we put in zeros? Let's say we do, and now whatever software request the sector, is going to get a bunch of zeros instead the original data without any notification that there was a data loss. Because the new sector reads OK, its just the data is gone. We can work around that and put some tag onto a replacement sector which says "if that sector is requested, return a read error" and then clear this flag after the first write, but that kind of negates the whole point of remapping.

backwoodsman · Jul 19, 2018

Alexey said:
This is kinda difficult question. Lets' say the sector does not read. After three attempts we remap it. What data do we put into the remapped sector? The original can't be read (or there won't be a problem). Do we put in zeros? Let's say we do, and now whatever software request the sector, is going to get a bunch of zeros instead the original data without any notification that there was a data loss. Because the new sector reads OK, its just the data is gone. We can work around that and put some tag onto a replacement sector which says "if that sector is requested, return a read error" and then clear this flag after the first write, but that kind of negates the whole point of remapping.

Makes sense when you explain it like that. I was just thinking from the standpoint of having repeated time-consuming read failures on a known-bad sector.

coffee · Jul 19, 2018

If this was a server raid set reporting bad sectors on two drives would you rush out and replace the drives or would you just go "Ah, Its only a few bad sectors".

Do your customer a favor and replace the drive and be done with it.

fincoder · Jul 20, 2018

Haole Boy said:
Did my regular cleanup / update routine and found the following in the SMART output

I use Crystal Disk Info and that program interprets the SMART statistics for you, and does it very well. If it says CAUTION or BAD then I replace the drive. If it says GOOD but I still suspect drive issues I do a full surface scan diagnostic (e.g. Seatools long test).

YeOldeStonecat · Jul 20, 2018

On a "slightly old" to an "older" rig...if it's running slow, and if it has a spindle drive, (especially a WD Blue spindle)...I don't even waste time or consume a few seconds of brain cells, simply and easily/quickly clone to an SSD.

Before SSD's...I just cloned to a new better spindle.

Rotating hard drives are the one single part of a computer that is guaranteed to degrade in performance over time. They "slow down and lose performance" as they accumulate hours of use.

Don't waste time thinking about it or questioning it. Just clone to new..and make that new one an SSD. Your client will LOVE you for it, they'll love the 12 second bootups and fast program launches....and long life span.

Larry Sabo · Jul 20, 2018

Haole Boy said:
Customer called saying machine was running slow, so I set up an appointment went over and the machine seemed normal to me. Did my regular cleanup / update routine

One of the most common causes of a slow PC, is a failing hard drive, so checking SMART is one first things I do. Tuning up the PC might just drive it over the edge and ruin any chance to back up the user data, so it would come last, not first.

Markverhyden · Jul 20, 2018

YeOldeStonecat said:
Rotating hard drives are the one single part of a computer that is guaranteed to degrade in performance over time. They "slow down and lose performance" as they accumulate hours of use.

That's the number one rule of troubleshooting failures. The parts that move the most fail the most. I do work for several companies that have interactive kiosks such as building directories, information displays, deposit takers, coupon dispensers, etc. Those types mostly run off of whats in RAM so the HD doesn't get much activity. In those situations motherboard/power supply failures outnumber HD failures like 5 to 1. With the drop in SSD prices many have been moving over to SSD's. For static environments like that 32gb is fine since most use the Embedded versions of Windoze.

NJW · Jul 20, 2018

YeOldeStonecat said:
Just clone to new

Not my choice – the disk is already saying that it can't read some sectors, so it isn't going to provide a useful clone. You'll just get the same data corruption but unreported. Cue unexplainable software problems coming up.

YeOldeStonecat · Jul 20, 2018

NJW said:
Not my choice – the disk is already saying that it can't read some sectors, so it isn't going to provide a useful clone. You'll just get the same data corruption but unreported. Cue unexplainable software problems coming up.

Based on thousands and thousands of clones...very often you can get right through it and we have crazy numbers of those cloned drives working great in service as I type this. The OP stated the clients computer is running slow. He didn't state it's constantly blue screening or locking up or has event viewer with an application log filled with red marks about corrupted programs. Very likely those couple of bad sectors on the source drives are simply marked and not used by the drive. Good clone hardware/software gets right through it. You'd be amazed at how successful even cloning a drive from a computer that was at the point of locking up or blue screening is.

We're mostly MSP, and only biz clients, so when we fix something and it comes back to bite us..that's a loss for us, so we avoid solutions that bring repeat calls for the same problem. if cloning failing drives had a high risk of "cuing unexplainable..problems" we'd have been avoiding doing this decades ago.

Is this drive slowly dying?

Active Member

Well-Known Member

Active Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Similar threads