Last Updated: 2012-11-16 10:03:54
Over the weekend our main web/email system encountered an issue when updating it's Linux kernel with a security patch. It's a little tricky working on a hard drive in a SPARC architecture, fortunately I have a special PC that can read/write SCSI drives for just such a task. I was able to locate a PCI SCSI controller that has BIOS extensions in it. Functionally the SCSI disk appears has a normal DOS drive. I was able to reset the kernel files and was back up and running in less than a half of an hour.
Since the systems were down I decided it was a good time to do a file system check on the drive and some other maintenance work. All went well and since the systems were down I decided to work on the firewall as well. That's when the trouble started. This was a fully functional drive that has been running continuously for years was now showing massive amounts of read/write and general IO errors. On a whim I decided to use a copy of SpinRite on it. Keep in mind this is a 10+ year old SCSI disk.
SpinRite 's claim to fame is fixing floppies and ATA/SATA drives, I never heard a mention of SCSI. SpinRite didn't miss a beat and after an hour and a half it had managed to slog through about 8 megabytes of corrupted data. The rest of the drive turned out fine and completed the rest of the 4 gigabytes in about 20 minutes. Although it didn't actually recover the bad sector, it got enough of it for the file system to be happy. The drive most likely deallocated the bad sector and moved on thus eliminating the errors. I only had to replace one file that had gotten corrupted.
Take it from this software developer: This is an amazing piece of craftsman software that much of the industry should look towards for inspiration. Rarely do you find such a gem that not just balances but excels in reliability, performance and simplicity.
Nice work Steve!