Index: [Article Count Order] [Thread]

Date:  Thu, 26 Feb 2009 15:35:58 -0500
From:  Abdul Rashid Abdullah <webmaster (at mark) muntada.com>
Subject:  [coba-e:15148] Re: Hard Drive Failure
To:  "coba-e (at mark) bluequartz. org" <coba-e (at mark) bluequartz.org>
Message-Id:  <C5CC675E.7145%webmaster (at mark) muntada.com>
In-Reply-To:  <C5CC634B.7141%webmaster (at mark) muntada.com>
X-Mail-Count: 15148

I have went ahead and submitted an Advance RMA Request since it is still
under MFR Warranty.  The data center is nearby but I rather not reboot until
I am there.

Gerald, it looks like sda is the bad drive.  Physically, which drive am I
looking at when I open the Super Micro system?   It was the last purchase I
made from you.  I hardly get time to spend with these systems before I put
them into the data center.

-Rashid


On 2/26/09 3:18 PM, "Abdul Rashid Abdullah" <webmaster (at mark) muntada.com> wrote:

> Gerald,
> 
> Thanks for the response.  Good info.
> 
> [root@juhfah ~]# smartctl -i -d ata /dev/sda
> smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce
> Allen
> Home page is http://smartmontools.sourceforge.net/
> 
> Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)
> 
> A mandatory SMART command failed: exiting. To continue, add one or more '-T
> permissive' options.
> [root@juhfah ~]# smartctl -i -d ata /dev/sdb
> smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce
> Allen
> Home page is http://smartmontools.sourceforge.net/
> 
> === START OF INFORMATION SECTION ===
> Device Model:     WDC WD5000KS-00MNB0
> Serial Number:    WD-WCANU1500276
> Firmware Version: 07.02E07
> User Capacity:    500,107,862,016 bytes
> Device is:        Not in smartctl database [for details use: -P showall]
> ATA Version is:   7
> ATA Standard is:  Exact ATA specification draft version not indicated
> Local Time is:    Thu Feb 26 12:17:58 2009 PST
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
> 
> 
> -Rashid
> 
> 
> On 2/26/09 8:39 AM, "Gerald Waugh" <gwaugh (at mark) frontstreetnetworks.com> wrote:
> 
>> Abdul Rashid Abdullah wrote;
>> 
>>> I have a RAID 1 on my system.  I have a hard drive failure:
>>> 
>>> cat /proc/mdstat
>>> Personalities : [raid1]
>>> md6 : active raid1 sdb1[1] sda1[0]
>>>      104320 blocks [2/2] [UU]
>>> 
>>> md3 : active raid1 sdb3[1] sda3[2](F)
>>>      4192896 blocks [2/1] [_U]
>>> 
>>> md5 : active raid1 sdb5[1] sda5[2](F)
>>>      1052160 blocks [2/1] [_U]
>>> 
>>> md2 : active raid1 sdb6[1] sda6[2](F)
>>>      1052160 blocks [2/1] [_U]
>>> 
>>> md4 : active raid1 sdb7[1] sda7[2](F)
>>>      475692544 blocks [2/1] [_U]
>>> 
>>> md1 : active raid1 sdb2[1] sda2[2](F)
>>>      6289344 blocks [2/1] [_U]
>>> 
>>> 
>>> How do I know which specific drive failed so that I can replace it
>>> correctly
>>> and what best practices should I follow when replacing it?
>>> 
>> 
>> To get the hard drive info on SATA drives
>> execute "smartctl -i -d ata /dev/sdx"
>> or "smartctl -i -d ata /dev/sdx | grep Serial"
>> where 'x' is 'a' or 'b'
>> 
>> Pay attention to the serial number in the smartctl output.
>> This will aid you in finding the correct drive.
>> The current good drive should be the one the server boots from, after
>> changing out the bad drive.
>> You can change the drive for boot in the BIOS.
>> The replacement drive should be clean, at least have no partiton info.
>> This will prevent the server booting from the replacement drive, and
>> overwriting the good drive.
>> 
>> Gerald
>> 
>> 
> 
>