Thứ Năm, 18 tháng 6, 2020
MegaCLI To Replace Failed Drive In Hardware RAID Disk Array
Hardware RAID is a form of a RAID where processings are done by a hardware called RAID card.
Here we are dealing with a MegaRAID RAID controller and I am showing you how to replace faulty disk from the RAID controller without data loss.
The utility MegaCLI need to be installed on the server first.
First of all, we need to find out which disk has failed, for that, we need to check the status of the disk using the command.
MegaCli -PDList -aALL | grep Firmware state
Output
Firmware state: Online
Firmware state: Offline
From the output, we can understand that one of the disks is faulty and we need to replace the same. For replacing the disk, before removing the disk from the RAID controller we need to do some steps on the RAID controller using the MegaCLI utility.
Make the disk offline
Make the disk missing
Make the disk removal
For performing these three steps we need to know some details of the faulty disk, like Enclosure ID and slot number, the combination of these two value represents the correct disk.
We can get these values using the following command
MegaCli -PDList -aALL
Output
Adapter #0
Enclosure Device ID: 252
Slot Number: 0
Device Id: 4
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
Raw Size: 238475MB [0x1d1c5970 Sectors]
Non Coerced Size: 237963MB [0x1d0c5970 Sectors]
Coerced Size: 237464MB [0x1cfcc000 Sectors]
Firmware state: Online
SAS Address(0): 0xb221c046788723f
Connected Port Number: 0(path0)
Inquiry Data: ATA ST3250620AS K 6QE1DRKL
Adapter #0
Enclosure Device ID: 252
Slot Number: 1
Device Id: 5
Sequence Number: 2
Media Error Count: 0
Other Error Count: 1
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
Raw Size: 238475MB [0x1d1c5970 Sectors]
Non Coerced Size: 237963MB [0x1d0c5970 Sectors]
Coerced Size: 237464MB [0x1cfcc000 Sectors]
Firmware state: Offline
SAS Address(0): 0xb221c046788723f
Connected Port Number: 0(path0)
Inquiry Data: ATA ST3250620AS K 6QE1DRKL
The above output contains the details of all disk connected to the RAID controller (Details of all disks are grouped separately) and we need to find the faulty disk by checking the value of Firmware state as Offline.
Once you find the faulty disk, find out the Enclosure ID and slot number of the faulty disk. Once we have the values for Enclosure ID and slot number perform the following step.
Mark ofline
MegaCli -PDOffline -PhysDrv [252:1] -aALL
mark missing
MegaCli-PDMarkMissing -PhysDrv [252:1] -aALL
mark removal
MegaCli -PdPrpRmv -PhysDrv [252:1] -aALL
Now we can remove the disk from RAID controller.
Before replacing drives, make sure that the replacement disk is of the same type as the degraded drive and the capacity equal to or larger than the capacity of the degraded drive.
Nowadays all the hardware RAID controllers support hot-pluggable drives so we don’t need to power down the server for replacing the disk, you can remove the faulty drive while the server is running and attach the new one. The entire array activity will pause for 1 to 2 seconds while the new drive is initializing, once it is completed the rebuilding will start automatically.
If you have any queries on how to replace failed drive in hardware RAID disk array using MegaCLI feel free to contact us.
Command to install MegaCli using rpm
rpm -ivh MegaCli-8.07.14-1.noarch.rpm
Command location (will be installed to /opt/MegaRAID/MegaCli)
/opt/MegaRAID/MegaCli/MegaCli64
Make an alias for easier use
alias megacli=’/opt/MegaRAID/MegaCli/MegaCli64′
Extra tools
MegaCli is not providing all the information we need like mapping to linux devices and raid level (readable), so we are going
to use some extra tools.
yum install sg3_utils
Useful Commands for Megacli
Controller information
megacli -AdpAllInfo -aALL
megacli -CfgDsply -aALL
megacli -AdpEventLog -GetEvents -f events.log -aALL && cat events.log
Enclosure information
megacli -EncInfo -aALL
Virtual drive information
megacli -LDInfo -Lall -aALL
Physical drive information
megacli -PDList -aALL
megacli -PDInfo -PhysDrv [E:S] -aALL
Battery backup information
megacli -AdpBbuCmd -aALL
Controller management
Silence active alarm
megacli -AdpSetProp AlarmSilence -aALL
Disable alarm
megacli -AdpSetProp AlarmDsbl -aALL
Enable alarm
megacli -AdpSetProp AlarmEnbl -aALL
To see information about the patrol read state and the delay between patrol read runs
megacli -AdpPR -Info -aALL
To find out the current patrol read rate
megacli -AdpGetProp PatrolReadRate -aALL
To reduce patrol read resource usage to 2% in order to minimize the performance impact
megacli -AdpSetProp PatrolReadRate 2 -aALL
To disable automatic patrol read
megacli -AdpPR -Dsbl -aALL
To start a manual patrol read scan
megacli -AdpPR -Start -aALL
To stop a patrol read scan
megacli -AdpPR -Stop -aALL
If your system is not connected to a UPS, you should disable the physical disk cache in order to prevent data loss.
megacli -LDGetProp EnDskCache -LAll -aALL
To enable it (only do this if you have a UPS and redundant power supplies)
megacli -LDGetProp DisDskCache -LAll -aALL
Detail about disks
megacli -ShowSummary -aALL
To Check patrol read warnings
megacli -AdpEventLog -GetSinceReboot -warning -fatal -a0
Command to check RAID type RAID0 or RAID1 or RAID10
[root@testsrc]
# megacli -LDInfo -L0 -a0 |grep -i raid
Name :Raid10
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
[root@testsrc]
Share This!
Đăng ký:
Đăng Nhận xét (Atom)
Không có nhận xét nào:
Đăng nhận xét