Thứ Năm, 18 tháng 6, 2020

MegaCLI To Replace Failed Drive In Hardware RAID Disk Array

Hardware RAID is a form of a RAID where processings are done by a hardware called RAID card. Here we are dealing with a MegaRAID RAID controller and I am showing you how to replace faulty disk from the RAID controller without data loss. The utility MegaCLI need to be installed on the server first. First of all, we need to find out which disk has failed, for that, we need to check the status of the disk using the command. MegaCli -PDList -aALL | grep Firmware state Output Firmware state: Online Firmware state: Offline From the output, we can understand that one of the disks is faulty and we need to replace the same. For replacing the disk, before removing the disk from the RAID controller we need to do some steps on the RAID controller using the MegaCLI utility. Make the disk offline Make the disk missing Make the disk removal For performing these three steps we need to know some details of the faulty disk, like Enclosure ID and slot number, the combination of these two value represents the correct disk. We can get these values using the following command MegaCli -PDList -aALL Output Adapter #0 Enclosure Device ID: 252 Slot Number: 0 Device Id: 4 Sequence Number: 2 Media Error Count: 0 Other Error Count: 0 Predictive Failure Count: 0 Last Predictive Failure Event Seq Number: 0 Raw Size: 238475MB [0x1d1c5970 Sectors] Non Coerced Size: 237963MB [0x1d0c5970 Sectors] Coerced Size: 237464MB [0x1cfcc000 Sectors] Firmware state: Online SAS Address(0): 0xb221c046788723f Connected Port Number: 0(path0) Inquiry Data: ATA ST3250620AS K 6QE1DRKL Adapter #0 Enclosure Device ID: 252 Slot Number: 1 Device Id: 5 Sequence Number: 2 Media Error Count: 0 Other Error Count: 1 Predictive Failure Count: 0 Last Predictive Failure Event Seq Number: 0 Raw Size: 238475MB [0x1d1c5970 Sectors] Non Coerced Size: 237963MB [0x1d0c5970 Sectors] Coerced Size: 237464MB [0x1cfcc000 Sectors] Firmware state: Offline SAS Address(0): 0xb221c046788723f Connected Port Number: 0(path0) Inquiry Data: ATA ST3250620AS K 6QE1DRKL The above output contains the details of all disk connected to the RAID controller (Details of all disks are grouped separately) and we need to find the faulty disk by checking the value of Firmware state as Offline. Once you find the faulty disk, find out the Enclosure ID and slot number of the faulty disk. Once we have the values for Enclosure ID and slot number perform the following step. Mark ofline MegaCli -PDOffline -PhysDrv [252:1] -aALL mark missing MegaCli-PDMarkMissing -PhysDrv [252:1] -aALL mark removal MegaCli -PdPrpRmv -PhysDrv [252:1] -aALL Now we can remove the disk from RAID controller. Before replacing drives, make sure that the replacement disk is of the same type as the degraded drive and the capacity equal to or larger than the capacity of the degraded drive. Nowadays all the hardware RAID controllers support hot-pluggable drives so we don’t need to power down the server for replacing the disk, you can remove the faulty drive while the server is running and attach the new one. The entire array activity will pause for 1 to 2 seconds while the new drive is initializing, once it is completed the rebuilding will start automatically. If you have any queries on how to replace failed drive in hardware RAID disk array using MegaCLI feel free to contact us. Command to install MegaCli using rpm rpm -ivh MegaCli-8.07.14-1.noarch.rpm Command location (will be installed to /opt/MegaRAID/MegaCli) /opt/MegaRAID/MegaCli/MegaCli64 Make an alias for easier use alias megacli=’/opt/MegaRAID/MegaCli/MegaCli64′ Extra tools MegaCli is not providing all the information we need like mapping to linux devices and raid level (readable), so we are going to use some extra tools. yum install sg3_utils Useful Commands for Megacli Controller information megacli -AdpAllInfo -aALL megacli -CfgDsply -aALL megacli -AdpEventLog -GetEvents -f events.log -aALL && cat events.log Enclosure information megacli -EncInfo -aALL Virtual drive information megacli -LDInfo -Lall -aALL Physical drive information megacli -PDList -aALL megacli -PDInfo -PhysDrv [E:S] -aALL Battery backup information megacli -AdpBbuCmd -aALL Controller management Silence active alarm megacli -AdpSetProp AlarmSilence -aALL Disable alarm megacli -AdpSetProp AlarmDsbl -aALL Enable alarm megacli -AdpSetProp AlarmEnbl -aALL To see information about the patrol read state and the delay between patrol read runs megacli -AdpPR -Info -aALL To find out the current patrol read rate megacli -AdpGetProp PatrolReadRate -aALL To reduce patrol read resource usage to 2% in order to minimize the performance impact megacli -AdpSetProp PatrolReadRate 2 -aALL To disable automatic patrol read megacli -AdpPR -Dsbl -aALL To start a manual patrol read scan megacli -AdpPR -Start -aALL To stop a patrol read scan megacli -AdpPR -Stop -aALL If your system is not connected to a UPS, you should disable the physical disk cache in order to prevent data loss. megacli -LDGetProp EnDskCache -LAll -aALL To enable it (only do this if you have a UPS and redundant power supplies) megacli -LDGetProp DisDskCache -LAll -aALL Detail about disks megacli -ShowSummary -aALL To Check patrol read warnings megacli -AdpEventLog -GetSinceReboot -warning -fatal -a0 Command to check RAID type RAID0 or RAID1 or RAID10 [root@testsrc] # megacli -LDInfo -L0 -a0 |grep -i raid Name :Raid10 RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0 [root@testsrc]

Share This!


Không có nhận xét nào:

Đăng nhận xét