SuSE Linux RAID Faulty disk replacement

LinuGeek

from LinuxQuestions.org on 2021-01-09 12:19 (#5CKPW)

Hello Experts,

We have a important Database Server with SUSE Linux Enterprise Server 12. The previous admin has setup it as follows.

4 internal disks :

1+1 --RAID-1 Software RAID --> ROOT Partitions
1+1 --RAID-1 Software RAID --> Data Partitions with Database.

Root Partitions have further LVM on top of it and then sliced to have Logical volumes of /usr /boot etc.

So there are 2 Volume groups. 1 System VG and 2. Data VG.

There are 4 Disks sda+sdb and sdc+sdd

Recently we noticed that one ofthe disks out of Software RAID group System is gone bad and the server
continued to work without any problem (Thanks to RAID 1 Mirroring).
See below, 3 Software RAID partitions are marked as Failed/degraded. md0, md1 and md2.
Which are System Partitions. md3 is for database.
So sda1,sda2 and sda3

Code:#cat /proc/mdstat
Personalities : [raid1]

md0 : active raid1 sdb1[1] sda1[0](F) <<<<<-------------
1051584 blocks super 1.0 [2/1] [_U]
bitmap: 1/1 pages [4KB], 65536KB chunk

md1 : active raid1 sdb2[1] sda2[0](F) <<<<<-------------
18876288 blocks super 1.0 [2/1] [_U]
bitmap: 1/1 pages [4KB], 65536KB chunk

md2 : active raid1 sdb3[1] sda3[0](F) <<<<<-------------
956832576 blocks super 1.0 [2/1] [_U]
bitmap: 2/8 pages [8KB], 65536KB chunk

md3 : active raid1 sdc1[0] sdd1[1]
976760640 blocks super 1.0 [2/2] [UU]
bitmap: 2/8 pages [8KB], 65536KB chunk

unused devices: <none>
We have to replace the faulty disk (sda) so that it builds back the original structure.

I have come up with following plan. Please suggest modifications.

1. Shutdown the server that will eventually also take down the database.
2. Take out the faulty disk
3. Replace with new one
4. And restart the server
5. Auto-Build process of mirroring the new disk from the existing one should start.

This sounds more of an automated process.

If this does not work then we can manually do few more steps.
Quote:

Question can we do this on existing runlevel without any problem??

1. Mark the disk as failed if it is not already marked F by the system.

Code:# mdadm --manage /dev/md0 --fail /dev/sda1
# mdadm --manage /dev/md1 --fail /dev/sda2
# mdadm --manage /dev/md2 --fail /dev/sda3To verify that the disk is failed, check /proc/mdstat:

2. Remove the disk by mdadm
Code:# mdadm --manage /dev/md0 --remove /dev/sda1
# mdadm --manage /dev/md1 --remove /dev/sda2
# mdadm --manage /dev/md2 --remove /dev/sda33. Replace the disk
Quote:

Question how to identify the faulty disk??

4. Copy the partition table to the new disk
(Caution: This sfdisk command will replace the entire partition table on the target disk with that of the source disk - use an alternative command if you need to preserve other partition information)

Code:# sfdisk -d /dev/sdb | sfdisk /dev/sda5. Create the mirror of the disk:

Code:# mdadm --manage /dev/md0 --add /dev/sda1
# mdadm --manage /dev/md1 --add /dev/sda2
# mdadm --manage /dev/md2 --add /dev/sda36. To test the setup, enter the below command:

Code:# /sbin/mdadm --detail /dev/md0The following command will show the current progress of the recovery of the mirror disk:

Code:7.# cat /proc/mdstatSystem backup is in place.
Please give your valuable inputs. Quote:

If there is any better option?

Thank you in advance.

Regards,
Admin

latest?i=6qM7RqyELKU:Anqi2qh16cY:F7zBnMy

latest?i=6qM7RqyELKU:Anqi2qh16cY:V_sGLiP

latest?i=6qM7RqyELKU:Anqi2qh16cY:gIN9vFw

Source	RSS or Atom Feed
Feed Location	https://feeds.feedburner.com/linuxquestions/latest
Feed Title	LinuxQuestions.org
Feed Link	https://www.linuxquestions.org/questions/