How to change boot drive on server with RAID 5 software raid

0

I have my hardware-owned dedicated server hosted in remote datacenter, running CentOS release 6.4 (Final) with grub (GNU GRUB 0.97). Server has 6 2TB drives with 2 software raids on them - md0 raid 1 for system and swap and md1 raid 5 for data. Here is fdisk and mdstat output:

[root@s3 ~]# fdisk -l

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x0004f1ce

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1         653     5242880   fd  Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sda2             653      243202  1948270592   fd  Linux raid autodetect

Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x0008177f

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1         653     5242880   fd  Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sdb2             653      243202  1948270592   fd  Linux raid autodetect

Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x0008afb7

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1   *           1         653     5242880   fd  Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sdc2             653      243202  1948270592   fd  Linux raid autodetect

Disk /dev/sde: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00012b28

   Device Boot      Start         End      Blocks   Id  System
/dev/sde1   *           1         653     5242880   fd  Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sde2             653      243202  1948270592   fd  Linux raid autodetect

Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x000df271

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1   *           1         653     5242880   fd  Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sdd2             653      243202  1948270592   fd  Linux raid autodetect

Disk /dev/sdf: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x000b6b9e

   Device Boot      Start         End      Blocks   Id  System
/dev/sdf1   *           1         653     5242880   fd  Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sdf2             653      243202  1948270592   fd  Linux raid autodetect

Disk /dev/md1: 9974.5 GB, 9974471720960 bytes
2 heads, 4 sectors/track, -1859793536 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 524288 bytes / 2621440 bytes
Disk identifier: 0x00000000


Disk /dev/md0: 5368 MB, 5368643584 bytes
2 heads, 4 sectors/track, 1310704 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000


Disk /dev/mapper/vg_s3-LogVol01: 12.6 GB, 12582912000 bytes
255 heads, 63 sectors/track, 1529 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 524288 bytes / 2621440 bytes
Disk identifier: 0x00000000


Disk /dev/mapper/vg_s3-LogVol00: 3271 MB, 3271557120 bytes
255 heads, 63 sectors/track, 397 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 524288 bytes / 2621440 bytes
Disk identifier: 0x00000000


Disk /dev/mapper/vg_s3-LogVol02: 9958.6 GB, 9958615678976 bytes
255 heads, 63 sectors/track, 1210732 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 524288 bytes / 2621440 bytes
Disk identifier: 0x00000000

[root@s3 ~]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid1]
md0 : active raid1 sdc1[2] sda1[0] sdd1[3] sdb1[1] sde1[4] sdf1[5]
      5242816 blocks super 1.0 [6/6] [UUUUUU]
      bitmap: 1/1 pages [4KB], 65536KB chunk

    md1 : active raid5 sda2[0](F) sdd2[3] sdc2[2] sdb2[1] sde2[4] sdf2[6]
          9740695040 blocks super 1.1 level 5, 512k chunk, algorithm 2 [6/5] [_UUUUU]
          bitmap: 15/15 pages [60KB], 65536KB chunk

    unused devices: <none>

As you can see, the array md1 is in degraded state, /dev/sda drive is faulty. I have exchanged many faulty drives before in other servers, but here /dev/sda is also boot drive (probably). Here is my boot grub device map:

# this device map was generated by anaconda
(hd0)     /dev/sdb
(hd1)     /dev/sdc
(hd2)     /dev/sdd
(hd3)     /dev/sde
(hd4)     /dev/sdf
(hd5)     /dev/sdg

/dev/sda is not listed there for some reason and it lists /dev/sdg, which according to fdisk does not exist. Here is /boot/grub/grub.conf:

# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You do not have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /, eg.
#          root (hd0,0)
#          kernel /boot/vmlinuz-version ro root=/dev/md0
#          initrd /boot/initrd-[generic-]version.img
#boot=/dev/sdb
default=0
timeout=5
splashimage=(hd0,0)/boot/grub/splash.xpm.gz
hiddenmenu
title CentOS (2.6.32-358.el6.x86_64)
    root (hd0,0)
    kernel /boot/vmlinuz-2.6.32-358.el6.x86_64 ro root=UUID=cf9ba269-255e-4650-a095-87f2cdc5e22e rd_NO_LUKS rd_LVM_LV=vg_s3/LogVol01 LANG=en_US.UTF-8 rd_MD_UUID=596e4bea:c4494ac6:b2007529:1c8053a7 SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_MD_UUID=40df07b4:85b88119:f11dabdf:97836f34  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet
    initrd /boot/initramfs-2.6.32-358.el6.x86_64.img

It lists hd(0,0) as boot drive. I think this is dev/sda drive, which is faulty. So if I turned now the server off to change the drive, it would not boot again. I am trying to switch boot drive to some other drive, I have done a lot of searching online, but I am unable to figure it out. I tried these commands:

[root@s3 ~]# grub-install /dev/sdb
/dev/sda1 does not have any corresponding BIOS drive.
[root@s3 ~]# grub
Probing devices to guess BIOS drives. This may take a long time.


    GNU GRUB  version 0.97  (640K lower / 3072K upper memory)

 [ Minimal BASH-like line editing is supported.  For the first word, TAB
   lists possible command completions.  Anywhere else TAB lists the possible
   completions of a device/filename.]
grub> find /grub/stage1
find /grub/stage1

Error 15: File not found
grub> find /boot/grub/stage1
find /boot/grub/stage1
 (hd0,0)
 (hd1,0)
 (hd2,0)
 (hd3,0)
 (hd4,0)
 (hd5,0)
grub> cat (hd0,0)/grub/grub.conf
cat (hd0,0)/grub/grub.conf

Error 15: File not found
grub> quit
quit 

What changes should I make to grub to be able to boot after I swap /dev/sda. HDD boot order will have to be probably modified also in BIOS.

centos
boot
software-raid
grub
asked on Server Fault Nov 19, 2016 by Josef

1 Answer

0

When using software RAID, the selected boot drive depends entirely on the BIOS (and on how it enumerate drives).

To have a bootable machine, simply replace the failed drive and use grub-install to install the bootloader on the BIOS boot drive or, even better, to all the drive partecipating in the RAID1 array.

Please see here for more information.

answered on Server Fault Nov 19, 2016 by shodanshok • edited Apr 13, 2017 by Community

User contributions licensed under CC BY-SA 3.0