Persistent device naming (or binding) for running RAC on Linux (10g R2 clusterware and above)

Installing oracle clusterware has been made easier and easier from release to release. When installing 10g R1 you struggled with ssh equivalence, wrong directory permissions and script bugs was a pain. 10g R2 was better but still quite painful (especially the step from 10.2.0.3.0 to 10.2.0.4.0). With 11g R1 installing the clusterware was made fairly robust and was even more simplified with 11g R2.

One important part of installing the oracle clusterware stack is to ensure device names for OCR and Voting disks do not change (called “persistent naming” or “persistent binding”). Starting with 11g R2 OCR and Voting disks can also be stored in ASM which eliminates the need for configuring persistent device names because ASM automatically detects the disks and ASM does not rely on fixed device names.

This post is a small guide how to configure persistent device names for the OCR and voting disks when installing oracle clusterware 10g R2 and above.

The experiences outlined here are based on the white paper Configuring udev and device mapper for Oracle RAC 10g Release 2 and 11g” from oracle available here.

Environment

This post assumes the following environment:

  • storage attached via fibre channel
  • Operating system: SLES 10/11 or Red Hat 4/5 or Oracle Enterprise Linux 5
  • no custom multipathing software is used (e.g. Powerpath)

For configuring multipathing we will use the Linux native method  – “multipath”. Multipath offers features such as persistent device naming, failover, load balancing and much more. See the manpage for more information.

Setting up persistent device naming

In order to set up persistent device naming the following steps are required:

  1. Determine the device WWN
  2. Create or edit the file /etc/multipath.conf and assign a device aliases
  3. Check
  4. Write a script to correct device permissions on system boot

Step 1- Determine the device WWN

Running multipath for the first time will produce a list of available devices together with their WWNs. The WWN is a unique and life-long number for identifying the LUN. Therefore we can safely use the WWN as identifier.

For example:

host:/ # multipath -l
3600601609e61260002bb4dc9edd4de11  dm-1 GC,RAID 5
[size=100G][features=0][hwhandler=1 emc]
\_ round-robin 0 [prio=-1][active]
\_ 2:0:0:2 sdc        8:32  [active][undef]
\_ 3:0:0:2 sdg        8:35  [active][undef]


3600601607a702600a0851e9bedc4de11 dm-2 DGC,RAID 5
[size=1.0G][features=0][hwhandler=1 emc]
\_ round-robin 0 [prio=-1][active]
\_ 3:0:0:0 sdf        8:80  [active][undef]
\_ 2:0:0:0 sde        8:70  [active][undef]

Multipath detected two LUNs: one of 100 GB size and a second one with 1 GB size. In real world scenarios there will be more LUNs but for the article two LUNs are more than enough.

The multipath entry can be read like this:

# device wwn                         device created in /dev    device information
3600601609e61260002bb4dc9edd4de11    dm-1                      DGC,RAID 5

# disk size     features        used storage type handler
[size=100G]     [features=0]    [hwhandler=1 emc]

# balancing algorithm     priority       status
\_ round-robin 0          [prio=-1]      [active]
\_ 2:0:0:2 sdc        8:32  [active][undef]       #<-- first device
\_ 3:0:0:2 sdg        8:35  [active][undef]       #<---second device

The most important thing here is the WWN. You should write the WWN down as this is your identifier which needs to be put in /etc/multipath.conf

Step 2 – Create or edit /etc/multipath.conf

For the two devices shown above with WWN  3600601609e61260002bb4dc9edd4de11 and 3600601607a702600a0851e9bedc4de11 we create the persistent device binding:

multipaths {
multipath {
wwid 3600601607a702600a0851e9bedc4de11
alias ocr_san_a
path_grouping_policy group_by_prio
path_checker readsector0
path_selector "round-robin 0"
failback immediate
polling_interval 5
no_path_retry fail
}
multipath {
wwid 3600601609e61260002bb4dc9edd4de11
alias asmdisk001
path_grouping_policy group_by_prio
path_checker readsector0
path_selector "round-robin 0"
failback immediate
polling_interval 5
no_path_retry fail
}

I wont go too deep into the meaning of the parameters but here are the most important ones:

  • ALIAS: the most important part of the configuration file; it specifies the device name for the disk to be created in /dev/mapper
  • FAILBACK: specifies the amount of time in seconds after which a prior failed path will be considered “active” again
  • POLLING_INTERVAL: how often (seconds) check the path and availability
  • NO_PATH_RETRY: immediate fails a path if errors are detected and does not queue I/O (Note: Oracle whitepapers recommend to use a value of “10” or even “20” which will queue I/O in error cases thus “halting” the system for a short period of time. You have to test it yourself in your configuration. “fail” works best for me but may be different at your site / configuration).

For the meaning of the parameters refer to the manpage of “multipath.conf”.

Step 3 – Check

For the changes to become active running “/etc/init.d/multipath restart” is required. After that /dev/mapper should look like this:

host:/dev/mapper # ll
total 0
lrwxrwxrwx 1 root root          16 Nov 20 13:33 control -> ../device-mapper
brw-rw---- 1 asm  dba      253,  5 Nov 20 13:33 asmdisk001
brw-rw---- 1 crs  oinstall 253,  9 Nov 20 13:33 ocr_san_a

As you can see the disks are named according to the alias specified. If a disk does not appear check the log files or run “multipath -v 2”.

Note: I strongly recommend to create and use partitions on the configured devices regardless if they will be used for ASM, OCR or Voting Disks. In addition to that i recommend to configure aliases for asm disks as well.

If you created partions one partition on each device /dev/mapper will look like this:

host:/dev/mapper # ll
total 0
lrwxrwxrwx 1 root root          16 Nov 20 13:33 control -> ../device-mapper
brw-rw---- 1 asm  dba      253,  5 Nov 20 13:33 asmdisk001
brw-rw---- 1 asm  dba      253, 17 Nov 23 12:23 asmdisk001-part1
brw-rw---- 1 crs  oinstall 253,  9 Nov 20 13:33 ocr_san_a
brw-rw---- 1 crs  oinstall 253, 16 Nov 23 11:26 ocr_san_a-part1

Device “asmdisk001-part1” is the first parition on device (disk) “asmdisk001”. For labeling the disk as ASM disk you specify “asmdisk001-part1”.

Step 4 – Write a script to correct device permissions

Oracle clusterware and asm require custom device permissions (the clusterware stack requires to own the OCR and voting disks and the ASM instance requires to own the ASM disks.) . By default disk devices are owned by user “root” and group “disk”. Unfortunately devices created in /dev/mapper cannot be influenced by udev rules (according to the whitepaper; i did not checked it myself). So i wrote a script to correct device permissions:

#! /bin/sh
#
# /etc/init.d/ocr_permission
#
### BEGIN INIT INFO
# Provides:          ocr_permission
# Required-Start:       $local_fs multipathd
# Should-Start:
# Required-Stop:
# Default-Start:    2 3 5
# Default-Stop:
# Description:       run ocr and voting disk permission corrections
### END INIT INFO

. /etc/rc.status
. /etc/sysconfig/sysctl

rc_reset

case "$1" in
 *)
 chown crs:oinstall /dev/mapper/ocr*
 chmod 660 /dev/mapper/ocr*

 chown crs:oinstall /dev/mapper/voting*
 chmod 660 /dev/mapper/voting*

 chown asm:dba /dev/mapper/disk*
 chmod 660 /dev/mapper/disk*
 ;;
esac

Note user “crs” holds the clusterware installation and user “asm” holds the ASM installation. Therefore OCR and voting disks will be owned by “crs” and asm disks will be owned by “asm”.

This script will be called upon system boot to correct the device permissions. It shall be started after the multipathing daemon. If you want to you can call this script periodically from cron to correct device permissions for devices added after system boot.

Version and component specific information

Use with Oracle Clusterware 10g R2

Oracle clusterware 10g R2 requires raw devices for OCR and voting disks.

To configure raw deivce mapping edit the file /etc/sysconfig/rawdevices (OEL/RedHat) or /etc/raw (SuSE) as follows:

# /etc/sysconfig/rawdevices
# SAMPLE FILE
raw1:/dev/mapper/ocr_sana-part1
raw2:/dev/mapper/ocrmirror_sanb-part1
raw3:/dev/mapper/voting1-part1
raw4:/dev/mapper/voting2-part1
raw5:/dev/mapper/voting3-part1

Restart raw service after changing the file to pick up the changes. Do not forget to add raw service to system boot sequence!

For RHEL5, OEL5, SLES10 to correct device permissions create a file /etc/udev/rules.d/99-raw.rules with the following content

KERNEL=="raw[1-2]*", GROUP="oinstall", MODE="640"
KERNEL=="raw[3-5]*", OWNER="crs", GROUP="oinstall", MODE="660"

For RHEL4, OEL4 or SLES9 refer to the oracle whitepaper mentioned above.

When installing clusterware stack OCR location would be according to our example: “/dev/raw/raw1”.

Use with Oracle Clusterware 11g R1

Note: Oracle itself recommends to use the latest clusterware and ASM software stack for new installations. I strongly recommend to install 11.1.0.7 clusterware and ASM stack (plus PSU if available) for new installations because installation is easier, software is more robust and many errors from 10g R2 are fixed (e.g. online addition of voting disks is possible as of 11g R1).

Starting with 11g R1 clusterware deprecated raw devices and can use block devices for OCR and voting disks directly. So beside configuring persistent device naming there is nothing more to do. When installing the clusterware based on this guide you would specify “/dev/mapper/ocr_san_a-part1” for ocr location.

Use with Oracle Clusterware 11g R2

Starting with 11g R2 OCR and voting disks can be placed inside ASM. Theoretically this eliminates the need for persistent device binding / naming because ASM automatically detects the disks but i recommend to use device naming anyway as outlined below.

What about ASM?

Basically ASM does not require persistent device naming because ASM automatically detects devices and multipathing devices. Due to this you could label “/dev/dm-*” devices as ASM disks an use them at the disadvantage that these device names are not consistent across all nodes and can (and will) change.

So if you already configured persistent device naming for OCR and voting disks i recommend to do so as well for the ASM disks. The advantages are for instance more readable device names, granular configuration. Readble device names are important for normal or external redundancy configuration to distinguish the device from each other without reading ASM labels.

For normal redundancy configuration (two SAN storages) device mapping can look like this:

host:/dev/mapper # ll
total 0
lrwxrwxrwx 1 root root          16 Nov 20 13:33 control -> ../device-mapper
brw-rw---- 1 asm  dba      253,  0 Nov 20 13:33 disk001_sana
brw-rw---- 1 asm  dba      253, 11 Nov 23 12:41 disk001_sana-part1
brw-rw---- 1 asm  dba      253,  5 Nov 20 13:33 disk001_sanb
brw-rw---- 1 asm  dba      253, 18 Nov 23 12:41 disk001_sanb-part1
brw-rw---- 1 asm  dba      253,  1 Nov 20 13:33 disk002_sana
brw-rw---- 1 asm  dba      253, 19 Nov 23 12:41 disk002_sana-part1
brw-rw---- 1 asm  dba      253,  7 Nov 20 13:33 disk002_sanb
brw-rw---- 1 asm  dba      253, 15 Nov 23 12:41 disk002_sanb-part1
brw-rw---- 1 crs  oinstall 253,  4 Nov 20 13:33 ocr_san_a
brw-rw---- 1 crs  oinstall 253, 13 Nov 23 12:07 ocr_san_a-part1
brw-rw---- 1 crs  oinstall 253,  8 Nov 20 13:33 ocrmirror_san_b
brw-rw---- 1 crs  oinstall 253, 10 Nov 23 12:07 ocrmirror_san_b-part1
brw-rw---- 1 crs  oinstall 253,  3 Nov 20 13:33 voting1_san_a
brw-rw---- 1 crs  oinstall 253, 16 Nov 23 12:41 voting1_san_a-part1
brw-rw---- 1 crs  oinstall 253,  9 Nov 20 13:33 voting2_san_b
brw-rw---- 1 crs  oinstall 253, 12 Nov 23 12:41 voting2_san_b-part1
This entry was posted in Oracle in general. Bookmark the permalink.

5 Responses to Persistent device naming (or binding) for running RAC on Linux (10g R2 clusterware and above)

  1. Pingback: Blogroll Report 04/12/2009-11/12/2009 « Coskan’s Approach to Oracle

  2. imcksj says:

    Not so much a comment, but a question.

    What should the value for ORACLEASM_SCANORDER and ORACLE_SCANEXCLUDE be in /etc/sysconfig/oracleasm (sles 10sp3) when using /dev/mapper devices?

    Any input would be greatly appreciated.

    • Ronny Egner says:

      Well, you dont need to insert anything on ORACLE_SCANEXCLUDE and ORACLE_SCANORDER. If you have a lot of devices scanning for devices might take some time which can be shortened by excluding some devices.

  3. sudhir says:

    hi ronnie , quick question.

    i am running 11gr2 oracle rac on rhel 5. i am using device mapper. do i still need to write the script to correct device permissions

    thanks
    sudhir

Leave a Reply

Your email address will not be published. Required fields are marked *