Useful ASM scripts and queries

This short post covers some useful ASM queries and scripts.

chk_asm_mapping.sh

The following is a script i found somewhere on the web. Unfortunately it was not totally correct for newer asm libraries. The script itself:

/etc/init.d/oracleasm querydisk -d `/etc/init.d/oracleasm listdisks -d` |
cut -f2,10,11 -d" " | perl -pe 's/"(.*)".*\[(.*), *(.*)\]/$1 $2 $3/g;' |
while read v_asmdisk v_minor v_major
do
v_device=`ls -la /dev | grep " $v_minor, *$v_major " | awk '{print $10}'`
echo "ASM disk $v_asmdisk based on /dev/$v_device [$v_minor, $v_major]"
done

When started it produces the following output:

ASM disk DISK001A based on /dev/sdc1 [8, 33]
ASM disk DISK001B based on /dev/sdb1 [8, 17]
ASM disk DISK002A based on /dev/sdd1 [8, 49]
ASM disk DISK002B based on /dev/sde1 [8, 65]
ASM disk DISK003A based on /dev/sdg1 [8, 97]
ASM disk DISK003B based on /dev/sdf1 [8, 81]

Disk Group Information

set pages 40000 lines 120
col NAME for a15
select GROUP_NUMBER DG#, name, ALLOCATION_UNIT_SIZE AU_SZ, STATE,
TYPE, TOTAL_MB, FREE_MB, OFFLINE_DISKS from v$asm_diskgroup;
DG# NAME  AU_SZ   STATE    TYPE   TOTAL_MB FREE_MB OFFLINE_DISKS
--- ----- ------- -------- ------ ------   ------- -------------
 1  DATA2 4194304 MOUNTED  NORMAL 40952    4168    0
 2  DATA  1048576 MOUNTED  NORMAL 30717    25667   0

ASM Disk Information

set pages 40000 lines 120
col PATH for a30
select DISK_NUMBER,MOUNT_STATUS,HEADER_STATUS,MODE_STATUS,STATE,
PATH FROM V$ASM_DISK;

DISK_NUMBER MOUNT_S HEADER_STATU MODE_ST STATE    PATH
----------- ------- ------------ ------- -------- -------------
0           CLOSED  MEMBER       ONLINE  NORMAL   ORCL:DISK001A
3           CLOSED  MEMBER       ONLINE  NORMAL   ORCL:DISK003B
1           CACHED  MEMBER       ONLINE  NORMAL   ORCL:DISK001B
2           CACHED  MEMBER       ONLINE  NORMAL   ORCL:DISK002A
3           CACHED  MEMBER       ONLINE  NORMAL   ORCL:DISK002B

Combined ASM Disk and ASM Diskgroup information

The following query combines ASM disk and diskgroup information. You can edit this query to suit your needs easily. If you use ASM files instead of disks you have to use v$asm_file instead of v$asm_disk. If you use ASM files you have to add v$asm_file to the query.

col PATH for a15
col DG_NAME for a15
col DG_STATE for a10
col FAILGROUP for a10

select dg.name dg_name, dg.state dg_state, dg.type, d.disk_number dsk_no,
d.path, d.mount_status, d.FAILGROUP, d.state 
from v$asm_diskgroup dg, v$asm_disk d
where dg.group_number=d.group_number
order by dg_name, dsk_no;

DG_NAME  DG_STATE TYPE   DSK_NO PATH          MOUNT_S FAILGROUP  STATE
-------- -------- ------ ------ ------------- ------- ---------- --------
DATA     MOUNTED  NORMAL 1      ORCL:DISK001B CACHED  DISK001B   NORMAL
DATA     MOUNTED  NORMAL 2      ORCL:DISK002A CACHED  DISK002A   NORMAL
DATA     MOUNTED  NORMAL 3      ORCL:DISK002B CACHED  DISK002B   NORMAL
DATA2    MOUNTED  NORMAL 1      ORCL:DISK003A CACHED  DISK003A   NORMAL
DATA2    MOUNTED  NORMAL 2      ORCL:DISK003B CACHED  DISK003B   NORMAL

Monitoring ASM disk operations

select GROUP_NUMBER, OPERATION, STATE, ACTUAL, SOFAR, EST_MINUTES from v$asm_operation;

If there is a operating going on (like rebalancing) the query will return some rows. For instance for our just added disk the query yields:

GROUP_NUMBER OPERA STAT    ACTUAL  SOFAR      EST_MINUTES
------------ ----- ---- ---------- ---------- -----------
2            REBAL RUN           1         49          16
Posted in Oracle in general | 5 Comments

ASM resilvering – or – how to recover your crashed cluster – Test no 2

In this post we will terminating the private network interconnect between the two nodes by moving node “rac1” lan interfaces to a different vlan. This will result in still connected network ports but interrupted connectivity.

Continue reading

Posted in Oracle in general | 2 Comments

Oracle 11g Release 2 – AWR emergency purging

During the night my SYSAUX tablespace filled up to the maximum allowed size so AWR could not save it´s data. In the database alert.log i found the following:

Tue Oct 06 02:02:38 2009
ORA-1688: unable to extend table SYS.WRH$_ACTIVE_SESSION_HISTORY
  partition WRH$_ACTIVE_270645195_176 by 8 in tablespace SYSAUX
MMON Flush encountered SYSAUX out of space error(1688).
MMON (emergency) purge of WR snapshots (105) and older

It seems if there is not enough space available for saving AWR data AWR will automatically delete the oldest snapshots to free up space. Nice feature :-)

Posted in Oracle in general | 1 Comment

ASM resilvering – or – how to recover your crashed cluster – Test no 1

In the first test we will just disable one node under normal load. Lets see what the different components (clusterware, asm instance, advm/acfs) report:

Continue reading

Posted in Oracle in general | 2 Comments

ASM resilvering – or – how to recover your crashed cluster – the Environment

We use the same environment we installed here which consists of two virtual machines running Oracle Enterprise Linux 5 Update 3 64-bit with Oracle 11g Release 2 (Grid Infrastructure and Oracle binaries) installed. All data is placed in ASM with normal redundancy. Storage is exported from two Netapp Filers via iSCSI.

Continue reading

Posted in Oracle in general | 1 Comment

Is it allowed to quote Metalink Notes completely or partially?

Writing posts for this blog i wondered: Am i allowed to post Metalink Notes entirely or partially in my blog posts?

Looking at the web i did not find any information. I might have read the Oracle Support License but due to lengthiness – and because i am not a lawyer i refused from this :-)

So i opened a SR and asked my question. This is the reply i got:

The customer can refer to Document IDs in My Oracle Support, but is NOT allowed to reproduce any Support content which currently requires a login to Oracle Support, doing so would violate the terms of their Oracle Support License.

I guess that answers the question entirely.

Posted in Oracle in general | Leave a comment

ext3 – beware of periodic file system checks

Users running Oracle on Red Hat or Oracle Enterprise Linux will most probaly use ext3 as default file system. It is robust, well maintained, a journaling file system and supports file system sizes up to 32 TB.

Database servers must be “high available” – so they are booted infrequently if ever.

Even if ext3 is a journaling file system there are periodic full file system checks which will check the entire file system just like ext2 did after a crash. On todays large file systems this can take hours. Thats especially bad if you just “quickly” rebooted the server after applying a patch and now facing a long running file system check.

Many users complained about this behavior and said ext3 is a journaling file system and should be running without periodic checks. On the other hand this is a security feature. Even with journaling corruptions may be introduced and remain undetected for a long time.

So lets take a look at the defaults chosen on Oracle Enterprise Linux 5 Update 3 when creating a ext3 file system:

[root@rac1 ~]# mkfs.ext3 /dev/asm/testvol1-132
mke2fs 1.39 (29-May-2006)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
131072 inodes, 262144 blocks
13107 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=268435456
8 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
 32768, 98304, 163840, 229376

Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 22 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

The last two lines are the ones of interest: “The file system will be checked every 22 mounts or 180 days, whichever comes first”. On database serverss 22 mounts seems a long time but 180 days are quite short. So administrators should change these values accordingly. Turning it off completely is a loss of data protection but still possible.

You can check the mount counts, maximum mount counts and check intervals with the following command:

tune2fs -l /dev/asm/testvol1-132
[..]
Mount count:              0
Maximum mount count:      22
[..]
Last checked:             Thu Oct  1 10:44:53 2009
Check interval:           15552000 (6 months)
Next check after:         Tue Mar 30 10:44:53 2010

You are free to change these values with tune2fs:

[root@rac1 ~]# tune2fs -c 100 /dev/asm/testvol1-132
tune2fs 1.39 (29-May-2006)
Setting maximal mount count to 100

[root@rac1 ~]# tune2fs -i 360 /dev/asm/testvol1-132
tune2fs 1.39 (29-May-2006)
Setting interval between checks to 31104000 seconds

…verifying:

tune2fs -l /dev/asm/testvol1-132
Filesystem created:       Thu Oct  1 10:44:53 2009
Last mount time:          n/a
Last write time:          Thu Oct  1 11:02:09 2009
Mount count:              0
Maximum mount count:      100
Last checked:             Thu Oct  1 10:44:53 2009
Check interval:           31104000 (12 months)
Next check after:         Sun Sep 26 10:44:53 2010

So based on this information:

To which values should max mount count and check interval be set?

The actual values depend completely on your requirements. But they should be set completely different for different volumes. This will prevent a full check on all your volumes at once.

Posted in Oracle in general | 5 Comments

Oracle 11g Release 2 – How to enable SSH equivalence

There is a problem with shared oracle homes on ACFS documented here.

However if you have not configured SSH equivalence before this short guide shows how to do set up ssh equivalence between the nodes “rac1.regner.de” and “rac2.regner.de”:

1. Unpack the database installation files

2. Execute:

<PATH_TO_DATABASE_INSTALLATION>/sshsetup/sshUserSetup.sh
  -user ora11p -hosts "rac1.regner.de rac2.regner.de" -shared

Arguments:
-user: the name of the user for ssh equivalence to be configured
-hosts: space separated list of node names
-shared: indicates a shared oracle home (on acfs or nfs)

3. Check it

/usr/bin/ssh -o FallBackToRsh=no  -o PasswordAuthentication=no 
  -o StrictHostKeyChecking=yes  -o NumberOfPasswordPrompts=0 rac2 /bin/date
/usr/bin/ssh -o FallBackToRsh=no  -o PasswordAuthentication=no
  -o StrictHostKeyChecking=yes  -o NumberOfPasswordPrompts=0 rac1 /bin/date

The two commands above should return the current date. Thats it.

Posted in Oracle 11g Release 2 | Leave a comment

Oracle 11 Release 2 Install Guide – Creating a RAC database

This is the final part of the Oracle 11g Release 2 installation guide. At this point you should have:

  • Oracle database and grid infrastructure binaries installed
  • at least ONE or better two disk groups configured for:
    • database and binary files
    • flash recovery area

Our database name will be “ORA11P”.

Preparations

  • create directory for cfgtools
create /u01/app/oracle/cfgtoollogs
mkdir -p /u01/app/oracle/cfgtoollogs
chown root:dba /u01/app/oracle/cfgtoollogs
chmod 775 /u01/app/oracle/cfgtoollogs
  • Check SSH equivalence once again:
    execute as user „ora11p“ (this is the user holding the oracle database binary installation):
/usr/bin/ssh -o FallBackToRsh=no  -o PasswordAuthentication=no 
  -o StrictHostKeyChecking=yes  -o NumberOfPasswordPrompts=0 rac2 /bin/date
/usr/bin/ssh -o FallBackToRsh=no  -o PasswordAuthentication=no 
  -o StrictHostKeyChecking=yes  -o NumberOfPasswordPrompts=0 rac1 /bin/date

Both commands should return the current date. If there are errors like „connection refused“, check:

  • network connectivity between both nodes
  • does a „normal“ SSH connection works?:
    • ssh root@rac1-priv
    • ssh root@rac1
    • ssh root@rac2-priv
    • ssh root@rac2

both commands should prompt for a password.

If SSH was working before and seems “gone” you might face a problem documented here.

  • add ACFS oracle home to cluster registry
$GRID_HOME/bin/srvctl add filesystem -d /dev/asm/ora11p_home-132
  -v ora11p_home -g DATA2 -m /u01/app/oracle/product/11.2.0/ora11p -u ora11p

Arguments:
-d: path of advm volume
-v: name of volume
-m: mount point path

„-u ora11p“ ist important and must match the owner of the database binary installation; else dbca will raise error compaining about missing permissions

Create the database with dbca

Start dbca

step6_010

step6_011

step6_012

step6_013

step6_014

step6_015

step6_016

step6_017

step6_018

step6_019a

step6_019b

step6_020

step6_021

step6_022

step6_023

step6_024

step6_025

step6_026

step6_027

step6_028

step6_029

step6_030

Where to go now?

  • Tune Instance, for instance:
    • memory_target
    • db_writer_processes
    • sessions parameter
  • Configure, perform and test your backup
Posted in Oracle 11g Release 2 | 3 Comments

Oracle 11g Release 2 – SCAN explained

Starting with Oracle 11g Release 2 Oracle introduced a new cluster database connection concept: SCAN – “Single Client Access Name”.

SCAN on the server side

When installing a 11g Release 2 grid infrastructure you are asked for the cluster name which will be part of the SCAN. The notation for the SCAN is:

<clustername>-scan.<domain>

For instance if your cluster is named “rac” and the domain “regner.de” the SCAN name will be “rac-scan.regner.de”.

In order to successful install grid infrastructure you need to configure your DNS (hosts file entries will not work!) prior installing grid infrastructure to resolve the name accordingly. Oracle requires at least one, better three IPs configured for the scan name. Your DNS zone file might look like this:

scan_example

 

In the example zone file above we configured three IPs for the scan: 172.23.15.3, 172.23.15.4 and 172.23.15.5.

After installation you will find three listener processes running (separated on all cluster nodes) as shown in the following figure:

scan_example

Listener for SCAN2 and SCAN3 are running on node “rac1” and listener for SCAN1 is running on node “rac2”. For each SCAN listener there will be a dedicated network interface running with the same IP as configured in DNS:

scan_interfaces

Client connections

Connection to database “RAC11P” using SCAN would use this tnsnames entry:

RAC11P =
 (DESCRIPTION=
 (ADDRESS=(PROTOCOL=tcp)(HOST=rac-scan.regner.de)(PORT=1521))
 (CONNECT_DATA=(SERVICE_NAME=RAC11P))
 )

The “old fashioned” way still works:

RAC11P_old =
 (DESCRIPTION=
 (ADDRESS_LIST=
 (ADDRESS=(PROTOCOL=tcp)(HOST=rac1-vip.regner.de)(PORT=1521))
 (ADDRESS=(PROTOCOL=tcp)(HOST=rac2-vip.regner.de)(PORT=1521))
 )
 (CONNECT_DATA=(SERVICE_NAME=RAC11P))
 )

Connecting to a named instance:

RAC11P =
 (DESCRIPTION=
 (ADDRESS=(PROTOCOL=tcp)(HOST=rac-scan.regner.de)(PORT=1521))
 (CONNECT_DATA=(SERVICE_NAME=RAC11P)
 (INSTANCE_NAME=RAC11P1))
 )
Posted in Oracle 11g Release 2 | 9 Comments