ASM resilvering – or – how to recover your crashed cluster – Test no 4

Test #4: Corrupting the ASM disk with ASM disk group being online and active

After overwriting the ASM disk header while the disk group was offline we will now put some load on the full running cluster and corrupt the asm disk slightly.

Continue reading

Posted in Oracle 11g Release 2, Oracle ASM | 3 Comments

The Oracle Database File System (DBFS)

The following post is a summary of my tests with DBFS on a 11g Release 2 RAC.

DBFS – An Overview

Since the introduction of LOBs the database was used to store all kinds of files, for instance images, audio files, video files or even text files. Starting with Oracle 11g Release 1 introduced SecureFiles which introduced compression and de-duplication.

Oracle Database Filesystem (DBFS) creates a mountable file system which can be used to access files stored in the database as SecureFile LOBs. DBFS is a shared file system like NFS and consists of a server (the database) and a client. Just like a traditional database connection server and client can be on different systems which communicate over SQLNet. The same applies to DBFS as well.

The following figure illustrates DBFS:

dbfs

(Source: http://download.oracle.com/docs/cd/E11882_01/appdev.112/e10645/adlob_fs.htm)

Continue reading

Posted in Oracle 11g Release 2, Oracle in general | 6 Comments

ORA-12157 TNS:internal network communication error when connecting after fresh installation

Recently i came across a problem i thought was fixed since 10g:

After a fresh installation of oracle 10g when trying to connect to the instance (for instance to create the database for the first time) you get the following error:

oracle@host#:> sqlplus /nolog

 SQL*Plus: Release 10.2.0.4.0 - Production on Do Oct 8 08:23:18 2009
 Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.

 SQL> connect / as sysdba
 ERROR:
 ORA-12157: TNS: internal network communication error.

This error is related to the linking phase when installing oracle. In your environment used during installation the variable TMPDIR pointed to a non existing directory. When linking oracle some adapters (like tcp) were not linked correctly. You can verify this with:

oracle@host#:> cd $ORACLE_HOME
oracle@host#:~/bin> adapters oracle

/u01/app/oracle/product/oracle/bin/adapters: line 70:

/u01/app/oracle/product/oracle/temp/naetab18456: no such file or directory

/u01/app/oracle/product/oracle/bin/adapters: line 71:

/u01/app/oracle/product/oracle/temp/nautab18456: no such file or directory
/u01/app/oracle/product/oracle/bin/adapters: line 296:

/u01/app/oracle/product/oracle/temp/exec18456: no such file or directory

Oracle Net transport protocols linked with oracle are:

Oracle Net naming methods linked with oracle are:

Oracle Advanced Security options linked with oracle are:

The solution for this is to relink oracle again. Make sure you create the missing $TMPDIR first or change $TMPDIR to point to a existing and writeable directory:

oracle@host#:> cd $ORACLE_HOME/network/lib
oracle@host#:> make -f ins_net_client.mk ntcontab.o
oracle@host#:> cd $ORACLE_HOME/bin
oracle@host#:> genclntsh
oracle@host#:> cd $ORACLE_HOME/rdbms/lib
oracle@host#:> make -f ins_rdbms.mk install
oracle@host#:> cd $ORACLE_HOME/sqlplus/lib
oracle@host#:> make -f ins_sqlplus.mk install
oracle@host#:> cd $ORACLE_HOME/network/lib
oracle@host#:> make -f ins_net_client.mk install
oracle@host#:> cd $ORACLE_HOME/network/lib
oracle@host#:> make -f ins_net_server.mk install

You can verify it by running “adapters oracle” again:

oracle@host#:> cd $ORACLE_HOME
oracle@host#:~/bin> adapters oracle

Oracle Net transport protocols linked with oracle are:

 IPC
 BEQ
 TCP/IP
 SSL
 SDP/IB
 RAW

Oracle Net naming methods linked with oracle are:

 Local Naming (tnsnames.ora)
 Oracle Directory Naming
 Oracle Host Naming

Oracle Advanced Security options linked with oracle are:

 RC4 40-bit encryption
 RC4 56-bit encryption
 RC4 128-bit encryption
 RC4 256-bit encryption
 DES40 40-bit encryption
 DES 56-bit encryption
 3DES 112-bit encryption
 3DES 168-bit encryption
 AES 128-bit encryption
 AES 192-bit encryption
 AES 256-bit encryption
 MD5 crypto-checksumming
 SHA-1 crypto-checksumming
 Kerberos v5 authentication
 RADIUS authentication
oracle@host#:> cd $ORACLE_HOME
oracle@host#:~/bin> adapters oracle

/u01/app/oracle/product/oracle/bin/adapters: line 70:
/u01/app/oracle/product/oracle/temp/naetab18456: no such file or directory

/u01/app/oracle/product/oracle/bin/adapters: line 71:
/u01/app/oracle/product/oracle/temp/nautab18456: no such file or directory/u01/app/oracle/product/oracle/bin/adapters: line 296:
/u01/app/oracle/product/oracle/temp/exec18456: no such file or directory

Oracle Net transport protocols linked with oracle are:

Oracle Net naming methods linked with oracle are:

Oracle Advanced Security options linked with oracle are:

Posted in Oracle in general | 1 Comment

ASM resilvering – or – how to recover your crashed cluster – Test no 3

Test #3: Overwriting the ASM disk header with ASM disk group being offline

In this post we will dig a little bit deeper with ASM. We will overwrite the ASM disk header on one disk while the disk group is offline, check the results and repair the LUN again.

Overwrite the disk header of disk “DISK003A”

dd if=/dev/random bs=8k count=1 of=/dev/sdg1

Try to mount the disk group

SQL> alter diskgroup DATA2 mount;
alter diskgroup data2 mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "0" is missing from group number "2"

Continue reading

Posted in Oracle in general | 8 Comments

Useful ASM scripts and queries

This short post covers some useful ASM queries and scripts.

chk_asm_mapping.sh

The following is a script i found somewhere on the web. Unfortunately it was not totally correct for newer asm libraries. The script itself:

/etc/init.d/oracleasm querydisk -d `/etc/init.d/oracleasm listdisks -d` |
cut -f2,10,11 -d" " | perl -pe 's/"(.*)".*\[(.*), *(.*)\]/$1 $2 $3/g;' |
while read v_asmdisk v_minor v_major
do
v_device=`ls -la /dev | grep " $v_minor, *$v_major " | awk '{print $10}'`
echo "ASM disk $v_asmdisk based on /dev/$v_device [$v_minor, $v_major]"
done

When started it produces the following output:

ASM disk DISK001A based on /dev/sdc1 [8, 33]
ASM disk DISK001B based on /dev/sdb1 [8, 17]
ASM disk DISK002A based on /dev/sdd1 [8, 49]
ASM disk DISK002B based on /dev/sde1 [8, 65]
ASM disk DISK003A based on /dev/sdg1 [8, 97]
ASM disk DISK003B based on /dev/sdf1 [8, 81]

Disk Group Information

set pages 40000 lines 120
col NAME for a15
select GROUP_NUMBER DG#, name, ALLOCATION_UNIT_SIZE AU_SZ, STATE,
TYPE, TOTAL_MB, FREE_MB, OFFLINE_DISKS from v$asm_diskgroup;
DG# NAME  AU_SZ   STATE    TYPE   TOTAL_MB FREE_MB OFFLINE_DISKS
--- ----- ------- -------- ------ ------   ------- -------------
 1  DATA2 4194304 MOUNTED  NORMAL 40952    4168    0
 2  DATA  1048576 MOUNTED  NORMAL 30717    25667   0

ASM Disk Information

set pages 40000 lines 120
col PATH for a30
select DISK_NUMBER,MOUNT_STATUS,HEADER_STATUS,MODE_STATUS,STATE,
PATH FROM V$ASM_DISK;

DISK_NUMBER MOUNT_S HEADER_STATU MODE_ST STATE    PATH
----------- ------- ------------ ------- -------- -------------
0           CLOSED  MEMBER       ONLINE  NORMAL   ORCL:DISK001A
3           CLOSED  MEMBER       ONLINE  NORMAL   ORCL:DISK003B
1           CACHED  MEMBER       ONLINE  NORMAL   ORCL:DISK001B
2           CACHED  MEMBER       ONLINE  NORMAL   ORCL:DISK002A
3           CACHED  MEMBER       ONLINE  NORMAL   ORCL:DISK002B

Combined ASM Disk and ASM Diskgroup information

The following query combines ASM disk and diskgroup information. You can edit this query to suit your needs easily. If you use ASM files instead of disks you have to use v$asm_file instead of v$asm_disk. If you use ASM files you have to add v$asm_file to the query.

col PATH for a15
col DG_NAME for a15
col DG_STATE for a10
col FAILGROUP for a10

select dg.name dg_name, dg.state dg_state, dg.type, d.disk_number dsk_no,
d.path, d.mount_status, d.FAILGROUP, d.state 
from v$asm_diskgroup dg, v$asm_disk d
where dg.group_number=d.group_number
order by dg_name, dsk_no;

DG_NAME  DG_STATE TYPE   DSK_NO PATH          MOUNT_S FAILGROUP  STATE
-------- -------- ------ ------ ------------- ------- ---------- --------
DATA     MOUNTED  NORMAL 1      ORCL:DISK001B CACHED  DISK001B   NORMAL
DATA     MOUNTED  NORMAL 2      ORCL:DISK002A CACHED  DISK002A   NORMAL
DATA     MOUNTED  NORMAL 3      ORCL:DISK002B CACHED  DISK002B   NORMAL
DATA2    MOUNTED  NORMAL 1      ORCL:DISK003A CACHED  DISK003A   NORMAL
DATA2    MOUNTED  NORMAL 2      ORCL:DISK003B CACHED  DISK003B   NORMAL

Monitoring ASM disk operations

select GROUP_NUMBER, OPERATION, STATE, ACTUAL, SOFAR, EST_MINUTES from v$asm_operation;

If there is a operating going on (like rebalancing) the query will return some rows. For instance for our just added disk the query yields:

GROUP_NUMBER OPERA STAT    ACTUAL  SOFAR      EST_MINUTES
------------ ----- ---- ---------- ---------- -----------
2            REBAL RUN           1         49          16
Posted in Oracle in general | 5 Comments

ASM resilvering – or – how to recover your crashed cluster – Test no 2

In this post we will terminating the private network interconnect between the two nodes by moving node “rac1” lan interfaces to a different vlan. This will result in still connected network ports but interrupted connectivity.

Continue reading

Posted in Oracle in general | 2 Comments

Oracle 11g Release 2 – AWR emergency purging

During the night my SYSAUX tablespace filled up to the maximum allowed size so AWR could not save it´s data. In the database alert.log i found the following:

Tue Oct 06 02:02:38 2009
ORA-1688: unable to extend table SYS.WRH$_ACTIVE_SESSION_HISTORY
  partition WRH$_ACTIVE_270645195_176 by 8 in tablespace SYSAUX
MMON Flush encountered SYSAUX out of space error(1688).
MMON (emergency) purge of WR snapshots (105) and older

It seems if there is not enough space available for saving AWR data AWR will automatically delete the oldest snapshots to free up space. Nice feature :-)

Posted in Oracle in general | 1 Comment

ASM resilvering – or – how to recover your crashed cluster – Test no 1

In the first test we will just disable one node under normal load. Lets see what the different components (clusterware, asm instance, advm/acfs) report:

Continue reading

Posted in Oracle in general | 2 Comments

ASM resilvering – or – how to recover your crashed cluster – the Environment

We use the same environment we installed here which consists of two virtual machines running Oracle Enterprise Linux 5 Update 3 64-bit with Oracle 11g Release 2 (Grid Infrastructure and Oracle binaries) installed. All data is placed in ASM with normal redundancy. Storage is exported from two Netapp Filers via iSCSI.

Continue reading

Posted in Oracle in general | 1 Comment

Is it allowed to quote Metalink Notes completely or partially?

Writing posts for this blog i wondered: Am i allowed to post Metalink Notes entirely or partially in my blog posts?

Looking at the web i did not find any information. I might have read the Oracle Support License but due to lengthiness – and because i am not a lawyer i refused from this :-)

So i opened a SR and asked my question. This is the reply i got:

The customer can refer to Document IDs in My Oracle Support, but is NOT allowed to reproduce any Support content which currently requires a login to Oracle Support, doing so would violate the terms of their Oracle Support License.

I guess that answers the question entirely.

Posted in Oracle in general | Leave a comment