Home > Oracle in general > Oracle databases with ASM in normal redundancy mode

Oracle databases with ASM in normal redundancy mode

It´s been a while since my last post because i was busy doing some projects. One of these projects involved installing a rac cluster with ASM in normal redundancy mode. My experiences installing this configuration is covered in this article.

The customer requested the installation of an 2-node-rac cluster running a 10g Release 2 database. Storage was attached via fibre channel coming from two EMC AX25 storage systems directly attached to both nodes. The cluster was installed with  11.1.0.7.0 clusterware and 11.1.0.7.0 asm. The database to be run was 10.2.0.4.2. By using ASM and normal redundancy mode both storage arrays were mirrored against each other. Everything worked well – too well. So after installing the whole cluster and setting up the database instance we performed some tests:

The first test was to interrupt the connection (pull the cable!) between storage array A and node A. From my experiences with 11g R2 and asm in normal redundancy mode i expected to database to stay up and running. To my surprise the database on node A crashed. In addition to that i was unable to start the instance again. I cannot give the exact error message because the project ended and i am not allowed to disclose the messages. Among several ORA-00600 messages i also saw messages saying the database was unable to write to the control file and open the redo logs. That was strange because ASM already started to dropping the missing disks from the disk group. In addition dropping the missing disks was not that easy. After re-adding the disks back to the disk group i tried to delete one LUN from the operating system selectively thus keeping communication, links and so on intact. The result did not change: The database instance on the affected node crashed and i was unable to start the instance again. After searching at meta link which yielded nothing we decided to give 11.1.0.7.1 a try:

After installing and creating a 11.1.0.7.1 database and increasing disk group compatibility from 10.2 to 11.1 we tried again to interrupt the connection between one storage array and one node. This time the instance crashed with an I/O error and was terminated by the log writer but was restarted seconds after that fine –  a great improvement over 10.2. After discussion the customer decided to go with 11.1.0.7.1 instead of 10.2.0.4.2 and we continued our tests which were completed successfully. These tests involved for instance interrupting the communication from one storage array to both nodes and re-establishing the connection again (after waiting 15 minutes).

My conclusion is: ASM in mirrored configuration can be used from 11g Release 1 onwards and gets well with 11g R2 (see my tests with 11g R2 and ASM in normal redundancy configuration). With 10g ASM should rely on external redundancy.

As always: Please comment! I´m looking forward from your experiences with ASM in normal or high redundancy configurations and 10g R2 or 11g R1 databases.

Categories: Oracle in general Tags:
  1. December 18th, 2009 at 10:46 | #1

    Hello,

    I have performed several stability tests on different customer sites with CRS/ASM/RDBMS 10gR2 (10.2.0.4) and normal redundancy. During these tests, the impact of loosing one of two storages on availability of server and database should be minimal, certainly no crashes or even unstartable instances.

    What´s important in case of two storages is the location of the third voting disk because of the majority rule.

    However, the ASM diskgroup availability will not be impacted by loosing one failgroup.

    Regards,
    Martin

  2. Ronny Egner
    December 19th, 2009 at 11:05 | #2

    Hi,

    i certainly agree with you. The third voting disk was located on a third site accessible via NFS. With the same system Oracle 11g R1 was far more stable than 10.2.0.4.2 (with installed PSU 2). I had nothad the time to investigate this further. The question here is: Did you really tested it? It SHOULD be not problem loosing one failure group. But in fact 10g R2 crashes.

    Ronny

  1. No trackbacks yet.