It´s been a while since my last post because i was busy doing some projects. One of these projects involved installing a rac cluster with ASM in normal redundancy mode. My experiences installing this configuration is covered in this article.
The customer requested the installation of an 2-node-rac cluster running a 10g Release 2 database. Storage was attached via fibre channel coming from two EMC AX25 storage systems directly attached to both nodes. The cluster was installed with 11.1.0.7.0 clusterware and 11.1.0.7.0 asm. The database to be run was 10.2.0.4.2. By using ASM and normal redundancy mode both storage arrays were mirrored against each other. Everything worked well – too well. So after installing the whole cluster and setting up the database instance we performed some tests:
The first test was to interrupt the connection (pull the cable!) between storage array A and node A. From my experiences with 11g R2 and asm in normal redundancy mode i expected to database to stay up and running. To my surprise the database on node A crashed. In addition to that i was unable to start the instance again. I cannot give the exact error message because the project ended and i am not allowed to disclose the messages. Among several ORA-00600 messages i also saw messages saying the database was unable to write to the control file and open the redo logs. That was strange because ASM already started to dropping the missing disks from the disk group. In addition dropping the missing disks was not that easy. After re-adding the disks back to the disk group i tried to delete one LUN from the operating system selectively thus keeping communication, links and so on intact. The result did not change: The database instance on the affected node crashed and i was unable to start the instance again. After searching at meta link which yielded nothing we decided to give 11.1.0.7.1 a try:
After installing and creating a 11.1.0.7.1 database and increasing disk group compatibility from 10.2 to 11.1 we tried again to interrupt the connection between one storage array and one node. This time the instance crashed with an I/O error and was terminated by the log writer but was restarted seconds after that fine – a great improvement over 10.2. After discussion the customer decided to go with 11.1.0.7.1 instead of 10.2.0.4.2 and we continued our tests which were completed successfully. These tests involved for instance interrupting the communication from one storage array to both nodes and re-establishing the connection again (after waiting 15 minutes).
My conclusion is: ASM in mirrored configuration can be used from 11g Release 1 onwards and gets well with 11g R2 (see my tests with 11g R2 and ASM in normal redundancy configuration). With 10g ASM should rely on external redundancy.
As always: Please comment! I´m looking forward from your experiences with ASM in normal or high redundancy configurations and 10g R2 or 11g R1 databases.