Your private tape library – a pre-configured virtual machine with MHVTL and iSCSI export capability

I am using MHVTL for quite a while now. MHVTL emulates a configurable tape library and stores the data in the file system. Normally when testing i am installing the backup software directly where MHVTL runs. But that only works for simple configurations.

For complex configurations for instance with drive sharing this approach does not work. A few days ago i sumbled accross the idea to export the tape library and the drives via iSCSI. So i made a VM based on Oracle Enterprise Linux 6 Update 2 plus the most recent MHVTL version plus self-compiled tgt to test this. And in fact: It does work without any major problems.

With this blog post i would like to share my VM with all interested people. You can download the VM here (728 MB). The root password is ‘root’.

Continue reading

Posted in Networker | 7 Comments

RMAN-10008/RMAN-04006: error from auxiliary database: ORA-12537: TNS:connection closed

Just a short note:

When doing a “duplicate from active database” with a larger amount of channels you need to set the PROCESSES setting in your parameter file to a reasonable high value (i.e. 500). Otherwise the duplicate will fail with:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 03/30/2012 13:56:22
RMAN-05501: aborting duplication of target database
RMAN-12001: could not open channel ORA_AUX_DISK_60
RMAN-10008: could not create channel context
RMAN-10003: unable to connect to target database
RMAN-04006: error from auxiliary database: ORA-12537: TNS:connection closed

Posted in Oracle in general | Leave a comment

Oracle announces support for 11g R2 databases on OEL 6

Oracle just released a statement which states the immediate availability and support using Oracle database 11g R2 with Oracle Enterprise Linux 6.

 

You can read the official announce here.

Posted in Oracle in general | Leave a comment

PRVF-5300: Failed to retrieve active version for CRS on this node when installing 11.2.0.2 DB on 11.2.0.3.0 Grid Infrastructure

I just played with 11.2.0.3.0 patchset on Linux x86_64 (in my testcase Oracle Enterprise Linux 5.6) and tried to install a 11.2.0.2.0 database on it. It fails with:

PRVF-5300: Failed to retrieve active version for CRS on this node


The error stack in the installation log is:

ID: oracle.install.commons.util.exception.DefaultErrorAdvisor:745
oracle.cluster.verification.VerificationException: An internal error occurred within cluster
verification framework

ERRORMSG(linux): PRVF-5300 : Failed to retrieve active version for CRS on this node
        at oracle.cluster.verification.ClusterVerification.getPreReqTasksForSIDBInst(ClusterVerification.java:615)
        at oracle.install.ivw.db.action.PrereqAction.getProductVerificationTasks(PrereqAction.java:111)
        at oracle.install.commons.base.interview.common.action.AbstractPrereqAction.execute
        (AbstractPrereqAction.java:86)
        at oracle.install.commons.flow.AbstractFlowExecutor.startAction(AbstractFlowExecutor.java:358)
        at oracle.install.commons.flow.AbstractFlowExecutor.enterVertex(AbstractFlowExecutor.java:571)
        at oracle.install.commons.flow.AbstractFlowExecutor.transition(AbstractFlowExecutor.java:333)
        at oracle.install.commons.flow.AbstractFlowExecutor.nextState(AbstractFlowExecutor.java:268)
        at oracle.install.commons.flow.AbstractFlowExecutor.nextViewState(AbstractFlowExecutor.java:227)
        at oracle.install.commons.flow.DefaultFlowNavigator.goForward(DefaultFlowNavigator.java:58)
        at oracle.install.commons.flow.jewt.FlowWizard$1.run(FlowWizard.java:125)
        at oracle.install.commons.flow.jewt.FlowWizard$TransitionManager$1.run(FlowWizard.java:101)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
        at java.util.concurrent.FutureTask.run(FutureTask.java:123)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676)
        at java.lang.Thread.run(Thread.java:595)

The problem

I started the installer with debug enabled just add “-debug -logLevel finest >inst1.out 2>inst2.out”). The log files gave some insight:

[Version.getVersion:497]  version String is 11.2.0.3.0
[Version.getVersion:498]  new Version().toString is 11.2.0.2.0
[VerificationUtil.getSIHAReleaseVersionObj:4986]  Configuration Exception:
PRKC-1137 : Unable to find Version object with string value 11.2.0.3.0
[VerificationUtil.getCRSUser:1362]  Active Version = null

The related query command is

"GI_HOME/bin/crsctl query has releaseversion"

Obviously 11.2.0.2.0 installer has problems with the string “11.2.0.3.0”.

Solution #1

The most simple approach is to start the installer like this:

./runInstaller -ignorePrereq

With that the installer skips al pre-installation tests.

Solution #2

One simple approach was to created a wrapper around crsctl to report a version of 11.2.0.2.0 when querying releaseversion:

cd $GRID_HOME /bin
mv crsctl crsctl.orig

Now create a script “crsctl” with the following contents:

EXEC=/u01/app/oragrid/product/11.2.0.3.0/bin/crsctl.orig
case $1 in
query)
 echo "Oracle High Availability Services release version on the local node is [11.2.0.2.0]"
;;
*)
        $EXEC $*
;;
esac

 

You can start the database installation. During the verification steps the installer might report the Oracle Restart Registry as invalid. Just ignore it. The installation should now run fine.

Note that this bug is NOT related to OEL 5.6. It is the installer which cannot deal with the version string of the newer grid infrastructure. So you will face this error on OEL 6, RedHat and SuSE as well.

Dont forget to revert the changes after the installation!

After installation finished i was able to create a database using ASM without any problems. Registering the database into Oracle Restart also worked fine.

Posted in Oracle in general | 9 Comments

INFO: task blocked for more than 120 seconds.

When running some high workloads on UEK kernels on systems with a lot of memory you might see the following errors in /var/log/messages:

 

INFO: task bonnie++:31785 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
bonnie++      D ffff810009004420     0 31785  11051               11096 (NOTLB)
ffff81021c771aa8 0000000000000082 ffff81103e62ccc0 ffffffff88031cb3
ffff810ac94cd6c0 0000000000000007 ffff810220347820 ffffffff80310b60
00016803dfd77991 00000000001312ee ffff810220347a08 0000000000000001
Call Trace:
[<ffffffff88031cb3>] :jbd:do_get_write_access+0x4f9/0x530
[<ffffffff800ce675>] zone_statistics+0x3e/0x6d
[<ffffffff88032002>] :jbd:start_this_handle+0x2e5/0x36c
[<ffffffff800a28b4>] autoremove_wake_function+0x0/0x2e
[<ffffffff88032152>] :jbd:journal_start+0xc9/0x100
[<ffffffff88050362>] :ext3:ext3_write_begin+0x9a/0x1cc
[<ffffffff8000fda3>] generic_file_buffered_write+0x14b/0x675
[<ffffffff80016679>] __generic_file_aio_write_nolock+0x369/0x3b6
[<ffffffff80021850>] generic_file_aio_write+0x65/0xc1
[<ffffffff8804c1b6>] :ext3:ext3_file_write+0x16/0x91
[<ffffffff800182df>] do_sync_write+0xc7/0x104
[<ffffffff800a28b4>] autoremove_wake_function+0x0/0x2e
[<ffffffff80062ff0>] thread_return+0x62/0xfe
[<ffffffff80016a81>] vfs_write+0xce/0x174
[<ffffffff80017339>] sys_write+0x45/0x6e
[<ffffffff8005d28d>] tracesys+0xd5/0xe0

This is a know bug. By default Linux uses up to 40% of the available memory for file system caching. After this mark has been reached the file system flushes all outstanding data to disk causing all following IOs going synchronous. For flushing out this data to disk this there is a time limit of 120 seconds by default. In the case here the IO subsystem is not fast enough to flush the data withing 120 seconds. This especially happens on systems with a lof of memory.

The problem is solved in later kernels and there is not “fix” from Oracle. I fixed this by lowering the mark for flushing the cache from 40% to 10% by setting “vm.dirty_ratio=10” in /etc/sysctl.conf. This setting does not influence overall database performance since you hopefully use Direct IO and bypass the file system cache completely.

Posted in Oracle in general | 23 Comments

11.2.0.3.0 is out

Since almost two week Oracle released Oracle database patchset 11.2.0.3.0.

 

I am a little bit late with this news. Martin Bach already  posted a lot of stuff about this so let me link to his guides:

At the moment 11.2.0.3.0 is available for Linux (32/64 bit) and Solaris SPARC.

Posted in Oracle in general | Leave a comment

ext4 file systems and the 16 TB limit – how to *solve* it

File systems do have limits. Thats no surprise. ext3 had a limit at 16 TB file system size. If you needed more space you´d have to use another file system for instance XFS or JFS or spilt the capacity into multiple mount points.

ext4 was designed to allow far more larger file systems than ext3. According to wikipedia ext4 has a maximum file system size of 1 EiB (approx. one exabyte or 1024 PB or 1024*1024 TB).

Now if you´d try to create one single large file system with ext4 on every linux distribution out there (including OEL 6.1; as of 18th August 2011) you will end up with:

[root@localhost ~]# mkfs.ext4 /dev/iscsi/test mke4fs 1.41.9 (22-Aug-2009)
mkfs.ext4: Size of device /dev/iscsi/test too big to be expressed in 32 bit susing a blocksize of 4096.

This post is about how to solve the issue.

Continue reading

Posted in UNIX | 28 Comments

When patching is not enough: Oracle 11g R2 on Solaris SPARC requires fresh base installation of Solaris 10 U6

While checking MOS i found an interesting note (ID 964976.1) which states:

Applying a kernel patch or a Solaris patch bundle is not the equivalent
to  installing the specific Solaris 10 "update 6" image. 11gR2 RDBMS software
is  only certified for a base install image of Solaris 10 update 6 or greater.

There is a FAQ (ID 971464.1) on this problem. Here it states:

Oracle/Sun has specifically started that "installing patches will not bring it  to Update 6".

and

It is only certified for a base install image of Solaris 10 Update 6  or greater, or an
upgraded image of an earlier Solaris 10 update to at least  Update 6 or greater. There are
only two methods to accomplish this " image".  Please see Question #9 for more details.

So keep this in mind when installing 11g R2 on Solaris SPARC.

Posted in Oracle in general | Leave a comment

Installing a paravirtualized guest using PXE and Kickstart on Oracle VM 2.2

The past few days i started working on Oracle VM 2.2.1.

The task i tried to accomplish was to install Oracle Enterprise Linux in a para-virtualized guest via PXE and Kickstart. Sounds not too complicated, does it? If you are familiar with VMWARE and know how easy it is to accomplish this task be advised: With Oracle VM it is complicated.

Continue reading

Posted in Oracle VM 2.2 | Leave a comment

11.2.0.2: two critical bugs

Just found two nice bugs in Metalink for 11.2.0.2.x:

Bug #1 (10205230) [Note ID 1318986.1]: ORA-00600 or DATA CORRUPTION in RAC Environments
when using shutdown mode "normal", "transactional" or "immediate" on 11.2.0.2.1 and 11.2.0.2.0.

 

This Bug is fixed in 11.2.0.2 PSU 2.

 

Now if you install 11.2.0.2 PSU 2 you might find the next major bug:

Bug #2 (12431716): Mutex waits may cause higher CPU usage in 11.2.0.2.2 PSU / GI PSU [ID 12431716.8]

Posted in Oracle in general | 1 Comment