Query for Tablespace usage with Autoextend

Sine i need to check the maximum size of tablespace with autoextend enabled i write the following query:

set linesize 100
set pagesize 100

select
                a.tablespace_name,
                round(SUM(a.bytes)/(1024*1024*1024)) CURRENT_GB,
                round(SUM(decode(b.maxextend, null, A.BYTES/(1024*1024*1024), 
                b.maxextend*8192/(1024*1024*1024)))) MAX_GB,
                (SUM(a.bytes)/(1024*1024*1024) - round(c.Free/1024/1024/1024)) USED_GB,
                round((SUM(decode(b.maxextend, null, A.BYTES/(1024*1024*1024), 
                b.maxextend*8192/(1024*1024*1024))) - (SUM(a.bytes)/(1024*1024*1024) - 
                round(c.Free/1024/1024/1024))),2) FREE_GB,
                round(100*(SUM(a.bytes)/(1024*1024*1024) - 
                round(c.Free/1024/1024/1024))/(SUM(decode(b.maxextend, null, A.BYTES/(1024*1024*1024), 
                b.maxextend*8192/(1024*1024*1024))))) USED_PCT
from
                dba_data_files a,
                sys.filext$ b,
                (SELECT
                               d.tablespace_name ,sum(nvl(c.bytes,0)) Free
                FROM
                               dba_tablespaces d,
                               DBA_FREE_SPACE c
                WHERE
                               d.tablespace_name = c.tablespace_name(+)
                               group by d.tablespace_name) c
WHERE
                a.file_id = b.file#(+)
                and a.tablespace_name = c.tablespace_name
GROUP BY a.tablespace_name, c.Free/1024
ORDER BY tablespace_name;

The result looks like this:

TABLESPACE_NAME                CURRENT_GB     MAX_GB    USED_GB    FREE_GB   USED_PCT
------------------------------ ---------- ---------- ---------- ---------- ----------
SYSAUX                                 16         32          1         31          3
SYSTEM                                 16         32          0         32          0
TEST                                   10         11          0      10.85          0
UNDOTBS1                               16      32768          0      32768          0
UNDOTBS2                               16      32768          0      32768          0
UNDOTBS3                               16      32768          0      32768          0
UNDOTBS4                               16      32768          0      32768          0
UNDOTBS5                               16      32768          0      32768          0
UNDOTBS6                               16      32768          0      32768          0
UNDOTBS7                               16      32768          0      32768          0
UNDOTBS8                               16      32768          0      32768          0
USERS                                   1         32          0         32          0

 

Nexenta: Configuring iSCSI Multipath Target with multiple network cards

This is a short post about how to configure iSCSI multipathing using Nexenta. I let it is neccessary to document it since the documentation is rather spare about it.

In my lab setup i had a machine with two physically separated interfaces:

  • e1000g0: 192.168.1.5/24   and
  • e1000g1: 192.168.10.1/24

As you can see both interfaces are on different subnets. Note that it is a very good idea (i´d call it: “required”) to use a separate subnet for each interface.  It will save you a lot of trouble and problems!

If you do not comply with this simple rule you will end up having problems with so called Arp Flux (also documented here, here, here and so on) which requires further modifications.

Note: If you modify and existing installation you only need to execute Step #2 and Step #3. Everything else should already be there.

Continue reading Nexenta: Configuring iSCSI Multipath Target with multiple network cards

Using ASMLib with Oracle Unbreakable Kernel (UEK)

Starting with the Oracle Unbreakable Enterprise Linux Kernel (UEK) the ASMlib drivers are now included in the kernel. This means that updating the kernel does not require installing the appropriate kernel modules as well.

In addition to the kernel modules you need the following packages:

  • oracleasm-support
  • oracleasmlib

For Oracle Enterprise Linux “oracleasm-support” can be found on the installation media while “oracleasmlib” is not available on the installation media. “oracleasmlib” can be downloaded over the ULN network (more information here).

If you dont have a ULN account or using Red Hat you can get the freely available packages (for OEL and Red Hat) here.

“Exiting Time2Retain handler because session_reinstatement=1” with LIO (linux-iscsi.org iSCSI implementation)

I just migrated my Ubuntu iSCSI storage server to Ubuntu 12.04 which ships with a new iSCSI implementation: LIO (linux-iscsi.org). Aside from the fact that this project lacks a HUGE amount of documentation I want to share this problem with you:

kernel: Exiting Time2Retain handler because session_reinstatement=1

This happens if you access LUNs from more multiple nodes (as required for RAC setups)  while all nodes share the same initiatorname.

The fix is to use different and unique initiatornames for every node.

RMAN-10008/RMAN-04006: error from auxiliary database: ORA-12537: TNS:connection closed

Just a short note:

When doing a “duplicate from active database” with a larger amount of channels you need to set the PROCESSES setting in your parameter file to a reasonable high value (i.e. 500). Otherwise the duplicate will fail with:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 03/30/2012 13:56:22
RMAN-05501: aborting duplication of target database
RMAN-12001: could not open channel ORA_AUX_DISK_60
RMAN-10008: could not create channel context
RMAN-10003: unable to connect to target database
RMAN-04006: error from auxiliary database: ORA-12537: TNS:connection closed

PRVF-5300: Failed to retrieve active version for CRS on this node when installing 11.2.0.2 DB on 11.2.0.3.0 Grid Infrastructure

I just played with 11.2.0.3.0 patchset on Linux x86_64 (in my testcase Oracle Enterprise Linux 5.6) and tried to install a 11.2.0.2.0 database on it. It fails with:

PRVF-5300: Failed to retrieve active version for CRS on this node


The error stack in the installation log is:

ID: oracle.install.commons.util.exception.DefaultErrorAdvisor:745
oracle.cluster.verification.VerificationException: An internal error occurred within cluster
verification framework

ERRORMSG(linux): PRVF-5300 : Failed to retrieve active version for CRS on this node
        at oracle.cluster.verification.ClusterVerification.getPreReqTasksForSIDBInst(ClusterVerification.java:615)
        at oracle.install.ivw.db.action.PrereqAction.getProductVerificationTasks(PrereqAction.java:111)
        at oracle.install.commons.base.interview.common.action.AbstractPrereqAction.execute
        (AbstractPrereqAction.java:86)
        at oracle.install.commons.flow.AbstractFlowExecutor.startAction(AbstractFlowExecutor.java:358)
        at oracle.install.commons.flow.AbstractFlowExecutor.enterVertex(AbstractFlowExecutor.java:571)
        at oracle.install.commons.flow.AbstractFlowExecutor.transition(AbstractFlowExecutor.java:333)
        at oracle.install.commons.flow.AbstractFlowExecutor.nextState(AbstractFlowExecutor.java:268)
        at oracle.install.commons.flow.AbstractFlowExecutor.nextViewState(AbstractFlowExecutor.java:227)
        at oracle.install.commons.flow.DefaultFlowNavigator.goForward(DefaultFlowNavigator.java:58)
        at oracle.install.commons.flow.jewt.FlowWizard$1.run(FlowWizard.java:125)
        at oracle.install.commons.flow.jewt.FlowWizard$TransitionManager$1.run(FlowWizard.java:101)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
        at java.util.concurrent.FutureTask.run(FutureTask.java:123)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676)
        at java.lang.Thread.run(Thread.java:595)

The problem

I started the installer with debug enabled just add “-debug -logLevel finest >inst1.out 2>inst2.out”). The log files gave some insight:

[Version.getVersion:497]  version String is 11.2.0.3.0
[Version.getVersion:498]  new Version().toString is 11.2.0.2.0
[VerificationUtil.getSIHAReleaseVersionObj:4986]  Configuration Exception:
PRKC-1137 : Unable to find Version object with string value 11.2.0.3.0
[VerificationUtil.getCRSUser:1362]  Active Version = null

The related query command is

"GI_HOME/bin/crsctl query has releaseversion"

Obviously 11.2.0.2.0 installer has problems with the string “11.2.0.3.0”.

Solution #1

The most simple approach is to start the installer like this:

./runInstaller -ignorePrereq

With that the installer skips al pre-installation tests.

Solution #2

One simple approach was to created a wrapper around crsctl to report a version of 11.2.0.2.0 when querying releaseversion:

cd $GRID_HOME /bin
mv crsctl crsctl.orig

Now create a script “crsctl” with the following contents:

EXEC=/u01/app/oragrid/product/11.2.0.3.0/bin/crsctl.orig
case $1 in
query)
 echo "Oracle High Availability Services release version on the local node is [11.2.0.2.0]"
;;
*)
        $EXEC $*
;;
esac

 

You can start the database installation. During the verification steps the installer might report the Oracle Restart Registry as invalid. Just ignore it. The installation should now run fine.

Note that this bug is NOT related to OEL 5.6. It is the installer which cannot deal with the version string of the newer grid infrastructure. So you will face this error on OEL 6, RedHat and SuSE as well.

Dont forget to revert the changes after the installation!

After installation finished i was able to create a database using ASM without any problems. Registering the database into Oracle Restart also worked fine.

INFO: task blocked for more than 120 seconds.

When running some high workloads on UEK kernels on systems with a lot of memory you might see the following errors in /var/log/messages:

 

INFO: task bonnie++:31785 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
bonnie++      D ffff810009004420     0 31785  11051               11096 (NOTLB)
ffff81021c771aa8 0000000000000082 ffff81103e62ccc0 ffffffff88031cb3
ffff810ac94cd6c0 0000000000000007 ffff810220347820 ffffffff80310b60
00016803dfd77991 00000000001312ee ffff810220347a08 0000000000000001
Call Trace:
[<ffffffff88031cb3>] :jbd:do_get_write_access+0x4f9/0x530
[<ffffffff800ce675>] zone_statistics+0x3e/0x6d
[<ffffffff88032002>] :jbd:start_this_handle+0x2e5/0x36c
[<ffffffff800a28b4>] autoremove_wake_function+0x0/0x2e
[<ffffffff88032152>] :jbd:journal_start+0xc9/0x100
[<ffffffff88050362>] :ext3:ext3_write_begin+0x9a/0x1cc
[<ffffffff8000fda3>] generic_file_buffered_write+0x14b/0x675
[<ffffffff80016679>] __generic_file_aio_write_nolock+0x369/0x3b6
[<ffffffff80021850>] generic_file_aio_write+0x65/0xc1
[<ffffffff8804c1b6>] :ext3:ext3_file_write+0x16/0x91
[<ffffffff800182df>] do_sync_write+0xc7/0x104
[<ffffffff800a28b4>] autoremove_wake_function+0x0/0x2e
[<ffffffff80062ff0>] thread_return+0x62/0xfe
[<ffffffff80016a81>] vfs_write+0xce/0x174
[<ffffffff80017339>] sys_write+0x45/0x6e
[<ffffffff8005d28d>] tracesys+0xd5/0xe0

This is a know bug. By default Linux uses up to 40% of the available memory for file system caching. After this mark has been reached the file system flushes all outstanding data to disk causing all following IOs going synchronous. For flushing out this data to disk this there is a time limit of 120 seconds by default. In the case here the IO subsystem is not fast enough to flush the data withing 120 seconds. This especially happens on systems with a lof of memory.

The problem is solved in later kernels and there is not “fix” from Oracle. I fixed this by lowering the mark for flushing the cache from 40% to 10% by setting “vm.dirty_ratio=10” in /etc/sysctl.conf. This setting does not influence overall database performance since you hopefully use Direct IO and bypass the file system cache completely.

When patching is not enough: Oracle 11g R2 on Solaris SPARC requires fresh base installation of Solaris 10 U6

While checking MOS i found an interesting note (ID 964976.1) which states:

Applying a kernel patch or a Solaris patch bundle is not the equivalent
to  installing the specific Solaris 10 "update 6" image. 11gR2 RDBMS software
is  only certified for a base install image of Solaris 10 update 6 or greater.

There is a FAQ (ID 971464.1) on this problem. Here it states:

Oracle/Sun has specifically started that "installing patches will not bring it  to Update 6".

and

It is only certified for a base install image of Solaris 10 Update 6  or greater, or an
upgraded image of an earlier Solaris 10 update to at least  Update 6 or greater. There are
only two methods to accomplish this " image".  Please see Question #9 for more details.

So keep this in mind when installing 11g R2 on Solaris SPARC.