Removing NFS Locks on Opensolaris / Nexenta (or handling ORA-27086: unable to lock file – already in use when using Oracle over NFS)

Facts

  • Oracle running on Linux / Solaris
  • Oracle data files and control file are stored on NFS
  • NFS server used Opensolaris or Nexenta

Symptom

ORA-00205: error in identifying controlfile, check alert log for more info
ORA-00210: cannot open the specified  control file
ORA-00202: control file:: '/u02/oradata/ORA11P/control02.ctl'
ORA-27086: unable to lock file - already in use
Linux-x86_64 Error: 11: Resource temporarily unavailable
Additional information: 8

Cause

Due to a database crash or unclean shutdown NFS locks were not properly release on storage side. You have to clean them manually in order to be able to startup the database again.

Solution

Step 1 – Remove NFS locks on Opensolaris / Nexenta side

root@nex2:/volumes# clear_locks oracle11
Clearing locks held for NFS client oracle11 on server nex2
clear of locks held for oracle11 on nex2 returned success

Note: It is NOT sufficient to enter the IP adress. You have to use the HOSTNAME here.

Step 2 – Mount & Open the database

oracle@oracle11:/u02/oradata/LIMSTEST> sqlplus / as sysdba

SQL*Plus: Release 11.2.0.2.0 Production on Thu Apr 28 20:53:19 2011
Copyright (c) 1982, 2010, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Release 11.2.0.2.0 - 64bit Production

SQL> alter database mount;
Database altered.
SQL> alter database open;
Database altered.
Posted in Oracle in general | 1 Comment

ORA-27154: post/wait create failed / ORA-27301: OS failure message: No space left on device when starting ASM or database instance

Today i came accross a very strange error. After rebooting one cluster node (which ran flawlessly before!) ASM instance came up fine but database instance failed with:

ORA-27154: post/wait create failed
ORA-27300: OS system dependent operation:semget failed with status: 28
ORA-27301: OS failure message: No space left on device
ORA-27302: failure occurred at: sskgpsemsper

It turned out the kernel settings were insufficent. The semaphore settings caused problems.

In /etc/sysctl.conf:

kernel.sem = 250 32000 100 128

This line needs to be changed to:

kernel.sem = 250 32000 100 256

Note the change of the last number from 128 to 256. After that applying the settings is done as root with:

sysctl -p

After that all instances came up just fine.

Posted in Oracle in general | 16 Comments

PSU Bundle Patch 3 for 11.2.0.2.0

Bundle Patch 3 for Oracle 11.2.0.2.0 is available.

For installation refer to these posts here and here.

Patch number is: 10387939

If youre worried about the fact the patch says it is for Exadata read this post from Oracle.

Money Quote:

Officially this Bundle Patch for Oracle Database 11.2.0.2 is titled “Exadata Database recommended patch” and got released yesterday. But I would recommend this one to all customers using 11.2.0.2 Grid Infrastructure, RAC and ASM.

Posted in Oracle in general | Leave a comment

RMAN backup validate database on databases in noarchivelog mode

Starting with 11g RMAN seems to allow a “backup validate database” even if the database is not in archive log mode:

SQL*Plus: Release 11.2.0.2.0 Production on Wed Mar 23 11:18:01 2011
Copyright (c) 1982, 2010, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> archive log list;
Database log mode              No Archive Mode
Automatic archival             Disabled
Archive destination            /u01/oradata/ORA11/archive
Oldest online log sequence     140
Current log sequence           143

As you can see from the listing above the database is not in archive log mode. Nevertheless doing a “backup validate database” works:

<hostname>:/tmp# rman target /
Recovery Manager: Release 11.2.0.2.0 - Production on Wed Mar 23 11:18:25 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
connected to target database: ORA11(DBID=12345678)
RMAN> backup validate datafile 1;
Starting backup at 23-MAR-11
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=124 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/oradata/ORA11/system01.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:07
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
1    OK     0              47774        307200          404393656
File Name: <name>
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data       0              237701
Index      0              19286
Other      0              2439

Doing the same with a 10g database yields:

<hostname>:/tmp# rman target /
Recovery Manager: Release 10.2.0.5.0 - Production on Wed Mar 23 11:19:20 2011
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
connected to target database: ORA10 (DBID=12343456)
RMAN> backup validate datafile 1;
Starting backup at 23-MAR-11
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=201 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 03/23/2011 11:19:27
ORA-19602: cannot backup or copy active file in NOARCHIVELOG mode
Posted in Oracle in general | 2 Comments

Networker cumulative hotfixes

EMC offers so called cumulative hotfixes which include all available and tested hotfixes for a specific version. These hotfixes can be found at the following URL:

ftp://ftp.legato.com/pub/NetWorker/Cumulative_Hotfixes
Posted in Oracle in general | Leave a comment

How to corrupt a oracle block within a datafile

For corrupting a block dd can be used. Just adjust the “ibs” argument to match your database block size and adjust “count” to specify the amount of blocks to be corrupted. “Seek” specifies how many blocks are skipped (counted from the beginning of the file) before writing the corrupted data.

dd ibs=8192 seek=100 count=110 if=/dev/zero of=test01.dbf conv=notrunc

Caution: This command WILL corrupt your data file. Use it ONLY on test systems and with a valid backup!


							
Posted in Oracle in general | Leave a comment

I am now certified as MCITP MS DBA for SQL Server 2008

Altough not oracle related i successfully completed my certification as an MS SQL Server DBA. In most companies MS SQL is used for smaller databases in addition to oracle. So from my point of view it is a plus.

I am allowed to wear the title mentioned above from now on. But i guess i wont print it on my business cards…. Way too long :-)

Posted in Oracle in general | Leave a comment

Do you know “chopt” ?

Starting with Oracle 11g Release 2 there is a new tool called “chopt” which lets you enable or disable database features such as partitioning, olap and so on without the need to re-install the bianries.

You can read about the utility here.

Posted in Oracle 11g Release 2, Oracle in general | Leave a comment

How to restore an rman backup without any existing control files

This week a came across the following very interesting scenario:

The customer made an RMAN online backup and saved it to an NFS location. Controlfile autobackup was on. During backup the backupset was written to that NFS location. The control file however was written to the controlfile autobackup location (i.e. a different location).

After that the customer destroyed all disk groups and re-installed the database servers.

Afterwards he tried to restore the database which failed because there was no controlfile in the backups at all. In order to make it worse the customer had no log files from the RMAN backup session, no DBID and no trace files whatever – including the alert.log. This leaved us with the following situation:

  • no autobackup anywhere
  • no older controlfile available
  • no snapshot controlfile available
  • no redo log available
  • no backup logfiles
  • no “backup controlfile to trace” information
  • no DBID
  • no alert.log

Under normal circumstances restoring the database from an rman backup without having any control file is impossible. In this article i will show you an approach you might use to restore the datafile anyway.

Continue reading

Posted in Oracle in general | 6 Comments

Networker: Fixing “NSR peer information” errors

Just because i have to use google to find the solution often here the direct link for fixing nsr peer information error messages.

An example error message can be found below:

39078 19.10.2010 12:07:35  nsrexecd GSS error: There is already a machine using the
name: "clientname.domain.com". Either choose a different name for your machine, or
delete the "NSR peer information" entry for "clientname.domain.com" on
host: "networkerserver.domain.com"

Preston De Guise explained in his blog how to fix the error.

Posted in Networker | Leave a comment