11g Release 2 for Windows – Grid Infrastructure Installation Buggy?

I just tried installing Oracle 11g Relese 2 on Windows 2008 R2 64-bit and Windows 2008. The system were installed with Windows 2008 and 2008 R2 standard edition 64-bit on a VMWARE server with 4 core with two core assigned to each virtual machine.

Until now i was unable to install it successfully. It always failed with:

[INS-20802] Grid Infrastructure Configuration failed

This is a short blog post about my findinds. Maybe someone else experience this issue as well  and leaves a comment.

After the installation ran for some time it failed on the second node. Some debugging and re-installation hours later i guess the grid installation is buggy at least on Windows 2008 and Windows 2008 R2. Hows that? Well: Look at the following pictures all taken from the SECOND node. The services on the first (= installation node) are fine and running. All log files look fine. On the second node i am completely unable to find any error messages in the log files. Everything looks fine… but the windows event log shows some failed services which are most probably causing the error.

During installation on the second node some services are created by the oracle installer. The first service to be created is names OracleOUIVC8Service which seems to install some kind of VC libraries:

Picture #1: Oracle seems to install some VC++ Runtime libraries during cluster installation. For doing so a service is created and then started. Starting this service shall install the VC runtime library.  But unfortunately the service seems to be unable to start within 30 seconds, causing the following error:

Picture #1: A timeout was reached (30000 milliseconds) while waiting for
the OracleOUIVC8Service service to connect

Picture #2: Note that ONE second after the OracleOUIVC8Service failed to start the windows installer entered the running state. What was installed then? The answer is in picture #3.

Picture #3: According to the windows installer the Microsoft Visual C++ 2005 Redistributable was installed. It took 3 seconds from the windows installer to enter the running start (= start of the installation at 07:27:24 show in the picture above) to the end of the installation which was successful. This is shown in the following pictures:

Picture #4 shows the failing installation service for the grid infrastructure home and/or configuration which fails with:

A timeout was reached (30000 milliseconds) while waiting for the OracleOUIOraCrs11g_home1Service service to connect

This does not surprise me at all. If there are some libraries missing the installer will most probably fail. But i guess the point is here that the service are created in a incorrect way. I DO see the java.exe from the installer in the process list – but only AFTER the start of the service failed….

For debugging i tried several thing:

  • Install on Windows 2008 R2 64-bit
  • Install on Windows 2008 64-bit
  • Pre-Install the VC Runtime
  • Increase the allowed time to start for a service to 180 seconds

Increasing the service start timeout value

Trying to increase the maximum allowed time for a service to start from 30 to 180 seconds didn’t even help. The node itself was completely idle during the service start. But the VC installation started RIGHT AFTER the service start time out after 180 seconds and was completed within 4 seconds. This makes me believe in some kind of error. And so i decided to blog about this problem because i have not yet found anything on the web.

In the following picture you can see the service OracleOUIVC8Service trying to start.

The command line being used for that was:

D:\Temp\2\OraInstall2010-04-17_02-20-52AM\ext\bin\vcredist_x64.exe
/q:a /c:"VCREDI~1.EXE /q:a /c:""msiexec /i vcredist.msi  /qn"" "

As you can see the system was completely idle during service startup with plenty of free memory.

An except from the windows event log shows the service failed to start after 180 seconds (the value specified in milliseconds was increased by adding ServicesPipeTimeout to HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control; so 180000 means 180 seconds):

Right after the service failed to start the windows installer entered the running state and installed the VC redistributable:

So this also leads me to the assumtion there is some bug in the grid infrastructure installer.

Another possibility is that this error is caused by using VMWARE virtualization. But at the moment i dont have a reason to think so.

59 thoughts on “11g Release 2 for Windows – Grid Infrastructure Installation Buggy?”

  1. Hey ,

    even am facing the same problem, here am trying a single node installation , stilll it fails with same error.

    just keep updating if you come across some info.

    Thanks,
    Phani

  2. I tried to install Oracle 11gR2 Grid Infrastructure on Windows 2008 R2 Hyper-V virtual machine (2 nodes), I got the exact the same error. Still working on it.

  3. Have the same issues on a single node install. Contacted Oracle support. No help a week later from them.

  4. I tried the same on Windows 2003 R2 Enterprise x64 SP2 ( 2 nodes )with the same result. Let me know if you have a resolution.

  5. Please install oracle 11g 32 bits on 2003 windows compatible mode on windows 2008 64-bit system it will work fine.

    Please install oracle 11g 64 bit

    1. > Please install oracle 11g 32 bits on 2003 windows compatible mode on windows 2008 64-bit system it will work fine.

      Why should i even do that? Mixing 32-bit and 64-bit is not even supported. By the way there is no 32-bit 11g R2 clusterware.

  6. Same problem here, SR opened at Oracle, no answer from them, not even a status report … Did they even try to install Grid Infra on R2 ? Did someone here managed to install it ?

  7. I ran into the same issue, make sure that the host name specified in listener.ora matches that of your machine (/etc/hosts on linux) (%WINDIR%\system32\drivers\etc\hosts on windows).

  8. @juandie
    not sure was that…. y did 3 installs
    1- your error
    (disable UAC)
    2- deinstall and then i didnt hit that error but hit another one while configuring listener. i thougth that something was not properly clean after the first install. So i uninstall again, But this time i also erase the asm disks…
    3- i hit the same error again…

    im cleaning… reinstalling again… clueless

    1. I appreciate your tests. I´d like to test as well. Unfortunately at the moment my available time is very limited.

  9. We encountered a similar problem, and found the following things

    – The failure of the “Grid Installation Configuration” step could be caused by many different kinds of errors, some of them silent (no useful information in any install logs)

    – Useful logs to look at are in \app\11.2.0\grid\cfgtoollogs\crsconfig – rootcrs_.log_OUT and rootcrs_.log

    – We saw the same service errors as described in your post, as well as these error messages in the rootcrs_.log file:

    StartService(OracleRemoteExecService) waiting for service to complete
    service status 4, still waiting

    service status 4, still waiting
    service terminated (0x0)
    Warning failed to Stop (OracleRemoteExecService) service. The service has not been started.

    These errors do not seem to be a problem. We eventually were able to install RAC and then the DB successfully, and we still saw those errors. Everything seems to be working fine otherwise.

    – In one case where the “Grid Installation Configuration” step failed, we saw these error messages:

    DiskGroup DATA creation failed with the following message: ORA-15018: diskgroup cannot be createdORA-15307: disk DATA_0000 not discoverable by CSS

    This was because the required disks to be used for OCR/voting ASM volume were not ‘online’ on the remote node.

    – In another case where the “Grid Installation Configuration” step failed, we could see no useful error messages in any of the logs, but it turned out that our node hostnames could not be resolved by DNS, or by hosts file. Fixing the hostname resolution allowed us to install RAC successfully.

    Our conclusion is that the RAC installer is very poor at providing useful error messages :( I was expecting much better from Oracle…

    Anyway, I hope this information helps someone!

  10. Same Problem … Contacted Oracle support … and that was the answer:

    RAC on Windows 2008: Oracle Grid Infrastructure / Clusterware Currently Incompatible with Microsoft Failover Clustering (MSFC) (Doc ID 1157711.1)

    Now the Installation works.

    There is also a Problem when IPv6 is enabled !

  11. Hi there,

    stumbling across the same errors. Did you install with advanced options using the new GNS or did you manually set the scan/vip ips?

  12. Hi

    Facing similar problem on Windows 2008. All pre-requisites check are fine. But dont seems to understand the error. Is it something to do with Windows 2008 priviledges or with Oracle.

    The Installation fails at the end while doing Grid infrastructure configuration. Realised that create the registry entries has some issues according to the logs and Windows event logs.

    If there are some answers please update.

    Thanks and Regards

    Diogo

  13. Hi

    Had to give up on Windows 2008 at the moment. It works fine on Windows 2003.

    My installation is not VMWARE but was getting the same error as yours on 2008 R2.

    Thanks and Regards

    Diogo Fernandes

  14. Hi,

    I am also exactly facing the same error. I am trying on Windows 2008 SP2 X64

    Any help will be appreciated.

    Thanks!
    ~Vivek

  15. @G
    Hi,

    We are having the same error DiskGroup OCR-Voting creation failed with the following message: ORA-15018: diskgroup cannot be createdORA-15307: disk DATA_0000 not discoverable by CSS. ALl the disks on the remote node are offline. How did you resolve this “G”? Did you stop the install mount the drives and then restart the install. I don’t see how we can restart the install once this fails.

    Thanks

    1. I´d suggest testing the disks.. are all disks accessible from all nodes in parallel? It seems one node is missing a disk.

  16. Please advise if “Windows Server 2008 R2 x64 Edition” and “Oracle 11g Release 1 (11.1.0.7.0)” combination is compatible . Appreciate your Quick response on this.
    Many Thanks. Lokesh

  17. Finally, I was able to overcome this issue “[INS-20802] Grid Infrastructure Configuration failed” and able to install Oracle 11g R2 Grid Infrastructure at Windows 2008 R2 Standard edition. It took more than a week time to really get to the bottom of the issue. Earlier, I had tried at least 5-6 times with different combinations and options.

    In my case the issue was teaming of the NIC network adapter. After breaking the teaming of network, the cluster utility check also changed and reduced the “failed message” at only one place. I had two node setup – I run the cluster utility check at both the nodes. At node one it gave just one failed message and at node 2 – no failed message.

    Thanks,
    Gyan

  18. Hi,

    I tried to install Oracle 11gR2 Grid Infrastructure on Windows 2008 R2 x64 and gives the same error ([INS-20802] Grid Infrastructure Configuration failed).

    I searched for patchs or some solution in Metalink and Google but no sucess.

    Anybody have news ?

    God bless.

  19. Finally was able to install the grid infrastructure successfully.

    1. Make sure that the disk drives are not lettered. After using the diskpart utility to create the partitions be sure to do a refresh in Disk Manager facility of Server Manager to be sure that none of the dirve has letters on all nodes. If they do, remove them and do a refresh again on all nodes to be sure that the drives are not lettered. This prevented us from succesfully configuring ASM.
    2. issue a route 0.0.0.0 delete on all nodes. Caution be sure that you save your defualt gateway address before executing this command. You will need to re-enter it after issuing this command. You may also need to login from the console as you will not be able to access the server untill you enter the default gateway.
    3. remove all network adapters leaving only the private and public interfaces and all references to ipv6 in all of the network adapters. When you issue an ipconfig /all you should only see the private and public interfaces. We had to issue the following commands in order to achieve this.
    netsh Interface 6to4 set state state=disabled
    netsh interface isatap set state disabled
    netsh interface teredo set state disabled
    plus we had to remove all of the ipv6 references in each of the remaining network adapters. You may need to enter further netsh commands depending on what interfaces are installed on your servers.

  20. well , i am hit by the same error , but metalink says the error is not related to the service.
    the service is a dummy service and can be ignored.
    The erorr may be related to something else.

    WIN: During Installation The Event Log Shows Failure To Start Some Services

    Solution
    OracleOUIVC8Service is a dummy service and it is actually not supposed to start. It is used by the installer to install VC8 binaries on the remote node. It’s not a proper Windows service and it is not expected to start, and in fact the Installer expects and handles the exceptions about the service not starting during the install process.

    The same is true for the OracleOUIOraCrs11g_home1Service service and any other OracleOUI[homename]Service . These are dummy services used by the installer to run an executable and do install work. They are not actually supposed to start. So you can ignore these errors in the Windows event log.

  21. Now
    the clufy is ok,
    the disks are online on both servers ,
    tried removing teaming ,
    followed the best practices installation from metalink ,,
    but still no luck ..
    support is working for more than a week now ,
    I think i found something new ..

  22. I was experiencing the same problem. I am using vmware workstation and doing a two node install on Windows 2008 with 11GR2 clusterware just like most people here on this thread.

    On the second node I set up a Windows DNS server with a forward and two reverse lookup zones (the two reverse lookup zones handled the vip and private network subnets). At first neither of my nodes were setup to use any kind of generic DNS name. So my hosts were just RAC1.localdomain, RAC1-VIP.localdomain, etc. And my forward lookup zone on my DNS configuration just used localdomain.

    This appeared to be ok with the initial scan and node validation that the installer was doing. But the grid infrastructure part of the install appeared to have problems resolving the names at it just used RAC1 and RAC1-vip instead of a fully qualified domain. And even though I was using a DNS server my nslookups without the “.localdomain” were not resolving.

    In the %GRID_HOME%\\cfgtoollogs\crsconfig\rootcrs_[node1] log file I saw the following errors:
    PRKC-1023 : Invalid IP address format: rac1-vip
    add nodeapps -n rac1 -A rac1-vip/255.255.255.0/eth0 on node=rac1 … failed
    PRCR-1001 : Resource ora.net1.network does not exist

    So I put both nodes in a domain called cookie.local. Then reconfigured my DNS so that the foward lookup zone referenced cookie.local. Then I put the entries for all server names back in. Once I did this everything worked.

  23. We’re have a same error ([INS-20802] Grid Infrastructure Configuration failed) on Windows 2008R2 64bits

    Anybody have a good news ?

  24. this was the solution:

    3. remove all network adapters leaving only the private and public interfaces and all references to ipv6 in all of the network adapters. When you issue an ipconfig /all you should only see the private and public interfaces. We had to issue the following commands in order to achieve this.
    netsh Interface 6to4 set state state=disabled
    netsh interface isatap set state disabled
    netsh interface teredo set state disabled

    thx Scott

  25. This solution worked for me ! Thanks Thomas !
    Be carefull : these interfaces aren’t displayed in the control panel.
    Now another problem because setup failed on another step :-(

  26. I ran into this is issue as well –

    DiskGroup DATA creation failed with the following message: ORA-15018: diskgroup cannot be createdORA-15307: disk DATA_0000 not discoverable by CSS

    The problem was in my case the binding order of the iSCSI LUNs. The device order between the two nodes was different. I had created 6 x 500GB iSCSI Volumes which looked to be discovered OK. Then I ran into the installation errors above. I then increased the size of only one volume to 1GB and spotted that it was discovered as DISK5 of node A and as DISK2 on node B. I have deleted all RAW devices and re-discovered them.

    After that the installation completed successfully.

    It has only cost me 10 tries. ;-)

  27. I’ve a problem which installing Grid Infrastructure,
    Grid Infrastructe configuration failed,The Plug-in failed in its perform method

  28. For initial timeouts i disabled ipV6 in Windows 2008 R2 and added database ip name in host. Been fine since then. I had four other network cards, i disabled three leaving the only 1 one i’m using.

    It’s working now, thanks guys. Spent 6 useful hours on that already.

  29. We are facing the problem with Grid Configuration step is failing in Grid Infrastructure install on production system where using OCFS on Windows 2008 R2 64bit. As Follows:

    [INS-20802] Grid Infrastructure Configuration failed.

    Please help if someone gone through this,

    thanks,
    Jane

  30. I had a similar problem (unable to configure grid infrastructure) and it was a simple fix. The setup:
    Windows 2008R2 running on Hyper-V
    Oracle 11.2.0.3 grid infrastructure single instance setup

    The error messages in the install log indicated that my user did not have administrative privileges. The privilege was granted through a group. I added my domain user directly to the Administrators group on the server and I was able to successfully configure Grid & ASM.

    HTH!
    -T. J.

  31. I also have problems while installing on 2 VMware nodes, eachone is running on a different vmware server.

    It seems that there is an issue during installation with the hdd on node 2

    on each node there is a online Raw disk with 15GB, they have both the same location (bus and id), what have i done wrong?

    2012-01-03 15:41:40: Executing cmd: E:\oracle\11.2.0\grid\bin\asmca -silent -diskGroupName DATA -diskList ‘\\.\ORCLDISKDATA0’ -redundancy EXTERNAL -configureLocalASM
    2012-01-03 15:42:05: Command output:
    >
    > Disk Group DATA creation failed with the following message:
    > ORA-15018: diskgroup cannot be created
    > ORA-15031: disk specification ‘\\.\ORCLDISKDATA0’ matches no disks
    >
    >
    >End Command output
    2012-01-03 15:42:05: Configuration of ASM … failed

  32. I met the same issue when installing Grid Infrastructure 11g R2 on step 15/16 under ESXi 4.1 and CentOS. The issue disappears when disable IPV6 on all NIC’s and retry. Thank you for sharing your experience and good luck!

  33. Yes,this is the answer,I have try and success.Thank you Very Much.

    Scott :
    Finally was able to install the grid infrastructure successfully.
    1. Make sure that the disk drives are not lettered. After using the diskpart utility to create the partitions be sure to do a refresh in Disk Manager facility of Server Manager to be sure that none of the dirve has letters on all nodes. If they do, remove them and do a refresh again on all nodes to be sure that the drives are not lettered. This prevented us from succesfully configuring ASM.
    2. issue a route 0.0.0.0 delete on all nodes. Caution be sure that you save your defualt gateway address before executing this command. You will need to re-enter it after issuing this command. You may also need to login from the console as you will not be able to access the server untill you enter the default gateway.
    3. remove all network adapters leaving only the private and public interfaces and all references to ipv6 in all of the network adapters. When you issue an ipconfig /all you should only see the private and public interfaces. We had to issue the following commands in order to achieve this.
    netsh Interface 6to4 set state state=disabled
    netsh interface isatap set state disabled
    netsh interface teredo set state disabled
    plus we had to remove all of the ipv6 references in each of the remaining network adapters. You may need to enter further netsh commands depending on what interfaces are installed on your servers.

  34. Hi,
    Hope you are well.

    I found the following problem at the installation time of Grid infrastructure (11g release 2) in Windows 7 64 bits OS.

    [INS-20802] automatic storage management configuration assistant failed.

    How can I resolve this problem.

    Waiting for your kind reply.

    Thank you.

    1. Look at the install logs – specifically the ASM alert.log. Something with the storage i suspect.

  35. most probably you hit the bug no=Bug 17927204
    the solution is to install patch before a configure is run.
    [INS-20802] Grid Infrastructure failed During Grid Installation On Windows OS (Doc ID 1987371.1)
    has solution for that.
    Regards.

  36. And one more note that ;
    windows “user account control UAC” works with tokens and by default also user is in administrators group token is standart. this will prevent many steps and access denied errors wil occur.
    disable with via https://msdn.microsoft.com/en-us/library/cc232765.aspx?tduid=(f99fe9d46f80bbd34d2119de22575fe9)(256380)(2459594)(TnL5HPStwNw-fTWYCfzmWFFJhGSwG8xyVQ)()

    Key: SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System
    Value: “EnableLUA”
    set to 0. disable

    it is different then windows askings “run as admin”
    regards. & thank you.

  37. Oracle attempt to use your current Windows domain credentials to authenticate you with the oracle server.
    This could fail the oracle server is not configured to support Windows authentication
    or the credentials you use to login to your local machine are not sufficient to allow you to login to the server.

    I had the same problem during installing grid 11g 2 (11.2.0.1 -11.2.0.4). The domain user was added to local admin group but
    that does not mean that had automatic admin privileges.
    After 4 days inside asmca.log there was an error (ORA-12638….).Booooom…Kerberos authentication problem…
    I try to change SQLNET.AUTHENTICATION_SERVICES= (NTS) to SQLNET.AUTHENTICATION_SERVICES= (NONE)…
    But every time i was hitting the retry button the installer was reverting the value on the file sqlnet.ora to (NTS)…
    So i decide to run the installer as local administrator…..
    BOOOOOOOM….. Finally the installation of grid 11g r2 ended with no errors!!!!!!!!!

    Just a couple days i was reading about the same problem (i cannot remember the site) and someone suggested
    to add the DES ciphers to “Network security: Configure encryption types allowed for Kerberos” inside
    “Computer Configuration\Policies\Windows Settings\Security Settings\Local Policies\Security Options”…
    I have not tried that solution yet, but if someone is willing to try please let us now….

  38. I began working with RAC 3 months ago and assumed it was similar to regular single instance database in linux env.
    This is not the same technology. I had reinstalled it successfully once in a test environment without issue. After this, I attempted to patch live.

    I spent 20 hours repeatedly reinstalling 11g RAC on win 2008 R2 on a PRODUCTION environment after an upgrade failed. Luckily, we had 24 hour downtime. It was quite terrifying but persistence paid off. After the upgrade failed, the cluster was rendered ineffective and mutated between versions. I had a tested downgrade plan. This failed. After deinstalling and restarting both nodes, I tried to reinstall, here was my experience.

    Upgrade 1 : Failed because the configuration agent couldn’t dismount ASM disks owing to locking on a registered ASM mount leading to crs not starting node 2.
    Attempt to downgrade failed – CRS was dead and recommended downgrade steps returned plenty of errors.
    Deinstall 1: of ALL oracle components, registry, inventory, rdbms ect
    Trash ASM disks, re-partition
    Install 1 of previous version: Failed because of a left over acsf driver or disk related issue from previous install on node 2.
    Deinstall 2 : everything
    Trash ASM disks, re-partition
    Install 2: Failed because windows lettered the freshly partitioned volumnes after re-partitioning the disks.
    Deinstall 2: everything
    Trash ASM disks, re-partition
    Install 3: Successfull, though ora.oc4j is in an unknown state.

    I opatched the system and installed the rdbms and restored from the morning backup to where the system was 20 hours previously.

    Oracle RAC on Windows is unpredictable. I am recovering from the experience and somewhat dumbstruck at the unpredictability of this software on windows. Be very careful with installing or upgrading this RAC 11g2 on win 2008 – when it goes wrong, it is spectacular.
    Make sure you have a backup plan other than the software you’re patching. Don’t think it will all work. It is very buggy.

    1. The problem with Oracle on Windows is, that with the Group Policy can can restrict so many things without even being aware of it. In addition to ‘smart’ virus scanners this makes a dangerous mix.

Leave a Reply

Your email address will not be published.