segunda-feira, 23 de maio de 2011

How to forcefully ‘deconfig’ Grid cluster configuration in 11gR2 Part II of II

Note: This article was edited in english for helps any t technical that passed for this problems.

In the last article talked about how to un-configure the CRS during installation, but in this article we will talk about why the error occurred during installation.

A pre-installation requirements is that the shell limits are adjusted as described in the manual "Oracle Grid Infrastructure - 11gR2 Installation Guide".

The values ​​for the Oracle user operating system are:

         fsize = -1   (Unlimited)
         core = 2097151
         cpu = -1     (Unlimited)
         data = -1    (Unlimited)
         rss = -1     (Unlimited)
         stack = -1   (Unlimited)
         nofiles = -1 (Unlimited)

And they should also be adjusted for user "root", because during the execution of the script "root.sh" this will need the configuration.
The root user requires these settings because the CRS daemon (crsd) runs as root.

So even after performing the step described in the previous article (see article: How  to forcefully  'deconfig' Grid cluster configuration in 11gR2 - Part I), the error will persist, to solve  it  in the final installation of the infrastructure  must be rebuilt .

You should abort the installation, remove the installation (manually, bellow you see the procedure),  and  reformat  the disks of ASM, as they have been marked during installation that failed.

To remove the installation must follow these steps (for ALL NODES).

- Remove the following files:

rm /etc/init.cssd
rm /etc/init.crs
rm /etc/init.crsd
rm /etc/init.evmd
rm /etc/rc.d/rc2.d/K96init.crs
rm /etc/rc.d/rc2.d/S96init.crs
rm-Rf /etc/oracle/scls_scr
rm-Rf /etc/oracle/oprocd
rm /etc/inittab.crs

Verify that CRS processes are running, if they remove them, using the following command:

ps-ef | grep crs
kill  pid>
ps-ef | grep evm
kill  pid>
ps-ef | grep css
kill  pid>

Remove the files in /var/tmp /.oracle or /tmp/.oracle:

rm-f /var/tmp/.oracle
or
rm-f /tmp/.oracle

Remove the file ocr.loc, the ocr.loc usually can be found at / etc / oracle.

 rm-f /etc/oracle.

De-install the CRS home in the Oracle Universal Installer

Remove the CRS install location:

 rm-Rf  Install Location>/*

Clean out the OCR and Voting Files that were written in ASM with “dd” commands,
as the example below:

 dd if = /dev/zero bs=8k count=1000 of=/dev/asm1
 dd if = /dev/zero bs=8k count=1000 of=/dev/asm2
 dd if = /dev/zero bs=8k count=1000 of=/dev/asm3
 .
 .
 .
 (Run the “dd” command to ALL the disks belonging to the ASM).


From this point you can re-install the Oracle Grid Infrastructure.Rather, make sure that the limits of shells are properly adjusted.

Following these steps the installation proceeds successfully until the end.


Rubens Oliveira
DBA Oracle Consultor
olivert.dba@consultant.com

segunda-feira, 16 de maio de 2011

How to forcefully ‘deconfig’ Grid cluster configuration in 11gR2 Part I of II


Note: This article was edited in english for helps any t technical that passed for this problems.


I was installing 11gR2 RAC with Grid infrastructure on a 2 node AIX cluster (Version 6.1). I did all the steps, but my root.sh failed.

# /oracle/grid/11.2.0/root.sh

Running Oracle 11g root.sh script…
The following environment variables are set as:

ORACLE_OWNER= oracle
ORACLE_HOME= /oracle/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The file “dbhome” already exists in /usr/local/bin. Overwrite it? (y/n) [n]: y
Copying dbhome to /usr/local/bin …
The file “oraenv” already exists in /usr/local/bin. Overwrite it? (y/n) [n]: y
Copying oraenv to /usr/local/bin …
The file “coraenv” already exists in /usr/local/bin. Overwrite it? (y/n) [n]: y
Copying coraenv to /usr/local/bin …
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root.sh script.
Now product-specific root actions will be performed.
2010-03-02 07:55:55: Parsing the host name
2010-03-02 07:55:55: Checking for super user privileges
2010-03-02 07:55:55: User has super user privileges
Using configuration parameter file: /grid/11.2.0/crs/install/crsconfig_params
Creating trace directory
User ora11gr2 is missing the following capabilities required to run CSSD in realtime:
CAP_NUMA_ATTACH,CAP_BYPASS_RAC_VMM,CAP_PROPAGATE
To add the required capabilities, please run:
/usr/bin/chuser capabilities=CAP_NUMA_ATTACH,CAP_BYPASS_RAC_VMM,CAP_PROPAGATE oracle
CSS cannot be run in realtime mode at /oracle/grid/crs/install/crsconfig_lib.pm line 8119.
So root.sh returned the error & asked me to run chuser command with above options. After executing this command on both the nodes, I again executed the root.sh, buut it failed with message “Deconfigure the existing cluster configuration before starting”

bash-2.05b# /oracle/grid/root.sh

Running Oracle 11g root.sh script…
The following environment variables are set as:

ORACLE_OWNER= oracle
ORACLE_HOME= /oracle/grid
Enter the full pathname of the local bin directory: [/usr/local/bin]:
The file “dbhome” already exists in /usr/local/bin. Overwrite it? (y/n) [n]:
The file “oraenv” already exists in /usr/local/bin. Overwrite it? (y/n) [n]:
The file “coraenv” already exists in /usr/local/bin. Overwrite it? (y/n) [n]:
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root.sh script.
Now product-specific root actions will be performed.
2010-03-02 08:07:32: Parsing the host name
2010-03-02 08:07:32: Checking for super user privileges
2010-03-02 08:07:32: User has super user privileges
Using configuration parameter file: /oracle/grid/crs/install/crsconfig_params
Improper Oracle Clusterware configuration found on this host
Deconfigure the existing cluster configuration before starting
to configure a new Clusterware
run ‘/oracle/grid/crs/install/rootcrs.pl -deconfig’
to configure existing failed configuration and then rerun root.sh

So I tried, but when I executed /oracle/grid/crs/install/rootcrs.pl -deconfig, it error out, saying, it could not communicate with CRS & asked me to start the CRS. But funny part is, CRS was not yet configured. In short it was going in a circular fashion.

In this scenario, -force option with -deconfig, will be very handy.

bash-2.05b# /grid/11.2.0/crs/install/rootcrs.pl -deconfig -force -verbose
2010-03-02 08:11:29: Parsing the host name
2010-03-02 08:11:29: Checking for super user privileges
2010-03-02 08:11:29: User has super user privileges
Using configuration parameter file: /oracle/grid/crs/install/crsconfig_params
PRCR-1035 : Failed to look up CRS resource ora.cluster_vip.type for 1
PRCR-1068 : Failed to query resources
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.gsd is registered
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.ons is registered
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.eons is registered
Cannot communicate with crsd
Failure at scls_scr_setval with code 8
Internal Error Information:
Category: -2
Operation: failed
Location: scrsearch3
Other: id doesnt exist scls_scr_setval
System Dependent Information: 2
CRS-4544: Unable to connect to OHAS
CRS-4000: Command Stop failed, or completed with errors.
Successfully deconfigured Oracle clusterware stack on this node

And finally, even though it could not communicate with CRS, it successfully deconfigured Oracle clusterware 
stack.



Rubens Oliveira

DBA Oracle Consultor
olivert.dba@consultant.com