RAC node startup problems after RHEL patch

ENV: Red Hat Enterprise Linux Server release 5.8 (Tikanga), 11.2.0.2 EE

As most organizations, we have maintenance, and our maintance occurred this last Saturday which usually entails applying RHEL OS patches to our RAC clusters.  We don’t usually encounter problems…. but we did this time.  Now, normally (actually always) we don’t relink any of our oracle/clusterware after applying OS patches… that apparently will get you into trouble.

After applying the latest round of RHEL patches on the 1st node of our test environment, the CRS wouldn’t come up…. the only things you saw running when doing a ps was

ohasd.bin reboot, init.ohasd run, cssdmonitor, & orarootagent.bin

when checking the crs, you’d see this:

[root@nrac01 init.d]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager

to make a long story short, to fix the problem relink by following below:

1. Stop all the instances running on that node

# srvctl stop instance -d <db_unique_name> -i <instance_name> -o immediate

2. Stop the listener

# srvctl stop listener -n <nodename> -l <listener list>

3. As root

# cd $GRID_HOME/crs/install

# perl rootcrs.pl -unlock

4. Relink the Oracle RDBMS as the RDBMS owner

# cd $ORACLE_HOME/bin

# relink all

5. As the GRID infrastructure owner

# $GRID_HOME/bin/relink

6. As root again

# cd $GRID_HOME/crs/install

# ./rootadd_rdbms.sh

# perl rootcrs.pl -patch -verbose

**you can find the log from above in $GRID_HOME/cfgtoollogs/crsconfig

6a. At this point, ran into Bug # 10128494, and you’ll know, if you see this from the output from the command above:

 
[root@nrac01 install]# perl rootcrs.pl -patch
Using configuration parameter file: ./crsconfig_params
Undefined subroutine &main::read_file called at crspatch.pm line 86.

If you do have this problem, then do the following to fix:

Workaround:

Modify $CH/crs/install/crsconfig_lib.pm, and change the line from
my @exp_func = qw(check_CRSConfig validate_olrconfig validateOCR
to
my @exp_func = qw(check_CRSConfig validate_olrconfig validateOCR read_file

After the fix, run the rootcrs.pl again and you should be good to go…. CRS should start up for you.

Advertisements

One thought on “RAC node startup problems after RHEL patch

  1. Pingback: Oracle 11g R2 Grid Infrastructure installation on 2 node cluster using Virtualbox « [SbhOracle] Saurabh Gupta's Oracle Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s