Thursday, October 16, 2008

Physical Standby out of sync - Missing Datafiles Scenario


1. Primary has 200 datafiles and standby has only 166 datafiles

2. Primary is a 3 node cluster and Standby is a 2 node cluster

3. The DB name is mydb

Problem and Symptoms:

1. When I tried to start the MRP on standby, it reported the following error in alert log:

Errors in file /u01/app/oracle/admin/mydb/bdump/mydb1_mrp0_21189.trc:ORA-01111: name for data file 167 is unknown - rename to correct fileORA-01110: data file 167: '/u01/app/oracle/product/9.2.0/dbs/UNNAMED00167'ORA-01157: cannot identify/lock data file 167 - see DBWR trace fileORA-01111: name for data file 167 is unknown - rename to correct fileORA-01110: data file 167: '/u01/app/oracle/product/9.2.0/dbs/UNNAMED00167'

2. On further investigation, standby’s alert log also contains following errors:

Tue Sep 9 04:05:03 2008Media Recovery Log /u03/oradata/mydb/arc_backup/mydb_2_2173.arcMedia Recovery Log /u03/oradata/mydb/arc_backup/mydb_1_1896.arcWARNING: File being created with same name as in PrimaryExisting file may be overwrittenFile #167 added to control file as 'UNNAMED00167'. Originally created as:'/u07/oradata/mydb/myfile_1.dbf'Recovery was unable to create the file as:'/u07/oradata/mydb/myfile_1.dbf'MRP0: Background Media Recovery terminated with error 1274Tue Sep 9 04:05:06 2008Errors in file /u01/app/oracle/admin/mydb/bdump/mydb1_mrp0_7175.trc:ORA-01274: cannot add datafile '/u07/oradata/mydb/myfile-1.dbf' - file could not be createdORA-01119: error in creating database file '/u07/oradata/mydb/myfile_1.dbf'ORA-27054: Message 27054 not found; product=RDBMS; facility=ORALinux-x86_64 Error: 13: Permission denied

3. On checking the view v$archived_log, there were lot of log sequence# which were APPLIED=NO

4. There is no gap in the sequence#


Parameter db_file_name_convert was not set at standby database. So as long as the files were created on /u02 and /u03 on primary, there was no problem on the standby because standby had /u02 and /u03. But when file#167 was added at /u07 on primary (on Sep 9 04:05:03 2008), it could not map to a /u07 mount point on standby because /u07 does not exists on standby and db_file_name_convert was also not set. As indicated by the alert log, the file#167 was registered in the standby’s control file as “UNANMED00167” at the default location of $ORACLE_HOME/dbs but the file was not created physically on standby database.

Action Plan:

1. At the standby:Please set the db_file_name_convert parameter at the Standby for the /u07 folder at the Primary to the corresponding folder at the Standby.
Since this parameter is a Static parameter, you need to bounce the Standby DB.

As step#1, you can do following instead of the above step:

At the standby:
Create /u07 soft link for /u02, to eliminate the bounce of standby db due to the addition of db_file_name_convert init.ora parameter


2. At the standby :SQL> alter system set standby_file_management=manual;

3. At the Primary for the datafile 167 :

SQL> alter tablespace <> begin backup ;
Copy the Datafile from the Primary to Standby to the correct location.
SQL> Alter tablespace end backup;

4. At the Standby:

SQL> alter database rename file '.......UNNAMED00167' to '<>';

You can skip steps#3 and #4 and instead do following step after #2:

At the Standby:


5. To create the remaining datafiles at the Standby automatically:

SQL> alter system set standby_file_management=auto;

6. Start the MRP at the Standby
SQL> alter database recover managed standby database;

At standby database ensure the MRP is running as expected:
SQL>select process, status , sequence# from v$managed_standby;

When Primary and Standby are RAC databases:

1. On Standby: You can see multiple copies of some or all logs transported and applied on standby when you check the view v$archived_log.

2. On Standby: All sequence# should have APPLIED=YES in v$archived_log for all threads. This ensures that all logs from all threads were transported and applied on standby and hence keeps standby in sync with primary.

3. On Standby: In the view v$archived_log you may not see same number of multiple copies of all logs. For example, if the primary is a 3 node cluster, you may or may not have 3 copies of each log i.e. you may not have the same sequence# log on standby for all 3 threads. Of course the reason is that number of logs generated on all 3 nodes of primary will differ. The current sequence# transported from a node of primary RAC database can be seen by querying v$archived_log on standby:

SQL> select max(sequence#) from v$archived_log where thread#=1;

As explained above, the output will differ for all 3 threads.