Introduction
Recently, I encountered an issue on a DBCS Data Guard standby database where the alert log reported the error ORA-29771. After seeing this error, I checked the Data Guard status and noticed that the standby had developed a significant apply lag, even though redo apply was still running.
Review the Alert Log on the Standby
I started by examining the standby alert log to confirm the ORA-29771 occurrence and identify any related background process failures:
Solution
Increase LMON Communication Timeout
Procedure for Setting Hidden Parameters on ADG Standby
Since patch 31961214 has already been successfully applied to the standby database home directory, I need to configure the following hidden parameter on the database.
1. Set _adg_parselock_timeout
This parameter controls the maximum time (in centiseconds, or 1/100 of a second) an Active Data Guard query will wait for a parse lock before timing out. I changed this parameter to 100 seconds (10000 centiseconds). This accommodates our observed 70-second wait time and provides a 30-second buffer.
ALTER SYSTEM SET "_adg_parselock_timeout" = 10000 SCOPE=BOTH;2. Set _bug31961214_standby_parse_limit_seconds
This parameter is a control introduced by the patch to
address standby query parsing issues. Setting it to a non-zero value enables
the fix. The recommended non-zero value is usually 10 or more seconds.
ALTER SYSTEM SET "_bug31961214_standby_parse_limit_seconds" = 10 SCOPE=BOTH;Now, we can see that these parameters have been configured on the standby database using the following select statement.
SELECT NAME, VALUE FROM V$SPPARAMETER WHERE NAME IN ('_adg_parselock_timeout', '_bug31961214_standby_parse_limit_seconds');
3. Ensure Sufficient Parallelism is Used
If a single MRP worker is trying to apply a massive redo
block, it can lock up CPU resources and trigger the error.
I confirmed that the apply_lag was about 5 hours, indicating
that the standby was significantly behind the primary, as shown below.
I checked the apply-rate as well
I stopped the MRP and then restarted it with a parallel
setting of 4, which has now dramatically increased the apply rate.
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION PARALLEL 4; Conclusion
ORA-29771 indicates that a critical Oracle background process unexpectedly terminated. In a Data Guard standby environment, this can lead to performance degradation and increased apply lag even when redo apply remains active. By reviewing the alert logs, checking Data Guard status, validating system resource health, and restarting MRP when necessary, you can effectively identify the cause and restore normal standby performance.
No comments:
Post a Comment