ORACLE DATABASE 19c
Disaster Recovery (DR) - Full Recovery
Guide
Handling 100+ Archive Log
Gaps on DR Standby
1. Overview & DR Architecture
2. Understanding Archive Log Gaps
3. Pre-Recovery Checklist
4. Step 1 — Assess the Archive Gap on DR
5. Step 2 — Resolve Missing Archive Logs
(Primary → DR)
6. Step 3 — Apply Archive Logs to DR Standby
7. Step 4 — Validate MRP / Redo Apply Status
8. Step 5 — Incremental Backup Refresh (if gap
> 100 logs)
9. Step 6 — Full DR Failover Procedure
10. Step 7 — Post-Recovery Verification
11. Troubleshooting: Common Archive Gap Errors
12. Sign-Off & Certification
This document provides a comprehensive, step-by-step procedure
for Oracle 19c Disaster Recovery (DR) when the standby database has an archive
log gap - specifically when 100 or more archive logs are missing from the DR
site. This is a real-time, verified guide for DBAs to safely recover and
resynchronize the DR standby.
1.1 Environment Reference
|
Parameter |
Primary Site |
DR Site |
|
DB Name |
PRODDB |
DRDB |
|
DB Unique Name |
PRODDB |
DRDB |
|
Host |
primary-db01.company.com |
dr-db01.company.com |
|
Oracle Home |
/u01/app/oracle/product/19c |
/u01/app/oracle/product/19c |
|
ORACLE_SID |
PRODDB |
DRDB |
|
Data Guard Mode |
PRIMARY |
PHYSICAL STANDBY |
|
Archive Dest |
/arch/PRODDB/ |
/arch/DRDB/ |
|
Redo Apply |
N/A |
MRP0 Process |
|
Protection Mode |
Maximum Performance |
Maximum Performance |
1.2 What is a DR Archive Gap?
|
✖
CRITICAL — When to Use This Guide Use
this procedure when V$ARCHIVE_GAP shows 100+ missing sequences on the DR
standby. DO
NOT attempt a DR failover until the archive gap is fully resolved or an RMAN
refresh is completed. A
failover with unresolved gaps will result in DATA LOSS equal to the missing
archive log period. |
2.1 Common Causes of Archive Log Gaps
|
Root Cause |
Description |
Typical Gap
Size |
|
Network outage between
Primary & DR |
Archivelog shipping
interrupted during network failure |
50–500 logs |
|
Disk full on DR archive
destination |
LOG_ARCHIVE_DEST fills up;
incoming archivelogs rejected |
10–200 logs |
|
MRP process crash on DR |
Managed Recovery Process
stopped unexpectedly |
Variable |
|
Primary switchover/failover
performed |
DR not updated before
failover was triggered |
Critical |
|
Standby DB restarted without
MRP |
Redo apply not restarted
after DB bounce |
Hours of logs |
|
Archive log deleted from
Primary |
RMAN deleted logs before FAL
could ship them to DR |
Permanent gap |
2.2 Gap Detection Query
Run the following on the DR standby to identify the exact gap
range:
--
On DR Standby (DRDB)
-- Query 1: Show all
archive gaps
SELECT THREAD#,
LOW_SEQUENCE#, HIGH_SEQUENCE#,
(HIGH_SEQUENCE# - LOW_SEQUENCE# + 1) AS
GAP_COUNT
FROM V$ARCHIVE_GAP
ORDER BY THREAD#, LOW_SEQUENCE#;
-- Query 2: Last applied
archive log
SELECT MAX(SEQUENCE#) AS
LAST_APPLIED
FROM V$ARCHIVED_LOG
WHERE APPLIED = 'YES'
AND STANDBY_DEST = 'NO';
-- Query 3: Compare with
Primary
-- Run this on PRIMARY
(PRODDB):
SELECT MAX(SEQUENCE#) AS
CURRENT_SEQ
FROM V$ARCHIVED_LOG
WHERE DEST_ID = 1;
-- Query 4: Full gap
analysis on Standby
SELECT NAME, SEQUENCE#,
APPLIED, COMPLETION_TIME
FROM V$ARCHIVED_LOG
WHERE APPLIED = 'NO'
ORDER BY SEQUENCE#;
|
⚠
WARNING — Complete All Checks Before Proceeding Skipping
pre-recovery checks may result in incomplete recovery or data corruption. Ensure
you have a valid RMAN backup of the Primary DB before making any changes. |
|
# |
Checklist
Item |
Command /
Action |
Expected
Result |
|
1 |
Verify Primary DB is open |
SELECT STATUS FROM
V$INSTANCE; |
OPEN |
|
2 |
Check DR standby mount
status |
SELECT STATUS FROM
V$INSTANCE; (on DR) |
MOUNTED |
|
3 |
Stop MRP on DR standby |
ALTER DATABASE RECOVER
MANAGED STANDBY DATABASE CANCEL; |
Statement processed |
|
4 |
Identify gap sequences |
SELECT * FROM V$ARCHIVE_GAP; |
Note LOW and HIGH sequences |
|
5 |
Check archive log disk on
Primary |
df -h /arch/PRODDB/ |
> 50% free space |
|
6 |
Check archive log disk on DR |
df -h /arch/DRDB/ |
> 50% free space |
|
7 |
Confirm network connectivity |
tnsping PRODDB (from DR
host) |
OK response |
|
8 |
Verify RMAN catalog
available |
rman catalog rman/pwd@catdb |
Connected to RMAN catalog |
|
9 |
Take RMAN backup of Primary |
BACKUP DATABASE PLUS
ARCHIVELOG; |
Backup successful |
|
10 |
Note Primary current SCN |
SELECT CURRENT_SCN FROM
V$DATABASE; |
Record the SCN value |
|
1 |
Identify
Exact Gap Range & Severity |
SSH to the DR server and run the following assessment queries:
--
SSH to DR site
ssh
oracle@dr-db01.company.com
export ORACLE_SID=DRDB
sqlplus / as sysdba
-- Step 1a: Full standby
status
SELECT DB_UNIQUE_NAME,
OPEN_MODE, PROTECTION_MODE,
DATABASE_ROLE, SWITCHOVER_STATUS
FROM V$DATABASE;
-- Step 1b: MRP process
status
SELECT PROCESS, STATUS,
SEQUENCE#, BLOCK#
FROM V$MANAGED_STANDBY
WHERE PROCESS LIKE 'MRP%';
-- Step 1c: Archive gap
details
SELECT THREAD#,
LOW_SEQUENCE#,
HIGH_SEQUENCE#,
(HIGH_SEQUENCE# - LOW_SEQUENCE# + 1) AS
TOTAL_MISSING
FROM V$ARCHIVE_GAP;
-- Step 1d: Stop MRP
before any gap recovery
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
|
ℹ
Decision Tree — Choose Recovery Path GAP
< 50 logs → Use Step 2: Ship logs
manually from Primary. GAP
50–100 logs → Use Step 2 + Step 3:
Manual ship + FAL re-register. GAP
> 100 logs → Use Step 5:
Incremental RMAN Backup Refresh (recommended). GAP
> 500 logs → Full RMAN restore
from Primary backup recommended. |
|
2 |
Transfer
Missing Archive Logs from Primary to DR Site |
5.1 Identify missing log files on Primary
--
On PRIMARY (PRODDB)
-- Find all archive logs
in the gap range
-- Replace 4521 and 4620
with your LOW_SEQUENCE# and HIGH_SEQUENCE#
SELECT NAME, SEQUENCE#,
FIRST_TIME, NEXT_TIME, BLOCKS * BLOCK_SIZE / 1048576 AS SIZE_MB
FROM V$ARCHIVED_LOG
WHERE THREAD# = 1
AND SEQUENCE# BETWEEN 4521 AND 4620
AND DEST_ID = 1
AND STANDBY_DEST = 'NO'
ORDER BY SEQUENCE#;
5.2 Copy missing archive logs to DR site
--
On PRIMARY server (bash)
-- Method A: SCP
individual sequence range
for seq in $(seq 4521
4620); do
FILE=$(ls /arch/PRODDB/1_${seq}_*.arc
2>/dev/null | head -1)
if [ -f "$FILE" ]; then
scp $FILE
oracle@dr-db01.company.com:/arch/DRDB/
echo "Copied: $FILE"
else
echo "MISSING: sequence $seq"
fi
done
-- Method B: rsync entire
archive directory (safer for large gaps)
rsync -avz --progress
/arch/PRODDB/ \
oracle@dr-db01.company.com:/arch/DRDB/
-- Verify count on DR side
ssh
oracle@dr-db01.company.com \
'ls /arch/DRDB/*.arc |
wc -l'
5.3 Re-register archive logs on DR standby
--
On DR Standby (DRDB)
-- Catalog the copied
archive logs in DR's control file
RMAN> CONNECT TARGET /
RMAN> CATALOG START
WITH '/arch/DRDB/' NOPROMPT;
-- Verify they are
catalogued
RMAN> LIST ARCHIVELOG FROM SEQUENCE 4521 UNTIL SEQUENCE 4620
THREAD 1;
|
3 |
Manually
Apply Archive Logs to Advance Standby |
6.1 Apply specific archive log sequence range
--
On DR Standby (sqlplus / as sysdba)
-- Option A: Apply a
single specific log
ALTER DATABASE REGISTER
PHYSICAL LOGFILE '/arch/DRDB/1_4521_98765432.arc';
-- Option B: Apply all
catalogued logs automatically (recommended)
ALTER DATABASE RECOVER
STANDBY DATABASE
UNTIL SEQUENCE 4621 THREAD 1;
-- Option C: Resume MRP
(auto-applies all available logs)
ALTER DATABASE RECOVER
MANAGED STANDBY DATABASE
USING CURRENT LOGFILE DISCONNECT FROM
SESSION;
-- Monitor MRP progress
SELECT PROCESS, STATUS,
SEQUENCE#, DELAY_MINS
FROM V$MANAGED_STANDBY;
6.2 FAL (Fetch Archive Log) — Force Re-fetch from
Primary
If archive logs exist on the Primary but never shipped,
trigger FAL to re-fetch them automatically:
--
On DR Standby — Force FAL fetch
ALTER DATABASE RECOVER
MANAGED STANDBY DATABASE CANCEL;
-- Set FAL server to point
to Primary
ALTER SYSTEM SET
FAL_SERVER='PRODDB' SCOPE=BOTH;
ALTER SYSTEM SET
FAL_CLIENT='DRDB' SCOPE=BOTH;
-- Resume MRP with FAL
enabled
ALTER DATABASE RECOVER
MANAGED STANDBY DATABASE
USING CURRENT LOGFILE DISCONNECT FROM
SESSION;
-- Watch FAL activity in
alert log
tail -f $ORACLE_BASE/diag/rdbms/drdb/DRDB/trace/alert_DRDB.log |
grep -i 'FAL\|gap\|arch'
|
4 |
Confirm
Gap is Resolved and MRP is Running Cleanly |
--
On DR Standby (DRDB)
-- Check 1: Confirm no
more archive gaps
SELECT * FROM
V$ARCHIVE_GAP;
-- Expected: 0 rows
returned
-- Check 2: MRP process
must show APPLYING_LOG
SELECT PROCESS, STATUS,
SEQUENCE#, BLOCK#, ACTIVE_AGENTS
FROM V$MANAGED_STANDBY
WHERE PROCESS LIKE 'MRP%';
-- Expected: STATUS =
'APPLYING_LOG'
-- Check 3: Confirm
applied sequence matches Primary
-- On PRIMARY:
SELECT MAX(SEQUENCE#) FROM
V$LOG_HISTORY WHERE THREAD#=1;
-- On DR STANDBY:
SELECT MAX(SEQUENCE#) FROM
V$ARCHIVED_LOG WHERE APPLIED='YES' AND THREAD#=1;
-- Both numbers should
match (or DR within 1-2 sequences)
-- Check 4: Data Guard
status
SELECT DEST_ID, STATUS,
TARGET, ARCHIVER, SCHEDULE,
DESTINATION, DB_UNIQUE_NAME
FROM V$ARCHIVE_DEST
WHERE STATUS <> 'INACTIVE';
-- Check 5: Transport lag
and apply lag
SELECT NAME, VALUE,
DATUM_TIME
FROM V$DATAGUARD_STATS
WHERE NAME IN ('transport
lag','apply lag','apply finish time');
|
✔
Gap Resolved — Expected Output V$ARCHIVE_GAP → Returns 0 rows (no gaps remaining). V$MANAGED_STANDBY
→ MRP0 shows STATUS = APPLYING_LOG. Transport
lag → Should be < 5 minutes after
gap resolution. Apply
lag → Should reduce to < 10
minutes within 30 min of MRP restart. |
|
5 |
RMAN
Incremental Backup Refresh for Large Archive Gaps |
|
⚠
Use This Method When Gap > 100 Archive Logs For
gaps exceeding 100 archive logs, manually copying files is error-prone. Use
RMAN Incremental Backup from Primary and roll forward the DR standby. This
is Oracle's recommended approach for large gaps in Data Guard environments. |
8.1 Take Incremental Backup on Primary
--
On PRIMARY (PRODDB) — RMAN
rman TARGET
sys/Oracle123@PRODDB
-- Record the current SCN
before backup
RMAN> SELECT
CURRENT_SCN FROM V$DATABASE;
-- Take incremental backup
from standby's last SCN
-- Replace 9876543 with
standby's last applied SCN
RMAN> BACKUP
INCREMENTAL FROM SCN 9876543
DATABASE
FORMAT
'/rman_backup/dr_refresh/incr_%d_%U.bkp'
TAG 'DR_GAP_REFRESH_20250404';
-- Also backup current
standby controlfile
RMAN> BACKUP CURRENT
CONTROLFILE FOR STANDBY
FORMAT
'/rman_backup/dr_refresh/stdby_ctl_%U.bkp';
RMAN> LIST BACKUP TAG 'DR_GAP_REFRESH_20250404';
8.2 Transfer backup to DR site
--
On Primary server (bash)
rsync -avz --progress
/rman_backup/dr_refresh/ \
oracle@dr-db01.company.com:/rman_backup/dr_refresh/
-- Verify transfer
integrity
ssh
oracle@dr-db01.company.com \
'ls -lh
/rman_backup/dr_refresh/'
8.3 Catalog and roll forward DR standby
--
On DR Standby — Stop MRP first
sqlplus / as sysdba
SQL> ALTER DATABASE
RECOVER MANAGED STANDBY DATABASE CANCEL;
-- RMAN on DR standby
rman TARGET /
-- Catalog the incremental
backup pieces
RMAN> CATALOG START
WITH '/rman_backup/dr_refresh/' NOPROMPT;
-- Roll forward the
standby using incremental backup
RMAN> RECOVER DATABASE
NOREDO;
-- Restart MRP to resume
normal redo apply
-- (back in sqlplus)
SQL> ALTER DATABASE
RECOVER MANAGED STANDBY DATABASE
USING CURRENT LOGFILE DISCONNECT FROM
SESSION;
-- Confirm gap is cleared
SQL> SELECT * FROM
V$ARCHIVE_GAP;
-- Expected: no rows
9. Step
6 — Full DR Failover Procedure
|
6 |
Execute DR
Failover — Primary to DR Site |
|
✖
CRITICAL — Only Perform Failover After Gap is Fully Resolved A
failover with unresolved archive gaps will cause data loss. Confirm
V$ARCHIVE_GAP returns 0 rows before proceeding with failover. This
is a ONE-WAY operation — the original Primary becomes a standby after
switchover. |
9.1 Graceful Switchover (Primary is still
accessible)
--
══ On PRIMARY (PRODDB) ══
sqlplus / as sysdba
-- Verify switchover is
ready
SELECT SWITCHOVER_STATUS
FROM V$DATABASE;
-- Must show: TO STANDBY
or SESSIONS ACTIVE
-- Initiate switchover to
standby role
ALTER DATABASE COMMIT TO
SWITCHOVER TO PHYSICAL STANDBY WITH SESSION SHUTDOWN;
-- Shutdown and remount as
standby
SHUTDOWN ABORT;
STARTUP MOUNT;
-- ══ On DR Standby (DRDB)
══
sqlplus / as sysdba
-- Confirm switchover
status
SELECT SWITCHOVER_STATUS FROM
V$DATABASE;
-- Must show: TO PRIMARY
-- Activate DR as new
Primary
ALTER DATABASE COMMIT TO
SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;
ALTER DATABASE OPEN;
-- Verify new Primary is
open
SELECT NAME, OPEN_MODE,
DATABASE_ROLE FROM V$DATABASE;
-- Expected: DRDB | READ WRITE | PRIMARY
9.2 Emergency Failover (Primary is DOWN)
--
══ On DR Standby only — Emergency Failover ══
sqlplus / as sysdba
-- Stop MRP
ALTER DATABASE RECOVER
MANAGED STANDBY DATABASE FINISH;
-- Activate as Primary
(with potential data loss warning)
ALTER DATABASE ACTIVATE
PHYSICAL STANDBY DATABASE;
-- Open the new primary
ALTER DATABASE OPEN;
-- Reset archive log
destination for new primary
ALTER SYSTEM SET
LOG_ARCHIVE_DEST_1='LOCATION=/arch/DRDB/' SCOPE=BOTH;
-- Add tempfile if missing
ALTER TABLESPACE TEMP ADD
TEMPFILE '/oradata/DRDB/temp01.dbf' SIZE 500M REUSE;
-- Verify
SELECT NAME, DB_UNIQUE_NAME, OPEN_MODE, DATABASE_ROLE FROM
V$DATABASE;
10. Step
7 — Post-Recovery Verification
|
7 |
Full
Health Check After DR Recovery |
--
On new Primary (DR site — DRDB)
sqlplus / as sysdba
-- 1. Database identity
and role
SELECT NAME,
DB_UNIQUE_NAME, OPEN_MODE, DATABASE_ROLE,
PROTECTION_MODE, SWITCHOVER_STATUS
FROM V$DATABASE;
-- 2. All datafiles must
be ONLINE
SELECT FILE#, STATUS, NAME
FROM V$DATAFILE
WHERE STATUS NOT IN
('ONLINE','SYSTEM');
-- Expected: 0 rows (all
online)
-- 3. Tablespace status
SELECT TABLESPACE_NAME,
STATUS FROM DBA_TABLESPACES
WHERE STATUS != 'ONLINE';
-- Expected: 0 rows
-- 4. No undo / rollback
segments issues
SELECT COUNT(*) FROM
V$ROLLSTAT WHERE STATUS='NEEDS RECOVERY';
-- Expected: 0
-- 5. Redo log status
SELECT GROUP#, STATUS,
ARCHIVED, MEMBERS FROM V$LOG;
-- 6. Invalid objects
check
SELECT COUNT(*) FROM
DBA_OBJECTS WHERE STATUS='INVALID';
-- 7. Recompile any
invalid objects
EXEC
UTL_RECOMP.RECOMP_SERIAL;
-- 8. Alert log — check
for errors
-- (bash)
tail -100 $ORACLE_BASE/diag/rdbms/drdb/DRDB/trace/alert_DRDB.log
| grep -iE 'error|ora-|warning'
|
✔
DR Recovery Complete — Final Verification Checklist DATABASE_ROLE
= PRIMARY and OPEN_MODE = READ WRITE. V$ARCHIVE_GAP
returns 0 rows. All
tablespaces ONLINE, all datafiles ONLINE. No
ORA- errors in the alert log. Application
connectivity restored and tested. RMAN
backup of new Primary completed immediately after recovery. |
|
ORA / RMAN
Error |
Root Cause |
Resolution |
|
ORA-01547 / ORA-01194 |
Log sequence not contiguous;
gap remains |
Re-run V$ARCHIVE_GAP;
re-copy missing sequences |
|
ORA-16014 |
Log not archived; redo log
cannot be reused |
Force log switch on Primary:
ALTER SYSTEM ARCHIVE LOG CURRENT |
|
ORA-16433 |
Standby open mode conflict
during failover |
Ensure DB is in MOUNT state
before ACTIVATE command |
|
ORA-01665 |
Controlfile is not a standby
controlfile |
Restore standby controlfile
from RMAN backup |
|
RMAN-06059 |
Expected archived log not
found |
Run CROSSCHECK ARCHIVELOG
ALL; DELETE EXPIRED ARCHIVELOG ALL |
|
ORA-16058 / ORA-16766 |
Redo apply stopped due to
gap |
Re-catalog archives; restart
MRP with FAL_SERVER set |
|
MRP: Status = WAIT_FOR_GAP |
FAL not configured or
Primary unreachable |
SET FAL_SERVER; verify
tnsping from DR to Primary |
|
ORA-19504 / ORA-27037 |
Archive log file not found
on DR |
Verify /arch/DRDB/ path
permissions; re-copy files |
|
ORA-16191 |
Primary log shipping
disabled |
ALTER SYSTEM SET
LOG_ARCHIVE_DEST_STATE_2=ENABLE on Primary |
|
ORA-01152 |
File not restored from
backup |
Restore missing datafile
from RMAN: RESTORE DATAFILE <n> |
No comments:
Post a Comment