Monday, October 3, 2022

Oracle: Dataguard: Automatic Failover and Switchover

 

An example of how to perform Automatic Failover and later switchover on Oracle DataGuard.


show configuration as expected Primary is at orclprm

[oracle@Primary ~]$ dgmgrl /
DGMGRL for Linux: Release 19.0.0.0.0 - Production on Mon Oct 3 14:23:59 2022
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates. All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected to "ORCLPRM"
Connected as SYSDG.
DGMGRL> connect sys@Standby
Password:
Connected to "ORCLSTB"
Connected as SYSDBA.
DGMGRL> show configuration

Configuration - orcl

Protection Mode: MaxPerformance
Members:
orclprm - Primary database
orclstb - Physical standby database

Fast-Start Failover: Disabled

Configuration Status:
SUCCESS (status updated 32 seconds ago)

Configuring Failover with an expected warnings that needs to be addressed.  It is complaining about Observer not starting. It is a process that monitors both primary and standby. So, start it and open another terminal to perform the rest of the test.

This step is set new fast fail over time to 60 seconds and enable the fast_start failover.

DGMGRL> edit configuration set property FastStartFailoverLagLimit=60;
Property "faststartfailoverlaglimit" updated


DGMGRL> enable fast_start failover;
Warning: ORA-16827: Flashback Database is disabled

Enabled in Potential Data Loss Mode.
DGMGRL> show configuration

Configuration - orcl

Protection Mode: MaxPerformance
Members:
orclprm - Primary database
Warning: ORA-16819: fast-start failover observer not started

orclstb - (*) Physical standby database

Fast-Start Failover: Enabled in Potential Data Loss Mode

Configuration Status:
WARNING (status updated 48 seconds ago)


DGMGRL> start observer;
[W000 2022-10-03T14:27:52.856-06:00] FSFO target standby is orclstb
Observer 'Primary.localdomain' started
[W000 2022-10-03T14:27:52.936-06:00] Observer trace level is set to USER
[W000 2022-10-03T14:27:52.936-06:00] Try to connect to the primary.
[W000 2022-10-03T14:27:52.936-06:00] Try to connect to the primary primary.
[W000 2022-10-03T14:27:52.947-06:00] The standby orclstb is ready to be a FSFO target
[W000 2022-10-03T14:27:53.948-06:00] Connection to the primary restored!
[W000 2022-10-03T14:27:55.949-06:00] Disconnecting from database primary.
[W000 2022-10-03T14:29:52.153-06:00] Primary database cannot be reached.
[W000 2022-10-03T14:29:52.153-06:00] Fast-Start Failover threshold has not exceeded. Retry for the next 30 seconds
[W000 2022-10-03T14:29:53.153-06:00] Try to connect to the primary.
[W000 2022-10-03T14:29:54.199-06:00] Primary database cannot be reached.
[W000 2022-10-03T14:29:55.199-06:00] Try to connect to the primary.
[W000 2022-10-03T14:30:20.788-06:00] Primary database cannot be reached.
[W000 2022-10-03T14:30:20.788-06:00] Fast-Start Failover threshold has not exceeded. Retry for the next 2 seconds
[W000 2022-10-03T14:30:21.789-06:00] Try to connect to the primary.
[W000 2022-10-03T14:30:22.839-06:00] Primary database cannot be reached.
[W000 2022-10-03T14:30:22.839-06:00] Fast-Start Failover threshold has expired.
[W000 2022-10-03T14:30:22.839-06:00] Try to connect to the standby.
[W000 2022-10-03T14:30:22.839-06:00] Making a last connection attempt to primary database before proceeding with Fast-Start Failover.
[W000 2022-10-03T14:30:22.839-06:00] Check if the standby is ready for failover.
[S002 2022-10-03T14:30:22.848-06:00] Fast-Start Failover started...

The following messages will repeat indefinitely until the primary database is reinstated after Primary db is Shutdown Abort. The Observer will continuously ping the primary that is currently down.

2022-10-03T14:30:22.848-06:00
Initiating Fast-Start Failover to database "orclstb"...
[S002 2022-10-03T14:30:22.848-06:00] Initiating Fast-start Failover.
Performing failover NOW, please wait...
Failover succeeded, new primary is "orclstb"
2022-10-03T14:30:50.857-06:00
[S002 2022-10-03T14:30:50.857-06:00] Fast-Start Failover finished...
[W000 2022-10-03T14:30:50.857-06:00] Failover succeeded. Restart pinging.
[W000 2022-10-03T14:30:50.870-06:00] Primary database has changed to orclstb.
[W000 2022-10-03T14:30:50.871-06:00] Try to connect to the primary.
[W000 2022-10-03T14:30:50.871-06:00] Try to connect to the primary standby.
[W000 2022-10-03T14:30:50.962-06:00] The standby orclprm needs to be reinstated
[W000 2022-10-03T14:30:50.962-06:00] Try to connect to the new standby orclprm.
[W000 2022-10-03T14:30:50.962-06:00] Connection to the primary restored!
[W000 2022-10-03T14:30:52.963-06:00] Connection to the new standby restored!
[W000 2022-10-03T14:30:52.963-06:00] Disconnecting from database standby.
[W000 2022-10-03T14:30:53.969-06:00] Failed to ping the new standby.
[W000 2022-10-03T14:30:54.969-06:00] Try to connect to the new standby orclprm.
[W000 2022-10-03T14:30:56.970-06:00] Connection to the new standby restored!
[W000 2022-10-03T14:30:56.983-06:00] Failed to ping the new standby.
[W000 2022-10-03T14:30:57.984-06:00] Try to connect to the new standby orclprm.
[W000 2022-10-03T14:30:59.985-06:00] Connection to the new standby restored!

On a different SQLPLUS prompt on primary db perform a Startup Mount then on dgmgrl, perform a Reinstate.

DGMGRL> reinstate database orclprm
Reinstating database "orclprm", please wait...
Error: ORA-16657: reinstatement of database in progress

Failed.
Database reinstatement for "orclprm" in progress

DGMGRL> show configuration

Configuration - orcl

Protection Mode: MaxPerformance
Members:
orclstb - Primary database
Warning: ORA-16824: multiple warnings, including fast-start failover-related warnings, detected for the database

orclprm - (*) Physical standby database
Error: ORA-16816: incorrect database role

Fast-Start Failover: Enabled in Potential Data Loss Mode

Configuration Status:
ERROR (status updated 31 seconds ago)


As soon as the Primary database is Startup Mount and Reinstate, the Observer will change its status to successfully restore on Primary.

[W000 2022-10-03T14:34:11.161-06:00] Failed to ping the new standby.
[W000 2022-10-03T14:34:12.161-06:00] Try to connect to the new standby orclprm.
[W000 2022-10-03T14:34:13.161-06:00] Connection to the new standby restored!
[W000 2022-10-03T14:34:46.215-06:00] Try to connect to the primary standby.
[W000 2022-10-03T14:34:47.215-06:00] Connection to the primary restored!
[W000 2022-10-03T14:34:47.215-06:00] Wait for new primary to be ready to reinstate.
[W000 2022-10-03T14:34:48.229-06:00] New primary is now ready to reinstate.
[W000 2022-10-03T14:34:49.230-06:00] Issuing REINSTATE command.

2022-10-03T14:34:49.230-06:00
Initiating reinstatement for database "orclprm"...
Reinstating database "orclprm", please wait...
[W000 2022-10-03T14:35:05.258-06:00] The standby orclprm is ready to be a FSFO target
Reinstatement of database "orclprm" succeeded
2022-10-03T14:35:32.952-06:00
[W000 2022-10-03T14:35:33.305-06:00] Successfully reinstated database orclprm.

With everything is going great, the new primary is with Standby db until the switchover takes place.

DGMGRL> show configuration

Configuration - orcl

Protection Mode: MaxPerformance
Members:
orclstb - Primary database
orclprm - (*) Physical standby database

Fast-Start Failover: Enabled in Potential Data Loss Mode

Configuration Status:
SUCCESS (status updated 52 seconds ago

Perform swithover back to Primary
DGMGRL> disable fast_start failover;
Disabled.

DGMGRL> connect sys as sysdba
Password:
Connected to "ORCLPRM"
Connected as SYSDBA.
DGMGRL> show configuration

Configuration - orcl

Protection Mode: MaxPerformance
Members:
orclstb - Primary database
orclprm - Physical standby database

Fast-Start Failover: Disabled

Configuration Status:
SUCCESS (status updated 73 seconds ago)

DGMGRL> switchover to orclprm
Performing switchover NOW, please wait...
New primary database "orclprm" is opening...
Operation requires start up of instance "orcl" on database "orclstb"
Starting instance "orcl"...
Connected to "ORCLSTB"
ORACLE instance started.
Connected to "ORCLSTB"
Database mounted.
Database opened.
Connected to "ORCLSTB"
Connected to "ORCLPRM"
Switchover succeeded, new primary is "orclprm"
DGMGRL> show configuration

Configuration - orcl

Protection Mode: MaxPerformance
Members:
orclprm - Primary database
orclstb - Physical standby database

Fast-Start Failover: Disabled

Configuration Status:
SUCCESS (status updated 70 seconds ago)