Subject: Standby Database and 8.1 Type: BULLETIN Status: PUBLISHED Creation Date: 17-SEP-1999 SCOPE & APPLICATION The audience for this paper are professionals responsible for administering an 8.1 database on a Unix platform. It is assumed that the reader is completely familiar with the Oracle Technical report Oracle8i Standby Database [NOTE:76451.1] which details the planning, design, preparation, deployment and maintenance of a standby database under 7.3. OVERVIEW For many years now there has been additional emphasis placed on redundancy and high availability in information systems. Part of Oracle's response to this has been the standby database solution. Oracle 8.1 offers new functionality to simplify and automate the maintenance of the standby database. Previously, it was the responsibility of the DBA to copy archived redo logs to the remote host and to apply them so that the primary database and the standby database remained synchronized. These functions are automated in Oracle 8.1. There are really two features used to provide this functionality; remote archival of the primary database's redo log files to the standby database host, and the subsequent automatic application of these archive files to the standby database. Additional enhancements are increased flexibility in defining local archive destinations and archive completion requirements, and better I/O failure handling and transmission restart capability. A new feature of the standby database is the ability to open this database in read only mode. While the standby database cannot be maintained while in read only mode, this provides the ability to use the processing capacity of the standby host during off hours batch processing for example, and return to maintenance mode in the morning. While the emphasis will be on the new features, and how to use them, the structure of the paper will be based on the process of standby database implementation. The implementation of this feature applies to ARCH, as well as its' foreground equivalent, but ARCH will predominantly be discussed for simplicity. ARCHITECTURAL OVERVIEW The foreground process or ARCH will archive to multiple locations specified by new initSID.ora parameters. One of these must be a local directory, but one or more can be a remote location. A remote destination is specified to the primary by a Net8 Service Name which is in turn defined in the appropriate network configuration files. It is the job of ARCH to accomplish the transmission to the remote site. Once ARCH has something to transmit, it invokes a non-local connection at the standby host called a Remote File Server(RFS). The listener at the remote host accepts the connection requests from ARCH and passes the connection request to RFS. The RFS process consumes the I/O from ARCH. It is the job of RFS to create and populate the remote archived logs at the remote site, and to update the controlfile of the standby database. RFS does not update the controlfile until it has completely written the archive log. Once the controlfile has been updated, the standby database will be automatically updated if it has been placed in the sustained recovery mode. Alternatively, the database can be maintained manually as before. This remote archive process requires uninterrupted connectivity over TCP/IP, and the standby database must be mounted as a standby database, open read only, or in a sustained or manual maintenance mode. Connectivity to the standby database will have to be reestablished if broken. Implementation involves configuring the primary database, the network, and the standby database. CONFIGURING THE PRIMARY OVERVIEW Beyond setting up the database to run in archivelog mode, configuring the primary to perform remote archiving involves using new initialization parameters for setting archive log destinations and associated states, and these changes affect both remote and local destinations as well. This discussion, therefore, encompasses the larger question of parameter usage, not just remote archival. Log_archive_dest_n initialization parameters for defining archive destinations replace earlier log_archive_dest and log_archive_duplex_dest parameters. These allow up to five archive log destinations, including 1 mandatory local destination and up to 4 local or remote destinations. A new log_archive_dest_state_n parameter defines an initial state for these destinations. These parameters may be dynamically changed via "alter system set" or "alter session set" commands. Alter session commands, however, would only affect foreground operations of the session entering the command. System commands would affect the life of the instance and apply to all sessions. The "n" suffix of the new parameters can have a value of 1 through 5. This number makes each parameter discrete. Log_archive_dest_state_3 applies to log_archive_dest_3, for example, and fixed-table queries differentiate the various parameters by number. This suffix is known as the "handle". DESTINATION STATUS Before delving into parameter details, it is important to understand the concept of "state". How Oracle treats a particular destination with respect to current and future archive activities is driven by the "state" of that destination. The "state" in turn is determined by the combination of values of three non-persistent characteristics. These characteristics are: valid/invalid - is the log_archive_dest_n destination specified? enabled/disabled - is what is the log_archive_dest_state_n value? active/inactive - has there been an error to this destination during the archive operation? The characteristics above combine to result in a state shown in the table below where "T" means true and "F" means false. The current state of a destination is contained in v$archive_dest. Valid Enabled Active State Remark ----- ------- ------ ------ ---------- F N/A N/A Inactive Destination was not defined - not used for archive T T T Valid Destination defined, enabled, no current errors T F T Deferred Set or initialized as "defer"- not used for future archive T F F Disabled Manually disabled following error - not used for archive T T F Error Error during archive - not used for future archive Log_archive_dest_state_n The enabled/disabled characteristic noted above is defined by the log_archive_dest_state_n parameter. The possible values are "enable" or "defer", with the default value being enable. Enable could be interpreted to mean for Oracle to use the destination, and defer to mean for Oracle not to use it. This parameter can be changed dynamically at any time via "alter system" to get Oracle to start using the destination. This is an attractive way of pre-defining an emergency local destination. For example, if log_archive_dest_state_3=defer was defined in the initSID.ora, and the log_archive_dest_3 destination was defined as a local directory, it would not be used for archiving, but would show up in v$archive_dest in deferred state. However, entering "alter system set log_archive_dest_state_3=enable" and starting an archive after failure would cause Oracle to write to this destination and would help to fulfill log_archive_min_dest_succeed requirements only. Log_archive_dest_n The other user controlled characteristic in the state table shown above is valid/invalid, and this is determined by defining a log_archive_dest_n parameter. As with log_archive_dest_state_n, there are five available parameters to use, represented by the 1 through 5 values of the n suffix. This parameter a space delimited string that is parsed at startup. The string must contain a destination, and that may be followed by optional attributes. The general format is log_archive_dest_n="destination-clause attribute1-clause attribute2-clause". If a log_archive_dest_n is defined in initSID.ora, it must contain a destination option. There are two forms of this option, location= and service=. The service= option denotes a remote standby database host, and should be a Net8 Service Name. The location= option denotes a local destination and should be a valid directory in which Oracle has write permissions. There is no default value for the destination clause. If it is not supplied, an "ORA-16027: parameter log_archive_dest_n is missing a destination option" is received during startup. For example, log_archive_dest_3="service=FARHOST" would define a remote destination. The Net8 Service Name "FARHOST" would have to be defined in the network files to provide the necessary network support. This is discussed later. In addition to the required destination option, log_archive_dest_n accepts three different optional attributes: reopen=, and retry=, and optional/mandatory. Reopen= n sets the number of seconds before Oracle will retry the archive from the beginning to a failed destination. This parameter applies to problems with creating the file, local or remote. The default behavior is to put the destination into "error" status and not retry the archive. If retry is successful, the destination is taken out of error status. Retry= sets the number of times a failed I/O should be retried. This parameter applies to I/O errors after the file has been created and data is being written to the file. The default behavior is to put the destination into "error" status and not retry the I/O. The optional/mandatory option sets archival success to a destination as a condition for reuse of a redo log group. The default is optional, that is, success is not required. Any log_archive_dest_n with an attribute of "mandatory" MUST complete for the redo log to be reused, regardless of any other parameter or any other log_archive_dest_n that completes. For remote destinations, the optional/mandatory attribute of log_archive_dest_n is the only factor Oracle uses with respect to that destination when determining if a redo log can be used again. But for local destinations, Oracle also considers the log_archive_min_succeed initialization parameter. This parameter, which defaults to 1, has been reinterpreted to mean the number of LOCAL destinations that must succeed to allow reuse of a redo log group. If there is a logical discrepancy between the number of local destinations defined as mandatory and the number of local successes required by log_archive_min_succeed, the most strict condition applies. In other words, a log_archive_min_succeed parameter may cause a local destination defined as "optional" to become "mandatory", but it may not cause a local "mandatory" destination to become "optional". Log_archive_min_succeed may not be set higher than the number of local defined destinations, and it may not be less than 1. In archivelog mode, Oracle requires that one local archive succeed no matter what. Examples Example 1 A typical implementation might be to define several local destinations to provide local archive log duplexing, and one remote destination to supply an archive for a standby database. log_archive_dest_1="location=/u02 /archdata reopen=300 retry=5" log_archive_dest_2="location=/u05/archdata reopen=300 retry=5" log_archive_dest_3="service=STBYHOST reopen=300 retry=5" log_archive_min_succeed=1 Here there are two local destinations defined and one remote. All destinations are enabled by default as there are no log_archive_dest_state_n entries. All destinations are defined as "optional" by default. However, one or the other of the two local destinations must complete to reuse the redo log group because log_archive_min_succeed is 1, and in any case, Oracle requires that at least one local destination succeed in archivelog mode. If log_archive_min_succeed had been set to 2, then both of the local "optional" destinations would have to successfully archive to allow redo log reuse and would therefore become "mandatory". Log_archive_min_dest_succeed could not have been set to 0, and it could not have been set to 3 since there are not that many local destinations available. The local destinations should preferably point to directories located on separate disks off of separate controllers in separate cabinets. The point would be to avoid a single point of failure. Since both destinations are defined as "optional" and only one must succeed, the possibility of having the database "hang" do to archive failure is remote. Log_archive_3 is defined as a remote destination because of the service being used in the destination option instead of location. STBYHOST is a Net8 Service Name(a.k.a. alias in tnsnames.ora), that points to the host where the standby database resides. This destination is "optional", and, since log_archive_min_succeed does not affect it will have no affect on redo log file reuse. In view of the variables involved in sending files to a remote, such as status of the network and the remote host, it would seem prudent to make remote destinations "optional" to prevent the database from hanging after running out of redo log files. Example 2 Not too practical, but good for discussion. log_archive_dest_1="location=/u02/archdata mandatory" log_archive_dest_2="location=/u05/archdata mandatory" log_archive_dest_3="location=/tmp/archdata" log_archive_dest_4="location=/u03/archdata" log_archive_dest_5="service=STBYHOST mandatory" log_archive_min_succeed=2 log_archive_dest_3=defer log_archive_min_succeed only requires that two local destinations succeed, so if destination 2 and destination 4 succeeded, its' requirements would be fulfilled. But destinations 1 and 5 would still have to succeed because they are defined as mandatory. Destination 3 is deferred and will not be used for archiving until it is dynamically enabled by "alter system set" or "alter session set". If enabled and an archive started, it could be used to fulfill log_archive_min_dest_succeed requirements, but destinations defined as "mandatory"(1,2 and 5) would still have to succeed! Configuring the Network Remote destinations must have their Net8 Service Name defined in the appropriate network files. Specifically, the local tnsnames.ora must be configured, and on the standby database host, the listener.ora must be configured. A discussion of Net8 is beyond the scope of this paper. Examples are provided as a guide for configuring these files. The local tnsnames.ora will be in $ORACLE_HOME/network/admin or in the directory pointed to by the $TNS_ADMIN environmental parameter. To support the STBYHOST Net8 Service Name used in the examples above, a tnsnames.ora entry similar to the following would be appropriate. Here the SID and dbname are a813b. STBYHOST, the Net8 Service Name, previously known as a tns alias, must be mapped to an SID so that the proper controlfile is updated during the remote archive process. STBYHOST = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = ( PROTOCOL = TCP)(HOST = otcsol1.us.oracle.com)(PORT =1521) ) (CONNECT_DATA = (SERVICE_NAME = a813b)) On the remote host where the standby database resides, this listener.ora should be configured. Remember to bounce the listener after editing the listener.ora. LISTENER = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = otcsol1.us.oracle.com)(PORT = 1521)) (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0)) ) SID_LIST_LISTENER = (SID_LIST = (SID_DESC = (SID_NAME = PLSExtProc) (ORACLE_HOME = /u01/app/oracle/product/8.1.3) (PROGRAM = extproc) ) (SID_DESC = (SID_NAME = a813b) ) ) Configuring the Standby Database Create the Standby Database In the briefest possible terms, a suggested procedure to create a standby database is shown below. For a detailed explanation, refer to the document mentioned in the introduction. * Primary host backup datafiles, redo log files, and initialization files create standby controlfile - alter database create standby controlfile as 'xxxx' shutdown primary database * Transfer files and archived redo logs to a remote host with the same OS level and Oracle version installed as the primary host * Standby host Modify the initSID.ora file as required Startup nomount the standby database Mount the standby database * Start the primary database * Maintain the standby database The modification of initSID.ora and standby database maintenance will be discussed in detail. Modify Standby Database InitSID.ora The initSID.ora for the standby database is copied over from the primary as part of standby DB creation. The remote archival implementation utilizes a new initSID.ora parameter for the standby database, standby_archive_dest which tells RFS where to create and populate the remote archive log files. This must be a local directory to which Oracle has write permissions. It this parameter is not supplied, it defaults to $ORACLE_HOME/dbs. Some additional parameters, while not new, that may be required are db_file_name_convert and log_file_name_convert. When the directory structure of the standby host does not permit the directory structure of the primary host to be mirrored, these parameters will translate the full path names in the binary standby controlfile to the correct path on the standby host. The initSID.ora file must also reflect the correct path as well as the correct name of the standby controlfile given to it when it was created on the primary. Example Partial initSID.ora is shown control_files = /beta/home/oradata/stbyctl.ctl db_file_name_convert="/u01/oradata", "/beta/home/oradata" log_file_name_convert="/u01/oradata", "/beta/home/oradata" standby_archive_dest=/beta/home/oradata/archives The four lines above are the only additions/modifications required to the original initSID.ora. The standby controlfile was created on the primary with "alter database create standby controlfile as 'stbyctl.ctl'" and this was brought over to /beta/home/oradata on the standby host. The datafiles and redo log files were also brought over to the /beta/home/oradata directory on the standby host, but they were originally in /u01/oradata on the primary host. The convert parameters cure this problem. When remote archiving begins, RFS will put the archive logs in /beta/home/oradata/archives. . Maintain the Standby Database With Oracle 8.1, a new sustained maintenance mode is available. During remote archival of redo logs, RFS updates the standby database controlfile log history records. Sustained maintenance mode is then able to apply these archived redo logs to the database automatically this keeping the standby in sync with the primary. Thus, sustained maintenance mode can only be used together with remote archival, not with manual copying of archive logs from the primary as the binary controlfile is not updated. Manual maintenance mode of the standby can still be used regardless of the archive transmission mechanism. Manual Mode Manual mode may be required for recovery purposes so the steps are shown here briefly. First, manually bring missing archive log files over to the standby host if necessary. Then, startup nomount pfile='/xx/xx/initSID.ora' alter database mount standby database recover standby database The recover command will suggest an archived redo log file to apply. The default directory that the recover command will look for archive files is the directory specified in the first log_archive_dest_n that is "local", but you can reply with the full pathname of where you put the logs. This command will end when you reply "cancel" or when it cannot find a subsequent log to apply. Sustained Mode Sustained mode takes advantage of remote archival feature by automatically applying the archived redo logs. To put the database in this mode: startup nomount pfile='/xx/xx/initSID.ora' alter database mount standby database recover managed standby database Normally, this command does not end and the sqlplus or svrmgrl prompt does not return. Rather it applies archive logs RFS has named in its' controlfile and waits for the next archive to arrive. This recover command must be ended from a different session with "recover managed standby database cancel". There is a timeout=n option to the recovery command which will cause recovery to end if a new archive log is not received in n minutes. For example, "recover managed standby database timeout=10" will start the recovery command, but if there is a 10 minute wait between the arrival of any two archive logs, the command will end. One possible way of handling recovery might be to put the recovery command in a shell as a Unix "here" document and nohup the shell putting it in the background. Example A very unsophisticated script to start sustained recovery might be: #vi rcov.script svrmgrl << END connect internal startup nomount pfile='/xx/yy/initSID.ora' alter database mount standby database; recover managed standby database; exit; END and this could be executed. # nohup rcov.script 1>/tmp/rcov.log & The alert_SID.log of the standby database will show the following during sustained recovery: Media Recovery Start: Managed Standby Recovery Media Recovery Log Media Recovery Waiting for thread 1 seq# 468 Media Recovery Log /beta/home/oradata/archives/arch000000468.arc Media Recovery Waiting for thread 1 seq# 469 Media Recovery Log /beta/home/oradata/archives/arch000000469.arc Media Recovery Waiting for thread 1 seq# 470 When recovery is canceled from a separate session ORA-00283: recovery session canceled due to errors ORA-16037: user requested cancel of sustained recovery operation When recovery times out due to timeout parameter Wait timeout: thread 1 sequence# 470 Media recovery failed with error 16016 ORA-283 signaled during: ALTER DATABASE RECOVER  Errors and Recovery The scope of changes to initialization parameters involved in remote archival expand the discussion of errors to include local archive recovery as well as remote archive recovery. The main source of information about an error is the alert_SID.log on the primary. Also, an understanding of the display of the status of archive destinations on the primary and their meaning is imperative. Scope of Archive Recovery In Oracle 8.1, a redo log that has been successfully archived cannot be re-archived, and a destination that has successfully archived cannot be re-archived. A redo log has been successfully archived when all mandatory destinations have archived and log_archive_min_dest_succeed requirements have been fulfilled. The redo log is marked as reusable. An error on an archive destination may or may not have stopped the archive process. "Recovery" for an optional destination differs from mandatory destination recovery. Recovery of an "optional" destination may involve manually copying a missing archive log to a desired location, resolving the problem that caused the error initially, and manually resetting or changing the destination status. Oracle will not archive the missing log to that destination because the archive was successful. Recovery of a mandatory destination may mean resolving the initial problem in some way, and reinitiating the archive process so that the required archival can take place. Oracle will reset the status of the destination upon completion of the required archive. Clearing the error status is a requirement for future archiving operations to that destination. If the error status has been cleared but the initial cause not resolved, the destination will return to error status. Displaying Status The usual displays about archivelog mode like archive log list, v$log, v$archived_log, and v$log_history apply. The most important display to show complete information about archive destinations, and to help analyze errors is v$archive_dest. See the Oracle 8 reference for a complete description. Some of these are deserve further discussion here. Status - As discussed in a previous section, this can be valid, inactive, deferred, error, or disabled Binding - Is the definition optional or mandatory Destination - Local directory or Service name Fail_date - Date of last failure Fail_sequence - Log sequence of last error Error - Text of last error Name_space - Scope of the display. Session if this has been altered via "alter session" command, otherwise System. The Status and what combinations of characteristics create a status were discussed in an earlier session. If there has been a problem with an archive destination, this column should be in "error". The Fail_date, Fail_sequence, and Error fields should be populated if an archive has erred. Fail_sequence gives the sequence number that failed, and Error provides an actual ORA message indicating the cause of error. The Binding column is important because it tells what destinations have been explicitly defined as mandatory. As stated previously mandatory destinations must complete. More specifically, if log_archive_dest_3 is defined as mandatory, log_archive_dest_3 must complete. Subsequently adding another destination, say log_archive_dest_5, will help satisfy the log_archive_min_dest_succeed requirement, but not the requirement for log_archive_dest_3 to complete. Name_space will tell whether this is a session level or system level display. The log_archive_dest_state_n and log_archive_dest_n parameters can be dynamically redefined by "alter system set" and "alter session set" commands. The alter session command only modifies the parameters for that session, and would only be used by foreground processing initiated from that session only. The real time parameters remain intact and ARCH would use these. The alter system command would last for the life of the instance and be used by foreground and background processing in that session and all others. It is more straightforward to use the "alter system" commands. Resetting an Error Condition 1) Fix the original problem. For "mandatory" destinations, fix the original problem and re-archive. For example, if permissions on a local directory were not correct, archive would fail. To clear the error and complete archival: chmod 777 /destdirectory archive log start If the incorrect directory was specified in the first place, dynamically re-specify the destination and initiate archival alter system set log_archive_dest_2="location=/u01/archdata"; archive log start For "optional" destinations, fix the problem and dynamically re-specify the destination. Since Oracle will not re-archive a successful destination, so you must clear the error. (You might "fix the error" by dynamically changing the destination to a new directory) chmod 777 /destdirectory alter system set log_archive_dest_2="location=/destdirectory" 2) Fix the error and bounce the instance Bouncing the instance will clear errors against all destinations. Required archival will be performed during startup. But, the cause of the original error must be fixed first. 3) Reopen threshold reached. If a reopen parameter has been set for the destination, once that time threshold has been reached, destination archival is retried. This will reset the error status if successful. Examples of Errors Here are some examples of common errors and how to fix them. Notice that in the primary alert_SID.log, all archive failures are accompanied by the message "Error creating archivelog file.". Also, if the failure is a mandatory destination, the message "ARCH: Archival Stopped, error occurred" will appear. Example 1 Problem: The standby database is down, destination is optional log_archive_dest_3="service=servicename" in initSID.ora Symptoms: Alert_SID.log of the primary shows: ARCH0: Beginning to archive Standby archivelog instance not available at host 'servicename' code 3113 Error creating archivelog file 'servicename' Action: Startup and mount the standby database. Startup nomount pfile='/xx/xx/initSID.ora' Alter database mount standby database Reset the destination status at the primary Alter system set log_archive_dest_3="service=servicename" Discussion: The initial problem has been resolved and the destination reset at the primary so remote archival should resume during archival of the next redo log group. But since this was an optional destination, Oracle will not go back and re-archive the logs that are now missing on the standby. You will have to decide.