Moved from andy/shelves to main WebMIP documentation area.
git-svn-id: http://locode01.ad.dom/svn/WEBMIP/trunk@6471 248e525c-4dfb-0310-94bc-949c084e9493
This commit is contained in:
247
Documentation/SupportingDocumentation/webmipdatabase.tex
Normal file
247
Documentation/SupportingDocumentation/webmipdatabase.tex
Normal file
@@ -0,0 +1,247 @@
|
||||
\documentclass[10pt,a4paper,twoside]{report}
|
||||
\usepackage[latin1]{inputenc}
|
||||
\usepackage{glossaries}
|
||||
\usepackage{listings}
|
||||
\lstloadlanguages{command.com,SQL}
|
||||
\lstset{frame=tb,framesep=5pt,basicstyle=\small,
|
||||
numbers=left, numberstyle=\tiny, numbersep=5pt,
|
||||
showstringspaces=true,
|
||||
breaklines=true
|
||||
}
|
||||
\lstset{escapeinside={(*@}{@*)}}
|
||||
\usepackage{svn-multi}
|
||||
\svnidlong
|
||||
{$HeadURL$}
|
||||
{$LastChangedDate$}
|
||||
{$LastChangedRevision$}
|
||||
{$LastChangedBy$}
|
||||
\svnid{$Id$}
|
||||
\author{Andrew Hardy}
|
||||
\title{webMIP database - backup, recovery and failover procedures}
|
||||
\newglossaryentry{SLA}
|
||||
{name=Service Level Agreement,
|
||||
first=Service Level Agreement(SLA),
|
||||
description={A service level agreement (frequently SLA) is that part of a service contract where the level of service is formally defined. In practice, the term SLA is sometimes used incorrectly in the context of contracted delivery time (of the service) or performance.}
|
||||
}
|
||||
\newglossaryentry{DNS}
|
||||
{name=Domain Name Server or Service,
|
||||
first=Domain Name Server(DNS),
|
||||
description={An Internet service that translates domain names into IP addresses.}
|
||||
}
|
||||
\newglossaryentry{COLD}
|
||||
{name=Cold backup,
|
||||
description={A cold backup, also called an offline backup, is a database backup when the database is offline and thus not accessible for updating.}
|
||||
}
|
||||
\newglossaryentry{HOT}
|
||||
{name=Hot backup,
|
||||
description={A hot backup, also called an online backup, is a backup performed on data even though it is actively accessible to users and may currently be in a state of being updated.}
|
||||
}
|
||||
\newglossaryentry{AL}
|
||||
{name=ARCHIVELOG,
|
||||
description={As Oracle rotates through its \gls{REDO} groups, it will eventually overwrite a group which it has already written to. The data that is being overwritten would be useless for a recovery scenario. In order to prevent that, a database can be run in archive log mode. If the database is in log archive mode, the database makes sure that online \gls{REDO}[s] are not overwritten before they have been archived.}
|
||||
}
|
||||
\newglossaryentry{REDO}
|
||||
{name=Redo log,
|
||||
description={Before Oracle changes data in a datafile it writes these changes to the redo log. If something happens to one of the datafiles, a backed up datafile can be restored and the redo that was written since that backup is applied bringing the datafile to the state it had before it became unavailable.
|
||||
The same technique is used in a standby databases environment: One database (the primary database) records all changes and sends them to the standby database(s). These standby databases in turn apply the arrived redo and this keeps them synchronized with the primary database.}
|
||||
}
|
||||
\newglossaryentry{STDBY}
|
||||
{name=standby database,
|
||||
description={A standby database is a maintained duplicate of the \gls{PRIMARY}.}
|
||||
}
|
||||
\newglossaryentry{PRIMARY}
|
||||
{name=primary database,
|
||||
description={The primary database is the database accessed by users under normal conditions.}
|
||||
}
|
||||
\makeglossaries
|
||||
\makeindex
|
||||
\begin{document}
|
||||
\maketitle
|
||||
\abstract{This document outlines the procedures used with the webMIP database with regards to backup and recovery}
|
||||
\tableofcontents
|
||||
\lstlistoflistings
|
||||
\chapter{Overview}
|
||||
\section{Requirements}
|
||||
The primary requirements for the backup and recovery of the webMIP system are:
|
||||
\begin{enumerate}
|
||||
\item Support the customer \gls{SLA} of having data-loss at less than 4 hours within the working-day (8am to 5pm);
|
||||
\item \gls{DNS};
|
||||
\end{enumerate}
|
||||
|
||||
\section{System Architecture}
|
||||
\subsection{Primary and standby databases}
|
||||
\subsection{Internet access}
|
||||
\chapter{Backup}
|
||||
\section{COLD backup}
|
||||
\section{HOT backup}\label{sec:HOTBackup}
|
||||
A \gls{HOT} is performed against an actively running database i.e. one in which users are connected and transactions are occurring. To use \gls{HOT}[s] you must operate the database in \gls{AL} mode (see listing \ref{lst:archivelog.sql}). Recovery from a \gls{HOT} relies on the restoration of the datafiles and the replaying of all archive logs from the point at which the original backup and onwards.
|
||||
|
||||
\lstset{language=command.com}
|
||||
\begin{lstlisting}[caption={Invoking SQL*Plus},label=lst:invoke,frame=tb]
|
||||
REM Identify the database connection as
|
||||
REM bequeath by using the database SID
|
||||
REM Invoke SQL*Plus without logging in
|
||||
set oracle_sid=WEBMIP
|
||||
sqlplus /nolog
|
||||
\end{lstlisting}
|
||||
|
||||
|
||||
\lstset{language=sql,caption={Prepare for archivelog}}
|
||||
\lstinputlisting[label=lst:archivelog.sql]{DatabaseScripts/archivelog.sql}
|
||||
|
||||
Once in \gls{AL} mode, the datafiles and control files are made available to be backed-up at the operating system level.
|
||||
The generated archive logs from the point at which the backup started are also backed up. In listing \ref{lst:hotbackup.sql}, the first `archive log list;' is used to provide details of the oldest online log sequence number at the start of the backup whilst the second `archive log list;' is used to provide the \textit{current} log sequence number at the end of the backup.
|
||||
To recover the database to a state consistent with the end of the backup, you \emph{must} include all the archived log files starting from oldest sequence number to the current sequence number as part of the backup \footnote{Recovery will start from the point of the `current' log, however as a precautionary measure you should keep all archived log files starting from the \textit{oldest online log sequence} number.}.
|
||||
The `alter system switch logfile;' command forces a log switch so that the \textit{current} log is archived\footnote{The creation of archive logs lags the online redo logs.} and so made available to be backed up.
|
||||
You should \emph{not} backup and restore the online redo logs as this will prevent the application of the archive logs during recovery.
|
||||
You \emph{cannot} backup and restore files associated with temporary tablespaces. Instead, these files are recreated using the `create\_tempfiles.sql' script generated by the hot backup script\footnote{Need to confirm this}.
|
||||
|
||||
\lstset{language=sql,caption={Perform HOT backup}}
|
||||
\lstinputlisting[label=lst:hotbackup.sql] {DatabaseScripts/hotbackup.sql}
|
||||
|
||||
You also need a copy of the password file that allows SYSDBA access\footnote{The password file should be re-copied whenever the passwords on the primary database are altered},
|
||||
this should be located in the `database' directory of the Oracle installation e.g. `C:$\backslash$oracle$\backslash$product$\backslash$11.1.0$\backslash$db\_1$\backslash$database' as `PWDwebmip.ora'.
|
||||
\lstset{language=command.com}
|
||||
\begin{lstlisting}[caption={Copying the password file to the backup location}]
|
||||
copy c:\oracle\product\11.1.0\db\_1\database\PWDwebmip.ora s:\orabackup\webmip\files
|
||||
\end{lstlisting}
|
||||
|
||||
At this point you have a set of datafiles, archive logs and control files that can be used to:
|
||||
\begin{enumerate}
|
||||
\item Recover the primary database (see section \ref{sec:HOTRecovery});
|
||||
\item Create a standby database (see section \ref{sec:CreateStandby}).
|
||||
\end{enumerate}
|
||||
These files should be stored outside of the server environment.
|
||||
|
||||
\chapter{Recovery}
|
||||
\section{COLD recovery}
|
||||
\section{HOT recovery}\label{sec:HOTRecovery}
|
||||
\chapter{Standby database}
|
||||
A \gls{STDBY} is a maintained duplicate, or `standby', of the production or `primary' database for recovering from disasters at the primary site. The intention is to be able to switch over from the primary database to the standby database in the case of disaster in the least amount of time and with as little recovery as possible.
|
||||
A hot standby database is a backup copy of the primary database that is maintained on a separate machine. A \gls{HOT} or \gls{COLD} is made of the primary database and copied to the standby server. The standby database is mounted but not opened.
|
||||
The archive logs from the primary database are copied from the primary to the standby database and applied at regular intervals. The means that the standby database is always a few log files (at least one log file) behind the primary database and is always in `mounted but not open' stage.
|
||||
|
||||
When the primary database fails, the standby database can be opened and all users switched to the standby server. After such a switch, the standby database becomes the primary database. A new standby database will then be required.
|
||||
\section{Creation}
|
||||
\subsection{Preparation on the primary database}
|
||||
The primary database is prepared by being placed into \gls{AL} mode and a \gls{HOT} being performed (see section \ref{sec:HOTBackup} for details on this).
|
||||
The `alter database create standby control file...' command in listing \ref{lst:hotbackup.sql} produces the control file that will be used by the standby database, whilst the `create pfile...' command produces a text version of the database initialization file.
|
||||
\subsection{Creation of the standby database}\label{sec:CreateStandby}
|
||||
The standby database is created from a \gls{HOT} of the primary database (see section \ref{sec:HOTBackup}).
|
||||
The datafiles are copied to the standby server.
|
||||
The standby server should have the same directory structure as the primary server in order to minimize the amount of changes to be made on the standby server (check the contents of the 'initstandby.ora' created in listing \ref{lst:hotbackup.sql} for the location of the `db\_recovery\_file\_dest' as this is generally only created when databases are created using the Oracle Database Configuration Agent).
|
||||
Do not forget to copy the `PWDwebmip.ora' password file to the correct `database' location (listing \ref{lst:copy_pwd}).
|
||||
|
||||
\lstset{language=command.com}
|
||||
\begin{lstlisting}[caption={Copying the password file to the standby location},label=lst:copy_pwd]
|
||||
copy s:\orabackup\webmip\files\PWDwebmip.ora c:\oracle\product\11.1.0\db\_1\database\PWDwebmip.ora
|
||||
\end{lstlisting}
|
||||
|
||||
The `oradim' command (listing \ref{lst:oradim_inst.cmd}) is used to create the Windows service instance to support the standby database. The service is created with a `manual' start mode - this means that the database instance will only become available when started manually.
|
||||
|
||||
\lstset{language=command.com,caption={Creation of Windows service for standby database}}
|
||||
\lstinputlisting[label=lst:oradim_inst.cmd] {DatabaseScripts/oradim_inst.cmd}
|
||||
|
||||
The `Oracle Net Manager' tool is used to create a `listener' for the database and the listener restarted to reflect the changes (see listing \ref{lst:lsnrctl}).
|
||||
|
||||
\lstset{language=command.com}
|
||||
\begin{lstlisting}[caption={Restarting the standby listener},label=lst:lsnrctl]
|
||||
lsnrctl stop
|
||||
lsnrctl start
|
||||
\end{lstlisting}
|
||||
|
||||
Where there are differences between the configuration of the primary and standby servers, the `initstandby.ora' database initialization file generated by listing \ref{lst:hotbackup.sql} is altered. The `standby.ctl' file (also created by \ref{lst:hotbackup.sql}) is copied to the appropriate directory.
|
||||
It is recommended to make multiple copies of this control file for greater resilience. Ensure that the names used for the control files are consistent with those listed in the `initstandby.ora'.
|
||||
|
||||
The standby database is now physically present, but the instance is not running. The `oradim' command (see listing \ref{lst:oradim_start.cmd}) is used to start the database service, but \emph{not} the instance. The `ORACLE\_SID' environment variable is used to identify the database instance that we are about to start - again we connect to SQL*Plus with the `nolog' option due to the difficulties in creating the connection string from the Windows command line.
|
||||
|
||||
\lstset{language=command.com,caption={Starting the standby service}}
|
||||
\lstinputlisting[label=lst:oradim_start.cmd] {DatabaseScripts/oradim_start.cmd}
|
||||
|
||||
|
||||
We can now connect to the `idle' instance and start it ready to receive archive logs from the primary server (listing \ref{lst:standby_startup.sql}\footnote{If you get a permissions error, connect with the sys password i.e. conn sys/`webmip' as sysdba}). We create the database `spfile' from the modified `initstandby.ora' file and startup the database as a physical standby, but without recovering\footnote{\gls{HOT}[s] always require recovery using the archive logs that were generated during the period of the backup.}.
|
||||
|
||||
\lstset{language=SQL}
|
||||
\begin{lstlisting}[caption={Starting the standby database},label=lst:standby_startup.sql]
|
||||
conn / as sysdba
|
||||
create spfile from pfile='s:\orabackup\webmip\files\initstandby.ora';
|
||||
startup nomount;
|
||||
alter database mount standby database;
|
||||
\end{lstlisting}
|
||||
|
||||
At this point the standby database is ready to be `maintained' (see section \ref{sec:maintain_standby}).
|
||||
|
||||
\section{Maintaining standby database}\label{sec:maintain_standby}
|
||||
The standby database is physically present with the datafiles as copied from the primary database. The database has been started up, but recovery has not taken place.
|
||||
Recovery requires the applying of archive logs to the database: if the archive logs generated during the original backup were applied, then the recovered database would be consistent with the state of the primary database as it was at the end of the backup period.
|
||||
A standby database extends this recovery mechanism by applying not only the archive logs generate during the backup, but some or all of the archive logs generate by the primary database since the time of the backup. This is achieved by keeping the standby database unmounted and periodically attempting recovery using archive logs delivered from the primary database.
|
||||
\subsection{Transfer of archive logs from primary database}
|
||||
A mechanism is required to periodically transfer archive logs generated by the primary database to the archive log destination of the standby server. The VBScript `logship.vbs' (listing \ref{lst:logship.vbs}) does this by comparing the archive logs available on the primary server with those available on the standby and transferring any missing files.
|
||||
|
||||
\lstset{language=VBScript,caption={Transferring archive logs}}
|
||||
\lstinputlisting[label=lst:logship.vbs] {DatabaseScripts/advwb1_logship.vbs}
|
||||
|
||||
Windows `Scheduled Tasks' are used to invoked this VBScript on a regular basis. The period between runs of this script largely determines the lag between the primary and standby databases in the case of a disaster.
|
||||
|
||||
\subsection{Application of archive logs on standby database}
|
||||
The standby database is mounted in a standby mode, users cannot access the database to perform queries, etc.
|
||||
Archive logs from the primary server are applied to the standby database through the standard `recovery' mechanism - simulating the recovery of a \gls{HOT}, but \emph{without} ending the recovery cycle.
|
||||
Listing \ref{lst:applylog} shows the method of placing the database into recovery mode. In this mode, the database automatically applies the archive logs shipped from the primary server.
|
||||
If the database determines that it requires a missing archive log (one that has not been shipped), it will raise an error. If the database is able to apply all required archive logs, it will continue until the following line where we `cancel' the recovery - this allows recovery to continue at a later date.
|
||||
|
||||
\lstset{language=SQL,caption={Applying archive logs}}
|
||||
\lstinputlisting[label=lst:applylog] {DatabaseScripts/applylog.sql}
|
||||
|
||||
Windows `Scheduled Tasks' are used to invoke the application of the archived logs on a regular basis through the use of a Windows command file (see listing \ref{lst:applylog.cmd}).
|
||||
|
||||
\lstset{language=command.com,caption={Applying archive logs - command file}}
|
||||
\lstinputlisting[label=lst:applylog.cmd]{DatabaseScripts/applylog.cmd}
|
||||
|
||||
The `currency' of the application of the archive logs can be checked by reviewing the database view v\$log\_history on both the primary and standby databases (see listing \ref{lst:applylog_status.sql}).
|
||||
|
||||
\lstset{language=SQL,caption={Checking the applying of the archive logs}}
|
||||
\lstinputlisting[label=lst:applylog_status.sql]{DatabaseScripts/applylog_status.sql}
|
||||
|
||||
\subsection{Restart of standby server}\label{sec:restart_of_standby_server}
|
||||
Whenever the standby server is restarted, the following scripts should be followed:
|
||||
|
||||
\lstset{language=command.com,caption={Restarting the standby service}}
|
||||
\lstinputlisting[label=lst:oradim_restart.cmd] {DatabaseScripts/oradim_start.cmd}
|
||||
|
||||
\lstset{language=SQL}
|
||||
\begin{lstlisting}[caption={Restarting the standby database},label=lst:restart_standby.sql]
|
||||
conn / as sysdba
|
||||
startup nomount;
|
||||
alter database mount standby database;
|
||||
\end{lstlisting}
|
||||
|
||||
At this point the standby database is ready to be `maintained'.
|
||||
|
||||
\chapter{Failover}
|
||||
\section{Activation of standby database}
|
||||
\subsection{Preparation of the primary database}
|
||||
\subsection{Preparation of the standby database}
|
||||
\subsection{Activating the standby database}
|
||||
The SQL commands in listing \ref{lst:activatestandby.sql} firstly activate the database, then shut it down cleanly prior to restarting and opening the database.
|
||||
\lstset{language=SQL,caption={Activating standby database}}
|
||||
\lstinputlisting[label=lst:activatestandby.sql]{DatabaseScripts/activatestandby.sql}
|
||||
On completion, the database is open for read write access and is a duplicate of the primary database to the point at which the last archive log from the primary database was applied. The standby database will generate a new set of archive logs.
|
||||
\section{Redirection of DNS}
|
||||
\chapter{Failback}
|
||||
\section{HOT failback}
|
||||
\subsection{Preparation of the primary database}
|
||||
\subsection{Backup of the standby database}
|
||||
\subsection{Transfer of files from standby to primary databases}
|
||||
\subsection{Maintaining primary database as standby}
|
||||
\subsection{Re-activation of primary database}
|
||||
\subsection{Redirection of DNS}
|
||||
\subsection{Recreation of standby database}
|
||||
\section{COLD failback}
|
||||
\subsection{Preparation of the primary database}
|
||||
\subsection{Backup of the standby database}
|
||||
\subsection{Transfer of files from standby to primary databases}
|
||||
\subsection{Re-activation of primary database}
|
||||
\subsection{Redirection of DNS}
|
||||
\subsection{Recreation of standby database}
|
||||
\printglossary
|
||||
\end{document}
|
||||
Reference in New Issue
Block a user