Oracle Checkpoint and SCN


This note contains almost everything about checkpoints and SCN in Oracle

A database checkpoint event represents the moment in time when the
database is in a consistent state. Checkpoints are important because,
following instance failure, only those transactions that occurred after the last
checkpoint have to be recovered during instance recovery. There are two
types of checkpoints:
incremental and full.
----------------------
When an incremental checkpoint
occurs, the following actions take place:
The CKPT background process updates the control file headers
immediately.
The CKPT background process updates the control file headers (again)
and datafile headers at the new Redo Log switch (ie for the next full checkpoint).

When a full checkpoint event occurs, the following actions occur in the
database:
All buffers in the Database Buffer Cache that contain committed
transactions are written to disk by the DBW0 background process.
All the contents of the Redo Log Buffer are written to the Online Redo
Log by the LGWR background process.
Database control files and datafile headers are updated by the CKPT
background process to indicate that the checkpoint event has
occurred.

Database checkpoints occur in the database whenever:
-------------------------------------------------------
The instance is shutdown using any method except ABORT.
When the Online Redo Log switches from the current log to the next
Redo Log in the sequence.
When the DBA issues the ALTER SYSTEM CHECKPOINT command.
When a tablespace is placed into hot backup mode using the ALTER
TABLESPACE … BEGIN BACKUP command or when a tablespace is taken
offline. This checkpoint is a special form of the complete checkpoint
referred to as a full tablespace checkpoint. During this type of checkpoint,
only the dirty buffers belonging to the tablespace in hot backup
mode are written to disk.
When the value specified by the init.ora parameter FAST_START_
MTTR_TARGET is exceeded.


*******************************************************************************************
How and where the checkpoint scn is stored?
*****************************************************************

THE SYSTEM CHECKPOINT SCN:
After a checkpoint completes, Oracle stores the system
checkpoint SCN in the control file. You can access the
checkpoint SCN using the following SQL:

SQL> select checkpoint_change# from v$database;
CHECKPOINT_CHANGE#
--------------------
292767


The Datafile Checkpoint SCN:
After a checkpoint completes, Oracle stores the SCN
individually in the control file for each datafile. The
following SQL shows the datafile checkpoint SCN for a
single datafile in the control file:


SQL> select name,checkpoint_change# from v$datafile
where name like '%users01%';
NAME                                 CHECKPOINT_CHANGE#
-----------------------------------  -------------------
d:\oracle\oradata\data\users01.dbf   292767


THE START SCN:
Oracle stores the checkpoint SCN value in the header
of each datafile. This is referred to as the start SCN
because it is used at instance startup time to check if
recovery is required. The following SQL shows the
checkpoint SCN in the datafile header for a single
datafile:

SQL> select name,checkpoint_change# from
v$datafile_header where name like '%users01%';
NAME                                  CHECKPOINT_CHANGE#
-----------------------------------   ---------------
d:\oracle\oradata\data\users01.dbf    292767


THE STOP SCN:
The stop SCN is held in the control file for each
datafile. The following SQL shows the stop SCN for a
single datafile when the database is open for normal
use:

SQL> select name,last_change# from v$datafile where
name like '%users01%';
NAME                                  LAST_CHANGE#
-----------------------------------   -------------------
d:\oracle\oradata\data\users01.dbf

During normal database operation, the stop SCN is NULL
for all datafiles that are online in read-write mode.
SCN Values while the Database Is Up Following a
checkpoint while the database is up and open for use,
the system checkpoint in the control file, the datafile
checkpoint SCN in the control file, and the start SCN
in each datafile header all match. The stop SCN for
each datafile in the control file is NULL. SCN after a
Clean Shutdown After a clean database shutdown
resulting from a SHUTDOWN IMMEDIATE or SHUTDOWN NORMAL
of the database, followed by STARTUP MOUNT, the
previous queries on v$database and v$datafile return
the following:

SQL> select checkpoint_change# from v$database;
CHECKPOINT_CHANGE#
--------------------
293184

SQL> select name,checkpoint_change#,last_change# from
v$datafile where name like '%user%';
NAME                                CHECKPOINT_CHANGE#   LAST_CHANGE#
-------------------------------------------------------  --------------
d:\oracle\oradata\data\users01.dbf  293184               293184


SQL> select name,checkpoint_change# from
v$datafile_header where name like '%users01%';
NAME                               CHECKPOINT_CHANGE#
---------------------------------  ---------------------
d:\oracle\oradata\data\users01.dbf   293184

During a clean shutdown, a checkpoint is performed and
the stop SCN for each datafile is set to the start SCN
from the datafile header. Upon startup, Oracle checks
the start SCN in the file header with the datafile
checkpoint SCN. If they match, Oracle checks the start
SCN in the datafile header with the datafile stop SCN
in the control file. If they match, the database can be
opened because all block changes have been applied, no
changes were lost on shutdown, and therefore no
recovery is required on startup. After the database is
opened, the datafile stop SCN in the control file once
again changes to NULL to indicate that the datafile is
open for normal use.


SCN AFTER AN INSTANCE CRASH:
The previous example showed the behavior of the SCN
after a clean shutdown. To demonstrate the behavior of
the checkpoints after an instance crash, the following
SQL creates a table (which performs an implicit commit)
and inserts a row of data into it without a commit:

create table x(x number) tablespace users;
insert into x values(100);

If the instance is crashed by using SHUTDOWN ABORT, the
previous queries on v$database and v$datafile return
the following after the database is started up in mount
mode:


SQL> select checkpoint_change# from v$database;
CHECKPOINT_CHANGE#
--------------------
293185

SQL> select name,checkpoint_change#,last_change# from
v$datafile where name like '%users01%';
NAME                                CHECKPOINT_CHANGE#   LAST_CHANGE#
---------------------------------  ------------------   ------------
d:\oracle\oradata\data\users01.dbf  293185

SQL> select name,checkpoint_change# from
v$datafile_header where name like '%users01%';
NAME                                CHECKPOINT_CHANGE#
---------------------------------   ---------------------
d:\oracle\oradata\data\users01.dbf   293185

In this case, the stop SCN is not set, which is
indicated by the NULL value in the LAST_CHANGE# column.
This information enables Oracle, at the time of the
next startup, to determine that the instance crashed
because the checkpoint on shutdown was not performed.
If it had been performed, the LAST_CHANGE# and
CHECKPOINT_CHANGE# values would match for each datafile
as they did during a clean shutdown. If an instance
crashes at shutdown, then instance crash recovery is
required the next time the instance starts up.


RECOVERY FROM AN INSTANCE CRASH:
Upon the next instance startup that takes place after
SHUTDOWN ABORT or a DBMS crash, the Oracle DBMS detects
that the stop SCN for datafiles is not set in the
control file during startup. Oracle then performs crash
recovery. During crash recovery, Oracle applies redo
log records from the online redo logs in a process
referred to as roll forward to ensure that all
transactions committed before the crash are applied to
the datafiles. Following roll forward, active
transactions that did not commit are identified from
the rollback segments and are undone before the blocks
involved in the active transactions can be accessed.
This process is referred to as roll back. In our
example, the following transaction was active but not
committed at the time of the SHUTDOWN ABORT, so it
needs to be rolled back:
SQL> insert into x values(100);

After instance startup, the X table exists, but remains
empty. Instance recovery happens automatically at
database startup without database administrator (DBA)
intervention. It may take a while because of the need
to apply large amounts of outstanding redo changes to
data blocks for transactions that completed and those
that didn’t complete and require roll back.


RECOVERY FROM A MEDIA FAILURE:
Up until this point, the checkpoint start SCN in the
datafile header has always matched the datafile
checkpoint SCN number held in the control file. This is
reasonable because during a checkpoint, the datafile
checkpoint SCN in the control file and the start SCN in
the datafile header are both updated, along with the
system checkpoint SCN. The following SQL shows the
start SCN from the datafile header and datafile
checkpoint SCN from the control file for the same file:

SQL> select 'controlfile' "SCN
location",name,checkpoint_change# from v$datafile where
name like '%users01%'
union
select 'file header',name,checkpoint_change# from
v$datafile_header where name like '%users01%';
SCNlocation    NAME                              CHECKPOINT_CHANGE#
-------------- ------------------------------    ----------------------
controlfile   d:\oracle\oradata\data\users01.dbf   293188
file header   d:\oracle\oradata\data\users01.dbf   293188


Unlike the v$datafile view, there is no stop SCN column
in the v$datafile_header view because v$datafile_header
is not used at instance startup time to indicate that
an instance crash occurred. However, the
v$datafile_header does provide the Oracle DBMS with the
information it requires to perform media recovery. At
instance startup, the datafile checkpoint SCN in the
control file and the start SCN in the datafile header
are checked for equality. If they don’t match, it is a
signal that media recovery is required.
For example, media recovery is required if a media
failure has occurred and the original datafile has been
replaced with a backup copy. In this case, the start
SCN in the backup copy is less than the checkpoint SCN
value in the control file, and Oracle requests archived
redo logs—generated at the time of previous log
switches—in order to reapply the changes required to
bring the datafile up to the current point in time.
In order to recover the database from a media failure,
you must run the database in ARCHIVELOG mode to ensure
that all database changes from the online redo logs are
stored permanently in archived redo log files. In order
to enable ARCHIVELOG mode, you must run the command
ALTERDATABASE ARCHIVELOG when the database is in a
mounted state.
You can identify files that need recovery after you
have replaced a datafile with an older version by
starting the instance in mount mode and running the
following SQL:
SQL> select file#,change# from v$recover_file;
FILE#      CHANGE#
---------- ----------
4          313401


In this example, file 4 is the datafile in the USERS
tablespace. By re executing the previous SQL to display
the datafile checkpoint SCN in the control file and the
start SCN in the datafile header, you can see that the
start SCN is older due to the restore of the backup
datafile that has taken place:

SQL>select 'controlfile' "SCN
location",name,checkpoint_change# from v$datafile
where name like '%users01%'
union
select 'file header',name,checkpoint_change#
from v$datafile_header where name like '%users01%';
SCNlocation     NAME                               CHECKPOINT_CHANGE#
-------------- ----------------------------------- --------------------
controlfile    d:\oracle\oradata\data\users01.dbf    313551
file header    d:\oracle\oradata\data\users01.dbf    313401


If you were to attempt to open the database, you would
receive errors like the following:
ORA-01113: file 4 needs media recovery
ORA-01110: datafile 4: 'd:\oracle\oradata\data\users01.dbf'


You can recover the database by issuing RECOVER
DATABASE from SQL*Plus while the database is in a
mounted state. If the changes needed to recover the
database to the point in time before the crash are in
an archived redo log, then you will be prompted to
accept the suggested name:

ORA-00279: change 313401 generated at 31/12/2007
17:50:23 needed for thread
ORA-00289: suggestion :
d:\dataachive\arch_1_79.ARC
ORA-00280:change 313401 for thread 1 is in sequence #72
Specify log: {<RET>=suggested | filename | AUTO |
CANCEL}

If you respond to the prompt using AUTO, Oracle applies
any archived redo logs it needs, followed by any
necessary changes in the online redo logs, to bring the
database right up to the last committed transaction
before the media failure that caused the requirement
for the restore.
So far, we’ve considered recovery scenarios where the
goal is to recover the database to the most recent
transaction. This is known as complete recovery. The
RECOVER DATABASE command has several other options that
enable you to recover from a backup to a point in time
before the most recent transaction by rolling forward
and then stopping the application of the redo log
changes at a specified point. This is known as
incomplete recovery. You can specify a time or an SCN
as the recovery point. For example,
recover database until time '2007-12-31:17:52:00';
recover database until change 313459;
Before you perform incomplete recovery, it’s
recommended that you restore a complete database backup
first. After incomplete recovery, you must open the
mounted database with ALTER DATABASE OPEN RESETLOGS.
This creates a new incarnation of the database and
clears the contents of the existing redo logs to make
sure they can’t be applied.
RECOVERY FROM A MEDIA FAILURE USING A BACKUP CONTROL
FILE:
In the previous example, we had access to a current
control file at the time of the media failure. This
means that none of the start SCN values in the datafile
headers exceeded the system checkpoint SCN number in
the control file. To recap, the system checkpoint
number is given by the following:

SQL> select checkpoint_change# from v$database;

You might be wondering why Oracle needs to maintain the
last system checkpoint value in the control file as
well as checkpoint SCNs in the control file for each
datafile (as used in the previous example). There are
two reasons for this. The first is that you might have
read-only tablespaces in your database. In this case,
the database checkpoint SCN increases, and the
checkpoint SCN for the datafiles in the read-only
tablespace remains frozen in the control file.
The following SQL report output shows a database with a
read-write tablespace (USERS) and read-only tablespace
(TEST). The start SCN in the file header and the
checkpoint SCN in the control file for TEST are less
than the system checkpoint value. Once a tablespace is
read only, checkpoints have no effect on the files in
it. The other read-write tablespace has checkpoint
values that match the system checkpoint:


SCNlocation            NAME                                 CHECKPOINT_CHANGE#
------------        ------------------------------------  ---------------------
controlfile           SYSTEM checkpoint                      355390
file header           d:\oracle\oradata\data\users01.dbf     355390
file in controlfile   d:\oracle\oradata\data\users01.dbf     355390
file header           d:\oracle\oradata\data\test01.dbf      355383
file in controlfile   d:\oracle\oradata\data\test01.dbf      355383


The second reason for the maintenance of multiple
checkpoint SCNs in the control file is that you might
not have a current control file available at recovery
time. In this case, you need to restore an earlier
control file before you can perform a recovery. The
system checkpoint in the control file may indicate an
earlier change than the start SCN in the datafile
headers.
The following SQL shows an example where the system


checkpoint SCN and datafile checkpoint SCN indicate an
earlier change than the start SCN in the datafile
header:

SQL>select 'controlfile' "SCN location",'SYSTEM
checkpoint' name,checkpoint_change# from v$database
union
select 'file in controlfile',name,checkpoint_change#
from v$datafile where name like 'users01%'
union
select 'file header',name,checkpoint_change#
from v$datafile_header where name like '%users01%';
SCN location         NAME                                 CHECKPOINT_CHANGE#
------------------- ------------------------------------- -------------
controlfile         SYSTEM checkpoint                          333765
file header         d:\oracle\oradata\data\users01.dbf         355253
file in controlfile d:\oracle\oradata\data\users01.dbf         333765


If try you to recover a database in the usual way in
this situation, Oracle detects that the control file is
older than some of the datafiles, as indicated by the
checkpoint SCN values in the datafile headers, and
reports the following message:
SQL> recover database
ORA-00283: recovery session canceled due to errors
ORA-01610: recovery using the BACKUP CONTROLFILE option
must be done
If you want to proceed with recovery in this situation,
you need to indicate to Oracle that a non current
control file—possibly containing mismatches in the SCN
values identified by the previous error messages—is
about to be specified for recovery by using the
following command: recover database using BACKUP
CONTROLFILE;

No comments:

Post a Comment