Subject: OERR: 600 2662 Block SCN is ahead of Current SCN Type: REFERENCE Status: PUBLISHED Content Type: TEXT/PLAIN Creation Date: 21-OCT-1997 Last Revision Date: 28-JAN-2000 Language: USAENG ORA-600 [2662] [a] [b] [c] [d] [e] Versions: 7.0.16 - 8.0.5 Source: kcrf.h =========================================================================== Meaning: There are 3 forms of this error. 4/5 argument forms - The SCN found on a block (dependant SCN) was ahead of the current SCN. See below for this 1 Argument (before 7.2.3): Oracle is in the process of writing a block to a log file. If the calculated block checksum is less than or equal to 1 (0 and 1 are reserved) ORA-600 [2662] is returned. This is a problem generating an offline immediate log marker (kcrfwg). *NOT DOCUMENTED HERE* --------------------------------------------------------------------------- Argument Description: Until version 7.2.3 this internal error can be logged for two separate reasons, which we will refer to as type I and type II. The two types can be distinguished by the number of arguments: Type I has four or five arguments after the [2662]. Type II has one argument after the [2662]. From 7.2.3 onwards type II no longer exists. Type I ~~~~~~ a. Current SCN WRAP b. Current SCN BASE c. dependant SCN WRAP d. dependant SCN BASE e. Where present this is the DBA where the dependant SCN came from. From kcrf.h: If the SCN comes from the recent or current SCN then a dba of zero is saved. If it comes from undo$ because the undo segment is not available then the undo segment number is saved, which looks like a block from file 0. If the SCN is for a media recovery redo (i.e. block number == 0 in change vector), then the dba is for block 0 of the relevant datafile. If it is from another database for distribute xact then dba is DBAINF(). If it comes from a TX lock then the dba is really usn<<16+slot. Type II ~~~~~~~ a. checksum -> log block checksum - zero if none (thread # in old format) --------------------------------------------------------------------------- Diagnosis: ~~~~~~~~~~ In addition to different basic types from above, there are different situations and coherences where ORA-600 [2662] type 'I' can be raised. For diagnosis we can split up startup-issues and no-startup-issues. Usually the startup-issues are more critical. Getting started: ~~~~~~~~~~~~~~~~ (1) is the error raised during normal database operations (i.e. when the database is up) or during startup of the database? (2) what is the SCN difference [d]-[b] ( subtract argument 'b' from arg 'd')? (3) is there a fifth argument [e] ? If so convert the dba to file# block# Is it a data dictionary object? (file#=1) If so find out object name with the help of reference dictionary from second database (4) What is the current SQL statement? (see trace) Which table is refered to? Does the table match the object you found in step before? Be careful at this point: there may be no relationship between DBA in [e] and real source of problem (blockdump). Deeper analysis: ~~~~~~~~~~~~~~~~ - investigate trace file this will be a user trace file normally but could be an smon trace too - search for: 'buffer' ("buffer dba" in Oracle7 dumps, "buffer tsn" in Oracle8 dumps) this will bring you to a blockdump which usually represents the 'real' source of OERI:2662 WARNING: There may be more than one buffer pinned to the process so ensure you check out all pinned buffers. -> does the blockdump match the dba from e.? -> what kind of blockdump is it? (a) rollbacksegment header (b) datablock (c) other SEE BELOW for EXAMPLES which demonstrate the sort of output you may see in trace files and the things to check. Check list and possible causes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - If Parallel Server check both nodes are using the same lock manager instance & point at the same control files. - If not Parallel Server check that 2 instances haven't mounted the same database (Is there a second PMON process around ?? - shut down any other instances to be sure) Possible causes: - doing an open resetlogs with _ALLOW_RESETLOGS_CORRUPTION enabled - a hardware problem, like a faulty controller, resulting in a failed write to the control file or the redo logs - restoring parts of the database from backup and not doing the appropriate recovery - restoring a control file and not doing a RECOVER DATABASE USING BACKUP CONTROLFILE - having _DISABLE_LOGGING set during crash recovery - problems with the DLM in a parallel server environment - a bug Solutions: - if the SCNs in the error are very close: Attempting a startup several times will bump up the dscn every time we open the database even if open fails. The database will open when dscn=scn. - ** You can bump the SCN on open using See [NOTE:30681.1] Be aware that you should really rebuild the database if you use this option. - Once this has occurred you would normally want to rebuild the database via exp/rebuild/imp as there is no guarantee that some other blocks are not ahead of time. Articles: ~~~~~~~~~ Solutions: [NOTE:30681.1] Details of the ADJUST_SCN Event [NOTE:1070079.6] alter system checkpoint Possible Causes: [NOTE:1021243.6] CHECK INIT.ORA SETTING _DISABLE_LOGGING [NOTE:74903.1] How to Force the Database Open (_ALLOW_RESETLOGS_CORRUPTION) [NOTE:41399.1] Forcing the database open with `_ALLOW_RESETLOGS_CORRUPTION` [NOTE:851959.9] OERI:2662 DURING CREATE SNAPSHOT AT MASTER SITE Known Bugs: ~~~~~~~~~~~ Fixed In. Bug No. Description ---------+------------+---------------------------------------------------- 7.0.14 BUG:153638 7.1.5 BUG:229873 7.1.3 Bug:195115 Miscalculation of SCN on startup for distributed TX ? 7.1.6.2.7 Bug:297197 Port specific Solaris OPS problem 7.3 Bug:336196 Port specific IBM SP AIX problem -> dlm issue 7.3.4.5 Bug:851959 OERI:2662 possible from distributed OPS select --------------------------------------------------------------------------- --------------------------------------------------------------------------- Examples: ~~~~~~~~ Below are some examples of this type of error and the information you will see in the trace files. ~~~~~~~~~~ CASE (a) ~~~~~~~~~~ blockdump should look like this: *** buffer dba: 0x05000002 inc: 0x00000001 seq: 0x0001a9c6 ver: 1 type: 1=KTU UNDO HEADER Extent Control Header ----------------------------------------------------------------- Extent Control:: inc#: 716918 tsn: 4 object#: 0 *** -> interpret: dba: 0x05000002 -> 83886082 (0x05000002) = 5,2 XXX tsn: 4 -> this is rollback segment 4 tsn: 4 -> this rollback segment is in tablespace 4 ORA-00600: Interner Fehlercode, Argumente: [2662], [0], [71183], [0], [71195], [83886082], [], [] -> [e] > 0 and represents dba from block which is in trace -> [d]-[b] = 71195 - 71183 = 12 -> convert [b] to hex: 71195 = 0x1161B so this value can be found in blockdump: *** TRN TBL:: index state cflags wrap# uel scn dba ------------------------------------------------------------------ ... 0x4e 9 0x00 0x00d6 0xffff 0x0000.0001161b 0x00000000 ... *** -> possible cause so in this case the CURRENT SCN is LOWER than the SCN on this transaction ie: The current SCN looks like it has decreased !! This could happen if the database is opened with the _allow_resetlogs_corruption parameter -> If some recovery steps have just been performed review these steps as the mismatch may be due to open resetlogs with _allow_resetlogs_corruption enabled or similar. See for information on this parameter. ------------------------------------------------------------------ ~~~~~~~~~~ CASE (b) ~~~~~~~~~~ blockdump looks like this: *** buffer dba: 0x0100012f inc: 0x00000815 seq: 0x00000d48 ver: 1 type: 6=trans data Block header dump: dba: 0x0100012f Object id on Block? Y seg/obj: 0xe csc: 0x00.5fed6 itc: 2 flg: O typ: 1 - DATA fsl: 0 fnx: 0x0 Itl Xid Uba Flag Lck Scn/Fsc 0x01 0x0000.00b.0000036c 0x0100261c.0138.04 --U- 1 fsc 0x0000.0005fed7 0x02 0x0000.00a.0000037b 0x0100261d.0138.01 --U- 1 fsc 0x0000.0005fed4 data_block_dump =============== ... *** interpret: dba: 0x0100012f -> 8,10 ==> 16777519 (0x0100012f) = 1,303 (0x1 0x12f) *** SVRMGR> SELECT SEGMENT_NAME, SEGMENT_TYPE FROM DBA_EXTENTS 2> WHERE FILE_ID = 1 AND 303 BETWEEN BLOCK_ID AND 3> BLOCK_ID + BLOCKS - 1; SEGMENT_NAME SEGMENT_TYPE ---------------------------------------------------------- ----------------- UNDO$ TABLE 1 row selected. *** -> current sql-statement (trace): *** update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6, undosqn=:7,xactsqn=:8,scnbas=:9,scnwrp=:10,inst#=:11 where us#=:1 ksedmp: internal or fatal error ORA-00600: internal error code, arguments: [2662], [0], [392916], [0], [392919], [0], [], [] *** -> e. = 0 info not available -> d-b = 392919 - 392916 = 3 -> dba from blockdump matches the object from current sql statement -> convert b. to hex: = 0x5FED7 so this value can be found in blockdump -> see ITL slot 0x01! --------------------------------------------------------------------------- --------------------------------------------------------------------------- --------------------------------------------------------------------------- Some more internals: ~~~~~~~~~~~~~~~~~~~~ I will try to give another example in oder to answer question if current SCN is decreased or dependant SCN increase. hypothesis: current SCN decreased Evidence: reproduced ORA-600 [2662] by aborting tx and using _allow_resetlog_corruption while open resetlogs. check database SCN before! Prerequisits: _allow_resetlogs_corruption = true in init.ora shutdown/startup db *** BEGIN TESTCASE SVRMGR> drop table tx; Statement processed. SVRMGR> create table tx (scn# number); Statement processed. SVRMGR> insert into tx values( userenv('COMMITSCN') ); 1 row processed. SVRMGR> select * from tx; SCN# ---------- 392942 1 row selected. ************ another session ************** SQL> connect scott/tiger Connected. SQL> update emp set sal=sal+1; 13 rows processed. SQL> -- no commit here ******************************************* SVRMGR> insert into tx values( userenv('COMMITSCN') ); 1 row processed. SVRMGR> select * from tx; SCN# ---------- 392942 392943 2 rows selected. -- so current SCN will be 392943 SVRMGR> shutdown abort ORACLE instance shut down. -- this breaks tx SVRMGR> startup mount pfile=e:\jv734\initj734.ora ORACLE instance started. Total System Global Area 11018952 bytes Fixed Size 35760 bytes Variable Size 7698200 bytes Database Buffers 3276800 bytes Redo Buffers 8192 bytes Database mounted. SVRMGR> recover database until cancel; ORA-00279: Change 392925 generated at 10/26/99 17:13:03 needed for thread 1 ORA-00289: Suggestion : e:\jv734\arch\arch_2.arc ORA-00280: Change 392925 for thread 1 is in sequence #2 Specify log: {=suggested | filename | AUTO | CANCEL} cancel Media recovery cancelled. SVRMGR> alter database open resetlogs; alter database open resetlogs * ORA-00600: internal error code, arguments: [2662], [0], [392928], [0], [392931], [0], [], [] *** END TESTCASE because we know current SCN before (392943) we see, that current SCN has decreased after solving the problem with: shutdown abort/startup -> works SVRMGR> drop table tx; Statement processed. SVRMGR> create table tx (scn# number); Statement processed. SVRMGR> insert into tx values( userenv('COMMITSCN') ); 1 row processed. SVRMGR> select * from tx; SCN# ---------- 392943 1 row selected. so we have exactly reached the current SCN from before 'shutdown abort' So current SCN was bumpt up from 392928 to 392942.