解决Oracle undo_management参数不一致错误

环境Linux 5.8 10.2.0.5 RAC,两个节点只能一个节点mount,如果尝试mount另外节点就报ORA-01105和ORA-01606错误。
查看数据库版本:
SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi
PL/SQL Release 10.2.0.5.0 - Production
CORE 10.2.0.5.0 Production
TNS for Linux: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production

crs资源情况
代码如下:
[oracle@node1 dbs]$ $ORA_CRS_HOME/bin/crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....D1.inst application OFFLINE OFFLINE
ora....D2.inst application ONLINE ONLINE node2
ora.PROD.db application ONLINE ONLINE node2
ora....SM1.asm application ONLINE ONLINE node1
ora....E1.lsnr application ONLINE ONLINE node1
ora.node1.gsd application ONLINE ONLINE node1
ora.node1.ons application ONLINE ONLINE node1
ora.node1.vip application ONLINE ONLINE node1
ora....SM2.asm application ONLINE ONLINE node2
ora....E2.lsnr application ONLINE ONLINE node2
ora.node2.gsd application ONLINE ONLINE node2
ora.node2.ons application ONLINE ONLINE node2
ora.node2.vip application ONLINE ONLINE node2

节点1 mount报错
代码如下:
SQL> startup
ORACLE instance started.

Total System Global Area 171966464 bytes
Fixed Size 2094832 bytes
Variable Size 113248528 bytes
Database Buffers 50331648 bytes
Redo Buffers 6291456 bytes
ORA-01105: mount is incompatible with mounts by other instances
ORA-01606: gc_files_to_locks not identical to that of another mounted instance
Error: ORA 1105
Text: mount is incompatible with mounts by other instances
-------------------------------------------------------------------------------
Cause: An attempt was made to mount the database, but another instance has already mounted
a database by the same name, and the mounts are not compatible.
dditional messages will accompany this message to report why the mounts are incompatible.
Action: See the accompanying messages for the appropriate action to take.

Error: ORA 1606
Text: GC_FILES_TO_LOCKS not identical to that of another mounted instance
-------------------------------------------------------------------------------
Cause: The initialization parameter GC_FILES_TO_LOCKS is not the same as
another instance mounted in parallel mode.
This parameter must be the same as that for all shared instances.
Action: Modify the parameter to be compatible with the other instances, then
shut down and restart the instance.

根据这个错误提示,查询两个节点的gc_files_to_locks参数,均为空值(默认值),也就是值相同
代码如下:
SQL> show parameter gc_files_to_locks;

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
gc_files_to_locks string

检查两个节点的gc相关隐含参数,发现所有值也均一致
代码如下:
NAME DESCRIPTION VALUE
------------------------------ --------------------------------------------------------------------- -------
_gc_affinity_limit dynamic affinity limit 50
_gc_affinity_minimum dynamic affinity minimum activity per minute 6000
_gc_affinity_time if non zero, enable dynamic object affinity 10
_gc_async_memcpy if TRUE, use async memcpy FALSE
_gc_check_bscn if TRUE, check for stale blocks TRUE
_gc_coalesce_recovery_reads if TRUE, coalesce recovery reads TRUE
_gc_defer_time how long to defer down converts for hot buffers 3
_gc_dissolve_undo_affinity if TRUE, dissolve undo affinity after an offline FALSE
_gc_dynamic_affinity_locks if TRUE, get dynamic affinity locks TRUE
_gc_element_percent global cache element percent 103
_gc_global_lru turn global lru off, make it automatic, or turn it on AUTO
_gc_initiate_undo_affinity if TRUE, initiate undo affinity after an online TRUE
_gc_integrity_checks set the integrity check level 1
_gc_keep_recovery_buffers if TRUE, make recovery buffers current TRUE
_gc_latches number of latches per LMS process 8
_gc_maximum_bids maximum number of bids which can be prepared 0
_gcs_fast_reconfig if TRUE, enable fast reconfiguration for gcs locks TRUE
_gcs_latches number of gcs resource hash latches to be allocated per LMS process 64
_gcs_pkey_history number of pkey remastering history 4000
_gcs_process_in_recovery if TRUE, process gcs requests during instance recovery TRUE
_gcs_resources number of gcs resources to be allocated
_gcs_shadow_locks number of pcm shadow locks to be allocated
_gc_statistics if TRUE, kcl statistics are maintained TRUE
_gcs_testing GCS testing parameter 0
_gc_tsn_undo_affinity if TRUE, use TSN undo affinity TRUE
_gc_undo_affinity if TRUE, enable dynamic undo affinity TRUE
_gc_undo_affinity_locks if TRUE, get affinity locks for undo TRUE
_gc_use_cr if TRUE, allow CR pins on PI and WRITING buffers TRUE
_gc_vector_read if TRUE, vector read current buffers TRUE

仔细对比数据库参数,发现undo异常
代码如下:
--节点1
SQL> show parameter undo

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
undo_management string MANUAL
undo_retention integer 900
undo_tablespace string SYSTEM


--节点2
SQL> show parameter undo

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
undo_management string AUTO
undo_retention integer 900
undo_tablespace string UNDOTBS1

这里已经明确,因为两个节点的undo_*相关参数配置不正确,导致数据库只能一个节点mount。进一步定位问题发现,原来是因为dba粗心在编辑节点1的参数文件的时候把undo_*相关的参数给弄丢了,从而数据库使用了默认值undo_management=manual,undo_tablespace=system

冷备份是数据库文件的物理备份,通常在数据库通过一个shutdown normal或shutdown immediate 命令正常关闭后进行。当数据库关闭时,其使用的各个文件都可以进行备份。这些文件构成一个数 ...