RAC脚本检查

Availablity (RAC)   PISAORA_R.B.1 interconnect network
availability
私有网络里多块网卡互联
配置多个的好处:负载均衡,failover和私网带宽提升
oifcfg getif OK: Redundancy
Configuration for Interconnect Network and Switch using IPMP, APA, etc
NO: No Redundancy
[Development] DB RAC
Interconnect not redundant configuration
[Problem] Interconnect Network failure, caused one DB server Reboot
[Improvement] HP’s APA (Auto Port Aggregation) using redundancy
[Note] Oracle Interfaces
lan1 185.191.120.0 global cluster_interconnect
lan4 17.91.220.0 global public è lan4 (Active) / lan5 (Standy)
  PISAORA_R.B.2
ok
[>=10g]
Dynamic Resource Mastering (DRM) Disable
(_gc_affinity_time & _gc_undo_affinity)
(11g:_gc_policy_time)
SELECT
max(ksppinm) name, max(ksppstvl) value
 FROM   x$ksppi a, x$ksppsv 
b
 WHERE 
a.indx=b.indx and ksppinm in (
‘_gc_affinity_time’, ‘_gc_policy_time’ );

SELECT max(ksppinm) name, max(ksppstvl) value
 FROM   x$ksppi a, x$ksppsv 
b
 WHERE 
a.indx=b.indx and ksppinm =
‘_gc_undo_affinity';

Bug 6960699 (fixed 10.2.0.5)
 
“latch: cache buffers chains” contention/ORA-481/kjfcdrmrfg:
SYNC TIMEOUT/ OERI[kjbldrmrpst:!master]

OK: DRM
Disabled
 10g :  _gc_affinity_time = 0,
 
       
_gc_undo_affinity=false

 11g 
: _gc_policy_time = 0,
       
_gc_undo_affinity=false

NO: DRM Enabled (*default)

** ONLINE change is not possible.
The Parameter values must be same between RAC nodes

[Phenomenon]
RAC Dynamic Resource Mastering features
[Problem] DRM functionality and performance according to the available bug
[Improvement] 10g: _gc_affinity_time = 0,
                     
_gc_undo_affinity
= false
               
11g: _gc_policy_time
= 0
                      
_gc_undo_affinity = false

[Note] is a node-to-node application partition, using the object is
separated, mainly between nodes separated by block access if the DRM is
effective, but with the same node object / block the access to this type of
application, if the performance load DRM increase. In this case, counsel DRM
disable
  PISAORA_R.B.3 [~11gR1]
IP=FIRST in listener.ora
??
The
(IP=FIRST) statement will make the listener create a listening endpoint on
the IP address to which the given HOST resolves. By default, without
(IP=FIRST), the listener will listen on all network interfaces (e.g.
INADDR_ANY) 
OK:
IP=FIRST in listener.ora
NO: hostname is used for “HOST=” clause but, “IP=FIRST”
is not specified in listener.ora

** No need from 11g because CRS set it automatically.

[??] Listener.ora? IP=FIRST? ???? ?? ??.
[???] ???? ??  INADDR_ANY
???? ?? host?
?? network
              interface
? listener? connection ??
[????]  listener.ora
? IP=FIRST ? ???? ????.

[
??] What is
IP=FIRST in the LISTENER.ORA file ? [ID 300729.1]

???: ??? ?? ???
listener.ora?
ADDRESS=???
IP=ADDRESS??
(??)
LISTENER =
 
(DESCRIPTION_LIST =
   
(DESCRIPTION =
     
(ADDRESS_LIST =
       
(ADDRESS = (PROTOCOL =
TCP)(HOST = racnode1-vip) (PORT = 1521) (IP = FIRST))
       
(ADDRESS = (PROTOCOL =
TCP)(HOST = racnode1) (PORT = 1521) (IP = FIRST))
     
)
   
)
 
)

[
???? ?? ?? ?????
???]
1? ??? NIC ??? ??
VIP1 ? 2? ??? ???? ??, Client-Side ? CTF ? ??? ??? ?? ?? ???, 2? ???
LISTENER ? VIP1 ? ???? ??? ???? ??. ?? ??, VIP1 ? ?? 1? ??? ????? ???? ?, 2? ????
VIP1 ? ??? ??? ?
Oracle Shadow Process ?
Client ?? Network
Connection ? ????? ??, ????(tcp_keepalive_interval
+ tcp_ip_abort_interval)?? Resource & Lock ? ?? ??? ???? ??.

  PISAORA_R.B.4
ok
DB parameters
consistency between instances
select
name,max(value) max,min(value) min from gv$parameter group by name
 
having max(value) <>
min(value)
 
and name not in
(‘audit_file_dest’,’instance_name’,’instance_number’,’local_listener’,’parallel_instance_group’,’undo_tablespace’,’user_dump_dest’,’background_dump_dest’,’core_dump_dest’,’service_names’,’thread’)
;
OK:
No rows selected
    
Same Value for each
instance
    
(Exception: Intended different
setting such as SGA size by differency Capacity)
NO: Rows Return 
[??] 2?
instance? ?? $ORACLE_HOME/dbs/initDWDB2.ora?? spfile=?? ?? paramter?? ?? ??
[???] instance? initSID.ora? spfile ?? ??? ??? ??, ???
parameter ??? ? ???? ?? ?? ??
[????] 2?
instance? ?? instance? ????? spfile?
???? init file?? spfile? ??
——————————————————-
[
??*] RAC???? Primary Server? Standby Server? DB parameter ??? ??
[???*] Failover? 2? ?? ?? ??
[????*] Primary Server? ?? ??? Standby Server? ?? ???? ???? ??? ??
  PISAORA_R.B.5 IPC
protocol 
setting in the
listener.ora
IPC 进程间通信机制
RAC中缓存融合内存之间通信靠IPC更快
listener.ora

[Note: MOD ID ID 403743.1] VIP Failover Take Long Time After Network Cable
Pulled
  
OK:
address list of IPC protocol is located upper than that of TCP protocol
NO: No setting for IPC protocol OR
     TCP is located upper than IPC
protocol
[??] listener ? IPC(Interprocess Communication) Portocol ???
[???] Network Cable? ??? ? Public Network ?? ?? ?
               Failover
??  ?? ??(3~4?)
[????] listener.ora? IPC Protocol ?? ??
  PISAORA_R.B.6 Sequence Cache
Size 
SQL>select
CACHE_SIZE 
from dba_sequences where
SEQUENCE_NAME in (‘AUDSES$’, ‘IDGEN1$’);

The cache size for IDGEN1$ is increased
  to 1000 
by bug 7694580 ( fixed
11.2 ).

[Note] High SQ Enqueue Contention with LOBs or Advanced Replication [ID
432508.1]
The cache size for IDGEN1$ is increased
  to 1000  by bug 7694580 ( fixed
11.2 ).
 
OK:
AUDSES$ 
10000,cache的大小默认是20,这里改成10000
    IDGEN1$ 
1000 (~11gR1)
NO: smaller value than “OK” specification

alter sequence sys.audses$ cache 10000;
alter sequence sys.idgen1$ cache 1000;

[??] ?? Oracle Sequence? Cache?
Default? ?? ??
[???] Connection Storm?? LOB insert?? ??? ??? ??
             
[
????] AUDSES$,
IDGEN1$ sequence?
cache? 10,000, 1,000? ????.

????
.
alter sequence sys.audses$ cache 10000;
alter sequence sys.idgen1$ cache 1000;

[??] High SQ
Enqueue Contention with LOBs or Advanced Replication [ID 432508.1]
The cache size for IDGEN1$ is increased  to 1000  by bug 7694580 ( fixed
11.2 ).
 

  PISAORA_R.B.7  Instance_groups
(avoid unintended internode parallelism)
SELECT
*
FROM
      
(SELECT version
     
,
to_number(replace(version,’.’,”)) as version_number
     
, parallel
 FROM  
v$instance
       ) 
ins,
      
——
      
(SELECT max(ksppinm) name,
max(ksppstvl) value
 FROM   x$ksppi a, x$ksppsv 
b
 WHERE 
a.indx=b.indx and ksppinm =
‘parallel_instance_group’
       ) 
par1,
      
——
      
(SELECT max(ksppinm) name,
max(ksppstvl) value
 FROM   x$ksppi a, x$ksppsv 
b
 WHERE 
a.indx=b.indx and ksppinm =
‘instance_groups’
       ) 
par2,
      
——
      
(SELECT max(ksppinm) name,
max(ksppstvl) value
 FROM   x$ksppi a, x$ksppsv 
b
 WHERE 
a.indx=b.indx and ksppinm =
‘parallel_force_local’
       ) 
par3
/
OK:
[10g] parallel_instance_group and instance_groups are set
   
[11g]
parallel_force_local=true
 
NO:
???

[Phenomenon]
instance_group Not specified
[Problem] Parallel Query runs, not unintentional Query Process of the RAC
              
Node performed in all
performance problems occur.
[Improvement]
[10g]
parallel_instance_group, instance_groups specify

parallel_instance_group: can be changed online
instance_groups: can not be changed online

[11g]
parallel_force_local = true set

  PISAORA_R.B.8  [Only 10gR2 10.2.0.5 ~11gR1]
Prevent Unexpected VIP relocation
#
$GRID_HOME/bin/racgvip check
# timeout of ping in number of loops (1 sec)
PING_TIMEOUT=”

If the problem is with the network, the above “ping” command
would take longer than 1s, and this leads to VIPs going offline unexpectedly
and relocating to another node.

In the VIP trace file under “CRS_HOME/log/node02″
在ora.nodename.vip.log 中内容
2011-02-18 15:3OK:39.48NR: [ RACG][1] [4587556][1][ora.node02.vip]: Fri Feb
18 15:3OK:37 GMT+08:00 2011 [ 8257768 ] About to execute
command: /usr/sbin/ping -S 192.168.220.36 -c 1 -w 1 192.168.220.33
Fri Feb 18 15:3OK:39 GMT+08:00 2011 [ 8257768 ]
IsIfAlive: RX packets checked if=en1 failed
翻译就是”/usr/sbin/ping -S 本机公网的IP 
-c 1 -w 1 网关的ip”这个命令,即通过本地的ip发送一个64K的包去ping网关的地址,
 
当发现timeout超过1秒时,即返回失败。这样oracle认为这个公网的ip不通,就进行了vip的切换。进而将本地监听offline。

OK:
PING_TIMEOUT=” -c 1 -w 3″
NO: PING_TIMEOUT=” -c 1 -w 1″

# $GRID_HOME/bin/racgvip check
# timeout of ping in number of loops (1 sec)
PING_TIMEOUT=” -c 1 -w 1
==>
# timeout of ping in number of loops (3 sec)
PING_TIMEOUT=” -c 1 -
w 3
srvctl start nodeapps -n

[Phenomenon] VIP check consists
of 1 second timeout
[Problems] because of the load on the system within the timeout time to
relocate VIP check is not a problem.
[Improvement] VIP check script timeout time of 1 second -> 3 second fix

[ID 1297867.1] VIPs Often Go Offline Unexpectedly and Relocate to Another
Node 

  PISAORA_R.B.9 Synchronization
of Default OS TimeZone and oracle user TZ environment
As
Oracle user, grid user, root user

env | grep TZ
OK:
Same TZ for Oracle, Grid, Root users.
NO: Different among the users

[Phenomenon] OS Default Timezone mismatch
[Problem] DB server Timezone note that during Booting OS User environment
variables and inconsistencies in the system log and the time value of
different Logging Has Timezone
[Improvement] OS Default Timezone (/ etc / default / tz) to the same set
  PISAORA_R.B.10 CRS Version crsctl
query crs activeversion
crsctl query crs softwareversion
OK:
Same as or higher than DB version
NO: Lower than the DB version
[??] CRS Version? DB Version ??
??
[???] CRS Version? DB Version?? ??? ??? ?.
[????] CRS Version? Upgrade?.
  PISAORA_R.B.11 [Except for
Exadata]
Interconnect Network Switch ?
  OK:
Switch is used for interconnect network
NO: Cross  Cable without Switch
[??] CRS interconnect? direct??? ??
[???] interconnect? direct??? ?????
???? ?? ??.
              ?? Lan Card??? interconnect ??? ??? ??  CRS? ??               Server
? Lan Card? ??? ??? ???? ??.
[
????] Switch? ??? Interconnect? ??.

[
??]
Interconnect Network?
Oracle?? ????? switch? ??? ???
support?, direct??? not support
  PISAORA_R.B.12 [Except for
Exadata]
jumbo frame for Interconnect
对于新的能支持巨帧的系统Network内网连接中巨帧的大小设置为9000字节,交换机上配置和私网网卡配置巨帧大小都要相同
(HP)
netstat -in (the other) ifconfig -a(查看frame size)
default: 1500, MTU:9000
OK:
Jumbo Frame is used (MTU:9000)
NO: Jumbo Frame is NOT used (MTU:1500)

*** Only for New Systems, that is, pre-production systems,
Not recommed to change it for Production systems

[??] Interconnect Network? MTU?
1500 ?? ??
[???] DB block? 8k, 16k, 32k ??? ???
MTU? ?? ??? ???? ?? Merge?? ??? ??? ??? ?? ??. 
[
????] ?? Switch? ???? MTU?
9000?? ???? ?? ??.
[
??] ? Open?? ???? ???? ?? ?? ??.
?? ??? ??? ??
  PISAORA_R.B.13 [Except for
ASM]
CRS Auto Start
HP/Solaris):
cat /var/opt/oracle/scls_scr/$host/root/crsstart
==> check “disable”
IBM/Linux) : cat /etc/oracle/scls_scr/`hostname`/root/crsstart
==> check disable
(From 11gR2 check ohasdstr instead of crsstart)
OK:
CRS Auto Start Disable
NO: no set (*default)

# crsctl disable crs

[??]  CRS auto start enable
[???] OS? Cluster ??? ????? Strat?? ?? CRS? ?? start???? ?? CRS? Resource?
????? ???? ?? ? ??.
[????] CRS Auto Start? disable??
               
os>crsctl disable
crs
  PISAORA_R.B.14 [Except for
ASM]
Voting Disk Configuration
CRS
user>
os>crsctl query css votedisk
OK:
# of Voting Disk is 3 or 5
NO: # of Voting Disk is 1
[??] Voting Disk? ??? ?? ??
??
[???] 
Voting File? ??? ??? ??
[????] Voting File? ???  (????? ???
?? ??? ???? ?? ??)
  PISAORA_R.B.15 [Except for
ASM]
File System seperation for each voting files 每个voting file放在单独的文件系统
CRS
user>
os>crsctl query css votedisk
OK:
Each File is located on the seperated file system
NO: located in the same directory
[??] ?? Voting File? ??
Directory? ??
[???] ????? ?? Voting File? ?? ?? ??? ??
[????] Voting File? ?? ?? Directory? ??
               (????? ??? ?? ??? ????
?? ??)
  PISAORA_R.B.16 OCR Disk
Configuration
CRS
user>
os>ocrcheck
OK:
# of OCR Disk >= 2
NO: # of OCR Disk = 1
[??] OCR Disk? ??? ?? ?? ??
[???]  OCR Disk
? ??? ??? ??
[????] OCR?
??? 
(????? ??? ?? ??? ???? ?? ??)
  PISAORA_R.B.17 File System
seperation for each OCR Disk
CRS
user>
os>ocrcheck
OK:
Each File is located on the seperated file system
NO: located in the same directory OR same ASM Disk Group
[??] ?? OCR File? ?? Directory?
??
[???] ????? ?? OCR File? ?? ?? ??? ??
[????] OCR File? ?? ?? Directory? ??
               (????? ??? ?? ??? ????
?? ??)
  PISAORA_R.B.19 CRS Network
Hearbeat misscount
Linux上默认misscount为60s,其他平台为30s,若使用了第三方vendor clusterware则为600s
$
crsctl get css misscount
内网网络PING在MC时间内完成
OK:  less than 200 seconds
 
If vendor第三方 (OS) clusterware
heartbeat can detect the heartbeat less than 200 seconds, it’s OK to have
larger than 200 of Oracle misscount.

default: 10g : CRS Only : 30 (linux? 60), Vendor Cluster : 600
         
11g : CRS Only : 30, Vendor
Cluster : 600

Steps To Change CSS Misscount, Reboottime and Disktimeout [ID 284752.1]
[??] css misscount? ???? ???
???? ?? ??.
[???] False Detection ?? ?? ????? ??.
[????] css misscount? 200 ??? ??? ????.
  PISAORA_R.B.20 [10gR1~11gR1]
CRS diagwait
CRS
user>
os>crsctl get css diagwait
保证OPROCD在超过13s内返回,重启系统
也就是cpu hang超10s,css重启超3s,两值相加超13s重启系统

** Online Change is not allowed

OK:
13
NO: not Set

No need to set from 11gR2 [ID 559365.1]

/ora_crs/bin/crsctl stop crs
/ora_crs/bin/crsctl get css diagwait
/ora_crs/bin/crsctl set css diagwait 13  -force   
/ora_crs/bin/crsctl get css diagwait 

[??] CRS Logging ?? ???? ?
Default ??
[???] CRS?
?? ?? Reboot? Log ??? ?? ???? ?? ?? ??
[????] diagwait ????  0
?? 13?? ??
[??] 2??
DB ??? ?? ??? CRS Stop ?? ?? ??
??
??? CRS??
????? ???? ?? ??
OCR Corruption? ??? ? ????
Rollable ???
  PISAORA_R.B.21 CRS client log
Management
ls
-al /ora_crs/log/`hostname`/client | wc -l 
OK:
less than 1000
NO: more than 1000

** You have to remove them regularly

[??]
$ORA_CRS_HOME//log/`hostname`/client ? ?? file? ??.
[???] svrctl command? ??? ???.
             
???? log ??? ?? file
limit ?? ? log scan? ?? CRS
             
command ?? ??
[????] $ORA_CRS_HOME//log/`hostname`/client ? 1000? ??? file?
              ????? ????. (????? ??)
  PISAORA_R.B.22 OCR/OLR Auto
Backup Status
CRS
user>
For OCR backup:
os>ocrconfig -showbackup

For OLR Backup:
os>ocrconfig -local -showbackup

OK:
OCR Backup within 1 week
    
OR OLR Backup
NO: No backup for OCR OR OLR Backup
 
  PISAORA_R.B.23 [If Switch is
used as VLAN]
Interconnect Network Switch Configuration?
如何在VLAN上部署RAC, 一个交换机虚拟出3个网路,public,private,san不成么?
[NOTE:
MOS ID 220970.1] RAC: Frequently Asked Questions
If deploying the interconnect on a VLAN, there should be a 1:1 mapping of
the VLAN to a non-routable subnet and the VLAN should not span multiple VLANs
(tagged) or multiple switches.
OK:  1:1 mapping of the VLAN to a non-routable
subnet
NO: share the Interlink with tagging

** Consult with Network Administration

[??] Interconnect Switch? VLAN??
??.
[
???] Cluster
interconnect? vlan? non-routerble IP? ????? ?.
[
????] ?? interlink? ???? 1:1 mapping? ??? ??.
[
??]
[ID 220970.1] RAC: Frequently Asked Questions
If deploying the interconnect on a VLAN, there should be a NR:1 mapping of
the VLAN to a non-routable subnet and the VLAN should not span multiple VLANs
(tagged) or multiple switches
  PISAORA_R.B.24 [11gR2, XA or
DB link Case]
_clusterwide_global_transactions
SQL>SELECT
max(ksppinm) name, max(ksppstvl) value
 FROM   x$ksppi a, x$ksppsv 
b
 WHERE  a.indx=b.indx and ksppinm =
‘_clusterwide_global_transactions';
OK:
_clusterwide_global_transactions=false
NO: _clusterwide_global_transactions=true (*default)

** ONLINE Changed is not permitted

[??] clusterwide global transaction? ??? ??
[???] 11g??
XA? DB link??? ?? TX?
RAC? ?? ????? ?? ????? clusterwide global TX? ??????, ???
bug ??? ?? ???? hang??
lock? ?? ??
[????] _clusterwide_global_transactions=false

[
?        ?] [ID 1361615.1] High rdbms ipc reply and DFS lock handle in
11gR2 RAC With XA Enabled Application
[ID 8588540.8] Bug 8588540 – Corruption / ORA-8102 in RAC with loopback DB
links between instances
[ID 13605839.8] Bug 13605839 – ORA-600 [ktbsdp1] ORA-600 [kghfrempty:ds].
Corruption in Rollback with Clusterwide Global Transactions in RAC
  PISAORA_R.B.25 Owner/permission
of OCR/Voting
  OK:
**Recommended Value
NO: Other value

** Recommended Value
OCR Disk: root:oinstall – 640
Voting Disk: oracle:oinstall – 644 OR 640
For Group, oinstall and dba are both OK

[??] OCR ?
voting file? owner:group? ?? ??? ??.
[???] Oracle Clusterware (Grid Infrastructure) ??? ??? ??
[????] OCR? Voting Disk ? owner ? ??? ??? ?? ??
              
OCR – root:dba – 640
              
Voting Disk –
oracle:dba – 644
  PISAORA_R.B.28 [10gR2~]
OS Watcher Configuration
$ ps -ef |
grep OSW
OK: OSW is
being used
NO: not configured

(How to run)
cd ./OSW/OSWbb
nohup ./startOSWbb.sh 30 720 &

[??]  OS Watcher ???? ??
[???]  ??? ????? ???? ??? ??? ?
CPU, MEMORY, IO, Network? ??? ?? ??.
[
????] OS Watcher ?? ??
[??] OS Watcher ????? ??
Technical Memo ??

- ??? ?? OS??? ??? ???? ??? ?? OS Platform? ?????
??? ??? ??? ????
Script
– RAC
? ???? ????? ??? ?? ???.

[
??] Technical
Memo-OSW(OS_Watcher)_Black_Box.doc
  PISAORA_R.B.29 [>= VERITAS CFS集群文件系统
4.1]
VERITAS CFS ??? ODM (Oracle Disk Manager) library with VERITAS CFS
ODM提升文件系统的性能,使文件系统也能达到raw设备的性能。但是ODM需要第三方厂家提供相应的接口才能实现,比如Veritas的提供的ODM
library
Check
$ORACLE_HOME/lib/libodm* if it’s soft 
linked to the Veritas ODM

# ls -l $ORACLE_HOME/lib/libodm*
/opt/VRTSodm/lib/libodm.sl  – HP PA
Systems
/opt/VRTSodm/lib/libodm.so — HP IA Systems

OK: VERITAS
ODM Library is Linked
NO: Not Linked with VERITAS ODM
[??] VERITAS CFS??? ???? ODM library ???? ??
[???] Oracle Library? ??? ???? ?? I/O? ??? ?? ?? ???
[????] ????
Library? ????? ????.
             
VERITAS CFS
?? ???? ODM (Oracle Disk Manager)
library ????? ?? ??
(?? ??)
1. Login as Oracle user
2. Shutdown database
3. Link the Oracle Disk Manager library into Oracle home
For Oracle 10g on HP 9000 Systems:
$ rm ${ORACLE_HOME}/lib/libodm10.sl
$ ln -s /opt/VRTSodm/lib/libodm.sl ${ORACLE_HOME}/lib/libodm10.sl
For Oracle 10g on Integrity Systems:
$ rm ${ORACLE_HOME}/lib/libodm10.so
$ ln -s /opt/VRTSodm/lib/libodm.so ${ORACLE_HOME}/lib/libodm10.so
For Oracle 11g on HP 9000 Systems:
$ rm ${ORACLE_HOME}/lib/libodm11.sl
$ ln -s /opt/VRTSodm/lib/libodm.sl ${ORACLE_HOME}/lib/libodm11.sl
For Oracle 11g on Integrity Systems:
$ rm ${ORACLE_HOME}/lib/libodm11.so
$ ln -s /opt/VRTSodm/lib/libodm.so ${ORACLE_HOME}/lib/libodm11.so
4. Start Oracle database

  PISAORA_R.B.30 [11.1 ~ ]
_gc_bypass_readers
<>
SQL>show
parameter _gc_bypass_readers

[Note] When one Instance crashes, the other instance hang with “Buffer
Busy” Wait with high workload (real case)
OK:
_gc_bypass_readers=false
NO: _gc_bypass_readers=true (*default)
[Phenomenon] 11g Version of the
operating _gc_bypass_readers = true.
[Problem] Reader Bypass due to an error the function phenomena such as Hang
or recovery slowdown is occurring.
[Improvement] as a new feature to disable.
                
_gc_bypass_readers =
false.

[Note: The action method;
SQL> alter system set “_gc_bypass_readers” = false;
That the parameters can be set in the operating environment and the online
(rolling can also be changed), initSID.ora value is changed when modifying
the maintenance restartup

Available ** ONLINE, ROLLABLE available
If you change ** Online, Session connected to the existing functionality of
the Reader Bypass use immediately and replaced with Block Lock type, so that
the work of Session may temporarily wait. Exact test under load, it takes 1
to 2 seconds during

  PISAORA_R.B.31 [11.2. ~ ]
_gc_read_mostly_locking
11g新特性,读能提高性能,写不成
《DRM和read-mostly locking》
  OK:
_gc_read_mostly_locking=false
NO: _gc_read_mostly_locking=true (*default)

[Phenomenon] 11g Version of the
operating _gc_read_mostly_locking = true.
[Problem] 11g instance crash due to renal failure occurs or Internal Error
[Improvement] as a new feature to disable.
                
_gc_read_mostly_locking = false.
Not ** ONLINE, ROLLABLE not
[Note]
Read-mostly: 11g new feature that does not occur by a change mainly about
the object specified as read-mostly locking.
When acquiring S lock faster, interconnect traffic can be reduced
However, if the changes are many object inefficient I / O is causing Saved.

Bug 13457582 – ora-600 [kclantilock_8] [ID 13457582.8]
Instance crash.

  PISAORA_R.B.32 [9.2 ~ 10.2.x]
_gc_integrity_checks
[Note] With
11g, “Fusion Assert” issue is related with
“_gc_integrity_checks” is 2,
that is, no problem with default value (1)
(9i~10g: Fusion Assert is realted with value 1)
 
OK:
_gc_integrity_checks=0
NO: _gc_integrity_checks=1 (*default) 
[??] _gc_integrity_checks = true? ??.
[
???] 1. cross ??? ??? restart??? ??
              2. interconnect
??? ???
fusion asset ??? ?? ?? ??
[????] ??? ???? ??? ?.
             
_gc_integrity_checks =
false.

** ONLINE
??,
ROLLABLE ??
1. RAC ???? cross ??? ???
restart??? ??
( RAC cross ???? oracle??? not support ?? ???)

2.
??? resource? ?? ???
process? ????? threshold??? ?? operation?? ????
assert code? ????, ?? process? ???? ????? F/G?, ??? ??? ?? ??? LMS process? fusion assert  ??? ????. ?? ??, RAC ?? ??? ??? ? ??.

Fusion assert
??
block status message ???? ??
RAC?? ??? ??? ? ????, disable(0) ??? ????? ??
(??)
  PISAORA_R.B.33 [11.2.0 ~
]
_disable_system_state
  OK:  _disable_system_state=10
NO: _disable_system_state=4294967294 (*default)
[Phenomenon] 11g Version
operate in _disable_system_state = 4294967294.
[Problem] Oracle internally during abnormal system to perform the System
State Dump, Level is set to high and the bulk of the load occurs Dump File.
               
In particular, the
Diag Directory F / S Full phenomena occurring in
[Improvement] as a new feature to disable.
                
alter system set “_disable_system_state”
= 10 scope = both;
Available ** ONLINE, ROLLABLE available
[Note]
Dump or explicitly call system, Oracle Code automatically carried over to
the System State Dump Level Definitions. (Dump only done less than the
specified value)

Third-party cases: System State Dump Level 267 due to “latch: gc
element” contention and DISK I / O performance bottlenecks prevent the
delay.

  PISAORA_R.B.34 Parameter
Check to handle the failover connections
? 猜测是连接数不足,无法处理故障转移
SQL> select
‘Parameter ==> ‘||resource_name||’ inst#: ‘||inst_id|| ‘ current:
‘||current_utilization||’ max: ‘||max_utilization||’ limit:
‘||limit_value
 
from gv$resource_limit
 
where resource_name in (‘processes’,
‘sessions’)
 order by resource_name, inst_id;
OK: processes, sessions
parameter are set to the larger value than the sum of “sessions high
water mark (max)” of the other instance

NO: processes, sessions parameters are not large enough to get the failover
connections (based on the sessions high water mark (max))
 
[??]
[???]
[????]
  PISAORA_R.B.35 [11.2.0.1
to 11.2.0.3]
VIP Intermediate Attribute
由于每秒都监控Network,在公网紧张时,导致不必要的listener
offline 和vip failover
<<rac span=”” listener
>>
crsctl
stat res ora.rac1.vip -p | grep STOP_DEPENDENCIES
OK:
STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)

NO: STOP_DEPENDENCIES=hard(ora.net1.network)
[??] CRS Resource ?? Listener, Service? Stop ?.

[
???] Check Timeout? ?? ??
Resource? Unknown ??? ???? ??
Online? ?, ?? Dependency? ??
Resource? Stop?? Online ? Failover?? ?? ??.

[
????] GI(CRS) PSU
11.2.0.3.3 ?? ?? ? VIP Stop Dependency? ‘immediate’ ?? ??.

1.
?? ?? ??
$ crsctl stat res ora.bjmcspd1.vip -p | grep STOP_DEPENDENCIES –>
?? ??: STOP_DEPENDENCIES=hard(ora.net1.network)

2.
??
(VIP/ScanVIP)
$ crsctl modify res ora.bjmcspd1.vip -attr
“STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)”
$ crsctl modify res ora.bjmcspd2.vip -attr
“STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)”
$ crsctl modify res ora.scan1.vip -attr
“STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)“
3. nodeapps/VIP
???

[?        ?] [ID 1333165.1] VIP, SCAN
VIP/Listener Fails Over and Listener Stops After Short Public Network
Hiccup 
PS:如果您想和业内技术大牛交流的话,请加qq群(527933790)或者关注微信公众 号(AskHarries),谢谢!

转载请注明原文出处:Harries Blog™ » RAC脚本检查

赞 (0)

分享到:更多 ()

评论 0

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址