Exadata v kocke

HealthCheck HealthCheck scripts run from /usr/oracle/healthcheck healthcheck binaries can be copied from Frame to Frame by tar’ing all the files up and scp to new location.

Change all Cell/DBNode passwords at once PASSWORD=<value> dcli -l root -g ~/all_group “echo ${PASSWORD} | passwd –stdin root”

Status of Cluster /usr/oracle/grid/product/ora11gR2/bin/crsctl stat res -t

Hardware sensors Status or detailed sensor info ipmitool sdr | grep -v ok ipmitool sensor

Gather all diagnostics information /opt/oracle.SupportTools/onecommand/diagget.sh

Hardware Profile /opt/oracle.SupportTools/CheckHWnFWProfile -S > tmp/CheckHWnFWProfile_dm10db01.cbp.dhs.gov.txt

Infiniband/RDS Status ibstatus rds-info -n

Status of network connections ip n s ip sar -n DEV 1 3 ip -s link show

Confirm Storage Cell Status dcli -l root -g ~/cell_group ‘cellcli -e list cell’

Additional Reading (MOS Notes) 888828.1 – Database Machine and Exadata Storage Server 11g Release 2 (11.2) Supported Versions 1078889.1 – Exadata calibrate reports substandard IOPS on SAS drives 1093890.1 – Steps To Shutdown/Startup The Exadata & RDBMS Services and Cell/Compute Nodes 757552.1 – Oracle Exadata Best Practices 1120955.1 – Exadata V2: How To Startup or Shutdown An Exadata Or Compute Node Server Thru ILOM? 735323.1 – Exadata Storage Server Diagnostic Collection Guide 1053498.1 – Network Diagnostics information for Oracle Database Machine Environments 1072676.1 – Exadata General FAQ 1070954.1 – Oracle Exadata Database Machine exachk or HealthCheck 735323.1 – Exadata Storage Server Diagnostic Collection Guide 1071221.1 – Oracle Sun Database Machine Backup and Recovery Best Practices 359395.1 – Remote Diagnostic Agent (RDA) 4 – RAC Cluster Guide 085606 – Yum repo – http://www.oracle.com/technetwork/topics/linux/yum-repository-setup-085606.html 1317159.1 – Changing IP addresses on Exadata Database Machine 361468.1 – HugePages on Oracle Linux 64-bit Best Practices: http://www.oracle.com/technetwork/database/features/availability/exadata-maa-best-practices-155385.html

 

DB NODES

Automated Diagnostics Repository (ADR) Report adrci> show homes adrci> set homepath <diag path for the instance that generated the incident> adrci> ips create package incident <incident number> adrci> ips generate package 1 in /tmp

Reset BMC dcli -l root -g ~/dbs_group ipmitool bmc reset cold

Display Disks Status dcli -l root -g ~/cell_group ‘cellcli -e list griddisk attributes name,size,status,asmmodestatus’ dcli -l root -g ~/cell_group ‘cellcli -e list celldisk’

Display Sensor Status ipmitool sensor

Temperature dcli -l root -g /root/all_group ipmitool sensor get T_AMB | grep -I Reading dcli -l root -g /root/cell_group ‘cellcli -e list cell detail’ | grep -i Reading

Restart OS Watcher /opt/oracle.oswatcher/osw/stopOSW.sh /opt/oracle.cellos/vldrun -script oswatcher

 

INFINIBAND

Shows IB ports and status on the IB switch listlinkup

IB/RDS Status ibstatus rds-info -n

Status of network connections ip n s sar -n DEV 1 3 ip -s link show

 

STORAGE CELLS

Capture all alerts from cell history based on a time stamp dcli -l root -g ~/cell_group cellcli -e “list alerthistory where begintime \> \’2011-05-25T00:00:00-05:00\'”

Check/change default disk timeout for cells (cell failure) The default value for the timer is 3.6 hours, 0 is default. This column is in seconds. SQL> SELECT NAME, repair_timer from v$asm_disks; SQL> ALTER DISKGROUP DATA SET ATTRIBUTE ‘DISK_REPAIR_TIME’=’8.5H’;

Get cell disk info cellcli -e list griddisk attributes name,size,offset,status

Help with cellcli cellcli -e help alter cell

Reset BMC dcli -l root -g ~/cell_group cellcli -e “alter cell restart bmc”

Restart cell services cellcli -e ‘alter cell restart services all’

Flashcache Disk Resurrection Procedure disable alerts & monitoring cell shutdown init 6 cell startup (Run the command below until all disks are “active online yes”) cellcli -e list griddisk attributes name,status,asmmodestatus,asmdeactivationoutcome

Flashcache Disk Resurrection Procedure disable alerts & monitoring cell shutdown init 6 cell startup (Run the command below until all disks are “active online yes”) cellcli -e list griddisk attributes name,status,asmmodestatus,asmdeactivationoutcome

Flash Cache Status/Creation dcli -l root -g ~/cell_group cellcli -e ‘list flashcache’ cellcli -e ‘drop celldisk all flashdisk force’ cellcli -e ‘create celldisk all flashdisk’ cellcli -e ‘create flashcache all’

List total I/O error counts Can indicate a pending proactive/predictive failure. dcli -l root -g ~/cell_group cellcli -e “list griddisk where errorCount > 0 detail”