Commonly Used Logfiles on Exadata

treasure_map

When treasure hunters looking for lost treasure, they are using treasure map. On Exadata,
when you are hunting down issues, logfiles are your friend. There are tons of logfiles on Exadata and I list the frequently used ones as follows in two major categories, database node and cell node.

Database Node

Database alert logfile
$ORACLE_BASE/diag/rdbms/{DBNAME}/{sid}/trace/alert_{sid}.log
Example: /u01/app/oracle/diag/rdbms/dbm/dbm1/trace/alert_dbm1.log

Listener logfile
$ORACLE_BASE/diag/tnslsnr/{node name}/listener/trace/listener.log
Example: /u01/app/oracle/diag/tnslsnr/enkx3db01/listener/trace/listener.log

ASM alert logfile
$ORACLE_BASE/diag/asm/+asm/+ASM{instance number}/trace/ alert_+ASM{instance number}.log
Example: /u01/app/oracle/diag/asm/+asm/+ASM1/trace/alert_+ASM1.log

CRS alert logfile
$GRID_HOME/log/{node name}/alert{node name}.log
Example: /u01/app/11.2.0.3/grid/log/enkx3db01/alertenkx3db01.log

Note: there are also many CRS related logfiles under this $GRID_HOME/log/{node name} directory

[enkx3db01:oracle:+ASM1] /u01/app/11.2.0.3/grid/log/enkx3db01
> ls -l /u01/app/11.2.0.3/grid/log/enkx3db01
total 1176
drwxr-xr-x 2 root   root    4096 Apr 24 11:53 acfs
drwxr-x--- 2 oracle dba     4096 Mar 12  2013 acfslog
drwxr-x--- 2 oracle dba     4096 Mar 12  2013 acfsrepl
drwxr-x--- 2 root   dba     4096 Mar 12  2013 acfsreplroot
drwxr-xr-x 2 root   dba     4096 Mar 12  2013 acfssec
drwxr-x--- 2 oracle dba     4096 Mar 12  2013 admin
drwxrwxr-t 4 root   dba     4096 Mar 12  2013 agent
-rw-rw-r-- 1 oracle dba  1096949 Oct 14 15:16 alertenkx3db01.log
drwxr-x--x 2 oracle dba    12288 Oct 14 06:48 client
drwxr-x--- 2 root   dba     4096 Oct  8 15:51 crflogd
drwxr-x--- 2 root   dba     4096 Sep 10 12:54 crfmond
drwxr-x--- 2 root   dba     4096 Oct 11 10:39 crsd
drwxr-x--- 2 oracle dba     4096 Oct 13 21:19 cssd
drwxr-x--- 2 root   dba     4096 Oct 12 12:07 ctssd
drwxr-x--- 4 oracle dba     4096 Mar 12  2013 cvu
drwxr-x--- 2 oracle dba     4096 Oct  9 14:11 diskmon
drwxr-x--- 2 oracle dba     4096 Mar 12  2013 evmd
drwxr-x--- 2 oracle dba     4096 Oct 14 10:48 gipcd
drwxr-x--- 2 root   dba     4096 Mar 12  2013 gnsd
drwxr-x--- 2 oracle dba     4096 Oct 14 15:14 gpnpd
drwxr-x--- 2 oracle dba     4096 Sep  6 12:22 mdnsd
drwxr-x--- 2 root   dba     4096 Oct 13 19:14 ohasd
drwxrwxr-t 5 oracle dba     4096 Mar 12  2013 racg
drwxr-x--- 2 oracle dba     4096 Mar 12  2013 srvm

The following chart shows the hierarchy of CRS related log directories.
crs_log_directories

OS Watcher output files
All OS Watcher files are under /opt/oracle.oswatcher/osw/archive/ directory.

[root@enkx3db01 ~]# cd /opt/oracle.oswatcher/osw/archive
[root@enkx3db01 archive]# ls -l
total 252
drwxr----- 2 root root 24576 Oct 14 15:14 ExadataOSW
drwxr----- 2 root root 20480 Oct 14 15:14 ExadataRDS
drwxr----- 2 root root  4096 Feb 20  2013 oswcellsrvstat
drwxr----- 2 root root 24576 Oct 14 15:15 oswdiskstats
drwxr----- 2 root root 20480 Oct 14 15:15 oswiostat
drwxr----- 2 root root 20480 Oct 14 15:15 oswmeminfo
drwxr----- 2 root root 24576 Oct 14 15:15 oswmpstat
drwxr----- 2 root root 20480 Oct 14 15:15 oswnetstat
drwxr----- 2 root root  4096 Feb 20  2013 oswprvtnet
drwxr----- 2 root root 24576 Oct 14 15:15 oswps
drwxr----- 2 root root 24576 Oct 14 15:15 oswslabinfo
drwxr----- 2 root root 20480 Oct 14 15:14 oswtop
drwxr----- 2 root root 24576 Oct 14 15:15 oswvmstat

For example, if we want to get OS watcher data for October 11, 2013, just run the folowing
To get OS watcher data for a specific date :
cd /opt/oracle.oswatcher/osw/archive
find . -name ‘*13.10.11*’ -print -exec zip /tmp/osw_`hostname`.zip {} \;
where 12- year 01- Month 13-day

[root@enkx3db01 tmp]# cd /opt/oracle.oswatcher/osw/archive
[root@enkx3db01 archive]# find . -name ‘*13.10.11*’ -print -exec zip /tmp/osw_`hostname`.zip {} \;

It will create a zip file with the name of osw_enkx3db01.enkitec.com.zip for all data happened on October 11, 2013. If we want just one hour of data, for example, 3pm’s data, we just make a little change in the above command

[root@enkx3db01 archive]# find . -name ‘*13.10.11.1400*’ -print -exec zip /tmp/osw_`hostname`.zip {} \;
./oswps/enkx3db01.enkitec.com_ps_13.10.11.1400.dat.bz2
updating: oswps/enkx3db01.enkitec.com_ps_13.10.11.1400.dat.bz2 (deflated 12%)
./ExadataRDS/enkx3db01.enkitec.com_ExadataRDS_13.10.11.1400.dat.bz2
updating: ExadataRDS/enkx3db01.enkitec.com_ExadataRDS_13.10.11.1400.dat.bz2 (deflated 6%)
./oswdiskstats/enkx3db01.enkitec.com_diskststs_13.10.11.1400.dat.bz2
updating: oswdiskstats/enkx3db01.enkitec.com_diskststs_13.10.11.1400.dat.bz2 (deflated 1%)
./oswtop/enkx3db01.enkitec.com_top_13.10.11.1400.dat.bz2
updating: oswtop/enkx3db01.enkitec.com_top_13.10.11.1400.dat.bz2 (deflated 7%)
./oswnetstat/enkx3db01.enkitec.com_netstat_13.10.11.1400.dat.bz2
updating: oswnetstat/enkx3db01.enkitec.com_netstat_13.10.11.1400.dat.bz2 (deflated 11%)
./oswslabinfo/enkx3db01.enkitec.com_slabinfo_13.10.11.1400.dat.bz2
updating: oswslabinfo/enkx3db01.enkitec.com_slabinfo_13.10.11.1400.dat.bz2 (deflated 1%)
./oswmpstat/enkx3db01.enkitec.com_mpstat_13.10.11.1400.dat.bz2
updating: oswmpstat/enkx3db01.enkitec.com_mpstat_13.10.11.1400.dat.bz2 (stored 0%)
./oswvmstat/enkx3db01.enkitec.com_vmstat_13.10.11.1400.dat.bz2
adding: oswvmstat/enkx3db01.enkitec.com_vmstat_13.10.11.1400.dat.bz2 (stored 0%)
./oswmeminfo/enkx3db01.enkitec.com_meminfo_13.10.11.1400.dat.bz2
adding: oswmeminfo/enkx3db01.enkitec.com_meminfo_13.10.11.1400.dat.bz2 (deflated 0%)
./oswiostat/enkx3db01.enkitec.com_iostat_13.10.11.1400.dat.bz2
adding: oswiostat/enkx3db01.enkitec.com_iostat_13.10.11.1400.dat.bz2 (deflated 1%)

[root@enkx3db01 archive]# ls -l /tmp/osw_`hostname`.zip
-rw-r–r– 1 root root 93306416 Oct 14 19:47 /tmp/osw_enkx3db01.enkitec.com.zip

OS message logfile
/var/log/messages

Secure logfile
/var/log/secure file records all SSH connections into the server. There are also two more log files related to /var/log/secure, /var/log/faillog and /var/log/tallylog. /var/log/faillog records the failed attempts to login while /var/log/tallylog is used as the “counter” for PAM and this counter is used to determine whether to lock a user account. But only /var/log/secure is “human readable” and it is difficult to manually edit faillog or tallylog files.
Here is a sample of secure file
[root@enkx3db01 log]# tail -30 /var/log/secure
Oct 14 16:53:19 enkx3db01 sshd[121950]: pam_unix(sshd:session): session opened for user oracle by (uid=0)
Oct 14 16:53:20 enkx3db01 sshd[121955]: Accepted password for oracle from 192.168.9.89 port 62530 ssh2
Oct 14 16:53:20 enkx3db01 sshd[121955]: pam_unix(sshd:session): session opened for user oracle by (uid=0)
Oct 14 16:53:20 enkx3db01 sshd[121959]: subsystem request for sftp
Oct 14 17:04:29 enkx3db01 su: pam_unix(su-l:session): session closed for user root
Oct 14 17:04:31 enkx3db01 sshd[16252]: Received disconnect from 192.168.9.111: 11: disconnected by user
Oct 14 17:04:31 enkx3db01 sshd[15720]: pam_unix(sshd:session): session closed for user oracle
Oct 14 17:35:59 enkx3db01 sshd[29330]: pam_unix(sshd:session): session closed for user oracle
Oct 14 19:03:20 enkx3db01 sshd[120214]: pam_unix(sshd:session): session closed for user oracle
Oct 14 19:04:35 enkx3db01 sshd[121955]: pam_unix(sshd:session): session closed for user oracle
Oct 14 19:09:42 enkx3db01 sshd[121950]: pam_unix(sshd:session): session closed for user oracle
Oct 14 19:23:49 enkx3db01 sshd[120198]: pam_unix(sshd:session): session closed for user oracle
Oct 14 20:25:29 enkx3db01 sshd[82077]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=192.168.9.154 user=oracle
Oct 14 20:25:30 enkx3db01 sshd[82077]: Failed password for oracle from 192.168.9.154 port 50358 ssh2
Oct 14 20:25:36 enkx3db01 sshd[82077]: Failed password for oracle from 192.168.9.154 port 50358 ssh2
Oct 14 20:25:38 enkx3db01 sshd[82077]: Accepted password for oracle from 192.168.9.154 port 50358 ssh2
Oct 14 20:25:38 enkx3db01 sshd[82077]: pam_unix(sshd:session): session opened for user oracle by (uid=0)
Oct 14 20:25:38 enkx3db01 sshd[82350]: Accepted password for oracle from 192.168.9.154 port 50359 ssh2
Oct 14 20:25:38 enkx3db01 sshd[82350]: pam_unix(sshd:session): session opened for user oracle by (uid=0)
Oct 14 20:25:38 enkx3db01 sshd[82352]: subsystem request for sftp
Oct 14 20:26:28 enkx3db01 sshd[83425]: Accepted password for oracle from 192.168.9.154 port 50361 ssh2
Oct 14 20:26:28 enkx3db01 sshd[83425]: pam_unix(sshd:session): session opened for user oracle by (uid=0)
Oct 14 20:26:28 enkx3db01 sshd[83428]: Accepted password for oracle from 192.168.9.154 port 50362 ssh2
Oct 14 20:26:28 enkx3db01 sshd[83428]: pam_unix(sshd:session): session opened for user oracle by (uid=0)
Oct 14 20:26:29 enkx3db01 sshd[83430]: subsystem request for sftp
Oct 14 20:49:46 enkx3db01 sshd[107563]: Accepted password for root from 192.168.9.54 port 58377 ssh2
Oct 14 20:49:46 enkx3db01 sshd[107563]: pam_unix(sshd:session): session opened for user root by (uid=0)
Oct 14 20:57:57 enkx3db01 sshd[115745]: Accepted password for oracle from 192.168.9.145 port 57148 ssh2
Oct 14 20:57:57 enkx3db01 sshd[115745]: pam_unix(sshd:session): session opened for user oracle by (uid=0)
Oct 14 20:58:01 enkx3db01 su: pam_unix(su-l:session): session opened for user root by oracle(uid=500)

In addition to secure file, last command is frequently used to identify the login/logout history for a certain user. For example, to find out the login/logout history for oracle user, just run the following:
[root@enkx3db01 log]# last oracle
oracle pts/0 192.168.9.89 Mon Oct 14 15:35 still logged in
oracle pts/1 192.168.9.111 Mon Oct 14 15:16 still logged in
oracle pts/0 192.168.9.89 Mon Oct 14 15:11 – 15:35 (00:24)
oracle pts/5 192.168.9.245 Mon Oct 14 10:32 – crash (04:37)
oracle pts/3 192.168.9.89 Mon Oct 14 10:32 – crash (04:38)
oracle pts/0 192.168.9.89 Mon Oct 14 09:24 – crash (05:46)
oracle pts/5 192.168.9.233 Sun Oct 13 23:29 – 02:27 (02:58)
oracle pts/4 192.168.9.233 Sun Oct 13 23:28 – 02:08 (02:40)
oracle pts/2 192.168.9.233 Sun Oct 13 22:06 – 02:09 (04:02)
oracle pts/0 enk02.enkitec.co Sun Oct 13 21:26 – 01:26 (03:59)
oracle pts/4 192.168.9.233 Sun Oct 13 18:15 – crash (02:58)
oracle pts/2 192.168.9.233 Sun Oct 13 17:38 – crash (03:36)
oracle pts/0 192.168.9.233 Sun Oct 13 17:15 – crash (03:59)
oracle pts/8 192.168.9.245 Fri Oct 11 12:44 – 17:55 (05:11)
oracle pts/5 192.168.9.89 Fri Oct 11 09:42 – 18:08 (08:25)
oracle pts/2 192.168.9.89 Fri Oct 11 08:28 – 18:30 (10:01)
oracle pts/1 192.168.9.89 Fri Oct 11 08:28 – 18:31 (10:03)
oracle pts/0 192.168.9.89 Fri Oct 11 08:06 – 16:40 (08:34)
oracle pts/6 192.168.9.245 Thu Oct 10 15:30 – 18:43 (03:13)
oracle pts/5 192.168.9.245 Thu Oct 10 14:54 – 19:05 (04:11)
oracle pts/4 192.168.9.89 Thu Oct 10 07:54 – 17:03 (09:08)
oracle pts/1 192.168.9.89 Thu Oct 10 07:53 – 17:03 (09:10)
oracle pts/0 192.168.9.89 Thu Oct 10 07:51 – 17:03 (09:12)
oracle pts/1 192.168.9.234 Wed Oct 9 12:37 – 20:22 (07:45)
. . . .

Print Message buffer of kernel
The dmesg command is used to write the kernel messages to standard output (which by default is the display screen). The output of dmesg is maintained in the log file /var/log/dmesg file, and it can be easily viewed by vi or cat command.

[root@enkx3db01 log]# dmesg
. . . .
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
Slow work thread pool: Starting up
Slow work thread pool: Ready
FS-Cache: Loaded
FS-Cache: Netfs ‘nfs’ registered for caching
svc: failed to register lockdv1 RPC service (errno 97).
i2c /dev entries driver
megasas: fasync_helper was not called first
warning: `ntpdate’ uses 32-bit capabilities (legacy support in use)
eth0: no IPv6 routers present
mpt2sas version 12.105.11.00 loaded
RDS/IB: connected to 192.168.12.2 version 3.1
ib0: Unicast, no dst: type 0000, QPN 060000 0020:0800:1404:0001:8000:0048:fe80:0000
ib0: Unicast, no dst: type 0000, QPN 060000 0020:0800:1404:0001:8000:0048:fe80:0000
ib0: Unicast, no dst: type 0000, QPN 060000 0020:0800:1404:0001:8000:0048:fe80:0000
ib0: packet len 2144 (> 2048) too long to send, dropping
ib0: packet len 2144 (> 2048) too long to send, dropping
ib0: packet len 2144 (> 2048) too long to send, dropping
ib0: Unicast, no dst: type 0000, QPN 060000 0020:0800:1404:0001:8000:0048:fe80:0000
ib0: Unicast, no dst: type 0000, QPN 060000 0020:0800:1404:0001:8000:0048:fe80:0000
ib0: Unicast, no dst: type 0000, QPN 060000 0020:0800:1404:0001:8000:0048:fe80:0000
ib0: Unicast, no dst: type 0000, QPN 060000 0020:0800:1404:0001:8000:0048:fe80:0000
ib0: Unicast, no dst: type 0000, QPN 060000 0020:0800:1404:0002:0000:0000:0000:0000
RDS/IB: connected to 169.254.87.194 version 3.1
RDS/IB: connected to 169.254.87.194 version 3.1
RDS/IB: connected to 169.254.97.245 version 3.1

Disk controller firmware logs
[root@enkx3db01 log]# /opt/MegaRAID/MegaCli/MegaCli64 -fwtermlog -dsply -a0
Firmware Term Log Information on controller 0:
= 0 internalIndex = 0 externalIndex = ff
T4: setPhyMap: phy = 1 internalIndex = 0 externalIndex = ff
T4: setPhyMap: phy = 2 internalIndex = 0 externalIndex = ff
T4: setPhyMap: phy = 3 internalIndex = 0 externalIndex = ff
T4: setPhyMap: phy = 4 internalIndex = 1 externalIndex = ff
T4: setPhyMap: phy = 5 internalIndex = 1 externalIndex = ff
T4: setPhyMap: phy = 6 internalIndex = 1 externalIndex = ff
T4: setPhyMap: phy = 7 internalIndex = 1 externalIndex = ff
T4: setPhyMap: phy = 8 internalIndex = ff externalIndex = ff
T4: setPhyMap: phy = 9 internalIndex = ff externalIndex = ff
T4: setPhyMap: phy = a internalIndex = ff externalIndex = ff
T4: setPhyMap: phy = b internalIndex = ff externalIndex = ff
T4: setPhyMap: phy = c internalIndex = ff externalIndex = ff
T4: setPhyMap: phy = d internalIndex = ff externalIndex = ff
T4: setPhyMap: phy = e internalIndex = ff externalIndex = ff
T4: setPhyMap: phy = f internalIndex = ff externalIndex = ff
. . . .

10/14/13 15:19:06: InhibitCharging : i2cExpander_GPIOstatus = 0xde
10/14/13 15:19:06: i2BufferBatDis : i2cExpander_GPIOstatus = 0xce
10/14/13 15:19:06: EVT#06367-10/14/13 15:19:06: 151=Battery relearn started
10/14/13 15:20:11: EVT#06368-10/14/13 15:20:11: 148=Battery is discharging
10/14/13 15:20:11: EVT#06369-10/14/13 15:20:11: 152=Battery relearn in progress
10/14/13 16:32:46: i2BufferBatDis : i2cExpander_GPIOstatus = 0xde
10/14/13 16:32:46: Learn completed successfully
10/14/13 16:32:46: EVT#06370-10/14/13 16:32:46: 153=Battery relearn completed
10/14/13 16:32:46: Next Learn will start on 11 13 2013
10/14/13 16:32:46: Given BBUMode 7, is the same as Current BBUMode

10/14/13 16:32:46: *** BATTERY FEATURE PROPERTIES ***
10/14/13 16:32:46: _________________________________________________

10/14/13 16:32:46: Auto Learn Period : 30 days
10/14/13 16:32:46: Next Learn Time : 437675566
10/14/13 16:32:46: Battery ID : 414a28fe
10/14/13 16:32:46: bbuMode : 7
10/14/13 16:32:46: Delayed Learn Interval: 0 hours from scheduled time
10/14/13 16:32:46: AutoLearnMode : 0
10/14/13 16:32:46: Next Learn scheduled on: 11 13 2013 16:32:46
10/14/13 16:32:46: _________________________________________________

10/14/13 16:32:46: i2cEepromWriteEnable : i2cExpander_GPIOstatus = 0x9e
10/14/13 16:32:49: i2cEepromWriteEnable : i2cExpander_GPIOstatus = 0xde

Exit Code: 0x00

Cell Node

Cell alert logfile
There are two possible locations for the cell alert logfile and both are the same
/opt/oracle/cell/log/diag/asm/cell/{node name}/trace/alert.log or $CELLTRACE/alert.log
[root@enkx3cel01 trace]# ls -l /opt/oracle/cell/log/diag/asm/cell/enkx3cel01/trace/alert.log
-rw-rw—- 1 root celladmin 776278 Oct 14 18:16 /opt/oracle/cell/log/diag/asm/cell/enkx3cel01/trace/alert.log
[root@enkx3cel01 trace]# ls -l $CELLTRACE/alert.log
-rw-rw—- 1 root celladmin 776278 Oct 14 18:16 /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/log/diag/asm/cell/enkx3cel01/trace/alert.log

MS logfile
Similar MS logfile has two possible locations
[root@enkx3cel01 trace]# ls -l /opt/oracle/cell/log/diag/asm/cell/enkx3cel01/trace/ms-odl.log
-rw-r—– 1 root celladmin 1184111 Oct 14 19:35 /opt/oracle/cell/log/diag/asm/cell/enkx3cel01/trace/ms-odl.log
[root@enkx3cel01 trace]# ls -l $CELLTRACE/ms-odl.log
-rw-r—– 1 root celladmin 1184111 Oct 14 19:35 /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/log/diag/asm/cell/enkx3cel01/trace/ms-odl.log

Cell patching issues related logfiles
There are many cell patching related logfiles under /var/log/cellos directory

[root@enkx3cel01 cellos]# ls -lart /var/log/cellos
total 1708
-rw-r-----  1 root root       25726 Apr 29 12:41 do_image.sh.log
-rw-r-----  1 root root       67099 Apr 29 13:04 install.post.10commonos.install.sh.log
-rw-r-----  1 root root        5509 Apr 29 13:04 install.post.30kernel.install.sh.log
-rw-r-----  1 root root         957 Apr 29 13:05 install.post.40debugos.install.sh.log
-rw-r-----  1 root root       17907 Apr 29 13:05 install.post.50ofed.install.sh.log
-rw-r-----  1 root root         955 Apr 29 13:05 install.post.60cellrpms.install.sh.log
-rw-r-----  1 root root        5216 Apr 29 13:06 install.post.70sunutils.install.sh.log
-rw-r-----  1 root root         598 Apr 29 13:06 install.post.90last.install.sh.log
-rw-r-----  1 root root         323 Apr 29 13:07 install.post.status
-rw-r-----  1 root root        3288 Apr 29 13:07 cell.bin.install.log
-rw-r-----  1 root root          63 Apr 29 13:11 vldrun.reimage_boot.log
-rw-r-----  1 root root          59 Apr 29 13:38 vldrun.first_boot.log
drwxr-xr-x  2 root root        4096 Apr 29 13:38 validations
drwxr-xr-x  2 root root        4096 Apr 29 13:38 saved
drwxr-xr-x  2 root root        4096 Apr 29 14:18 sosreports
-rw-r-----  1 root root       20061 Jul 17 11:56 ipconf.log
-rw-r-----  1 root root       11754 Oct 14 13:10 cellFirstboot.log
-rw-r-----  1 root celladmin 408910 Oct 14 13:17 checkdeveachboot.log
-rw-r-----  1 root root         944 Oct 14 13:17 exachkcfg.log
-rw-r-----  1 root root         670 Oct 14 13:17 checkconfigs.alert.txt
-rw-r--r--  1 root root       85976 Oct 14 13:17 SerialNumbers
-rw-rw----  1 root celladmin  23069 Oct 14 13:17 CheckHWnFWProfile.log
-rw-r-----  1 root root         406 Oct 14 13:19 vldrun.each_boot.log
-rw-r-----  1 root root       58416 Oct 14 13:19 validations.log
-rw-r-----  1 root root      674195 Oct 14 13:19 checkconfigs_configuration_summaries.txt
drwxr-xr-x 12 root root        4096 Oct 14 13:19 ..
drwxr-xr-x  5 root root        4096 Oct 14 13:20 .
-rw-r-----  1 root root      138088 Oct 14 16:47 start_oswatcher.log

Among these logfiles, validations.log is a useful one
[root@enkx3cel01 cellos]# tail -50 validations.log
2013-10-05 04:02:18 vldrun.pl: Run validation oswatcher – 2013-10-05 04:02:18 vldrun.pl: [PASSED]
2013-10-06 04:02:19 vldrun.pl: Logging started to /var/log/cellos/validations.log
2013-10-06 04:02:19 vldrun.pl: Command line is /opt/oracle.cellos/validations/bin/vldrun.pl -script oswatcher
2013-10-06 04:02:19 vldrun.pl: Run validation oswatcher – 2013-10-06 04:02:19 vldrun.pl: [PASSED]
2013-10-07 04:02:19 vldrun.pl: Logging started to /var/log/cellos/validations.log
2013-10-07 04:02:19 vldrun.pl: Command line is /opt/oracle.cellos/validations/bin/vldrun.pl -script oswatcher
2013-10-07 04:02:19 vldrun.pl: Run validation oswatcher – 2013-10-07 04:02:19 vldrun.pl: [PASSED]
2013-10-08 04:02:18 vldrun.pl: Logging started to /var/log/cellos/validations.log
2013-10-08 04:02:18 vldrun.pl: Command line is /opt/oracle.cellos/validations/bin/vldrun.pl -script oswatcher
2013-10-08 04:02:18 vldrun.pl: Run validation oswatcher – 2013-10-08 04:02:18 vldrun.pl: [PASSED]
2013-10-09 04:02:19 vldrun.pl: Logging started to /var/log/cellos/validations.log
2013-10-09 04:02:19 vldrun.pl: Command line is /opt/oracle.cellos/validations/bin/vldrun.pl -script oswatcher
. . . .
2013-10-14 04:02:24 vldrun.pl: Logging started to /var/log/cellos/validations.log
2013-10-14 04:02:24 vldrun.pl: Command line is /opt/oracle.cellos/validations/bin/vldrun.pl -script oswatcher
2013-10-14 04:02:24 vldrun.pl: Run validation oswatcher – 2013-10-14 04:02:24 vldrun.pl: [PASSED]
2013-10-14 13:12:38 vldrun.pl: Logging started to /var/log/cellos/validations.log
2013-10-14 13:12:38 vldrun.pl: Command line is /opt/oracle.cellos/validations/bin/vldrun.pl -quiet -all
2013-10-14 13:12:38 vldrun.pl: Run validation ipmisettings – 2013-10-14 13:13:01 vldrun.pl: [PASSED]
2013-10-14 13:13:01 vldrun.pl: Run validation misceachboot – 2013-10-14 13:13:36 vldrun.pl: [PASSED]
2013-10-14 13:13:36 vldrun.pl: Run validation celldstatus – 2013-10-14 13:13:51 vldrun.pl: [PASSED]
2013-10-14 13:13:51 vldrun.pl: Run validation sosreport – 2013-10-14 13:13:51 vldrun.pl: [NOHUP RUN]
2013-10-14 13:13:51 vldrun.pl: Run validation oswatcher – 2013-10-14 13:13:51 vldrun.pl: [PASSED]
2013-10-14 13:13:51 vldrun.pl: Run validation biosbootorder – 2013-10-14 13:14:06 vldrun.pl: [PASSED]
2013-10-14 13:14:06 vldrun.pl: Run validation checkconfigs – 2013-10-14 13:19:53 vldrun.pl: [FAILED]
2013-10-14 13:19:53 vldrun.pl: Check log in /var/log/cellos/validations/checkconfigs.log
2013-10-14 13:19:53 vldrun.pl: Run validation saveconfig – 2013-10-14 13:19:53 vldrun.pl: [BACKGROUND RUN]

The followings are similar as the ones on database node and I am not going to repeat them to give more detail.
OS Watcher output files
/opt/oracle.oswatcher/osw/archive/

OS message logfile
/var/log/messages

Disk controller firmware logs
/opt/MegaRAID/MegaCli/MegaCli64 -fwtermlog -dsply -a0

For more information, you can also check Oracle’s document – Location of Different Logfiles in Exadata Environment (Doc ID 1326382.1)

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s