
I like good tools. CELLCLI is a useful tools to perform administration works on Exadata Storage Servers. Now, starting Exadata Storage Server Release 12.1.2.1.0, there is another utility, called DBMCLI to configure and monitor Exadata Database Servers. DBMCLI replaces the /opt/oracle.cellos/compmon/exadata_mon_hw_asr.pl Perl script. The usage of DBMCLI is similar to the usage of CELLCLI.
With DBMCLI, we can start/stop services, list alert history, configure SMTP, and monitor hardware components. Here are a few examples executing on our X3 box in the lab. I removed some similar content from the execution result to save space here.
[root@enkx3db01 ~]# imageinfo
Kernel version: 2.6.39-400.248.3.el6uek.x86_64 #1 SMP Wed Mar 11 18:04:34 PDT 2015 x86_64
Image version: 12.1.2.1.1.150316.2
Image activated: 2015-05-13 07:35:16 -0500
Image status: success
System partition on device: /dev/mapper/VGExaDb-LVDbSys2
[root@enkx3db01 ~]# dbmcli
DBMCLI: Release - Production on Sat Oct 03 07:55:10 CDT 2015
Copyright (c) 2007, 2014, Oracle. All rights reserved.
DBMCLI> list dbserver
enkx3db01 online
DBMCLI> list dbserver detail
name: enkx3db01
bbuStatus: normal
coreCount: 16
cpuCount: 32
diagHistoryDays: 7
fanCount: 16/16
fanStatus: normal
id: 1302FML051
interconnectCount: 2
ipaddress1: 192.168.12.1/24
kernelVersion: 2.6.39-400.248.3.el6uek.x86_64
locatorLEDStatus: off
makeModel: Oracle Corporation SUN FIRE X4170 M3
metricHistoryDays: 7
msVersion: OSS_12.1.2.1.1_LINUX.X64_150316.2
powerCount: 2/2
powerStatus: normal
releaseImageStatus: success
releaseVersion: 12.1.2.1.1.150316.2
releaseTrackingBug: 20240049
status: online
temperatureReading: 18.0
temperatureStatus: normal
upTime: 108 days, 16:29
msStatus: running
rsStatus: running
DBMCLI> list physicaldisk
252:0 NLV9ZD normal
252:1 NGXRZF normal
252:2 NLV1JD normal
252:3 NH06DD normal
DBMCLI> list physicaldisk detail
name: 252:0
deviceId: 11
diskType: HardDisk
enclosureDeviceId: 252
errMediaCount: 0
errOtherCount: 0
makeModel: "HITACHI H106030SDSUN300G"
physicalFirmware: A3D0
physicalInsertTime: 2015-05-13T07:32:53-05:00
physicalInterface: sas
physicalSerial: NLV9ZD
physicalSize: 279.39677238464355G
slotNumber: 0
status: normal
name: 252:1
deviceId: 10
diskType: HardDisk
enclosureDeviceId: 252
errMediaCount: 0
errOtherCount: 0
makeModel: "HITACHI H106030SDSUN300G"
physicalFirmware: A3D0
physicalInsertTime: 2015-05-13T07:32:53-05:00
physicalInterface: sas
physicalSerial: NGXRZF
physicalSize: 279.39677238464355G
slotNumber: 1
status: normal
name: 252:2
deviceId: 9
diskType: HardDisk
enclosureDeviceId: 252
errMediaCount: 0
errOtherCount: 0
makeModel: "HITACHI H106030SDSUN300G"
physicalFirmware: A3D0
physicalInsertTime: 2015-05-13T07:32:53-05:00
physicalInterface: sas
physicalSerial: NLV1JD
physicalSize: 279.39677238464355G
slotNumber: 2
status: normal
name: 252:3
deviceId: 8
diskType: HardDisk
enclosureDeviceId: 252
errMediaCount: 0
errOtherCount: 0
makeModel: "HITACHI H106030SDSUN300G"
physicalFirmware: A3D0
physicalInsertTime: 2015-05-13T07:32:53-05:00
physicalInterface: sas
physicalSerial: NH06DD
physicalSize: 279.39677238464355G
slotNumber: 3
status: normal
DBMCLI> list lun
0_0 0_0 normal
DBMCLI> list lun detail
name: 0_0
diskType: HardDisk
id: 0_0
lunSize: 835.3940000003204G
lunUID: 0_0
raidLevel: 5
lunWriteCacheMode: "WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU"
status: normal
DBMCLI> list ibport
HCA-1:1 Active
HCA-1:2 Active
DBMCLI> list ibport detail
name: HCA-1:1
activeSlave: TRUE
dataRate: "40 Gbps"
hcaFWVersion: 2.11.2012
id: 0x0010e0000128ce65
lid: 45
linkDowned: 0
linkIntegrityErrs: 0
linkRecovers: 0
physLinkState: LinkUp
portNumber: 1
rcvConstraintErrs: 0
rcvData: 1448988236094
rcvErrs: 0
rcvRemotePhysErrs: 0
status: Active
symbolErrs: 0
vl15Dropped: 0
xmtConstraintErrs: 0
xmtData: 237626900021
xmtDiscards: 0
name: HCA-1:2
activeSlave: FALSE
dataRate: "40 Gbps"
hcaFWVersion: 2.11.2012
id: 0x0010e0000128ce66
lid: 46
linkDowned: 0
linkIntegrityErrs: 0
linkRecovers: 0
physLinkState: LinkUp
portNumber: 2
rcvConstraintErrs: 0
rcvData: 22172909573667
rcvErrs: 0
rcvRemotePhysErrs: 0
status: Active
symbolErrs: 0
vl15Dropped: 0
xmtConstraintErrs: 0
xmtData: 18855706123959
xmtDiscards: 0
Here are the result from list alerthistory.
DBMCLI> list alerthistory
5_1 2015-06-16T12:39:32-05:00 critical "A power supply component is suspected of causing a fault with a 100 certainty. Component Name : /SYS/PS1 Fault class : fault.chassis.power.ext-fail Fault message : http://www.sun.com/msg/SPX86-8003-73"
5_2 2015-07-15T15:59:20-05:00 clear "A power supply component fault has been cleared. Component Name : /SYS/PS1 Trap Additional Info : fault.chassis.power.ext-fail"
6_1 2015-06-25T19:36:44-05:00 critical "File system "/" is 80% full, which is above the 80% threshold. Accelerated space reclamation has started. This alert will be cleared when file system "/" becomes less than 75% full. Top three directories ordered by total space usage are as follows: /opt : 7.79G /home : 7.38G /usr : 3.06G"
7_1 2015-07-06T14:33:51-05:00 info "The HDD disk controller battery is performing an unscheduled learn cycle. All disk drives have been placed in WriteThrough caching mode. The flash drives are not affected. Battery Serial Number : 10494 Battery Type : ibbu08 Battery Temperature : 39 C Full Charge Capacity : 1368 mAh Relative Charge : 98% Ambient Temperature : 19 C"
7_2 2015-07-06T15:45:31-05:00 clear "All disk drives are in WriteBack caching mode. Battery Serial Number : 10494 Battery Type : ibbu08 Battery Temperature : 39 C Full Charge Capacity : 1368 mAh Relative Charge : 72% Ambient Temperature : 17 C"
8_1 2015-07-15T15:59:35-05:00 critical "A power supply component is suspected of causing a fault with a 100 certainty. Component Name : /SYS/PS1 Fault class : fault.chassis.power.ext-fail Fault message : http://www.sun.com/msg/SPX86-8003-73"
8_2 2015-07-15T16:01:37-05:00 clear "A power supply component fault has been cleared. Component Name : /SYS/PS1 Trap Additional Info : fault.chassis.power.ext-fail"
9_1 2015-08-04T22:04:16-05:00 critical "File system "/u01" is 80% full, which is above the 80% threshold. This alert will be cleared when file system "/u01" becomes less than 75% full. Top three directories ordered by total space usage are as follows: /u01/app : 147.43G /u01/lost+found : 16K /u01/stage : 4K"
If want to check out the detail of alert history, run the following.
DBMCLI> list alerthistory detail
name: 5_1
alertDescription: "A power supply component suspected of causing a fault"
alertMessage: "A power supply component is suspected of causing a fault with a 100 certainty. Component Name : /SYS/PS1 Fault class : fault.chassis.power.ext-fail Fault message : http://www.sun.com/msg/SPX86-8003-73"
alertSequenceID: 5
alertShortName: Hardware
alertType: Stateful
beginTime: 2015-06-16T12:39:32-05:00
endTime: 2015-07-15T15:59:20-05:00
examinedBy:
metricObjectName: /SYS/PS1_FAULT
notificationState: 0
sequenceBeginTime: 2015-06-16T12:39:32-05:00
severity: critical
alertAction: "For additional information, please refer to http://www.sun.com/msg/SPX86-8003-73"
name: 5_2
alertDescription: "A power supply component fault cleared"
alertMessage: "A power supply component fault has been cleared. Component Name : /SYS/PS1 Trap Additional Info : fault.chassis.power.ext-fail"
alertSequenceID: 5
alertShortName: Hardware
alertType: Stateful
beginTime: 2015-07-15T15:59:20-05:00
endTime: 2015-07-15T15:59:20-05:00
examinedBy:
metricObjectName: /SYS/PS1_FAULT
notificationState: 0
sequenceBeginTime: 2015-06-16T12:39:32-05:00
severity: clear
alertAction: Informational.
name: 6_1
alertDescription: "File system "/" is 80% full"
alertMessage: "File system "/" is 80% full, which is above the 80% threshold. Accelerated space reclamation has started. This alert will be cleared when file system "/" becomes less than 75% full. Top three directories ordered by total space usage are as follows: /opt : 7.79G /home : 7.38G /usr : 3.06G"
alertSequenceID: 6
alertShortName: Software
alertType: Stateful
beginTime: 2015-06-25T19:36:44-05:00
examinedBy:
metricObjectName: /
notificationState: 0
sequenceBeginTime: 2015-06-25T19:36:44-05:00
severity: critical
alertAction: "MS includes a file deletion policy that is triggered when file system utilitization is high. Deletion of files is triggered when file utilization reaches 80%. For the / file system, 1) files in metric history directory will be deleted using a policy based on the file modification time stamp. Files older than the number of days set by the metricHistoryDays attribute value will be deleted first, then successive deletions will occur for earlier files, down to files with modification time stamps older than or equal to 10 minutes, or until file system utilization is less than 75%. 2) files in the ADR base directory and LOG_HOME directory will be deleted using a policy based on the file modification time stamp. Files older than the number of days set by the diagHistoryDays attribute value will be deleted first, then successive deletions will occur for earlier files, down to files with modification time stamps older than or equal to 10 minutes, or until file system utilization is less than 75%. The renamed alert.log files and ms-odl generation files that are over 5 MB, and older than the successively-shorter age intervals are also deleted. Crash files that are over 5 MB and older than one day will be deleted.Try to delete more recent files, or files not being automatically purged, to free up space if needed."
. . . .
If just want to check out the critcial alerts, run the following.
DBMCLI> describe alerthistory
name
alertDescription
alertMessage
alertSequenceID
alertShortName
alertType
beginTime
endTime
examinedBy modifiable
failedMail
failedSNMP
metricObjectName
metricValue
notificationState
sequenceBeginTime
severity
alertAction
DBMCLI> list alerthistory where alertType=’critical’;
DBMCLI> list alerthistory where severity=’critical’
5_1 2015-06-16T12:39:32-05:00 critical “A power supply component is suspected of causing a fault with a 100 certainty. Component Name : /SYS/PS1 Fault class : fault.chassis.power.ext-fail Fault message : http://www.sun.com/msg/SPX86-8003-73”
6_1 2015-06-25T19:36:44-05:00 critical “File system “/” is 80% full, which is above the 80% threshold. Accelerated space reclamation has started. This alert will be cleared when file system “/” becomes less than 75% full. Top three directories ordered by total space usage are as follows: /opt : 7.79G /home : 7.38G /usr : 3.06G”
8_1 2015-07-15T15:59:35-05:00 critical “A power supply component is suspected of causing a fault with a 100 certainty. Component Name : /SYS/PS1 Fault class : fault.chassis.power.ext-fail Fault message : http://www.sun.com/msg/SPX86-8003-73”
9_1 2015-08-04T22:04:16-05:00 critical “File system “/u01” is 80% full, which is above the 80% threshold. This alert will be cleared when file system “/u01″ becomes less than 75% full. Top three directories ordered by total space usage are as follows: /u01/app : 147.43G /u01/lost+found : 16K /u01/stage : 4K”
DBMCLI> list alerthistory where severity=’critical’ and beginTime > ‘2015-08-01T01:00:01-07:00’
9_1 2015-08-04T22:04:16-05:00 critical “File system “/u01” is 80% full, which is above the 80% threshold. This alert will be cleared when file system “/u01″ becomes less than 75% full. Top three directories ordered by total space usage are as follows: /u01/app : 147.43G /u01/lost+found : 16K /u01/stage : 4K”
We can also find out the meric history information.
DBMCLI> list metrichistory
DS_TEMP enkx3db01 18.0 C 2015-10-03T07:00:39-05:00
DS_FANS enkx3db01 16 2015-10-03T07:00:40-05:00
DS_BBU_TEMP enkx3db01 35.0 C 2015-10-03T07:00:41-05:00
DS_CPUT enkx3db01 2.6 % 2015-10-03T07:00:41-05:00
DS_CPUT_MS enkx3db01 0.0 % 2015-10-03T07:00:41-05:00
DS_FSUT / 87 % 2015-10-03T07:00:41-05:00
DS_FSUT /boot 7 % 2015-10-03T07:00:41-05:00
DS_FSUT /mnt/oldroot 80 % 2015-10-03T07:00:41-05:00
DS_FSUT /u01 87 % 2015-10-03T07:00:41-05:00
DS_MEMUT enkx3db01 82 % 2015-10-03T07:00:41-05:00
DS_MEMUT_MS enkx3db01 0.4 % 2015-10-03T07:00:41-05:00
DS_MEMUT_MS enkx3db01 0.4 % 2015-10-03T07:00:41-05:00
DS_RUNQ enkx3db01 1.6 2015-10-03T07:00:41-05:00
DS_SWAP_IN_BY_SEC enkx3db01 4.7 KB/sec 2015-10-03T07:00:41-05:00
DS_SWAP_OUT_BY_SEC enkx3db01 0.0 KB/sec 2015-10-03T07:00:41-05:00
DS_SWAP_USAGE enkx3db01 2 % 2015-10-03T07:00:41-05:00
DS_VIRTMEM_MS enkx3db01 4,708 MB 2015-10-03T07:00:41-05:00
DS_VIRTMEM_MS enkx3db01 4,708 MB 2015-10-03T07:00:41-05:00
N_HCA_MB_RCV_SEC enkx3db01 1.818 MB/sec 2015-10-03T07:00:42-05:00
N_HCA_MB_TRANS_SEC enkx3db01 2.075 MB/sec 2015-10-03T07:00:42-05:00
N_IB_MB_RCV_SEC HCA-1:1 0.233 MB/sec 2015-10-03T07:00:42-05:00
N_IB_MB_RCV_SEC HCA-1:2 1.585 MB/sec 2015-10-03T07:00:42-05:00
N_IB_MB_TRANS_SEC HCA-1:1 0.216 MB/sec 2015-10-03T07:00:42-05:00
N_IB_MB_TRANS_SEC HCA-1:2 1.859 MB/sec 2015-10-03T07:00:42-05:00
N_IB_UTIL_RCV HCA-1:1 0.0 % 2015-10-03T07:00:42-05:00
N_IB_UTIL_RCV HCA-1:2 0.0 % 2015-10-03T07:00:42-05:00
N_IB_UTIL_TRANS HCA-1:1 0.0 % 2015-10-03T07:00:42-05:00
N_IB_UTIL_TRANS HCA-1:2 0.1 % 2015-10-03T07:00:42-05:00
N_NIC_KB_RCV_SEC enkx3db01 0.5 KB/sec 2015-10-03T07:00:42-05:00
N_NIC_KB_TRANS_SEC enkx3db01 0.4 KB/sec 2015-10-03T07:00:42-05:00
DS_TEMP enkx3db01 18.0 C 2015-10-03T07:01:39-05:00
DS_FANS enkx3db01 16 2015-10-03T07:01:40-05:00
DS_BBU_TEMP enkx3db01 35.0 C 2015-10-03T07:01:41-05:00
DS_CPUT enkx3db01 3.5 % 2015-10-03T07:01:41-05:00
DS_CPUT_MS enkx3db01 0.0 % 2015-10-03T07:01:41-05:00
DS_FSUT / 87 % 2015-10-03T07:01:41-05:00
DS_FSUT /boot 7 % 2015-10-03T07:01:41-05:00
DS_FSUT /mnt/oldroot 80 % 2015-10-03T07:01:41-05:00
DS_FSUT /u01 87 % 2015-10-03T07:01:41-05:00
DS_MEMUT enkx3db01 82 % 2015-10-03T07:01:41-05:00
DS_MEMUT_MS enkx3db01 0.4 % 2015-10-03T07:01:41-05:00
DS_MEMUT_MS enkx3db01 0.4 % 2015-10-03T07:01:41-05:00
DS_RUNQ enkx3db01 1.3 2015-10-03T07:01:41-05:00
DS_SWAP_IN_BY_SEC enkx3db01 0.0 KB/sec 2015-10-03T07:01:41-05:00
DS_SWAP_OUT_BY_SEC enkx3db01 0.0 KB/sec 2015-10-03T07:01:41-05:00
DS_SWAP_USAGE enkx3db01 2 % 2015-10-03T07:01:41-05:00
DS_VIRTMEM_MS enkx3db01 4,708 MB 2015-10-03T07:01:41-05:00
DS_VIRTMEM_MS enkx3db01 4,708 MB 2015-10-03T07:01:41-05:00
. . . .
DS_TEMP enkx3db01 18.0 C 2015-10-03T08:01:39-05:00
DS_FANS enkx3db01 16 2015-10-03T08:01:40-05:00
DS_BBU_TEMP enkx3db01 35.0 C 2015-10-03T08:01:41-05:00
DS_CPUT enkx3db01 3.8 % 2015-10-03T08:01:41-05:00
DS_CPUT_MS enkx3db01 0.0 % 2015-10-03T08:01:41-05:00
DS_FSUT / 87 % 2015-10-03T08:01:41-05:00
DS_FSUT /boot 7 % 2015-10-03T08:01:41-05:00
DS_FSUT /mnt/oldroot 80 % 2015-10-03T08:01:41-05:00
DS_FSUT /u01 87 % 2015-10-03T08:01:41-05:00
DS_MEMUT enkx3db01 82 % 2015-10-03T08:01:41-05:00
DS_MEMUT_MS enkx3db01 0.4 % 2015-10-03T08:01:41-05:00
DS_MEMUT_MS enkx3db01 0.4 % 2015-10-03T08:01:41-05:00
DS_RUNQ enkx3db01 1.5 2015-10-03T08:01:41-05:00
DS_SWAP_IN_BY_SEC enkx3db01 0.0 KB/sec 2015-10-03T08:01:41-05:00
DS_SWAP_OUT_BY_SEC enkx3db01 0.0 KB/sec 2015-10-03T08:01:41-05:00
DS_SWAP_USAGE enkx3db01 2 % 2015-10-03T08:01:41-05:00
DS_VIRTMEM_MS enkx3db01 4,708 MB 2015-10-03T08:01:41-05:00
DS_VIRTMEM_MS enkx3db01 4,708 MB 2015-10-03T08:01:41-05:00
N_HCA_MB_RCV_SEC enkx3db01 0.833 MB/sec 2015-10-03T08:01:43-05:00
N_HCA_MB_TRANS_SEC enkx3db01 0.708 MB/sec 2015-10-03T08:01:43-05:00
N_IB_MB_RCV_SEC HCA-1:1 0.095 MB/sec 2015-10-03T08:01:43-05:00
N_IB_MB_RCV_SEC HCA-1:2 0.737 MB/sec 2015-10-03T08:01:43-05:00
N_IB_MB_TRANS_SEC HCA-1:1 0.081 MB/sec 2015-10-03T08:01:43-05:00
N_IB_MB_TRANS_SEC HCA-1:2 0.627 MB/sec 2015-10-03T08:01:43-05:00
N_IB_UTIL_RCV HCA-1:1 0.0 % 2015-10-03T08:01:43-05:00
N_IB_UTIL_RCV HCA-1:2 0.0 % 2015-10-03T08:01:43-05:00
N_IB_UTIL_TRANS HCA-1:1 0.0 % 2015-10-03T08:01:43-05:00
N_IB_UTIL_TRANS HCA-1:2 0.0 % 2015-10-03T08:01:43-05:00
N_NIC_KB_RCV_SEC enkx3db01 0.6 KB/sec 2015-10-03T08:01:43-05:00
N_NIC_KB_TRANS_SEC enkx3db01 0.4 KB/sec 2015-10-03T08:01:43-05:00
If just want to know the average number of processes in the run queue, run the following.
DBMCLI> <b>list metrichistory DS_RUNQ</b>
DS_RUNQ enkx3db01 1.5 2015-10-04T08:00:41-05:00
DS_RUNQ enkx3db01 1.3 2015-10-04T08:01:42-05:00
DS_RUNQ enkx3db01 1.3 2015-10-04T08:02:41-05:00
DS_RUNQ enkx3db01 1.6 2015-10-04T08:03:41-05:00
DS_RUNQ enkx3db01 1.6 2015-10-04T08:04:41-05:00
DS_RUNQ enkx3db01 1.7 2015-10-04T08:05:41-05:00
DS_RUNQ enkx3db01 1.8 2015-10-04T08:06:41-05:00
DS_RUNQ enkx3db01 1.4 2015-10-04T08:07:41-05:00
DS_RUNQ enkx3db01 1.3 2015-10-04T08:08:41-05:00
DS_RUNQ enkx3db01 1.7 2015-10-04T08:09:41-05:00
DS_RUNQ enkx3db01 1.6 2015-10-04T08:10:41-05:00
DS_RUNQ enkx3db01 1.6 2015-10-04T08:11:41-05:00
DS_RUNQ enkx3db01 1.7 2015-10-04T08:12:41-05:00
DS_RUNQ enkx3db01 2.2 2015-10-04T08:13:41-05:00
DS_RUNQ enkx3db01 1.8 2015-10-04T08:14:42-05:00
DS_RUNQ enkx3db01 1.6 2015-10-04T08:15:42-05:00
DS_RUNQ enkx3db01 1.4 2015-10-04T08:16:41-05:00
Similarly we can check out other metric history like
DS_MEMUT: The percentage of total physical memory used on the server.
DS_SWAP_IN_BY_SEC: The number of swap pages read in KB per second.
DS_SWAP_OUT_BY_SEC: The number of swap pages written in KB per second.
DS_SWAP_USAGE: The percentage of swap space used.
N_HCA_MB_RCV_SEC: The number of MB received by the InfiniBand interfaces per second.
N_HCA_MB_TRANS_SEC: The number of MB transmitted by the InfiniBand interfaces per second.
N_NIC_KB_RCV_SEC: The number of KB received by the Ethernet interfaces per second.
N_NIC_KB_TRANS_SEC: The number of KB transmitted by the Ethernet interfaces per second.
Another useful feature is that DBMCLI can configure email notifications for database server. I did not perform the following steps, just use the example from Oracle document.
Configure SMTP on database server
DBMCLI> ALTER DBSERVER smtpServer=’my_mail.example.com’, –
smtpFromAddr=’john.doe@example.com’, –
smtpFrom=’John Doe’, –
smtpToAddr=’jane.smith@example.com’, –
snmpSubscriber=((host=host1),(host=host2)), –
notificationPolicy=’clear’, –
notificationMethod=’mail,snmp’
Validate e-mail on a database server
DBMCLI> ALTER DBSERVER VALIDATE MAIL
Change the email format
DBMCLI> ALTER DBSERVER emailFormat=’text’
DBMCLI> ALTER DBSERVER emailFormat=’html’
To get further detail about DBMCLI, check out the document at http://docs.oracle.com/cd/E50790_01/doc/doc.121/e51951/app_dbmcli.htm#DBMMN22053.