Use OEM 13c R2 to Discover Oracle BDA

OEM 13c Cloud Control is a powerful monitoring tool, not only for Exadata and Oracle database, but also for Oracle Big Data Appliance (BDA). There are many articles or blogs about Exadata Discovery using OEM 12c or 13c. But not many places discuss the OEM BDA Discovery, especially using the new version of OEM, 13c Cloud Control. In this blog, I am going to discuss the steps to discover BDA using OEM 13c R2.

First, do not use OEM 13c R1 for BDA Discovery. It is very time consuming and very likely not going to work. OEM 13c R2 is much better, at least I can successfully do the BDA Discovery on all of the BDAs I have worked on.

Secondly, unlike OEM Exadata Discovery, BDA Discovery usually requires one extra step before the Manual OEM BDA Discovery by using bdacli enable em command first. Theoretically if works, I don’t need to do anything in manual BDA discovery process. Unfortunately I have never run into this perfect situation in different BDA environment and always get certain kind of errors at the end.

There are a few useful notes about OEM BDA Discovery.
1) Instructions to Install BDA Plug-in on Oracle Big Data Appliance (BDA) V2.*/V3.0.*/V3.1/V4.* (Doc ID 1682558.1)
2) BDA Credentials for Enterprise Manager 13.x Plugin (Doc ID 2206111.1)
3) Instructions to Enable / Disable the 13.x BDA Enterprise Manager Plug-in on Oracle Big Data Appliance (BDA) V4.5-V4.7 (Doc ID 2206207.1)

Execute bdacli command
Run bdacli enable em. For BDA version below 4.5, run command bdacli enable em –force. I am almost 100% guarantee you won’t see the successful completion message from this command. For example, get the following error at the end.

INFO: Running: /opt/oracle/emcli_home/emcli discover_bda_cluster -hostname=enkx4bda1node01.enkitec.local -cloudera_credential=BDA_ENKX4BDA_CM_CRED -host_credential=BDA_ENKX4BDA_HOSTS_CRED -cisco_credential=BDA_ENKX4BDA_CISCO_CRED -ilom_credential=BDA_ENKX4BDA_ILOM_CRED -infiniband_credential=BDA_ENKX4BDA_IB_CRED -pdu_credential=BDA_ENKX4BDA_PDU_CRED -cisco_snmp_string="snmp_v3;;SNMPV3Creds;authUser:none;authPwd:none;authProtocol:none;privPwd:none" -pdu_snmp_string="snmp_v1v2_v3;;SNMPV1Creds;COMMUNITY:none" -switch_snmp_string="snmp_v1v2_v3;;SNMPV3Creds;authUser:none;authPwd:none;authProtocol:none;privPwd:none"
ERROR: Syntax Error: Unrecognized argument -cisco_snmp_string #Step Syntax Error: Unrecognized argument -pdu_snmp_string#
Are you sure you want to completely cleanup em and lose all related state ?

When see the above message, always type in N and not rollback the changes. Basically you have OEM agent deployed, just need to figure out which node you want to use as the start point for Manual OEM BDA Discovery.

On each node, run the following command:

[root@enkx4bda1node06 ~]# java -classpath /opt/oracle/EMAgent/agent_13.*:/opt/oracle/EMAgent/agent_13.* oracle.sysman.bda.discovery.pojo.GetHadoopClusters http://enkx4bda1node03.enkitec.local:7180/api/v1/clusters admin admin_password

You should see the error below for the execution on many nodes.

Apr 10, 2017 10:14:44 AM com.sun.jersey.api.client.ClientResponse getEntity
SEVERE: A message body reader for Java class [Loracle.sysman.bda.discovery.pojo.Items;, and Java type class [Loracle.sysman.bda.discovery.pojo.Items;, and MIME media type text/html was not found
Apr 10, 2017 10:14:44 AM com.sun.jersey.api.client.ClientResponse getEntity
SEVERE: The registered message body readers compatible with the MIME media type are:
*/* ->

For certain node, you could see successful message and showing below.


In my case, it is node 2. So I will use Node 2 for my manual BDA Discovery in the following steps.

Manual OEM BDA Discovery
Logon to OEM as sysman user. Select Add Target -> Add Target Manually.

Select Add Targets Using Guided Process

Select Oracle Big Data Appliance

The Add Targets Manually pages shows up. Select node2 from the list. Click Next.

After it completes, it will show the following hardware information. Click Next.

The Hardware Credentials screen shows up. If all Host credentials show green sign, you don’t need to do anything related to Host. Go to the next one, for example, IB Switch. Select Set Credentials -> All Infiniband Switches . Then set SNMP Credentials type and community string. Majority of the time, input public for community string. Then click OK.

If successful, it shows the green check.

Following the similar procedure for all other hardware components, like ILOM, PDU and Cisco Switch. At the end, you should see the following screen.
One interesting note about PDU. PDU component always behave in a weird way during the discovery. For this case, it shows successful with green check, but later on OEM shows PDUs as DOWN status after the discovery. In my other discovery works for different BDA environments, the green check has never shown up in this page, but PDUs shows UP status after the discovery. So the result is inconsistent.

Click Next. The screen for Cloudera Manager shows up. Click Edit, verify the credential for admin user for Cloudera Manager. Then click Next.

The Software page shows up, click Next.

The review page shows up, click Submit

If successful, will see the screen message below, click OK.

The BDA Discovery is completed.
You might notice the new BDA cluster is called BDA Network1. This is not a good way to name a cluster, especially you have multiple BDAs under the management from the same OEM. I don’t understand why not to use BDA’s cluster name or Cloudera Manager’s cluster name. Either one will be much better than this naming. Even worse, you can change a lot of target name in OEM, but not for this one. I have another blog (Change BDA Cluster Name in OEM Cloud Control 13c) discussing a partial workaround for this issue.

To view the detail of a host target, you can have the following:

The presentation looks better than OEM 12c. In general, OEM 13c for BDA is good one. But pay attention to the followings. Otherwise you will spend a lot of additional time.
1) Before performing OEM BDA Discovery, make sure you have changed all of your default passwords on BDA. It’s easier to use default password during the discovery, but a huge pain after you change passwords for certain user accounts used in BDA discovery. Basically, update the Named Credentials is not enough and you have to delete the whole BDA target in OEM and redo the discovery.

2) Similarly, if configure TLS with Cloudera Manager after BDA Discovery, you will have to remove the BDA target and redo the discovery. It is a clearly a bug in OEM, at least not fixed at the time I am writing this blog.

3) Sometimes you might see tons of alerts from almost every ports in the Cisco switch. If from a few ports, I might believe it. But for almost every port, there is no way this is the right alert. As matter of fact, Oracle Support confirmed it seem false alert. At the time I had to do the BDA Rediscovery after configuring TLS with Cloudera Manager, I happened to notice all Cisco port alerts were gone after BDA rediscovery.

4) Both Oracle document and Oracle support says OEM 13c R2 supports BDA v4.5+ and any version below it is not supported. It’s true the lower BDA version would run into additional issues, but I managed to find workaround and make it working for BDA v4.3.


Change BDA Cluster Name in OEM Cloud Control 13c

Oracle OEM Cloud Control 13c has some improvement than OEM 12c. But for BDA, the most weird thing after OEM BDA Discovery is the target name for BDA cluster. By default, the target name is called BDA Network 1 for the first BDA cluster, and BDA Network 2 for 2nd BDA cluster. Think about this way, Oracle BDA already has a different BDA cluster name than Cloudera Manager’s cluster name. Right now OEM comes out with another different cluster name. If we have two BDAs and use OEM 13c to discover the DR BDA first, the DR BDA will take BDA Network 1 name. Then primary BDA will be discovered as BDA Network 2. It’s really an annoying new feature in OEM 13c. Ideally, I want to change the BDA Network Name to something meaningful. BDA Network 1 is really an useless naming standard, just like the door below. Mapping to either BDA cluster name or Cloudera Manager’s Cluster name is fine with me. In this blog, I am going to discuss whether I can change this name to something I like.

There are two types of targets in OEM: Repository Side Targets and Agent Side Targets. Each managed target in OEM have a Display Name and Target Name. So for BDA Network, I am wondering which category is for this target.

Run the query for Repository Side targets:

set lines 200
set pages 999
col ENTITY_TYPE for a30
col ENTITY_NAME for a45
col DISPLAY_NAME for a40

ORDER  BY 1,2;
------------------------------ ----------------------------------- --------------------------------------------- ----------------------------------------
j2ee_application_cluster       Clustered Application Deployment    /EMGC_GCDomain/GCDomain/BIP_cluster/bipublish bipublisher(11.1.1)

j2ee_application_cluster       Clustered Application Deployment    /EMGC_GCDomain/GCDomain/BIP_cluster/ESSAPP	 ESSAPP
oracle_em_service	       EM Service			   EM Jobs Service				 EM Jobs Service
oracle_emsvrs_sys	       EM Servers System		   Management_Servers				 Management Servers
oracle_si_netswitch	       Systems Infrastructure Switch	   enkx4bda1sw-ib2				 enkx4bda1sw-ib2
oracle_si_netswitch	       Systems Infrastructure Switch	   enkx4bda1sw-ip				 enkx4bda1sw-ip
oracle_si_netswitch	       Systems Infrastructure Switch	   enkx4bda1sw-ib3				 enkx4bda1sw-ib3
oracle_si_server_map	       Systems Infrastructure Server	   enkbda1node08-ilom				 enkbda1node08-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkx4bda1node06-ilom 			 enkx4bda1node06-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkbda1node17-ilom				 enkbda1node17-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkbda1node11-ilom				 enkbda1node11-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkx4bda1node04.enkitec.local/server 	 enkx4bda1node04.enkitec.local/server
oracle_si_server_map	       Systems Infrastructure Server	   enkbda1node18-ilom				 enkbda1node18-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkx4bda1node05-ilom 			 enkx4bda1node05-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkbda1node15-ilom				 enkbda1node15-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkbda1node14-ilom				 enkbda1node14-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkbda1node02-ilom				 enkbda1node02-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkbda1node01-ilom				 enkbda1node01-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkbda1node13-ilom				 enkbda1node13-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkbda1node06-ilom				 enkbda1node06-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkbda1node09-ilom				 enkbda1node09-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkx4bda1node03-ilom 			 enkx4bda1node03-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkbda1node16-ilom				 enkbda1node16-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkbda1node03-ilom				 enkbda1node03-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkbda1node10-ilom				 enkbda1node10-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkx4bda1node01-ilom 			 enkx4bda1node01-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkbda1node04-ilom				 enkbda1node04-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkbda1node05-ilom				 enkbda1node05-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkbda1node12-ilom				 enkbda1node12-ilom
oracle_si_server_map	       Systems Infrastructure Server	   enkx4bda1node02-ilom 			 enkx4bda1node02-ilom
weblogic_cluster	       Oracle WebLogic Cluster		   /EMGC_GCDomain/GCDomain/BIP_cluster		 BIP_cluster

31 rows selected.

Not found in this category. Try Agent Side target.

set lines 200
set pages 999
col ENTITY_TYPE for a30
col ENTITY_NAME for a35
col DISPLAY_NAME for a35
col EMD_URL for a60

ORDER  BY 1,2,3;

------------------------------ ------------------------------ ----------------------------------- ----------------------------------- ------------------------------------------------------------
host			       Host			      enkx4bda1node01.enkitec.local	  enkx4bda1node01.enkitec.local       https://enkx4bda1node01.enkitec.local:1830/emd/main/
host			       Host			      enkx4bda1node02.enkitec.local	  enkx4bda1node02.enkitec.local       https://enkx4bda1node02.enkitec.local:1830/emd/main/
host			       Host			      enkx4bda1node03.enkitec.local	  enkx4bda1node03.enkitec.local       https://enkx4bda1node03.enkitec.local:1830/emd/main/
host			       Host			      enkx4bda1node04.enkitec.local	  enkx4bda1node04.enkitec.local       https://enkx4bda1node04.enkitec.local:1830/emd/main/
host			       Host			      enkx4bda1node05.enkitec.local	  enkx4bda1node05.enkitec.local       https://enkx4bda1node05.enkitec.local:1830/emd/main/
host			       Host			      enkx4bda1node06.enkitec.local	  enkx4bda1node06.enkitec.local       https://enkx4bda1node06.enkitec.local:1830/emd/main/
host			       Host			      enkx4bdacli02.enkitec.local	  enkx4bdacli02.enkitec.local	      https://enkx4bdacli02.enkitec.local:3872/emd/main/
oracle_apache		       Oracle HTTP Server	      /EMGC_GCDomain/GCDomain/ohs1	  ohs1				      https://enkx4bdacli02.enkitec.local:3872/emd/main/
oracle_bda_cluster	       BDA Network		      BDA Network 1			  BDA Network 1 		      https://enkx4bda1node02.enkitec.local:1830/emd/main/
oracle_beacon		       Beacon			      EM Management Beacon		  EM Management Beacon		      https://enkx4bdacli02.enkitec.local:3872/emd/main/
oracle_big_data_sql	       Oracle Big Data SQL	      bigdatasql_enkx4bda		  bigdatasql_enkx4bda		      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_cloudera_manager        Cloudera Manager 	      Cloudera Manager - enkx4bda	  Cloudera Manager - enkx4bda	      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_emd		       Agent			      enkx4bda1node01.enkitec.local:1830  enkx4bda1node01.enkitec.local:1830  https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_emd		       Agent			      enkx4bda1node02.enkitec.local:1830  enkx4bda1node02.enkitec.local:1830  https://enkx4bda1node02.enkitec.local:1830/emd/main/
oracle_emd		       Agent			      enkx4bda1node03.enkitec.local:1830  enkx4bda1node03.enkitec.local:1830  https://enkx4bda1node03.enkitec.local:1830/emd/main/
oracle_emd		       Agent			      enkx4bda1node04.enkitec.local:1830  enkx4bda1node04.enkitec.local:1830  https://enkx4bda1node04.enkitec.local:1830/emd/main/
oracle_emd		       Agent			      enkx4bda1node05.enkitec.local:1830  enkx4bda1node05.enkitec.local:1830  https://enkx4bda1node05.enkitec.local:1830/emd/main/
oracle_emd		       Agent			      enkx4bda1node06.enkitec.local:1830  enkx4bda1node06.enkitec.local:1830  https://enkx4bda1node06.enkitec.local:1830/emd/main/
oracle_emd		       Agent			      enkx4bdacli02.enkitec.local:3872	  enkx4bdacli02.enkitec.local:3872    https://enkx4bdacli02.enkitec.local:3872/emd/main/
oracle_emrep		       OMS and Repository	      Management Services and Repository  Management Services and Repository  https://enkx4bdacli02.enkitec.local:3872/emd/main/
oracle_hadoop_cluster	       Hadoop Cluster		      enkx4bda				  enkx4bda			      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_datanode	       Hadoop DataNode		      DN_enkx4bda1node01_enkx4bda	  DN_enkx4bda1node01_enkx4bda	      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_datanode	       Hadoop DataNode		      DN_enkx4bda1node02_enkx4bda	  DN_enkx4bda1node02_enkx4bda	      https://enkx4bda1node02.enkitec.local:1830/emd/main/
oracle_hadoop_datanode	       Hadoop DataNode		      DN_enkx4bda1node03_enkx4bda	  DN_enkx4bda1node03_enkx4bda	      https://enkx4bda1node03.enkitec.local:1830/emd/main/
oracle_hadoop_datanode	       Hadoop DataNode		      DN_enkx4bda1node04_enkx4bda	  DN_enkx4bda1node04_enkx4bda	      https://enkx4bda1node04.enkitec.local:1830/emd/main/
oracle_hadoop_datanode	       Hadoop DataNode		      DN_enkx4bda1node05_enkx4bda	  DN_enkx4bda1node05_enkx4bda	      https://enkx4bda1node05.enkitec.local:1830/emd/main/
oracle_hadoop_datanode	       Hadoop DataNode		      DN_enkx4bda1node06_enkx4bda	  DN_enkx4bda1node06_enkx4bda	      https://enkx4bda1node06.enkitec.local:1830/emd/main/
oracle_hadoop_failoverctl      Hadoop Failover Controller     FC_NNA_enkx4bda			  FC_NNA_enkx4bda		      https://enkx4bda1node02.enkitec.local:1830/emd/main/
oracle_hadoop_failoverctl      Hadoop Failover Controller     FC_NNB_enkx4bda			  FC_NNB_enkx4bda		      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_hdfs	       Hadoop HDFS		      hdfs_enkx4bda			  hdfs_enkx4bda 		      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_historyserver    Hadoop Job History Server      JHS_enkx4bda			  JHS_enkx4bda			      https://enkx4bda1node03.enkitec.local:1830/emd/main/
oracle_hadoop_hive	       Hadoop Hive		      hive_enkx4bda			  hive_enkx4bda 		      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_hive_metaserver  Hadoop Hive Metastore Server   Metastore_enkx4bda		  Metastore_enkx4bda		      https://enkx4bda1node04.enkitec.local:1830/emd/main/
oracle_hadoop_hive_server      Hadoop Hive Server2	      HiveServer2_enkx4bda		  HiveServer2_enkx4bda		      https://enkx4bda1node04.enkitec.local:1830/emd/main/
oracle_hadoop_hive_webhcat     Hadoop Hive WebHCat Server     WebHCat_enkx4bda			  WebHCat_enkx4bda		      https://enkx4bda1node04.enkitec.local:1830/emd/main/
oracle_hadoop_hue	       Hadoop Hue		      hue_enkx4bda			  hue_enkx4bda			      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_impala	       Hadoop Impala		      impala_enkx4bda			  impala_enkx4bda		      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_impala_demon     Hadoop Impala Daemon	      ImpalaD_enkx4bda1node01_enkx4bda	  ImpalaD_enkx4bda1node01_enkx4bda    https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_impala_demon     Hadoop Impala Daemon	      ImpalaD_enkx4bda1node02_enkx4bda	  ImpalaD_enkx4bda1node02_enkx4bda    https://enkx4bda1node02.enkitec.local:1830/emd/main/
oracle_hadoop_impala_demon     Hadoop Impala Daemon	      ImpalaD_enkx4bda1node03_enkx4bda	  ImpalaD_enkx4bda1node03_enkx4bda    https://enkx4bda1node03.enkitec.local:1830/emd/main/
oracle_hadoop_impala_demon     Hadoop Impala Daemon	      ImpalaD_enkx4bda1node04_enkx4bda	  ImpalaD_enkx4bda1node04_enkx4bda    https://enkx4bda1node04.enkitec.local:1830/emd/main/
oracle_hadoop_impala_demon     Hadoop Impala Daemon	      ImpalaD_enkx4bda1node06_enkx4bda	  ImpalaD_enkx4bda1node06_enkx4bda    https://enkx4bda1node06.enkitec.local:1830/emd/main/
oracle_hadoop_impala_server_cat Hadoop Impala Server Catalogue ImpalaCatSrv_enkx4bda		  ImpalaCatSrv_enkx4bda 	      https://enkx4bda1node06.enkitec.local:1830/emd/main/
oracle_hadoop_impala_statestore Hadoop Impala State Store      StateStore_enkx4bda		  StateStore_enkx4bda		      https://enkx4bda1node06.enkitec.local:1830/emd/main/
oracle_hadoop_journalnode      Hadoop Journal Node	      JN_enkx4bda1node01_enkx4bda	  JN_enkx4bda1node01_enkx4bda	      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_journalnode      Hadoop Journal Node	      JN_enkx4bda1node02_enkx4bda	  JN_enkx4bda1node02_enkx4bda	      https://enkx4bda1node02.enkitec.local:1830/emd/main/
oracle_hadoop_journalnode      Hadoop Journal Node	      JN_enkx4bda1node03_enkx4bda	  JN_enkx4bda1node03_enkx4bda	      https://enkx4bda1node03.enkitec.local:1830/emd/main/
oracle_hadoop_kerberos	       Kerberos 		      kerberos_enkx4bda 		  kerberos_enkx4bda		      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_mysql	       MySql			      mysql_enkx4bda			  mysql_enkx4bda		      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_namenode	       Hadoop NameNode		      NNA_enkx4bda			  NNA_enkx4bda			      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_namenode	       Hadoop NameNode		      NNB_enkx4bda			  NNB_enkx4bda			      https://enkx4bda1node02.enkitec.local:1830/emd/main/
oracle_hadoop_nodemgr	       Hadoop NodeManager	      NM_enkx4bda1node01_enkx4bda	  NM_enkx4bda1node01_enkx4bda	      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_nodemgr	       Hadoop NodeManager	      NM_enkx4bda1node02_enkx4bda	  NM_enkx4bda1node02_enkx4bda	      https://enkx4bda1node02.enkitec.local:1830/emd/main/
oracle_hadoop_nodemgr	       Hadoop NodeManager	      NM_enkx4bda1node03_enkx4bda	  NM_enkx4bda1node03_enkx4bda	      https://enkx4bda1node03.enkitec.local:1830/emd/main/
oracle_hadoop_nodemgr	       Hadoop NodeManager	      NM_enkx4bda1node04_enkx4bda	  NM_enkx4bda1node04_enkx4bda	      https://enkx4bda1node04.enkitec.local:1830/emd/main/
oracle_hadoop_nodemgr	       Hadoop NodeManager	      NM_enkx4bda1node05_enkx4bda	  NM_enkx4bda1node05_enkx4bda	      https://enkx4bda1node05.enkitec.local:1830/emd/main/
oracle_hadoop_nodemgr	       Hadoop NodeManager	      NM_enkx4bda1node06_enkx4bda	  NM_enkx4bda1node06_enkx4bda	      https://enkx4bda1node06.enkitec.local:1830/emd/main/
oracle_hadoop_oozie	       Hadoop Oozie		      oozie_enkx4bda			  oozie_enkx4bda		      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_oozie_server     Hadoop Oozie Server	      OozieServer_enkx4bda		  OozieServer_enkx4bda		      https://enkx4bda1node04.enkitec.local:1830/emd/main/
oracle_hadoop_resourcemgr      Hadoop ResourceManager	      RMA_enkx4bda			  RMA_enkx4bda			      https://enkx4bda1node04.enkitec.local:1830/emd/main/
oracle_hadoop_resourcemgr      Hadoop ResourceManager	      RMB_enkx4bda			  RMB_enkx4bda			      https://enkx4bda1node03.enkitec.local:1830/emd/main/
oracle_hadoop_solr	       Hadoop Solr		      solr_enkx4bda			  solr_enkx4bda 		      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_solr_server      Hadoop Solr Server	      SolrServer_enkx4bda		  SolrServer_enkx4bda		      https://enkx4bda1node03.enkitec.local:1830/emd/main/
oracle_hadoop_spark_on_yarn    Hadoop Spark On Yarn	      spark_on_yarn_enkx4bda		  spark_on_yarn_enkx4bda	      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_yarn	       Hadoop Yarn		      yarn_enkx4bda			  yarn_enkx4bda 		      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_zookeeper        Hadoop ZooKeeper 	      zookeeper_enkx4bda		  zookeeper_enkx4bda		      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_zookeeper_server Hadoop ZooKeeper Server	      ZKS_enkx4bda1node01_enkx4bda	  ZKS_enkx4bda1node01_enkx4bda	      https://enkx4bda1node01.enkitec.local:1830/emd/main/
oracle_hadoop_zookeeper_server Hadoop ZooKeeper Server	      ZKS_enkx4bda1node02_enkx4bda	  ZKS_enkx4bda1node02_enkx4bda	      https://enkx4bda1node02.enkitec.local:1830/emd/main/
oracle_hadoop_zookeeper_server Hadoop ZooKeeper Server	      ZKS_enkx4bda1node03_enkx4bda	  ZKS_enkx4bda1node03_enkx4bda	      https://enkx4bda1node03.enkitec.local:1830/emd/main/


oracle_oms		       Oracle Management Service      enkx4bdacli02.enkitec.local:4889_Ma enkx4bdacli02.enkitec.local:4889_Ma https://enkx4bdacli02.enkitec.local:3872/emd/main/
							      nagement_Service			  nagement_Service
oracle_oms_console	       OMS Console		      enkx4bdacli02.enkitec.local:4889_Ma enkx4bdacli02.enkitec.local:4889_Ma https://enkx4bdacli02.enkitec.local:3872/emd/main/
							      nagement_Service_CONSOLE		  nagement_Service_CONSOLE
oracle_oms_pbs		       OMS Platform		      enkx4bdacli02.enkitec.local:4889_Ma enkx4bdacli02.enkitec.local:4889_Ma https://enkx4bdacli02.enkitec.local:3872/emd/main/
							      nagement_Service_PBS		  nagement_Service_PBS
oracle_si_pdu		       Systems Infrastructure PDU     enkx4bda1-pdua			  enkx4bda1-pdua		      https://enkx4bda1node02.enkitec.local:1830/emd/main/
oracle_si_pdu		       Systems Infrastructure PDU     enkx4bda1-pdub			  enkx4bda1-pdub		      https://enkx4bda1node02.enkitec.local:1830/emd/main/

164 rows selected

The following query shows just oracle_bda_cluster type of target.

col ENTITY_TYPE for a20
col ENTITY_NAME for a16
col DISPLAY_NAME for a16
col EMD_URL for a55
ENTITY_TYPE = 'oracle_bda_cluster'
-------------------- -------------------- ---------------- ---------------- -------------------------------------------------------
oracle_bda_cluster   BDA Network	  BDA Network 1    BDA Network 1    https://enkx4bda1node02.enkitec.local:1830/emd/main/

Ok, we can see entity_type is oracle_bda_cluster for BDA Network. Both target name and display name are BDA Network 1.

Next, I will check whether I can rename the target name of BDA Network 1. I used emcli rename_target command in the past to rename OEM target. It usually works. So I run the following command:

[oracle@enkx4bdacli02 ~]$ emcli show_bda_clusters
BDA Network 1 : enkx4bda

[oracle@enkx4bdacli02 ~]$ emcli get_targets -targets="oracle_bda_cluster"
Status  Status           Target Type           Target Name                        
-9      N/A              oracle_bda_cluster    BDA Network 1  

[oracle@enkx4bdacli02 ~]$ emcli rename_target -target_type="oracle_bda_cluster" -target_name="BDA Network 1" -new_target_name="X4BDA"
Rename not supported for the given Target Type.

No luck. It doesn’t work. If renaming target name not working, let me try to change display name.

[oracle@enkx4bdacli02 ~]$ emcli modify_target -type="oracle_bda_cluster" -name="BDA Network 1" -display_name="X4BDA"
Target "BDA Network 1:oracle_bda_cluster" modified successfully

It works. Rerun the query to check oracle_bda_cluster type.

-------------------- -------------------- ---------------- ---------------- -------------------------------------------------------
oracle_bda_cluster   BDA Network	  BDA Network 1    X4BDA	    https://enkx4bda1node02.enkitec.local:1830/emd/main/

Well it work partially. For some screens, it works perfectly.

But for some other screen, it still shows the same annoying name.

Another lesson I learned recently is that you need very careful in using default password when setting up BDA. Once setting up BDA using default password and OEM BDA Discovery is using these default password for Named Credentials, you will run into issue after you change default password later on. In the worst case, like Cloudera Manager’s password change, it requires the remove the current BDA target and redo the BDA Discovery. I may write this topic in a different blog if I have time.

AWR is not Enough to Track Down IO Problem on Exadata

Recently we run into an interesting performance issue at one of our clients. They reported significant slow down in the system during the day time for some time. Their X2 Exadata does not host many databases, with three production databases and a few test/QA databases. Out of the three production databases, let me give a fake name (CRDB), is the most important and critical one, which is mainly OLTP and have some reporting activities. The other two production databases are tiny and less important databases with not much activities. In other words, the majority of db activities happens at CRDB database.

The slow down is mysterious and randomly during the day and did not seem to follow a pattern. When the slow down happens, the active sessions at CRDB shot up from average 2~4 to at least 30, sometimes reach to 100. At the same time, there were massive slow down in all other databases on the same Exadata. To track down the issue, we requested a few AWR reports from the client for the CRDB database.

In the AWR report, the top event is cell smart table scan. For example, on db node 1 alone, from 10:30am to 2:30pm on August 6, 2013, the total waits for cell smart table scan was 574, 218 with average wait time of 106 ms, equal to 60,980 seconds of DB time, in other words, 78.20% of DB time. Other than that, AWR report did not tell anything useful that should cause performance issue. Therefore, we  mainly focused on IO and smart scan related operation.

On Exadata, cell smart table scan is the scan on cell storage for large IOs. Many offloading activities involve cell smart table scan. As client said there was no application code change recently and data volume remains similar in the past, we assume the total logical and physical IO for CRDB database should be at the similar level as before. They also mentioned that the performance issue began a few days after Exadata patch.

Their resource plan gives  50% allocation to CRDB, and another 50% to the rest of databases with objective of LOW_LATENCY. This resource plan has been in production for over a year and managed quite well in IO.
Luckily during the time we were tracking down the issue, I was attending our company‘s 2013 Exadata conference (E4) . It’s an excellent Exadata and Bigdata specific conference, not only it has many great Exadata/Big Data speakers, but also many Exadata experts/users around the world. I happened to listen one session about IORM presented by Oracle’s Sue Lee. She is a Director of Development in the Oracle RDBMS division and responsible for the Oracle Resource Manager. Her presentation about IORM was an excellent one. She will also give a similar session at OpenWorld. If you are working on Exadata and attend OpenWorld this year, I highly recommended to attend her session, Using Resource Manager for Database Consolidation with Oracle Database 12c (Session ID: CON8884)  and you will gain in-depth knowledge about IORM and Exadata internals.

During her presentation, she mentioned a tool, called iorm_metrics script (Doc ID 1337265.1), and said this tools is frequently used in her group to track down IORM performance problems. It sounds interesting, so I immediately downloaded the script and asked our client run the followings on all of cell nodes.

./ “where collectionTime > ‘2013-08-06T11:30:00-06:00’ and collectionTime < ‘2013-08-06T13:20:00-06:00′” > metric_iorm_`hostname`.log

The result was quite interesting and I used one snapshot from cell node 1 as example:

Time: 2013-08-06T12:33:03-05:00
Database: CRDB
Utilization:     Small=4%    Large=7%
Flash Cache:     IOPS=226
Disk Throughput: MBPS=120
Small I/O's:     IOPS=258    Avg qtime=1.1ms
Large I/O's:     IOPS=114    Avg qtime=3026ms
	Utilization:     Small=0%    Large=0%
	Flash Cache:     IOPS=0.1
	Disk Throughput: MBPS=0
	Small I/O's:     IOPS=0.3    Avg qtime=0.0ms
	Large I/O's:     IOPS=0.0    Avg qtime=0.0ms
	Consumer Group: OTHER_GROUPS
	Utilization:     Small=0%    Large=7%
	Flash Cache:     IOPS=150
	Disk Throughput: MBPS=116
	Small I/O's:     IOPS=1.8    Avg qtime=0.8ms
	Large I/O's:     IOPS=112    Avg qtime=3077ms
	Utilization:     Small=1%    Large=0%
	Flash Cache:     IOPS=23.8
	Disk Throughput: MBPS=1
	Small I/O's:     IOPS=82.1    Avg qtime=2.7ms
	Large I/O's:     IOPS=0.0    Avg qtime=0.0ms
	Utilization:     Small=3%    Large=0%
	Flash Cache:     IOPS=51.6
	Disk Throughput: MBPS=2
	Small I/O's:     IOPS=174    Avg qtime=0.3ms
	Large I/O's:     IOPS=1.9    Avg qtime=66.1ms

Utilization:     Small=10%    Large=9%
Flash Cache:     IOPS=89.7
Disk Throughput: MBPS=142
Small I/O's:     IOPS=504    Avg qtime=0.5ms
Large I/O's:     IOPS=134    Avg qtime=4137ms
	Utilization:     Small=0%    Large=0%
	Flash Cache:     IOPS=0.1
	Disk Throughput: MBPS=0
	Small I/O's:     IOPS=0.4    Avg qtime=0.2ms
	Large I/O's:     IOPS=0.0    Avg qtime=0.0ms
	Consumer Group: OTHER_GROUPS
	Utilization:     Small=0%    Large=9%
	Flash Cache:     IOPS=42.8
	Disk Throughput: MBPS=139
	Small I/O's:     IOPS=0.8    Avg qtime=0.6ms
	Large I/O's:     IOPS=134    Avg qtime=4138ms
	Utilization:     Small=0%    Large=0%
	Flash Cache:     IOPS=16.9
	Disk Throughput: MBPS=0
	Small I/O's:     IOPS=56.2    Avg qtime=3.5ms
	Large I/O's:     IOPS=0.0    Avg qtime=0.0ms
	Utilization:     Small=9%    Large=0%
	Flash Cache:     IOPS=29.8
	Disk Throughput: MBPS=2
	Small I/O's:     IOPS=447    Avg qtime=0.2ms
	Large I/O's:     IOPS=0.0    Avg qtime=53.0ms


Cell Total Utilization:     Small=14%    Large=16%
Cell Total Flash Cache:     IOPS=315.7
Cell Total Disk Throughput: MBPS=249.454
Cell Total Small I/O's:     IOPS=762.2
Cell Total Large I/O's:     IOPS=245.6
Cell Avg small read latency:  11.24 ms
Cell Avg small write latency: 2.64 ms
Cell Avg large read latency:  13.35 ms
Cell Avg large write latency: 4.47 ms

From the above result, we can see the average queue time for every large IO (cell smart scan) was over 3, 000 ms for CRDB resource group and over 4, 000 ms for OTHER_DATABASE. The normal range should be < 30ms. The throughput for OTHER_DATABASE was 142 MB/second while CRDB was 120 MB/second. This indicates the saturated disk I/O.

It’s possible CRDB’s large amount IO caused OTHER_DATABASE to slow down, but Disk Throughpt from OTHER_DATABASE should be small and wait time could be longer. On the contrary, the high IO Throughput from OTHER_DATABASE indicates something not right. So I zoomed into other databases, and compare the IO throughput between databases.

The following charts shows the throughput for CRDB database was around 300MB/second before 3:30pm.
At the same time, another much smaller and less used database, the throughput was much higher than CRDB database. The throughput was mostly between 500MB to 600MB/second, with some peak to over 1200MB/second at a few times.


It is normal to see CRDB has a lot of large IOs during the day as it is the major database and there are many activities against this database. However it is unusual to see a tiny small database took significant amount of large IOs. This inspired me to investigate more on this small database.

From 12c OEM’s SQL Monitor screen, we can see a lot of queries with SQL ID d0af9yxrrrvy5 with many IOs, over 80GB for each execution and long running time. There  were  multiple instances for the same query executed during the same timeframe  the slow down happened.


The 12c Cloud Control OEM also shows the Average Throttle Time for Disk I/Os for both CRDB and OTHER_DATABASE shot up to 1,500~2,000 milliseconds on August 6 and 7 afternoon. This was exact the time the query was executing in one of OTHER_DATABASE.


After this query was shutdown, system returned to normal.  So the slow down in CRDB database was not the cause of the problem, but the victim of IO throttle caused by other database.

It seems the end of story for the problem. Actually not yet.

Although the system looks normal, there were two queries running about 2~3 times slower than in the past. Our Enkitec‘s awr_plan_change.sql shows almost identical LIO and PIO for each execution, but the timing changes significantly since the patch date of July 15. It’s another interesting issue.

SYS@CRDB1> @awr_plan_change
Enter value for sql_id: 4zbfxnv733dzb
Enter value for instance_number:
---------- ------ -------------------------------- ----- --------  ------------   ------------ -------------- --------------
9700	1 28-MAY-13 AM	 4zbfxnv733dzb   1	 288.559   55,876,213.0   55,875,801.0
9748	2 29-MAY-13 AM	 4zbfxnv733dzb   1	 334.543   55,876,213.0   55,875,801.0
9796	3 30-MAY-13 AM	 4zbfxnv733dzb	 1	 315.035   55,876,333.0   55,875,801.0
11956	3 14-JUL-13 AM	 4zbfxnv733dzb	 1	 258.629   55,876,269.0   55,875,804.0
12000	2 15-JUL-13 AM	 4zbfxnv733dzb	 1 1,549.712   43,115,159.0   43,107,149.0
12001	2 15-JUL-13 AM	 4zbfxnv733dzb	 0	 993.135   12,778,387.0   12,768,812.0
12047	1 16-JUL-13 AM	 4zbfxnv733dzb	 1	 565.923   55,876,638.0   55,875,801.0
12096	1 17-JUL-13 AM	 4zbfxnv733dzb	 1 1,148.289   55,878,883.0   55,875,923.0
12143	3 18-JUL-13 AM	 4zbfxnv733dzb	 1	 567.586   55,876,013.0   55,875,803.0
13057	1 06-AUG-13 AM	 4zbfxnv733dzb	 1	 645.235   55,876,538.0   55,875,821.0
13105	2 07-AUG-13 AM	 4zbfxnv733dzb	 1	 986.482   55,877,223.0   55,875,823.0
13153	3 08-AUG-13 AM	 4zbfxnv733dzb	 1	 587.454   55,875,957.0   55,875,801.0
13201	1 09-AUG-13 AM	 4zbfxnv733dzb	 1	 594.734   55,876,423.0   55,875,801.0
13249	3 10-AUG-13 AM	 4zbfxnv733dzb	 1	 515.732   55,877,880.0   55,875,801.0
13297	1 11-AUG-13 AM	 4zbfxnv733dzb	 1	 477.941   55,875,965.0   55,875,802.0

So I compare AWR reports one before the patch and another after the patch during the same period (7am~8:30am) when both queries run. Although there are some difference here and there, I focus on more about the difference on IO wait, especially large scan wait.


From the above chart, we can see the 2nd line, cell smart table scan, avg wait time jump from 7ms to 38ms and total wait time jump from 2,083 seconds to 3,000 seconds. At the same time, the average Wait Time drop 24% for cell single block physical read, which is small IO. At this moment, it seems I realized IORM internal handling logic seem change after the patch.

Checking out Oracle Support site, found something interesting related to IORM.
For cell version –, IORM is disabled by default. To disable IORM, run “alter iormplan objective=off;” command. is cell version our client used before the patch.

For cell version and above, IORM is enabled by default and use basic objective. To disable IORM, run “alter iormplan objective=basic;”. is our client’s current version.

The above indicates there are some changes involving IORM. Their current objective is low-latency. Low-latency will be good for fast response to OLTP requests, with lower wait time for small IO. At the same time, the large IO is scarified to have longer wait time per request. Another possibility is IORM changes the way to throttle IO. These two queries generate over 110 million physical IO within 10 minutes. I saw throttle wait time for CRDB was between 100~200 ms when these two queries run in the morning.

So the LOW-LATENCY might not be a good objective for the client. I remember Sue Lee recommended to use AUTO objective as best practice. If it doesn’t work, try other next. Obviously, this is something we would like to follow. So the client made just one line change as follows to switch LOW-LATENCY to AUTO objective.

Here is the command to make the change to AUTO.
# dcli -g ~/cell_group -l root ‘cellcli -e alter iormplan objective = auto’

Here is the result after the objective change to AUTO.

---------- ------ -------------------------------- ----- --------  ------------   ------------ -------------- --------------
13633	3 18-AUG-13 AM	 4zbfxnv733dzb	 1	 471.955   55,875,957.0   55,875,801.0
13681	3 19-AUG-13 AM	 4zbfxnv733dzb	 1	 556.433   55,877,846.0   55,875,863.0
13729	3 20-AUG-13 AM	 4zbfxnv733dzb	 1	 537.509   55,877,821.0   55,875,861.0
13777	2 21-AUG-13 AM	 4zbfxnv733dzb   1	 166.424   55,876,181.0   55,875,806.0
13825	2 22-AUG-13 AM	 4zbfxnv733dzb	 1	 175.517   55,875,957.0   55,875,801.0

The result was amazing. Not only there is no negative impact on the system, the two queries running time immediately back to their normal range, even better than before. The query above drops from 500+ seconds to less than 200 seconds.

During a few emails back and forth with Sue Lee about this topic, she added a few good points as follows.

When you look at the disk utilization charts, you should keep in mind that IORM is only actively regulating when you see IOs being throttled. So you should only see that the allocations are “kicking in” when the utilization is pretty high.

If you look at slide 59, you’ll see that with LOW LATENCY objective, we don’t allow the disks to run at their full utilization. This is why you got better results with AUTO objective for throughput-intensive applications like these reports. This is basically what you already said in the attached email.

When I checked out her presentation slide again, of course, it explains everything. For Low-Latency, the Peak Disk Utilization for scans is the lowest, only 40% and next one is Balanced with 90%, High Throughput with 100%. No wonder the disk throttle time is that high.

My colleague, Carlos Sierra, the author of famous SQLT tool, also did excellent analysis for the same issue from SQLT perspective and here is the link to his analysis.

I must say I am very happy to attend E4 conference and uses the knowledge immediately to the real time production issue. In case you didn’t attend, here is one photo I took from the conference. This year we had a larger room than last year’s E4 conference with more people coming. I am sure next year it will have even a larger room for bigger audience.

Default Port Numbers Used on Exadata: Port Numbers for OEM Part 3 of 3


In the post of Part 1, I show the default port numbers for general purpose.

The Part 2 shows the port numbers related to ILOM. This post shows the port numbers for 12c Oracle Enterprise Manager (OEM) Cloud Control.

Port Name                Normal Ranges    Exadata Default Value
EM Upload HTTP Port      4889-4898        4889
EM Upload HTTPS Port     1559,4899-4908   1159

Node Mgr HTTPS Port      7401-7500        Check
Managed Server HTTP Port 7201-7300        Check
EM Console HTTP Port     7788-7798        7788
EM Console HTTPS Port    7799-7809        7799

Management Agent Port    3872,1830-1849   Check
Admin Server HTTP Port   7001             Check
Admin Server HTTPS Port  7101-7200        Check
Managed Server HTTPS Port 7301-7400       Check

Check means check the port number configured in the installation.

There are a few ports not necessarily through the firewall between Exadata and OEM OMS.

EM Repository DB Port                 1521

There are also some more ports related to OEM and optional, and only need when using the components.

Port Name                             Port Number
JVM Diagnostics Managed Server	      3800
JVM Diagnostics Managed Server (SSL)  3801

ADP RMI Registry Port	              51099
ADP Java Provider Port	              55003
ADP Remote Service Controller Port	  55000
ADP Listen                            4210
ADP Listen Port (SSL)                 4211
BI Publisher HTTP                     9701
BI Publisher HTTPS                    9702
Secure web connection to   443

Note: Port 443 is https to,,,,
It is outgoing from OMS and used for communication with Oracle for OCM, MOS, Patching, Self-Updates, ASR.

To verify detail about the ports used in oms, you can run the followings

[oracle@gc12c bin]$ emctl status oms -details
Oracle Enterprise Manager Cloud Control 12c Release 2
Copyright (c) 1996, 2012 Oracle Corporation. All rights reserved.
Enter Enterprise Manager Root (SYSMAN) Password :
Console Server Host :
HTTP Console Port : 7789
HTTPS Console Port : 7801
HTTP Upload Port : 4890
HTTPS Upload Port : 4901
EM Instance Home : /u01/app/oracle/oms12c/gc_inst/em/EMGC_OMS1
OMS Log Directory Location : /u01/app/oracle/oms12c/gc_inst/em/EMGC_OMS1/sysman/log
OMS is not configured with SLB or virtual hostname
Agent Upload is locked.
OMS Console is locked.
Active CA ID: 1
Console URL:
Upload URL:

WLS Domain Information
Domain Name : GCDomain
Admin Server Host:

Managed Server Information
Managed Server Instance Name: EMGC_OMS1
Managed Server Instance Host:
WebTier is Up
Oracle Management Server is Up

I could not find a command to show what’s port number is used for something like Node Manager, Managed Server. But do find a way to show this kind of information from a temp file in the initial installation. The file is in MIDDLEWARE_HOME/.gcinstall_temp/staticports.ini on OMS host.

[oracle@gc12c oracle]$ cat /u01/app/oracle/oms12c/.gcinstall_temp/staticports.ini
Enterprise Manager Upload Http Port=4890
Enterprise Manager Upload Http SSL Port=4901
Enterprise Manager Central Console Http SSL Port=7801
Node Manager Http SSL Port=7405
Managed Server Http Port=7203
Enterprise Manager Central Console Http Port=7789
Oracle Management Agent Port=3872
Admin Server Http SSL Port=7102
Managed Server Http SSL Port=7302

The following chart shows firewall configurations for OEM components.


Related Posts:

Default Port Numbers Used on Exadata: Port Numbers for General Use Part 1 of 3

Default Port Numbers Used on Exadata: Port Numbers for ILOM Part 2 of 3