Over the last few years, Oracle has dedicated to cloud computing and they are in a very tough race with its competitors. In order to stand out in this race, Oracle provides more services day by day. One of the services Oracle offers to the end user is “Oracle Big Data Cloud Service – Compute Edition”. I examined this service by creating a trial account, and I decided to write a series of blog posts for those who would like to use this service.
In my opinion, the most difficult part of creating a Big Data ecosystem is to run many open source software projects together, and integrate them with each another. There are 3 major players on the market to help end-users to build an integrated and tested solution for big data: Cloudera, Hortonworks and MapR. Oracle has partnered with Cloudera to build the Oracle Big Data Appliance and Oracle Big Data Cloud Service. They also offer “Oracle Big Data Cloud Service – Compute Edition” based on Hortonworks. Creating “Oracle Big Data Cloud Service – Compute Edition” is simple. You get a ready-to-use big data cluster in about 15 minutes after giving the basic information such as the name of the cluster, the number of servers (nodes), CPU and disk sizes for each node, and the administrator password.
First, let’s create an “Oracle Big Data Cloud Service – Compute Edition”. After you create our test account for Oracle Cloud, you are log in to the “Oracle Cloud” dashboard. Using this dashboard you can see all your services and add new services at the same time.
When we click Create Service and select “Big Data – Compute Edition”, we land to “Oracle Big Data Cloud Service – Compute Edition” service page. On this page, we can click “create service” to launch a very simple wizard which has only 3 steps to create our big data cluster.
On the first step, we gave the name of the cluster and click next. I entered “bilyonveri” – it will be the name of my cluster.
On the second step, we enter the details of the service (number of nodes, CPU, etc.). We can use total 6 CPUs on a trial account, and since a node can have a minimum of 2 CPUs, we can create a cluster with a maximum of 3 nodes. Since the total disk size we can use is about 500 GB (I think it will be replaced with “read request”), you can give up to 80 GB of hdfs size per node. I recommend you to select the installation type “full”. If you select the “basic”, you get the most basic components which will only allow you to work with Spark.
Although we create a cloud storage container, our big data cluster does not use it.
On the last step, we confirm the settings and click “create” button.
When the service is created, you can see the overal status of your service by clicking on the service name. This page also allows you to stop and restart the service, set access permissions, and reach the Oracle “big data cluster console”.
The following services are installed on our cluster:
- HDFS (18.104.22.168.4)
- YARN + MapReduce2 (22.214.171.124.4)
- Tez (0.7.0.2.4)
- Hive (126.96.36.199.4)
- Pig (0.15.0.2.4)
- ZooKeeper (188.8.131.52.4)
- Spark (1.6.x.2.4)
- Zeppelin Notebook (0.6.0)
- Alluxio (1.2.0)
- BDCSCE Logstash Agent (0.0.1)
- Nginx Reverse Proxy (0.0.1)
- Spark Cloud Service UI (0.5.0)
- Spocs Fabric Service (0.1)
These services may be different for you (if you create it 1-2 months after I published this blog post). Big Data technologies (and the services related with big data) are changing very fast. Today, Oracle uses Hortonworks DP 2.4.2 – Current hortonworks data platform (2.6) and it comes with different services in the default setup. So these services will probably change when Oracle updates the version of HDP they use. In my next article, I will explain you what these services are used for.
To monitor our cluster, click on “big data cluster console”.
The console, which does not only show the general status of the system, also allows us to see the files in the HDFS file system, access the Zeppelin notebook (I will also write a separate blog post about it) and modify the settings of resource manager.
This is enough for this post, I will continue from where I left on my next post.