If 2017 is the year of Docker, 2018 is the year for Kubernetes. Kubernetes allows easy container management. It does not manage containers directly, but pods. A pod has one or more tightly coupled containers as a deployed object. Kubernetes also supports horizontal autoscaling for the pods. When the application is accessed by a large number of users, you can instruct Kubernetes to replicate your pods to balance the load. As expected, Spark can be deployed on Kubernetes. Currently there are a few ways to run Spark on Kubernetes.
1. Standalone Spark Cluster
Spark Standalone Mode is a nice way to quickly start a Spark cluster without using YARN or Mesos. In this way, you don’t have to use HDFS to store huge datasets. Instead you can use cloud storage to store whatever you like and decouple Spark Cluster with its storage. For a spark cluster, you will have one pod for Spark Master and multiple pods for Spark workers. In the case when you want to run the job, just deploy Spark Master and create a Master service. Then you could deploy multiple Spark workers. Once the job completes, delete all the pods from Kubernetes Workload.
Actually this is the recommended way to run jobs against big dataset on cloud. You don’t need 200 nodes Spark cluster running all the time, just run whenever you need to run the job. This is going to save significantly on the cloud cost. The Standalone Spark Cluster is not my topic in this blog and I may cover it in a different blog.
2. Spark on Kubernetes
Spark on Kubernetes is another interesting mode to run Spark cluster. It uses native Kubernetes scheduler for the resource management of Spark cluster. Here is the architecture of Spark on Kubernetes.
There is a blog, Apache Spark 2.3 with Native Kubernetes Support, which go through the steps to start a basic example Pi. However, I followed the steps and it did not work. Many steps and stuffs are missing. After some research, I figured out the correct steps to run it on Google Cloud Platform (GCP). This blog discusses the steps to show how to run the Pi example on Kubernetes.
Download Apache Spark 2.3
One of the major changes in this release is the inclusion of new Kubernetes Scheduler backend.The software can be downloaded at http://spark.apache.org/releases/spark-release-2-3-0.html or http://spark.apache.org/downloads.html. After downloading the software, unzip the file in the local machine.
Build Docker Image
The Spark on Kubernetes requires to specify an image for its driver and executors. I can get a Spark image from somewhere. But I like to build the image by myself. So I can easily customize it in the future. There is a docker file under spark-2.3.0-bin-hadoop2.7/kubernetes/dockerfiles/spark directory.
[root@docker1 spark]# cat Dockerfile # # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # FROM openjdk:8-alpine ARG spark_jars=jars ARG img_path=kubernetes/dockerfiles # Before building the docker image, first build and make a Spark distribution following # the instructions in http://spark.apache.org/docs/latest/building-spark.html. # If this docker file is being used in the context of building your images from a Spark # distribution, the docker build command should be invoked from the top level directory # of the Spark distribution. E.g.: # docker build -t spark:latest -f kubernetes/dockerfiles/spark/Dockerfile . RUN set -ex && \ apk upgrade --no-cache && \ apk add --no-cache bash tini libc6-compat && \ mkdir -p /opt/spark && \ mkdir -p /opt/spark/work-dir \ touch /opt/spark/RELEASE && \ rm /bin/sh && \ ln -sv /bin/bash /bin/sh && \ chgrp root /etc/passwd && chmod ug+rw /etc/passwd COPY ${spark_jars} /opt/spark/jars COPY bin /opt/spark/bin COPY sbin /opt/spark/sbin COPY conf /opt/spark/conf COPY ${img_path}/spark/entrypoint.sh /opt/ COPY examples /opt/spark/examples COPY data /opt/spark/data ENV SPARK_HOME /opt/spark WORKDIR /opt/spark/work-dir ENTRYPOINT [ "/opt/entrypoint.sh" ]
Pay more attention of line COPY examples /opt/spark/examples. The associated jar file for Pi example is in the examples directory. You need to remember to use this path /opt/spark/examples instead of the path on your local machine that run the job submission. I run into an issue of SparkPi class not found. It was caused by the fact I included the local path to the jar file on my local computer instead of the path on the docker image.
I has a Docker VM and use it for all Docker related operations. Logon the docker VM and run the followings to download/unzip the software:
[root@docker1 ]# mkdir spark-2.3 [root@docker1 ]# cd spark-2.3 [root@docker1 spark-2.3]# wget http://www-eu.apache.org/dist/spark/spark-2.3.0/spark-2.3.0-bin-hadoop2.7.tgz --2018-04-24 19:11:09-- http://www-eu.apache.org/dist/spark/spark-2.3.0/spark-2.3.0-bin-hadoop2.7.tgz Resolving www-eu.apache.org (www-eu.apache.org)... 195.154.151.36, 2001:bc8:2142:300:: Connecting to www-eu.apache.org (www-eu.apache.org)|195.154.151.36|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 226128401 (216M) [application/x-gzip] Saving to: ‘spark-2.3.0-bin-hadoop2.7.tgz’ 100%[===========================================================================================================================================>] 226,128,401 26.8MB/s in 8.8s 2018-04-24 19:11:18 (24.6 MB/s) - ‘spark-2.3.0-bin-hadoop2.7.tgz’ saved [226128401/226128401] [root@docker1 spark-2.3]# ls -l total 220856 -rw-r--r--. 1 root root 22860 Apr 24 19:10 spark-2.3.0-bin-hadoop2.7.tgz -rw-r--r--. 1 root root 226128401 Feb 22 19:54 spark-2.3.0-bin-hadoop2.7.tgz.1 [root@docker1 spark-2.3]# tar -xzf spark-2.3.0-bin-hadoop2.7.tgz
Build the image and push to my google private container registry.
[root@docker1 spark-2.3.0-bin-hadoop2.7]# bin/docker-image-tool.sh -r gcr.io/wz-gcptest-357812 -t k8s-spark-2.3 build Sending build context to Docker daemon 256.4MB Step 1/14 : FROM openjdk:8-alpine 8-alpine: Pulling from library/openjdk ff3a5c916c92: Pull complete 5de5f69f42d7: Pull complete fd869c8b9b59: Pull complete Digest: . . . . Step 13/14 : WORKDIR /opt/spark/work-dir Removing intermediate container ed4b6fe3efd6 ---> 69cd2dd1cae8 Step 14/14 : ENTRYPOINT [ "/opt/entrypoint.sh" ] ---> Running in 07da54b9fd34 Removing intermediate container 07da54b9fd34 ---> 9c3bd46e026d Successfully built 9c3bd46e026d Successfully tagged gcr.io/wz-gcptest-357812/spark:k8s-spark-2.3 [root@docker1 spark-2.3.0-bin-hadoop2.7]# bin/docker-image-tool.sh -r gcr.io/wz-gcptest-357812 -t k8s-spark-2.3 push The push refers to repository [gcr.io/wz-gcptest-357812/spark] e7930b27b5e2: Pushed 6f0480c071be: Pushed d7e218db3d89: Pushed 8281f673b660: Pushed 92e162ecfbe3: Pushed 938ba54601ba: Pushed dc1345b437d9: Pushed 4e3f1d639db8: Pushed 685fdd7e6770: Layer already exists c9b26f41504c: Layer already exists cd7100a72410: Layer already exists k8s-spark-2.3: digest: sha256:2f865bf17985317909c866d036ba7988e1dbfc5fe10440a95f366264ceee0518 size: 2624 [root@docker1 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE gcr.io/wz-gcptest-357812/spark k8s-spark-2.3 9c3bd46e026d 3 days ago 346MB ubuntu 16.04 c9d990395902 2 weeks ago 113MB hello-world latest e38bc07ac18e 2 weeks ago 1.85kB openjdk 8-alpine 224765a6bdbe 3 months ago 102MB
Check Google Container Registry. It shows the image with the correct tag k8s-spark-2.3.
Configure RBAC
I have already had a Kubernetes cluster up and running with 3 nodes. I have to setup Role-Based Access Control (RBAC) to allow Spark on Kubernetes working. Otherwise it will throw the error as follows during job execution:
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://kubernetes.default.svc/api/v1/namespaces/default/pods/spark-pi-449efacd5a4a386ca31177faddb8eab4-driver. Message: Forbidden!Configured service account doesn’t have access. Service account may have been revoked. pods “spark-pi-449efacd5a4a386ca31177faddb8eab4-driver” is forbidden: User “system:serviceaccount:default:default” cannot get pods in the namespace “default”: Unknown user “system:serviceaccount:default:default”.
Check service account and clusterrolebinding.
weidong.zhou:@macpro spark-2.3.0-bin-hadoop2.7 > kubectl get serviceaccount NAME SECRETS AGE default 1 5m weidong.zhou:@macpro spark-2.3.0-bin-hadoop2.7 > kubectl get clusterrolebinding NAME AGE cluster-admin 5m event-exporter-rb 5m gce:beta:kubelet-certificate-bootstrap 5m gce:beta:kubelet-certificate-rotation 5m heapster-binding 5m kube-apiserver-kubelet-api-admin 5m kubelet-cluster-admin 5m npd-binding 5m system:basic-user 5m system:controller:attachdetach-controller 5m . . . . system:controller:statefulset-controller 5m system:controller:ttl-controller 5m system:discovery 5m system:kube-controller-manager 5m system:kube-dns 5m system:kube-dns-autoscaler 5m system:kube-scheduler 5m system:node 5m system:node-proxier 5m
Create the spark service account and cluster role binding.
weidong.zhou:@macpro spark-2.3.0-bin-hadoop2.7 > kubectl create serviceaccount spark serviceaccount "spark" created weidong.zhou:@macpro spark-2.3.0-bin-hadoop2.7 > kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default clusterrolebinding "spark-role" created weidong.zhou:@macpro spark-2.3.0-bin-hadoop2.7 > kubectl get serviceaccount NAME SECRETS AGE default 1 1h spark 1 56m
Run Spark Application
You might need to set SPARK_LOCAL_IP. Also need to find out MASTER_IP by running kubectl cluster-info | grep master |awk ‘{print $6}’. Use the following commands to set environment.
export PROJECT_ID="wz-gcptest-357812" export ZONE="us-east1-b" export KUBE_CLUSTER_NAME="wz-kube1" gcloud config set project ${PROJECT_ID} gcloud config set compute/zone ${ZONE} gcloud container clusters get-credentials ${KUBE_CLUSTER_NAME}
Finally I can run the job. I intentionally gave a parameter of 1000000 to make the job running for a long time.
bin/spark-submit \ --master k8s://https://104.136.128.109 \ --deploy-mode cluster \ --name spark-pi \ --class org.apache.spark.examples.SparkPi \ --conf spark.executor.instances=2 \ --conf spark.app.name=spark-pi \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ --conf spark.kubernetes.container.image=gcr.io/wz-gcptest-357812/spark:k8s-spark-2.3 \ local:///opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar 1000000
If checking out GCP’s Kubernetes Workload screen, you will see one Spark driver and two executors running.
Monitor the Spark Job
If the job can run for a longer time, you will see the screen below when checking out Pod details. It shows CPU, Memory and Disk usage. It is usually good enough for monitoring purpose.
But how do I check out Spark UI screen? There are no resource manager like YARN in the picture. At this moment I need to use port forwarding to access Spark UI. Find out the driver pod and then setup the port forwarding.
weidong.zhou:@macpro ~ > kubectl get pods NAME READY STATUS RESTARTS AGE spark-pi-6e2c3b5d707531689031d3259f57b2ea-driver 1/1 Running 0 7m spark-pi-6e2c3b5d707531689031d3259f57b2ea-exec-1 1/1 Running 0 7m spark-pi-6e2c3b5d707531689031d3259f57b2ea-exec-2 1/1 Running 0 7m weidong.zhou:@macpro ~ > kubectl port-forward spark-pi-6e2c3b5d707531689031d3259f57b2ea-driver 4040:4040 Forwarding from 127.0.0.1:4040 -> 4040
Find out the IP for the pod.
weidong.zhou:@macpro mytest_gcp > kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE spark-pi-6e2c3b5d707531689031d3259f57b2ea-driver 1/1 Running 0 10m 10.44.0.8 gke-wz-kube1-default-pool-2aac262a-thw0 spark-pi-6e2c3b5d707531689031d3259f57b2ea-exec-1 1/1 Running 0 10m 10.44.2.8 gke-wz-kube1-default-pool-2aac262a-09vt spark-pi-6e2c3b5d707531689031d3259f57b2ea-exec-2 1/1 Running 0 10m 10.44.1.6 gke-wz-kube1-default-pool-2aac262a-23gk
Now we can see the familiar Spark UI.
If want to check out the logs from the driver pod, just run the followings:
weidong.zhou:@macpro mytest_gcp > kubectl -n=default logs -f spark-pi-6e2c3b5d707531689031d3259f57b2ea-driver 2018-04-27 20:40:02 INFO TaskSetManager:54 - Starting task 380242.0 in stage 0.0 (TID 380242, 10.44.1.6, executor 2, partition 380242, PROCESS_LOCAL, 7865 bytes) 2018-04-27 20:40:02 INFO TaskSetManager:54 - Finished task 380240.0 in stage 0.0 (TID 380240) in 3 ms on 10.44.1.6 (executor 2) (380241/1000000) 2018-04-27 20:40:02 INFO TaskSetManager:54 - Starting task 380243.0 in stage 0.0 (TID 380243, 10.44.2.8, executor 1, partition 380243, PROCESS_LOCAL, 7865 bytes) 2018-04-27 20:40:02 INFO TaskSetManager:54 - Finished task 380241.0 in stage 0.0 (TID 380241) in 5 ms on 10.44.2.8 (executor 1) (380242/1000000) 2018-04-27 20:40:02 INFO TaskSetManager:54 - Starting task 380244.0 in stage 0.0 (TID 380244, 10.44.1.6, executor 2, partition 380244, PROCESS_LOCAL, 7865 bytes)
Killing Executor and Driver
What’s happened if I killed one of executors?
weidong.zhou:@macpro mytest_gcp > kubectl get pods NAME READY STATUS RESTARTS AGE spark-pi-6e2c3b5d707531689031d3259f57b2ea-driver 1/1 Running 0 23m spark-pi-6e2c3b5d707531689031d3259f57b2ea-exec-1 1/1 Running 0 23m spark-pi-6e2c3b5d707531689031d3259f57b2ea-exec-2 1/1 Running 0 23m weidong.zhou:@macpro mytest_gcp > kubectl delete pod spark-pi-6e2c3b5d707531689031d3259f57b2ea-exec-1 pod "spark-pi-6e2c3b5d707531689031d3259f57b2ea-exec-1" deleted weidong.zhou:@macpro mytest_gcp > kubectl get pods NAME READY STATUS RESTARTS AGE spark-pi-6e2c3b5d707531689031d3259f57b2ea-driver 1/1 Running 0 25m spark-pi-6e2c3b5d707531689031d3259f57b2ea-exec-2 1/1 Running 0 25m
After 30 seconds, check again. A new executor starts.
weidong.zhou:@macpro mytest_gcp > kubectl get pods NAME READY STATUS RESTARTS AGE spark-pi-6e2c3b5d707531689031d3259f57b2ea-driver 1/1 Running 0 26m spark-pi-6e2c3b5d707531689031d3259f57b2ea-exec-2 1/1 Running 0 25m spark-pi-6e2c3b5d707531689031d3259f57b2ea-exec-3 1/1 Running 0 19s
The Spark UI show the executor changes.
This is actually what I expected. Ok, what’s happened if I killed the driver?
weidong.zhou:@macpro mytest_gcp > kubectl get pods NAME READY STATUS RESTARTS AGE spark-pi-6e2c3b5d707531689031d3259f57b2ea-driver 1/1 Running 0 31m spark-pi-6e2c3b5d707531689031d3259f57b2ea-exec-2 1/1 Running 0 31m spark-pi-6e2c3b5d707531689031d3259f57b2ea-exec-3 1/1 Running 0 5m weidong.zhou:@macpro mytest_gcp > kubectl delete pod spark-pi-6e2c3b5d707531689031d3259f57b2ea-driver pod "spark-pi-6e2c3b5d707531689031d3259f57b2ea-driver" deleted weidong.zhou:@macpro mytest_gcp > kubectl get pods No resources found, use --show-all to see completed objects.
So killing driver pod is actually the way to stop the Spark Application during the execution.
The nice thing about Spark on Kubernets is that all pods disappear whether the Spark job completes by it self or is killed. This allows the free of resource automatically. Overall, Spark on Kubernetes is an easy to quickly run Spark application on Kubernetes.
You must be logged in to post a comment.