Apache Spark: Client v/s Cluster Mode


Whenever a user submits a spark application it is very difficult for them to choose which deployment mode to choose. Deployment mode is the only property that defines the behavior of your Spark internal mechanism. So, don't worry today we will be discussing the same & the difference between Client and Cluster mode & which one is the best to choose in Production?

But before coming to deployment mode we should first understand how spark executes a job.

How Spark executes a Job?

Whenever a user executes spark it get executed through spark-submit. When we do spark-submit it submits your job. Also, while creating spark-submit there is an option to define deployment mode. Here actually, a user defines which deployment mode to choose either Client mode or Cluster Mode. And if the same scenario is implemented over YARN then it becomes YARN-Client mode or YARN-Cluster mode

So, when a user submits a job, there are 2 processes that get spawned, one is Driver Program &

another one is Executor Program. Driver Program is the program that drives your complete spark job & is spawned on one of the machine/nodes of our cluster. And then there are different Executor Program or Worker Nodes gets spawned through this Driver Program. These Executor Programs are spawned on different nodes of the cluster. These Worker nodes or Executor Program are the actual workers who do data processing & these data processing workers are controlled by the Driver Program. All the transformation and action are executed by Executor Program. As said, Driver Program is the program that drives the Spark job, and can also be said that Driver Program is the main program & if somehow Driver Program exits, the whole job will exit & the job will terminate. 

So, this is how your Spark job is executed. 

Now let's discuss what happens in the case of execution of Spark in Client Mode v/s Cluster Mode? 

Client Mode 


So, let's say a user submits a job. Initially, this job goes to Edge Node or we can say here reside your spark-submit. Edge Node is basically a gateway to your cluster. 

As, when we do spark-submit your Driver Program launches, so in case of client mode, Driver Program will spawn on the same node/machine where your spark-submit is running in our case Edge Node whereas executors will launch on other multiple nodes which are spawned by Driver Programs.  

Also, the Driver Program will occupy the same resources of the Edge Node such as memory, CPU utilization, etc...  

So, let's say if there are 3 users which are running 3 jobs. So, in the case of Client Mode all 3 jobs or

Driver Program will launch on 1 Edge Node and all 3 Driver Program will occupy the resources of 1 Edge Node. At some point of time, all the resources will definitely get occupied & will throw an Out of Memory(OOM) Exception. And once, Driver Program exits or failed, the whole job will terminate. 

Client Mode is always chosen when we have a limited amount of job, even though in this case can face OOM exception because you can't predict the number of users working with you on your Spark application. So, always go with Client Mode when you have limited requirements.   

Cluster Mode

In the case of Cluster mode, when we do spark-submit the job will be submitted on the Edge Node. But in this mode, the Driver Program will not be launched on Edge Node instead Edge Node will take a job and will spawn the Driver Program on one of the available nodes on the cluster. 

So, let's say Edge Node chooses machine1 as Driver Program based on resource allocation. Hence, Driver will run on machine1 & then Driver Program will launch Executor on other machines. Now, Driver Program is not utilizing the resource of Edge Node instead it has a separate node and will be able to utilize the resources efficiently because Spark is so smart and resource allocation will happen in such a way that the Driver Program will launch on that particular node where resources are available.  

Now considering we have 3 users & in this scenario, for instance, Edge Node will choose machine1 as Driver Program & then this machine 1 will occupy any of the 2 machines as their Executor. Similarly, for other 2 users, they will have separate Driver Program on 2 different machines followed by their Worker Nodes. That means, each driver program has separate resources and all are well distributed. Hence, there is no chance or less chance(bad scenario) of OOM exception

And it is always a good practice to use Cluster Mode in Production due to its distributed nature. 

Hope you are very clear with both Client Mode & Cluster Mode. 

If you like this blog, please do show your appreciation by hitting like button and sharing this blog. Also, drop any comments about the post & improvements if needed. Till then HAPPY LEARNING.

References


Comments

Post a Comment