How can set up a multi nodes job for a spark application through a batch file?Management of available file descriptors within a Hadoop clusterHow to set up the parameter of a spark job according to the available nodes?

Why do Thanos's punches not kill Captain America or at least cause some mortal injuries?

Was there ever any real use for a 6800-based Apple I?

How are Core iX names like Core i5, i7 related to Haswell, Ivy Bridge?

Are there variations of the regular runtimes of the Big-O-Notation?

Is there a faster way to calculate Abs[z]^2 numerically?

Was there a contingency plan in place if Little Boy failed to detonate?

Does a member have to be initialized to take its address?

We are two immediate neighbors who forged our own powers to form concatenated relationship. Who are we?

Is the homebrew weapon attack cantrip 'Arcane Strike' balanced?

How can this pool heater gas line be disconnected?

Two researchers want to work on the same extension to my paper. Who to help?

Make all the squares explode

Does Lawful Interception of 4G / the proposed 5G provide a back door for hackers as well?

Drawing Quarter-Circle

Looking for a simple way to manipulate one column of a matrix

How do I get past a 3-year ban from overstay with VWP?

Is Simic Ascendancy triggered by Awakening of Vitu-Ghazi?

How did Thanos not realise this had happened at the end of Endgame?

LocalDate.plus Incorrect Answer

Would an 8% reduction in drag outweigh the weight addition from this custom CFD-tested winglet?

Ubuntu won't let me edit or delete .vimrc file

Unit Test - Testing API Methods

What food production methods would allow a metropolis like New York to become self sufficient

Can 'sudo apt-get remove [write]' destroy my Ubuntu?

How can set up a multi nodes job for a spark application through a batch file?

Management of available file descriptors within a Hadoop clusterHow to set up the parameter of a spark job according to the available nodes?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;

I tried the following script but looks like there is some errors. Can someone tell me please if I was missing something in this configuration above?

#!/bin/bash
#SBATCH --nodes=2
#SBATCH --time=00:20:00
#SBATCH --mem=80G
#SBATCH --cpus-per-task=4
#SBATCH --ntasks-per-node=2
#SBATCH --output=sparkjob-%j.out
#SBATCH --mail-type=ALL
#SBATCH --error=/project/6008168/moudi/error6_hours.out
#SBATCH --exclusive

## --------------------------------------
## 0. Preparation
## --------------------------------------

# load the Spark module
module load spark/2.3.0
module load python/3.7.0
source "/home/moudi/ENV3.7.0/bin/activate"

# identify the Spark cluster with the Slurm jobid
export SPARK_IDENT_STRING=$SLURM_JOBID

# prepare directories
export SPARK_WORKER_DIR=$SPARK_WORKER_DIR:-$HOME/.spark/2.3.0/worker
export SPARK_LOG_DIR=$SPARK_LOG_DIR:-$HOME/.spark/2.3.0/logs
export SPARK_LOCAL_DIRS=$SPARK_LOCAL_DIRS:-/tmp/spark

mkdir -p $SPARK_LOG_DIR $SPARK_WORKER_DIR

## --------------------------------------
## 1. Start the Spark cluster master
## --------------------------------------

start-master.sh
sleep 5
MASTER_URL=$(grep -Po '(?=spark://).*' $SPARK_LOG_DIR/spark-$SPARK_IDENT_STRING-org.apache.spark.deploy.master*.out)

## --------------------------------------
## 2. Start the Spark cluster workers
## --------------------------------------

# get the resource details from the Slurm job
export SPARK_WORKER_CORES=$SLURM_CPUS_PER_TASK:-1
export SLURM_SPARK_MEM=$(printf "%.0f" $(($SLURM_MEM_PER_NODE *95/100)))
export SPARK_DAEMON_MEMORY=$SPARK_MEM
export SPARK_WORKER_MEMORY=$SPARK_MEM
export SPARK_EXECUTOR_MEMORY=$SPARK_MEM

# start the workers on each node allocated to the job
export SPARK_NO_DAEMONIZE=1
srun --output=$SPARK_LOG_DIR/spark-%j-workers.out --label start-slave.sh $MASTER_URL &

## --------------------------------------
## 3. Submit a task to the Spark cluster
## --------------------------------------

spark-submit --master $MASTER_URL --total-executor-cores $((SLURM_NTASKS * SLURM_CPUS_PER_TASK)) /project/6008168/moudi/mainold.py

## --------------------------------------
## 4. Clean up
## --------------------------------------

# stop the workers
scancel $SLURM_JOBID.0

# stop the master

asked May 1 at 14:30

moudi

add a comment |

I tried the following script but looks like there is some errors. Can someone tell me please if I was missing something in this configuration above?

#!/bin/bash
#SBATCH --nodes=2
#SBATCH --time=00:20:00
#SBATCH --mem=80G
#SBATCH --cpus-per-task=4
#SBATCH --ntasks-per-node=2
#SBATCH --output=sparkjob-%j.out
#SBATCH --mail-type=ALL
#SBATCH --error=/project/6008168/moudi/error6_hours.out
#SBATCH --exclusive

## --------------------------------------
## 0. Preparation
## --------------------------------------

# load the Spark module
module load spark/2.3.0
module load python/3.7.0
source "/home/moudi/ENV3.7.0/bin/activate"

# identify the Spark cluster with the Slurm jobid
export SPARK_IDENT_STRING=$SLURM_JOBID

# prepare directories
export SPARK_WORKER_DIR=$SPARK_WORKER_DIR:-$HOME/.spark/2.3.0/worker
export SPARK_LOG_DIR=$SPARK_LOG_DIR:-$HOME/.spark/2.3.0/logs
export SPARK_LOCAL_DIRS=$SPARK_LOCAL_DIRS:-/tmp/spark

mkdir -p $SPARK_LOG_DIR $SPARK_WORKER_DIR

## --------------------------------------
## 1. Start the Spark cluster master
## --------------------------------------

start-master.sh
sleep 5
MASTER_URL=$(grep -Po '(?=spark://).*' $SPARK_LOG_DIR/spark-$SPARK_IDENT_STRING-org.apache.spark.deploy.master*.out)

## --------------------------------------
## 2. Start the Spark cluster workers
## --------------------------------------

# get the resource details from the Slurm job
export SPARK_WORKER_CORES=$SLURM_CPUS_PER_TASK:-1
export SLURM_SPARK_MEM=$(printf "%.0f" $(($SLURM_MEM_PER_NODE *95/100)))
export SPARK_DAEMON_MEMORY=$SPARK_MEM
export SPARK_WORKER_MEMORY=$SPARK_MEM
export SPARK_EXECUTOR_MEMORY=$SPARK_MEM

# start the workers on each node allocated to the job
export SPARK_NO_DAEMONIZE=1
srun --output=$SPARK_LOG_DIR/spark-%j-workers.out --label start-slave.sh $MASTER_URL &

## --------------------------------------
## 3. Submit a task to the Spark cluster
## --------------------------------------

spark-submit --master $MASTER_URL --total-executor-cores $((SLURM_NTASKS * SLURM_CPUS_PER_TASK)) /project/6008168/moudi/mainold.py

## --------------------------------------
## 4. Clean up
## --------------------------------------

# stop the workers
scancel $SLURM_JOBID.0

# stop the master

asked May 1 at 14:30

moudi

add a comment |

I tried the following script but looks like there is some errors. Can someone tell me please if I was missing something in this configuration above?

#!/bin/bash
#SBATCH --nodes=2
#SBATCH --time=00:20:00
#SBATCH --mem=80G
#SBATCH --cpus-per-task=4
#SBATCH --ntasks-per-node=2
#SBATCH --output=sparkjob-%j.out
#SBATCH --mail-type=ALL
#SBATCH --error=/project/6008168/moudi/error6_hours.out
#SBATCH --exclusive

## --------------------------------------
## 0. Preparation
## --------------------------------------

# load the Spark module
module load spark/2.3.0
module load python/3.7.0
source "/home/moudi/ENV3.7.0/bin/activate"

# identify the Spark cluster with the Slurm jobid
export SPARK_IDENT_STRING=$SLURM_JOBID

# prepare directories
export SPARK_WORKER_DIR=$SPARK_WORKER_DIR:-$HOME/.spark/2.3.0/worker
export SPARK_LOG_DIR=$SPARK_LOG_DIR:-$HOME/.spark/2.3.0/logs
export SPARK_LOCAL_DIRS=$SPARK_LOCAL_DIRS:-/tmp/spark

mkdir -p $SPARK_LOG_DIR $SPARK_WORKER_DIR

## --------------------------------------
## 1. Start the Spark cluster master
## --------------------------------------

start-master.sh
sleep 5
MASTER_URL=$(grep -Po '(?=spark://).*' $SPARK_LOG_DIR/spark-$SPARK_IDENT_STRING-org.apache.spark.deploy.master*.out)

## --------------------------------------
## 2. Start the Spark cluster workers
## --------------------------------------

# get the resource details from the Slurm job
export SPARK_WORKER_CORES=$SLURM_CPUS_PER_TASK:-1
export SLURM_SPARK_MEM=$(printf "%.0f" $(($SLURM_MEM_PER_NODE *95/100)))
export SPARK_DAEMON_MEMORY=$SPARK_MEM
export SPARK_WORKER_MEMORY=$SPARK_MEM
export SPARK_EXECUTOR_MEMORY=$SPARK_MEM

# start the workers on each node allocated to the job
export SPARK_NO_DAEMONIZE=1
srun --output=$SPARK_LOG_DIR/spark-%j-workers.out --label start-slave.sh $MASTER_URL &

## --------------------------------------
## 3. Submit a task to the Spark cluster
## --------------------------------------

spark-submit --master $MASTER_URL --total-executor-cores $((SLURM_NTASKS * SLURM_CPUS_PER_TASK)) /project/6008168/moudi/mainold.py

## --------------------------------------
## 4. Clean up
## --------------------------------------

# stop the workers
scancel $SLURM_JOBID.0

# stop the master

asked May 1 at 14:30

moudi

I tried the following script but looks like there is some errors. Can someone tell me please if I was missing something in this configuration above?

#!/bin/bash
#SBATCH --nodes=2
#SBATCH --time=00:20:00
#SBATCH --mem=80G
#SBATCH --cpus-per-task=4
#SBATCH --ntasks-per-node=2
#SBATCH --output=sparkjob-%j.out
#SBATCH --mail-type=ALL
#SBATCH --error=/project/6008168/moudi/error6_hours.out
#SBATCH --exclusive

## --------------------------------------
## 0. Preparation
## --------------------------------------

# load the Spark module
module load spark/2.3.0
module load python/3.7.0
source "/home/moudi/ENV3.7.0/bin/activate"

# identify the Spark cluster with the Slurm jobid
export SPARK_IDENT_STRING=$SLURM_JOBID

# prepare directories
export SPARK_WORKER_DIR=$SPARK_WORKER_DIR:-$HOME/.spark/2.3.0/worker
export SPARK_LOG_DIR=$SPARK_LOG_DIR:-$HOME/.spark/2.3.0/logs
export SPARK_LOCAL_DIRS=$SPARK_LOCAL_DIRS:-/tmp/spark

mkdir -p $SPARK_LOG_DIR $SPARK_WORKER_DIR

## --------------------------------------
## 1. Start the Spark cluster master
## --------------------------------------

start-master.sh
sleep 5
MASTER_URL=$(grep -Po '(?=spark://).*' $SPARK_LOG_DIR/spark-$SPARK_IDENT_STRING-org.apache.spark.deploy.master*.out)

## --------------------------------------
## 2. Start the Spark cluster workers
## --------------------------------------

# get the resource details from the Slurm job
export SPARK_WORKER_CORES=$SLURM_CPUS_PER_TASK:-1
export SLURM_SPARK_MEM=$(printf "%.0f" $(($SLURM_MEM_PER_NODE *95/100)))
export SPARK_DAEMON_MEMORY=$SPARK_MEM
export SPARK_WORKER_MEMORY=$SPARK_MEM
export SPARK_EXECUTOR_MEMORY=$SPARK_MEM

# start the workers on each node allocated to the job
export SPARK_NO_DAEMONIZE=1
srun --output=$SPARK_LOG_DIR/spark-%j-workers.out --label start-slave.sh $MASTER_URL &

## --------------------------------------
## 3. Submit a task to the Spark cluster
## --------------------------------------

spark-submit --master $MASTER_URL --total-executor-cores $((SLURM_NTASKS * SLURM_CPUS_PER_TASK)) /project/6008168/moudi/mainold.py

## --------------------------------------
## 4. Clean up
## --------------------------------------

# stop the workers
scancel $SLURM_JOBID.0

# stop the master

hpc multi-threading mpio slurm

asked May 1 at 14:30

moudi

asked May 1 at 14:30

moudi

asked May 1 at 14:30

moudi

asked May 1 at 14:30

moudi

asked May 1 at 14:30

moudi

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "2"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f965426%2fhow-can-set-up-a-multi-nodes-job-for-a-spark-application-through-a-batch-file%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Server Fault!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

77awSvQGnI272eKw,c4,bVK,G5vHRVeuZTtPIOOqCKwB5NUle2

搜尋此網誌

Otdfbt