torque reports error when posting job to client nodesNon-exclusive job scheduling in PBS/Torquetorque pbs 4.0.1 job stays queued ('Q') state; the scheduler seems not receiving any notificationI get the error qsub: Bad UID for job execution when trying to submit a job via PBSEmail notifications per JOB ARRAY not per job in PBS torqueHow can we configure torque with multiple nodes for a workstation?Job submitted to Torque does not generate error/log fileRunning tensorflow code in torque job
STM Microcontroller burns every time
How much will studying magic in an academy cost?
How long would it take to cross the Channel in 1890's?
Suggested order for Amazon Prime Doctor Who series
How do I set an alias to a terminal line?
Inverse-quotes-quine
What is the legal status of travelling with methadone in your carry-on?
Links to webpages in books
How was Hillel permitted to go to the skylight to hear the shiur
How to make clear to people I don't want to answer their "Where are you from?" question?
Should my manager be aware of private LinkedIn approaches I receive? How to politely have this happen?
What's currently blocking the construction of the wall between Mexico and the US?
Intuition for capacitors in series
Should I prioritize my 401(k) over my student loans?
Graphical representation of connection of people
What reason would an alien civilization have for building a Dyson Sphere (or Swarm) if cheap Nuclear fusion is available?
Is it damaging to turn off a small fridge for two days every week?
Why cruise at 7000' in an A319?
What is the mechanical difference between the Spectator's Create Food and Water action and the Banshee's Undead Nature Trait?
Iterate MapThread with matrices
Why do textbooks often include the solutions to odd or even numbered problems but not both?
Unusual mail headers, evidence of an attempted attack. Have I been pwned?
Is there a maximum distance from a planet that a moon can orbit?
Going to get married soon, should I do it on Dec 31 or Jan 1?
torque reports error when posting job to client nodes
Non-exclusive job scheduling in PBS/Torquetorque pbs 4.0.1 job stays queued ('Q') state; the scheduler seems not receiving any notificationI get the error qsub: Bad UID for job execution when trying to submit a job via PBSEmail notifications per JOB ARRAY not per job in PBS torqueHow can we configure torque with multiple nodes for a workstation?Job submitted to Torque does not generate error/log fileRunning tensorflow code in torque job
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
The system has two machines, one (called macondo02) runs pbs_server and pbs_schedule, another (called macondo01) runs pbs_mom. I have ensured that the host can clearly identify the existance of the guest:
$ pbsnodes -a
macondo01
state = free
np = 64
ntype = cluster
status = rectime=1403183300,varattr=,jobs=,state=free,netload=1102560564743,gres=,loadave=0.00,ncpus=64,physmem=131988228kb,availmem=263457400kb,totmem=266160896kb,idletime=705,nusers=6,nsessions=17,sessions=2817 59201 59937 18341 21924 27356 30089 31663 32133 32934 34374 7341 42678 58843 59605 59606 59741,uname=Linux macondo01 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64,opsys=linux
However, whenever I submit a job through qsub, the job didn't run, and I got error message in the PBS_server log.
06/19/2014 23:00:19;0040;PBS_Server;Svr;macondo02.edu.au;Scheduler was sent the command new
06/19/2014 23:00:19;0008;PBS_Server;Job;54.macondo02.edu.au;Job Modified at request of Scheduler@macondo02.uq.edu.au
06/19/2014 23:00:19;0008;PBS_Server;Job;54.macondo02.edu.au;Job Run at request of Scheduler@macondo02.uq.edu.au
06/19/2014 23:00:19;0040;PBS_Server;Svr;macondo02.edu.au;Scheduler was sent the command recyc
06/19/2014 23:00:20;0010;PBS_Server;Job;54.macondo02.uq.edu.au;Exit_status=0 resources_used.cput=00:00:00 resources_used.mem=7680kb resources_used.vmem=23876kb resources_used.walltime=00:00:01
06/19/2014 23:00:24;000d;PBS_Server;Job;54.macondo02.uq.edu.au;Post job file processing error; job 54.macondo02.uq.edu.au on host macondo01/0
06/19/2014 23:00:24;0100;PBS_Server;Job;54.macondo02.uq.edu.au;dequeuing from batch, state COMPLETE
06/19/2014 23:00:24;0040;PBS_Server;Svr;macondo02.uq.edu.au;Scheduler was sent the command term
Apparently the failure comes from posting job from the host(ie macondo02) to the guest (ie macondo01).
I have serveral idea in my mind:
1. I know it is necessary to establish a seamless shh between the host and guest using NFS. I have done that to MY OWN NORMAL user, and use this user to submit the qsub job. while error still occurs.
2. in the error file I saw another user called Scheduler@macondo02.uq.edu.au however I can neither find any info about this usr on cat /etc/groups, nor give seamless right to visit macondo01.
any suggestions would be appreciated!
torque pbs
add a comment |
The system has two machines, one (called macondo02) runs pbs_server and pbs_schedule, another (called macondo01) runs pbs_mom. I have ensured that the host can clearly identify the existance of the guest:
$ pbsnodes -a
macondo01
state = free
np = 64
ntype = cluster
status = rectime=1403183300,varattr=,jobs=,state=free,netload=1102560564743,gres=,loadave=0.00,ncpus=64,physmem=131988228kb,availmem=263457400kb,totmem=266160896kb,idletime=705,nusers=6,nsessions=17,sessions=2817 59201 59937 18341 21924 27356 30089 31663 32133 32934 34374 7341 42678 58843 59605 59606 59741,uname=Linux macondo01 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64,opsys=linux
However, whenever I submit a job through qsub, the job didn't run, and I got error message in the PBS_server log.
06/19/2014 23:00:19;0040;PBS_Server;Svr;macondo02.edu.au;Scheduler was sent the command new
06/19/2014 23:00:19;0008;PBS_Server;Job;54.macondo02.edu.au;Job Modified at request of Scheduler@macondo02.uq.edu.au
06/19/2014 23:00:19;0008;PBS_Server;Job;54.macondo02.edu.au;Job Run at request of Scheduler@macondo02.uq.edu.au
06/19/2014 23:00:19;0040;PBS_Server;Svr;macondo02.edu.au;Scheduler was sent the command recyc
06/19/2014 23:00:20;0010;PBS_Server;Job;54.macondo02.uq.edu.au;Exit_status=0 resources_used.cput=00:00:00 resources_used.mem=7680kb resources_used.vmem=23876kb resources_used.walltime=00:00:01
06/19/2014 23:00:24;000d;PBS_Server;Job;54.macondo02.uq.edu.au;Post job file processing error; job 54.macondo02.uq.edu.au on host macondo01/0
06/19/2014 23:00:24;0100;PBS_Server;Job;54.macondo02.uq.edu.au;dequeuing from batch, state COMPLETE
06/19/2014 23:00:24;0040;PBS_Server;Svr;macondo02.uq.edu.au;Scheduler was sent the command term
Apparently the failure comes from posting job from the host(ie macondo02) to the guest (ie macondo01).
I have serveral idea in my mind:
1. I know it is necessary to establish a seamless shh between the host and guest using NFS. I have done that to MY OWN NORMAL user, and use this user to submit the qsub job. while error still occurs.
2. in the error file I saw another user called Scheduler@macondo02.uq.edu.au however I can neither find any info about this usr on cat /etc/groups, nor give seamless right to visit macondo01.
any suggestions would be appreciated!
torque pbs
add a comment |
The system has two machines, one (called macondo02) runs pbs_server and pbs_schedule, another (called macondo01) runs pbs_mom. I have ensured that the host can clearly identify the existance of the guest:
$ pbsnodes -a
macondo01
state = free
np = 64
ntype = cluster
status = rectime=1403183300,varattr=,jobs=,state=free,netload=1102560564743,gres=,loadave=0.00,ncpus=64,physmem=131988228kb,availmem=263457400kb,totmem=266160896kb,idletime=705,nusers=6,nsessions=17,sessions=2817 59201 59937 18341 21924 27356 30089 31663 32133 32934 34374 7341 42678 58843 59605 59606 59741,uname=Linux macondo01 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64,opsys=linux
However, whenever I submit a job through qsub, the job didn't run, and I got error message in the PBS_server log.
06/19/2014 23:00:19;0040;PBS_Server;Svr;macondo02.edu.au;Scheduler was sent the command new
06/19/2014 23:00:19;0008;PBS_Server;Job;54.macondo02.edu.au;Job Modified at request of Scheduler@macondo02.uq.edu.au
06/19/2014 23:00:19;0008;PBS_Server;Job;54.macondo02.edu.au;Job Run at request of Scheduler@macondo02.uq.edu.au
06/19/2014 23:00:19;0040;PBS_Server;Svr;macondo02.edu.au;Scheduler was sent the command recyc
06/19/2014 23:00:20;0010;PBS_Server;Job;54.macondo02.uq.edu.au;Exit_status=0 resources_used.cput=00:00:00 resources_used.mem=7680kb resources_used.vmem=23876kb resources_used.walltime=00:00:01
06/19/2014 23:00:24;000d;PBS_Server;Job;54.macondo02.uq.edu.au;Post job file processing error; job 54.macondo02.uq.edu.au on host macondo01/0
06/19/2014 23:00:24;0100;PBS_Server;Job;54.macondo02.uq.edu.au;dequeuing from batch, state COMPLETE
06/19/2014 23:00:24;0040;PBS_Server;Svr;macondo02.uq.edu.au;Scheduler was sent the command term
Apparently the failure comes from posting job from the host(ie macondo02) to the guest (ie macondo01).
I have serveral idea in my mind:
1. I know it is necessary to establish a seamless shh between the host and guest using NFS. I have done that to MY OWN NORMAL user, and use this user to submit the qsub job. while error still occurs.
2. in the error file I saw another user called Scheduler@macondo02.uq.edu.au however I can neither find any info about this usr on cat /etc/groups, nor give seamless right to visit macondo01.
any suggestions would be appreciated!
torque pbs
The system has two machines, one (called macondo02) runs pbs_server and pbs_schedule, another (called macondo01) runs pbs_mom. I have ensured that the host can clearly identify the existance of the guest:
$ pbsnodes -a
macondo01
state = free
np = 64
ntype = cluster
status = rectime=1403183300,varattr=,jobs=,state=free,netload=1102560564743,gres=,loadave=0.00,ncpus=64,physmem=131988228kb,availmem=263457400kb,totmem=266160896kb,idletime=705,nusers=6,nsessions=17,sessions=2817 59201 59937 18341 21924 27356 30089 31663 32133 32934 34374 7341 42678 58843 59605 59606 59741,uname=Linux macondo01 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64,opsys=linux
However, whenever I submit a job through qsub, the job didn't run, and I got error message in the PBS_server log.
06/19/2014 23:00:19;0040;PBS_Server;Svr;macondo02.edu.au;Scheduler was sent the command new
06/19/2014 23:00:19;0008;PBS_Server;Job;54.macondo02.edu.au;Job Modified at request of Scheduler@macondo02.uq.edu.au
06/19/2014 23:00:19;0008;PBS_Server;Job;54.macondo02.edu.au;Job Run at request of Scheduler@macondo02.uq.edu.au
06/19/2014 23:00:19;0040;PBS_Server;Svr;macondo02.edu.au;Scheduler was sent the command recyc
06/19/2014 23:00:20;0010;PBS_Server;Job;54.macondo02.uq.edu.au;Exit_status=0 resources_used.cput=00:00:00 resources_used.mem=7680kb resources_used.vmem=23876kb resources_used.walltime=00:00:01
06/19/2014 23:00:24;000d;PBS_Server;Job;54.macondo02.uq.edu.au;Post job file processing error; job 54.macondo02.uq.edu.au on host macondo01/0
06/19/2014 23:00:24;0100;PBS_Server;Job;54.macondo02.uq.edu.au;dequeuing from batch, state COMPLETE
06/19/2014 23:00:24;0040;PBS_Server;Svr;macondo02.uq.edu.au;Scheduler was sent the command term
Apparently the failure comes from posting job from the host(ie macondo02) to the guest (ie macondo01).
I have serveral idea in my mind:
1. I know it is necessary to establish a seamless shh between the host and guest using NFS. I have done that to MY OWN NORMAL user, and use this user to submit the qsub job. while error still occurs.
2. in the error file I saw another user called Scheduler@macondo02.uq.edu.au however I can neither find any info about this usr on cat /etc/groups, nor give seamless right to visit macondo01.
any suggestions would be appreciated!
torque pbs
torque pbs
asked Jun 19 '14 at 13:26
Chenming ZhangChenming Zhang
1015 bronze badges
1015 bronze badges
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Try checking /var/log/syslog
or PBS logfiles on the machine where was the job running, which was host macondo01
.
You're looking for something like this, probably error while copying job's logfile:
pbs_mom: LOG_ERROR::sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool...
You can find the actual log from that run in /var/spool/torque/undelivered/
.
The problem might be with PBS_SCP
command which requires passwordless ssh access to machine, typically it uses command like this:
$PBS_SCP -rpB <path to source> <user>@<destination.host>:<path to destination>
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "2"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f606439%2ftorque-reports-error-when-posting-job-to-client-nodes%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Try checking /var/log/syslog
or PBS logfiles on the machine where was the job running, which was host macondo01
.
You're looking for something like this, probably error while copying job's logfile:
pbs_mom: LOG_ERROR::sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool...
You can find the actual log from that run in /var/spool/torque/undelivered/
.
The problem might be with PBS_SCP
command which requires passwordless ssh access to machine, typically it uses command like this:
$PBS_SCP -rpB <path to source> <user>@<destination.host>:<path to destination>
add a comment |
Try checking /var/log/syslog
or PBS logfiles on the machine where was the job running, which was host macondo01
.
You're looking for something like this, probably error while copying job's logfile:
pbs_mom: LOG_ERROR::sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool...
You can find the actual log from that run in /var/spool/torque/undelivered/
.
The problem might be with PBS_SCP
command which requires passwordless ssh access to machine, typically it uses command like this:
$PBS_SCP -rpB <path to source> <user>@<destination.host>:<path to destination>
add a comment |
Try checking /var/log/syslog
or PBS logfiles on the machine where was the job running, which was host macondo01
.
You're looking for something like this, probably error while copying job's logfile:
pbs_mom: LOG_ERROR::sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool...
You can find the actual log from that run in /var/spool/torque/undelivered/
.
The problem might be with PBS_SCP
command which requires passwordless ssh access to machine, typically it uses command like this:
$PBS_SCP -rpB <path to source> <user>@<destination.host>:<path to destination>
Try checking /var/log/syslog
or PBS logfiles on the machine where was the job running, which was host macondo01
.
You're looking for something like this, probably error while copying job's logfile:
pbs_mom: LOG_ERROR::sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool...
You can find the actual log from that run in /var/spool/torque/undelivered/
.
The problem might be with PBS_SCP
command which requires passwordless ssh access to machine, typically it uses command like this:
$PBS_SCP -rpB <path to source> <user>@<destination.host>:<path to destination>
edited Apr 22 '15 at 15:56
answered Apr 22 '15 at 14:42
TombartTombart
1,1782 gold badges17 silver badges36 bronze badges
1,1782 gold badges17 silver badges36 bronze badges
add a comment |
add a comment |
Thanks for contributing an answer to Server Fault!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f606439%2ftorque-reports-error-when-posting-job-to-client-nodes%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown