[Pegasus-users] pegasus-monitord took 30G memory
Yu Huang
polyactis at gmail.com
Thu May 3 13:58:53 PDT 2012
The error message looks like this:
Held 00:01 ??????PEAlignmentByBWA.sh_ID0009937
??????Error from slot1 at n6231.S1335855707.27@n6231: Failed to open
'/u/home/eeskin2/polyacti/NetworkData/vervet/vervetPipeline/work/ShortRead2Al..
The folder above is a symbolic link to a local folder on central manager
n6223. slot1 at n6231 would just hit a dead link.
yu
On Thu, May 3, 2012 at 1:54 PM, Yu Huang <polyactis at gmail.com> wrote:
>
>
> On Thu, May 3, 2012 at 1:46 PM, Karan Vahi <vahi at isi.edu> wrote:
>
>> Hi Yu
>>
>> Hi Karan,
>>
>> I stopped one workflow and moved the submit directory to a local
>> filesystem. However, after i restarted it, the jobs on the computing nodes
>> could not see the submit directory (cuz it's local on the central manager)
>> as they need to write to the *out.000 files. Is there anyway around it?
>>
>>
>> The jobs on the computing nodes don't need to see the submit directory
>> where Condor writes out the job.out.** files
>>
>> Are you saying that you planned a workflow , started it, stopped it and
>> moved the submit directory to local filesystem ?
>>
>> Pegasus in the submit files has an absolute path to the output and error
>> files. In case you moved the submit directory, then the condor submit files
>> have the wrong paths.
>> You might want to do perl replace on the paths in the submit files, or
>> plan a new workflow and launch it ( the --dir option should be a directory
>> on the local filesystem )
>>
>> i did a symbolic link to the new path. not enough? i attached my
> pegasusrc file.
> is this entry causing some the problem?
>
> pegasus.dir.storage.deep=false
>
>
>> Also, have you considered clustering your workflow ?
>> https://pegasus.isi.edu/wms/docs/4.0/reference.php#job_clustering
>>
>> yes, in this case, i thought each job would take a long time. will
> enable it next time.
>
>> Cheers
>> Karan
>>
>>
>> checking the options of pegasus-plan, doesn't seem to there is one for
>> that.
>>
>> thanks,
>> yu
>>
>>
>>> It is important to note that after sqlite there are 4 / if there are 3 /
>>> then sqlalchemy picks it up as a relative path.
>>>
>>> 2) For the next pegasus release 4.1.0 , we have put in a newer version
>>> of sqlalchemy 0.7.2 . Pegausus 4.0 has 0.6.4
>>> You can try installing pegasus 4.1.0cvs
>>> http://download.pegasus.isi.edu/wms/download/4.1/nightly/
>>>
>>> Thanks
>>> Karan
>>>
>>>
>>>
>>
>
>
> --
> Yu Huang
> Postdoc in Nelson Freimer Lab,
> Center for Neurobehavioral Genetics, UCLA
> Office Phone:* +1.310-794-9598*
> Skype ID: crocea
> http://www-scf.usc.edu/~yuhuang
>
--
Yu Huang
Postdoc in Nelson Freimer Lab,
Center for Neurobehavioral Genetics, UCLA
Office Phone:* +1.310-794-9598*
Skype ID: crocea
http://www-scf.usc.edu/~yuhuang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.isi.edu/pipermail/pegasus-users/attachments/20120503/1287738a/attachment.html
More information about the pegasus-users
mailing list