Advanced search

Message boards : Number crunching : Will frequently pausing a long run task increase the chance of a computation error?

Author Message
DavidVR
Send message
Joined: 20 Jan 11
Posts: 6
Credit: 10,705,495
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 21636 - Posted: 7 Jul 2011 | 20:33:47 UTC

That is, if I pause a long run task a few times a day, will this make it more likely I'll get a computation error or invalid result?

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 13
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21637 - Posted: 7 Jul 2011 | 22:45:07 UTC - in response to Message 21636.

That is, if I pause a long run task a few times a day, will this make it more likely I'll get a computation error or invalid result?



Hello: It is possible that.

The way to reduce this is what I do when I have to stop work is as follows.

A) Options Memory Disk Usage and uncheck the option that the application remains in memory when suspended.
B) First time suspended and a suspended task BoingManager Exit.

By doing this the moment I have not missed any work because of this. Greetings.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21638 - Posted: 8 Jul 2011 | 1:00:47 UTC - in response to Message 21637.

A) Options Memory Disk Usage and uncheck the option that the application remains in memory when suspended.

I just use Snooze or Snooze GPU, and these work fine with LAIM.

LAIM does not quite work the same for GPU's as it does for CPU tasks; for CPU tasks, some can literally pickup where they left off, without any loss. For the GPU it tends to pickup at the last checkpoint (5min, or after a certain percentage of the task has been completed). Thus it should not cause errors in itself, but it could if you had a faulty RAM module or an error on your HDD.

It is worth noting that the first checkpoint can be some way into the task, especially on slow GPU's, so I would suggest you look at the task properties and see just when it last checkpointed before closing, suspending or snoozing.

PS. Some CPU projects also have LAIM as the recommended setting.

Post to thread

Message boards : Number crunching : Will frequently pausing a long run task increase the chance of a computation error?

//