Advanced search

Message boards : Number crunching : Checkpoint interval

Author Message
jjwhalen
Send message
Joined: 23 Nov 09
Posts: 29
Credit: 17,591,899
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18386 - Posted: 23 Aug 2010 | 0:49:01 UTC
Last modified: 23 Aug 2010 | 1:14:09 UTC

I have my GPU host set to "Write to disk at most every 300 seconds". Monitoring ACEMD2 6.11 with BoincTasks (which has a checkpoint counter/timer) I find that it is checkpointing approximately every 50 seconds, +/- the BT refresh rate of 3 seconds.

This is on an Intel Core2 Quad Q9400/GTX 465SC machine running BOINC 6.11.4 under WinVista SP2.

Are GPU tasks exempt from the core client's write to disk setting? Other GPU projects I crunch either don't checkpoint at all (i.e., DNETC) or are close to the 5 minute interval I have set.
____________

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 18388 - Posted: 23 Aug 2010 | 10:32:17 UTC - in response to Message 18386.
Last modified: 23 Aug 2010 | 10:32:42 UTC

I suspect that setting applies to saving the boinc client state only, not to the applications (which have their own checkpointing mechanism)

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 18390 - Posted: 23 Aug 2010 | 14:18:30 UTC

Officially checkpointing of the scientific apps is the one thing this preference setting is meant for. Look in the unofficial Wiki

____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

jjwhalen
Send message
Joined: 23 Nov 09
Posts: 29
Credit: 17,591,899
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18391 - Posted: 24 Aug 2010 | 13:34:29 UTC - in response to Message 18390.
Last modified: 24 Aug 2010 | 13:51:41 UTC

Officially checkpointing of the scientific apps is the one thing this preference setting is meant for. Look in the unofficial Wiki


Exactly, TG. And with this in mind, I reiterate my question ;) (BTW, "unofficial" WIKI doesn't = "uninformed.")

Some users, myself included, have found it useful to limit the amount of checkpointing certain application do, because this can improve their performance at the risk of losing more data in the event of a crash or deliberate restart. This is why Fred Melgert added the checkpoint counter/timer to his BoincTasks utility.
____________

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 18392 - Posted: 24 Aug 2010 | 14:57:56 UTC - in response to Message 18391.

Our application does not take the value given by the BOINC client to checkpoint because for us it must be a multiple of the output file frequency. In this way all works nice when you stop and restart the application.

gdf

jjwhalen
Send message
Joined: 23 Nov 09
Posts: 29
Credit: 17,591,899
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18393 - Posted: 25 Aug 2010 | 2:02:13 UTC - in response to Message 18392.

Our application does not take the value given by the BOINC client to checkpoint because for us it must be a multiple of the output file frequency. In this way all works nice when you stop and restart the application.

gdf


OK gdf, thanks for the clarification :)

____________

Post to thread

Message boards : Number crunching : Checkpoint interval

//