Advanced search

Message boards : Graphics cards (GPUs) : pausing BOINC for 1 hour (but resuming it after a few minutes) gives computation error?

Author Message
Send message
Joined: 12 Apr 20
Posts: 2
Credit: 4,525,000
RAC: 111,525
Scientific publications
Message 61495 - Posted: 10 May 2024 | 21:11:16 UTC

Don't know if this is just a one-time thing or if it happens to other people too?

Got a BOINC WU which says
0,9 CPU + 1 nVIDIA GPU.
It was running for a few hours and had about 20 hours remaining time.
I PAUSED BOINC for 1 hour (from the task bar) and boinc correctly said
"paused by user" on all my tasks.
After a few minutes i unchecked the "pause for 1 hour" menu-item and all projects initially correctly resumed,
BUT the nVIDIA GPU WU was uploaded and then cancelled with
"computation error".
Does it mean that pausing generally does not work or is not reliable or could this be a one tme thing?
In any case, the system now seems offended :-) because i get no now GPU work :-)

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1302
Credit: 5,676,271,959
RAC: 7,144,883
Scientific publications
Message 61503 - Posted: 14 May 2024 | 20:04:15 UTC - in response to Message 61495.
Last modified: 14 May 2024 | 20:05:40 UTC

Most of the GPUGrid apps are not capable of being stopped or paused without causing instant errors upon resume.

So don't do it.

AFAIK, at this stage of the project, only the old acemd3 tasks actually checkpoint and can be stopped or resumed.

Also if I remember correctly, the Quantum Chemistry tasks can be stopped and resumed, but all progress is lost and the tasks restart from scratch each time they are stopped.

The Python apps, Python on GPU and ATM tasks just error out and report if I remember correctly.

Been too long since I've seen any of those so may not remember if this is still the case.

Emmanuel Mar
Send message
Joined: 14 May 24
Posts: 10
Credit: 29,830,000
RAC: 654,619
Scientific publications
Message 61516 - Posted: 18 May 2024 | 22:12:07 UTC - in response to Message 61495.

Instead of suspending the specific task, suspend the project in which the task is located and allow a few minutes to pass before disconnecting the computer.

In Boinc /options/Processing/Leave tasks in memory suspended/uncheck this option

Post to thread

Message boards : Graphics cards (GPUs) : pausing BOINC for 1 hour (but resuming it after a few minutes) gives computation error?