Author |
Message |
|
Hello community!
I looked in BOINC and the progress (%) of the GPUGRID WU didn't changed. I looked ~ 5 mins. The estimate time increased and increased.
I aborted the WU.
3 GPUGRID WUs were calculated well before on this GPU.
Manufacturer OCed GTX260-216.
ACEMD2: GPU molecular dynamics v6.05 (cuda)
resultid=3186839
SWAN : FATAL : Failure executing kernel sync [nb_k_nt_p_tp_pme] [700]
Assertion failed: 0, file swanlib_nv.cpp, line 121
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
(I don't know if this is an info because of aborting)
Is this 'normal' that sometimes the progress don't change?
Is this an error, what can happen sometimes?
What to do?
Restart of BOINC, PC, ..?
Aborting?
Thanks!
____________
|
|
|
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
That specific task seemed to fail after about 3h, but some of the KKi tasks are unusually long. For a GTX260 about 5h to 8h is about right, depending on the task.
|
|
|
|
The other 'KKi'-WUs running < 8 hours (~ 28,000 secs).
If I look in BOINC, normally I see that the % progress change every few seconds.
But this time, I looked and ~ 5 mins the % progress didn't changed. The elapsed time increased also the estimate time.
If I look to the wingmen (wuid=2012518), the 1st 9400M, 3rd GTX295 got computing error.
My 2nd result was 'strange'.
The 4th result will come from a GTX275.
It could be that this WU is 'buggy'?
____________
|
|
|
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
Yes, I would say the WU had a problem in this case, rather than your setup.
As it is a KKi task, with a known issues that occasionally prevent some tasks from running (fail very early) it is possible that this is one such task, but it managed to run on your system. |
|
|
|
The old WU in question was calculated well from a Linux GTX275 now.
Maybe it's a prob with the Windows app?
This morning I saw again a (new) WU with no % progress change: wuid=2028574.
I suspended the calculation in BOINC, waited a few seconds and after enabling of calculation the WU in question was marked immediately as computation error.
What is the prob?
The Windows app?
My system?
The old nVIDIA driver 190.38? S@h and MW@h run well.
I would like to crunch GPUGRID from time to time, but if every few times a WU have probs..
What would happen, if I don't see this in BOINC?
The WU stay in calculation mode, but crunch days and days in an idle loop and block the GPU?
____________
|
|
|
|
I see now, both WUs were calculated on GPU #2.
Maybe the GPU is buggy for/at GPUGRID? Other project are well.
____________
|
|
|