Message boards : Number crunching : Bad batch of TONI-AGGd tasks
Author | Message |
---|---|
Both my active hosts - GTX470 and GTX670 - are showing "Energies have become nan" errors for all recent tasks in this batch (short queue). | |
ID: 27194 | Rating: 0 | rate: / Reply Quote | |
Same for me. The problem started November 1st around 22:00 UTC. I had to stop temporarily GPUgrid as I receive only WUs of this faulting "Toni" batch. | |
ID: 27195 | Rating: 0 | rate: / Reply Quote | |
I just had to download 5 before I got a NOELIA that would run. That brings my total to 8 failed TONI wu's, it sucks because GPUGRID doesn't know that a task has failed and one of my video cards sits idle for sometime. | |
ID: 27199 | Rating: 0 | rate: / Reply Quote | |
Yes, | |
ID: 27201 | Rating: 0 | rate: / Reply Quote | |
Sorry guys, I cancelled them. I was fooled because some of them went ok. Those which fail, appear to do so at the start. | |
ID: 27202 | Rating: 0 | rate: / Reply Quote | |
Thanks Toni for you're quick action, they seem to be running fine now on all my GPU's. | |
ID: 27237 | Rating: 0 | rate: / Reply Quote | |
Thanks to you. It took a week to debug, but was very instructive in the end. Fixed workunits are called "AGGd2", and should have high GPU usage. | |
ID: 27251 | Rating: 0 | rate: / Reply Quote | |
Doesn't appear to be working for me. Just noticed my industrial quantity of errors. Two slightly different water cooled 580s running at stock speeds. | |
ID: 27315 | Rating: 0 | rate: / Reply Quote | |
I also get TONI WUs that error out, with either "Energies have become nan" errors or, after a short period of crunching, with "output file absent" errors. This is really starting to annoy me. More so as I also get NOELIA WUs that run till the end only to report "output file absent" | |
ID: 27401 | Rating: 0 | rate: / Reply Quote | |
I'm also getting Toni and NOELIA WU's failing after several hous. I'm setting to no new work for a few days to see how this shakes out. | |
ID: 27413 | Rating: 0 | rate: / Reply Quote | |
Message boards : Number crunching : Bad batch of TONI-AGGd tasks