Message boards : Number crunching : Dozens of Failed Tasks
Author | Message |
---|---|
My systems are currently failing more tasks than ever. It looks like most of the work units failed on other systems as well. | |
ID: 23764 | Rating: 0 | rate: / Reply Quote | |
Hi, from a quick look it seems that only one of your hosts http://www.gpugrid.net/results.php?hostid=119703 has recent failed tasks. | |
ID: 23765 | Rating: 0 | rate: / Reply Quote | |
thank you for the quick reply. When I spot check the tasks that failed, most of them look like they failed on other computers as well. | |
ID: 23771 | Rating: 0 | rate: / Reply Quote | |
Your issue is most likely with 295.73. | |
ID: 23775 | Rating: 0 | rate: / Reply Quote | |
It looks like 275.33 fixed everything. Now back to crunching!! | |
ID: 23789 | Rating: 0 | rate: / Reply Quote | |
Never touch a running system =) good to have a cruncher for real science back :) | |
ID: 23791 | Rating: 0 | rate: / Reply Quote | |
I am back to the 275.33 drivers but downclocking has become an issue. I never saw this problem before I upgraded and then downgraded the drivers. 295.73 appears to fix the downclocking but does not work with GPUGrdid WUs. 275.33 will downclock but WUs continue to run. | |
ID: 23796 | Rating: 0 | rate: / Reply Quote | |
Some sort of 'threadsafe' exit code might do it, though it might also change the code to the extent that the research methods alter; something that reviewers might not be so keen on. Obviously the recent app updates, adding instructions to enable some task types, didn't include this 'threadsafte' code. Maybe next time. | |
ID: 23802 | Rating: 0 | rate: / Reply Quote | |
Can someone looke at this WU http://www.gpugrid.net/workunit.php?wuid=3261113 and provide some insight into the failure? It is really hurts when they run for 13,000 seconds and fail. | |
ID: 23922 | Rating: 0 | rate: / Reply Quote | |
The failure was, ERROR: # Energies have become nan | |
ID: 23923 | Rating: 0 | rate: / Reply Quote | |
I am suddenly having a bunch of acemd2 tasks fail. This is the first time this has happened since I started running GPUGRID. | |
ID: 23941 | Rating: 0 | rate: / Reply Quote | |
This computer http://www.gpugrid.net/show_host_detail.php?hostid=119703 failed a few WUs in a row folloed by at least 2 successful WUs. I did not change anything. My GPU typically never exceeds 71C. Is my GPU just running a little too fast? I had about 10 successful WUs prior to these errors. | |
ID: 23947 | Rating: 0 | rate: / Reply Quote | |
Message boards : Number crunching : Dozens of Failed Tasks