Message boards : Number crunching : Error on ethTRYP
Author | Message |
---|---|
I recently returned to GPUGrid after a long absence and haven't been having any problems (that I know of) until this ethTRYP workunit errored out overnight. Stderr out is: | |
ID: 23543 | Rating: 0 | rate: / Reply Quote | |
From the crunchers perspective it's just a fact that you get the odd task failure. There are some generic things you can do to try to deal with increasing amounts of failures, but one failure in say 50 isn't a big issue; overall performance is reduced by no more than 2%. | |
ID: 23545 | Rating: 0 | rate: / Reply Quote | |
Unfortunately there's a rate of 22% of failure for these tasks. Sources are several, from abortions via GUI to unknown app errors. | |
ID: 23566 | Rating: 0 | rate: / Reply Quote | |
Thanks for the feedback, you guys. | |
ID: 23567 | Rating: 0 | rate: / Reply Quote | |
Just an FYI in case it's worthwhile, the WU I referenced in my original post resulted in compute errors for two other hosts before being successfully completed. | |
ID: 23599 | Rating: 0 | rate: / Reply Quote | |
Just an FYI in case it's worthwhile, the WU I referenced in my original post resulted in compute errors for two other hosts before being successfully completed. Yeah, I was one of them :(. My second error out of 60 odd tasks. didn't mind this one too much as it failed after only 6500 seconds. http://Thishttp://www.gpugrid.net/workunit.php?wuid=3174532 one though failed after 37000 seconds :/ took 5 errors and 1 abortion before finally being completed by the 6th user. | |
ID: 23601 | Rating: 0 | rate: / Reply Quote | |
Just an FYI in case it's worthwhile, the WU I referenced in my original post resulted in compute errors for two other hosts before being successfully completed. Yeah, I was one of them :(. My second error out of 60 odd tasks. didn't mind this one too much as it failed after only 6500 seconds. this one though failed after 37000 seconds :/ took 5 errors and 1 abortion before finally being completed by the 6th user. | |
ID: 23602 | Rating: 0 | rate: / Reply Quote | |
Some tasks end up being corrupted and error out serially and end up dying. Again, causes are a mystery but the impact is very small on the batch overall. | |
ID: 23604 | Rating: 0 | rate: / Reply Quote | |
Message boards : Number crunching : Error on ethTRYP