Message boards : Number crunching : 'Energies have become nan' error
Author | Message |
---|---|
This error affects most of us here at least once. It happens regardless of card type, OS, driver or clocking. Certain types of wu are more prone than others. | |
ID: 19879 | Rating: 0 | rate: / Reply Quote | |
I've also had this error on a GTX 295 based system. I don't do anything like over clocking, and also use TThrottle to manage the heat on my CPUs and GPUs. So I suspect it is not a hardware problem. It is also accompanied by a message "MDIO ERROR: cannot open file "restart.coor". | |
ID: 19947 | Rating: 0 | rate: / Reply Quote | |
Yeah, the scientists continuously review their applications and when they see errors they change things. | |
ID: 19950 | Rating: 0 | rate: / Reply Quote | |
I'm not that concerned about the odd error or single nan error. More frustrated seeing the GPUs working for hours only to fail at the end of running a WU. Some of my failures have simular times as other user runs, yet they get success. The "cannot open file "restart.coor"." error is pretty a common message when I have looked at failures. | |
ID: 19955 | Rating: 0 | rate: / Reply Quote | |
Restart.coor is not an error; it is also reported for successful tasks. I think this is just a file that is continuously opened during task runs, and then tries to open again after task completion, but does not need to. | |
ID: 19957 | Rating: 0 | rate: / Reply Quote | |
I don't think there is a high suggestion implementation rate here No kidding... Just stopped by to say MERRY CHRISTMAS to all! | |
ID: 20020 | Rating: 0 | rate: / Reply Quote | |
Message boards : Number crunching : 'Energies have become nan' error