Advanced search

Message boards : Number crunching : All of my NOELIA_20AMINOACID Betas are crashing

Author Message
mwgiii
Send message
Joined: 22 Jan 09
Posts: 8
Credit: 988,332,833
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26810 - Posted: 7 Sep 2012 | 0:52:33 UTC

All of my NOELIA_20AMINOACID Betas are crashing and freezing my system.

I have canceled the couple of others in my queue.

Vista x64 Nvidia 570 Driver 301.42 Boinc 7.0.28 x64

Stderr output

<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
MDIO: cannot open file "output.restart.coor"
SWAN : FATAL : Cuda driver error 999 in file 'swanlibnv2.cpp' in line 1574.
Assertion failed: a, file swanlibnv2.cpp, line 59

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

</stderr_txt>
]]>


and


Stderr output

<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
- exit code 98 (0x62)
</message>
<stderr_txt>
MDIO: read error for file "input.coor", byte number 4: number of atoms (-377094144) != (51695) expected
ERROR: file mdioload.cpp line 80: Unable to read bincoordfile

called boinc_finish

</stderr_txt>
]]>




____________

Old man
Send message
Joined: 24 Jan 09
Posts: 42
Credit: 16,676,387
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26816 - Posted: 7 Sep 2012 | 12:24:41 UTC

Same here:

Name: structure_288_replica5-NOELIA_20AMINOACIDequ-0-1-RND2200_2

http://www.gpugrid.net/workunit.php?wuid=3697900

Exit status 3 (0x3)

Stderr output

<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
MDIO: cannot open file "output.restart.coor"
SWAN : FATAL : Cuda driver error 2 in file 'swanlibnv2.cpp' in line 639.
Assertion failed: a, file swanlibnv2.cpp, line 59

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

</stderr_txt>
]]>

Next wu: http://www.gpugrid.net/workunit.php?wuid=3698153

Exit status 98 (0x62)

<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
- exit code 98 (0x62)
</message>
<stderr_txt>
MDIO: read error for file "input.coor", byte number 4: number of atoms (-377094144) != (51695) expected
ERROR: file mdioload.cpp line 80: Unable to read bincoordfile

called boinc_finish
MDIO: read error for file "input.coor", byte number 4: number of atoms (-377094144) != (51695) expected
ERROR: file mdioload.cpp line 80: Unable to read bincoordfile

called boinc_finish

</stderr_txt>
]]>

Waldo
Send message
Joined: 21 Sep 12
Posts: 1
Credit: 58,107,070
RAC: 0
Level
Thr
Scientific publications
watwatwatwat
Message 27013 - Posted: 25 Sep 2012 | 16:48:14 UTC - in response to Message 26816.
Last modified: 25 Sep 2012 | 16:55:35 UTC

I am also getting alot of these on 6xx series cards
Driver version 301.42 boinc 6.12.34 x64

Here ia one of the results

<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
- exit code 98 (0x62)
</message>
<stderr_txt>
MDIO: cannot open file "restart.coor"
ERROR: file deven.cpp line 1106: # Energies have become nan

called boinc_finish

</stderr_txt>
]]>

If someone has some input that would be great!
____________

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,206,655,749
RAC: 261,147
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27021 - Posted: 25 Sep 2012 | 22:31:41 UTC - in response to Message 27013.

The MDIO: cannot open file "restart.coor" is a false error message, it appears in every result, even in the successful ones.
See the Energies have become nan thread.
The definition of NaN on Wikipedia explains a lot.
These kind of errors basically caused by overclocking, overheating, insufficient PSU, or a faulty GPU.

Post to thread

Message boards : Number crunching : All of my NOELIA_20AMINOACID Betas are crashing

//