Advanced search

Message boards : Number crunching : acemd.2562.cuda42 - Application Error

Author Message
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26133 - Posted: 3 Jul 2012 | 11:30:06 UTC
Last modified: 3 Jul 2012 | 11:38:29 UTC

Just completed a MJHARVEY task () and started to run another MJHARVEY task when I got an Application Error message:


    acemd.2562.cuda42 - Application Error

    The exception unknown software exception (0x40000015) occurred in the application at location 0x00413c9b.

    Clock on OK to terminate the program
    Click on CANCEL to debug the program



Name 10zx330-MJHARVEY_MJH120523-4-5-RND7312_3
Workunit 3526445
Created 3 Jul 2012 | 8:19:14 UTC
Sent 3 Jul 2012 | 10:15:33 UTC
Received 3 Jul 2012 | 11:24:32 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 3 (0x3)
Computer ID 125945
Report deadline 8 Jul 2012 | 10:15:33 UTC
Run time 382.06
CPU time 2.41
Validate state Invalid
Credit 0.00
Application version Long runs (8-12 hours on fastest card) v6.16 (cuda42)


Stderr output

<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
MDIO: cannot open file "restart.coor"
SWAN : FATAL : Cuda driver error 702 in file 'swanlibnv2.cpp' in line 1574.
Assertion failed: a, file swanlibnv2.cpp, line 59

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

</stderr_txt>
]]>

Was running 6 yoyo tasks on the CPU (i7-2600) at the time. The next task is running ok. It's a NATHAN long task.

Just going by the fact that it failed on one other 'good system' it might just be a bad task:

5543661 110513 28 Jun 2012 | 3:03:42 UTC 3 Jul 2012 | 3:03:42 UTC Timed out - no response 0.00 0.00 --- Long runs (8-12 hours on fastest card) v6.16 (cuda31)
5563435 100662 3 Jul 2012 | 3:03:51 UTC 3 Jul 2012 | 7:50:40 UTC Error while computing 25.93 2.95 --- Long runs (8-12 hours on fastest card) v6.16 (cuda42)
5564217 128832 3 Jul 2012 | 7:52:18 UTC 3 Jul 2012 | 8:19:09 UTC Error while computing 14.86 2.46 --- Long runs (8-12 hours on fastest card) v6.16 (cuda42)
5564307 125945 3 Jul 2012 | 10:15:33 UTC 3 Jul 2012 | 11:24:32 UTC Error while computing 382.06 2.41 --- Long runs (8-12 hours on fastest card) v6.16 (cuda42)
5564779 82076 3 Jul 2012 | 11:24:40 UTC 8 Jul 2012 | 11:24:40 UTC In progress --- --- --- Long runs (8-12 hours on fastest card) v6.16 (cuda31)

Alas it's now been sent to a truly evil NVIDIA GeForce GT 330M (986MB) driver: 197.56. It's doomed!
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Post to thread

Message boards : Number crunching : acemd.2562.cuda42 - Application Error

//