Advanced search

Message boards : Number crunching : Unhandled Exception Detected...

Author Message
Profile Mad Matt
Send message
Joined: 29 Aug 09
Posts: 28
Credit: 101,584,171
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 18401 - Posted: 26 Aug 2010 | 17:44:15 UTC

It looks like the WU caused a complete system error including artefacts on screen. I finally aborted when noticing more than 4hs run time and 0 progress. Could you have a look into this?

http://www.gpugrid.net/result.php?resultid=2882059

Cheers



<core_client_version>6.10.56</core_client_version>
<![CDATA[
<message>
aborted by user
</message>
<stderr_txt>
# Using device 0
# There are 3 devices supporting CUDA
# Device 0: "GeForce GTX 295"
# Clock rate: 1.51 GHz
# Total amount of global memory: 939327488 bytes
# Number of multiprocessors: 30
# Number of cores: 240
# Device 1: "GeForce GTX 295"
# Clock rate: 1.51 GHz
# Total amount of global memory: 939196416 bytes
# Number of multiprocessors: 30
# Number of cores: 240
# Device 2: "GeForce 9800 GT"
# Clock rate: 1.81 GHz
# Total amount of global memory: 1073545216 bytes
# Number of multiprocessors: 14
# Number of cores: 112
MDIO ERROR: cannot open file "restart.coor"
# Using device 1
# There are 3 devices supporting CUDA
# Device 0: "GeForce GTX 295"
# Clock rate: 1.24 GHz
# Total amount of global memory: 939327488 bytes
# Number of multiprocessors: 30
# Number of cores: 240
# Device 1: "GeForce GTX 295"
# Clock rate: 1.24 GHz
# Total amount of global memory: 939196416 bytes
# Number of multiprocessors: 30
# Number of cores: 240
# Device 2: "GeForce 9800 GT"
# Clock rate: 1.50 GHz
# Total amount of global memory: 1073545216 bytes
# Number of multiprocessors: 14
# Number of cores: 112
MDIO ERROR: read error for file "restart.vel", byte number 4: number of atoms (0) != (43825) expected
# Using device 1
# There are 3 devices supporting CUDA
# Device 0: "GeForce GTX 295"
# Clock rate: 1.24 GHz
# Total amount of global memory: 939327488 bytes
# Number of multiprocessors: 30
# Number of cores: 240
# Device 1: "GeForce GTX 295"
# Clock rate: 1.24 GHz
# Total amount of global memory: 939196416 bytes
# Number of multiprocessors: 30
# Number of cores: 240
# Device 2: "GeForce 9800 GT"
# Clock rate: 1.50 GHz
# Total amount of global memory: 1073545216 bytes
# Number of multiprocessors: 14
# Number of cores: 112
MDIO ERROR: read error for file "restart.coor", byte number 4: number of atoms (0) != (43825) expected
No heartbeat from core client for 30 sec - exiting
# Using device 1
# There are 3 devices supporting CUDA
# Device 0: "GeForce GTX 295"
# Clock rate: 1.51 GHz
# Total amount of global memory: 939327488 bytes
# Number of multiprocessors: 30
# Number of cores: 240
# Device 1: "GeForce GTX 295"
# Clock rate: 1.51 GHz
# Total amount of global memory: 939196416 bytes
# Number of multiprocessors: 30
# Number of cores: 240
# Device 2: "GeForce 9800 GT"
# Clock rate: 1.81 GHz
# Total amount of global memory: 1073545216 bytes
# Number of multiprocessors: 14
# Number of cores: 112
MDIO ERROR: read error for file "restart.coor", byte number 4: number of atoms (0) != (43825) expected
# Using device 1
# There are 3 devices supporting CUDA
# Device 0: "GeForce GTX 295"
# Clock rate: 1.51 GHz
# Total amount of global memory: 939327488 bytes
# Number of multiprocessors: 30
# Number of cores: 240
# Device 1: "GeForce GTX 295"
# Clock rate: 1.51 GHz
# Total amount of global memory: 939196416 bytes
# Number of multiprocessors: 30
# Number of cores: 240
# Device 2: "GeForce 9800 GT"
# Clock rate: 1.81 GHz
# Total amount of global memory: 1073545216 bytes
# Number of multiprocessors: 14
# Number of cores: 112
# Using device 1
# There are 3 devices supporting CUDA
# Device 0: "GeForce GTX 295"
# Clock rate: 1.51 GHz
# Total amount of global memory: 939327488 bytes
# Number of multiprocessors: 30
# Number of cores: 240
# Device 1: "GeForce GTX 295"
# Clock rate: 1.51 GHz
# Total amount of global memory: 939196416 bytes
# Number of multiprocessors: 30
# Number of cores: 240
# Device 2: "GeForce 9800 GT"
# Clock rate: 1.81 GHz
# Total amount of global memory: 1073545216 bytes
# Number of multiprocessors: 14
# Number of cores: 112
MDIO ERROR: read error for file "restart.coor", byte number 4: number of atoms (0) != (43825) expected


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x7D61002D

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 6.7.0


Dump Timestamp : 08/26/10 19:00:00
Install Directory :
Data Directory : C:\Documents and Settings\All Users\Application Data\BOINC
Project Symstore :
LoadLibraryA( C:\Documents and Settings\All Users\Application Data\BOINC\dbghelp.dll ): GetLastError = 126
Loaded Library : dbghelp.dll
LoadLibraryA( C:\Documents and Settings\All Users\Application Data\BOINC\symsrv.dll ): GetLastError = 126
LoadLibraryA( symsrv.dll ): GetLastError = 126
LoadLibraryA( C:\Documents and Settings\All Users\Application Data\BOINC\srcsrv.dll ): GetLastError = 126
LoadLibraryA( srcsrv.dll ): GetLastError = 126
LoadLibraryA( C:\Documents and Settings\All Users\Application Data\BOINC\version.dll ): GetLastError = 126
Loaded Library : version.dll
Debugger Engine : 4.0.5.0
Symbol Search Path: C:\Documents and Settings\All Users\Application Data\BOINC\slots\8;C:\Documents and Settings\All Users\Application Data\BOINC\projects\www.gpugrid.net;srv*C:\Documents and Settings\All Users\Application Data\BOINC\projects\www.gpugrid.netsymbols*http://msdl.microsoft.com/download/symbols;srv*C:\Documents and Settings\All Users\Application Data\BOINC\projects\www.gpugrid.netsymbols*http://boinc.berkeley.edu/symstore


SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :

SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
Linked PDB Filename :



*** Dump of the Process Statistics: ***

- I/O Operations Counters -
Read: 4116, Write: 0, Other 1065

- I/O Transfers Counters -
Read: 0, Write: 429, Other 0

- Paged Pool Usage -
QuotaPagedPoolUsage: 107776, QuotaPeakPagedPoolUsage: 121936
QuotaNonPagedPoolUsage: 12512, QuotaPeakNonPagedPoolUsage: 12512

- Virtual Memory Usage -
VirtualSize: 149254144, PeakVirtualSize: 150306816

- Pagefile Usage -
PagefileUsage: 96882688, PeakPagefileUsage: 97615872

- Working Set Size -
WorkingSetSize: 88481792, PeakWorkingSetSize: 89223168, PageFaultCount: 42125

*** Dump of thread ID 3576 (state: Waiting): ***

- Information -
Status: Wait Reason: UserRequest, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 36199.000000

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x7D61002D

- Registers -
eax=00000000 ebx=00000000 ecx=00677782 edx=01ee613c esi=00000001 edi=00000000
eip=7d61002d esp=01eefba4 ebp=01eeffec
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246

- Callstack -
ChildEBP RetAddr Args to Child
01eeffec 00000000 004af870 00000000 00000000 00000000 !DbgBreakPoint+0x0 SymGetModuleInfo(): GetLastError = '87' Address = '7d61002d'

*** Dump of thread ID 3548 (state: Waiting): ***

- Information -
Status: Wait Reason: UserRequest, , Kernel Time: 11250000.000000, User Time: 185000000.000000, Wait Time: 36199.000000

- Registers -
eax=05a20278 ebx=002c9eac ecx=002c9eac edx=000000ff esi=05a20290 edi=01c094d8
eip=10014b8b esp=002c9e80 ebp=01c094dc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000216

- Callstack -
ChildEBP RetAddr Args to Child
01c094dc 01c094dc 05a216e8 00000000 00000000 00000000 !+0x0 SymGetModuleInfo(): GetLastError = '87' Address = '10014b8b'

StackWalk(): GetLastError = 87

*** Dump of thread ID 3592 (state: Waiting): ***

- Information -
Status: Wait Reason: UserRequest, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 29805.000000

- Registers -
eax=00000000 ebx=022dfef0 ecx=00000000 edx=00000000 esi=00000002 edi=00000000
eip=7d61d06f esp=022dfea8 ebp=022dff48
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202

- Callstack -
ChildEBP RetAddr Args to Child
022dff48 7d63f9a8 00000002 022dff70 00000000 000493e0 !ZwWaitForMultipleObjects+0x0 SymGetModuleInfo(): GetLastError = '87' Address = '7d61d06f'
022dffb8 7d4dfe37 00000000 00000000 00000000 00000000 !RtlSetEnvironmentStrings+0x0 SymGetModuleInfo(): GetLastError = '87' Address = '7d63f9a8'
022dffec 00000000 7d63f806 00000000 00000000 00000000 !FlsSetValue+0x0 SymGetModuleInfo(): GetLastError = '87' Address = '7d4dfe37'


*** Debug Message Dump ****


*** Foreground Window Data ***
Window Name :
Window Class :
Window Process ID: 0
Window Thread ID : 0

Exiting...

</stderr_txt>
]]>

____________

Post to thread

Message boards : Number crunching : Unhandled Exception Detected...

//