Advanced search

Message boards : Number crunching : acemd2_6.15_windows_intelx86__cuda31 vault caused my Windows to crash

Author Message
tim
Send message
Joined: 19 Jun 11
Posts: 7
Credit: 272,000,767
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 21870 - Posted: 22 Aug 2011 | 5:34:19 UTC

Hi,

No reboot, just halted and had to reset this morning, Windows reliability centre reported the cause as;

Description
Faulting Application Path: C:\ProgramData\BOINC\projects\www.gpugrid.net\acemd2_6.15_windows_intelx86__cuda31

Problem signature
Problem Event Name: APPCRASH
Application Name: acemd2_6.15_windows_intelx86__cuda31
Application Version: 0.0.0.0
Application Timestamp: 4e11af1b
Fault Module Name: acemd2_6.15_windows_intelx86__cuda31
Fault Module Version: 0.0.0.0
Fault Module Timestamp: 4e11af1b
Exception Code: 40000015
Exception Offset: 0003af9a
OS Version: 6.1.7601.2.1.0.256.1
Locale ID: 2057
Additional Information 1: 97d3
Additional Information 2: 97d3f4d93e2afbff856d53f1a89d983e
Additional Information 3: bd47
Additional Information 4: bd47a27867296d73243181d29511c41e

Extra information about the problem
Bucket ID: 2516797008

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21871 - Posted: 22 Aug 2011 | 8:54:38 UTC - in response to Message 21870.

What 'just halted', the operating system or the application?
Was there a Windows error message with that message?

I take it you have W7. Is it up to date?
What GPU and what driver?

tim
Send message
Joined: 19 Jun 11
Posts: 7
Credit: 272,000,767
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 21878 - Posted: 23 Aug 2011 | 11:30:37 UTC - in response to Message 21871.

Hi,

My whole system halted, I have to press reset when I got back into the office in the morning. The error message I posted is what I got from the Windows 7reliability application after my systems restarted.

My set-up is;

Nvidia driver 275.33
Windows 7 (64bit) all updates installed
AMD CPU
1 x Zotac GTX 275
1 x Zotac GTX 570
1 x Zotac GTX 590

At the bottom of the fault description is mentions a 'Bucket ID' but I do not know where to get this information

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21889 - Posted: 26 Aug 2011 | 12:32:11 UTC - in response to Message 21878.
Last modified: 26 Aug 2011 | 12:32:50 UTC

This sounds like a one-off error, so probably nothing to worry about; one off errors happen on most science projects.
The app crashed either due to a bug in the app, the operating system (which tends not to blame itself for anything), the environment (too warm, power dip), a hardware glitch, or an issue with the driver.

If the problem is with the app there is nothing you or I can do about it, but there was a recent announcement about the pending release of a new app (CC1.3 and above only).
If the problem was with the operating system all you can do is keep it maintained (up to date, defraged, and the disk scanned).
I always recommend using a tool such as MSI Afterburner to control temperatures via the fan speed. Keeping the GPU cooler prevents problems, increases longevity, and can reduce power usage slightly (less leakage at lower temps).
You might want to try the latest driver from NVidia, though I don't know of any specific fix in 280, for Windows, that overcomes problems in 275,

http://www.nvidia.com/object/win7-winvista-64bit-280.26-whql-driver.html
http://www.nvidia.com/Download/index.aspx?lang=en-us

GL

Xavier Zepherious
Send message
Joined: 9 Dec 11
Posts: 1
Credit: 1,809,076
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 22743 - Posted: 20 Dec 2011 | 3:47:26 UTC - in response to Message 21889.
Last modified: 20 Dec 2011 | 3:56:41 UTC

I had a bunch of them

killed 2 saved 1

didn't understand the issue with the first two(couldn't get them to work) - so I had to kill them... ie ABORT)

even tho I had done a bunch with no issues prior (figured a few bad units)

- seems there is am memory issue at play here

if I use too much memory for other task - this unit seems to fail


exact WU

IIR249-NATHAN-FA5-31-100-RDH7308

LONG RUNS(8-12 HOURS ON FASTEST CARD) 6.15 (CUDA 31)

however if I suspend, clean up my projects and freed up memory and resume it works


the last one happened immediately apon running it (in fact 6 errors - all the same apon running)

(I was using a big chunch of ram for graphics work - photo work)


once I freed up memory the WU worked - and hasn't failed yet

Post to thread

Message boards : Number crunching : acemd2_6.15_windows_intelx86__cuda31 vault caused my Windows to crash

//