Advanced search

Message boards : Graphics cards (GPUs) : Is there a problem with acemd 6.69?

Author Message
Dave_In_Oz
Send message
Joined: 13 Jul 09
Posts: 32
Credit: 287,042,950
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 12644 - Posted: 23 Sep 2009 | 2:16:54 UTC

I have had 4 work units in a row fail with error 107374. When I had a bit of a look I noticed that the last successful units were running v6.67.

Hardware Intel i7, 4GB RAM, NVIDIA GTX295 Driver 190.62

I've reset the project after the first two failures, the next pair of workunits also had the same computation error - 107374

Any ideas?

Dave_In_Oz
Send message
Joined: 13 Jul 09
Posts: 32
Credit: 287,042,950
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 12660 - Posted: 23 Sep 2009 | 4:49:42 UTC - in response to Message 12644.

More have failed with same error - 107374.

I'm running Vista 64 bit, GTX295 190.62 , BOINC 6.6.36

Error details :

Problem signature:
Problem Event Name: APPCRASH
Application Name: acemd_6.69_windows_intelx86__cuda.exe
Application Version: 0.0.0.0
Application Timestamp: 4ab753ae
Fault Module Name: cudart.dll!cudaRuntimeGetVersion
Fault Module Version: 6.0.6002.18005
Fault Module Timestamp: 49e03824
Exception Code: c0000139
Exception Offset: 0006f04e
OS Version: 6.0.6002.2.2.0.256.1
Locale ID: 3081
Additional Information 1: 9d13
Additional Information 2: 1abee00edb3fc1158f9ad6f44f0f6be8
Additional Information 3: 9d13
Additional Information 4: 1abee00edb3fc1158f9ad6f44f0f6be8

RyanChen
Send message
Joined: 2 Dec 08
Posts: 5
Credit: 3,027,593
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 12666 - Posted: 23 Sep 2009 | 6:39:21 UTC - in response to Message 12660.

Did you solve the problem?
I encountered the similar problem, after reset the project and reboot did not help!

I made a trial to replace the cudart.dll & cufft.dll which placed in GPUGRID project folder with version CUDA 2.3 runtime lib. I downloaded from http://tinyurl.com/lwr6av
Amazingly it works without exit code!!! I'm not sure if it is right method to solve the problem, so remember to backup the files and do it at your own risk.

Dave_In_Oz
Send message
Joined: 13 Jul 09
Posts: 32
Credit: 287,042,950
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 12669 - Posted: 23 Sep 2009 | 7:50:46 UTC - in response to Message 12666.

Sounds like you have a simiar problem to me. Resets and reboots did not fix it for me.

I will try the CUDA 2.3 runtime and see what happens, althought I suspect there is something wrong with the code being delivered by GPUGRID. Or are we just tangled up in the transition from 2.2 to 2.3??

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 12672 - Posted: 23 Sep 2009 | 8:16:29 UTC - in response to Message 12669.

New application coming out in a couple of hours.
gdf

Dave_In_Oz
Send message
Joined: 13 Jul 09
Posts: 32
Credit: 287,042,950
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 12673 - Posted: 23 Sep 2009 | 8:24:32 UTC - in response to Message 12672.

Cool.

Will that mean I can leave my toy running the current drivers from NVIDIA, ie 190.62?

Dave_In_Oz
Send message
Joined: 13 Jul 09
Posts: 32
Credit: 287,042,950
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 12675 - Posted: 23 Sep 2009 | 8:34:49 UTC - in response to Message 12666.

Did you solve the problem?
I encountered the similar problem, after reset the project and reboot did not help!

I made a trial to replace the cudart.dll & cufft.dll which placed in GPUGRID project folder with version CUDA 2.3 runtime lib. I downloaded from http://tinyurl.com/lwr6av
Amazingly it works without exit code!!! I'm not sure if it is right method to solve the problem, so remember to backup the files and do it at your own risk.


Just for those thinking about playing with CUDA versions, I downloaded the CUDA V2.3 driver from CUDA Zone (I won't use drivers from sources other than the manufacturer's web site) and it continued to get the same error. So it is probably not worth while.

I was going to download the 2.2 driver, then saw the msg from GDF so I will wait and see what happens a bit later when he has organised his magic :)

I have a preference for running science related to health rather than things like Milkyway or SETI. Although I do run SETI, but not on my GPUs, this is because I have been running it for most of 10 yrs.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 12679 - Posted: 23 Sep 2009 | 9:04:05 UTC - in response to Message 12675.
Last modified: 23 Sep 2009 | 9:04:25 UTC

New application out. I am not sure that it will solve the problem, but it is a try.

acemd_6.71


gdf

Dave_In_Oz
Send message
Joined: 13 Jul 09
Posts: 32
Credit: 287,042,950
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 12681 - Posted: 23 Sep 2009 | 9:39:17 UTC - in response to Message 12679.

Hi GDF,

I have reset the project and let it schedule work and I am now getting :

AECMD beta version is not available for this computer
Full-atom molecular dynamics on Cell Processor is not available for your type of computer

Dave

ignasi
Send message
Joined: 10 Apr 08
Posts: 254
Credit: 16,836,000
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 12683 - Posted: 23 Sep 2009 | 9:39:46 UTC - in response to Message 12681.

try again now.
ignasi

Dave_In_Oz
Send message
Joined: 13 Jul 09
Posts: 32
Credit: 287,042,950
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 12685 - Posted: 23 Sep 2009 | 9:48:56 UTC - in response to Message 12683.

Hi Ignasi,

Looking good, I reset project again and it successfully downloaded and started 2 tasks on the GTX295. They are approaching 1% done which is further than they were getting earlier.

Thanks for your help

Dave

Profile dataman
Avatar
Send message
Joined: 18 Sep 08
Posts: 36
Credit: 100,352,867
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 12688 - Posted: 23 Sep 2009 | 16:28:13 UTC

The 6.71 app seems to be working fine now on 6.10.6 and GTX260 cards. :)

Win7 64-bit, BOINC 6.10.6, nVidia 190.62, CUDA 2.3

Thanks!
____________

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 12689 - Posted: 23 Sep 2009 | 16:31:14 UTC - in response to Message 12688.

Wow. I saw that you have many GTX260. Did you have any problem with the Nvidia FFT bug?

gdf

Profile dataman
Avatar
Send message
Joined: 18 Sep 08
Posts: 36
Credit: 100,352,867
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 12690 - Posted: 23 Sep 2009 | 16:40:25 UTC - in response to Message 12689.
Last modified: 23 Sep 2009 | 17:31:23 UTC

Wow. I saw that you have many GTX260. Did you have any problem with the Nvidia FFT bug?

gdf

Not really. Until the 6.69 came out yesterday, I had maybe 1 failed wu per day over 7 GTX260's but it also happened on the GTX285 too. When 6.69 came out, every wu failed on every machine. I have XP SP3 running Ok now and 1 Win7. I will convert the Vista 64-bit today.

In all, I have 6 Win7, 1 Vista and 1 XP ... 7 GTX260's and 1 GTX285.

Thanks for the quick resolution!!

EDIT: The Vista machine is working fine now also. :) Now I will put the rest of the machines on the config I posted earlier.
____________

Rabinovitch
Avatar
Send message
Joined: 25 Aug 08
Posts: 143
Credit: 64,937,578
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 12698 - Posted: 24 Sep 2009 | 2:19:19 UTC

Sehr Gut! I will try to get a couple of Wus for testing too, 'cause my PC is under Win 7 x64 too and equipped with GTX260 (unfortunately, single). Now need to wait for Collatz Wus to be finished... :-)
____________
From Siberia with love!

Rabinovitch
Avatar
Send message
Joined: 25 Aug 08
Posts: 143
Credit: 64,937,578
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 12700 - Posted: 24 Sep 2009 | 4:30:42 UTC

First WU under 6.71 app failed on Win 7 x64 (GTX260):

24.09.2009 9:23:55 GPUGRID Starting task 838-GIANNI_BIND001-36-100-RND2858_2 using acemd version 671
24.09.2009 11:04:23 GPUGRID Computation for task 838-GIANNI_BIND001-36-100-RND2858_2 finished
24.09.2009 11:04:23 GPUGRID Output file 838-GIANNI_BIND001-36-100-RND2858_2_1 for task 838-GIANNI_BIND001-36-100-RND2858_2 absent
24.09.2009 11:04:23 GPUGRID Output file 838-GIANNI_BIND001-36-100-RND2858_2_2 for task 838-GIANNI_BIND001-36-100-RND2858_2 absent
24.09.2009 11:04:23 GPUGRID Output file 838-GIANNI_BIND001-36-100-RND2858_2_3 for task 838-GIANNI_BIND001-36-100-RND2858_2 absent
24.09.2009 11:04:30 GPUGRID Started upload of 838-GIANNI_BIND001-36-100-RND2858_2_0
24.09.2009 11:04:30 GPUGRID Started upload of 838-GIANNI_BIND001-36-100-RND2858_2_4

____________
From Siberia with love!

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 12701 - Posted: 24 Sep 2009 | 6:20:00 UTC

I guess I am just slow to react, but, I just noticed that I have quite a few failed tasks with this version in the last two days. Across, I think all of my systems. THough I should note that it looks like I have one in progress that is going to complete as it is 90% plus complete ...

So, the following tasks had the 139 error:

Computer 19410: GTX 295 cards

- 757-GIANNI_BIND001-30-100-RND7538_2
- 492-GIANNI_BINDT1-7-10-RND6252_1
- m35000-IBUCH_1_pYEEI_2209-0-20-RND5275_1
- 72-KASHIF_HIVPR_sub_so_ba1-33-100-RND2448_1
- PMEcube25-OTTO_HERG4-4-40-RND3779_2
- PME191-TONI_HERG2-6-40-RND1676_1
- 131-KASHIF_HIVPR_sub_so_ba2-34-100-RND4021_1
- 171-GIANNI_BIND001-27-100-RND7217_1
- 778-GIANNI_BIND001-44-100-RND0745_2
- 717-GIANNI_BIND001-38-100-RND3246_2
- 196-KASHIF_HIVPR_sub_so_ba2-37-100-RND7121_1
- 19-GIANNI_CLOZ-16-50-RND8348_2


Computer 36139: GTX 260 cards

- 570-GIANNI_BIND001-35-100-RND1972_2
- PMEcube45-OTTO_HERG5-8-40-RND4347_2
- 100-GIANNI_VIL-0-100-RND1180_1
- PMEcube60-TONI_HERG2-10-40-RND5555_1

I will note that I have not see a string of failures like this in like forever ...



Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 12704 - Posted: 24 Sep 2009 | 7:42:04 UTC - in response to Message 12701.

Hi Paul,
your are filing on the previous application (the first release).
Then we have updloaded another two. 6.70/6.71.

gdf

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 12719 - Posted: 24 Sep 2009 | 15:56:48 UTC - in response to Message 12704.

Hi Paul,
your are filing on the previous application (the first release).
Then we have updloaded another two. 6.70/6.71.

gdf

Only because the deaths were on the date that I was filing the report, yesterday. And I still had one in hand from that version.

I knew it was probably a waste of time, but, one never knows ... error on the side of pointing out problems is not a bad rule ...

Post to thread

Message boards : Graphics cards (GPUs) : Is there a problem with acemd 6.69?

//