Advanced search

Message boards : Graphics cards (GPUs) : WU crash on 9800 GT/GTX, but not on GTX 260

Author Message
Dennis-TW
Send message
Joined: 15 Jan 09
Posts: 6
Credit: 113,514,591
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 13301 - Posted: 29 Oct 2009 | 6:26:04 UTC
Last modified: 29 Oct 2009 | 6:29:23 UTC

In the last 6 days I had two work units crashing after 50.000 and 73.000 seconds, respectively.

http://www.gpugrid.net/workunit.php?wuid=865833

http://www.gpugrid.net/workunit.php?wuid=869267

Both were running on my NVIDIA GeForce 9800 GT (1024MB) driver: 19062 using Boinc 6.6.36

Apperently I wasn't the first with these units crashing (also on GT/GTX 9800 and Boinc version 6.6.36), however, I was the last since both units went fine on a GTX 260 afterwards.

The error message in all cases was Cuda error: Kernel [pme_fill_charges_accumulate] failed in file 'fillcharges.cu' in line 73 : unknown error.

Any suggestions what to do? Despite these two work units, all other units didn't have any problems. It just would be nice to avoid loosing more than 100k seconds of GPU time within one week...Thanks.

Dennis-TW
Send message
Joined: 15 Jan 09
Posts: 6
Credit: 113,514,591
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 13359 - Posted: 2 Nov 2009 | 16:21:08 UTC

And just another one bit the dust......that makes it 3 fails out of the last 13.

Any comment on this issue or should I look for another CUDA project??

zpm
Avatar
Send message
Joined: 2 Mar 09
Posts: 159
Credit: 13,639,818
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 13360 - Posted: 2 Nov 2009 | 18:35:15 UTC - in response to Message 13359.

their will be some bad apples every once in awhile unless someone really screws up the compiler. just the other day, i had my first wu error for the first time since july........ 4 months without a single error is lucky....

Dennis-TW
Send message
Joined: 15 Jan 09
Posts: 6
Credit: 113,514,591
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 13362 - Posted: 3 Nov 2009 | 5:56:17 UTC - in response to Message 13360.

Sure errors may occur once in a while, but this may seem to be a systematic one, since 9800GT likes to crash while GTX260 has no problem with the same WU. Apart from wasting lots of GPU hours I would really like to help and report things like these, but for sure it would be nice also to get some feedback.

This is also the first time since July that I run into errors, but 3 from 13 are quite a lot bad apples at a time.....

Daniel.Ahlborn
Send message
Joined: 12 Jan 09
Posts: 5
Credit: 3,359,168
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 13702 - Posted: 26 Nov 2009 | 10:46:22 UTC - in response to Message 13362.

Seem all of sudden the G92 and below based cards are computing bullshit.

I can report a similar problem here, my GT200b machine does the job just fine, but my G92 based machine is calculating producing recently a 100% error rate.

i think of switching that one to another project, but i have no idea wich one yet.
Collatz is most of the time out of order, milkyway does not support my G92, for Seti is no Linux Cuda Client avaiable, Aqua stept back from GPU processing at all. Seem i am stuck on this project.

Does anyone know a alternative project where we could join instead of getting crazy trying figure whats the matter here (once again) ?

To be honest i am really getting sick of that crap every now and then.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13751 - Posted: 1 Dec 2009 | 19:56:19 UTC - in response to Message 13702.

I'm experiencing similar problems with my G92 cards. About 40% success over the last 2 weeks, and even this has dropped way down since the 28th Nov 09. I expect this might be work unit related, but there seems to be a move away from the G92 core support that is causing more failures. Perhaps the very long run time is to blame. This has been creeping up over the last 6 months. The longer you run the more chance of an error.

You might want to look into hooking your G92 cards up to Folding@home.
It is not a Boinc project, so it has its own client, but there is always work to be done and it seems to be more stable than the other projects you mentioned.

On several occasions running Aqua I lost huge amounts of work when it failed. The last time I lost over 450hours, that was 19 days non-stop folding lost at about 85% in! Not surprised they gave up.

You could also try Einstein. This does use the Boinc client. It has only recently begun to use CUDA, so dont expect too much from it.

I'm going to give my two G92 cards a few more days before I pull them away from here, and I will be trying to make them both as stable as possible in the mean time.

I have my doubts about pushing for G200 and above here.

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 13834 - Posted: 8 Dec 2009 | 16:19:20 UTC - in response to Message 13751.

I was having lots of problems with my 9800GTX, and a while back had to stop GPUGrid WUs as they were crashing too often. It had crunched very well in the past, and certainly others had 9800GTXs running with no issues, so it was all a little strange, but I couldnt take the blue screens anymore. I suspected the driver but could not say definitively. I am paranoid about cleaning old driver bits out, so it was not that.

I keep track with new drivers as they came out with little change to me so far. I changed to 195.62. It was a whole new world ..... blue screens disappeared, reliability returned, along with my sanity ....

Run it for a couple of weeks now and no changes, rock solid, one blue screen and that was my fault. Probably will get back to crunching here again as I miss doing these WUs for many reasons.

Worth a shot at the new drivers, worked for me so far, yet to do the acid test with a GPUGrid WU, but looking good so far.

Regards
Zy

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13837 - Posted: 8 Dec 2009 | 18:12:37 UTC - in response to Message 13834.

Some of the Work Units were not compatible with my G92 cards;
TONI-HERG is Bad on G92 Cards.

Post to thread

Message boards : Graphics cards (GPUs) : WU crash on 9800 GT/GTX, but not on GTX 260

//