Advanced search

Message boards : Graphics cards (GPUs) : blast of fast failures...

Author Message
Skip Da Shu
Send message
Joined: 13 Jul 09
Posts: 38
Credit: 41,955,165
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwat
Message 11990 - Posted: 21 Aug 2009 | 1:05:43 UTC

Anybody know what this might be suggesting:

<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 9600 GSO"
# Clock rate: 1.62 GHz
# Total amount of global memory: 401932288 bytes
# Number of multiprocessors: 12
# Number of cores: 96
MDIO ERROR: cannot open file "restart.coor"
Cuda error: Kernel [pme_fill_charges_overflow] failed in file 'fillcharges.cu' in line 96 : unspecified launch failure.

</stderr_txt>

I error'd out a whole slew of these (a few seconds each) after completing a good WU.

http://www.gpugrid.net/result.php?resultid=1137861
____________
- da shu @ HeliOS,
"A child's exposure to technology should never be predicated on an ability to afford it."

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1947
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 12014 - Posted: 21 Aug 2009 | 13:36:07 UTC - in response to Message 11990.

Try to update the client to 6.6.36.

Several bugs have been solved since 6.4.5.

gdf

Skip Da Shu
Send message
Joined: 13 Jul 09
Posts: 38
Credit: 41,955,165
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwat
Message 12019 - Posted: 21 Aug 2009 | 15:30:53 UTC - in response to Message 12014.

All my other machines are running 6.4.5 w/o issues. Including one with the same mobo, same CPU, same video card. The only oddities about this machine are:
1) It's v8.04 of Xubuntu vs v9.04 of Ubuntu the others are running.
2) It has more recent driver installed from nvidia site (v185.xx I believe) vs the Debian v180.44 the others are running.

What bothers me is it ran one to good end of job. Thought it was good to go.

I'll look at doing a manual 6.6.36 install tonight.
____________
- da shu @ HeliOS,
"A child's exposure to technology should never be predicated on an ability to afford it."

Skip Da Shu
Send message
Joined: 13 Jul 09
Posts: 38
Credit: 41,955,165
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwat
Message 12029 - Posted: 22 Aug 2009 | 0:15:22 UTC

Well so far it's looking better... Installed the Berkeley v6.6.36 x64 over top of the Debian executables (so I can use the Debian/Ubuntu scripts) and got 2 new WUs (Did ya'll change the limit from 5 to 2 today?) and one has been running for nearly 20minutes instead of the sub 1 minute they were running before getting the error before. I'm hopeful. I had also backed off the GPU core clock a bit but I really think it was the v6.6.36 install. Not sure I can fathom why but working is working. Thanx, Skip
____________
- da shu @ HeliOS,
"A child's exposure to technology should never be predicated on an ability to afford it."

BarryAZ
Send message
Joined: 16 Apr 09
Posts: 163
Credit: 919,609,146
RAC: 72,669
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 12062 - Posted: 23 Aug 2009 | 17:53:43 UTC - in response to Message 12014.

Interesting -- from what I have seen (along with a number of other reports), 6.6.36 introduced its own set of issues as well.

What makes it difficult to pin down is the variables of the BOINC client version, plus the Nvidia driver version, plus the OS, plus the specific video adapter in the mix.

One area where the 6.6.36 can be problematic (and I fear this may be the case with subsequent updates) is the handling of CPU/GPU mixes where most projects are still CPU only.

Try to update the client to 6.6.36.

Several bugs have been solved since 6.4.5.

gdf

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 12116 - Posted: 25 Aug 2009 | 21:41:29 UTC - in response to Message 12062.

One area where the 6.6.36 can be problematic (and I fear this may be the case with subsequent updates) is the handling of CPU/GPU mixes


I fear this has been the problem ever since the first GPU was used for BOINC ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Post to thread

Message boards : Graphics cards (GPUs) : blast of fast failures...

//