Advanced search

Message boards : Graphics cards (GPUs) : GPU problem

Author Message
Temujin
Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 1279 - Posted: 19 Jul 2008 | 22:46:08 UTC

I'm getting a lot or errors as below

Sat 19 Jul 2008 23:15:13 BST|PS3GRID|Starting Gn30779-TEST12-0-5-acemd_0
Sat 19 Jul 2008 23:15:13 BST|PS3GRID|Starting task Gn30779-TEST12-0-5-acemd_0 using acemd version 625
Sat 19 Jul 2008 23:15:26 BST|PS3GRID|Computation for task Gn30779-TEST12-0-5-acemd_0 finished
Sat 19 Jul 2008 23:15:26 BST|PS3GRID|Output file Gn30779-TEST12-0-5-acemd_0_1 for task Gn30779-TEST12-0-5-acemd_0 absent
Sat 19 Jul 2008 23:15:26 BST|PS3GRID|Output file Gn30779-TEST12-0-5-acemd_0_2 for task Gn30779-TEST12-0-5-acemd_0 absent
Sat 19 Jul 2008 23:15:26 BST|PS3GRID|Output file Gn30779-TEST12-0-5-acemd_0_3 for task Gn30779-TEST12-0-5-acemd_0 absent


This usually happens when the 2nd WU of a download batch runs (or 3rd/4th), I don't think my rig has successfully processed a full batch of WUs.
The 1st Wu of a batch normally runs to completion so I set my "connect every" to 0.1 with 0 cache to try to download only 1 WU at a time but the above error came from a single WU download this one.
The WU can process anything from 13 seconds (as above) to 4 hours (this one) before failing.

Is anyone else getting errors like these?
Any ideas why its happening?
Anyone else using an 8800GS successfully?


Fedora 7, Q6600 (running 3xseti & 1xps3grid), Asus 8800GS running 173.14 drivers (also happened with 169.09)

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 1282 - Posted: 20 Jul 2008 | 8:26:11 UTC - in response to Message 1279.

I'm getting a lot or errors as below

Sat 19 Jul 2008 23:15:13 BST|PS3GRID|Starting Gn30779-TEST12-0-5-acemd_0
Sat 19 Jul 2008 23:15:13 BST|PS3GRID|Starting task Gn30779-TEST12-0-5-acemd_0 using acemd version 625
Sat 19 Jul 2008 23:15:26 BST|PS3GRID|Computation for task Gn30779-TEST12-0-5-acemd_0 finished
Sat 19 Jul 2008 23:15:26 BST|PS3GRID|Output file Gn30779-TEST12-0-5-acemd_0_1 for task Gn30779-TEST12-0-5-acemd_0 absent
Sat 19 Jul 2008 23:15:26 BST|PS3GRID|Output file Gn30779-TEST12-0-5-acemd_0_2 for task Gn30779-TEST12-0-5-acemd_0 absent
Sat 19 Jul 2008 23:15:26 BST|PS3GRID|Output file Gn30779-TEST12-0-5-acemd_0_3 for task Gn30779-TEST12-0-5-acemd_0 absent


This usually happens when the 2nd WU of a download batch runs (or 3rd/4th), I don't think my rig has successfully processed a full batch of WUs.
The 1st Wu of a batch normally runs to completion so I set my "connect every" to 0.1 with 0 cache to try to download only 1 WU at a time but the above error came from a single WU download this one.
The WU can process anything from 13 seconds (as above) to 4 hours (this one) before failing.

Is anyone else getting errors like these?
Any ideas why its happening?
Anyone else using an 8800GS successfully?


Fedora 7, Q6600 (running 3xseti & 1xps3grid), Asus 8800GS running 173.14 drivers (also happened with 169.09)



Thanks for the accurate description. We knew there was a problem with WU at start after a successful one, but this is much more clear.

Keep in touch. We hope to fix it soon.

GDF

Profile UBT - NaRyan
Avatar
Send message
Joined: 16 Jul 08
Posts: 68
Credit: 1,242,980
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 1283 - Posted: 20 Jul 2008 | 9:29:24 UTC

I also have had 1 workunit fail right at the start.
Task ID 38220, and by the looks of things it's one of the ones Temujin had fail on him too (but after 924 Seconds)

Mine was moaning about "error while loading shared libraries: libcudart.so: cannot open shared object file: No such file or directory"

Temujin
Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 1296 - Posted: 21 Jul 2008 | 9:21:56 UTC - in response to Message 1282.

While waiting for a fix, do you know of any tricks to reduce the number of failures as I've hit my 4 WU/day limit today with 4 failures :(
I've tried restarting boinc but that didn't work, would restarting the machine help?
Is there anything I can run to clean things up?

I don't mean to sound ungrateful/impatient but any idea how long "soon" will be?
are we talking days or weeks?

How has the take up of the GPU app been?
Any idea how many GPU users you have?

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 135,911,881
RAC: 68
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 1297 - Posted: 21 Jul 2008 | 9:40:20 UTC

I guess the fix will be implemented very soon... ;)

I also have had issues with one of my cards (driver problems plus some failing tasks like you described them), and hit the max. of 4 WUs per day. Unfortunately there is nothing you can do to reset it...


____________

pixelicious.at - my little photoblog

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 1299 - Posted: 21 Jul 2008 | 15:19:26 UTC - in response to Message 1296.
Last modified: 21 Jul 2008 | 15:19:46 UTC

While waiting for a fix, do you know of any tricks to reduce the number of failures as I've hit my 4 WU/day limit today with 4 failures :(
I've tried restarting boinc but that didn't work, would restarting the machine help?
Is there anything I can run to clean things up?

I don't mean to sound ungrateful/impatient but any idea how long "soon" will be?
are we talking days or weeks?

How has the take up of the GPU app been?
Any idea how many GPU users you have?



Hi,
we are looking into it. The problem is that we cannot replicate it here. It does happen to others but much less frequently. At the moment your machine and Stefan's are summing up 90% of all errors, which otherwise is going well. It could be a driver problem for both. I hope that we can do something in days.



GDF

Temujin
Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 1300 - Posted: 21 Jul 2008 | 16:13:57 UTC - in response to Message 1299.

At the moment your machine and Stefan's are summing up 90% of all errors
Oops

I hope that we can do something in days.
Many thanks

Profile UBT - NaRyan
Avatar
Send message
Joined: 16 Jul 08
Posts: 68
Credit: 1,242,980
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 1302 - Posted: 21 Jul 2008 | 16:46:26 UTC

Just got the same error: Task ID: 38629

Mon 21 Jul 2008 17:32:53 BST|PS3GRID|Restarting task xD30815-TEST12-1-5-acemd_0 using acemd version 625
Mon 21 Jul 2008 17:32:54 BST|PS3GRID|Computation for task xD30815-TEST12-1-5-acemd_0 finished
Mon 21 Jul 2008 17:32:54 BST|PS3GRID|Output file xD30815-TEST12-1-5-acemd_0_0 for task xD30815-TEST12-1-5-acemd_0 absent
Mon 21 Jul 2008 17:32:54 BST|PS3GRID|Output file xD30815-TEST12-1-5-acemd_0_1 for task xD30815-TEST12-1-5-acemd_0 absent
Mon 21 Jul 2008 17:32:54 BST|PS3GRID|Output file xD30815-TEST12-1-5-acemd_0_2 for task xD30815-TEST12-1-5-acemd_0 absent
Mon 21 Jul 2008 17:32:54 BST|PS3GRID|Output file xD30815-TEST12-1-5-acemd_0_3 for task xD30815-TEST12-1-5-acemd_0 absent

Mind you Boinc had been jumping about the workunits like a demented flea yesterday, due to it thinking that it would not reach the workunit deadline.
That workunit was listed as 0% and had just started after one had just finnished.

I don't know if it's anyway related, but a workunit for the project I run along side gpugrid had also just finnished.

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 135,911,881
RAC: 68
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 1303 - Posted: 21 Jul 2008 | 16:59:49 UTC - in response to Message 1299.

...
It could be a driver problem for both. I hope that we can do something in days.

GDF


Well, I'm not sure if I can believe that it is a driver problem only...

I run a 8800GT with Ubuntu 7.10 and 169.14 drivers,
a 9800GTX with Fedora 9 and 173.14.09 drivers
and a GTX 260 with Ubuntu 8.04 and 177.13 drivers...

Three different machines, three different OS and three different driver versions, and all show the same errors from time to time...

The error with the WUs at start after a successful one with the error

"process exited with code 127 (0x7f, -129)
acemd_6.25_x86_64-pc-linux-gnu__cuda: error while loading shared libraries: libcudart.so: cannot open shared object file: No such file or directory"

and the second error

"process exited with code 1 (0x1, -255)"

I know that the 177.13 driver for the GTX260 is really crap because the PowerMizer does not work and it slows down the core clock speed of the card after the first successful WU, but the other two computers (drivers) too?

I really hope you can find out what's going wrong.
____________

pixelicious.at - my little photoblog

Temujin
Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 1304 - Posted: 21 Jul 2008 | 17:19:35 UTC - in response to Message 1303.

"process exited with code 127 (0x7f, -129)
acemd_6.25_x86_64-pc-linux-gnu__cuda: error while loading shared libraries: libcudart.so: cannot open shared object file: No such file or directory"
I've only had this error once. In my boinc directory I only have libcudart32.so and libcudart64.so, ie no libcudart.so so would it be worth trying
ln -s libcudart64.so libcudart.so
??

and the second error "process exited with code 1 (0x1, -255)"
Thats the one I get on all but 1 of my fails :(

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 135,911,881
RAC: 68
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 1305 - Posted: 21 Jul 2008 | 17:50:08 UTC - in response to Message 1304.

"process exited with code 127 (0x7f, -129)
acemd_6.25_x86_64-pc-linux-gnu__cuda: error while loading shared libraries: libcudart.so: cannot open shared object file: No such file or directory"
I've only had this error once. In my boinc directory I only have libcudart32.so and libcudart64.so, ie no libcudart.so so would it be worth trying
ln -s libcudart64.so libcudart.so
??

and the second error "process exited with code 1 (0x1, -255)"
Thats the one I get on all but 1 of my fails :(


libcudart.so was downloaded from ps3grid/gpugrid with your first WU and it should be in the BOINC/projects/www.ps3grid.net directory.
____________

pixelicious.at - my little photoblog

Temujin
Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 1306 - Posted: 21 Jul 2008 | 17:55:55 UTC - in response to Message 1305.

libcudart.so was downloaded from ps3grid/gpugrid with your first WU and it should be in the BOINC/projects/www.ps3grid.net directory.

Yep, you're right
I didn't think to look in there

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 135,911,881
RAC: 68
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 1307 - Posted: 21 Jul 2008 | 18:09:36 UTC
Last modified: 21 Jul 2008 | 18:11:02 UTC

Hmm, the only thing I can see is that we both use Quadcore CPUs...
Gianni, could this problems be related to Quadcores, or is this only a coincidence?
What are the CPU types of the other computers which throw out these errors?
____________

pixelicious.at - my little photoblog

Temujin
Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 1308 - Posted: 21 Jul 2008 | 18:41:27 UTC - in response to Message 1307.
Last modified: 21 Jul 2008 | 18:59:38 UTC

What are the CPU types of the other computers which throw out these errors?
You may be on to something.
I know of the following GPU users

UBT - NaRyan
Q6600, 5 good, 2 errors, 1 abort
Athlon 6000+, 11 good

sneakysaurus
Q6600, 6 good, 5 errors

JG4KEZ(Koichi Soraku)
Xeon X3360, 7 good, 1 error

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 135,911,881
RAC: 68
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 1310 - Posted: 21 Jul 2008 | 19:00:44 UTC

Looks obvious that there's something wrong with Quads, but who knows...
Let's see what G is saying.


____________

pixelicious.at - my little photoblog

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 1311 - Posted: 21 Jul 2008 | 19:02:44 UTC - in response to Message 1308.
Last modified: 21 Jul 2008 | 19:03:17 UTC

What are the CPU types of the other computers which throw out these errors?
You may be on to something.
I know of the following GPU users

UBT - NaRyan
Q6600, 5 good, 2 errors, 1 abort
Athlon 6000+, 11 good

sneakysaurus
Q6600, 6 good, 5 errors

JG4KEZ(Koichi Soraku)
Xeon X3360, 7 good, 1 error



The vast majority of errors are at start-up. We have submitted a series of very fast WUs to check it now.
If you go over quota for the day, let me have your hostid.


GDF

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 135,911,881
RAC: 68
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 1312 - Posted: 21 Jul 2008 | 19:07:28 UTC

Ok, but my queue is pretty full with ps3grid WUs, I doubt I'll get new WUs until tomorrow.
But I'll try to stop running tasks, maybe I can get some new ones...
____________

pixelicious.at - my little photoblog

Temujin
Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 1313 - Posted: 21 Jul 2008 | 19:08:54 UTC - in response to Message 1311.

If you go over quota for the day, let me have your hostid.
PM sent

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 1314 - Posted: 21 Jul 2008 | 19:18:36 UTC - in response to Message 1308.




Hi,

your card seems to be overclocked which makes it unstable and causes the errors!
Is it right?

GDF

Temujin
Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 1315 - Posted: 21 Jul 2008 | 19:21:45 UTC - in response to Message 1314.




Hi,

your card seems to be overclocked which makes it unstable and causes the errors!
Is it right?

GDF
who, me?

not as far as I know, I've certainly not tweaked anything.

nvclock gives the following
-- General info --
Card: Unknown Nvidia card
Architecture: G92 A2
PCI id: 0x606
GPU clock: 601.712 MHz
Bustype: PCI-Express

-- Shader info --
Clock: 1674.000 MHz
Stream units: 96 (1b)
ROP units: 12 (1b)
-- Memory info --
Amount: 384 MB
Type: 128 bit DDR3
Clock: 899.996 MHz

-- PCI-Express info --
Current Rate: 16X
Maximum rate: 16X

-- Sensor info --
Sensor: GPU Internal Sensor
GPU temperature: 18C

-- VideoBios information --
Version: 62.92.29.00.00
Signon message: ASUS EN8800GS TOP VGA BIOS Ver 62.92.29.00.AS13
Performance level 0: gpu 600MHz/shader 1700MHz/memory 900MHz/0.00V/100%
VID mask: 3
Voltage level 0: 0.95V, VID: 0
Voltage level 1: 1.00V, VID: 1
Voltage level 2: 1.05V, VID: 2
Voltage level 3: 1.10V, VID: 3

Temujin
Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 1316 - Posted: 21 Jul 2008 | 19:31:56 UTC - in response to Message 1313.

If you go over quota for the day, let me have your hostid.

got 3,
2 normal & 1 shortie, running the shortie now

thanks

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 1317 - Posted: 21 Jul 2008 | 19:34:04 UTC - in response to Message 1315.




Hi,

your card seems to be overclocked which makes it unstable and causes the errors!
Is it right?

GDF
who, me?

not as far as I know, I've certainly not tweaked anything.

nvclock gives the following
-- General info --
Card: Unknown Nvidia card
Architecture: G92 A2
PCI id: 0x606
GPU clock: 601.712 MHz
Bustype: PCI-Express

-- Shader info --
Clock: 1674.000 MHz
Stream units: 96 (1b)
ROP units: 12 (1b)
-- Memory info --
Amount: 384 MB
Type: 128 bit DDR3
Clock: 899.996 MHz

-- PCI-Express info --
Current Rate: 16X
Maximum rate: 16X

-- Sensor info --
Sensor: GPU Internal Sensor
GPU temperature: 18C

-- VideoBios information --
Version: 62.92.29.00.00
Signon message: ASUS EN8800GS TOP VGA BIOS Ver 62.92.29.00.AS13
Performance level 0: gpu 600MHz/shader 1700MHz/memory 900MHz/0.00V/100%
VID mask: 3
Voltage level 0: 0.95V, VID: 0
Voltage level 1: 1.00V, VID: 1
Voltage level 2: 1.05V, VID: 2
Voltage level 3: 1.10V, VID: 3




Maybe it was overclocked by the vendor.
According to this
http://en.wikipedia.org/wiki/GeForce_8_Series

the shader should be clocked at 1375

You could try to reduce it.
GDF

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 1318 - Posted: 21 Jul 2008 | 19:37:36 UTC
Last modified: 21 Jul 2008 | 19:43:05 UTC

Hi,

According to the Asus website, at [url=http://www.asus.co.nz/news_show.aspx?id=9578], your card is a factory overclocked 8800GS. Whilst overclocking is often acceptable for games, because the GPUGRID application works the card very hard it is quite possible that it becoming unstable. I suggest that you try reducing the clock back to the standard settings shown in the Wikipedia article griven by GDF.

Before you can change the clock frequencies, you may need to add the following to the screen or device section of /etc/X11/xorg.conf:

Option "Coolbits" "1"

Then restart X. The nvidia-settings program will then have a panel called "clock settings".

MJH

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 135,911,881
RAC: 68
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 1319 - Posted: 21 Jul 2008 | 19:40:11 UTC
Last modified: 21 Jul 2008 | 19:41:56 UTC

If you meant me with the oc'd card - no they aren't overclocked...

My 9800GTX is an EVGA 9800 GTX SC (super clocked) and is a little bit overclocked by default, but it was stable 24/7 over one week when I was running the Folding@home GPU Client under Windows - Haven't had one error on this card at FAH. And the other two cards do show the same errors with ps3grid and they are for sure not oc'd - not by me and not by the vendor...

-----
Just had another error after nine hours, with a new error code-
The WU was http://www.ps3grid.net/PS3GRID/result.php?resultid=38719 ,and the error was

<core_client_version>6.3.5</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>

</stderr_txt>
]]>
____________

pixelicious.at - my little photoblog

Temujin
Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 1320 - Posted: 21 Jul 2008 | 19:59:25 UTC - in response to Message 1318.

Hi,

According to the Asus website, at [url=http://www.asus.co.nz/news_show.aspx?id=9578], your card is a factory overclocked 8800GS. Whilst overclocking is often acceptable for games, because the GPUGRID application works the card very hard it is quite possible that it becoming unstable. I suggest that you try reducing the clock back to the standard settings shown in the Wikipedia article griven by GDF.
Yep, that would make sense.
Must admit, I didn't realise it was an overclock card, i just bought it because it was a cheap 8800.
As far as temps go, its 16C at idle and a max of 28C while running WUs.

Before you can change the clock frequencies, you may need to add the following to the screen or device section of /etc/X11/xorg.conf:

Option "Coolbits" "1"

Then restart X. The nvidia-settings program will then have a panel called "clock settings".
Yep, done that and up pops the extra panel but I can only adjust GPU (at default of 600Mhz) and Memory (at default of 900 Mhz) settings. There's no access to the Shader settings.

I'm willing to knock both GPU & Memory down if that will help, any suggestions what to set it to?

I've also done a "nvclock -r" (reset) but it didn't seem to change the output from "nvclock -i -f"

Temujin
Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 1322 - Posted: 21 Jul 2008 | 20:23:43 UTC - in response to Message 1320.
Last modified: 21 Jul 2008 | 20:26:19 UTC

ok, I've gone for GPU @ 550 and Memory @ 850
shader is still at 1700 though, anyone know how to adjust that?

edit
oops, nope, its dropped down to 1566Mhz
I'll have a play around with GPU & Mem settings

Profile UBT - NaRyan
Avatar
Send message
Joined: 16 Jul 08
Posts: 68
Credit: 1,242,980
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 1323 - Posted: 21 Jul 2008 | 20:39:29 UTC
Last modified: 21 Jul 2008 | 20:53:21 UTC

My 8800GT is factory overclocked should be 600MHz Core, 1500MHz Shader & 1800MHz Memory.
However it runs at 700MHz Core, 1700MHz Shader & 2000MHz Memory.

And it's the computer that's so far been 100% stable.

I do however have the fan set to 100% on it.

*EDIT*
oops forgot it was Quad systems having probs not dual core ones *ahem*

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 135,911,881
RAC: 68
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 1324 - Posted: 21 Jul 2008 | 20:49:45 UTC

Ok, that's your Dualcore without problems, but the Quadcore has also some WUs with errors...

Any word from GDF or MJH about that it looks like those errors only appear on Quadcore computers?
____________

pixelicious.at - my little photoblog

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 1325 - Posted: 21 Jul 2008 | 21:31:11 UTC - in response to Message 1324.

Ok, that's your Dualcore without problems, but the Quadcore has also some WUs with errors...

Any word from GDF or MJH about that it looks like those errors only appear on Quadcore computers?


I don't see any reason why quad cores could create problems.

GDF

Profile UBT - NaRyan
Avatar
Send message
Joined: 16 Jul 08
Posts: 68
Credit: 1,242,980
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 1328 - Posted: 21 Jul 2008 | 22:22:47 UTC - in response to Message 1325.

Ok, that's your Dualcore without problems, but the Quadcore has also some WUs with errors...

Any word from GDF or MJH about that it looks like those errors only appear on Quadcore computers?


I don't see any reason why quad cores could create problems.

GDF


Thing is, looking at the Top Computers every single Quad Core has them :(
Need someone with an AMD Quad to join to see if that also has the same probs.

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 135,911,881
RAC: 68
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 1330 - Posted: 21 Jul 2008 | 23:12:06 UTC
Last modified: 21 Jul 2008 | 23:14:26 UTC

Ok, let's look a little bit further at the top hosts list...

Computers with computations errors:

stefan@home Intel Quad Core
stefan@home Intel Quad Core
JG4KEZ(Koichi Soraku) Intel Quad Core Xeon
sneakysaurus Intel Quad Core
Anonymous user Intel Quad Core
UBT-NaRyan Intel Quad Core
stefan@home Intel Quad Core
Anonymous user Intel Dual Core !
[AF>Linux>Gentoo] elgrande71 Intel Dual Core !

Computers without computation errors:

UBT-NaRyan AMD Dual Core

Ok that's - this are all GPU computers I could find (Except the computer of GDF, I have excluded it, because he also had computation errors, but they were dated before the official start of gpugrid)... Seems the errors are not only related to Quad Cores, but the only computer without errors is an AMD Dual Core...
Don't know what it means, but at least all Intel computers show these errors...
____________

pixelicious.at - my little photoblog

Temujin
Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 1333 - Posted: 22 Jul 2008 | 8:12:52 UTC - in response to Message 1316.

got 3, 2 normal & 1 shortie, running the shortie now

The shortie turned out to be not a shortie :(

1st one crashed again but that was before I underclocked the card.
Current WU is now 12 hours in with an estimated 1.5 hours to go.

Profile UBT - NaRyan
Avatar
Send message
Joined: 16 Jul 08
Posts: 68
Credit: 1,242,980
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 1334 - Posted: 22 Jul 2008 | 8:21:24 UTC - in response to Message 1333.
Last modified: 22 Jul 2008 | 8:22:37 UTC

got 3, 2 normal & 1 shortie, running the shortie now

The shortie turned out to be not a shortie :(

1st one crashed again but that was before I underclocked the card.
Current WU is now 12 hours in with an estimated 1.5 hours to go.


The shorties have the name "xxxxxxx-FASTTEST-x-x-xxxxx" and for me it took about 47 Seconds to complete for 1.99 credits :)
____________

Down with the Kredit Kops!!!

Temujin
Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 1336 - Posted: 22 Jul 2008 | 12:27:10 UTC - in response to Message 1334.

The shorties have the name "xxxxxxx-FASTTEST-x-x-xxxxx" and for me it took about 47 Seconds to complete for 1.99 credits :)
I didn't get any of them then :(

Temujin
Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 1351 - Posted: 24 Jul 2008 | 10:52:26 UTC
Last modified: 24 Jul 2008 | 11:27:14 UTC

I think I may have sorted my GPU problems.

I run my own little boinc stats database for my team and mysqld gets a tad busy during updates for about 10 minutes.
I'm also in the process of upgrading my machine from Fedora 7 to Fedora 9.
To do that, I've borrowed a machine from work and have moved the stats over to that machine while I do the upgrade.
I'm waiting a couple of days before upgrading the main machine untill I'm sure everything is running ok on the temp machine.

Since moving the stats all GPU WUs have completed successfully.
Ok, its only done 2 and a bit WUs but its never managed that before, I was lucky if I had 1 in 5 succeed and never 2 sequentially.

Its a bit early to claim 100% victory but its looking good, maybe, touch wood :D

Profile UBT - NaRyan
Avatar
Send message
Joined: 16 Jul 08
Posts: 68
Credit: 1,242,980
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 1354 - Posted: 25 Jul 2008 | 2:46:22 UTC

I just got another error on my quad. Task ID 39247

Last one I had on the Quad was 4 days ago, so can't moan about it.
And the AMD dual core is still plodding along as happy as can be :)
____________

Down with the Kredit Kops!!!

Profile Nightlord
Avatar
Send message
Joined: 22 Jul 08
Posts: 61
Credit: 5,461,041
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 1366 - Posted: 26 Jul 2008 | 18:57:36 UTC
Last modified: 26 Jul 2008 | 18:59:20 UTC

Hi guys! A couple of error types to report.

I loaded up two boxes with 8800GT's (the first box ran a 8600GT as a test bed for a couple of days).

One box (this one)seems a bit unstable. Lots of compute errors. I've reduced the clock using nvclock as discussed earlier in the thread and we'll see what happens

On the other box, I have some missing libcudart.so errors. Was there a fix found for the missing libcudart.so discussed earlier? This host seems to do that on every second WU - immediately it tries to start up the second WU after completing the first, it fails for missing libcudart. I've checked and the file is present in the /projects/ps3grid.net folder, so stumped really.

Both boxes are running ubuntu 8.04, with 8800GT's fed from dual core E4300's, which have been fine on boinc up till now.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 1376 - Posted: 28 Jul 2008 | 9:47:42 UTC - in response to Message 1366.

Hi guys! A couple of error types to report.

I loaded up two boxes with 8800GT's (the first box ran a 8600GT as a test bed for a couple of days).

One box (this one)seems a bit unstable. Lots of compute errors. I've reduced the clock using nvclock as discussed earlier in the thread and we'll see what happens

On the other box, I have some missing libcudart.so errors. Was there a fix found for the missing libcudart.so discussed earlier? This host seems to do that on every second WU - immediately it tries to start up the second WU after completing the first, it fails for missing libcudart. I've checked and the file is present in the /projects/ps3grid.net folder, so stumped really.

Both boxes are running ubuntu 8.04, with 8800GT's fed from dual core E4300's, which have been fine on boinc up till now.



We have a workaround for the libcudart missing problem. It seems to happen in a strange way, which we cannot replicate on our fedora box.
The workaround is to install the Nvidia toolkit (same page of the the driver) and set in your .bashrc file the following command:

export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/lib

This should not be needed in the future. I keep you updated for when we find a solution.

gdf


Profile Nightlord
Avatar
Send message
Joined: 22 Jul 08
Posts: 61
Credit: 5,461,041
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 1380 - Posted: 28 Jul 2008 | 17:38:52 UTC - in response to Message 1376.



We have a workaround for the libcudart missing problem. It seems to happen in a strange way, which we cannot replicate on our fedora box.
The workaround is to install the Nvidia toolkit (same page of the the driver) and set in your .bashrc file the following command:

export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/lib

This should not be needed in the future. I keep you updated for when we find a solution.

gdf





Thanks for your tip, I'll keep a watch on the boxes and patch if needed.

I reduced the clocks slightly on the machine that appeared to have a stability problem, it seems fine now. The other machine has not encountered a libcudart.so fail since Saturday evening.


____________

Temujin
Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 1409 - Posted: 4 Aug 2008 | 12:20:18 UTC
Last modified: 4 Aug 2008 | 12:22:48 UTC

Hi

I had a WU running for ~9 hours when the benchmarks kicked in.
The WU resumed after the benchmarks and promptly failed with error 1

I know my card isn't the most stable but this failure looked to be caused by the benchmarks.

If a benchmark is due within the estimated WU run period is there any way that the benchmark could be run before starting the WU?

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 1410 - Posted: 4 Aug 2008 | 14:18:34 UTC - in response to Message 1409.

Hi

I had a WU running for ~9 hours when the benchmarks kicked in.
The WU resumed after the benchmarks and promptly failed with error 1

I know my card isn't the most stable but this failure looked to be caused by the benchmarks.

If a benchmark is due within the estimated WU run period is there any way that the benchmark could be run before starting the WU?


The only thing we saw is that you were first running with the factory default frequency

# Clock rate: 1674000 kilohertz

After the restart, you were running with the underclocked values and crashed

# Clock rate: 1350000 kilohertz

It would have been less surprising the other way round...

gdf

Filipe
Send message
Joined: 28 Apr 08
Posts: 3
Credit: 1,994,582
RAC: 0
Level
Ala
Scientific publications
watwatwat
Message 1411 - Posted: 4 Aug 2008 | 14:31:12 UTC

Hi all,

keep up the good work !! and look forward to progressing with the first boinc project utilizing gpu's.

I've started 4 gpu test work units and 3 have failed (compute error), system details below.

1 x 9600GT slightly OC'ed card (by manufacturer)
Ubuntu hardy 8.04
cuda driver 177.13
client 6.3.5
Pentium D processor

out of 4 units 3 units failed with a stderr error of

"process exited with code 1 (0x1, -255)"

but the 4th work unit passed with exit code 0(0x0).

The 3 units completed processing times before compute error of 35k, 384 (yes small) and 63k cpu seconds. Now i know that the 1st failure occurred when at 35k i hit the sleep button (ridiculously placed on keyboard), when system came out of sleep mode compute error occured. The others failed on their own!..really. So my questions are below and pls excuse any questions that may seem simple with Linux as i'm still picking it up after a few years off..and getting used to Ubuntu differences.

1) it seems to me that the logging of actual gpu results as they are processed (or at set point)are a little bugy. since after sleep mode it just didnt restart, the 384 seconds unit i actually closed boinc manager normaly and opened it again and compute error happened. So does the data get stored correctly in this beta version as its being processed? or are there known issues here? or havent i done something right?

2) do i need to set the boinc files location in my path or Lib path? as i have to click the boinc client then boinc manager to start project. clicking boinc manager by itself doesnt start the process..thats were i had connection refused coming up before i realised running the client created the files needed for the boinc manager to connect to client 6.3.5 and start processing.

3) my dual core cpu indicates that both cores are 90-100% all the time? i saw from other posts that only 1 core should indicate close to 100% because of polling. no other apps were running.

4) in messages section of boinc manager it says cant download anymore files because of 1 cpu limit..i guess this should say i gpu limit? or is it cpu limit then that could explain 3) above also.

Other comments
After compute error the claimed credit says 1987.41 no matter what the computation time before error occurred i.e 384s or 63,000s. Not too worried if i dont get any credit for these errors as really this is all about science and testing a new system for future advancement....but its fun :-)

Cheers,
Fil.
p.s sorry about any spelling mistakes...it's late and i'm going to sleep now.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 1413 - Posted: 4 Aug 2008 | 14:41:28 UTC - in response to Message 1411.

[quote]Hi all,

1) restart is usually rock solid. What went wrong here is that the sleep mode caused the GPU program to crash notifying a compute error, so the client did not even try to restart. Even using the desktop heavily could cause the application to crash, display has priority of cuda runs.

2) There is a problem when running directly the manager, this is a boinc problem related to SELinux present even before the GPU. Workaround start first the client with boinc -daemon and then the graphical interface.

3) We use only 1 CPU to sync. What are the processes using the other one?

4) At the moment, boinc checks on the number of cpus. If you guys collect in a thread proven BOINC issues I will present them to D. Anderson when he comes to Barcelona in September.

gdf

Temujin
Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 1414 - Posted: 4 Aug 2008 | 14:50:24 UTC - in response to Message 1410.

Hi

I had a WU running for ~9 hours when the benchmarks kicked in.
The WU resumed after the benchmarks and promptly failed with error 1

I know my card isn't the most stable but this failure looked to be caused by the benchmarks.

If a benchmark is due within the estimated WU run period is there any way that the benchmark could be run before starting the WU?


The only thing we saw is that you were first running with the factory default frequency
# Clock rate: 1674000 kilohertz

After the restart, you were running with the underclocked values and crashed
# Clock rate: 1350000 kilohertz

It would have been less surprising the other way round...

gdf
Ahh, thats interesting.
I upgraded my machine to Fedora 9 yesterday, installed 173 cuda drivers, downloaded a WU and it crashed straightaway.
I checked nvclock and the card had reset to factory overclock, so I underclocked the card back down to 450 & 800 and then downloaded the WU in question.

The NVidia X utility and nvclock both reported the lower clock values before that WU started but boinc saw the higher values untill after the benchmarks, almost 9 hours later ??
me=puzzled :D


Filipe
Send message
Joined: 28 Apr 08
Posts: 3
Credit: 1,994,582
RAC: 0
Level
Ala
Scientific publications
watwatwat
Message 1421 - Posted: 6 Aug 2008 | 13:09:17 UTC - in response to Message 1413.

Thanks gdf for your comments,

In relation to point 3)the client seems to be using 48-50% of my dual core cpu, as reported by processes of the system monitor in Ubuntu.

However i previously mentioned that 100% or both cores were being used . This is true if viewing the resources animation tab and reports both at 100% (most of the time). Looking on the web i found that there is a bug with the compiz plugins 0.7.4 for Ubunto (hardy 8.04 version included) ..actually there are a whole list of issues to be addressed with the compiz plugins as reported at
https://bugs.launchpad.net/ubuntu/+source/compiz/+bug/218726

i unloaded all the compiz plugins but i still get the problem. I guess its not a ps3grid/Boinc issue but a Ubuntu/Linux issue.

cheers,
Fil.



quote]Hi all,

1) restart is usually rock solid. What went wrong here is that the sleep mode caused the GPU program to crash notifying a compute error, so the client did not even try to restart. Even using the desktop heavily could cause the application to crash, display has priority of cuda runs.

2) There is a problem when running directly the manager, this is a boinc problem related to SELinux present even before the GPU. Workaround start first the client with boinc -daemon and then the graphical interface.

3) We use only 1 CPU to sync. What are the processes using the other one?

4) At the moment, boinc checks on the number of cpus. If you guys collect in a thread proven BOINC issues I will present them to D. Anderson when he comes to Barcelona in September.

gdf[/quote]

Profile Bender10
Avatar
Send message
Joined: 3 Dec 07
Posts: 167
Credit: 8,368,897
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwat
Message 1434 - Posted: 10 Aug 2008 | 22:34:38 UTC


Hi, I just added my AMD Quad to the mix.

Nothing here OC'd

Evga 8800GS
AMD 9550
Gigabyte GA-M78SM-S2H mb, with 4 gig ram
Ubuntu 8.04
Boinc 6.3.5
Nvidia 173.14 drivers

I have been running other Boinc project WU's for about a week, just to test out the setup (with Boinc 6.3.5), and everything has been fine until today...

I just got it running a few minutes ago on PS3Grid. And went through 4 failed wu's with this error:

"process exited with code 193 (0xc1, -63)"

I'm not sure what is going on. And I am a Linux noob....

____________


Consciousness: That annoying time between naps......

Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.

Profile Bender10
Avatar
Send message
Joined: 3 Dec 07
Posts: 167
Credit: 8,368,897
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwat
Message 1435 - Posted: 10 Aug 2008 | 23:37:26 UTC


Ooops...Here are the tasks...

http://www.ps3grid.net/results.php?hostid=5914


____________


Consciousness: That annoying time between naps......

Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.

Profile Nightlord
Avatar
Send message
Joined: 22 Jul 08
Posts: 61
Credit: 5,461,041
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 1436 - Posted: 11 Aug 2008 | 11:18:54 UTC
Last modified: 11 Aug 2008 | 11:29:29 UTC

I had one of these overnight too.

Running from a 3GHz P4, not overclocked. On Ubuntu 8.04 a slow 8600GT and 6.3.8 client.

Link to WU: http://www.ps3grid.net/result.php?resultid=42123

It failed right at the start of the run. I noticed some error messages in the message tab - Sorry, I'm not at the box now, so I can't be certain, but it was something like "result file not found". I'll post the exact message later today if anyone needs it.

/edit/ I should add that other boxes were converted over to 6.3.8 last night without issue.
____________

Profile Bender10
Avatar
Send message
Joined: 3 Dec 07
Posts: 167
Credit: 8,368,897
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwat
Message 1437 - Posted: 11 Aug 2008 | 11:22:20 UTC


I installed the new 6.3.8 over the 6.3.5 version. It runs PrimeGrid wu's fine. I'll try PS3Grid wu's this morning. Again

____________


Consciousness: That annoying time between naps......

Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.

Profile [AF>HFR>RR] Jim PROFIT
Send message
Joined: 3 Jun 07
Posts: 107
Credit: 31,331,137
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 1438 - Posted: 11 Aug 2008 | 15:34:03 UTC
Last modified: 11 Aug 2008 | 15:35:35 UTC

So after some test, i think i have a problem but i can't solve it.

I have only one WU finished without problems!!
All others are with errors.

I don't understand why, but sometimes, after Boinc finished a WU for another project, and start PS3Grid, my PC freeze!
And not at the same time.
Yesterday it was near 2H. But i saw it his morning!

And when i restart, the WU is dead with computation errors!
As i can't be all the times in front of the PC, i think i will no try to crunch a WU. And after 4 WU with errors, i can't download another one.
If i try others projects, i don't have any problems. So i think taht's not Ubuntu who have a problem. I don't use any applications with it. Just for testing PS3Grid and the GPU.

For information:
Ubuntu 8.04.1
Nvidia 177.13
Boinc 6.3.8
Q9450 @ 3.4Ghz
GTX 260 not OC
8Go DDR2

May be i will try one other time, but i wait with expectations, the windows version.
I don't have any problems with windows, and i continue to think that Linux is not so stable.

Profile Nightlord
Avatar
Send message
Joined: 22 Jul 08
Posts: 61
Credit: 5,461,041
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 1439 - Posted: 11 Aug 2008 | 17:22:39 UTC
Last modified: 11 Aug 2008 | 17:22:50 UTC

I found on one of my 8800GT boxes that a heavy O/C on the CPU made it unstable on this project. It has been 100% stable on CPU projects before. I reduced the O/C on the CPU by about 10% and it has crunched without fail since.

Hope that helps.
____________

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 1440 - Posted: 11 Aug 2008 | 18:18:02 UTC - in response to Message 1439.

I found on one of my 8800GT boxes that a heavy O/C on the CPU made it unstable on this project. It has been 100% stable on CPU projects before. I reduced the O/C on the CPU by about 10% and it has crunched without fail since.

Hope that helps.



What is O/C?

gdf

Profile Nightlord
Avatar
Send message
Joined: 22 Jul 08
Posts: 61
Credit: 5,461,041
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 1441 - Posted: 11 Aug 2008 | 18:29:34 UTC
Last modified: 11 Aug 2008 | 18:30:51 UTC

Sorry.....O/C = OverClock

His CPU is heavily overclocked and on one of my machines that made the GPU card unstable on this project. Just a 10% reduction in the CPU overclock brought it back - might be the same for him?
____________

Profile Bender10
Avatar
Send message
Joined: 3 Dec 07
Posts: 167
Credit: 8,368,897
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwat
Message 1442 - Posted: 12 Aug 2008 | 0:50:06 UTC
Last modified: 12 Aug 2008 | 1:47:17 UTC


Ok, I tried changing the default GPU (550) and MEM (800) settings to 500 and 700. I picked up 2 new wu's (http://www.ps3grid.net/results.php?hostid=5914). So far the 1st wu has been running for ~10 minutes. I have another Boinc project running on the other 3 cpu's also.

If this werks, I will bump up the settings a bit.

Time will tell....


Evga 8800GS
AMD 9550
Gigabyte GA-M78SM-S2H mb, with 4 gig ram
Ubuntu 8.04
Boinc 6.3.5
Nvidia 173.14 drivers

____________


Consciousness: That annoying time between naps......

Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 135,911,881
RAC: 68
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 1444 - Posted: 12 Aug 2008 | 5:38:22 UTC - in response to Message 1438.
Last modified: 12 Aug 2008 | 5:39:43 UTC

So after some test, i think i have a problem but i can't solve it.

I have only one WU finished without problems!!
All others are with errors.

I don't understand why, but sometimes, after Boinc finished a WU for another project, and start PS3Grid, my PC freeze!
And not at the same time.
Yesterday it was near 2H. But i saw it his morning!

And when i restart, the WU is dead with computation errors!
As i can't be all the times in front of the PC, i think i will no try to crunch a WU. And after 4 WU with errors, i can't download another one.
If i try others projects, i don't have any problems. So i think taht's not Ubuntu who have a problem. I don't use any applications with it. Just for testing PS3Grid and the GPU.

For information:
Ubuntu 8.04.1
Nvidia 177.13
Boinc 6.3.8
Q9450 @ 3.4Ghz
GTX 260 not OC
8Go DDR2

May be i will try one other time, but i wait with expectations, the windows version.
I don't have any problems with windows, and i continue to think that Linux is not so stable.


I have had the same problems with a GTX260 and the 177.13 drivers!
Seems this are driver related problems...
I already have sent a bug report to NVIDIA a few weeks ago, but got no answer, so I had to install Vista on this PC to crunch for Folding@home...
Hope there will be a Windows version of the GPUGRID application very soon! ;-)
____________

pixelicious.at - my little photoblog

Post to thread

Message boards : Graphics cards (GPUs) : GPU problem

//