Advanced search

Message boards : Graphics cards (GPUs) : Computation Error

Author Message
Fatbob
Send message
Joined: 1 Jan 09
Posts: 1
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 7267 - Posted: 8 Mar 2009 | 12:34:39 UTC
Last modified: 8 Mar 2009 | 12:45:07 UTC

I have tried to run Nvidia client several time.

I have 1 x 8800 GTX and 1 x 8800 GT non SLi (obviously).
running under Vista 64 bit with latest WQL drivers.

I have tried several Boinc Windows clients.

First time the WU ran for several hours then in final few % it errored. Now every WU i run errors.

How do I get this to work ?
Seti@Home runs great with Cuda so why not GPUGrid ?

This is a example of what is posted against WU

<core_client_version>6.6.12</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 8800 GTX"
# Clock rate: 1350000 kilohertz
# Total amount of global memory: 805306368 bytes
# Number of multiprocessors: 16
# Number of cores: 128
# Device 1: "GeForce 8800 GT"
# Clock rate: 1500000 kilohertz
# Total amount of global memory: 268435456 bytes
# Number of multiprocessors: 14
# Number of cores: 112
Cuda error in file 'nonbonded.cu' in line 189 : invalid device symbol.

</stderr_txt>
]]>

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 135,911,881
RAC: 892
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 7269 - Posted: 8 Mar 2009 | 12:50:54 UTC - in response to Message 7267.

GPU FAQ: Overview of cards that run Cuda 2.0 compiled applications
The 8800GTX is not supported by GPUGRID - the 8800GT is supported...
It seems BOINC switched between the two cards during computation of the task and that's probably why it errored out...
____________

pixelicious.at - my little photoblog

uBronan
Avatar
Send message
Joined: 1 Feb 09
Posts: 139
Credit: 575,023
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 7271 - Posted: 8 Mar 2009 | 13:40:33 UTC

Well i have the same error and i only have 1 card so thats not it.
Until now i never had an error on the cuda applications but today my first ever.

<core_client_version>6.5.0</core_client_version>
<![CDATA[
<message>
Onjuiste functie. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 9600 GT"
# Clock rate: 1800000 kilohertz
# Total amount of global memory: 536543232 bytes
# Number of multiprocessors: 8
# Number of cores: 64
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# Device 0: "GeForce 9600 GT"
# Clock rate: 1800000 kilohertz
# Total amount of global memory: 536543232 bytes
# Number of multiprocessors: 8
# Number of cores: 64

</stderr_txt>
]]>

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 135,911,881
RAC: 892
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 7274 - Posted: 8 Mar 2009 | 14:05:19 UTC - in response to Message 7271.

Actually it's not the same error. ;)

Your's is also incorrect function (0x1) exit code 1, but in Fatbob's stderr.out there's also -
Cuda error in file 'nonbonded.cu' in line 189 : invalid device symbol.


____________

pixelicious.at - my little photoblog

[boinc.at] Fireman69
Send message
Joined: 8 Oct 08
Posts: 15
Credit: 29,603,934
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7275 - Posted: 8 Mar 2009 | 14:52:32 UTC

Same problem on my new GTX260-216.

<core_client_version>6.4.7</core_client_version>
<![CDATA[
<message>
Unzul�ssige Funktion. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce GTX 260"
# Clock rate: 1350000 kilohertz
# Total amount of global memory: 939196416 bytes
# Number of multiprocessors: 27
# Number of cores: 216
MDIO ERROR: cannot open file "restart.coor"

</stderr_txt>
]]>

Boinc-Client: 6.4.7
OS: MS Windows XP Pro/32Bit, SP3 (05.01.2600.00)
Coprozessor: Gainward GeForce GTX260-216/895MB (620MHz Core, 1242MHz Shader Clock, 896MB 2200MHz GDDR3 Memory)
Nvidia driver: 182.08

ALL WU's crashed on this machine. On the others with GTX280 there is no actual problem. Machine has no problem when playing games.

____________

uBronan
Avatar
Send message
Joined: 1 Feb 09
Posts: 139
Credit: 575,023
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 7277 - Posted: 8 Mar 2009 | 16:14:43 UTC

Ok agreed but seems now every units is ending in error direct when starts someone any idea what or how ?!

<core_client_version>6.5.0</core_client_version>
<![CDATA[
<message>
Onjuiste functie. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 9600 GT"
# Clock rate: 1800000 kilohertz
# Total amount of global memory: 536543232 bytes
# Number of multiprocessors: 8
# Number of cores: 64
MDIO ERROR: cannot open file "restart.coor"

</stderr_txt>
]]>

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7278 - Posted: 8 Mar 2009 | 16:45:45 UTC - in response to Message 7277.

Reboot? And/or power the machine off and remove the power cord for >10 min.

Your computers are hidden, so I can't check it myself. Do you post the entire error message or just part of it? The line "Onjuiste functie. (0x1) - exit code 1 (0x1)" is the general error category and doesn't tell us what's happening. For example in fatbobs case "Cuda error in file 'nonbonded.cu' in line 189 : invalid device symbol." was the actual error message.

MrS
____________
Scanning for our furry friends since Jan 2002

uBronan
Avatar
Send message
Joined: 1 Feb 09
Posts: 139
Credit: 575,023
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 7281 - Posted: 8 Mar 2009 | 19:03:21 UTC

changed the setting back to show :)
ill try to reboot and see what happens

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7283 - Posted: 8 Mar 2009 | 19:46:43 UTC - in response to Message 7281.

OK, you did post all relevant information and there's nothing else in the task output. But it was worth taking a look anyway.

MrS
____________
Scanning for our furry friends since Jan 2002

uBronan
Avatar
Send message
Joined: 1 Feb 09
Posts: 139
Credit: 575,023
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 7285 - Posted: 8 Mar 2009 | 21:54:58 UTC
Last modified: 8 Mar 2009 | 22:00:56 UTC

I guess Stefan is right i think that somehow it tried to switch to the nonbonded device, its probably like switching between different computers
Meanwhile i took the advise and rebooted and closed down my machine for a few minutes to see if that solves the issue.
If so we must reboot once in a few days, probably because of memory leaks in the applications.
But which one is not clear to me it can be both or it could the combination of gpugrid versus seti

Strangly i haven't had any problems on all previous units and never needed a reboot other then about 2 months ago my machine runs 24/7 normally.
Thanks for trying to help anyway but i fear its out of our hands now


PS sadly no solution the newly received unit crashed withing 30 seconds
So its something else other then my machine or boinc since all other projects run without errors

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7287 - Posted: 8 Mar 2009 | 22:05:39 UTC - in response to Message 7285.

Normally we don't need to reboot for GPU-Grid, it runs just fine. But if *something* happened an WUs error out in rows, it could be that the PC went into some strange state (which the reboot / power off would solve).

I guess Stefan is right i think that somehow it tried to switch to the nonbonded device, its probably like switching between different computers


Sorry to tell you, but you're on the completely wrong track here. The error message involving file "nonbonded.cu" appeared because fatbobs BOINC tried to run GPU-Grid on a card which does not support the features it needs. I.e. it can not recognize some commands which the GPU-Grid team put into nonbonded.cu. This is not something you could trigger (even if you wanted to), or which just happens, it only happens if you use *incapable* hardware. That's why your errors are completely different from fatboys, except for the general error code 0x1.

MrS
____________
Scanning for our furry friends since Jan 2002

uBronan
Avatar
Send message
Joined: 1 Feb 09
Posts: 139
Credit: 575,023
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 7288 - Posted: 8 Mar 2009 | 22:24:34 UTC
Last modified: 8 Mar 2009 | 22:32:27 UTC

Lol you misunderstood me i prolly stink at english but you say exactly what i meant with it

On my personal problem with gpugrid i did something nasty :(
I downclocked my gpu speeds drastically and see what happens with the new units now my perfect record of running non errors is over
I think the admin from gpugrid could not stand me being error free on the runs ;)
Anyway no clue why it keeps crashing if it crashes again ill leave for a while to good running projects untill the problems are solved
I dont want to spend another 24 hours and then getting nothing because it errors out again.

uBronan
Avatar
Send message
Joined: 1 Feb 09
Posts: 139
Credit: 575,023
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 7321 - Posted: 10 Mar 2009 | 15:32:21 UTC

Not sure if anybody want to know but when i had seti and gpugrid active both as cuda the grugrid was crashed.
And only 3 cores out of 4 active on boinc.
I have now only gpugrid active on the cuda and seems to run without error.
So lets see if it stays running

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7327 - Posted: 10 Mar 2009 | 19:22:44 UTC - in response to Message 7321.

Did they try to run at the same time on your single card or was it just that you had projects activated and BOINc would switch between them normally?

In that case we can say for sure that seti is not leaving the machine in a "clean" state.

MrS
____________
Scanning for our furry friends since Jan 2002

uBronan
Avatar
Send message
Joined: 1 Feb 09
Posts: 139
Credit: 575,023
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 7378 - Posted: 12 Mar 2009 | 15:46:40 UTC - in response to Message 7327.

Did they try to run at the same time on your single card or was it just that you had projects activated and BOINc would switch between them normally?

In that case we can say for sure that seti is not leaving the machine in a "clean" state.

MrS


Well after testing and disabling seti from my projects (also because of i am not very happy with that project (probably fake results))

I must admit that gpugrid runs nicely again altough i have to admit i changed some hardware settings also
First of all i slowed down my dram it was set to run at 4-4-4-12 CR1 but i fear is maybe too fast so i switched it back to CR2 but i am not sure if this was needed.
Second i downclocked my VC back to beneat the default OC settings of this card and run the card at 100% stock speeds for the given 9600GT model.
By default the card was overclocked by EVGA to 675 mhz

I saw also that sometimes 2 or 3 seti cuda units ran simultanous with the gpugrid but for a while this gave NO errors.
Hence i have 4 cores ;)
After i resetted seti cuda and installed the KWSN optimized app it went to run with 1 cuda unit together with gpugrid also not giving any errors.

But then suddenly all the errors came as result, i cleaned up all and set it back to low/standard settings and now the project runs "normal" (slow) again.

Finishing units like it should but i am not sure if/or all these should be seen as needed because the weird thing of it all is that it worked for weeks without problems, and all of a sudden all started to fail.

Its ofcourse kinda hard to find the real culprit in these situations, maybe the given units where nasty, i have no clue yet but i am glad it runs normal again.
Or maybe the damn windows updates where ;), who knows may say it :D

Joe
Send message
Joined: 1 Sep 08
Posts: 37
Credit: 5,864,088
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 7415 - Posted: 13 Mar 2009 | 11:19:49 UTC

I have a new Win XP with an updates, the newest Nvidia driver for my GTX 295. But I cant complete any WU with GPUGrid. Try to reset the project, use the GTX in SLI and in two core mode... nothing works... Its the same mistake evrytime:

MDIO ERROR: cannot open file "restart.coor"

<core_client_version>6.6.15</core_client_version>
<![CDATA[
<message>
Unzul�ssige Funktion. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce GTX 295"
# Clock rate: 1242000 kilohertz
# Total amount of global memory: 939261952 bytes
# Number of multiprocessors: 30
# Number of cores: 240
# Device 1: "GeForce GTX 295"
# Clock rate: 1242000 kilohertz
# Total amount of global memory: 939196416 bytes
# Number of multiprocessors: 30
# Number of cores: 240
MDIO ERROR: cannot open file "restart.coor"

</stderr_txt>
]]>

The GTX is ok, in a Vista system I can complete the WUs...

Can you help me with this problem, please?

Alain Maes
Send message
Joined: 8 Sep 08
Posts: 63
Credit: 1,437,484,959
RAC: 69,868
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7416 - Posted: 13 Mar 2009 | 11:54:44 UTC - in response to Message 7415.

This is actually not a problem. The message [MDIO ERROR: cannot open file "restart.coor"] always occurs at the start of a new WU for the simple reason that it has to start from scratch and can not fall back on a previously saved restart.coor, such as will be the case after a shutdown and restart of the PC in the middle of crunching a WU.

Your machine also reports two devices as it should be for a GTX 295. So two GPUGRID WUs will be running together.

Please check your result status and you should see that everything is fine.

PS - and if want any more help, try unhiding your PCs so that other fellow crunchers can have a look at them.

Kind regards.

Alain

Joe
Send message
Joined: 1 Sep 08
Posts: 37
Credit: 5,864,088
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 7418 - Posted: 13 Mar 2009 | 14:02:15 UTC - in response to Message 7416.

Hi Alain,

this is the machine: http://www.gpugrid.net/results.php?hostid=29092 Its unhide now.

The PC ist running the hole day, there is no restart. Sometimes the error comes after a few seconds starting the new WU, sometimes after a few hours work... The result quit with a client and computing error...

Hope you can help me a little more...

Kind regards

Joe

PS Here some fact of the machine:

13.03.2009 14:37:09 Starting BOINC client version 6.6.15 for windows_intelx86
13.03.2009 14:37:09 log flags: task, file_xfer, sched_ops
13.03.2009 14:37:09 Libraries: libcurl/7.19.4 OpenSSL/0.9.8j zlib/1.2.3
13.03.2009 14:37:09 Data directory: D:\BOINC\Data
13.03.2009 14:37:09 Running under account Jörg
13.03.2009 14:37:09 Milkyway@home Found app_info.xml; using anonymous platform
13.03.2009 14:37:09 SETI@home Found app_info.xml; using anonymous platform
13.03.2009 14:37:09 Processor: 2 GenuineIntel Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz [x86 Family 6 Model 15 Stepping 11]
13.03.2009 14:37:09 Processor features: fpu tsc sse sse2 mmx
13.03.2009 14:37:09 OS: Microsoft Windows XP: Professional x86 Editon, Service Pack 3, (05.01.2600.00)
13.03.2009 14:37:09 Memory: 2.00 GB physical, 3.85 GB virtual
13.03.2009 14:37:09 Disk: 107.42 GB total, 100.79 GB free
13.03.2009 14:37:09 Local time is UTC +1 hours
13.03.2009 14:37:09 CUDA device: GeForce GTX 295 (driver version 18208, CUDA version 1.3, 896MB, est. 106GFLOPS)
13.03.2009 14:37:09 Not using a proxy
13.03.2009 14:37:09 GPUGRID URL: http://www.gpugrid.net/; Computer ID: 29092; location: (none); project prefs: default
13.03.2009 14:37:09 Reading preferences override file
13.03.2009 14:37:09 Preferences limit memory usage when active to 1023.46MB
13.03.2009 14:37:09 Preferences limit memory usage when idle to 1842.23MB
13.03.2009 14:37:09 Preferences limit disk usage to 53.71GB
...

13.03.2009 14:45:13 GPUGRID Sending scheduler request: To fetch work.
13.03.2009 14:45:13 GPUGRID Requesting new tasks
13.03.2009 14:45:41 GPUGRID Computation for task sM24328-SH2_US_8-0-10-SH2_US_8620000_0 finished
13.03.2009 14:45:41 GPUGRID Output file sM24328-SH2_US_8-0-10-SH2_US_8620000_0_1 for task sM24328-SH2_US_8-0-10-SH2_US_8620000_0 absent
13.03.2009 14:45:41 GPUGRID Output file sM24328-SH2_US_8-0-10-SH2_US_8620000_0_2 for task sM24328-SH2_US_8-0-10-SH2_US_8620000_0 absent
13.03.2009 14:45:41 GPUGRID Output file sM24328-SH2_US_8-0-10-SH2_US_8620000_0_3 for task sM24328-SH2_US_8-0-10-SH2_US_8620000_0 absent

...

Alain Maes
Send message
Joined: 8 Sep 08
Posts: 63
Credit: 1,437,484,959
RAC: 69,868
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7419 - Posted: 13 Mar 2009 | 14:34:22 UTC - in response to Message 7418.

OK, there is indeed a problem. All your results have the error code [Unzul�ssige Funktion. (0x1) - exit code 1 (0x1)]. Unfortunately this tells little and gives no real clues.

Worth trying in such cases.
1. Check the version of your video driver. Make sure you have the last one, currently 180.29 if I am not mistaken.
2. Verify your GPU temperature
3. Did you overclock your videocard? If so try easing back.
4. Also, did you try a simple restart?

Hope one of these help, sorry I can not be more specific.

Kind regards

Alain

Phoneman1
Send message
Joined: 25 Nov 08
Posts: 51
Credit: 980,186
RAC: 0
Level
Gly
Scientific publications
watwat
Message 7420 - Posted: 13 Mar 2009 | 14:41:37 UTC - in response to Message 7419.

I'd add another question:

5. Do you have GPU tasks ticked in your Seti options on your account @ Seti? The default is yes.

In theory it should be possible to get GPU projects to share resources with outher GPU projects on the same machine but I've not been able to get that to work yet.

Phoneman1

Joe
Send message
Joined: 1 Sep 08
Posts: 37
Credit: 5,864,088
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 7421 - Posted: 13 Mar 2009 | 15:06:18 UTC - in response to Message 7420.

1. Check the version of your video driver.
GeForce Release 182.08 WHQL

2. Verify your GPU temperature
without GPU work:
GPU1: Grafikprozessor (GPU) 53 °C (127 °F)
GPU1: GPU Speicher 45 °C (113 °F)
GPU1: GPU Umgebung 48 °C (118 °F)
GPU1: GPU VRM 50 °C (122 °F)
GPU2: Grafikprozessor (GPU) 56 °C (133 °F)
GPU2: GPU Umgebung 50 °C (122 °F)
GPU2: GPU VRM 51 °C (124 °F)

with GPU work:
GPU1: Grafikprozessor (GPU) 78 °C (172 °F)
GPU1: GPU Umgebung 65 °C (149 °F)
GPU1: GPU VRM 61 °C (142 °F)
GPU2: Grafikprozessor (GPU) 76 °C (169 °F)
GPU2: GPU Speicher 68 °C (154 °F)
GPU2: GPU Umgebung 67 °C (153 °F)
GPU2: GPU VRM 60 °C (140 °F)

3. Did you overclock your videocard?
No, its original
GPU Takt (Geometric Domain) 576 MHz (Original: 576 MHz)
GPU Takt (Shader Domain) 1242 MHz (Original: 1242 MHz)
Busbreite 448 Bit
Tatsächlicher Takt 999 MHz (DDR) (Original: 999 MHz)
Effektiver Takt 1998 MHz
Bandbreite 109.3 GB/s

4. Also, did you try a simple restart?
Yes, but it had no positve effekt

5. Do you have GPU tasks ticked in your Seti options on your account @ Seti?
No, its disable. Because I use the CUDA only for GPUGrid. Its works with all other machines...

Kind regards

Joe

Alain Maes
Send message
Joined: 8 Sep 08
Posts: 63
Credit: 1,437,484,959
RAC: 69,868
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7422 - Posted: 13 Mar 2009 | 15:16:09 UTC - in response to Message 7421.

Ohoh. If all that does not work, maybe your GPU is broken?

Check it out
1. with e.g. GPU-Z
2. in another machine if possible

OK, have to drive back home to Belgium now.
Hope to find some positive news on this after the WE

Take care

Kind regards

Alain

Joe
Send message
Joined: 1 Sep 08
Posts: 37
Credit: 5,864,088
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 7423 - Posted: 13 Mar 2009 | 15:33:54 UTC - in response to Message 7422.

There an no problems with Everest or GPU-Z.
I testes this card in a XP machine - the same problem with GPUGrid. Then in a Vista PC - all ok. Put in in the next XP PC - and the card wont word... A problem with XP???

Kind regards

Joe

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7429 - Posted: 13 Mar 2009 | 21:11:39 UTC
Last modified: 13 Mar 2009 | 21:14:46 UTC

Hi Joe,

that looks really strange! It's not the GPU-Grid wouldn't run under XP in general. I installed 182.08 myself yesterday and it's running just fine. Since it works under Vista your hardware is fine, so you seem to have a wierd software problem. Did you customize your install via nlite? Or is there some rather uncommon software present on all your machines?
Edit: do you use remote desktop to access the machines?

MrS
____________
Scanning for our furry friends since Jan 2002

Joe
Send message
Joined: 1 Sep 08
Posts: 37
Credit: 5,864,088
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 7477 - Posted: 15 Mar 2009 | 16:00:49 UTC - in response to Message 7429.

Hi MrS,

yes, the hardware is ok. I believe that XP is the problem... Its an 2day old XP with all MS, Nvidia and board updates. And on my other (older) XP PC I have the same problems... I try GPUGrid with the GTX295 in an naked XP - only with problems...
No, there is no remote desktop, only a normal network. There is only XP, Office an Photoshop CS3 installed.
I tested another GTX 295 in both XP PC - its already the same problem - in Vista a is all ok!!!
I tried the Seti Cuda work with the XP machines and it work very well, no problems.
Perhaps the next Nvidia driver will work better with my XP...

Kind regards

Joe

PS If you have any idea or solution for me, please write me!!!!!


Joe
Send message
Joined: 1 Sep 08
Posts: 37
Credit: 5,864,088
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 7478 - Posted: 15 Mar 2009 | 16:05:48 UTC - in response to Message 7477.

PS I can't change to Vista, because I have problems to work with Photoshop CS on Vista, when the network print server works on XP. Photoshop works VERY slow in some functions...

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 7489 - Posted: 15 Mar 2009 | 18:51:24 UTC - in response to Message 7478.

Are the XP PCs updated to SP3 and the latest .NET ?

Regards
Zy

Joe
Send message
Joined: 1 Sep 08
Posts: 37
Credit: 5,864,088
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 7492 - Posted: 15 Mar 2009 | 19:23:44 UTC - in response to Message 7489.

Yes, SP3 incl. updates and .Net3.5 Sp1...
There are no more updates for my XP...

Kind regards

Joe

Ulf Ohlsson
Send message
Joined: 1 Jan 09
Posts: 20
Credit: 616,384
RAC: 0
Level
Gly
Scientific publications
watwatwat
Message 7497 - Posted: 15 Mar 2009 | 20:18:46 UTC

Also got a corrupted WU, first one for me, could anyone please tell whats wrong!

QW11771-SH2_US_8-4-10-SH2_US_8880000_0
Workunit 305555
Created 13 Mar 2009 17:28:14 UTC
Sent 13 Mar 2009 19:04:44 UTC
Received 14 Mar 2009 19:36:59 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
Computer ID 21408
Report deadline 17 Mar 2009 19:04:44 UTC
CPU time 6095.469
stderr out <core_client_version>6.6.15</core_client_version>
<![CDATA[
<message>
Felaktig funktion. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>

</stderr_txt>
]]>


Validate state Invalid
Claimed credit 4214.27546296296
Granted credit 0
application version 6.62

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7505 - Posted: 15 Mar 2009 | 21:19:18 UTC - in response to Message 7497.

Also got a corrupted WU, first one for me, could anyone please tell whats wrong!


Sorry, no. There's no useful information in the task output and if it there are no further errors I'd just forget about it. You could keep an eye on the failed WU, though, and see if others are successful.

MrS
____________
Scanning for our furry friends since Jan 2002

jrobbio
Send message
Joined: 13 Mar 09
Posts: 59
Credit: 324,366
RAC: 0
Level

Scientific publications
watwatwatwat
Message 7508 - Posted: 15 Mar 2009 | 23:05:38 UTC - in response to Message 7505.

http://www.gpugrid.net/result.php?resultid=401624


<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce GTS 250"
# Clock rate: 1890000 kilohertz
# Total amount of global memory: 1073414144 bytes
# Number of multiprocessors: 16
# Number of cores: 128
MDIO ERROR: cannot open file "restart.coor"
Failed to set low-cpu sync mode
# Using CUDA device 0
Cuda error in file 'deviceQuery.cu' in line 59 : initialization error.

</stderr_txt>
]]>


My first WU worked perfectly, but the second one failed on my Windows XP Home SP3 when I changed profile, whilst the other was logged in.

Regards,

Rob

David Saum
Send message
Joined: 13 Jan 09
Posts: 2
Credit: 1,008,604
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwat
Message 7518 - Posted: 16 Mar 2009 | 16:09:38 UTC

One of my computers that has been crunching gpugrid quit happily suddenly started giving errors. Suggestions? Here is a typical error:

http://www.gpugrid.net/result.php?resultid=410378

<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 9600 GSO"
# Clock rate: 1350000 kilohertz
# Total amount of global memory: 402325504 bytes
# Number of multiprocessors: 12
# Number of cores: 96
Cuda error in file '..\cuda/cutil.h' in line 305 : out of memory.
Memory usage: host: bytes device: bytes
Assertion failed: 0, file ..\cuda/cutil.h, line 305

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

uBronan
Avatar
Send message
Joined: 1 Feb 09
Posts: 139
Credit: 575,023
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 7523 - Posted: 16 Mar 2009 | 17:31:01 UTC

Yesterday all started over again all Gpugrid error out
Its not the vc since s@h runs fine the only thing which is active is the other project CPDN on 3 cores
But i have also a problem now getting new units its somehow not accepting new downloads or something not sure what is going on.
The sad thing is the unit seems to run fine for a while and then suddenly crashes.

Jeremy
Send message
Joined: 15 Feb 09
Posts: 55
Credit: 3,542,733
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 7526 - Posted: 16 Mar 2009 | 19:12:28 UTC

After doing a good bit of work to re-ensure the stability and reliability of my machine, I set it working last night. My first GPUgrid task finished with an error, all other projects are completing without issue. Seems as though there might be something up with the WUs from the project atm, but I have no way to confirm that.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 7527 - Posted: 16 Mar 2009 | 19:22:31 UTC - in response to Message 7526.

Most of the errors are usually due to overclocking either by the factory or by the user.
In your case the GTX 260-216 should be clocked 1242 Mhz, so you might be too high.

http://en.wikipedia.org/wiki/GeForce_200_Series

Try to reduce it to standard clock rates and see if it works.

gdf

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7533 - Posted: 16 Mar 2009 | 22:03:20 UTC

David,

Cuda error in file '..\cuda/cutil.h' in line 305 : out of memory.


This is a typical error, which is resolved by a reboot. Either caused by a memory leak in elder drivers under XP64 or after an app / game reserved lots of memory and GPU-Grid was running in the background.

webbie wrote:
Its not the vc since s@h runs fine

That's definitely wrong, take a look at what Paul and me wrote here. You are ~175 MHz above stock speed. It's naive to assume that this could not possibly cause your errors.

MrS
____________
Scanning for our furry friends since Jan 2002

uBronan
Avatar
Send message
Joined: 1 Feb 09
Posts: 139
Credit: 575,023
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 7542 - Posted: 17 Mar 2009 | 10:00:28 UTC
Last modified: 17 Mar 2009 | 10:02:41 UTC

I tested with lower speeds i set the card to lowest settings possible even lower then reported for nvidia 9600 gt on nivdia site.

The EVGA defaults for this card settings are indeed much higher then nvidia reports to be default but nevertheless i lowered them all.

But as i was checking i also stopped the CPDN units and saw something what i did not see before, after stopping the CPDN units the drive leds stopped flashing.
So i checked what happens when i turned on the units again, and it started flashing again without stopping.
The anwer on this came from the site of cpdn these units use 1,5 Gb memory a piece so running 3 or 4 of them seem to be too much.
Even if i had booted into my win7 X64 mode it would have had not enough memory to run them all since 8GB isnt enough to run 4 of them.
Hence the continous use of the swap space.
After i turned of cpdn units and let the other projects flow as usual on 3 cores seems to run fine again.
So i guess its a good idea to check which other projects run together with gpugrid.
I start to wonder if it is safe to run s@h or cpdn together with gpugrid
Ill test the settings of cpdn to use only low amount memory ones after this ones are done.

jrobbio
Send message
Joined: 13 Mar 09
Posts: 59
Credit: 324,366
RAC: 0
Level

Scientific publications
watwatwatwat
Message 7543 - Posted: 17 Mar 2009 | 10:20:51 UTC - in response to Message 7508.

I think I may have fixed this issue:
If you have Windows XP Home and you want to switch profiles whilst using Boinc, follow the steps below found here:
http://setiathome.berkeley.edu/forum_thread.php?id=50929


Windows XP Home, huh? Then you don't have the User Account control panel, only that stupid (sorry ;-)) simple file sharing.

What you could try is the following:
- Restart computer into Windows Safe Mode (Keep hammering F8 after you cleared the BIOS and before you see the Windows Logo, when upon the menu choose Safe Mode).
- Select Start > Run
- Type %allusersprofile% in the Open text box. Then click OK.
- Right-click the Shared Documents folder, and select Properties.
- Click the Security tab.
- Select Everyone in Group Or User Names, and then select Allow next to Full Control in the Permissions. (if possible, check that the admin account you're using is a member of the boinc_admins, boinc_projects and boinc_users groups. Normal accounts should only be members of the latter two groups)
- Click Advanced, and select Replace Permission Entries On All Child Objects With Entries Shown Here That Apply To Child Objects.
- Click OK, click Yes, and then click OK to close the Shared Documents Properties dialog box.
- Select Start > Run.
- Type msconfig in the Open text box. Then click OK. The System Configuration Utility will open.
- In the System Configuration Utility, select the General tab.
- Select Normal Startup - Load All Device Drivers And Services. Then click OK.
- Restart the computer.
- Reinstall BOINC.


I didn't even know you could get to the NTFS permissions on XP Home.

Regards,

Rob

uBronan
Avatar
Send message
Joined: 1 Feb 09
Posts: 139
Credit: 575,023
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 7547 - Posted: 17 Mar 2009 | 12:37:17 UTC
Last modified: 17 Mar 2009 | 12:42:21 UTC

Speeds set by evga precision tool
525 Mhz core speed, shaders linked on core showing 1302 mhz
memory set at 900 mhz
nvidia stock speeds core 650 Mhz shaders not mentioned.
memory 900 Mhz
But i guess i can set them higher again since all works fine when s@h or cpdn aren't jumping in as running projects

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7568 - Posted: 17 Mar 2009 | 22:02:27 UTC - in response to Message 7547.

There are issues when running s@h and GPU-Grid together. I don't know everythig, but it seems that if seti errors out the PC will need a reboot to use CUDA again (GPU-Grid also erros). But there may be more.

Stock shaders for 9600GT are 1.625 GHz. I'd put the clocks at NV stock for one WU or 2 and switch to the old values afterwards [if successful ;) ].

MrS
____________
Scanning for our furry friends since Jan 2002

Profile [AF>Libristes] Dudumomo
Send message
Joined: 30 Jan 09
Posts: 45
Credit: 425,620,748
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7574 - Posted: 17 Mar 2009 | 23:35:44 UTC

Hello.
A friend has just tried the 185.13 drivers on Debian 64b
And all his WU are in computation error.
The error given is :
<core_client_version>6.6.15</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
# Using CUDA device 0
SIGSEGV: segmentation violation
Stack trace (17 frames):
acemd_6.59_x86_64-pc-linux-gnu__cuda[0x4baac9]
/lib/libc.so.6[0x7f48eab1af60]
/usr/lib/libcuda.so.1[0x7f48eba0bce0]
/usr/lib/libcuda.so.1[0x7f48eba11a44]
/usr/lib/libcuda.so.1[0x7f48eb9d79df]
/usr/lib/libcuda.so.1[0x7f48eb65f9cb]
/usr/lib/libcuda.so.1[0x7f48eb6702cb]
/usr/lib/libcuda.so.1[0x7f48eb6580c1]
/usr/lib/libcuda.so.1(cuCtxCreate+0xaa)[0x7f48eb65224a]
../../projects/www.gpugrid.net/libcudart.so.2[0x7f48ebc8cd58]
../../projects/www.gpugrid.net/libcudart.so.2[0x7f48ebc8d2a9]
../../projects/www.gpugrid.net/libcudart.so.2(cudaThreadSynchronize+0x1d)[0x7f48ebc7374d]
acemd_6.59_x86_64-pc-linux-gnu__cuda[0x414253]
acemd_6.59_x86_64-pc-linux-gnu__cuda(sin+0x16ac)[0x408a3c]
acemd_6.59_x86_64-pc-linux-gnu__cuda(sin+0x31b)[0x4076ab]
/lib/libc.so.6(__libc_start_main+0xe6)[0x7f48eab071a6]
acemd_6.59_x86_64-pc-linux-gnu__cuda(sinh+0x49)[0x407489]

Exiting...

</stderr_txt>
]]>

Any ideas ?

uBronan
Avatar
Send message
Joined: 1 Feb 09
Posts: 139
Credit: 575,023
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 7578 - Posted: 18 Mar 2009 | 1:49:20 UTC - in response to Message 7568.
Last modified: 18 Mar 2009 | 1:50:35 UTC

There are issues when running s@h and GPU-Grid together. I don't know everythig, but it seems that if seti errors out the PC will need a reboot to use CUDA again (GPU-Grid also erros). But there may be more.

Stock shaders for 9600GT are 1.625 GHz. I'd put the clocks at NV stock for one WU or 2 and switch to the old values afterwards [if successful ;) ].

MrS


Thanx i put it on these to see if it helps also
I slowly am raising the clock speed of the core but i guess it won't matter much untill it reaches the normal clock of 650 Mhz
Ill keep on setting a step higher till stock speeds are reached on the shaders as well, as i understood memory speed doesn't do much i keep it close to stock.
I should have started boinc under x64 vista instead of under normal xp :D
Then i could have used the full 8 Gb mem xD
Or i have to run the machine totally dry but thats gonna take a while since CPDN has downloaded a large one.

jrobbio
Send message
Joined: 13 Mar 09
Posts: 59
Credit: 324,366
RAC: 0
Level

Scientific publications
watwatwatwat
Message 7582 - Posted: 18 Mar 2009 | 8:53:43 UTC - in response to Message 7543.

I think I may have fixed this issue:
If you have Windows XP Home and you want to switch profiles whilst using Boinc, follow the steps below found here:
http://setiathome.berkeley.edu/forum_thread.php?id=50929

I didn't even know you could get to the NTFS permissions on XP Home.

Regards,

Rob


Turns out that this didn't work for me. The GPU task runs for a few seconds and then fails out followed by any queued tasks for that day.

I have reinstalled Boinc as a service so changing profile should not interfere with the tasks. I'll update this thread with my findings.

Regards,

Rob

Joe
Send message
Joined: 1 Sep 08
Posts: 37
Credit: 5,864,088
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 7587 - Posted: 18 Mar 2009 | 11:57:16 UTC - in response to Message 7582.

I think I may have fixed this issue:
If you have Windows XP Home and you want to switch profiles whilst using Boinc, follow the steps below found here:
http://setiathome.berkeley.edu/forum_thread.php?id=50929

I didn't even know you could get to the NTFS permissions on XP Home.

Regards,

Rob


Turns out that this didn't work for me. The GPU task runs for a few seconds and then fails out followed by any queued tasks for that day.

I have reinstalled Boinc as a service so changing profile should not interfere with the tasks. I'll update this thread with my findings.

Regards,

Rob


It's the same for me... Perhaps we have to wait for the next CUDA version???

Kind regards

Joe

Scott Brown
Send message
Joined: 21 Oct 08
Posts: 144
Credit: 2,973,555
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwat
Message 7589 - Posted: 18 Mar 2009 | 12:21:55 UTC - in response to Message 7582.


I have reinstalled Boinc as a service so changing profile should not interfere with the tasks. I'll update this thread with my findings.


I don't think you can run the GPU tasks as a service (a limitation of CUDA itself I believe).

Profile Michael Goetz
Avatar
Send message
Joined: 2 Mar 09
Posts: 124
Credit: 46,573,744
RAC: 837,894
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 7590 - Posted: 18 Mar 2009 | 13:53:25 UTC - in response to Message 7589.

I don't think you can run the GPU tasks as a service (a limitation of CUDA itself I believe).


That's correct. The Windows installation process for both 6.4.5 and 6.4.7 explicitly states that you can not install BOINC as a service if you want to use CUDA.

I have BOINC running as a service on all my Windows machines *except* for the one where I'm running CUDA. Can't install it as a service there.

Mike

Joe
Send message
Joined: 1 Sep 08
Posts: 37
Credit: 5,864,088
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 7598 - Posted: 18 Mar 2009 | 17:09:41 UTC - in response to Message 7590.

I'm testing 6.6.16 Beta http://boinc.berkeley.edu/dl/boinc_6.6.16_windows_intelx86.exe and it seem to work now with XP Prof 32 Bit. Working now since 4 hours without an error... Let me see more tomorrow.

Kind regard

Joe

jrobbio
Send message
Joined: 13 Mar 09
Posts: 59
Credit: 324,366
RAC: 0
Level

Scientific publications
watwatwatwat
Message 7599 - Posted: 18 Mar 2009 | 17:18:13 UTC - in response to Message 7590.

I don't think you can run the GPU tasks as a service (a limitation of CUDA itself I believe).


That's correct. The Windows installation process for both 6.4.5 and 6.4.7 explicitly states that you can not install BOINC as a service if you want to use CUDA.

I have BOINC running as a service on all my Windows machines *except* for the one where I'm running CUDA. Can't install it as a service there.

Mike


Mike,

I am running 6.6.15 and BOINC is running CUDA whilst installed as a service.

They must have resolved whatever issue there was with the stable editions.

Rob

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 135,911,881
RAC: 892
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 7602 - Posted: 18 Mar 2009 | 18:58:13 UTC - in response to Message 7599.

The issue is only with Vista.
On XP it worked all the time with BOINC installed as a service...
____________

pixelicious.at - my little photoblog

Profile Michael Goetz
Avatar
Send message
Joined: 2 Mar 09
Posts: 124
Credit: 46,573,744
RAC: 837,894
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 7603 - Posted: 18 Mar 2009 | 19:14:30 UTC - in response to Message 7599.
Last modified: 18 Mar 2009 | 19:23:03 UTC

I am running 6.6.15 and BOINC is running CUDA whilst installed as a service.


Sweet!

In that case, I'm certainly looking forward to the release of a stable 6.6.x client. I much prefer running BOINC as a service.

I prefer not to run the beta versions of the client, so I'm sticking to the stable release versions. I like running CPDN, and with the length of those work units, too much work gets lost if they error out. Yeah, I know I can backup/restore those WUs, but then I have to remember to back them up, and restoration is less than a pleasant process when you're running lots of projects.

Mike

Edit:

The issue is only with Vista.
On XP it worked all the time with BOINC installed as a service...


Well, maybe not so sweet then...

Joe
Send message
Joined: 1 Sep 08
Posts: 37
Credit: 5,864,088
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 7625 - Posted: 19 Mar 2009 | 10:54:04 UTC - in response to Message 7603.

A few minutes before finishing the WU I got "incorrect function (0x1) exit code 1" with 6.6.16... Now I'm testing 6.6.17...

Profile [AF>Libristes] Dudumomo
Send message
Joined: 30 Jan 09
Posts: 45
Credit: 425,620,748
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7641 - Posted: 19 Mar 2009 | 16:38:28 UTC

Any ideas for the error 193 with the 185.13 drivers on Debian 64b ?

Clownius
Send message
Joined: 19 Feb 09
Posts: 37
Credit: 30,657,566
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwat
Message 7656 - Posted: 20 Mar 2009 | 2:01:52 UTC

Just had 4 errors on my Vista machine......looks like my fellow crunchers have errored out too. Is there a bad batch of Wu's out there atm? Just thought its worth checking as my GTX 295 is a factory overclock card and it could possibly cause some errors.

http://www.gpugrid.net/workunit.php?wuid=318412
http://www.gpugrid.net/workunit.php?wuid=317639
http://www.gpugrid.net/workunit.php?wuid=317457
This one was completed by someone else but had error at same time as one of the other WU's....possibly killed when he other one went...i dont know
http://www.gpugrid.net/workunit.php?wuid=317194

jrobbio
Send message
Joined: 13 Mar 09
Posts: 59
Credit: 324,366
RAC: 0
Level

Scientific publications
watwatwatwat
Message 7672 - Posted: 20 Mar 2009 | 21:14:35 UTC - in response to Message 7603.

[quote]I am running 6.6.15 and BOINC is running CUDA whilst installed as a service.


Sweet!

In that case, I'm certainly looking forward to the release of a stable 6.6.x client. I much prefer running BOINC as a service.
/quote]

I thought installing it as a service would fix the problem, but my tasks again failed out when logging into the machine as two users simultaneously on XP Home.

20/03/2009 16:34:22 GPUGRID Computation for task up108704-pYEpYI_US530000-0-10-ignasi_1 finished
20/03/2009 16:34:22 GPUGRID Starting WS10117-SH2_US_8-0-10-SH2_US_8270000_0
20/03/2009 16:34:23 GPUGRID Starting task WS10117-SH2_US_8-0-10-SH2_US_8270000_0 using acemd version 662
20/03/2009 16:34:24 GPUGRID Computation for task WS10117-SH2_US_8-0-10-SH2_US_8270000_0 finished
20/03/2009 16:34:24 GPUGRID Output file WS10117-SH2_US_8-0-10-SH2_US_8270000_0_1 for task WS10117-SH2_US_8-0-10-SH2_US_8270000_0 absent
20/03/2009 16:34:24 GPUGRID Output file WS10117-SH2_US_8-0-10-SH2_US_8270000_0_2 for task WS10117-SH2_US_8-0-10-SH2_US_8270000_0 absent
20/03/2009 16:34:24 GPUGRID Output file WS10117-SH2_US_8-0-10-SH2_US_8270000_0_3 for task WS10117-SH2_US_8-0-10-SH2_US_8270000_0 absent
20/03/2009 16:34:25 GPUGRID Started upload of up108704-pYEpYI_US530000-0-10-ignasi_1_0
20/03/2009 16:34:25 GPUGRID Started upload of up108704-pYEpYI_US530000-0-10-ignasi_1_1

Any suggestions how I can fix this? Should I disable the option on install to allow all users to run Boinc? Anything else?

Rob

computerguy09
Send message
Joined: 20 Aug 08
Posts: 10
Credit: 15,789,768
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 7673 - Posted: 20 Mar 2009 | 22:07:51 UTC - in response to Message 7641.
Last modified: 20 Mar 2009 | 22:09:06 UTC

Any ideas for the error 193 with the 185.13 drivers on Debian 64b ?


Just a check on your environment. I just brought up GPUGRID on my Ubuntu 8.10 64-bit machine. Brand new install. And every WU would error out in seconds. No error 193, but it gave me the "output file absent" message.

Check to see if the 32-bit runtime libraries are installed. If not, try that on the next set of WU's. After I installed the IA32 libraries, and the "microcode.ctl" package, the next set of WU's are running fine.

I'm not saying this will fix your problem, but it's something to check.

I'm running nvidia 180.11 drivers, since that's what Ubuntu "likes". And trying to upgrade to the released versions from Nvidia's website just wasn't going too well, so I reverted to "stock".

Also, I'm running an 8600GTS card, and BOINC 6.6.15.

Mark

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 7693 - Posted: 21 Mar 2009 | 11:03:41 UTC

A long string of compute errors to report (task ids):

426788 4 errors
426924 3 errors
427001 4 errors, 1 canceled
427056 2 errors
427583 3 errors
425600 1 error

The error counts are as of the posting of this note ... obviously these may rise by tomorrow or when ever you all at the project look at these ...

These all look like "new" and "improved" tasks so ... maybe back to the drawing board ...

ignasi
Send message
Joined: 10 Apr 08
Posts: 254
Credit: 16,836,000
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 7702 - Posted: 21 Mar 2009 | 12:58:31 UTC - in response to Message 7693.

Certainly one of the batches sent yesterday was corrupted.
*pYIpYV1*
I am canceling them out.

sorry about that,
ignasi

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 7705 - Posted: 21 Mar 2009 | 13:24:12 UTC - in response to Message 7702.

The aborts came through ok - it picked up an additional two bad ones in the queue lurking from the same batch as the three that bombed out earlier this morning UTC time.

Many Thanks - Crunch On :)

Regards
Zy

David Saum
Send message
Joined: 13 Jan 09
Posts: 2
Credit: 1,008,604
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwat
Message 7708 - Posted: 21 Mar 2009 | 16:31:22 UTC

Help! I have 3 identical winXP AMD 4200 x2 systems with evga 9600gso 384mb cards that have been running GPUGRID happily for at least a month, and suddenly about a week ago ALL of them started giving errors simultaneously. All WUs now crash within seconds of starting. I have tried lots of fixes: going back to stock clocking, underclocking, rebooting, updating drivers, etc. Nothing makes any difference. There have been no changes to my system other than WinXP auto updates. My conclusion is that there was a change in GPUGRID WUs and the problem is not on my side. Here is my typical crash:

http://www.gpugrid.net/result.php?resultid=410378

<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 9600 GSO"
# Clock rate: 1350000 kilohertz
# Total amount of global memory: 402325504 bytes
# Number of multiprocessors: 12
# Number of cores: 96
Cuda error in file '..\cuda/cutil.h' in line 305 : out of memory.
Memory usage: host: bytes device: bytes
Assertion failed: 0, file ..\cuda/cutil.h, line 305

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
----

TIA for any help.

ignasi
Send message
Joined: 10 Apr 08
Posts: 254
Credit: 16,836,000
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 7711 - Posted: 21 Mar 2009 | 16:50:15 UTC - in response to Message 7708.

My conclusion is that there was a change in GPUGRID WUs and the problem is not on my side. Here is my typical crash:

http://www.gpugrid.net/result.php?resultid=410378


Nope. These WUs haven't changed at all...

i

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7714 - Posted: 21 Mar 2009 | 17:53:06 UTC - in response to Message 7708.
Last modified: 21 Mar 2009 | 18:01:00 UTC

Why would it say "out of memory" even after a fresh reboot? That's very strange. Do you know how to use RivaTuner to watch the video memory usage? It's been discussed here a long time ago. If the search engine can't find it I could write it down again.

Edit: others are able to complete your WUs just fine. That tells us that the error should be on the side of your machines. Among your wingmen was noone with less than 512 MB.. which doesn't tell us anything.

MrS
____________
Scanning for our furry friends since Jan 2002

Joe
Send message
Joined: 1 Sep 08
Posts: 37
Credit: 5,864,088
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 7767 - Posted: 23 Mar 2009 | 12:47:02 UTC - in response to Message 7714.

WOW! Now I have 3 complete WUs with 6.6.17/WinXP and only one error...
Kind regards
Joe

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7778 - Posted: 23 Mar 2009 | 20:36:30 UTC - in response to Message 7767.

Did you change anything?

MrS
____________
Scanning for our furry friends since Jan 2002

Joe
Send message
Joined: 1 Sep 08
Posts: 37
Credit: 5,864,088
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 7779 - Posted: 23 Mar 2009 | 21:27:25 UTC - in response to Message 7778.

>Did you change anything?
Yes, I have a second GPU (GeForce 8800 GT) inside and made the update to 6.6.17. But in to moment ist 3:3... See here: http://www.gpugrid.net/results.php?hostid=29092
I belive that problem is inside XP! The same card works in Vista http://www.gpugrid.net/results.php?hostid=24499 without problems and the vista working card make errors in XP...

Kind regards
Joe

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7780 - Posted: 23 Mar 2009 | 21:41:07 UTC - in response to Message 7779.

I belive that problem is inside XP!


XP itself is not the problem, as thousands (?) of users are running with just fine. However, something you do on your XP is special and cuases these issues. Too bad I have run out of ideas..

MrS
____________
Scanning for our furry friends since Jan 2002

uBronan
Avatar
Send message
Joined: 1 Feb 09
Posts: 139
Credit: 575,023
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 7787 - Posted: 23 Mar 2009 | 23:54:05 UTC
Last modified: 23 Mar 2009 | 23:56:49 UTC

Hmm sadly again error but most nasty it was finished but when sending started it gave an error
Also not a reason given why it crashed also.
this one
So i guess still some problems

jrobbio
Send message
Joined: 13 Mar 09
Posts: 59
Credit: 324,366
RAC: 0
Level

Scientific publications
watwatwatwat
Message 7793 - Posted: 24 Mar 2009 | 8:53:02 UTC - in response to Message 7779.

>Did you change anything?
Yes, I have a second GPU (GeForce 8800 GT) inside and made the update to 6.6.17. But in to moment ist 3:3... See here: http://www.gpugrid.net/results.php?hostid=29092
I belive that problem is inside XP! The same card works in Vista http://www.gpugrid.net/results.php?hostid=24499 without problems and the vista working card make errors in XP...

Kind regards
Joe


Thoughts:

  • Have you used driver cleaner before installing a new driver? Driver Cleaner Professional Free. Newer paid version is here: http://www.drivercleaner.net/
  • If it is XP home log in to safe mode and check the security permissions to your Boinc folders
  • I'm a bit confused as to why the system is reporting that you have 2 x GTX295's when you say it is a Geforce 8800GT. Is that normal?
  • Run a CPU/GPU stress test on your XP machine and see how it fare's. The new one that people are recommending is OCCT
  • Check your memory



Regards,

Rob

Joe
Send message
Joined: 1 Sep 08
Posts: 37
Credit: 5,864,088
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 7805 - Posted: 24 Mar 2009 | 14:02:40 UTC - in response to Message 7793.

Now I have complete 2 more WUs with XP





No, because the newesz Nvidia driver was the first driver on a new XP



  • If it is XP home log in to safe mode and check the security permissions to your Boinc folders




It's XP Prof. 32Bit. The permissions are ok and the dir is exclude from Antivir



  • I'm a bit confused as to why the system is reporting that you have 2 x GTX295's when you say it is a Geforce 8800GT. Is that normal?/]OCCT[/url]




Yes, there is a GTX295 (not in SLI mode) and a 8800GT in this system



  • Run a CPU/GPU stress test on your XP machine and see how it fare's. The new one that people are recommending is OCCT
  • Check your memory




All components in this system are ok...

But in the Vista system I had a new error:
Cuda error: Kernel [scale_coordinate_kernel] failed in file 'ComputePme_kernels.cu' in line 23 : unknown error.

See more here: http://www.gpugrid.net/result.php?resultid=435480

Kind regards

Joe

Post to thread

Message boards : Graphics cards (GPUs) : Computation Error

//