Advanced search

Message boards : Graphics cards (GPUs) : GPUGrid doesn't like me!

Author Message
Gregg
Send message
Joined: 27 Apr 08
Posts: 15
Credit: 45,805,690
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6853 - Posted: 22 Feb 2009 | 3:25:50 UTC

I attached to the GPUGrid project. Files were downloaded and work started on crunching numbers. Things went smoothly for about 1 hour. Then suddenly, BOINC manager reset the project and then detached me from it.

I reattached. Same smooth start. Same result. Reset and detached from the project. I'm running version 6.4.5 and an 8600 GT GPU. Any one else experience being rejected by the project for no apparent reason?

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 6857 - Posted: 22 Feb 2009 | 4:33:33 UTC

No, happily running on 3 systems ...

Can you post the relevant portions of the message log, say 20 lines above and below the detach?

Gregg
Send message
Joined: 27 Apr 08
Posts: 15
Credit: 45,805,690
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6876 - Posted: 22 Feb 2009 | 15:34:23 UTC - in response to Message 6857.

No, happily running on 3 systems ...

Can you post the relevant portions of the message log, say 20 lines above and below the detach?


Thanks for your response. Here it is:

2/21/2009 10:25:03 PM|GPUGRID|Sending scheduler request: To fetch work. Requesting 45413 seconds of work, reporting 0 completed tasks
2/21/2009 10:25:08 PM|GPUGRID|Scheduler request completed: got 0 new tasks
2/21/2009 10:25:08 PM|GPUGRID|Message from server: No work sent
2/21/2009 10:25:08 PM|GPUGRID|Message from server: (reached per-CPU limit of 1 tasks)
2/21/2009 10:25:08 PM|GPUGRID|Message from server: (Project has no jobs available)
2/21/2009 10:27:33 PM|GPUGRID|Sending scheduler request: To fetch work. Requesting 45370 seconds of work, reporting 0 completed tasks
2/21/2009 10:27:38 PM|GPUGRID|Scheduler request completed: got 0 new tasks
2/21/2009 10:27:38 PM|GPUGRID|Message from server: No work sent
2/21/2009 10:27:38 PM|GPUGRID|Message from server: (reached per-CPU limit of 1 tasks)
2/21/2009 10:27:38 PM|GPUGRID|Message from server: (Project has no jobs available)
2/22/2009 12:51:34 AM|GPUGRID|Resetting project
2/22/2009 12:51:35 AM|GPUGRID|Detaching from project


The program got into a loop of requesting work every minute so I set it to "No New Tasks" which solved that problem. Sadly, the end result was the same.

Profile Dieter Matuschek
Avatar
Send message
Joined: 28 Dec 08
Posts: 58
Credit: 231,884,297
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6878 - Posted: 22 Feb 2009 | 15:48:40 UTC - in response to Message 6853.

@Gregg

Isn't a GeForce 8600 GT with only 32 shader units way too slow for this project? Please see this post.

Gregg
Send message
Joined: 27 Apr 08
Posts: 15
Credit: 45,805,690
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6879 - Posted: 22 Feb 2009 | 16:07:44 UTC - in response to Message 6878.
Last modified: 22 Feb 2009 | 16:10:38 UTC

@Gregg

Isn't a GeForce 8600 GT with only 32 shader units way too slow for this project? Please see this post.


I don't know. It is on the list as "It Works" although it is in white letters and the thread seems to directly address only gray, red and green letters. Maybe the 8600 GT was supposed to be in gray, not white letters.

If someone could state definitively that it doesn't have sufficient capacity, that would clear up the mystery.

Thanks.

Profile Dieter Matuschek
Avatar
Send message
Joined: 28 Dec 08
Posts: 58
Credit: 231,884,297
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6880 - Posted: 22 Feb 2009 | 16:15:11 UTC - in response to Message 6879.

Perhaps this discussion is helpful.

Gregg
Send message
Joined: 27 Apr 08
Posts: 15
Credit: 45,805,690
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6881 - Posted: 22 Feb 2009 | 16:50:30 UTC - in response to Message 6880.

I think I'm seeing a pattern emerge. It appears that for GPUGRID purposes, my little old 8600 GT "may" work, but most likely won't. I will wait to reconnect to the project until I get an appropriate video card. Thanks to all for the help.

Scott Brown
Send message
Joined: 21 Oct 08
Posts: 144
Credit: 2,973,555
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwat
Message 6882 - Posted: 22 Feb 2009 | 16:56:21 UTC

An 8600GT will complete work within the 4-day deadline, but the problem will come in as follows:

1. Though shorter workunits will complete in less than two-days, the 8600GT will take more than 48-hours to complete the longer variety of workunits which occur frequently (thus requiring a 24/7 runtime to begin with).

2. GPUGRID apps download based on the number of CPU cores, not GPU's. Thus, on your machine it will always try to keep two workunits downloaded.

3. If the combination is of two shorter units or one shorter unit and one longer unit, you should be fine completing both in under 4 days (or very close to that). The problem comes in when you get 2 longer workunits in a row (or more...3+ in a row is quite possible). When the latter happens, the amount of time that the card is late in returning work will increase, and you will have to abort one or more workunits to get back into an "on-time" cycle.

4. You can reduce this problem by overclocking somewhat (especially the shader speed), but no stable overclock will be fast enough on the 32 shader cards to be able to "set and forget" things...some degree of babysitting will always be the case.

A final note, a majority of the 8600 series cards have only 256mb (this is not the case with yours which has 512). Though not a problem now, GDF (a project Admin) has indicated that in the furture GPUGRID apps may require more memory.

Phoneman1
Send message
Joined: 25 Nov 08
Posts: 51
Credit: 980,186
RAC: 0
Level
Gly
Scientific publications
watwat
Message 6883 - Posted: 22 Feb 2009 | 16:56:25 UTC - in response to Message 6879.

@Gregg
If someone could state definitively that it doesn't have sufficient capacity, that would clear up the mystery.


My 8600GTS has the same number of shaders (32) as your 8600GT. Therefore you shouldn't have a problem... it won't be quick though, typical run time is around 36 hours but there are wu's of different lengths and credits around. Leave it 24 hours and make sure it is over 50% done. Check on it every day or so for the first few days if you like. As you have a core 2 duo Boinc will only let you have one gpu running and one queued so you should be able to do both within the 4 day deadline.

Phoneman1

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6885 - Posted: 22 Feb 2009 | 17:35:13 UTC - in response to Message 6879.

A 8600GT can finish a WU within the deadline, but your PC will be almost unuseable when it runs and if you don't run 24/7 you might miss the deadlines.

MrS
____________
Scanning for our furry friends since Jan 2002

Gregg
Send message
Joined: 27 Apr 08
Posts: 15
Credit: 45,805,690
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6887 - Posted: 22 Feb 2009 | 17:38:48 UTC - in response to Message 6883.

Thanks Scott and Phoneman. Your comments were helpful. It appears that my card is in the gray area of being able to handle the load.

That said, I still don't understand why, after only a few hours of computing, the server detaches me from the project. It's not waiting a few days to see the status of the WU. Maybe it is keeping tabs on my progress and sees that I can't keep up.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6889 - Posted: 22 Feb 2009 | 17:53:06 UTC - in response to Message 6887.

That said, I still don't understand why, after only a few hours of computing, the server detaches me from the project.


I've never seen such behaviour before. Which BOINC version do you use? I'm a little worried about BOINC deciding to reset a project itself. Do that in a wrong moment and one may loose many WUs. That's not something I'd not (yet) trust a program to decide.

MrS
____________
Scanning for our furry friends since Jan 2002

Gregg
Send message
Joined: 27 Apr 08
Posts: 15
Credit: 45,805,690
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6891 - Posted: 22 Feb 2009 | 18:14:34 UTC - in response to Message 6889.

That said, I still don't understand why, after only a few hours of computing, the server detaches me from the project.


I've never seen such behaviour before. Which BOINC version do you use? I'm a little worried about BOINC deciding to reset a project itself. Do that in a wrong moment and one may loose many WUs. That's not something I'd not (yet) trust a program to decide.

MrS


GPUGRID is the only project that has done that to me in the many years that I have been involved with BOINC. In only started running GPUGRID yesterday.I was running 6.4.5 and it did it to me twice. I researched the forum and didn't find anything resembling my problem, but did see that some people were having trouble with aspects of 6.4.5 and it was suggested that 6.5.0 might fix their problem. I realize it is a pre-release version but decided to see if that would fix my problem. Just like it did with 6.4.5, 6.5.0 ran the project for about 2-3 hours and then detached me.

Scott Brown
Send message
Joined: 21 Oct 08
Posts: 144
Credit: 2,973,555
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwat
Message 6893 - Posted: 22 Feb 2009 | 19:02:41 UTC - in response to Message 6891.


How did you install BOINC? in "protected mode", etc.?

Were you doing anything on the machine when it detached or was it just running?

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 6899 - Posted: 22 Feb 2009 | 20:31:19 UTC

What I find most interesting is that there is a communication and then *2 HOURS LATER* the detach message ...

Why the two hour delay? What else might have happened, what else may be running on the system?

I also wonder why there is no other messages for two hours ... I know I have busier systems, but I don't think I have two hour gaps in the messages ... ever ... even my slowest systems ...

Gregg
Send message
Joined: 27 Apr 08
Posts: 15
Credit: 45,805,690
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6903 - Posted: 22 Feb 2009 | 22:41:05 UTC - in response to Message 6893.


How did you install BOINC? in "protected mode", etc.?

Were you doing anything on the machine when it detached or was it just running?


I've never installed anything in "protected mode." I just click the .exe file and let installer do its thing. It has flushed me a total of four times. I was probably just doing usual internet browsing during that time.

Gregg
Send message
Joined: 27 Apr 08
Posts: 15
Credit: 45,805,690
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6905 - Posted: 22 Feb 2009 | 22:55:04 UTC - in response to Message 6899.

What I find most interesting is that there is a communication and then *2 HOURS LATER* the detach message ...

Why the two hour delay? What else might have happened, what else may be running on the system?

I also wonder why there is no other messages for two hours ... I know I have busier systems, but I don't think I have two hour gaps in the messages ... ever ... even my slowest systems ...


Regarding the lack of messages for two hours, I only posted the messages that were specific to the GPUGRID project. I was receiving normal message traffic regarding the other projects I was running during the two hours between attaching and getting detached. And, BOINC Manager showed that GPUGRID was running the entire time.

One last note. As you know, when the GPU is working on a project, its temperature jumps significantly (but stays well below a critical level). I noticed the last time I tried to get the project to run, that after running for about 20 minutes, the temperature on the GPU dropped back to its normal low-usage level. However, the BOINC Manager showed that the project was still running. It could not have been running given the magnitude of the temperature change of the GPU.

Scott Brown
Send message
Joined: 21 Oct 08
Posts: 144
Credit: 2,973,555
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwat
Message 6907 - Posted: 23 Feb 2009 | 0:11:51 UTC - in response to Message 6905.

One last note. As you know, when the GPU is working on a project, its temperature jumps significantly (but stays well below a critical level). I noticed the last time I tried to get the project to run, that after running for about 20 minutes, the temperature on the GPU dropped back to its normal low-usage level. However, the BOINC Manager showed that the project was still running. It could not have been running given the magnitude of the temperature change of the GPU.


Now this sounds familiar as a similar thing has happened to me with my 9500GT. On it, the GPUGRID work would show running in BOINC manager, but the temperature monitor showed it as being basically idle and the progress bar stopped counting. A reboot got the machine working correctly again briefly, but something else went wrong and it dumped a full day's 8 workunits yesterday. Haven't checked it this evening yet to see if it has self corrected...though mine was just hung and not resetting or detaching??? It is a 32-bit Vista machine and the hang-ups happened when my kids switched users in an attempt to guess a password.




Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 6916 - Posted: 23 Feb 2009 | 5:13:16 UTC

@Gregg,

Well, it is not supposed to happen ... but, the reason we wanted the messages, all the messages is to see what is happening on your system. What is happening on some other project may be causing the detach on GPU Grid. in other words, a cross-project bug ... project A does something that triggers the detach from project B ... it should not happen, and I am not saying it happens here ... but ... that gap is important ... what happened to GPU Grid does not look to me like something done by the GPU Grid project ...

Alain Maes
Send message
Joined: 8 Sep 08
Posts: 63
Credit: 1,437,484,959
RAC: 9,643
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6920 - Posted: 23 Feb 2009 | 8:52:47 UTC

OK, just a long shot here. Any chance these gentlemen/women are attached to a BOINC account manager like BAM?

If GPUGRID or any other project for that matter is not yet part of the projects they subscribed to in the relevant host, it will be automatically detached after the daily (or manually requested) update contacting the account manager.

This happened to me in the past for other projects.

Kind regards

Alain

Gregg
Send message
Joined: 27 Apr 08
Posts: 15
Credit: 45,805,690
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6957 - Posted: 24 Feb 2009 | 3:33:08 UTC - in response to Message 6916.

@Gregg,

Well, it is not supposed to happen ... but, the reason we wanted the messages, all the messages is to see what is happening on your system. What is happening on some other project may be causing the detach on GPU Grid. in other words, a cross-project bug ... project A does something that triggers the detach from project B ... it should not happen, and I am not saying it happens here ... but ... that gap is important ... what happened to GPU Grid does not look to me like something done by the GPU Grid project ...


Paul - I'm certainly not an expert, but the message traffic in BOINC Manager didn't look any different than any other times. Unfortunately, I can't send you the entire message list during the time in question as I have rebooted my computer since that time, and the message list has started over. I'd be happy to recreate the problem if you would like. In the interim, I have just given up and ordered a new GPU - one that is listed as being capable of running GPUGRID.

Alain - I use BAM. However, BAM appears to support GPUGRID. It is listed in the projects list that you use to attach to a project through BAM.

Profile Dieter Matuschek
Avatar
Send message
Joined: 28 Dec 08
Posts: 58
Credit: 231,884,297
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6958 - Posted: 24 Feb 2009 | 4:48:24 UTC - in response to Message 6957.

Alain - I use BAM. However, BAM appears to support GPUGRID. It is listed in the projects list that you use to attach to a project through BAM.

Your answer shows that you haven't attached GPUGRID project at BAM.
So Alain is right. :- )

At every contact between BOINC manager and BAM you get detached from GPUGRID.
Please go to the BAM web site and attach there GPUGRID.

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 6961 - Posted: 24 Feb 2009 | 5:06:45 UTC

Ah yes ...

Over the holiday season I sent a long list of my experiences with BAM to Willy, to which he replied (thank you Willy) and though we did not completely agree on all things, he did make some changes to BAM.

The AM to client and back again communication was one of the areas I noted did not work well ... though I concentrated on the NNT aspects, this would seem to be another. That if you attach on the client and NOT on BAM, it does not communicate that information UP to BAM ...

Gregg
Send message
Joined: 27 Apr 08
Posts: 15
Credit: 45,805,690
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6972 - Posted: 25 Feb 2009 | 0:19:36 UTC - in response to Message 6958.

Alain - I use BAM. However, BAM appears to support GPUGRID. It is listed in the projects list that you use to attach to a project through BAM.

Your answer shows that you haven't attached GPUGRID project at BAM.
So Alain is right. :- )

At every contact between BOINC manager and BAM you get detached from GPUGRID.
Please go to the BAM web site and attach there GPUGRID.

Oops. I really misspoke. I meant to say that I used BOINCstats to attach to GPUGRID. However, GPUGRID is in the list of projects that you can sign up for on BOINCstats. I'm soooo confused!

Alain Maes
Send message
Joined: 8 Sep 08
Posts: 63
Credit: 1,437,484,959
RAC: 9,643
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6976 - Posted: 25 Feb 2009 | 7:33:01 UTC - in response to Message 6972.

To attach succesfully to a (new) project under BAM one needs to do the following: go to your BAM account and

1. under tab sign-up for projects: select create account (first time) or find account (rejoining) for the project(s) you want to run.
2. under tab project resources: default is 100, change it if you like. Here you can also choose to attach all your new hosts by default.
3. under tab host list: when you attach your (new) host to BAM a host name will be created and addded to this list. Select the relevant host. A list is shown of all projects you signed up for. Make sure the button Attach? is selected for the projects you want to run on this particular host (omitting do to this will lead to this host being detached from the project next time your host contacts BAM).

A bit wieldy I know, but it works fine if you follow the rules above.

Kind regards

Alain

Post to thread

Message boards : Graphics cards (GPUs) : GPUGrid doesn't like me!

//