Advanced search

Message boards : Graphics cards (GPUs) : CUDA does not suspend

Author Message
Controlix
Send message
Joined: 13 Dec 08
Posts: 3
Credit: 136,056
RAC: 0
Level

Scientific publications
watwatwat
Message 5542 - Posted: 12 Jan 2009 | 19:31:46 UTC

When I leave my laptop for a couple of minutes and the CUDA task starts working, then I come back to work, it takes 50% of my cpu performance on my dual core cpu. Screen updates lag, so it is practically impossible to work.
In the boinc manager I see that the GPUGRID task still takes cpu time.
When I suspend the task manually, it still continues running.
The only thing that helps is to shutdown the boinc manager. That stops all tasks and I can work normally again.
Any idea what goes wrong or what I can do make this work? Is there any additional info that can help you?

Profile Bender10
Avatar
Send message
Joined: 3 Dec 07
Posts: 167
Credit: 8,368,897
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwat
Message 5550 - Posted: 12 Jan 2009 | 20:26:28 UTC - in response to Message 5542.

If you would like some help, We will need more information (os, gpu etc) on your system, and the BOINC and nVidia driver version you are using.



Sadly, Other people (most likely from the Public Relations Dept...) will be around shortly that will not need that information, as they will most likely just be telling you that 'their' system runs fine, and they are racking up credits like crazy, and it's too bad you have a problem...


Disclaimer: The opinions expressed here are the views of the writer and do not necessarily reflect
the views and opinions of the GPUgrid project, or any of the volunteer members that have already posted....

____________


Consciousness: That annoying time between naps......

Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 5554 - Posted: 12 Jan 2009 | 22:05:55 UTC
Last modified: 12 Jan 2009 | 22:06:19 UTC

Not sure if I belong to the public relations department or not ... :)

But ...

There are many issues with CUDA and BOINC... the major one is that it is new ... and being new, there are problems ... this is another ...

The first part is probably because you are running windows (XP or Vista) and there is a problem in the 6.55 application that takes more CPU than it should.

The second half of your problem is one of controls on the CUDA execution and that is a new one ...

I did post the following on the BOINC Dev mailing list:


On the GPU Grid board the following was posted:

When I leave my laptop for a couple of minutes and the CUDA task starts working, then I come back to work, it takes 50% of my cpu performance on my dual core cpu. Screen updates lag, so it is practically impossible to work.
In the boinc manager I see that the GPUGRID task still takes cpu time.
When I suspend the task manually, it still continues running.
The only thing that helps is to shutdown the boinc manager. That stops all tasks and I can work normally again.


Which is indicating an additional problem with the control of tasks running on the GPU. These tasks are apparently not correctly suspending when so directed by the BOINC Manager. It is possible that the stopping of BOINC is short-stopping the suspension of the task and that had the participant waited the suspension would have occurred at the next interaction between the BOINC Manager and the CUDA processor.

But, waiting the next CPU to GPU interaction is not a proper behavior for the suspend which should be immediate. I am not sure if checkpointing and CPU to GPU interaction to save the state should occur or even if the GPU should be cleared.

Perhaps others (John) might have a better perspective on this than I ...


and someone that has the ability to open a Trak ticket might hopefully enter something along this line there ...

A couple more things to add to the question list though ...

If you suspend the project does that stop the interference?
If you suspend BOINC does that work?

In other words, there are at least these two additional levels that may help in the mean time ... so, your assignment, should you decide to accept it, is to add some additional details and we shall try to work with you ...

Oh, and Yes, I have not had any problems recently ... I guess I have to say that too ... :)

Well, except for too much CPU usage, the board not saving my preferences, etc ...
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 5557 - Posted: 12 Jan 2009 | 22:14:53 UTC

I just tried with 6.5.0: set BOINC to suspend if user is active.
"suspended" -> wait -> "active" -> mouse movement -> "suspended" all projects, including GPU-Grid

So if you have any version older than 6.5.0 you might want to upgrade.

MrS
____________
Scanning for our furry friends since Jan 2002

javichu77
Send message
Joined: 24 Nov 08
Posts: 4
Credit: 50,712
RAC: 0
Level

Scientific publications
wat
Message 5563 - Posted: 13 Jan 2009 | 1:22:04 UTC - in response to Message 5542.

When I leave my laptop for a couple of minutes and the CUDA task starts working, then I come back to work, it takes 50% of my cpu performance on my dual core cpu. Screen updates lag, so it is practically impossible to work.
In the boinc manager I see that the GPUGRID task still takes cpu time.
When I suspend the task manually, it still continues running.
The only thing that helps is to shutdown the boinc manager. That stops all tasks and I can work normally again.
Any idea what goes wrong or what I can do make this work? Is there any additional info that can help you?

(Sorry for my terrible english) I had the same problem as you. I have an 8600GT with passive cooling (575/DDR2-400/1300) overclocked to 653/469/1641 and the CPU is a Dual Core E2180-2GHz overclocked to 3,15 GHz. The boinc version 6.4.5 takes more 33 hours to complete each work, running a WCG work at the same time. The GPUGRID works use nearly one complete core. As it seems that my card is near the minimum requirementes for this proyect, when a GPUGRID work is running any movement in the screen is freezed (there is no graphic power left for any other thing other than the CUDA work). Laptops usually have low end graphic
cards, integrated in the motherboard in most cases, so I imagine that you may suffer the same problem I do: low power graphic card.
I used to do the same thing you said: pause the Bonic manager. Problems: Besides the obvious loos of computing work (I donĀ“t care about credits; it is solidarity work) I loosed part of the work done for the GPUGRID work when I continue the work (I did not loose the work for the WCG proyects because the Boinc manager allows to mantain the data in the memory (4 Gb - 3,5 for WinXP 32) while the work is paused.
Suggestions for the programmers: it would be good that this could be done with the GPUGRID; and it would be inmenselly great if we could limit the GPU load, in the same way that we can limit the CPU load (I imagine that it has not be done because of "technical difficulties" but I try...)
What I do now when I need to work with the computer is to suspend the GPUGRID proyect. The Boinc manager stops the GPU-CPU work and starts other work, an only-CPU work. In that way I can run 2 CPU works (WCG in my case), limit their use of the core power and, at he same time, use my computer. Result: more work done for the computer grid. When I finish my work with the computer I resume the GPUGRID proyect and the Boinc manager stops one of the CPU works and continues with the GPU-CPU work.

When I want to PAUSE the Boinc application (and PAUSE all the Boinc tasks) very often the GPUGRID doesn't pause consuming CPU time and, what is killing, nearly all the GPU time. I have to resume the application and pause it again. After serveral attempts I allways manage to pause the GPUGRID task. But, as I said before, now, I prefer to SUSPEND the GPUGRID proyect or the task.

Besides that, I can only get new GPUGRID works if I wait untill the last work I have is completed, suspend the proyect, exit the boinc application, launch it again and resume the GPUGRID proyect. This results in a loose of computing time for the GPUGRID proyect.Even more, now I have a GPUGRID work that I know that I can't finish in time (I need 33 hours to complete one) so I have tried to cancel it, and get a new one. Every time I tried to cancel it the manager crashed. When I launch the Boinc manager the GPUGRID work was there. I have to say that is the first time I trie to cancel one GPUGRID work and I don't know if it is a general problem or it happens only to me. I have suspended the GPUGRID proyect waiting for the time limit to arrive, resume the proyect and wait what happens. I hope "the system" will take away this work to another computer and send me another one.

Another question for this endless quote: is it possible to get a boinc version newer than the 6.4.5 I am using now?. I've read that they are much faster, specially the 6.5.6. My slow computer (or graphic card) would really appreciate it.

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 5566 - Posted: 13 Jan 2009 | 3:12:19 UTC

The BOINC Manger version 6.5.0 is no faster than your version. It only has a fix that mainly seems to ease the ability to run a CUDA task along with the CPU tasks.

The faster application, which has really similar numbers is the GPU Grid application 6.56 which was pulled because of a bug and the project is trying to get a new version out "real soon now" as Jerry Pournelle used to day ( for those that used to read Byte Magazine ... I know, I am dating myself) ...

Anyway, getting that application will be automatic when it is released ... you will not have to do anything ...
____________

javichu77
Send message
Joined: 24 Nov 08
Posts: 4
Credit: 50,712
RAC: 0
Level

Scientific publications
wat
Message 5567 - Posted: 13 Jan 2009 | 4:02:04 UTC - in response to Message 5566.

Thank you, Paul D. Buck, for your quick answer.

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 5569 - Posted: 13 Jan 2009 | 4:26:17 UTC - in response to Message 5567.

Thank you, Paul D. Buck, for your quick answer.


Some of us don't have much of a life ... :)

And you are very welcome ...
____________

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 5571 - Posted: 13 Jan 2009 | 10:40:50 UTC - in response to Message 5554.
Last modified: 13 Jan 2009 | 10:46:11 UTC

and someone that has the ability to open a Trak ticket might hopefully enter something along this line there ...


I'll wait and see what BOINC 6.6.0 offers. Windows version is due out tomorrow.

The one thing they have mentioned so far is it addresses trak #808, which is to do with leaking thread handles.

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 5578 - Posted: 13 Jan 2009 | 12:08:32 UTC - in response to Message 5571.

and someone that has the ability to open a Trak ticket might hopefully enter something along this line there ...


I'll wait and see what BOINC 6.6.0 offers. Windows version is due out tomorrow.

The one thing they have mentioned so far is it addresses trak #808, which is to do with leaking thread handles.


I hate it when that happens ...

You wind up with thread handles all over the floor and towels soaked in them ... and the smell ... ugh ...

As to not mentioning things, yeah, that too is common in BOINC. I struggled for years trying to improve that and in the end failed ...

It kinda makes it hard to test when you don't know what the changes are ... and to my mind, limited as it is, I really don't see issuing a new version for a single problem fix ... even leaks like that ...
____________

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 5670 - Posted: 16 Jan 2009 | 11:12:32 UTC - in response to Message 5542.

In the boinc manager I see that the GPUGRID task still takes cpu time.
When I suspend the task manually, it still continues running.


This seems to work correctly in BOINC 6.5.0, in that it will kill the science app. A resume will start it up again.

I haven't bothered getting 6.6.0 (its being alpha-tested at the moment). Maybe one of the guys with it could try suspend/resume on a cuda task and see what it does.

javichu77
Send message
Joined: 24 Nov 08
Posts: 4
Credit: 50,712
RAC: 0
Level

Scientific publications
wat
Message 5761 - Posted: 18 Jan 2009 | 18:14:14 UTC

The problems I writed about in a prior post were using 6.5.0 WUs. Now I'm crunching my second 6.55 WU without problems. I have paused them several times with an inmediate reacction from the boinc manager. It seams that the works are done in less time (though my ultra-slow system).
Another good thing for my slow system. Half an hour ago, the server sent a message saying that it will not send me a new work because I will not finish it on time. This is true and will avoid receiving a work that I have to suspend, with the problems I had with that issue before.
Besides that, what is more important, they are the first two WU that I recieved automatically, without having to suspend GPUGRID proyect and resume it manually. This increases the GPUGRID work done, releves me from me the "babysitter" work.

I am ager to see the performance of my machine with v6.60, but I imagine that have to wait untill I finish the GPUGRID WU I'm working on; an eternity (33 hours, more o less).

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 5792 - Posted: 19 Jan 2009 | 22:08:01 UTC - in response to Message 5761.

Yes, the need for baby-sitting has been resolved in the beginning of january. Apart from that you're still mixing up the version numbers of the GPU-Grid application (6.55 is the current one for windows) and the BOINC manager version, which is totally independet from the former version number (as long as it supports CUDA) and does not really affect speed. Recent BOINC manager versions have been e.g. 6.3.21, 6.4.x, 6.5.0, 6.6.0.

MrS
____________
Scanning for our furry friends since Jan 2002

Controlix
Send message
Joined: 13 Dec 08
Posts: 3
Credit: 136,056
RAC: 0
Level

Scientific publications
watwatwat
Message 6398 - Posted: 4 Feb 2009 | 17:00:56 UTC
Last modified: 4 Feb 2009 | 17:01:50 UTC

Strange. I am running cuda v6.62 (acemd_6.62_windows_intelx86__cuda.exe) and I'm experiencing the same problem: the task keeps running and consumes all my gpu-power.

I'm running on a Core2 Duo @ 2.53GHz and a Nvidia GeForce 9800M GTS.
My operating system is Vista Ultimate 64bit.

The only way to stop the cudo task is to stop the BOINC manager, thereby terminating the BOINC server app.

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 6399 - Posted: 4 Feb 2009 | 17:05:03 UTC - in response to Message 6398.

Strange. I am running cuda v6.62 (acemd_6.62_windows_intelx86__cuda.exe) and I'm experiencing the same problem: the task keeps running and consumes all my gpu-power.


um, that is what is supposed to happen...

The CPU tasks will use all of the available CPU power, a GPU task will use all available GPU power. Because this is new technology the ability to be less intrusive is not there yet (and may never really get there because of limitations of the GPU cards themselves) ...

But, if it is consuming all resources then things are as they should be ...

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 6407 - Posted: 4 Feb 2009 | 20:19:04 UTC - in response to Message 6398.

Controlix, could you be a bit more specific? I've seen you made the first post in this thread, but it got quite carried away, so I'm not sure what you mean when you say "the same problem".

Is it still that BOINC doesn't suspend your CUDA task, either when you tell it manually to suspend or when you resume activity [and have it set to "suspend when user is active"]? Which BOINC version?

MrS
____________
Scanning for our furry friends since Jan 2002

Controlix
Send message
Joined: 13 Dec 08
Posts: 3
Credit: 136,056
RAC: 0
Level

Scientific publications
watwatwat
Message 6507 - Posted: 8 Feb 2009 | 19:48:11 UTC - in response to Message 6407.

It is indeed the fact that cuda does not suspend its work when i'm active and continues taking up almost all the gpu resources.
I've seen that eventually it does suspend, but it can take several minutes (more than 10) before that happens. I haven't verified but it might well be that it suspends when the active WU is finished.

My version of boinc is 6.4.5 on Vista x64.

Post to thread

Message boards : Graphics cards (GPUs) : CUDA does not suspend

//