Advanced search

Message boards : Graphics cards (GPUs) : nanosleep for Windows for Windows gurus

Author Message
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 2646 - Posted: 27 Sep 2008 | 9:03:56 UTC
Last modified: 27 Sep 2008 | 9:04:42 UTC

Does anybody have a source code of nanosleep for Windows? Contact me in PM.

gdf

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 2649 - Posted: 27 Sep 2008 | 11:23:14 UTC

Hi GDF,

do you know the current windows sleep implementation causes problems or is it still a supposition?

If it's the latter it might be a good idea to test it by writing the time when the thread actually wakes up into a log file and analysing the variance of these times. The test machine would have to be "loaded normally" though, with at least ncpus-1 BOINC tasks.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 2651 - Posted: 27 Sep 2008 | 14:56:19 UTC - in response to Message 2649.

Hi GDF,

do you know the current windows sleep implementation causes problems or is it still a supposition?

If it's the latter it might be a good idea to test it by writing the time when the thread actually wakes up into a log file and analysing the variance of these times. The test machine would have to be "loaded normally" though, with at least ncpus-1 BOINC tasks.

MrS


It is likely as it is the only difference. We may change the way we do this. We want to use the extra CPU as well...but we cannot lose performance on the GPU.

g

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 2670 - Posted: 30 Sep 2008 | 18:48:03 UTC

I asked a programmer-friend about this. He took a look at how nanosleep is implemented in cygwin and came up with the conclusion that they round up to full ms (which may well be more than 1) and that it seems like it's a limitation of the win scheduler / kernel.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile [XTBA>XTC] ZeuZ
Send message
Joined: 15 Jul 08
Posts: 60
Credit: 108,384
RAC: 0
Level

Scientific publications
wat
Message 2743 - Posted: 3 Oct 2008 | 13:18:12 UTC

Any news about this?

Thanks

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 2807 - Posted: 5 Oct 2008 | 12:43:06 UTC
Last modified: 5 Oct 2008 | 12:46:05 UTC

I've been thinking about this issue a bit more.

To my knowledge the problem looks like this: you tell your GPU-managing-thread to sleep for 1 ms and after that time it gets woken up and polls the GPU if it's ready yet. This "1 ms sleep" is implemented by the windows-intrinsic sleep function. The problem is that this sleep only allows a minimum resolution of 1 ms and depending on the other running threads it may be even more inaccurate. And the reason that no better solution seems to exist is that waking-thread-up process has to be managed by the scheduler and from what my friend told me it seems like windows may not be able to handle shorter times than 1 ms and that's why there is no proper nano sleep for windows (yet).

If I'm right we'd need some other solution and it would be in nVidias interest to solve this issue, as most GP-GPU programs would eventually run into this bottleneck - either use a CPU core completely or suffer a substantial performance loss.

And 1 ms is really critical. You could increase step times to ~100 ms (not talking about GPU-Grid specifically) the performance loss due to the sleep issue would be small, percentage-wise. But then you'd get a totally choppy / laggy windows experience, the computer would not be useable any more.
The ideal 25 fps correpsond to 40 ms per frame, so we'd need step times <40 ms or better. At 40 ms loosing 2 or 3 ms every now and then due to sleep already starts to hurt and the overhead only gets worse as cards become faster.

So what could be done? Some ideas:

- MS or you: somehow implement an accurate sleep / nanosleep function

- MS and/or nVidia: provide a means for the GPU to actively tell the GPU that it's finished, the need to poll the GPU seems fundamentally wrong

- nVidia: somehow modify the driver so that CUDA code could really run along the conventional GPU operation in parallel, avoiding the screen refresh issue completely (which in turn would enable one to use step times of e.g. seconds opposed to the current ms). Probably difficult: handle the GPU-internal scheduling and resource share dynamically for several concurrent apps.

- you: implement a user controlled switch to return to the old method of constant polling. Could be benefical for users with OC'ed GTX 280 and the upcoming 30% performance increase.

- you: if you use the CPU core for some calculations anyway it may be much easier to implement proper GPU polling and to avoid a performance loss on the GPU

- you: let's say you observe that on an interactively used windows system a typical GPU needs on average 60 ms per step and 99.9% of all steps need more than say 50 ms. Then you could use the regular sleep until 50 ms passed and then start polling the GPU constantly. Benefit: very low loss in GPU performance. Trade off: more CPU usage than 6.45. Possibly difficul: determine the spread of step times accurately enough and make the algorithmn robust enough so that it's not confused by things like GPU-Grid + gaming.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 2812 - Posted: 5 Oct 2008 | 18:09:29 UTC - in response to Message 2807.

We have currently implemented the last solution for Windows (not yet uploaded) and done some tests with Stefan on it.

GDF

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 2815 - Posted: 5 Oct 2008 | 19:08:15 UTC - in response to Message 2812.

Very nice, my card says thanks in advance ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Post to thread

Message boards : Graphics cards (GPUs) : nanosleep for Windows for Windows gurus

//