Advanced search

Message boards : Number crunching : Can't Crunch!

Author Message
tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35833 - Posted: 23 Mar 2014 | 18:53:31 UTC

A big "oops" here...

With the dearth of GPUGrid WUs this morning I switched to Einstein. Ran out of WUs so switched to Milky Way. Got several completed. Noticed some available GPUGrid WUs so switched back. Got seven GIANNIs, all of which failed within seconds with:



Went back to Milky Way. Same failures. Switched the beat off.

Just restarted. Got two SANTIs. Both failed in seconds; same message.

What to do??



Jozef J
Send message
Joined: 7 Jun 12
Posts: 112
Credit: 1,118,845,172
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 35835 - Posted: 23 Mar 2014 | 20:09:23 UTC

only cancel a number of incoming jobs, but you have to prosecute the moment when they initiating what is only a few seconds ..
It happens to me also sometimes, we do not know whether it is a nvidia driver or tasks errors ..

mikey
Send message
Joined: 2 Jan 09
Posts: 291
Credit: 2,045,966,115
RAC: 10,180,104
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35836 - Posted: 23 Mar 2014 | 22:37:55 UTC - in response to Message 35833.

A big "oops" here...

With the dearth of GPUGrid WUs this morning I switched to Einstein. Ran out of WUs so switched to Milky Way. Got several completed. Noticed some available GPUGrid WUs so switched back. Got seven GIANNIs, all of which failed within seconds with:



Went back to Milky Way. Same failures. Switched the beat off.

Just restarted. Got two SANTIs. Both failed in seconds; same message.

What to do??


I got the same message a while back except using FireFox and all the Google links came up saying it is a 'Windows timing problem'. They gave me some ideas for fixes but none worked. I ended up trading for an AMD gpu and had no more problems.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35837 - Posted: 23 Mar 2014 | 23:26:59 UTC - in response to Message 35833.

From my experience... Sometimes the drivers can get into a wonky state. It especially happens, for me, when watching a Flash video (like YouTube) while crunching on the GPU. Usually, a restart will get it non-wonky again.

However, if you prefer to take a more pro-active approach, you can report the problem to nVIDIA Driver Feedback. Be sure to include the .dmp dump files within the C:\Windows\LiveKernelReports\WATCHDOG directory, which is where the GPU "Display driver stopped responding" dump files get stored.

I haven't seen the problem in a long long time, but coincidentally, I did have it happen this week. It was apparently a conflict with watching a YouTube video while crunching one of the new POEM OpenCL 2.0 tasks. They don't play nicely.

Regards,
Jacob Klein

Matt
Avatar
Send message
Joined: 11 Jan 13
Posts: 216
Credit: 846,538,252
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35838 - Posted: 24 Mar 2014 | 1:00:36 UTC

I've also had this happen to me while on YouTube or other streaming video sites and crunching with the GPU and CPU. After this happened a couple times, I tried lowering the number of CPU cores by one whenever I was going to be watching video for more than a minute or two. I don't know if that's a real fix, but it hasn't happened since.

This has also happened when crunching other GPU projects and switching to GPUGrid tasks that are queued up. The typical example is running Prime Grid PPS Sieve on each GPU and 5/8 cores on the CPU. When one of the Prime Grid tasks finish and the GPUGrid task starts up so that one GPU is on Prime Grid and the other on GPUGrid and still 5/8 cores active on CPU tasks. GPUGrid always eats an extra core, so if I set 62.5% of cores for CPU tasks, with GPUGrid on each GPU I will only get 50% of my CPU cores active on CPU tasks. I think somehow this switch between GPU projects and the CPU utilization difference causes the crash. At least, that's what I've observed several times.

I don't know why tasks would continue to fail after reboot, however. I guess my next step would be to try a clean re-installation of the driver.
____________

mikey
Send message
Joined: 2 Jan 09
Posts: 291
Credit: 2,045,966,115
RAC: 10,180,104
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35855 - Posted: 24 Mar 2014 | 11:44:44 UTC - in response to Message 35838.

I've also had this happen to me while on YouTube or other streaming video sites and crunching with the GPU and CPU. After this happened a couple times, I tried lowering the number of CPU cores by one whenever I was going to be watching video for more than a minute or two. I don't know if that's a real fix, but it hasn't happened since.

This has also happened when crunching other GPU projects and switching to GPUGrid tasks that are queued up. The typical example is running Prime Grid PPS Sieve on each GPU and 5/8 cores on the CPU. When one of the Prime Grid tasks finish and the GPUGrid task starts up so that one GPU is on Prime Grid and the other on GPUGrid and still 5/8 cores active on CPU tasks. GPUGrid always eats an extra core, so if I set 62.5% of cores for CPU tasks, with GPUGrid on each GPU I will only get 50% of my CPU cores active on CPU tasks. I think somehow this switch between GPU projects and the CPU utilization difference causes the crash. At least, that's what I've observed several times.

I don't know why tasks would continue to fail after reboot, however. I guess my next step would be to try a clean re-installation of the driver.


What browser do you use to view the youtube videos? I now use Chrome as one of the forums I am on won't even show the youtube link in Firefox. I still use Firefox for most things, like this, but use Chrome for that one forum and youtube videos, NO crashes since I changed. I have no clue though if that is the real answer or not though.

Matt
Avatar
Send message
Joined: 11 Jan 13
Posts: 216
Credit: 846,538,252
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35858 - Posted: 24 Mar 2014 | 13:51:04 UTC - in response to Message 35855.

What browser do you use to view the youtube videos?


I use Waterfox (64-bit version of Firefox).

____________

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35868 - Posted: 24 Mar 2014 | 17:32:03 UTC - in response to Message 35837.

Be sure to include the .dmp dump files within the C:\Windows\LiveKernelReports\WATCHDOG directory, which is where the GPU "Display driver stopped responding" dump files get stored.

Thanks for that Jacob. I have now accumulated 127 dump files!

I identified the potentially offending GPU and installed it into a different mobo slot. Same problem. I installed it into a spare PC with a different Nvidia driver. Same problem.

I've made a warranty claim against PNY.


mikey
Send message
Joined: 2 Jan 09
Posts: 291
Credit: 2,045,966,115
RAC: 10,180,104
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35884 - Posted: 25 Mar 2014 | 11:12:48 UTC - in response to Message 35858.

What browser do you use to view the youtube videos?


I use Waterfox (64-bit version of Firefox).


I have never heard of that, I will check it out, thanks.

mikey
Send message
Joined: 2 Jan 09
Posts: 291
Credit: 2,045,966,115
RAC: 10,180,104
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35885 - Posted: 25 Mar 2014 | 11:14:55 UTC - in response to Message 35868.

Be sure to include the .dmp dump files within the C:\Windows\LiveKernelReports\WATCHDOG directory, which is where the GPU "Display driver stopped responding" dump files get stored.

Thanks for that Jacob. I have now accumulated 127 dump files!

I identified the potentially offending GPU and installed it into a different mobo slot. Same problem. I installed it into a spare PC with a different Nvidia driver. Same problem.

I've made a warranty claim against PNY.


I'm thinking different pc means bad card too.

Post to thread

Message boards : Number crunching : Can't Crunch!

//