Advanced search

Message boards : Number crunching : Why am I getting TOO MANY WU's?

Author Message
Rick A. Sponholz
Avatar
Send message
Joined: 20 Jan 09
Posts: 52
Credit: 2,518,707,115
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32828 - Posted: 8 Sep 2013 | 15:29:43 UTC

I can't seem to get GPUGRID to follow my BOINC Local Preferences. I've got the Minimum Work Buffer set to .02 days, and Max Additional Work Buffer set to .15 days, and I still get seven WU's for my 4 GPU core machines, even when there is 5-9 hours left on the 4 wu's that are running. The resource share is only 62.14%. and my app_config is set to run no more than 4 wu's at a time. Am I missing something? Hope someone can help. Thanks in advance, Rick
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32830 - Posted: 8 Sep 2013 | 16:01:12 UTC - in response to Message 32828.
Last modified: 8 Sep 2013 | 16:12:50 UTC

The estimated remaining runtime could well be incorrect (due to the recent short and long test work units all being described as 5M Gflops server side).

Any Network Usage restrictions in place?

I cant see your app_config or settings.
So, is your i7-4770's Intel HD4600 GPU being seen by Boinc (it would say in the first page of your Boinc Log file)?

If so then you can exclude the intel_GPU from being 'used' by GPUGrid (you maybe already did this).

If the intel_GPU is being seen by Boinc and you have not excluded it from the GPUGrid project, your buffer settings will mean you cache up to 4h of work, for 5 GPU's.

This might help you work out what's going on,
http://www.gpugrid.net/host_app_versions.php?hostid=157727

You can see what your different app performances are listed as for that system.
Should be going by 'Average turnaround time 0.66 days'
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32832 - Posted: 8 Sep 2013 | 16:10:16 UTC - in response to Message 32830.
Last modified: 8 Sep 2013 | 16:12:45 UTC

I thought Intel GPUs are listed a resource type that is different than NVIDIA, and so they are naturally not included within any work fetch requests to GPUGrid.

Another possibility as to the problem might be that the server-side speed estimations aren't yet calculated. It takes 10 completed task for a given host (e.g. Computer X) on a given app's (e.g. Long-Run) given plan class (e.g. Cuda 5.5) before the server knows just how fast the host can do the tasks for that host/app/planclass combination.

Until that happens, estimations of speeds are.... often wrong.

PS: <max_concurrent> in app_config.xml currently limits the number of tasks of that app that can be running at a time, but it does NOT have any effect on how much work is requested. The best way to limit the amount of tasks queued up, is to limit your work fetch buffers.

Mine are set to:
Minimum 0.05 days
Max Additional: 0.15 days

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32835 - Posted: 8 Sep 2013 | 16:24:58 UTC - in response to Message 32832.
Last modified: 8 Sep 2013 | 16:37:17 UTC

Maybe development versions of Boinc handle the Intel GPU's differently, but I have an exclude_gpu setting for my intel_gpu for GPUGrid in my cc_config file:

    <options>
    <exclude_gpu>
    <url>http://www.gpugrid.net/</url>
    <type>intel_gpu</type>
    </exclude_gpu>
    </options>


It's been there for a while, so I guess I thought I needed it at some stage (maybe I didn't actually need it though and just read in some forum that I needed it).
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32836 - Posted: 8 Sep 2013 | 16:29:50 UTC - in response to Message 32835.

I'd be curious what happens when you take that out, and restart BOINC. If it doesn't behave correctly, maybe we can look at it further to see if a bug existed/exists.

Rick A. Sponholz
Avatar
Send message
Joined: 20 Jan 09
Posts: 52
Credit: 2,518,707,115
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32837 - Posted: 8 Sep 2013 | 16:41:44 UTC - in response to Message 32830.
Last modified: 8 Sep 2013 | 16:52:11 UTC

The estimated remaining runtime could well be incorrect (due to the recent short and long test work units all being described as 5M Gflops server side).

Any Network Usage restrictions in place?

I cant see your app_config or settings.
So, is your i7-4770's Intel HD4600 GPU being seen by Boinc (it would say in the first page of your Boinc Log file)?

If so then you can exclude the intel_GPU from being 'used' by GPUGrid (you maybe already did this).

If the intel_GPU is being seen by Boinc and you have not excluded it from the GPUGrid project, your buffer settings will mean you cache up to 4h of work, for 5 GPU's.

This might help you work out what's going on,
http://www.gpugrid.net/host_app_versions.php?hostid=157727

You can see what your different app performances are listed as for that system.
Should be going by 'Average turnaround time 0.66 days'


Thanks for the feedback, Jacob, & skgiven.
BOINC is only showing recognition of my 4 nvidia processors, NOT the intel (btw, I didn't even know mt computer had an intel GPU. Is there a way to use that one too?)

I do not have any network restrictions. My wireless network is connected full time to a cable modem, with backup to a second wireless network connected to my backup DSL modem.

I'll try to answer any additional questions posted. Thanks, Rick
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32838 - Posted: 8 Sep 2013 | 16:44:34 UTC - in response to Message 32836.
Last modified: 8 Sep 2013 | 16:49:29 UTC

Just after deleting my exclude intel_gpu, and immediately got,
GPUGRID - Requesting new tasks for intel_gpu

- Obviously I wouldn’t get Intel work, but I thought it might change the scheduler causing it to cache up more work, some wonky way. Unfortunately I can't test for this because the 2 WU's I have running are 3h and 6h into a run and Boinc thinks they will take another 60h and 70h (due to the estimates).

Rick, you can download the Intel drivers and use it at Einstein.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32839 - Posted: 8 Sep 2013 | 16:47:57 UTC - in response to Message 32838.
Last modified: 8 Sep 2013 | 16:58:47 UTC

Right skgiven, you have a resource "intel_gpu" that GPUGrid doesn't explicitly say that it doesn't support (in sched_reply_www.gpugrid.net.xml), and so BOINC asks for work for it.

When it doesn't get any work for it, it should create an incrementing resource backoff, up to a 1-day backoff. You can see that backoff in both the Project properties, as well as the work_fetch_debug log statements (if you have that flag set in cc_config.xml)

For CPU, GPUGrid explicitly says "we don't do that resource" (<no_cpu_apps>1</no_cpu_apps> in sched_reply_www.gpugrid.net.xml), and so BOINC never asks GPUGrid for CPU work. That's why the project properties can say "Project has no apps for CPU". But GPUGrid doesn't yet do that for the intel_gpu resource type.

You can keep the GPU Exclusion in place if you'd like.
But I believe BOINC is doing its job of
- periodically asking GPUGrid for work for intel_gpu (since the project may start having tasks for it, since it doesn't explicitly say that it doesn't have apps for it!)
- creating an incrementing resource backoff when it gets no work (since the project may be temporarily out of work for that resource type).

If GPUGrid wanted to save a bit of RPC bandwidth for its users that have intel_gpu resources, I think they could add a line in the sched_reply_www.gpugrid.net.xml file to do so.

Does that make sense hopefully?

PS: You can fix your estimates by closing BOINC, opening client_state.xml, finding the GPUGrid duration correction factor, and setting it to something reasonable like 1.6

Rick A. Sponholz
Avatar
Send message
Joined: 20 Jan 09
Posts: 52
Credit: 2,518,707,115
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32841 - Posted: 8 Sep 2013 | 16:55:36 UTC - in response to Message 32838.

Just after deleting my exclude intel_gpu, and immediately got,
GPUGRID - Requesting new tasks for intel_gpu

- Obviously I wouldn’t get Intel work, but I thought it might change the scheduler causing it to cache up more work, some wonky way. Unfortunately I can't test for this because the 2 WU's I have running are 3h and 6h into a run and Boinc thinks they will take another 60h and 70h (due to the estimates).

Rick, you can download the Intel drivers and use it at Einstein.


Jacob,
Where can I get the Intel drivers? I'd love to get more out of my machines. Thanks in advance, Rick
____________

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32842 - Posted: 8 Sep 2013 | 16:56:35 UTC - in response to Message 32841.
Last modified: 8 Sep 2013 | 16:57:49 UTC

I think skgiven would have to answer that one.
I'd wager, though, that you go to the Intel website, and look for graphics drivers there.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32843 - Posted: 8 Sep 2013 | 16:57:02 UTC - in response to Message 32839.

Makes sense.

Maybe I thought I needed it in place when I was also using app_config (which is an anonymous platform), didn't want my logs saturated, or was just being overcautious. I certainly put some exclusions in due to POEM's needs.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32844 - Posted: 8 Sep 2013 | 17:00:27 UTC - in response to Message 32841.

Rick, see the following thread in Einstein,
http://einstein.phys.uwm.edu/forum_thread.php?id=10215&nowrap=true#125627
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Rick A. Sponholz
Avatar
Send message
Joined: 20 Jan 09
Posts: 52
Credit: 2,518,707,115
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32863 - Posted: 9 Sep 2013 | 20:02:55 UTC - in response to Message 32844.

Rick, see the following thread in Einstein,
http://einstein.phys.uwm.edu/forum_thread.php?id=10215&nowrap=true#125627


Thanks for pointing me in the right direction, skgiven. Unfortunately, I guess you can't run the Intel HD processor if you already have two dual processor video cards in your box. It cost me $60 and an 80 mile trip to the computer store to find that out. But thanks for trying to help. Regards, Rick
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32870 - Posted: 10 Sep 2013 | 14:26:21 UTC - in response to Message 32863.
Last modified: 10 Sep 2013 | 14:34:15 UTC

Unfortunately, I guess you can't run the Intel HD processor if you already have two dual processor video cards in your box. It cost me $60 and an 80 mile trip to the computer store to find that out. But thanks for trying to help. Regards, Rick

Don't see why that would be the situation. Did they mention why, or just charge you $60 and say it can't be done?
Have you tried to use the IHD4600 to support your primary monitor?
While I don't have two dual GPU's in any system, I do have 3 discrete GPU's in one system and I can use my iHD4000 GPU to run Einstein.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Rick A. Sponholz
Avatar
Send message
Joined: 20 Jan 09
Posts: 52
Credit: 2,518,707,115
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32871 - Posted: 10 Sep 2013 | 14:38:07 UTC - in response to Message 32870.

Unfortunately, I guess you can't run the Intel HD processor if you already have two dual processor video cards in your box. It cost me $60 and an 80 mile trip to the computer store to find that out. But thanks for trying to help. Regards, Rick

Don't see why that would be the situation. Did they mention why?
Have you tried to use the IHD4600 to support your primary monitor?
While I don't have two dual GPU's in any system, I do have 3 discrete GPU's in one system and I can use my iHD4000 GPU to run Einstein.


The techs just told me they could only get two of the three to work at one time (I think they spent most of their time un-doing the mess I made trying to do it myself). I only have an HDMI output for the IHD4600, and don't know how to connect it to my KVM switch. I use to be knowlageble about these things in the 1960's & 70's , when you had to rewire computers to get them to complete a task. Now technology is moving faster than this old man can keep up with, but I'll keep on trying. Thanks for the nuggets of info. I'll keep following the trail, until I figure this out. Regards, Rick
____________

Rick A. Sponholz
Avatar
Send message
Joined: 20 Jan 09
Posts: 52
Credit: 2,518,707,115
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32878 - Posted: 11 Sep 2013 | 2:12:17 UTC - in response to Message 32870.

Unfortunately, I guess you can't run the Intel HD processor if you already have two dual processor video cards in your box. It cost me $60 and an 80 mile trip to the computer store to find that out. But thanks for trying to help. Regards, Rick

Don't see why that would be the situation. Did they mention why, or just charge you $60 and say it can't be done?
Have you tried to use the IHD4600 to support your primary monitor?
While I don't have two dual GPU's in any system, I do have 3 discrete GPU's in one system and I can use my iHD4000 GPU to run Einstein.


Skgiven,
I can't thank you enough for the help getting my IHD to work. Your suggestion to use the IHD as my primary monitor did the trick, and I'm now crunching Einstein@home om it! Now more NVIDIA GPU time for your GPUGRID. Now on to my other two I7 machine and try to getr them to work too. Thanks again, Rick
____________

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32881 - Posted: 11 Sep 2013 | 6:27:08 UTC

I believe you need a load on the intel VGA (or DVI) port. One way is to plug a monitor in. Another way would be to use a dummy vga plug. The way most bios's are set up is if they don't detect a device on the iGPU and you have another graphics card they simply disable the iGPU

Also if you get the intel GPU driver through windows update its not usable for crunching. You need to get the 'full' package from the intel download centre.

Probably a bit late but maybe it will help someone else.
____________
BOINC blog

Rick A. Sponholz
Avatar
Send message
Joined: 20 Jan 09
Posts: 52
Credit: 2,518,707,115
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32962 - Posted: 15 Sep 2013 | 0:29:24 UTC - in response to Message 32881.

I believe you need a load on the intel VGA (or DVI) port. One way is to plug a monitor in. Another way would be to use a dummy vga plug. The way most bios's are set up is if they don't detect a device on the iGPU and you have another graphics card they simply disable the iGPU

Also if you get the intel GPU driver through windows update its not usable for crunching. You need to get the 'full' package from the intel download centre.

Probably a bit late but maybe it will help someone else.


Thanks for the input MarkJ. Luckly, I installed the correct (full) driver directly from Intel's site. I was using the Intel port for my monitor (that was the only way I could get the Intel GPU to work), but maybe I need the dummy plug for drive 0 on my GTX690? I've ordered 3 VGA Dummy Pluggs (1 for eacj of my i7 machines), and hope to receive them by next weekend. It would be great, if that's all that's needed. Thanks again, Rick
____________

werdwerdus
Send message
Joined: 15 Apr 10
Posts: 123
Credit: 1,004,473,861
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32984 - Posted: 16 Sep 2013 | 1:51:43 UTC

i have 4 machines running with no monitors and no dummy plugs, 3 of the machine have 2 nvidia each and 1 has 1 amd card. they all work fine. i think only the integrated gpu (intel) needs dummy plug.
____________
XtremeSystems.org - #1 Team in GPUGrid

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32991 - Posted: 16 Sep 2013 | 10:50:29 UTC - in response to Message 32984.

I have my monitor on the iHD4000, and don't use a dummy plug for the 3 discrete GPU's in the same system (660Ti, 660, HD5850). I can still use the iHD4000 at Einstein and the 3 discrete cards to crunch. The only issue is that I cannot launch the NVidia control panel (but I don't really need to anyway - things are working well enough as is).
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33021 - Posted: 16 Sep 2013 | 21:30:58 UTC

Yes.. only Intel did not yet get their heads around the fact that if they want their GPUs to be used as coprocessors (do they?) they should consider the remote possibility someone wants to OpenCL-crunch some nuumbers on them without displaying the result on an attached display.

In short: only Intel requires a display or dummy.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 33027 - Posted: 16 Sep 2013 | 21:58:24 UTC - in response to Message 33021.


In short: only Intel requires a display or dummy.


That sort of thing is usually down to the BIOS - some systems will insist on disabling the iGPU if there's a discrete card present, others let you chose which has priority. Still others let you chose but then bugger things up anyway.

Matt

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33074 - Posted: 18 Sep 2013 | 20:05:31 UTC - in response to Message 33027.

That's not what I'm talking about. I'm starting from the point where all GPUs are crunching happily ever after.. until you remove the display from the iGPU. At this point, or when the next WU is supposed to start BOINC can't detect the card as OpenCL device any more because Intels driver sent it to bed. The BIOS has nothing to say in this if you're already in Win and both can run simultaneously in principle.

MrS
____________
Scanning for our furry friends since Jan 2002

Rick A. Sponholz
Avatar
Send message
Joined: 20 Jan 09
Posts: 52
Credit: 2,518,707,115
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33151 - Posted: 22 Sep 2013 | 20:37:18 UTC
Last modified: 22 Sep 2013 | 20:37:47 UTC

I am happy to report successful use of two iCPU's (both 4770's). For me, it required the bios setting for the iGPU to be set to, "Always On", my monitor attached to my GTX690's, AND a VGA Dummy Plug attached to the iGPU. They have been running for 5 hrs now without a problem. So far only SETI Beta wu's have run, but hope to get Einstein wu's soon. I'll report when I get some successfully run. Getting my 3770 iGPU working using the same techniques was a dismal failure. Will keep on trying though. BTW, I'm still having the problem of getting too many GPUGRIP wu's at a time (usually 6 wu's for my 4 cores). hope someone can help me with that too. Thanks to all for the help. Regards, Rick
____________

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33153 - Posted: 22 Sep 2013 | 21:16:51 UTC - in response to Message 33151.
Last modified: 22 Sep 2013 | 21:27:07 UTC

I think, my default, you'll get "up to 2 GPUGrid tasks per GPU".

The best approach to limit BOINC work fetch from fetching too many tasks, is to:
- Make sure you are using the latest supported version of BOINC (7.0.64 currently I believe)
- Limit the buffer settings (My settings are: 0.05 days minimum buffer, 0.15 days max additional buffer)
- Use settings within cc_config.xml (in your data directory) to ensure you specifically exclude any GPUs that you do not want to do GPUGrid work. See http://boinc.berkeley.edu/wiki/Client_configuration. Regarding GPUGrid, this really should only be needed if you have any nVidia GPUs that you do not want to do GPUGrid work on.

For reference, here's my cc_config.xml file, which shows how I have excluded GPUGrid from GPU #2 (in addition to several other GPU exclusions across the 3 GPUs). Note that I have some sections commented out using xml comment blocks.



<cc_config>


<log_flags>
<!-- The 3 flags that are on by default are: file_xfer, sched_ops, task -->

<file_xfer>1</file_xfer>
<file_xfer_debug>0</file_xfer_debug>

<sched_ops>1</sched_ops>
<sched_op_debug>0</sched_op_debug>

<task>1</task>
<task_debug>0</task_debug>

<unparsed_xml>1</unparsed_xml>

<work_fetch_debug>0</work_fetch_debug>
<rr_simulation>0</rr_simulation>
<rrsim_detail>0</rrsim_detail>

<cpu_sched>0</cpu_sched>
<cpu_sched_debug>0</cpu_sched_debug>
<cpu_sched_status>0</cpu_sched_status>
<coproc_debug>1</coproc_debug>

<mem_usage_debug>0</mem_usage_debug>
<checkpoint_debug>1</checkpoint_debug>

<http_debug>0</http_debug>
<http_xfer_debug>0</http_xfer_debug>
<network_status_debug>0</network_status_debug>

<scrsave_debug>1</scrsave_debug>
<notice_debug>0</notice_debug>

<app_msg_receive>0</app_msg_receive>
<app_msg_send>0</app_msg_send>
<async_file_debug>0</async_file_debug>
<benchmark_debug>0</benchmark_debug>
<dcf_debug>0</dcf_debug>
<disk_usage_debug>0</disk_usage_debug>
<priority_debug>0</priority_debug>
<gui_rpc_debug>0</gui_rpc_debug>
<heartbeat_debug>0</heartbeat_debug>
<poll_debug>0</poll_debug>
<proxy_debug>0</proxy_debug>
<slot_debug>0</slot_debug>
<state_debug>0</state_debug>
<statefile_debug>0</statefile_debug>
<suspend_debug>0</suspend_debug>
<time_debug>0</time_debug>
<trickle_debug>0</trickle_debug>

</log_flags>




<options>
<!-- =================================================== TESTING OPTIONS =================================================== -->
<!--
<start_delay>20</start_delay>
<ncpus>8</ncpus>
<exclusive_app>NotepadTest01.exe</exclusive_app>
<exclusive_gpu_app>NotepadTest02.exe</exclusive_gpu_app>
-->

<!-- =================================================== REGULAR OPTIONS =================================================== -->
<report_results_immediately>0</report_results_immediately>
<fetch_on_update>0</fetch_on_update>
<max_event_log_lines>0</max_event_log_lines>

<max_file_xfers>10</max_file_xfers>
<max_file_xfers_per_project>4</max_file_xfers_per_project>

<exclusive_app>iRacingSim.exe</exclusive_app>
<exclusive_app>iRacingSim64.exe</exclusive_app>
<exclusive_app>Aces.exe</exclusive_app>
<exclusive_app>TmForever.exe</exclusive_app>
<exclusive_app>TmForeverLauncher.exe</exclusive_app>

<!-- ===================================================== SETUP GPUS ====================================================== -->
<use_all_gpus>1</use_all_gpus>

<!-- =========================================== SETUP GPU 0: GeForce GTX 660 Ti =========================================== -->
<!--
<ignore_nvidia_dev>0</ignore_nvidia_dev>
-->

<!-- Exclude World Community Grid's "Help Conquer Cancer" GPU app (hcc1) on main display - makes graphics slow, even on 660 Ti -->
<!-- Commenting out, for now, since this round of hcc1 is completed, and next round may not exhibit the issue. -->
<!--
<exclude_gpu>
<url>http://www.worldcommunitygrid.org</url>
<device_num>0</device_num>
<app>hcc1</app>
</exclude_gpu>
-->

<!-- Exclude Einstein/Albert, since work from other GPU projects should give enough work to keep this GPU busy. -->
<exclude_gpu>
<url>http://einstein.phys.uwm.edu/</url>
<device_num>0</device_num>
</exclude_gpu>
<exclude_gpu>
<url>http://albert.phys.uwm.edu/</url>
<device_num>0</device_num>
</exclude_gpu>

<!-- Exclude SETI/Beta, since work from other GPU projects should give enough work to keep this GPU busy. -->
<exclude_gpu>
<url>http://setiathome.berkeley.edu/</url>
<device_num>0</device_num>
</exclude_gpu>
<exclude_gpu>
<url>http://setiweb.ssl.berkeley.edu/beta/</url>
<device_num>0</device_num>
</exclude_gpu>

<!-- Exclude Milkyway@Home, since work from other GPU projects should give enough work to keep this GPU busy. -->
<exclude_gpu>
<url>http://milkyway.cs.rpi.edu/milkyway/</url>
<device_num>0</device_num>
</exclude_gpu>

<!-- =========================================== SETUP GPU 1: GeForce GTX 460 =========================================== -->
<!--
<ignore_nvidia_dev>1</ignore_nvidia_dev>
-->

<!-- Exclude POEM's "POEM++ OpenCL version" GPU app (poemcl) from a second heterogeneous GPU, since it does not work properly -->
<!-- Note: Although 320.18 drivers successfully run smalltest_3, the drivers still do not work right with POEM. -->
<!-- Note: Also, it appears that running POEM only on the GTX 460, does not work. So, it must run on the GTX 660 Ti! -->
<exclude_gpu>
<url>http://boinc.fzk.de/poem/</url>
<device_num>1</device_num>
<app>poemcl</app>
</exclude_gpu>

<!-- Reminder: For GPUGrid.net, if going to run 2-tasks-on-1-GPU, exclude this GPU (it only has 1 GB memory) -->
<!--
<exclude_gpu>
<url>http://www.gpugrid.net</url>
<device_num>1</device_num>
</exclude_gpu>
-->

<!-- Exclude Einstein/Albert, since work from other GPU projects should give enough work to keep this GPU busy. -->
<exclude_gpu>
<url>http://einstein.phys.uwm.edu/</url>
<device_num>1</device_num>
</exclude_gpu>
<exclude_gpu>
<url>http://albert.phys.uwm.edu/</url>
<device_num>1</device_num>
</exclude_gpu>

<!-- Exclude SETI/Beta, since work from other GPU projects should give enough work to keep this GPU busy. -->
<exclude_gpu>
<url>http://setiathome.berkeley.edu/</url>
<device_num>1</device_num>
</exclude_gpu>
<exclude_gpu>
<url>http://setiweb.ssl.berkeley.edu/beta/</url>
<device_num>1</device_num>
</exclude_gpu>

<!-- Exclude Milkyway@Home, since work from other GPU projects should give enough work to keep this GPU busy. -->
<exclude_gpu>
<url>http://milkyway.cs.rpi.edu/milkyway/</url>
<device_num>1</device_num>
</exclude_gpu>

<!-- =========================================== SETUP GPU 2: GeForce GTS 240 =========================================== -->
<!--
<ignore_nvidia_dev>2</ignore_nvidia_dev>
-->

<!-- Exclude World Community Grid's Help Conquer Cancer GPU app -->
<!-- GPU not supported per https://secure.worldcommunitygrid.org/help/viewTopic.do?shortName=GPU#610 -->
<exclude_gpu>
<url>http://www.worldcommunitygrid.org</url>
<device_num>2</device_num>
<app>hcc1</app>
</exclude_gpu>

<!-- Exclude POEM's "POEM++ OpenCL version" GPU app (poemcl) from a second heterogeneous GPU, since it does not work properly -->
<!-- Also, GPU is not supported, as all tasks immediately error out -->
<exclude_gpu>
<url>http://boinc.fzk.de/poem/</url>
<device_num>2</device_num>
<app>poemcl</app>
</exclude_gpu>

<!-- Exclude GPUGrid.net -->
<!-- GPU not supported per http://www.gpugrid.net/forum_thread.php?id=2507 -->
<exclude_gpu>
<url>http://www.gpugrid.net/</url>
<device_num>2</device_num>
</exclude_gpu>

<!-- Exclude Milkyway@Home -->
<!-- GPU not supported, as all tasks immediately error out -->
<exclude_gpu>
<url>http://milkyway.cs.rpi.edu/milkyway/</url>
<device_num>2</device_num>
</exclude_gpu>

</options>


</cc_config>



If done successfully, you'll see the exclusions listed towards the beginning of your Event Log when you restart BOINC. For instance, mine says:

9/21/2013 9:13:51 PM | Einstein@Home | Config: excluded GPU. Type: all. App: all. Device: 0
9/21/2013 9:13:51 PM | Albert@Home | Config: excluded GPU. Type: all. App: all. Device: 0
9/21/2013 9:13:51 PM | SETI@home | Config: excluded GPU. Type: all. App: all. Device: 0
9/21/2013 9:13:51 PM | SETI@home Beta Test | Config: excluded GPU. Type: all. App: all. Device: 0
9/21/2013 9:13:51 PM | Milkyway@Home | Config: excluded GPU. Type: all. App: all. Device: 0
9/21/2013 9:13:51 PM | Poem@Home | Config: excluded GPU. Type: all. App: poemcl. Device: 1
9/21/2013 9:13:51 PM | Einstein@Home | Config: excluded GPU. Type: all. App: all. Device: 1
9/21/2013 9:13:51 PM | Albert@Home | Config: excluded GPU. Type: all. App: all. Device: 1
9/21/2013 9:13:51 PM | SETI@home | Config: excluded GPU. Type: all. App: all. Device: 1
9/21/2013 9:13:51 PM | SETI@home Beta Test | Config: excluded GPU. Type: all. App: all. Device: 1
9/21/2013 9:13:51 PM | Milkyway@Home | Config: excluded GPU. Type: all. App: all. Device: 1
9/21/2013 9:13:51 PM | World Community Grid | Config: excluded GPU. Type: all. App: hcc1. Device: 2
9/21/2013 9:13:51 PM | Poem@Home | Config: excluded GPU. Type: all. App: poemcl. Device: 2
9/21/2013 9:13:51 PM | GPUGRID | Config: excluded GPU. Type: all. App: all. Device: 2
9/21/2013 9:13:51 PM | Milkyway@Home | Config: excluded GPU. Type: all. App: all. Device: 2

Good luck!

Rick A. Sponholz
Avatar
Send message
Joined: 20 Jan 09
Posts: 52
Credit: 2,518,707,115
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33160 - Posted: 23 Sep 2013 | 1:31:50 UTC - in response to Message 33153.

Thanks for sharing your setup Jacob. I too use a cc_config to try and exclude my iGPU from the calculations BOINC makes to get GPUGRID work. My buffer settings are similar to yours: .02 minimum days, .23 additional days. I do use project specific app_config.xml's to allow multiple projects to run simaltainiously on my GTX690's. Other than those, I let BOINC get work for all my projects and allow BOINC access to all my computer capability. Still wish I'd only get 1 GPUGRID wu per CPU, until I get to the .02 minimum shown in my preferences, and when it gets work, only get 1 WU (because 1 long wu is longer than .25 days work) also shown in my preferences. Oh well, I'll keep trying, but hate abortinh GPUGRID wu's because they've been sitting in my que unable to run because I got more than 4 wu's. Thanks again, Regards, Rick
____________

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33161 - Posted: 23 Sep 2013 | 3:25:28 UTC - in response to Message 33160.

If you want me to do some more research into it, I can. What I'd need you to do is to turn on work_fetch_debug in the cc_config log_flags, then capture a segment where you believe BOINC fetched work erroneously.

I'm well-versed in reading the BOINC work fetch log files.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33269 - Posted: 29 Sep 2013 | 20:25:57 UTC

I figured out a reason for overfetch in my BOINC 7.1.18:

I have GPU-Grid set up as backup project, with POEM being primary and providing sporadic work supply. The cache setting is relatively low (0.05 + 0.35 days) and long runs take ~11 hours (0.46 days) on my GPU.

Yet, when POEM ran out BOINC got 2 GPU-Grids. I increased the time to switch between apps to 15h to make it finish the 1st WU which already started, but the other one may sit there for days without starting when POEM supplied work again.

So there I sat thinking why the **** does BOINC fetch 22h worth of work when it's supposed to get at most 1 for a backup project anyway? Why not wait until the 1st WU is almost finished, or at least when the expected remaining runtime falls below the cache setting? This made me switch to short tasks, where the entire problem is just not as large.

Today I think I figured out what's happening: I switched to long-runs again and got a beta WU, where this didn't happen (only fetched one, as it should). The difference on my side between beta and long run? The former is set to 1 GPU, whereas the latter is set to 0.51 to allow up to 3 POEMs along the GPU-Grid task, increasing GPU utilization from ~85% to 98%.

Apparently BOINC sees a nVidia not fully utilized when the backup project kicks in (correct, it's only 0.51), but does not yet factor in that fetching another WU of 0.51 won't help in this case. On the other hand this behaviour is not too bad, since with correct backup-project-behaviour I'd have to wait for the download of the new WU when the current one finishes. And I'd rather miss some early-return bonus than risk an idle GPU ;)

Long story short: do you have less than 1 GPU set per GPU-Grid task?

MrS
____________
Scanning for our furry friends since Jan 2002

Rick A. Sponholz
Avatar
Send message
Joined: 20 Jan 09
Posts: 52
Credit: 2,518,707,115
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33271 - Posted: 30 Sep 2013 | 3:20:21 UTC - in response to Message 33269.

I figured out a reason for overfetch in my BOINC 7.1.18:

I have GPU-Grid set up as backup project, with POEM being primary and providing sporadic work supply. The cache setting is relatively low (0.05 + 0.35 days) and long runs take ~11 hours (0.46 days) on my GPU.

Yet, when POEM ran out BOINC got 2 GPU-Grids. I increased the time to switch between apps to 15h to make it finish the 1st WU which already started, but the other one may sit there for days without starting when POEM supplied work again.

So there I sat thinking why the **** does BOINC fetch 22h worth of work when it's supposed to get at most 1 for a backup project anyway? Why not wait until the 1st WU is almost finished, or at least when the expected remaining runtime falls below the cache setting? This made me switch to short tasks, where the entire problem is just not as large.

Today I think I figured out what's happening: I switched to long-runs again and got a beta WU, where this didn't happen (only fetched one, as it should). The difference on my side between beta and long run? The former is set to 1 GPU, whereas the latter is set to 0.51 to allow up to 3 POEMs along the GPU-Grid task, increasing GPU utilization from ~85% to 98%.

Apparently BOINC sees a nVidia not fully utilized when the backup project kicks in (correct, it's only 0.51), but does not yet factor in that fetching another WU of 0.51 won't help in this case. On the other hand this behaviour is not too bad, since with correct backup-project-behaviour I'd have to wait for the download of the new WU when the current one finishes. And I'd rather miss some early-return bonus than risk an idle GPU ;)

Long story short: do you have less than 1 GPU set per GPU-Grid task?

MrS


Yes, I do have GPUGRID set at .75 GPU via app_config, so I can get better utilization from my GTX690's. So, you think BOINC can't figure out how to properly feed the GPU's? Interesting perspective. Thanks for your thoughts MrS. Regards, Rick
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33310 - Posted: 1 Oct 2013 | 20:51:00 UTC

Rick, just curious: why are you setting 0.75? Any other project running besides GPU-Grid, like POEM in my case?

SK: to be fair, scheduling is in this case based on "completely utilize the GPUs before anything else", which is in principle just what I want. The failure seems to be a minor bug / unwanted behaviour in that the current logic only says "GPU not fully utilized, so get more work" without taking into account that the new WU won't be able to run due to my settings (and in fact shouldn't start to run before the other one).

MrS
____________
Scanning for our furry friends since Jan 2002

Post to thread

Message boards : Number crunching : Why am I getting TOO MANY WU's?

//