Advanced search

Message boards : Graphics cards (GPUs) : HOWTO use more the GPU

Author Message
Barraud Denis
Avatar
Send message
Joined: 2 Sep 08
Posts: 15
Credit: 36,207,656
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23668 - Posted: 26 Feb 2012 | 18:50:29 UTC

I found honw to use more my GPU :

- stop boinc
- add SWAN_SYNC=0 in your variable environment.

My cc_config :

<cc_config>
<options>
<ncpus>-1</ncpus>
<report_results_immediately>1</report_results_immediately>
<use_all_gpus>1</use_all_gpus>
<save_stats_days>672</save_stats_days>
</options>
</cc_config>

WARMING : before using maximize the gpu cooling.

- restat boinc.

for my GTX 560 TI 448 :
732 MHz GPU Core clock (GainWard Fatory Settings)
950 MHz GPU Mermory clock
1464MHz Shader Clock

70% Fan speed (for the noise),
56°C GPU ,
pc external input air flow at 15°C
now 82% GPU Load but 12/12=full of my core cpu
before 73% GPU Loas but only 2/12 of my core i7-950

for FERMI Card and GTX2x it permit to better synchronize CPU and GPU .

you will need too an app_info.xml file like this :

<app_info>
<app>
<name>acemdlong</name>
<user_friendly_name>Long runs (8-12 hours on fastest card)</user_friendly_name>
<non_cpu_intensive>0</non_cpu_intensive>
<download_url>http://www.ps3grid.net/PS3GRID/download/acemdlong_6.15_windows_intelx86__cuda31</download_url>
</app>

<file_info>
<name>acemdlong_6.15_windows_intelx86__cuda31</name>
<executable/>
</file_info>

<file_info>
<name>cudart32_31_9.dll</name>
<executable/>
</file_info>
<file_info>
<name>cufft32_31_9.dll</name>
<executable/>
</file_info>
<file_info>
<name>tcl85.dll</name>
<executable/>
</file_info>

<app_version>
<app_name>acemdlong</app_name>
<version_num>615</version_num>
<platform>windows_intelx86</platform>
<avg_ncpus>1</avg_ncpus>
<max_ncpus>1</max_ncpus>
<flops>197218977547.224270</flops>
<plan_class>cuda31</plan_class>
<api_version>6.7.0</api_version>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<gpu_ram>603979776.000000</gpu_ram> <!-- for 576 Mo -->
<file_ref>
<file_name>acemdlong_6.15_windows_intelx86__cuda31</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart32_31_9.dll</file_name>
<open_name>cudart32_31_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>cufft32_31_9.dll</file_name>
<open_name>cufft32_31_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>tcl85.dll</file_name>
<open_name>tcl85.dll</open_name>
<copy_file/>
</file_ref>
</app_version>


<app>
<name>acemd2</name>
<user_friendly_name>ACEMD2: GPU molecular dynamics</user_friendly_name>
<non_cpu_intensive>0</non_cpu_intensive>
<download_url>http://www.ps3grid.net/PS3GRID/download/acemd2_6.15_windows_intelx86__cuda31</download_url>
</app>

<file_info>
<name>acemd2_6.15_windows_intelx86__cuda31</name>
<executable/>
</file_info>

<app_version>
<app_name>acemd2</app_name>
<version_num>615</version_num>
<platform>windows_intelx86</platform>
<avg_ncpus>1</avg_ncpus>
<max_ncpus>1</max_ncpus>
<flops>60076679705.760513</flops>
<plan_class>cuda31</plan_class>
<api_version>6.7.0</api_version>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<gpu_ram>603979776.000000</gpu_ram> <!-- for 576 Mo -->
<file_ref>
<file_name>acemd2_6.15_windows_intelx86__cuda31</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>tcl85.dll</file_name>
<open_name>tcl85.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>cudart32_31_9.dll</file_name>
<open_name>cudart32_31_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>cufft32_31_9.dll</file_name>
<open_name>cufft32_31_9.dll</open_name>
<copy_file/>
</file_ref>
</app_version>

</app_info>
____________

mikey
Send message
Joined: 2 Jan 09
Posts: 286
Credit: 572,655,776
RAC: 118,127
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23675 - Posted: 27 Feb 2012 | 15:56:22 UTC
Last modified: 27 Feb 2012 | 15:57:43 UTC

I have a question for you then, I also have a 560 ti but mine is only a 1meg card, so similar but not identical...here is my question: when I look at one of your units that you completed it seems to take almost as much of or even moreo f your cpu to crunch a unit as it does your gpu.

5012411 3203986 26 Feb 2012 | 16:51:18 UTC 27 Feb 2012 | 2:33:04 UTC Completed and validated (gpu time)17,018.22 (cpu time)17,023.00 (credits)35,811.00 Long runs (8-12 hours on fastest card)
Anonymous platform (NVIDIA GPU)

When I do a similar unit, I do NOT have an app info file yet, I get this:
5012803 3204275 118607 26 Feb 2012 | 17:43:47 UTC 27 Feb 2012 | 4:58:23 UTC Completed and validated (gpu time)28,343.70 (cpu time)1,676.01 (credits)35,811.00 Long runs (8-12 hours on fastest card) v6.15 (cuda31)

As you can see I crunch much longer on the gpu than you do, but also much less using the cpu and we both get similar credits. Is that what the Swan sync did, is sync the two together to crunch faster? If so how does that work when I would also be crunching cpu workunits from another project? Would I be required to use one less cpu so gpugrid can use it?

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2335
Credit: 16,178,080,749
RAC: 1,989
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23676 - Posted: 27 Feb 2012 | 16:54:46 UTC - in response to Message 23675.
Last modified: 27 Feb 2012 | 16:58:59 UTC

I have a question for you then, I also have a 560 ti but mine is only a 1meg card, so similar but not identical...

nVidia named two very different card the same way. Yours is a CC2.1 card, therefore GPUGrid could use only the 2/3rd of its shaders. The other guy's card is CC2.0, so all of its shaders could be used.

As you can see I crunch much longer on the gpu than you do, but also much less using the cpu and we both get similar credits. Is that what the Swan sync did, is sync the two together to crunch faster?

Yes. Your card will not cruch as fast as the other even with the SWAN_SYNC=0 setting, because yours is CC2.1.

If so how does that work when I would also be crunching cpu workunits from another project? Would I be required to use one less cpu so gpugrid can use it?

Yes. When you set the SWAN_SYNC environmental variable to 0, GPUGrid will use one full CPU thread, no matter of how many CPU tasks are running. In this case, it is advised to reduce the number of CPU cores used by BOINC clients by the number of GPU tasks running (or more, hyperthreaded CPUs can handle twice as much CPU threads as many FPUs they have. Depending on the project, it could lead to overwhelm the CPU, or the system can run out of RAM). See the FAQ section.

mikey
Send message
Joined: 2 Jan 09
Posts: 286
Credit: 572,655,776
RAC: 118,127
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23687 - Posted: 28 Feb 2012 | 13:19:15 UTC - in response to Message 23676.

I have a question for you then, I also have a 560 ti but mine is only a 1meg card, so similar but not identical...

nVidia named two very different card the same way. Yours is a CC2.1 card, therefore GPUGrid could use only the 2/3rd of its shaders. The other guy's card is CC2.0, so all of its shaders could be used.

As you can see I crunch much longer on the gpu than you do, but also much less using the cpu and we both get similar credits. Is that what the Swan sync did, is sync the two together to crunch faster?

Yes. Your card will not cruch as fast as the other even with the SWAN_SYNC=0 setting, because yours is CC2.1.

If so how does that work when I would also be crunching cpu workunits from another project? Would I be required to use one less cpu so gpugrid can use it?

Yes. When you set the SWAN_SYNC environmental variable to 0, GPUGrid will use one full CPU thread, no matter of how many CPU tasks are running. In this case, it is advised to reduce the number of CPU cores used by BOINC clients by the number of GPU tasks running (or more, hyperthreaded CPUs can handle twice as much CPU threads as many FPUs they have. Depending on the project, it could lead to overwhelm the CPU, or the system can run out of RAM). See the FAQ section.


VERY helpful, thank you!!!

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23689 - Posted: 28 Feb 2012 | 17:53:31 UTC - in response to Message 23668.

I found honw to use more my GPU :

- stop boinc
- add SWAN_SYNC=0 in your variable environment.

My cc_config :

<cc_config>
<options>
<ncpus>-1</ncpus>
<report_results_immediately>1</report_results_immediately>
<use_all_gpus>1</use_all_gpus>
<save_stats_days>672</save_stats_days>
</options>
</cc_config>

WARMING : before using maximize the gpu cooling.

- restat boinc.

for my GTX 560 TI 448 :
732 MHz GPU Core clock (GainWard Fatory Settings)
950 MHz GPU Mermory clock
1464MHz Shader Clock

70% Fan speed (for the noise),
56°C GPU ,
pc external input air flow at 15°C
now 82% GPU Load but 12/12=full of my core cpu
before 73% GPU Loas but only 2/12 of my core i7-950

for FERMI Card and GTX2x it permit to better synchronize CPU and GPU .

you will need too an app_info.xml file like this :

<app_info>
<app>
<name>acemdlong</name>
<user_friendly_name>Long runs (8-12 hours on fastest card)</user_friendly_name>
<non_cpu_intensive>0</non_cpu_intensive>
<download_url>http://www.ps3grid.net/PS3GRID/download/acemdlong_6.15_windows_intelx86__cuda31</download_url>
</app>

<file_info>
<name>acemdlong_6.15_windows_intelx86__cuda31</name>
<executable/>
</file_info>

<file_info>
<name>cudart32_31_9.dll</name>
<executable/>
</file_info>
<file_info>
<name>cufft32_31_9.dll</name>
<executable/>
</file_info>
<file_info>
<name>tcl85.dll</name>
<executable/>
</file_info>

<app_version>
<app_name>acemdlong</app_name>
<version_num>615</version_num>
<platform>windows_intelx86</platform>
<avg_ncpus>1</avg_ncpus>
<max_ncpus>1</max_ncpus>
<flops>197218977547.224270</flops>
<plan_class>cuda31</plan_class>
<api_version>6.7.0</api_version>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<gpu_ram>603979776.000000</gpu_ram> <!-- for 576 Mo -->
<file_ref>
<file_name>acemdlong_6.15_windows_intelx86__cuda31</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart32_31_9.dll</file_name>
<open_name>cudart32_31_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>cufft32_31_9.dll</file_name>
<open_name>cufft32_31_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>tcl85.dll</file_name>
<open_name>tcl85.dll</open_name>
<copy_file/>
</file_ref>
</app_version>


<app>
<name>acemd2</name>
<user_friendly_name>ACEMD2: GPU molecular dynamics</user_friendly_name>
<non_cpu_intensive>0</non_cpu_intensive>
<download_url>http://www.ps3grid.net/PS3GRID/download/acemd2_6.15_windows_intelx86__cuda31</download_url>
</app>

<file_info>
<name>acemd2_6.15_windows_intelx86__cuda31</name>
<executable/>
</file_info>

<app_version>
<app_name>acemd2</app_name>
<version_num>615</version_num>
<platform>windows_intelx86</platform>
<avg_ncpus>1</avg_ncpus>
<max_ncpus>1</max_ncpus>
<flops>60076679705.760513</flops>
<plan_class>cuda31</plan_class>
<api_version>6.7.0</api_version>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<gpu_ram>603979776.000000</gpu_ram> <!-- for 576 Mo -->
<file_ref>
<file_name>acemd2_6.15_windows_intelx86__cuda31</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>tcl85.dll</file_name>
<open_name>tcl85.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>cudart32_31_9.dll</file_name>
<open_name>cudart32_31_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>cufft32_31_9.dll</file_name>
<open_name>cufft32_31_9.dll</open_name>
<copy_file/>
</file_ref>
</app_version>

</app_info>


Do not use the <ncpus>

You have told BOINC that your machine is a single core machine. Works real well on an 8 core machine, only one usable core for BOINC. If you want to restrict the number of usable CPUs then use the percentage setting under CPU prefs.

We suggest that people free up one CPU core for each model they are running, so in my 8 core machine I simply set the percentage to 87.5 which frees a CPU core that the gpu app will be able to use. You don't need the app_info file at all (unless you wanted to run more than one per gpu, which isn't what your one says). Also with an app info you do not get updated apps from the project it's your responsibility to do it.

If you want the most productive box for GPUgrid all you need to do is:
- Free one CPU core per instance of GPUgrid running using the percentage of CPUs
- Set swan_sync environment variable
- Set your cache to a low setting, mine run on 0.01 days

No need tor cc_config or app_info files and you can run other projects on your remaining CPU cores.

____________
BOINC blog

Barraud Denis
Avatar
Send message
Joined: 2 Sep 08
Posts: 15
Credit: 36,207,656
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23693 - Posted: 28 Feb 2012 | 21:05:22 UTC - in response to Message 23689.

it was read :

<ncpus> - 1 </ncpus>

this is to use all cpu core.

____________

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23700 - Posted: 29 Feb 2012 | 10:27:31 UTC - in response to Message 23693.
Last modified: 29 Feb 2012 | 10:28:22 UTC

it was read :

<ncpus> - 1 </ncpus>

this is to use all cpu core.


The default is to use all cores anyway, so no reason to use it.

You should free up a cpu core using the percentage (ie Tools -> Computing Prefs -> Processor usage tab). That way if you aren't running GPUgrid work and want to run all processors for some other project you can easily set it back to 100%. Much easier to change than editing a cc_config file.

Basically when you set the SWAN_SYNC variable it tells the app to poll to see if the card has completed the last instruction. As its polling it takes some cpu power to do. You'll find your run times will drop when compared to using all cpu cores for other stuff.
____________
BOINC blog

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23833 - Posted: 8 Mar 2012 | 20:17:41 UTC - in response to Message 23700.
Last modified: 9 Mar 2012 | 23:33:16 UTC

<report_results_immediately>1</report_results_immediately>


At GPUGrid this is 'still' recommended for 6.10.60 and 6.12.34 versions of Boinc.

Although server side configurations were implemented several months ago, asking for tasks to automatically report, this function does not work for recommended Boinc clients. On recent Alpha Boinc clients (7.0.3 and above), it appears to work, but unless you are a tester you really shouldn't be running these alpha Boinc clients.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

candido
Send message
Joined: 12 Jun 11
Posts: 12
Credit: 131,379,999
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 24216 - Posted: 2 Apr 2012 | 19:31:04 UTC - in response to Message 23833.
Last modified: 2 Apr 2012 | 19:32:31 UTC

Yes, it is very much recommended. If you want to get the bonus points, reporting immediatly can help a great deal!

Post to thread

Message boards : Graphics cards (GPUs) : HOWTO use more the GPU

//