HOWTO use more the GPU

Message boards : Graphics cards (GPUs) : HOWTO use more the GPU

Author	Message
Barraud Denis Send message Joined: 2 Sep 08 Posts: 15 Credit: 36,207,656 RAC: 0 Level Scientific publications	Message 23668 - Posted: 26 Feb 2012 \| 18:50:29 UTC
	I found honw to use more my GPU : - stop boinc - add SWAN_SYNC=0 in your variable environment. My cc_config : <cc_config> <options> <ncpus>-1</ncpus> <report_results_immediately>1</report_results_immediately> <use_all_gpus>1</use_all_gpus> <save_stats_days>672</save_stats_days> </options> </cc_config> WARMING : before using maximize the gpu cooling. - restat boinc. for my GTX 560 TI 448 : 732 MHz GPU Core clock (GainWard Fatory Settings) 950 MHz GPU Mermory clock 1464MHz Shader Clock 70% Fan speed (for the noise), 56°C GPU , pc external input air flow at 15°C now 82% GPU Load but 12/12=full of my core cpu before 73% GPU Loas but only 2/12 of my core i7-950 for FERMI Card and GTX2x it permit to better synchronize CPU and GPU . you will need too an app_info.xml file like this : <app_info> <app> <name>acemdlong</name> <user_friendly_name>Long runs (8-12 hours on fastest card)</user_friendly_name> <non_cpu_intensive>0</non_cpu_intensive> <download_url>http://www.ps3grid.net/PS3GRID/download/acemdlong_6.15_windows_intelx86__cuda31</download_url> </app> <file_info> <name>acemdlong_6.15_windows_intelx86__cuda31</name> <executable/> </file_info> <file_info> <name>cudart32_31_9.dll</name> <executable/> </file_info> <file_info> <name>cufft32_31_9.dll</name> <executable/> </file_info> <file_info> <name>tcl85.dll</name> <executable/> </file_info> <app_version> <app_name>acemdlong</app_name> <version_num>615</version_num> <platform>windows_intelx86</platform> <avg_ncpus>1</avg_ncpus> <max_ncpus>1</max_ncpus> <flops>197218977547.224270</flops> <plan_class>cuda31</plan_class> <api_version>6.7.0</api_version> <coproc> <type>CUDA</type> <count>1</count> </coproc> <gpu_ram>603979776.000000</gpu_ram> <!-- for 576 Mo --> <file_ref> <file_name>acemdlong_6.15_windows_intelx86__cuda31</file_name> <main_program/> </file_ref> <file_ref> <file_name>cudart32_31_9.dll</file_name> <open_name>cudart32_31_9.dll</open_name> <copy_file/> </file_ref> <file_ref> <file_name>cufft32_31_9.dll</file_name> <open_name>cufft32_31_9.dll</open_name> <copy_file/> </file_ref> <file_ref> <file_name>tcl85.dll</file_name> <open_name>tcl85.dll</open_name> <copy_file/> </file_ref> </app_version> <app> <name>acemd2</name> <user_friendly_name>ACEMD2: GPU molecular dynamics</user_friendly_name> <non_cpu_intensive>0</non_cpu_intensive> <download_url>http://www.ps3grid.net/PS3GRID/download/acemd2_6.15_windows_intelx86__cuda31</download_url> </app> <file_info> <name>acemd2_6.15_windows_intelx86__cuda31</name> <executable/> </file_info> <app_version> <app_name>acemd2</app_name> <version_num>615</version_num> <platform>windows_intelx86</platform> <avg_ncpus>1</avg_ncpus> <max_ncpus>1</max_ncpus> <flops>60076679705.760513</flops> <plan_class>cuda31</plan_class> <api_version>6.7.0</api_version> <coproc> <type>CUDA</type> <count>1</count> </coproc> <gpu_ram>603979776.000000</gpu_ram> <!-- for 576 Mo --> <file_ref> <file_name>acemd2_6.15_windows_intelx86__cuda31</file_name> <main_program/> </file_ref> <file_ref> <file_name>tcl85.dll</file_name> <open_name>tcl85.dll</open_name> <copy_file/> </file_ref> <file_ref> <file_name>cudart32_31_9.dll</file_name> <open_name>cudart32_31_9.dll</open_name> <copy_file/> </file_ref> <file_ref> <file_name>cufft32_31_9.dll</file_name> <open_name>cufft32_31_9.dll</open_name> <copy_file/> </file_ref> </app_version> </app_info> ____________
	ID: 23668 \| Rating: 0 \| rate: / Reply Quote

mikey Send message Joined: 2 Jan 09 Posts: 292 Credit: 2,374,953,615 RAC: 13,680,231 Level Scientific publications	Message 23675 - Posted: 27 Feb 2012 \| 15:56:22 UTC Last modified: 27 Feb 2012 \| 15:57:43 UTC
	I have a question for you then, I also have a 560 ti but mine is only a 1meg card, so similar but not identical...here is my question: when I look at one of your units that you completed it seems to take almost as much of or even moreo f your cpu to crunch a unit as it does your gpu. 5012411 3203986 26 Feb 2012 \| 16:51:18 UTC 27 Feb 2012 \| 2:33:04 UTC Completed and validated (gpu time)17,018.22 (cpu time)17,023.00 (credits)35,811.00 Long runs (8-12 hours on fastest card) Anonymous platform (NVIDIA GPU) When I do a similar unit, I do NOT have an app info file yet, I get this: 5012803 3204275 118607 26 Feb 2012 \| 17:43:47 UTC 27 Feb 2012 \| 4:58:23 UTC Completed and validated (gpu time)28,343.70 (cpu time)1,676.01 (credits)35,811.00 Long runs (8-12 hours on fastest card) v6.15 (cuda31) As you can see I crunch much longer on the gpu than you do, but also much less using the cpu and we both get similar credits. Is that what the Swan sync did, is sync the two together to crunch faster? If so how does that work when I would also be crunching cpu workunits from another project? Would I be required to use one less cpu so gpugrid can use it?
	ID: 23675 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,201,255,749 RAC: 470 Level Scientific publications	Message 23676 - Posted: 27 Feb 2012 \| 16:54:46 UTC - in response to Message 23675. Last modified: 27 Feb 2012 \| 16:58:59 UTC
	I have a question for you then, I also have a 560 ti but mine is only a 1meg card, so similar but not identical... nVidia named two very different card the same way. Yours is a CC2.1 card, therefore GPUGrid could use only the 2/3rd of its shaders. The other guy's card is CC2.0, so all of its shaders could be used. As you can see I crunch much longer on the gpu than you do, but also much less using the cpu and we both get similar credits. Is that what the Swan sync did, is sync the two together to crunch faster? Yes. Your card will not cruch as fast as the other even with the SWAN_SYNC=0 setting, because yours is CC2.1. If so how does that work when I would also be crunching cpu workunits from another project? Would I be required to use one less cpu so gpugrid can use it? Yes. When you set the SWAN_SYNC environmental variable to 0, GPUGrid will use one full CPU thread, no matter of how many CPU tasks are running. In this case, it is advised to reduce the number of CPU cores used by BOINC clients by the number of GPU tasks running (or more, hyperthreaded CPUs can handle twice as much CPU threads as many FPUs they have. Depending on the project, it could lead to overwhelm the CPU, or the system can run out of RAM). See the FAQ section.
	ID: 23676 \| Rating: 0 \| rate: / Reply Quote

mikey Send message Joined: 2 Jan 09 Posts: 292 Credit: 2,374,953,615 RAC: 13,680,231 Level Scientific publications	Message 23687 - Posted: 28 Feb 2012 \| 13:19:15 UTC - in response to Message 23676.
	I have a question for you then, I also have a 560 ti but mine is only a 1meg card, so similar but not identical... nVidia named two very different card the same way. Yours is a CC2.1 card, therefore GPUGrid could use only the 2/3rd of its shaders. The other guy's card is CC2.0, so all of its shaders could be used. As you can see I crunch much longer on the gpu than you do, but also much less using the cpu and we both get similar credits. Is that what the Swan sync did, is sync the two together to crunch faster? Yes. Your card will not cruch as fast as the other even with the SWAN_SYNC=0 setting, because yours is CC2.1. If so how does that work when I would also be crunching cpu workunits from another project? Would I be required to use one less cpu so gpugrid can use it? Yes. When you set the SWAN_SYNC environmental variable to 0, GPUGrid will use one full CPU thread, no matter of how many CPU tasks are running. In this case, it is advised to reduce the number of CPU cores used by BOINC clients by the number of GPU tasks running (or more, hyperthreaded CPUs can handle twice as much CPU threads as many FPUs they have. Depending on the project, it could lead to overwhelm the CPU, or the system can run out of RAM). See the FAQ section. VERY helpful, thank you!!!
	ID: 23687 \| Rating: 0 \| rate: / Reply Quote

MarkJ Volunteer moderator Volunteer tester Send message Joined: 24 Dec 08 Posts: 738 Credit: 200,909,904 RAC: 0 Level Scientific publications	Message 23689 - Posted: 28 Feb 2012 \| 17:53:31 UTC - in response to Message 23668.
	I found honw to use more my GPU : - stop boinc - add SWAN_SYNC=0 in your variable environment. My cc_config : <cc_config> <options> <ncpus>-1</ncpus> <report_results_immediately>1</report_results_immediately> <use_all_gpus>1</use_all_gpus> <save_stats_days>672</save_stats_days> </options> </cc_config> WARMING : before using maximize the gpu cooling. - restat boinc. for my GTX 560 TI 448 : 732 MHz GPU Core clock (GainWard Fatory Settings) 950 MHz GPU Mermory clock 1464MHz Shader Clock 70% Fan speed (for the noise), 56°C GPU , pc external input air flow at 15°C now 82% GPU Load but 12/12=full of my core cpu before 73% GPU Loas but only 2/12 of my core i7-950 for FERMI Card and GTX2x it permit to better synchronize CPU and GPU . you will need too an app_info.xml file like this : <app_info> <app> <name>acemdlong</name> <user_friendly_name>Long runs (8-12 hours on fastest card)</user_friendly_name> <non_cpu_intensive>0</non_cpu_intensive> <download_url>http://www.ps3grid.net/PS3GRID/download/acemdlong_6.15_windows_intelx86__cuda31</download_url> </app> <file_info> <name>acemdlong_6.15_windows_intelx86__cuda31</name> <executable/> </file_info> <file_info> <name>cudart32_31_9.dll</name> <executable/> </file_info> <file_info> <name>cufft32_31_9.dll</name> <executable/> </file_info> <file_info> <name>tcl85.dll</name> <executable/> </file_info> <app_version> <app_name>acemdlong</app_name> <version_num>615</version_num> <platform>windows_intelx86</platform> <avg_ncpus>1</avg_ncpus> <max_ncpus>1</max_ncpus> <flops>197218977547.224270</flops> <plan_class>cuda31</plan_class> <api_version>6.7.0</api_version> <coproc> <type>CUDA</type> <count>1</count> </coproc> <gpu_ram>603979776.000000</gpu_ram> <!-- for 576 Mo --> <file_ref> <file_name>acemdlong_6.15_windows_intelx86__cuda31</file_name> <main_program/> </file_ref> <file_ref> <file_name>cudart32_31_9.dll</file_name> <open_name>cudart32_31_9.dll</open_name> <copy_file/> </file_ref> <file_ref> <file_name>cufft32_31_9.dll</file_name> <open_name>cufft32_31_9.dll</open_name> <copy_file/> </file_ref> <file_ref> <file_name>tcl85.dll</file_name> <open_name>tcl85.dll</open_name> <copy_file/> </file_ref> </app_version> <app> <name>acemd2</name> <user_friendly_name>ACEMD2: GPU molecular dynamics</user_friendly_name> <non_cpu_intensive>0</non_cpu_intensive> <download_url>http://www.ps3grid.net/PS3GRID/download/acemd2_6.15_windows_intelx86__cuda31</download_url> </app> <file_info> <name>acemd2_6.15_windows_intelx86__cuda31</name> <executable/> </file_info> <app_version> <app_name>acemd2</app_name> <version_num>615</version_num> <platform>windows_intelx86</platform> <avg_ncpus>1</avg_ncpus> <max_ncpus>1</max_ncpus> <flops>60076679705.760513</flops> <plan_class>cuda31</plan_class> <api_version>6.7.0</api_version> <coproc> <type>CUDA</type> <count>1</count> </coproc> <gpu_ram>603979776.000000</gpu_ram> <!-- for 576 Mo --> <file_ref> <file_name>acemd2_6.15_windows_intelx86__cuda31</file_name> <main_program/> </file_ref> <file_ref> <file_name>tcl85.dll</file_name> <open_name>tcl85.dll</open_name> <copy_file/> </file_ref> <file_ref> <file_name>cudart32_31_9.dll</file_name> <open_name>cudart32_31_9.dll</open_name> <copy_file/> </file_ref> <file_ref> <file_name>cufft32_31_9.dll</file_name> <open_name>cufft32_31_9.dll</open_name> <copy_file/> </file_ref> </app_version> </app_info> Do not use the <ncpus> You have told BOINC that your machine is a single core machine. Works real well on an 8 core machine, only one usable core for BOINC. If you want to restrict the number of usable CPUs then use the percentage setting under CPU prefs. We suggest that people free up one CPU core for each model they are running, so in my 8 core machine I simply set the percentage to 87.5 which frees a CPU core that the gpu app will be able to use. You don't need the app_info file at all (unless you wanted to run more than one per gpu, which isn't what your one says). Also with an app info you do not get updated apps from the project it's your responsibility to do it. If you want the most productive box for GPUgrid all you need to do is: - Free one CPU core per instance of GPUgrid running using the percentage of CPUs - Set swan_sync environment variable - Set your cache to a low setting, mine run on 0.01 days No need tor cc_config or app_info files and you can run other projects on your remaining CPU cores. ____________ BOINC blog
	ID: 23689 \| Rating: 0 \| rate: / Reply Quote

Barraud Denis Send message Joined: 2 Sep 08 Posts: 15 Credit: 36,207,656 RAC: 0 Level Scientific publications	Message 23693 - Posted: 28 Feb 2012 \| 21:05:22 UTC - in response to Message 23689.
	it was read : <ncpus> - 1 </ncpus> this is to use all cpu core. ____________
	ID: 23693 \| Rating: 0 \| rate: / Reply Quote

MarkJ Volunteer moderator Volunteer tester Send message Joined: 24 Dec 08 Posts: 738 Credit: 200,909,904 RAC: 0 Level Scientific publications	Message 23700 - Posted: 29 Feb 2012 \| 10:27:31 UTC - in response to Message 23693. Last modified: 29 Feb 2012 \| 10:28:22 UTC
	it was read : <ncpus> - 1 </ncpus> this is to use all cpu core. The default is to use all cores anyway, so no reason to use it. You should free up a cpu core using the percentage (ie Tools -> Computing Prefs -> Processor usage tab). That way if you aren't running GPUgrid work and want to run all processors for some other project you can easily set it back to 100%. Much easier to change than editing a cc_config file. Basically when you set the SWAN_SYNC variable it tells the app to poll to see if the card has completed the last instruction. As its polling it takes some cpu power to do. You'll find your run times will drop when compared to using all cpu cores for other stuff. ____________ BOINC blog
	ID: 23700 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 23833 - Posted: 8 Mar 2012 \| 20:17:41 UTC - in response to Message 23700. Last modified: 9 Mar 2012 \| 23:33:16 UTC
	<report_results_immediately>1</report_results_immediately> At GPUGrid this is 'still' recommended for 6.10.60 and 6.12.34 versions of Boinc. Although server side configurations were implemented several months ago, asking for tasks to automatically report, this function does not work for recommended Boinc clients. On recent Alpha Boinc clients (7.0.3 and above), it appears to work, but unless you are a tester you really shouldn't be running these alpha Boinc clients. ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 23833 \| Rating: 0 \| rate: / Reply Quote

candido Send message Joined: 12 Jun 11 Posts: 12 Credit: 150,069,999 RAC: 0 Level Scientific publications	Message 24216 - Posted: 2 Apr 2012 \| 19:31:04 UTC - in response to Message 23833. Last modified: 2 Apr 2012 \| 19:32:31 UTC
	Yes, it is very much recommended. If you want to get the bonus points, reporting immediatly can help a great deal!
	ID: 24216 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : Graphics cards (GPUs) : HOWTO use more the GPU

	About	Science	Volunteers	Performance	Forum	Join us	Donate