Advanced search

Message boards : Number crunching : Custom app_info.xml for these pesky low-utilisation 3EKO_paola tasks

Author Message
Luke Formosa
Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26697 - Posted: 25 Aug 2012 | 12:46:36 UTC

In the previous thread (this one: http://www.gpugrid.net/forum_thread.php?id=3116#26688) we established that the 3EKO Paola tasks have low GPU utilisation. In the case of my GTX 670 on Windows 7, I'm only seeing 30% load, whereas the Nathan tasks for example return 92% load.

Can someone help me make a custom app_info.xml to run multiple instances of 3EKO Paola tasks on each GPU? Ideally we'd make it so that it only changes the <coproc> number of Paola tasks, and does not affect Nathan, Noelia, or normal Paola tasks (the non-3EKO ones that use the whole GPU properly on their own), and doesn't affect the short (acemd) tasks either.

Please ideally post a complete app_info.xml. I need a starting point... but once I get it running I'll be happy to run some experiments and post to this thread which parameters give the greatest throughput.

Luke Formosa
Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26705 - Posted: 25 Aug 2012 | 23:19:33 UTC

Hmm. Silence. OK, I did all the thinking and testing myself. Here's a working app_info.xml that will run 2 tasks on one GPU. You're welcome :)

<app_info>
<app>
<name>acemdlong</name>
<user_friendly_name>Long runs (8-12 hours on fastest card)</user_friendly_name>
</app>
<file_info>
<name>acemd.2562.cuda42</name>
<executable/>
</file_info>
<file_info>
<name>cudart32_42_9.dll</name>
<executable/>
</file_info>
<file_info>
<name>cufft32_42_9.dll</name>
<executable/>
</file_info>
<file_info>
<name>tcl85.dll</name>
<executable/>
</file_info>
<app_version>
<app_name>acemdlong</app_name>
<version_num>615</version_num>
<platform>windows_intelx86</platform>
<avg_ncpus>0.36</avg_ncpus>
<max_ncpus>1.000000</max_ncpus>
<flops>6.0e11</flops>
<plan_class>cuda42</plan_class>
<api_version>6.7.0</api_version>
<coproc>
<type>CUDA</type>
<count>.5</count>
</coproc>
<gpu_ram>1073741824.00000</gpu_ram>
<file_ref>
<file_name>acemd.2562.cuda42</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart32_42_9.dll</file_name>
<open_name>cudart32_42_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>cufft32_42_9.dll</file_name>
<open_name>cufft32_42_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>tcl85.dll</file_name>
<open_name>tcl85.dll</open_name>
<copy_file/>
</file_ref>
</app_version>
</app_info>

Now, I tried to test it, but sadly I only kept being given Nathan tasks. Nathan tasks DO NOT WORK WELL when running two at a time with this app_info. They make the entire system very very slow and unresponsive, and the GPU load is actually 40%!! Running one nathan uses 90% GPU. So if you receive Nathan tasks, you'll have to stop two from running at the same time. Either suspend one of them in BOINC, or exit boinc, open app_info.xml, change the line <count>.5</count> to <count>1</count>, save and close app_info.xml and start BOINC again. You won't lose data - BOINC only discards tasks the first time it finds an app_info.xml. Changing values within the file (whilst boinc is not running) doesn't make it discard tasks.

I have yet to receive any more 3EKO Paola tasks, so I can't test this app_info.xml with them. But if anyone reading this could try, please report back to us if things improve.

flashawk
Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 26708 - Posted: 26 Aug 2012 | 18:15:48 UTC - in response to Message 26705.

How did you make out? I noticed that they started sending out another batch of 3EKO Paola tasks.

I had one take over 27 hours on my GTX550Ti, this must be close to what it was like years ago on this project (I'm assuming) with the older cards. It could have taken days to finnish a task for not many points, now with the new cards they rip through them in some cases in just a few hours.
____________

Luke Formosa
Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26709 - Posted: 27 Aug 2012 | 0:40:58 UTC - in response to Message 26708.

Heh. Well, using the posted app_info does indeed run two tasks at the same time, but it's a disaster with Nathan tasks. I never actually received any paola tasks to test it with. And yesterday I basically migrated to a project that gives me much, much more points per hour on my GTX 670 (once you make a few modifications that is!). But thanks to your message I'm going to try and resume GPUgrid and get two paola tasks.

I'll report back if I manage. Most likely though I'm not going to wait for them to finish. I'll time how long it takes to get to say 5% or 10% and extrapolate from there to give an estimate of total completion time.

Luke Formosa
Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26710 - Posted: 27 Aug 2012 | 0:52:55 UTC - in response to Message 26708.

OK, I got one Nathan and one Paola. Running them both at the same time and no other projects I get:

CPU usage 20%
GPU load 99%

After 5 minutes:
Nathan progress: 1.546% (estimated 323 minutes or 5.39 hours to completion - usually takes me 4.84 hours when running alone)
Paola progress: 0.386% (estimated 1295 minutes or 21.5 hours to completion - usually takes me 11.51 hours when running it alone)

Now I'll try and get two paola tasks running at the same time.

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 26711 - Posted: 27 Aug 2012 | 9:39:55 UTC - in response to Message 26710.

Thanks for keeping up on this and trying a few ideas, Luke. I'm looking at this again today and hopefully I can come up with some good news. :-/

Luke Formosa
Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26712 - Posted: 27 Aug 2012 | 10:04:59 UTC - in response to Message 26711.

You're welcome :)

I couldn't get a second Paola to try. I kept aborting Nathans and updating in the hopes of being given a second Paola, but the server kept sending me more Nathan tasks. I aborted a total of 3 of them and gave up for now.

flashawk
Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 26719 - Posted: 27 Aug 2012 | 19:31:51 UTC - in response to Message 26712.

I would try backing up both BOINC folders, you can reuse the same ones as many times as you like without aborting new tasks.

neilp62
Send message
Joined: 23 Nov 10
Posts: 14
Credit: 7,871,750,536
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26725 - Posted: 28 Aug 2012 | 6:21:47 UTC

Luke:

Thank you for your efforts on this issue!

I'd received three 3EKO Paolas tasks almost in a row, so I adapted your app_info.xml to my 4GB GTX 680 setup early this morning. Before, my single Paola task average CPU/GPU util was ~70%/32%, with an average GPU runtime of ~17.5 hours (after four Paolas), regardless whether my other three CPU cores were processing other BOINC WUs or idle (the GTX 680 is running at factory clock 1188MHz but typically underclocking itself to less; the CPU is an older Intel Quad Q9300 @ 2.5GHz). After restarting BOINC with GPUGRID <count>.5</count>, I received two 3EKO Paolas. I am now about 70% complete on both Paola tasks, and am averaging ~80% CPU core util per Paola, with total GPU util of ~56%. Total GPU runtime is projected at ~16.75 hrs per Paola, with total GPU mem usage of ~1400 MB.

Looks like my setup sees a slight runtime improvement with two 3EKO Paola GPU tasks in parallel. I'm a lot happier with the effective throughput than with one at a time, and will tolerate having to swap app_info <count> values as needed. I only wish the project would permit three 3EKO Paolas at once :)

Let me know if there are any other data points I can provide that might be helpful. Thanks again!

Luke Formosa
Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26726 - Posted: 28 Aug 2012 | 10:39:18 UTC - in response to Message 26719.

I would try backing up both BOINC folders, you can reuse the same ones as many times as you like without aborting new tasks.


I never thought of that. Does it work for you? Which two folders do you mean when you say "both" BOINC folders?

I don't know if something will go wrong at upload time because of external files that boinc keeps task lists in. But your suggestion would be good to try for someone who doesn't mind potentially losing two tasks' worth of work (lately I've become one of these people because in the course of testing, I've been aborting more tasks than I've been completing hehe).

Luke Formosa
Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26727 - Posted: 28 Aug 2012 | 10:54:17 UTC - in response to Message 26725.

Luke:

Thank you for your efforts on this issue!

I'd received three 3EKO Paolas tasks almost in a row, so I adapted your app_info.xml to my 4GB GTX 680 setup early this morning. Before, my single Paola task average CPU/GPU util was ~70%/32%, with an average GPU runtime of ~17.5 hours (after four Paolas), regardless whether my other three CPU cores were processing other BOINC WUs or idle (the GTX 680 is running at factory clock 1188MHz but typically underclocking itself to less; the CPU is an older Intel Quad Q9300 @ 2.5GHz). After restarting BOINC with GPUGRID <count>.5</count>, I received two 3EKO Paolas. I am now about 70% complete on both Paola tasks, and am averaging ~80% CPU core util per Paola, with total GPU util of ~56%. Total GPU runtime is projected at ~16.75 hrs per Paola, with total GPU mem usage of ~1400 MB.

Looks like my setup sees a slight runtime improvement with two 3EKO Paola GPU tasks in parallel. I'm a lot happier with the effective throughput than with one at a time, and will tolerate having to swap app_info <count> values as needed. I only wish the project would permit three 3EKO Paolas at once :)

Let me know if there are any other data points I can provide that might be helpful. Thanks again!


Welcome :). What adaptations did you have to make to run it on your 680? Would be good to compare notes.

The throughput does rise considerably when you run multiple tasks. This is why I hate GPUgrid's enforced limit of 2 tasks. In fact I'm no longer participating in GPUgrid. I'm running a project right now that is giving me 10 points per second with an experimental app_info.xml. I've tried up to 9 concurrent tasks on my single GTX 670 but the CPU becomes the limiting factor. GPUgrid was giving me 1, 2 or 4 points per second depending on task. My plan is to keep this up until I take #1 spot on the country-wide Malta leaderboard (should take a few days now, I've been #2 for a month because #1 guy earns 300,000 points a day and I was only earning 375,000 with GPUgrid. I'm now earning 800,000 a day just by changing project!!). Then I will temporarily switch back to running GPUGrid until I get my 10 million point badge (should take 13 days). And then I'll either give up GPUgrid for good or split the time between two projects.

I wonder if the two task limit is per-GPU? Would someone with many GPUs in their computer tell us? Because if so, we could perhaps trick BOINC into reporting a GPU count of 2 so that we could get four tasks. Another idea would be to move all the tasks out of the BOINC project folder. Then do a scheduler request so that BOINC downloads two new tasks. Then close boinc and move the two old tasks back into the folder, and hopefully you'd have four tasks in the queue.

Perhaps there's also a way to run many instances of BOINC on the same PC, and each one would act as a separate host and have an individual limit of 2 tasks, so maximum tasks would be 2 x number of BOINC instances.

What I know for sure is that the 2 task limit is not something they're willing to change. I had posted about this a few weeks ago and that was the answer I got. They need speedy returns of tasks (because one builds on the results of a previous one) so that's how they make it happen.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26731 - Posted: 28 Aug 2012 | 14:33:59 UTC

I wonder if the two task limit is per-GPU?
Yes, it is 2 WU per GPU.

Perhaps the project would consider using the BETA queue for PAOLOA tasks so we could isolate these types and get multiple WUs going per GPU to help speed up the research? I know this requires a bit of work for both the project and for crunchers, of which only some will understand enough to get an app_info working properly. I for one would gladly participate on my 3 gpus (GTX670, GTX660Ti, GTX480). I might even consider resurrecting my GTX295 just for these tasks :-)

side note: my PC crashed hard last night and managed to drop the entries for GPUGrid entirely so while GPUGrid thinks I have tasks in process I actually don't. I think I may have to wait until they clear (no reply) before I can get more as I don't feel like changing the machine name just to trick the project into giving me more right away.
____________
Thanks - Steve

Luke Formosa
Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26732 - Posted: 28 Aug 2012 | 14:47:51 UTC - in response to Message 26731.


Perhaps the project would consider using the BETA queue for PAOLOA tasks so we could isolate these types and get multiple WUs going per GPU to help speed up the research? I know this requires a bit of work for both the project and for crunchers, of which only some will understand enough to get an app_info working properly. I for one would gladly participate on my 3 gpus (GTX670, GTX660Ti, GTX480). I might even consider resurrecting my GTX295 just for these tasks :-)


I don't know if putting paola tasks in the beta queue is such a good idea, because they'll lose a lot of throughput. I think beta is not activated by default when anyone signs up, and I think many crunchers don't tick that checkbox because of the higher risk of getting a computation error and not getting any points for beta tasks.

The ideal solution is for them to create a third queue (right alongside acemd2 and acemdlong), and make <coproc> number equal to 0.5 from THEIR side. Then everyone would be running two Paolas without having to mess around with any app_info.xml files themselves!

Luke Formosa
Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26733 - Posted: 28 Aug 2012 | 15:05:43 UTC

Just tried to remove the currently running tasks from the project folder in BOINC and re-starting boinc to get it to download more tasks. It didn't work. BOINC just re-downloads the two tasks I removed from the folder. We need to find another way of getting more than 2 tasks downloaded.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26735 - Posted: 28 Aug 2012 | 17:37:21 UTC

I'm not part of the project team and I'm not saying I don't think you have good ideas, and I really would like to see better efficiency but ... I'm aiming for an approach that I think may actually have a chance of happening.

I have been convinced over may conversations that the administrative overhead associated with adding another queue (beyond the effort just to get it up and running) and the general slowdown of the entire project by increasing the max WU count are a combination that just simply isn't in the best interest of the long term goals of the project. This run of WUs is limited and will be through the pipeline in the near future but decisions taken to handle specific egde conditions will need to be managed for the long haul and make any future upgrade just all that more difficult to manage also.


I suggested using BETA as the queue itself already exists and I believe these are the very users that are more likley to be running high end cards and have the ability to utilize a custom app_info file successfully. If this was in place and we brought this back to our respective teams, actively recruit high end cards/ team members we CAN get the WUs process for Paola.
____________
Thanks - Steve

flashawk
Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 26738 - Posted: 28 Aug 2012 | 19:27:07 UTC - in response to Message 26726.

I would try backing up both BOINC folders, you can reuse the same ones as many times as you like without aborting new tasks.


I never thought of that. Does it work for you? Which two folders do you mean when you say "both" BOINC folders?

I don't know if something will go wrong at upload time because of external files that boinc keeps task lists in. But your suggestion would be good to try for someone who doesn't mind potentially losing two tasks' worth of work (lately I've become one of these people because in the course of testing, I've been aborting more tasks than I've been completing hehe).


BOINC creates 2 main folders when you install it, 1 in Program Files and the other depends on you're OS but on the same drive and it's labeled Data. I copy and paste everything out of those 2 folders into 2 other folders that I created, compress them and store them on my data server. If anything goes wrong, you can copy and paste it all back as many times as you like.
____________

Luke Formosa
Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26740 - Posted: 28 Aug 2012 | 20:02:34 UTC - in response to Message 26738.

I would try backing up both BOINC folders, you can reuse the same ones as many times as you like without aborting new tasks.


I never thought of that. Does it work for you? Which two folders do you mean when you say "both" BOINC folders?

I don't know if something will go wrong at upload time because of external files that boinc keeps task lists in. But your suggestion would be good to try for someone who doesn't mind potentially losing two tasks' worth of work (lately I've become one of these people because in the course of testing, I've been aborting more tasks than I've been completing hehe).


BOINC creates 2 main folders when you install it, 1 in Program Files and the other depends on you're OS but on the same drive and it's labeled Data. I copy and paste everything out of those 2 folders into 2 other folders that I created, compress them and store them on my data server. If anything goes wrong, you can copy and paste it all back as many times as you like.


Ahh, you mean the boinc program and boinc data folders. Yes, that might work! Thanks for the idea

Luke Formosa
Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26741 - Posted: 28 Aug 2012 | 20:11:44 UTC - in response to Message 26735.

I'm not part of the project team and I'm not saying I don't think you have good ideas, and I really would like to see better efficiency but ... I'm aiming for an approach that I think may actually have a chance of happening.

I have been convinced over may conversations that the administrative overhead associated with adding another queue (beyond the effort just to get it up and running) and the general slowdown of the entire project by increasing the max WU count are a combination that just simply isn't in the best interest of the long term goals of the project. This run of WUs is limited and will be through the pipeline in the near future but decisions taken to handle specific egde conditions will need to be managed for the long haul and make any future upgrade just all that more difficult to manage also.


I suggested using BETA as the queue itself already exists and I believe these are the very users that are more likley to be running high end cards and have the ability to utilize a custom app_info file successfully. If this was in place and we brought this back to our respective teams, actively recruit high end cards/ team members we CAN get the WUs process for Paola.


Course there's always option 3 - they could let us have more than two workunits in the queue! Maybe just add a tickbox somewhere, and explain why it's not on by default and why they want it limited to 2 tasks... but at least give us the option for extraordinary circumstances such as these :)

neilp62
Send message
Joined: 23 Nov 10
Posts: 14
Credit: 7,871,750,536
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26747 - Posted: 29 Aug 2012 | 7:08:35 UTC - in response to Message 26727.

What adaptations did you have to make to run it on your 680? Would be good to compare notes.


Just small changes specific to my rig based on my GPUGRID sched_request file, i.e. values for <avg_ncpus>, <flops>, <gpu_ram>, etc. I have not attempted any further optimization.

spout23
Send message
Joined: 8 May 12
Posts: 2
Credit: 304,211,223
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26802 - Posted: 6 Sep 2012 | 16:49:18 UTC

Luke

Where is the file app_info.xml in the ProgramData directory from which GPUGRD
program works?

I know it does not work in the cc_config file in Bonic program.

So in other words, where are you or other people putting the program to have 2 wu's work at once.

Thanks
spout23

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 817,865,789
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26803 - Posted: 6 Sep 2012 | 17:12:37 UTC

it should be in the right project order of poem "gpugrid" two orders up from your order where cc_config os contained when i remember right.
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Luke Formosa
Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26804 - Posted: 6 Sep 2012 | 18:19:22 UTC - in response to Message 26802.

My BOINC data folder is a little different because I have my main windows installation on an SSD and all data-related things on a hard disk, but the directory is something like this for me:

D:\HDDProgramFiles\BOINC Data\projects\gpugrid\app_info.xml

What you need to do is the following:
Since BOINC discards all tasks the first time it finds an app_info, you need to finish your current ones (assuming you don't want to just abort them). To do this:
1. Open boinc, click on GPUgrid in the "projects" tab and click "no new tasks"
2. Wait for all current tasks to end
3. Exit BOINC

Then, you need to make the app_info file. So:
1. Open notepad
2. Copy and paste the contents from my first post into notepad
3. Save it as app_info.xml - with a .xml extension. Notepad by default saves with a .txt extension. Worst case, if you save it as .txt you can just rename the file, remove the ".txt" and make it ".xml".

Save the file within the project folder.

In a default install, the BOINC data folder will be:

Windows 98/SE/ME: C:\Windows\All Users\BOINC\ or C:\Windows\Profiles\All Users\BOINC\ (*)
Windows 2000/XP: C:\Documents and Settings\All Users\Application Data\BOINC\ (*)
Windows Vista/Windows 7: C:\ProgramData\BOINC\ (*)
Linux: wherever you unpack it/BOINC/
Macintosh OS X: /Library/Applications Support/BOINC/

(*) This directory may well be hidden, so either put the path to it directly into Windows Explorer, or instruct Windows Explorer to show hidden files and folders.

Once you open the BOINC data folder, you'll find a folder called "projects". Open that, and you should have a folder for every BOINC project running on that PC. Open the gpugrid folder, and put the app_info.xml in there. Then start up BOINC, set it to allow new tasks, and look at the notices and task list to check that it was a success.

Luke Formosa
Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26805 - Posted: 6 Sep 2012 | 18:22:27 UTC - in response to Message 26802.

Luke

Where is the file app_info.xml in the ProgramData directory from which GPUGRD
program works?

I know it does not work in the cc_config file in Bonic program.

So in other words, where are you or other people putting the program to have 2 wu's work at once.

Thanks
spout23


My previous message was aimed at someone who didn't know where the BOINC data folder was. Since you already know where the cc_config is, then the short answer for you is to open the "projects" folder, which is in the same directory as cc_config, then open the "gpugrid" folder.

In other words: BOINC_data_folder\projects\gpugrid\app_info.xml

spout23
Send message
Joined: 8 May 12
Posts: 2
Credit: 304,211,223
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26809 - Posted: 6 Sep 2012 | 21:50:57 UTC

Well what can I say, the file does work in the place you said.
But I would recommend that on the most expert people among us in the
Grid do this procedure. I'm a superuser on the computer,
But I'm Not a expert.

Thanks for your Help.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26825 - Posted: 8 Sep 2012 | 11:00:27 UTC

It appears there are 2 app versions being distributed on the LONG queue ... 615 and 616. I added a new app_version section for 616 and it seems to be working fine.
____________
Thanks - Steve

Luke Formosa
Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26833 - Posted: 8 Sep 2012 | 17:32:46 UTC - in response to Message 26825.

It appears there are 2 app versions being distributed on the LONG queue ... 615 and 616. I added a new app_version section for 616 and it seems to be working fine.


Hmm. So you duplicated the entire section between <app_version> and </app_version> ? Could you post the complete app_info?

I always wonder whether it would be possible to have separate sections of the app_info.xml for Nathan's and Paola's so that we could run only one nathan and two paolas automatically. Any idea how this could be done?

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26834 - Posted: 8 Sep 2012 | 17:50:13 UTC - in response to Message 26833.

The app_info only knows about the application, it has no idea of what types of WUs it is running.
I've now started running this on my i7-920 Win7 x64 in conjunction with CPU tasks from other projects ... I left a more core free than necessarey and something odd is happening ... instead of splitting the core in half for the 2 instances of ACEMD it is using a bit more than 1/2 for each and I'm currently running at a pace to complete 2 in 16 hours total!!! I have no idea what's going on but I'm not going to question too much.

Here is the app_info I am currently using. A couple of points to note ... I upped the MEM by simply changing the leading 1 to a 2 which is not exact but close enough and I have also upped the avg num cpus. I was using this same file on my i7-980 Win7 x64 rig with the 660Ti and while I was getting better efficiency overall it took 30 hours to complete 2 WUs.

This morning after discarding quite a few NATE WUs to capture more PAOLOAs for the 660, BOINC started saying I had a bad app_info??? Whatever ... it's working great on the 670 so I'm going to leave it alone.

<app_info>
<app>
<name>acemdlong</name>
<user_friendly_name>Long runs (8-12 hours on fastest card)</user_friendly_name>
</app>
<file_info>
<name>acemd.2562.cuda42</name>
<executable/>
</file_info>
<file_info>
<name>cudart32_42_9.dll</name>
<executable/>
</file_info>
<file_info>
<name>cufft32_42_9.dll</name>
<executable/>
</file_info>
<file_info>
<name>tcl85.dll</name>
<executable/>
</file_info>
<app_version>
<app_name>acemdlong</app_name>
<version_num>615</version_num>
<platform>windows_intelx86</platform>
<avg_ncpus>0.50</avg_ncpus>
<max_ncpus>1.000000</max_ncpus>
<flops>6.0e11</flops>
<plan_class>cuda42</plan_class>
<api_version>6.7.0</api_version>
<coproc>
<type>CUDA</type>
<count>.5</count>
</coproc>
<gpu_ram>2073741824.00000</gpu_ram>
<file_ref>
<file_name>acemd.2562.cuda42</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart32_42_9.dll</file_name>
<open_name>cudart32_42_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>cufft32_42_9.dll</file_name>
<open_name>cufft32_42_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>tcl85.dll</file_name>
<open_name>tcl85.dll</open_name>
<copy_file/>
</file_ref>
</app_version>
<app_version>
<app_name>acemdlong</app_name>
<version_num>616</version_num>
<platform>windows_intelx86</platform>
<avg_ncpus>0.50</avg_ncpus>
<max_ncpus>1.000000</max_ncpus>
<flops>6.0e11</flops>
<plan_class>cuda42</plan_class>
<api_version>6.7.0</api_version>
<coproc>
<type>CUDA</type>
<count>.5</count>
</coproc>
<gpu_ram>2073741824.00000</gpu_ram>
<file_ref>
<file_name>acemd.2562.cuda42</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart32_42_9.dll</file_name>
<open_name>cudart32_42_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>cufft32_42_9.dll</file_name>
<open_name>cufft32_42_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>tcl85.dll</file_name>
<open_name>tcl85.dll</open_name>
<copy_file/>
</file_ref>
</app_version>
</app_info>
____________
Thanks - Steve

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26843 - Posted: 9 Sep 2012 | 21:05:36 UTC

So ... I figured out what the "bad app_info" message really is (I should read the message closer) ... it really was telling me that I had already puled 18 WUs for the day, so this means I won;t get any more. I just had this happen on my 670 so I'm giving up this venture and going back to crunching 1 at a time :sad_panda:
____________
Thanks - Steve

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 7,520
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26844 - Posted: 9 Sep 2012 | 21:37:36 UTC
Last modified: 9 Sep 2012 | 21:38:42 UTC

I'm receiving these PAOLA_3EKO tasks less and less frequently lately, so this problem will be gone soon.
I hope the upcoming long tasks won't have similar problems.

flashawk
Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 26845 - Posted: 10 Sep 2012 | 0:08:02 UTC - in response to Message 26844.

I'm receiving these PAOLA_3EKO tasks less and less frequently lately, so this problem will be gone soon.
I hope the upcoming long tasks won't have similar problems.


I've gotten 2 additional PAOLA's with different names acting like the 3EKO, it looks as though there are 2 more runs coming out of the gate.
____________

Luke Formosa
Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26856 - Posted: 10 Sep 2012 | 17:58:15 UTC - in response to Message 26834.

The app_info only knows about the application, it has no idea of what types of WUs it is running.
I've now started running this on my i7-920 Win7 x64 in conjunction with CPU tasks from other projects ... I left a more core free than necessarey and something odd is happening ... instead of splitting the core in half for the 2 instances of ACEMD it is using a bit more than 1/2 for each and I'm currently running at a pace to complete 2 in 16 hours total!!! I have no idea what's going on but I'm not going to question too much.

Here is the app_info I am currently using. A couple of points to note ... I upped the MEM by simply changing the leading 1 to a 2 which is not exact but close enough and I have also upped the avg num cpus. I was using this same file on my i7-980 Win7 x64 rig with the 660Ti and while I was getting better efficiency overall it took 30 hours to complete 2 WUs.

This morning after discarding quite a few NATE WUs to capture more PAOLOAs for the 660, BOINC started saying I had a bad app_info??? Whatever ... it's working great on the 670 so I'm going to leave it alone.

<app_info>
<app>
<name>acemdlong</name>
<user_friendly_name>Long runs (8-12 hours on fastest card)</user_friendly_name>
</app>
<file_info>
<name>acemd.2562.cuda42</name>
<executable/>
</file_info>
<file_info>
<name>cudart32_42_9.dll</name>
<executable/>
</file_info>
<file_info>
<name>cufft32_42_9.dll</name>
<executable/>
</file_info>
<file_info>
<name>tcl85.dll</name>
<executable/>
</file_info>
<app_version>
<app_name>acemdlong</app_name>
<version_num>615</version_num>
<platform>windows_intelx86</platform>
<avg_ncpus>0.50</avg_ncpus>
<max_ncpus>1.000000</max_ncpus>
<flops>6.0e11</flops>
<plan_class>cuda42</plan_class>
<api_version>6.7.0</api_version>
<coproc>
<type>CUDA</type>
<count>.5</count>
</coproc>
<gpu_ram>2073741824.00000</gpu_ram>
<file_ref>
<file_name>acemd.2562.cuda42</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart32_42_9.dll</file_name>
<open_name>cudart32_42_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>cufft32_42_9.dll</file_name>
<open_name>cufft32_42_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>tcl85.dll</file_name>
<open_name>tcl85.dll</open_name>
<copy_file/>
</file_ref>
</app_version>
<app_version>
<app_name>acemdlong</app_name>
<version_num>616</version_num>
<platform>windows_intelx86</platform>
<avg_ncpus>0.50</avg_ncpus>
<max_ncpus>1.000000</max_ncpus>
<flops>6.0e11</flops>
<plan_class>cuda42</plan_class>
<api_version>6.7.0</api_version>
<coproc>
<type>CUDA</type>
<count>.5</count>
</coproc>
<gpu_ram>2073741824.00000</gpu_ram>
<file_ref>
<file_name>acemd.2562.cuda42</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart32_42_9.dll</file_name>
<open_name>cudart32_42_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>cufft32_42_9.dll</file_name>
<open_name>cufft32_42_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>tcl85.dll</file_name>
<open_name>tcl85.dll</open_name>
<copy_file/>
</file_ref>
</app_version>
</app_info>


Thanks for that. Today I finally reached #1 spot in the Malta leaderboard (using a more time-profitable project), so as of now I have once again started running GPUGRID. I'm using your app_info - your idea of changing the avg_ncpus was a good idea - BOINC now runs two GPUgrid tasks and 7 CPU tasks at once on my computer (before it used to try to run 8 CPU tasks but one of them was always running slower than the other 7 - likely due to the way windows schedules CPU time between tasks when it has an excess of active threads...).

I'm currently running a Nathan and a Paola simultaneously. The GPU load is 97-99% (a record for me). Computer is just as responsive as usual, which is good. GPU RAM usage is 1700MB, which is also a record as it's usually under 700MB with other BOINC GPU projects.

I'll try and remember to report back when the tasks complete with the completion time and calculated rate of points. Then we can compare the points per second of running two at once with the points per second of running just one Nathan alone.

Post to thread

Message boards : Number crunching : Custom app_info.xml for these pesky low-utilisation 3EKO_paola tasks

//