Message boards : Number crunching : IBUCH WU

Bedrich Hajek
Joined: 28 Mar 09
Posts: 485
Credit: 10,659,123,466
RAC: 15,258,829
Scientific publications
Message 19694 - Posted: 27 Nov 2010 | 15:15:55 UTC

Why are the IBUCH work units so much slower (50% longer, often more) than the KASHIF and GIANNI units?

Profile skgiven
Volunteer moderator
Volunteer tester
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Scientific publications
Message 19700 - Posted: 27 Nov 2010 | 17:57:16 UTC - in response to Message 19694.

More CPU-bound scripting operations are performed on the IBUCH tasks. This can't be stopped mid-run/experiment but in the future such scripting will be cut down.

Bedrich Hajek
Joined: 28 Mar 09
Posts: 485
Credit: 10,659,123,466
RAC: 15,258,829
Scientific publications
Message 19708 - Posted: 29 Nov 2010 | 1:51:54 UTC - in response to Message 19700.

Interesting! On a 480 card the IBUCH units use about 30% of the card capacity, while the other units use about 50%, by reducing the scripting operations across the board, would that increase speed (reducing time) for all the units?

Profile skgiven
Volunteer moderator
Volunteer tester
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Scientific publications
Message 19709 - Posted: 29 Nov 2010 | 12:25:00 UTC - in response to Message 19708.

On multi GPU systems I think so, but it depends on the number of cards and CPU cores; I have 3 IBUCH tasks running and 1 GIANNI_DHFR task running - all 4 GPUs are running at between 66% and 70% GPU Utilization. The last GIANNI_DHFR task ran at over 80% when paired with 2 KASHIF and 1 IBUCH.

If I stop crunching 1 IBUCH task the utilization rises to 72% for the other IBUCH tasks and 80% for the GIANNI_DHFR.
When I free up another IBUCH task the GIANNI_DHFR went up to 85% and the IBUCH task rose slightly to 75%.

Phenom II quad, 4GB RAM, Vista, swan_sync=0, no CPU tasks running.

This would not be the case on your Phenom(tm) II X6 1090T, so long as you did not use CPU cores for crunching. In fact you could probably use 3 without any issues.

Joined: 23 May 09
Posts: 121
Credit: 327,240,386
RAC: 66,341
Scientific publications
Message 19720 - Posted: 30 Nov 2010 | 20:41:20 UTC

My last two IBUCH-wu's finished with success. But the output file shows an entry
MDIO ERROR: cannot open file "restart.coor"
One of them is
I've seen that I cannot shut down my PC; the wu starts from beginning. Is that correct?

Profile Retvari Zoltan
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Scientific publications
Message 19723 - Posted: 30 Nov 2010 | 21:35:23 UTC - in response to Message 19720.
Last modified: 30 Nov 2010 | 21:41:16 UTC

My last two IBUCH-wu's finished with success. But the output file shows an entry
MDIO ERROR: cannot open file "restart.coor"

This MDIO error message appears in every task. I guess this is not a real error.

One of them is
I've seen that I cannot shut down my PC; the wu starts from beginning. Is that correct?

I can shut down my PC, next time the WU's processing will start from the last checkpoint. It's loosing only a cople of minutes calculating time. If your WU starts from the beginning, there is some problem with your BOINC client installation. (I guess that Win 7's User Access Control interfere with the BOINC client)

Bedrich Hajek
Joined: 28 Mar 09
Posts: 485
Credit: 10,659,123,466
RAC: 15,258,829
Scientific publications
Message 19775 - Posted: 5 Dec 2010 | 4:38:32 UTC

Just finished several IBUCH long WU on both of my computers. I have pretty fast cards, a 285 and 480, and they took about 16 hours to complete these units, with a lot of scripting operations. The main issue is, if you download 2 of these units back to back on a computer with a single card or 3 units with 2 cards, that the completion time of 16 hours and 16 hours is greater than 24 hours, so there goes the 50% bonus for the second unit. If the scripting was reduced or no back to back downloads than this would not be an issue. I don't usually complain about something this trivial, but I think this should be addressed.

Profile skgiven
Volunteer moderator
Volunteer tester
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Scientific publications
Message 19778 - Posted: 5 Dec 2010 | 9:36:11 UTC - in response to Message 19775.
Last modified: 5 Dec 2010 | 10:02:44 UTC

I have already suggested that people reduce their cach, even for Fermi crunchers.
My recommended GPUGrid cache settings are between 0.10 and 0.01 days, rather than the Boinc default of 0.25 or 0.50 days.

    Open Boinc Manager (6.10.58) in Advanced mode.
    From the drop down menu click Advanced and Preferences to open the Boinc Manager Preferences window.
    Click the network usage tab, and set the [b]additional work buffer[b] to between 0.10 and 0.01.

Bedrich Hajek,
I would suggest you leave a CPU core free for each GPU and use swan_sync=0 as your tasks could run faster.
That said both your GTX480 and GTX285 are now completing these long tasks in around 10h and getting full credit.

GTX480, 3394160 3393769 3393416
GTX285, 3394258 3393616

Bedrich Hajek
Joined: 28 Mar 09
Posts: 485
Credit: 10,659,123,466
RAC: 15,258,829
Scientific publications
Message 19803 - Posted: 7 Dec 2010 | 23:41:50 UTC

I did that Swan_Sync=0 on both my computers, and it looks like it improved the performance (reduced completion time) by an average of at least 20%. The IBUCH long WU were completed, on average, about 2 to 3 hours faster from about 16 hours to about 13 to 14 hours. The GPU usage, on the 480 card on windows 7, increased to over 40% from about 30% for an IBUCH long unit. On the other units, on the same machine, usage increased to almost 70% from 50%. These are the results that were produced on my computers.

The other thing that I didn't see mentioned was, if you have a multiprocessor system, under bionic manager version 6.10.58, you have to go the advanced, than preferences, than processor usage, the "On multiprocessor system use at most _____ % of the processor" cannot be 100%. If you have a dual processor with one video card, set it to 50% or 6 processors with 2 video cards, set it to 66.67%, or else the system will run 2 CPU jobs on one processor or 6 CPU jobs on 4 processors. It will do all those CPU jobs, without any errors (my machines did), but the completion times for them will be long.

Thank you for your advice.

