Advanced search

Message boards : Number crunching : Duplicated work

Author Message
Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 206
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 20891 - Posted: 9 Apr 2011 | 18:34:56 UTC
Last modified: 9 Apr 2011 | 18:41:32 UTC

Hello: I would like to comment on a matter which does not understand.

I see that there are tasks that we have made TWO users, completed, validated and marked; i.e. we have doubled the work. Best regards

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 20892 - Posted: 9 Apr 2011 | 20:34:52 UTC - in response to Message 20891.

You'll find that the first WU is over 2 days old without a result so it is issued to a second host.

The reason they do this is because they require fast and reliable completion as the next WU in a job depends on the first one being completed and can't be sent out before.
____________
Radio Caroline, the world's most famous offshore pirate radio station.
Great music since April 1964. Support Radio Caroline Team -
Radio Caroline

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 206
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21152 - Posted: 4 May 2011 | 17:43:44 UTC - in response to Message 20892.

Hello: From a few days ago the number tereas we performed in duplicate (I + a collaborator) and not by spending two days between sending each of them, in these last few hours have passed.

If there is a coordination problem on the server, it seems a waste of labor. Greetings.

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 21154 - Posted: 4 May 2011 | 22:13:35 UTC - in response to Message 21152.

Hi, if you can, please report the task numbers which seem duplicate.

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 206
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21155 - Posted: 4 May 2011 | 22:33:58 UTC - in response to Message 21154.

Hello: So far I see this, there were two more but now indicate miscalculation ...?. Will report if I find others. Greetings.

http://www.gpugrid.net/workunit.php?wuid=2462615

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21157 - Posted: 5 May 2011 | 8:20:58 UTC - in response to Message 21152.

After the first task sent out was not returned for about 3days, another task was sent out. The first task then returned with an error. I presume this triggered a second task to be sent out, at which time there would have been 2 tasks in progress. Now one of these tasks has been returned, but there is still one in progress.

Given the nature of the project (dependency on fast turn around of tasks) this is an acceptable situation; if several tasks are slow to return or fail it slows down the overall research really badly.

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 206
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21321 - Posted: 6 Jun 2011 | 17:01:32 UTC - in response to Message 21157.
Last modified: 6 Jun 2011 | 17:26:47 UTC

Hello Again it happened that a task is sent to another user before the end of two days of processing in the first ...!

Proceed to cancel the task because it is losing work with it, but it is unfortunate the time I lost my ...! Greetings.

Work: http://www.gpugrid.net/workunit.php?wuid=2514023

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 206
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21392 - Posted: 12 Jun 2011 | 12:21:03 UTC - in response to Message 21321.

Hi, again a task has been sent to two users on the same day (three hour time difference) I think is worth reporting if a server failure or any special circumstances of the task ...? Greetings.

Workunit 2521888

http://www.gpugrid.net/workunit.php?wuid=2521888

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21393 - Posted: 12 Jun 2011 | 15:08:45 UTC - in response to Message 21392.

The work unit was sent out.
The host did not complete the task within 2days so it was resent.
The first host then returned the task as an Error. This prompted the server to send out an additional WU. Subsequently both resends returned as Valid results.
This is normal server behavior.

Ram Raider
Send message
Joined: 17 Jul 07
Posts: 2
Credit: 332,470
RAC: 0
Level

Scientific publications
watwatwatwat
Message 21406 - Posted: 13 Jun 2011 | 8:19:23 UTC - in response to Message 21393.

At the moment I suspect that a lot of work is being wasted due to this policy. Don't forget that BOINC is a multi-project environment and the scheduler takes into account the project resource share and the deadline when scheduling tasks. The problem is made worse in that BOINC can take up to a day to send the results back.

A much better solution would be to make the deadline 2 days and let BOINC work as it was designed. You will get more complaints about the short deadline but it's better to be open about it instead of wasting resources. The users can then decide for themselves whether GPU is for them.

It's only recently I've become aware of this policy and it's very frustrating to realise that probably every one of my tasks has been re-issued. What a waste!

Dirk
Send message
Joined: 10 Oct 08
Posts: 18
Credit: 39,100,916
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21407 - Posted: 13 Jun 2011 | 8:33:00 UTC

Hmm I agree with making the deadline 2 days instead of 5. I often get sent tasks which are still running on slower GPUs, my WU will complete and it'll still be running on the other host, just a waste of resources if you ask me. Would also, hopefully, stop boinc downloading 2 GPUGRID tasks (my buffer is set to 1 day because I need enough einstein WUs to feed my GPU which runs 4 of those at once)

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21411 - Posted: 13 Jun 2011 | 9:39:03 UTC - in response to Message 21407.

I almost agree in principle, though I would opt for a 3 day cutoff point and change the credit system slightly (150% for <1day return, 100% for <2day return, 50% for <3day return). Anyway, it's really down to the scientists to work out what is best for the project overall, not us. They know things we don't.

One low deadline concern would be getting tasks in the first place; Boinc's lack of up-time could prevent you getting a task.

There is a recommendation to use report tasks immediately (See the FAQ section).

It's important to remember that the next batch of tasks relies on the completion of all existing tasks, so following a failure, it is essential to have another task sent out; to expedite the overall science, not just the one task.

There has been plenty of suggestions on how to improve the science project. For example, if a task is completed by one person, send a signal out to any other crunchers to stop crunching that task and credit them for the percentage already completed. The problem is implementation, and many suggestions are just not doable with present Boinc versions.

Running more than one GPU project often results in problems. You would be better off crunching GPUGrid tasks for a while and then Einstein tasks for a while (if you must), rather than both at the same time. A low cache is the recommended setting at GPUGrid.

PS. I would have thought running 4 Einstein tasks on a GTX480 would be suboptimal (2 or perhaps 3, but not 4).

Dirk
Send message
Joined: 10 Oct 08
Posts: 18
Credit: 39,100,916
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21415 - Posted: 13 Jun 2011 | 10:02:04 UTC - in response to Message 21411.
Last modified: 13 Jun 2011 | 10:02:25 UTC

Yes, I usually don't run both GPU projects at the same time. Also use report tasks immediately. I myself have no trouble completing WUs in time.

Running 4 einstein WUs works great btw. If I'm not using the pc it completes them in about 95 mins (with 4 CPU tasks running alongside). Running 3 or 2 only makes them complete a bit faster. But not enough to cancel out running 1 or 2 tasks less. Annoyingly enough even 4 einstein WUs only get GPU usage up to a max of about 80%.

Ram Raider
Send message
Joined: 17 Jul 07
Posts: 2
Credit: 332,470
RAC: 0
Level

Scientific publications
watwatwatwat
Message 21427 - Posted: 13 Jun 2011 | 16:47:01 UTC

By all means reward users for a fast turnaround, but I don't think it's a good idea to penalise users from returning tasks within the deadline period.

Not sure I understand what the issues are around BOINC's lack of up-time, however the last thing you want is for users to stock up their buffer above 2 days.

For info I've never had any problems running multiple GPU projects at the same time. BOINC takes care of this rather well and ensures tasks are (almost) always completed within the deadline, which brings us back to the original problem.

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21431 - Posted: 13 Jun 2011 | 21:50:52 UTC


What's causing the waste here is the fact that tasks are resent before the first computer misses the deadline. The concept is simple...decide how soon you need results back and set the deadline to that many days. If they want them back in 2 days then grow a pair, draw the line and set a 2 day deadline. If they feel 3 days is adequate then make the deadline 3 days.

Setting the deadline to 5 and then resending tasks after 2 days is just plain dumb because it's inevitably going to end up producing a lot of wasted (duplicated) effort. It's better for systems that can't return a task in 2 days to crunch somewhere else rather than waste their time duplicating work here.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 6,169
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21434 - Posted: 13 Jun 2011 | 22:18:26 UTC - in response to Message 21431.

I agree with Dagorath on the deadline.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21435 - Posted: 13 Jun 2011 | 22:38:42 UTC - in response to Message 21431.
Last modified: 13 Jun 2011 | 22:49:14 UTC

Boinc calculates the ratio of time the computer is on and running Boinc. If Boinc thinks it is not on long enough to finish a task in 2 days then Boinc will not ask for a task. Crunching for other GPU projects would also impact upon asking for GPUGrid tasks as would things like task failures, your Boinc configurations such as use GPU while computer is in use...
2 days would be long enough for many people for any tasks, but not long enough for many people trying to run long WU's. Boinc sees different tasks as all part of the same GPUGrid propject, so just trying to run one long task might throw it out and prevent someone that would be capable of running normal tasks from getting any. Then there is different Boinc versions (some might behave slightly differently to others) and what would happen if you had a poor GPU and replaced it with a high end GPU? Basically I think a bit more leeway is called for than a 2 day cutoff. Ideally Boinc would better understand separate task types and their requirements and even allow the user to more exactly specify what type of task and when these tasks should run. So GPUGrid has to be a bit more flexible and try to facilitate more people and operate within Boinc's restrictions.

The research team also spent a lot of time and effort getting to where we are now. At some stage (now) they have to concentrate more on the Science and maintenance (new GPUs/drivers/CUDA versions) and less on the project setup and development. Also, changes always annoy someone.

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21437 - Posted: 14 Jun 2011 | 2:31:22 UTC - in response to Message 21435.

Boinc calculates the ratio of time the computer is on and running Boinc. If Boinc thinks it is not on long enough to finish a task in 2 days then Boinc will not ask for a task.


How can that be when the deadline is currently 5 days? Was the 2 above a typo? Or do I not understand the scheduler?

2 days would be long enough for many people for any tasks, but not long enough for many people trying to run long WU's.


Then make the deadline 3 days. Or 4 days. Or 100 days, I don't care. But stop resending tasks before the deadline is up.

Boinc sees different tasks as all part of the same GPUGrid propject, so just trying to run one long task might throw it out and prevent someone that would be capable of running normal tasks from getting any.


If they shortened the deadline to 2 days some people would find their ACEMD long tasks would get status Abandoned or No Response or whatever this server gives when a task goes over deadline. They would soon learn to go to their preferences and deselect ACEMD long so they get only ACEMD standard tasks. To make the transition less painful they could advertise in the threads and on the home page that the deadline is about to change and that those with slower GPUs and those who don't run BOINC 24/7 should deselct the long tasks.

Then there is different Boinc versions (some might behave slightly differently to others) and what would happen if you had a poor GPU and replaced it with a high end GPU?


I don't see any problem with that. If they've deselected the long tasks because they have a slower GPU and then they get a faster GPU they can easily select the long tasks. Seems simple enough. Did I miss something?

Basically I think a bit more leeway is called for than a 2 day cutoff.


I mentioned 2 days in my last post because that seems to be the return time the admins want. I could live with a 3 day deadline. Or 4 days. Or 100 days. Whatever the admins want ius fine with me. Just stop resending tasks before the deadline expires.

Ideally Boinc would better understand separate task types and their requirements and even allow the user to more exactly specify what type of task and when these tasks should run. So GPUGrid has to be a bit more flexible and try to facilitate more people and operate within Boinc's restrictions.


Indeed projects should be flexible and facilitate and operate within BOINC's restrictions. However, I think project admins' primary obligation is to use donated resources as efficiently as possible. That's not happening now.

The research team also spent a lot of time and effort getting to where we are now. At some stage (now) they have to concentrate more on the Science and maintenance (new GPUs/drivers/CUDA versions) and less on the project setup and development. Also, changes always annoy someone.


As I understand it, changing the deadline is as simple as changing a number in a config file. Resending tasks before the deadline expires seems to me to be custom code by GPUgrid devs but it shouldn't be too hard to go back to the standard BOINC server way.

As for people being annoyed by changes...well...there are people annoyed now by the current policy of resending tasks before the deadline expires. So what's an admin to do? In my mind there is no contest. They MUST live up to their primary obligation to use donated resources as efficiently as possible. That means appease the people who want to eliminate duplicated work and to heck with those who support the status quo.

Remember, less duplicated work means more throughput for the project. Who doesn't want more throughput?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21441 - Posted: 14 Jun 2011 | 10:54:44 UTC - in response to Message 21437.

I was explaining what might happen if we had a 2day deadline, speaking hypothetically, rather than explaining the existing 5day system. I agree that the system should be reviewed periodically, and the deadline probably reduced, but it's not my call and I don't have all the info.

Resends are always going to happen (tasks fail, as do GPU's, computers, routers).
Even if the deadline was 2 or 3 days, tasks would still fail/not be returned in time and have to be resent. The number of resends might actually rise if the deadline was changed. I don't have the figures so I can't work it out, but the team could do some stats to work out in advance if it would expedite the project overall, or hinder it. If they were really up for the challenge they could write a program to continuously analyze returns and self-regulate the deadline within boundaries, but I expect Boinc would backfire.

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21455 - Posted: 15 Jun 2011 | 0:26:40 UTC - in response to Message 21441.

I was explaining what might happen if we had a 2day deadline, speaking hypothetically, rather than explaining the existing 5day system. I agree that the system should be reviewed periodically, and the deadline probably reduced, but it's not my call and I don't have all the info.


I realize it's not your call but maybe the project admins can be persuaded through a discussion here.

Resends are always going to happen (tasks fail, as do GPU's, computers, routers).


Resends for the reasons you state will always be with us but that doesn't mean we should set a deadline of 5 days but resend if the result hasn't returned in 2 days.

Even if the deadline was 2 or 3 days, tasks would still fail/not be returned in time and have to be resent.


Indeed tasks would still fail but that's extraneous to discussion of the deadline. Tasks not returned in time is what we're focused on here.

The number of resends might actually rise if the deadline was changed.


No, the number would not rise, because the tasks are being resent now after 2 days anyway. How can reducing the real deadline to match the artificial deadline induce more late returns? There is no cause-effect relationship there that I can see.

If the real deadline (now 5 days) were reduced to match the artificial deadline (now 2 days) then volunteers who do not return tasks in 2 days would get some feedback that their results are not returning in time. They might notice that their RAC decreases or they might see many "Abandoned" or "No Reply" outcomes and zero credits awarded in their list of tasks on the website. They'll figure it out for themselves or they'll inquire in the forums as to what's going wrong. Ideally someone would create a sticky thread and put an item on the home page news warning folks of an upcoming reduction in real deadline a month or more in advance and advise users what to watch for and do.

Another mechanism would kick in too. BOINC would warn users that task XYZ is about to miss its deadline. That would get users asking questions too.

One way or another, volunteers who cannot return a task in 2 days would come to realize that they need to select only the short tasks in their preferences or perhaps move their resources to a different project. With the way things are now they just keep missing the 2 day artificial deadline over and over but get credits anyway and no feedback that they've missed the artificial deadline. They have no reason/motivation to change. Therefore, in the long run, reducing the real deadline from 5 days to match the artificial deadline of 2 days will actually reduce the number of resends. As I said, that won't happen immediately. It will happen as volunteers become aware of the change. If we leave things as they are then there is almost zero chance of ever reducing the number of resends.

I don't have the figures so I can't work it out, but the team could do some stats to work out in advance if it would expedite the project overall, or hinder it.


It seems to me pertinent stats won't exist until the real deadline is reduced to match the artificial deadline. I don't see how they can predict from the stats they have now. At present we can only reason our way to the outcome based on what we know (or assume) about volunteer behavior.

If they were really up for the challenge they could write a program to continuously analyze returns and self-regulate the deadline within boundaries, but I expect Boinc would backfire.


Perhaps I don't fully understand what you're getting at but it sounds like that would result in a floating deadline. I think that could only confuse users in the sense that some would find their results return in time today but not the next day. I can't see that as being a good thing. Decide what the project needs for a return time (it seems to be 2), draw one line in the sand and stick to it. Don't fudge around with artificial deadlines or floating deadlines.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 6,169
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21456 - Posted: 15 Jun 2011 | 8:54:23 UTC

At least long workunits should not be sent to slow hosts. They have no chance to return them even in 5 days. I'm still getting resent tasks, which are still being processed on GT 9500 or GT 9600 cards. I don't feel like warning these users one by one. I wrote about it in march.

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 206
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21463 - Posted: 15 Jun 2011 | 19:12:59 UTC - in response to Message 21456.


Hi: Following the points made here, also estimated that if the project needs the results in two days makes no sense to confuse users with the story of five days and generate duplication.

It generates the five days that users with GPUs that need three or four days to complete a task (not learn it in two days) participate miserably losing time, money and enthusiasm to participate in these tasks.

A) Identifies that GPUs is installed, if it is not recommended simply not accept the participation in this project.

B) When the time came to execute the task, if it has not been completed, canceled automatically (Boinc can do it perfectly) and now if it is referred to another user.

When you cancel once or twice a task and you find out what the problem and project requirements in which you participate. Greetings.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21475 - Posted: 16 Jun 2011 | 12:03:33 UTC - in response to Message 21463.
Last modified: 16 Jun 2011 | 23:40:55 UTC

Present GIANNI_KKFREE tasks now use less GPU (87% compared to 98%) and do not cause any Lag on either of my Fermi systems.

There was internal tests done prior to doing Beta tests (for those that run Betas). Some problems were identified and fixed. Then more Betas were run, and reported on by many users. It was not until the tasks were run by many users with many different systems that the extent of the lag was fully realized. The main concern was to see if the tasks were stable, rather than if each users system is fine with the new tasks. The feedback from the crunching contributors has prompted the modifications that have been implemented. Ergo the system works.

Most of the suggestions made in this thread had already been suggested, some time ago, along with others. Some things were looked at but not implemented. Other suggestions have been implemented, and the scientists came up with plenty of ideas of their own.

The project is in the main production phase; it spent a long time in development, now it's trying to get through some research, and just keep up with driver/cuda and GPU changes.

Please appreciate that GPUGrid is a small team with limited resources (especially manpower) and I know the team spent considerable time in the recent past trying to modify some systems unsuccessfully. This is a working project, and while there are problems it's not broken. It's very much down to the cruncher to accommodate this project; it's not a hook up and forget project (never was and never can be). You need to have an appropriate GPU, install the correct drivers, configure your system to crunch GPU tasks, use fan controlling software (hopefully), and modify things on occasions (SWAN_SYNC, Priorities, CPU usage).

While I did see some lag it was not as bad as others have described. I think it was (might still be?) impacting on some systems more than others. Perhaps W7 and Linux are on the whole more affected and perhaps the GTX460 is more affected than the CC2.0 cards. Lag might also depend whether or not Aero is installed/what flavor of Linux you are using, and on other hardware and setups (cores used for example), as well as what tasks are being crunched.

Post up what you are crunching and if you are experiencing any lag or not.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 21477 - Posted: 16 Jun 2011 | 12:42:10 UTC - in response to Message 21475.

We have increased the duplication time from 2 to 3 days.

gdf

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 206
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21583 - Posted: 3 Jul 2011 | 13:56:05 UTC - in response to Message 21477.

We have increased the duplication time from 2 to 3 days.
gdf


Hello: The increase of 2 to 3 days I think it was positive, not get so many duplicate tasks ... today.

The two tasks that I have received recently, had been implemented by others but within minutes one is over, so I was performing was just useless, I canceled the other and see ...

I sincerely believe that the best solición to this question is simply adapt the return time of the tasks to the maximum time allowed, two, three, etc ... days and nothing more. Greetings.

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 206
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21657 - Posted: 9 Jul 2011 | 21:07:36 UTC - in response to Message 21583.

Hi: The two tasks that I am running right now are in the process by another user.

One of them has seven tasks in the process (with NVIDIA Quadro 4000) lowered on day 6 so have already been forwarded to as many users ... !!!

I know not much, I say so, but I keep insisting that it makes sense to keep two dates to complete the tasks, the famous 3 to 5 days. Greetings.

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 206
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21680 - Posted: 16 Jul 2011 | 16:10:54 UTC - in response to Message 21657.

Hello: Every time I understand forwarding policies least half-finished tasks ... Now I get one that makes 24 hours sent to another user ... as 3 days or case. Greetings.

Task: http://www.gpugrid.net/workunit.php?wuid=2579334

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21681 - Posted: 16 Jul 2011 | 18:04:29 UTC - in response to Message 21680.

Thought I already explained that situation?

Anyway,
The 1st resend was after 3days - the task was not returned, this is normal.
The 2nd resend occurred after that because the task failed (user aborted in this case) - this is also normal; a failure also triggers a resend, even if a resend is already in progress.

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 206
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21682 - Posted: 16 Jul 2011 | 18:19:51 UTC - in response to Message 21681.
Last modified: 16 Jul 2011 | 18:22:07 UTC

Thought I already explained that situation?

Anyway,
The 1st resend was after 3days - the task was not returned, this is normal.
The 2nd resend occurred after that because the task failed (user aborted in this case) - this is also normal; a failure also triggers a resend, even if a resend is already in progress.


Hello, explain whether this and I also understand, but continues with little sense, in my opinion.

The first user does not comply within 3 days and the task is forwarded, fine, but continues its execution. (sure he still believes that you have 5 days ...)

The second user receives the task and the performer is usually from about 24 hours ago

We return to the first user, which is now the task is aborted runs (you will notice that they have forwarded their task and makes no sense to stick with it ...?), well.

But I do not understand is that when you cancel the task by the first user to generate a new shipment of the same task, when it is running normally by another user and within 3 days.

If this does not duplicate work without any need and I will say it is. Greetings.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21684 - Posted: 16 Jul 2011 | 18:56:12 UTC - in response to Message 21682.

Perhaps this system needs to be revised, but the importance of returning the task increases following a failure. As to why the user aborted, who knows. Usually they just think it will take too long, but they might have a problem with it, which would make it more important to return the task. The server cannot know that the first resend was/is running smoothly. Should one of the resends finish before the other starts (when someone has their cache too high) the other resend will not start (assuming intervening communications).

Suggested revision:
First resends only to be sent to CC2.0 cards
Second resends to be sent internally

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21687 - Posted: 16 Jul 2011 | 20:18:07 UTC - in response to Message 21684.
Last modified: 16 Jul 2011 | 20:25:00 UTC

The system certainly does need to be revised but let's keep it simple rather than make it more complicated. The suggestion to issue resends only to CC 2.0 hosts would require new code on the server side, I think.

The simple, easy to implement solution is to get rid of the double deadline that now exists. Having a 3 day deadline plus another 5 day deadline leads to waste as has been demonstrated numerous times in this thread. Have 1 deadline and 1 deadline only. If you think I'm asking to have the 3 day deadline increased then read the post again because that's NOT what I want.

BOINC server code already has an option to issue resends to fast, reliable hosts. The code to do that was tested and proven several years ago. Why not use that instead of writing new code to issue resends to CC 2.0 hosts?

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 206
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21688 - Posted: 16 Jul 2011 | 20:36:24 UTC - in response to Message 21684.

Perhaps this system needs to be revised, but the importance of returning the task increases following a failure. As to why the user aborted, who knows. Usually they just think it will take too long, but they might have a problem with it, which would make it more important to return the task. The server cannot know that the first resend was/is running smoothly. Should one of the resends finish before the other starts (when someone has their cache too high) the other resend will not start (assuming intervening communications).

Suggested revision:
First resends only to be sent to CC2.0 cards
Second resends to be sent internally



Hi Allow me a suggestion on the topic:

1 - Leave only 3 days to complete a task, Amul duplication of 3 to 5
days. Avoid many forwards, duplicate tasks and less load to the server.

2 - The server knows when a task is running for being
forwarded and no problem as the task in the previous user will
automatically canceled prior to forwarding.

The references that I'm putting in these comments are only those that affect me personally, I know for sure, but I fear that the volume of duplicate tasks is very high. Greetings.

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 206
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21716 - Posted: 21 Jul 2011 | 18:10:45 UTC - in response to Message 21688.
Last modified: 21 Jul 2011 | 18:48:06 UTC

Hello: Following what I promised, I put a link to a new task duplicate.

http://www.gpugrid.net/workunit.php?wuid=2589619

If the term to complete a task was bound and 3 days, this circumstance would not occur.

It is a partner with a GT9600 that takes at least 44 hours (nonstop) to execute a task is working with the idea and hope to run a useful task, when in reality it is not, all tasks have doubled. Greetings.

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 206
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21890 - Posted: 26 Aug 2011 | 14:36:56 UTC - in response to Message 21716.


Hello: Again is coming more duplicate tasks

The last four tasks (including the two that I am performing at this time) are underway by other users that have exceeded initial three days.

Honestly is a topic that I will never tire of repeating and claiming for the loss of effort (and money) it means ... regrettable. Greetings.

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 206
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21894 - Posted: 27 Aug 2011 | 14:44:33 UTC - in response to Message 21890.


Hello: We are the same, the two tasks I have just received are also duplicated.

This time one has been completed by someone else soon after I start the same task (one hour), have been canceled and proceeded to ask a new one (it is not published ... thank goodness).

I keep arguing, it seems that without success, that if the program needs the results in - three days - this is the real deadline to complete the task and nothing more ... is simple ... no? Greetings.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 21974 - Posted: 7 Sep 2011 | 14:12:01 UTC - in response to Message 21894.
Last modified: 7 Sep 2011 | 14:23:32 UTC

I have just removed the duplication mechanism as announced in another thread.
It will take few days before all duplicated workunits disappear.
Deadline for now stays at 5 days.

gdf

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 21975 - Posted: 7 Sep 2011 | 15:37:30 UTC - in response to Message 21974.

I have just removed the duplication mechanism as announced in another thread.
It will take few days before all duplicated workunits disappear.
Deadline for now stays at 5 days.

gdf

Thank you very much!

So now it's like this:
less than 24h (except PYRT): bonus of 100%
less than 48h: bonus of 50%
before deadline: normal credits, no redundant results
after deadline: new result sent.

(BTW: What's the 100%-time for PYRT?)
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 21976 - Posted: 7 Sep 2011 | 16:04:27 UTC - in response to Message 21975.

I have just removed the duplication mechanism as announced in another thread.
It will take few days before all duplicated workunits disappear.
Deadline for now stays at 5 days.

gdf

Thank you very much!

So now it's like this:
less than 24h (except PYRT): bonus of 100%
less than 48h: bonus of 50%
before deadline: normal credits, no redundant results
after deadline: new result sent.

(BTW: What's the 100%-time for PYRT?)



I still have to talk with Ignasi about PYRT because we just came back from Berlin. But it should be valid for all workunits.

gdf

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 206
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21977 - Posted: 7 Sep 2011 | 16:35:32 UTC - in response to Message 21974.

Hello. Thanks for the change.

Post to thread

Message boards : Number crunching : Duplicated work

//