Advanced search

Message boards : Number crunching : tasks aborted by project

Author Message
Angelique
Send message
Joined: 27 Oct 12
Posts: 14
Credit: 29,337,200
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 36629 - Posted: 25 Apr 2014 | 9:25:59 UTC

I was just wondering why 5 tasks within the last 20ish hours were aborted by the project... or atleast thats what it said when i saw the tasks go away in boinc manager... also do those tasks have any scientific value that are helping the project or are they just completely scrapped... Id like to think my time worked on them has some sort of use. Thanks

9331176 6561433 157034 24 Apr 2014 | 16:30:49 UTC 24 Apr 2014 | 20:37:09 UTC Abandoned 0.00 0.00 --- Short runs (2-3 hours on fastest card) v8.41 (cuda60)
9331033 6560107 157034 24 Apr 2014 | 16:30:49 UTC 24 Apr 2014 | 20:37:09 UTC Abandoned 0.00 0.00 --- Short runs (2-3 hours on fastest card) v8.41 (cuda60)
9330997 6560755 157034 24 Apr 2014 | 16:30:12 UTC 24 Apr 2014 | 19:39:41 UTC Completed and validated 10,470.45 1,401.31 16,650.00 Short runs (2-3 hours on fastest card) v8.41 (cuda60)
9330959 6561320 157034 24 Apr 2014 | 16:30:12 UTC 24 Apr 2014 | 20:37:09 UTC Abandoned 0.00 0.00 --- Short runs (2-3 hours on fastest card) v8.41 (cuda60)
9329542 6540411 157034 24 Apr 2014 | 10:05:58 UTC 24 Apr 2014 | 15:36:18 UTC Abandoned 0.00 0.00 --- Short runs (2-3 hours on fastest card) v8.41 (cuda60)
9329492 6549925 157034 24 Apr 2014 | 10:05:58 UTC 24 Apr 2014 | 15:36:18 UTC Abandoned 0.00 0.00 --- Short runs (2-3 hours on fastest card) v8.41 (cuda60)

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 36636 - Posted: 25 Apr 2014 | 12:00:50 UTC - in response to Message 36629.

It is a bit hard to answer as your computers are hidden. Not a big deal but we can not see what the times are your system reports back.
If your WU's are not have started and a wingman/wingwoman does than the server deletes your WU, as they have their result. And they have high throughput here as they really need there WU's as soon as possible.

Hope this helps a bit.

____________
Greetings from TJ

Angelique
Send message
Joined: 27 Oct 12
Posts: 14
Credit: 29,337,200
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 36639 - Posted: 25 Apr 2014 | 12:19:49 UTC - in response to Message 36636.

Ill show my computers for the sake of knowing whats happening... As far as the aborted workunits i was atleast 1 hour and 30 minutes (about half way done) into a couple of them when they were aborted by project. I could care less about aborted WU's if they havnt started nor could i care about them aborted half way in so long as some sort of scientific progress comes out of it to help the project... Whether or not its for testing purposes etc or showing problems with the way things are currently working.

Again thanks for your input

Angelique
Send message
Joined: 27 Oct 12
Posts: 14
Credit: 29,337,200
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 36644 - Posted: 25 Apr 2014 | 15:34:02 UTC - in response to Message 36639.
Last modified: 25 Apr 2014 | 15:37:16 UTC

I have even had a few more tasks aborted by project since last post.
here is an example of one of them that was over 1 hour and a half hours into processing.

25/04/2014 17:33:31 | GPUGRID | Result 692x-SANTI_MAR423cap310-68-84-RND7388_0 is no longer usable
25/04/2014 17:33:32 | GPUGRID | Computation for task 692x-SANTI_MAR423cap310-68-84-RND7388_0 finished

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 36646 - Posted: 25 Apr 2014 | 15:53:27 UTC - in response to Message 36644.

As far as I know is that Abandoned is used as the deadline has passed, which is not the case in your problem. People with more in depth knowledge of BOINC need to answer this. We have to wait a few hours when they start reading your post.
____________
Greetings from TJ

mikey
Send message
Joined: 2 Jan 09
Posts: 298
Credit: 6,644,776,185
RAC: 15,204,627
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 36667 - Posted: 26 Apr 2014 | 12:07:55 UTC - in response to Message 36629.
Last modified: 26 Apr 2014 | 12:10:07 UTC

I was just wondering why 5 tasks within the last 20ish hours were aborted by the project... or atleast thats what it said when i saw the tasks go away in boinc manager...


It could be as simple as they pulled the units back due to problems in the units themselves, cutting their losses, and yours too.

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 36721 - Posted: 29 Apr 2014 | 19:33:25 UTC - in response to Message 36667.
Last modified: 29 Apr 2014 | 19:34:40 UTC

No, the WU was not cancelled. It seems a case of "Client detached [7]". I have no clue about the cause.

Some discussions pointed to "cpid mismatch" and the use of account managers.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1626
Credit: 9,362,966,723
RAC: 18,909,192
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 36729 - Posted: 29 Apr 2014 | 21:50:43 UTC - in response to Message 36721.

No, the WU was not cancelled. It seems a case of "Client detached [7]". I have no clue about the cause.

Some discussions pointed to "cpid mismatch" and the use of account managers.

I'd certainly agree that 'abandoned' != 'aborted'.

This is a server-only problem which rears its head from time to time - I tried doing some code-walking once when it appeared at SETI, but without cooperation from a server admin with access to the server logs, I gave up in frustration. Host CPID is certainly a player in the game, together with (I'm pretty sure) rpc_seqno. The closest I could get was a comms glitch resulting in RPC requests either arriving, or being processed, in the wrong order, and triggering a deliberate anti-cheating mechanism in the server code (there are only two short code segments which can trigger the subroutine which marks tasks as 'abandoned').

At SETI, there was no mention of the use of account managers, and I didn't consider that possibility. Judging by the comments over the years from Willy @ BAM, that sounds like a possibly fruitful lode for bug-hunting, but I'd avoid jumping too far in that direction at this stage - keep an open mind, and follow the evidence. I hope you have more success than I did.

Post to thread

Message boards : Number crunching : tasks aborted by project

//