Advanced search

Message boards : Graphics cards (GPUs) : Failed tasks starting 2 June

Author Message
Ken Florian
Send message
Joined: 4 May 12
Posts: 56
Credit: 1,832,989,878
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 25503 - Posted: 5 Jun 2012 | 1:42:45 UTC

I moved a GTX 550TI from one computer to another. It picked up and completed a long run task but has failed on every one since.

There were zero changes to the machine until a few minutes ago when I installed the nvidia 301.42 drivers.

Here is the relevant portion of the log after a reboot and boinc "Update"

6/4/2012 8:34:38 PM | GPUGRID | Sending scheduler request: Requested by user.
6/4/2012 8:34:38 PM | GPUGRID | Requesting new tasks for NVIDIA
6/4/2012 8:34:40 PM | GPUGRID | Scheduler request completed: got 0 new tasks
6/4/2012 8:34:40 PM | GPUGRID | No tasks sent
6/4/2012 8:34:40 PM | GPUGRID | This computer has finished a daily quota of 7 tasks
6/4/2012 8:38:09 PM | GPUGRID | update requested by user
6/4/2012 8:38:14 PM | GPUGRID | Sending scheduler request: Requested by user.
6/4/2012 8:38:14 PM | GPUGRID | Requesting new tasks for NVIDIA
6/4/2012 8:38:16 PM | GPUGRID | Scheduler request completed: got 0 new tasks
6/4/2012 8:38:16 PM | GPUGRID | No tasks sent
6/4/2012 8:38:16 PM | GPUGRID | No tasks are available for ACEMD2: GPU molecular dynamics
6/4/2012 8:38:16 PM | GPUGRID | No tasks are available for ACEMD beta version
6/4/2012 8:38:16 PM | GPUGRID | This computer has finished a daily quota of 7 tasks


5pot
Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 25504 - Posted: 5 Jun 2012 | 2:40:15 UTC

Disallow beta tasks for starters. They're broken. Everyone is failing those, as well as the test applications tick box.

Now, I don't know why you're failing the others, someone with more knowledge should take a look. It has the failed to enumerate device error 3.

Someone with more knowledge will post, give it some time.

Cheers

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 729,045,933
RAC: 113,743
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 25517 - Posted: 5 Jun 2012 | 21:42:34 UTC

I just looked at the listing of my recent workunits.

ACEMD beta version v6.45 (cuda40) always gave Error while computing, on both of my desktop computers.

ACEMD beta version v6.43 (cuda42) also only gave errors, but only one of the computers ran any.

Long runs (8-12 hours on fastest card) v6.16 (cuda31) usually gave Completed and validated; only one failure.

Are the most recent applications unable to run correctly on a GTS 450 or a GT 440?

At least all the failed runs took less than 30 seconds each.

Using a 301.42 driver on one computer, but just noticed that the other is still using 285.62.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 25518 - Posted: 5 Jun 2012 | 21:51:43 UTC - in response to Message 25517.
Last modified: 5 Jun 2012 | 21:53:19 UTC

I think 285 supports CUDA 4.0 and I know 301.42 definitely supports CUDA 4.2, but you have to expect failures from any Beta Apps.
Both non-Beta apps should still work with the driver you were using last week.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 729,045,933
RAC: 113,743
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 25523 - Posted: 6 Jun 2012 | 5:21:22 UTC - in response to Message 25518.

Then you may want to add a new option - allow only as many beta tasks as will not interfere with getting a normal flow of non-beta tasks.

That way, users could still help with beta testing, without taking much time away from the non-beta tasks they do.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 729,045,933
RAC: 113,743
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 25600 - Posted: 9 Jun 2012 | 22:42:25 UTC

Something I've seen on other BOINC projects that you might consider - if a beta test workunit fails, don't count that against the reliability of that host; especially if all other hosts that try copies of that workunit fail also.

Post to thread

Message boards : Graphics cards (GPUs) : Failed tasks starting 2 June

//