Advanced search

Message boards : Number crunching : No work being sent

Author Message
Spatzthecat
Send message
Joined: 26 Nov 09
Posts: 33
Credit: 1,282,387,913
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23606 - Posted: 22 Feb 2012 | 12:21:02 UTC

Is there a problem as no work is being sent, apparently there are no long units and my machines have reached their daily limit.

mikey
Send message
Joined: 2 Jan 09
Posts: 292
Credit: 2,352,078,615
RAC: 13,749,380
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23608 - Posted: 22 Feb 2012 | 15:55:45 UTC - in response to Message 23606.

Is there a problem as no work is being sent, apparently there are no long units and my machines have reached their daily limit.


Your pc's are hidden so I can't look, are you turning in invalid work? If so your daily limit is reduced unitl you start doing valid work again.

Spatzthecat
Send message
Joined: 26 Nov 09
Posts: 33
Credit: 1,282,387,913
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23610 - Posted: 22 Feb 2012 | 16:54:30 UTC - in response to Message 23608.

I have been reporting 10 - 12 valid Nathan CB1 wu's (total) per day on 3 machines
I am not receiving any wu's at all.
Is it because I have aborted or cancelled wu's that I do not wish to process?

Spatzthecat
Send message
Joined: 26 Nov 09
Posts: 33
Credit: 1,282,387,913
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23611 - Posted: 22 Feb 2012 | 17:19:55 UTC

I have just checked my recent tasks and I have had over 140 wu's error out. I updated the Nvidia drivers to the latest 29573 CUDA version 4020 yesterday and am assuming that this is the reason.
Rolling back drivers.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 573
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23612 - Posted: 22 Feb 2012 | 18:13:28 UTC - in response to Message 23611.

I'm using the 295.73 drivers on WinXP x64. It's working well.

Spatzthecat
Send message
Joined: 26 Nov 09
Posts: 33
Credit: 1,282,387,913
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23613 - Posted: 22 Feb 2012 | 21:24:16 UTC
Last modified: 22 Feb 2012 | 21:27:06 UTC

It was the 295 73 drivers that caused my problem so I have gone back to the 285 62
but using Win 7 x64 Ultimate

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23614 - Posted: 22 Feb 2012 | 21:32:35 UTC - in response to Message 23613.

I have installed it and I am using it now, but on a 2003R2 x64 server. This is similar to XP. So far no issues.
It may be the case that it causes problems on W7 or just on some GTX500 cards?
Would take more reports to know for sure.
I'm testing as I have come across several reports of issues with 295.x and I want to know if it's any faster or slower. It would take a few days to assess performance, but I am not expecting any gain.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Spatzthecat
Send message
Joined: 26 Nov 09
Posts: 33
Credit: 1,282,387,913
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23615 - Posted: 22 Feb 2012 | 23:10:43 UTC

The 3 machines are all Win 7 x64 Ultimate
GPU's GTX 275, 285 and 570 all 3 have suffered the same fate when the drivers were updated.
i7 920 with the 285
i7 975 with the 275
i7 990 with the 570
No issues before the update
Have set preferences to show computers

Ilkka Forsblom
Send message
Joined: 14 Jan 09
Posts: 1
Credit: 532,662,048
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23634 - Posted: 24 Feb 2012 | 20:10:04 UTC - in response to Message 23615.

Same problem here. Was also looking at joining Einstein@home for GPU calc but their forums also have messages about CUDA problems with newest NVIDIA drivers: apparently, CUDA stops working if a DVI connected monitor is used and the monitor goes/is put to sleep, and I believe there are some other problems as well.

I'm running Windows 7 x64 version (intel i7, nvidia gtx 570). Worked fine with the older nvidia WHQL driver.

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 23635 - Posted: 24 Feb 2012 | 21:37:02 UTC - in response to Message 23634.

We'll put more WUs asap.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23639 - Posted: 25 Feb 2012 | 0:52:54 UTC - in response to Message 23635.

For those crunching Long tasks only, I would suggest selecting to accepting normal tasks if no long tasks are available.
At present there are only 20 Long tasks waiting to be sent.

GPUGrid Preferences,

If no work for selected applications is available, accept work from other applications? yes
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Spatzthecat
Send message
Joined: 26 Nov 09
Posts: 33
Credit: 1,282,387,913
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23714 - Posted: 2 Mar 2012 | 11:46:55 UTC - in response to Message 23639.

For those crunching Long tasks only, I would suggest selecting to accepting normal tasks if no long tasks are available.
At present there are only 20 Long tasks waiting to be sent.

GPUGrid Preferences,

If no work for selected applications is available, accept work from other applications? yes



When are we likely to see some Long tasks?
I am only receiving ACEMD beta version 6.42 at the moment.

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 23717 - Posted: 2 Mar 2012 | 16:48:06 UTC

I'll put some new ones on soon, while removing some old ones. Others will add additional WUs in a week or so. Simulations take time to prepare, and it's better if we wait an extra week or two to be sure we have everything correct than sending wrong stuff and having to retract them in a week. Please be patient, and crunch for short tasks in the meantime. They are also important.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23720 - Posted: 2 Mar 2012 | 18:35:43 UTC - in response to Message 23717.

Some of you may have noticed a few Beta's by Ignasi and Toni.
So far I have not experienced any failures. Toni's tasks ran at ~90% GPU utilization when my system was optimized.
my beta's
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Spatzthecat
Send message
Joined: 26 Nov 09
Posts: 33
Credit: 1,282,387,913
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23723 - Posted: 2 Mar 2012 | 22:49:50 UTC - in response to Message 23717.

I'll put some new ones on soon, while removing some old ones. Others will add additional WUs in a week or so. Simulations take time to prepare, and it's better if we wait an extra week or two to be sure we have everything correct than sending wrong stuff and having to retract them in a week. Please be patient, and crunch for short tasks in the meantime. They are also important.



I totally understand that all tasks are important but there does'nt seem to be any consistency. The ACEMD beta version 6.42 takes a similar time to the NATHAN CB1 - What constitutes a long task?

I will admit that I have no idea what is involved to prepare the units but from my perspective when I have everything running smoothly my end (which is a task in itself) consistency is important.

Top end cards are not cheap, would they not be better utilised for the "top end tasks". Not everyone can invest in such hardware but still wish to contribute to this ever so worthwhile project, so horses for courses would be apt, to get the best out of us.

This is my "chosen" project and whatever your response is, will not deter me from continuing to support it. However I do feel that it would be much more efficient to take full advantage of a GPU's capability.

I will be investing in 2 more GPU's in the near future and would be delighted if the project were to utilise them to the max.

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 23737 - Posted: 4 Mar 2012 | 14:21:02 UTC - in response to Message 23723.

Spatz, thanks for your comments. You certainly have a point, and I will try to keep these things in mind. It's not lost on us how difficult it can be to get your systems up and running. It is one of the less ideal aspects of high performance computing. So many configurations, so many little details that can go wrong. We wish we could offer more support in that area. C'est la vie.

And you certainly are correct that long tasks should be longer to fully utilize the best cards, and we're conscious of that. My mistake with the "NATHAN_CB1" tasks was not optimizing the time for the faster cards, but rather averaging for all types. In the future we will be setting task length for the faster cards in the long queue. Of course, slower cards will still be welcome.

Again, thanks for giving your perspective.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23739 - Posted: 4 Mar 2012 | 15:38:54 UTC - in response to Message 23737.

Over time the task length, in terms of steps, can increase with the improvements of GPU's. This is probably more important for the long tasks than the normal tasks, especially those that have to stay the same length. Recently, some of the 'Long tasks' took ~3 1/2h on a GTX580,
I13R26-NATHAN_CB1_1-84-125-RND3000_1 3225042 3 Mar 2012 | 20:13:56 UTC 4 Mar 2012 | 2:38:21 UTC Completed and validated 12,561.16 12,561.16 35,811.00 Long runs (8-12 hours on fastest card) v6.14 (cuda31). I think this is too short. It's too close to normal length tasks, makes other Long tasks look too long, and raises concern over the credit system as a whole. While it has allowed many to crunch with mid-range GPU's and get the full credit bonus, that's not the purpose of long tasks.
I thought the idea was that the Long tasks would constantly take ~8h on the top GPU's (and up to 12h on similar cards, down to say a GTX470). Clearly you could double the 3.5h and they would still be a reasonable length. In time, when GF600's turn up, and presuming they are faster, the steps should increase again, but runtime for the top GPU's (GF680, or whatever) would still be ~8h. Obviously this would mean tasks take longer on a GTX580 and the like, but this would not happen immediately after release of the GF600; perhaps several months later. Before that there probably would not be enough of these GPU's to make longer runs feasible, and the research would need to be adapted to these cards, which takes time (batches of runs have to finish).
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 573
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23757 - Posted: 4 Mar 2012 | 20:39:40 UTC - in response to Message 23739.

Recently, some of the 'Long tasks' took ~3 1/2h on a GTX580. I think this is too short. It's too close to normal length tasks, makes other Long tasks look too long, and raises concern over the credit system as a whole. While it has allowed many to crunch with mid-range GPU's and get the full credit bonus, that's not the purpose of long tasks.

I share your concern, but it made me see this problem from another perspective. The so-called long tasks should be called fast returned tasks. I guess the GPUGrid staff already use the long queue in this way. It should be made "official" and in the fast return queue the deadline should be 24h, and 50% bonus credit should be awarded for a task returned in less than 8h and 25% bonus credit for under 12h.

Post to thread

Message boards : Number crunching : No work being sent

//