Advanced search

Message boards : Number crunching : ATI vs NVIDIA GPUs

Author Message
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 7550 - Posted: 17 Mar 2009 | 13:24:02 UTC
Last modified: 17 Mar 2009 | 16:56:20 UTC

I try to answer some of the doubts that I see in the forum.

ATI and Nvidia have been head-to-head for a long time with pro and cons on each vendor.

Firestream ATI or other top ATI cards have more or less the same flops of top Nvidia cards in single precision.

http://en.wikipedia.org/wiki/GeForce_200_Series
http://en.wikipedia.org/wiki/Comparison_of_ATI_Graphics_Processing_Units

Nvidia seems so far better in GPGPU programming with CUDA and GPGPU hardware support. For a complex code, this is the best solution until OpenCL comes out and ATI adapts their hardware.

ATI is so far better in double precision performance, but this is still much less than single-precision performance anyway.



gdf

Gipsel
Send message
Joined: 17 Mar 09
Posts: 12
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 7554 - Posted: 17 Mar 2009 | 16:47:08 UTC - in response to Message 7550.
Last modified: 17 Mar 2009 | 16:48:00 UTC

Nvidia seems so far better in GPGPU programming with CUDA and GPGPU hardware support. For a complex code, this is the best solution until OpenCL comes out and ATI adapts their hardware.

ATI is so far better in double precision performance, but this is still much less that single-precision performance anyway.

That is right. CUDA is currently the much more matured SDK than Stream is. Implementing a complex algorithm with CUDA should be much more straightforward than on ATI hardware in the moment, which makes the life easier for the developer. Furthermore, some of the hardware functionality needed to speed up some algorithms is only implemented in ATI's HD4000 series (with support in high level languages like Brook only slowly arriving), whereas on the nvidia side such features are already available in earlier cards.
We will have to wait another 3 months or so before one can start to compare the OpenCL implementations of both manufacturers.

Concerning the single/double precision performance of ATI and nvidia you are also right. With single precision nvidia and ATI are roughly on par (ATI claims only slightly higher theoretical values). But when going to doubles the nvidia performance reduces to about 8% of the single precision throughput, while ATI is able to sustain 20% of its advertized single precision peak performance. That means a HD3850 has about the same theoretical double precision performance as a GTX285 and a HD4800 series card is significantly faster for such things.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 7555 - Posted: 17 Mar 2009 | 17:02:17 UTC - in response to Message 7554.

Hi,
ACEMD uses only single-precision on the gpu for maximum performance.
As you say the double precision performance of ATI is still 5 times less than any ATI or NVIDIA GPUs in single precision.

Fortunately, for molecular dynamics applications is ok to use single-precision.

gdf



Gipsel
Send message
Joined: 17 Mar 09
Posts: 12
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 7557 - Posted: 17 Mar 2009 | 17:37:13 UTC - in response to Message 7555.

Hi,
ACEMD uses only single-precision on the gpu for maximum performance.
As you say the double precision performance of ATI is still 5 times less than any ATI or NVIDIA GPUs in single precision.

Fortunately, for molecular dynamics applications is ok to use single-precision.

I know. I only wanted to quantify that "much less than single-precision performance".

By the way, could you have a look in the credit calculation thread?

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 7560 - Posted: 17 Mar 2009 | 19:41:50 UTC - in response to Message 7557.

> By the way, could you have a look in the credit calculation thread?

DONE.

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11050 - Posted: 9 Jul 2009 | 6:49:28 UTC - in response to Message 7560.

Theory is beautiful but life is brutal:

In GPUGRID with 3xGTX260SP216 you may have 55000 credits per day,

In MilkyWay with 2xHD4870 you may have 110 000 credits per day...

So where is balance?

In MilkyWay project you may run 2 wu per gpu.
In gpu grid you may run only 1 wu per gpu.

So I really like GPUGRID, and this is only reason why I recently, when I was upgrading my computers I buy 3GTX260 not 3 HD4870.....

After all, I am not happy when I see so big differences in the scoring.
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 11051 - Posted: 9 Jul 2009 | 7:40:53 UTC - in response to Message 11050.
Last modified: 9 Jul 2009 | 7:43:25 UTC

Hi,
thanks for your preference.
The 260 are not the fastest card, but I understand what you mean.

gdf

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11052 - Posted: 9 Jul 2009 | 8:43:33 UTC - in response to Message 11051.

Another horrifying statistics:

In MilkyWay...

This host

may crunch almost 400 000 Credits per day!!!

2xHD4870x2....

In GPUGRID fastest computer - 3xGTX295 make.... only..... almost 100 000 per day :(

NO COMMENT....
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11055 - Posted: 9 Jul 2009 | 14:17:51 UTC - in response to Message 11052.
Last modified: 9 Jul 2009 | 14:20:27 UTC

I have a favour for administrators. Standardize the scoring so that there are no such irrational differences....

Is the MilkyWay@Home project apart from principles to which they are applying Seti@Home and GPUGRID?
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11084 - Posted: 9 Jul 2009 | 19:23:17 UTC - in response to Message 11055.

Milkyway released their source code and people wrote optimized apps. Some of the improvements made their way into the standard client, but the last batch (about a half year old) of optimizations didn't. The optimized clients for cpus are about a factor of 4 faster than the stock client. For the stock client the credits are OK, but optimized ones get proportionally more.

My 4870 has almost reached 80k RAC there. If you divided this by 4 the number would be somewhere in the realm of sanity..

MrS
____________
Scanning for our furry friends since Jan 2002

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11101 - Posted: 10 Jul 2009 | 6:01:21 UTC - in response to Message 11084.
Last modified: 10 Jul 2009 | 6:12:08 UTC

When we can expect such such optimized apps for GPUGRID?

If it is posible for MilkyWay it must be possible for GPUGRID.
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 11102 - Posted: 10 Jul 2009 | 7:03:50 UTC - in response to Message 11101.
Last modified: 10 Jul 2009 | 7:05:35 UTC

The problem is not the optimization but the multiplication. An ATI card in double precision cannot perform better than Nvidia in single precision.

gdf

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11103 - Posted: 10 Jul 2009 | 7:35:58 UTC - in response to Message 11102.
Last modified: 10 Jul 2009 | 8:16:56 UTC

Ok, so is it possible to make for GeForce such optimized apps using CUDA?
(4-2 times faster then current aplication - appropriately so as for MilkyWay users do for Ati cards - or aplication witch will generate 4 times more credit
Is it impossible here for GeForce?)
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11104 - Posted: 10 Jul 2009 | 8:39:40 UTC - in response to Message 11103.
Last modified: 10 Jul 2009 | 8:42:30 UTC

And 2 images:

GPUGRID:

MilkyWay:


It is clear that at the beginning of the June when an optimized program was published the project doubled its computing power and the scoring
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11206 - Posted: 20 Jul 2009 | 18:57:17 UTC

Thomasz,

GDF and his tam are actually very good at optimizations and have been caring about this for a long time. At Milkyway the stock code was terribly inefficient - that's why they released it, to get help from the public. So the problem is not that GPU-Grid does not have an "optimized" app, the problem is that MWs stock app is not yet optimized enough. And that they base they credits on CPU computation time compared to seti (the old standard), not flops.

At GPU-Grid you can already get more credits per time from your nVidia card than at seti, because GPU-Grid is more optimized (and may have more GPU-friendly code).

The rapid increase of RAC you're seeing at MW is not due to the release of an optimized app, it's actually the point when Travis (MW commander in chief) decided to let the work generator run more frequently, so that all the ATI cards wouldn't run dry any more but actually get new WU when they requested them.

MrS
____________
Scanning for our furry friends since Jan 2002

RalphEllis
Send message
Joined: 11 Dec 08
Posts: 43
Credit: 2,216,617
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 11278 - Posted: 23 Jul 2009 | 21:01:55 UTC - in response to Message 11206.

In the end, the credits are meaningless. The only way to have any concrete gain from distributed computing is to run folding@home for the Evga team and you can get EVGA Bucks to contribute to the purchase of an EVGA card.
I like the challenge of optimizing the system to run faster but in the end the point totals only exist to encourage people to participate. 10 points a day and 100,000 point per day are really the same thing.
The real bonus is that the computations are going to help medical science or astromomy or mathematics.

popandbob
Send message
Joined: 18 Jul 07
Posts: 67
Credit: 40,277,822
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11280 - Posted: 24 Jul 2009 | 0:24:44 UTC - in response to Message 11206.

At GPU-Grid you can already get more credits per time from your nVidia card than at seti, because GPU-Grid is more optimized (and may have more GPU-friendly code)


The cuda 2.3 drivers and dll's make it possible to obtain more credits per time than GPUgrid or very close to the same.
I'm getting a crazy ~10 credits per minute on my single gtx core 216 which works out to about 14000 RAC

Bob

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11635 - Posted: 3 Aug 2009 | 19:47:20 UTC - in response to Message 11278.

In the end, the credits are meaningless. The only way to have any concrete gain from distributed computing is to run folding@home for the Evga team and you can get EVGA Bucks to contribute to the purchase of an EVGA card.


You imply that "not being meaningless" equals "concrete gain", something which you seem to define as "getting something of material value" (or something like that).

For me I'd say I get lots of concrete gain due to DC. It's a hobby, I can enjoy spending time on it whenever I want to (and other duties permit it). That's nothing to touch, but it's a very real "gain" for me.

... but in the end the point totals only exist to encourage people to participate. 10 points a day and 100,000 point per day are really the same thing.
The real bonus is that the computations are going to help medical science or astromomy or mathematics.


The credits also reflect the amount of contribution. Therefore they also exist to show people how good their systems are at what they're doing.

You're right that just by itself 10 or 100,000 credits don't mean anything, as it could be any arbitrary number / scaling factor. However, there's not just one of these numbers. The different amounts of credits enable comparisons between systems and projects (how valid these are is another question) and 10 credits do not equal 100,000 any more when your co-cruncher gets 1000.

MrS
____________
Scanning for our furry friends since Jan 2002

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11668 - Posted: 5 Aug 2009 | 12:36:08 UTC - in response to Message 11635.

Explain me this paradox...

857-GIANNI_BIND001-0-100-RND8741_3
Approximate elapsed time for entire WU: 31806.654 s
Granted credit 5664.88715277777

460-GIANNI_DOPc-0-25-RND4727_1
Approximate elapsed time for entire WU: 29039.251 s
Granted credit 6379.86834490741

In my opinion, for the longer counted WU more points should be granted.

Assuming that the second WU was awarded correctly, scoring first is lowered.
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11676 - Posted: 5 Aug 2009 | 20:54:28 UTC - in response to Message 11668.

Which has what to do with "ATI vs nVidia GPUs"?

MrS
____________
Scanning for our furry friends since Jan 2002

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11679 - Posted: 5 Aug 2009 | 21:45:56 UTC - in response to Message 11676.
Last modified: 5 Aug 2009 | 21:48:48 UTC

In the end, the credits are meaningless. The only way to have any concrete gain from distributed computing is to run folding@home for the Evga team and you can get EVGA Bucks to contribute to the purchase of an EVGA card.


You imply that "not being meaningless" equals "concrete gain", something which you seem to define as "getting something of material value" (or something like that).

For me I'd say I get lots of concrete gain due to DC. It's a hobby, I can enjoy spending time on it whenever I want to (and other duties permit it). That's nothing to touch, but it's a very real "gain" for me.

... but in the end the point totals only exist to encourage people to participate. 10 points a day and 100,000 point per day are really the same thing.
The real bonus is that the computations are going to help medical science or astromomy or mathematics.


The credits also reflect the amount of contribution. Therefore they also exist to show people how good their systems are at what they're doing.

You're right that just by itself 10 or 100,000 credits don't mean anything, as it could be any arbitrary number / scaling factor. However, there's not just one of these numbers. The different amounts of credits enable comparisons between systems and projects (how valid these are is another question) and 10 credits do not equal 100,000 any more when your co-cruncher gets 1000.

MrS


Which has what to do with "ATI vs nVidia GPUs"?

MrS


Good Question

If we want to be so pure we must make other tread?
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 12983 - Posted: 2 Oct 2009 | 17:35:10 UTC
Last modified: 2 Oct 2009 | 17:35:50 UTC

Are there any new developments on the ATI GPU front for GPUGRID?
Are there apps based on OpenCL planned in the near future?
Thanks!

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 12985 - Posted: 2 Oct 2009 | 18:29:55 UTC - in response to Message 12983.
Last modified: 2 Oct 2009 | 18:30:16 UTC

Are there any new developments on the ATI GPU front for GPUGRID?
Are there apps based on OpenCL planned in the near future?
Thanks!

GDF said there would be ... I don't know, ooops yes I do: which thread... but it is about here someplace...

Einstein hinted that they have OpenCL in the works too, but it is lagging behind the CUDA version they are working on ... I do not know that there will be a mad rush ... or that this will lead to Mac OpenCL apps at the same time ... but I can hope cause I have a mac that I could run GPUs in too ...

Profile bloodrain
Send message
Joined: 11 Dec 08
Posts: 32
Credit: 748,159
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwatwatwatwat
Message 17268 - Posted: 24 May 2010 | 22:19:26 UTC - in response to Message 7550.

you nailed it gdf. on what i was going to post.

Profile Fred J. Verster
Send message
Joined: 1 Apr 09
Posts: 58
Credit: 35,833,978
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21234 - Posted: 24 May 2011 | 23:53:54 UTC - in response to Message 17268.

And SETI BĂȘta is also testing a MultiBeam app. rev177 & rev.234(?) and a
AstroPulse app. rev521, for ATI cards. The speedgain on AstroPulse is about
10 times less runtime.(CPU time strongly depends on % radar-blanking)
~10% CPU time with 0% radar blanking.

And the Longruns here, just finished 2 of them, also takes 10-12 hours, but
more credit is given compaired to SETI CUDA.

I think the GPUGrid app. makes better use of more advanced architecture of the
FERMI, or compute capabillity 2.0 oor 2.1.



____________

Knight Who Says Ni N!

Post to thread

Message boards : Number crunching : ATI vs NVIDIA GPUs

//