ATI vs NVIDIA GPUs

Message boards : Number crunching : ATI vs NVIDIA GPUs

Author	Message
GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 7550 - Posted: 17 Mar 2009 \| 13:24:02 UTC Last modified: 17 Mar 2009 \| 16:56:20 UTC
	I try to answer some of the doubts that I see in the forum. ATI and Nvidia have been head-to-head for a long time with pro and cons on each vendor. Firestream ATI or other top ATI cards have more or less the same flops of top Nvidia cards in single precision. http://en.wikipedia.org/wiki/GeForce_200_Series http://en.wikipedia.org/wiki/Comparison_of_ATI_Graphics_Processing_Units Nvidia seems so far better in GPGPU programming with CUDA and GPGPU hardware support. For a complex code, this is the best solution until OpenCL comes out and ATI adapts their hardware. ATI is so far better in double precision performance, but this is still much less than single-precision performance anyway. gdf
	ID: 7550 \| Rating: 0 \| rate: / Reply Quote

Gipsel Send message Joined: 17 Mar 09 Posts: 12 Credit: 0 RAC: 0 Level Scientific publications	Message 7554 - Posted: 17 Mar 2009 \| 16:47:08 UTC - in response to Message 7550. Last modified: 17 Mar 2009 \| 16:48:00 UTC
	Nvidia seems so far better in GPGPU programming with CUDA and GPGPU hardware support. For a complex code, this is the best solution until OpenCL comes out and ATI adapts their hardware. ATI is so far better in double precision performance, but this is still much less that single-precision performance anyway. That is right. CUDA is currently the much more matured SDK than Stream is. Implementing a complex algorithm with CUDA should be much more straightforward than on ATI hardware in the moment, which makes the life easier for the developer. Furthermore, some of the hardware functionality needed to speed up some algorithms is only implemented in ATI's HD4000 series (with support in high level languages like Brook only slowly arriving), whereas on the nvidia side such features are already available in earlier cards. We will have to wait another 3 months or so before one can start to compare the OpenCL implementations of both manufacturers. Concerning the single/double precision performance of ATI and nvidia you are also right. With single precision nvidia and ATI are roughly on par (ATI claims only slightly higher theoretical values). But when going to doubles the nvidia performance reduces to about 8% of the single precision throughput, while ATI is able to sustain 20% of its advertized single precision peak performance. That means a HD3850 has about the same theoretical double precision performance as a GTX285 and a HD4800 series card is significantly faster for such things.
	ID: 7554 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 7555 - Posted: 17 Mar 2009 \| 17:02:17 UTC - in response to Message 7554.
	Hi, ACEMD uses only single-precision on the gpu for maximum performance. As you say the double precision performance of ATI is still 5 times less than any ATI or NVIDIA GPUs in single precision. Fortunately, for molecular dynamics applications is ok to use single-precision. gdf
	ID: 7555 \| Rating: 0 \| rate: / Reply Quote

Gipsel Send message Joined: 17 Mar 09 Posts: 12 Credit: 0 RAC: 0 Level Scientific publications	Message 7557 - Posted: 17 Mar 2009 \| 17:37:13 UTC - in response to Message 7555.
	Hi, ACEMD uses only single-precision on the gpu for maximum performance. As you say the double precision performance of ATI is still 5 times less than any ATI or NVIDIA GPUs in single precision. Fortunately, for molecular dynamics applications is ok to use single-precision. I know. I only wanted to quantify that "much less than single-precision performance". By the way, could you have a look in the credit calculation thread?
	ID: 7557 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 7560 - Posted: 17 Mar 2009 \| 19:41:50 UTC - in response to Message 7557.
	> By the way, could you have a look in the credit calculation thread? DONE.
	ID: 7560 \| Rating: 0 \| rate: / Reply Quote

TomaszPawel Send message Joined: 18 Aug 08 Posts: 121 Credit: 59,836,411 RAC: 0 Level Scientific publications	Message 11050 - Posted: 9 Jul 2009 \| 6:49:28 UTC - in response to Message 7560.
	Theory is beautiful but life is brutal: In GPUGRID with 3xGTX260SP216 you may have 55000 credits per day, In MilkyWay with 2xHD4870 you may have 110 000 credits per day... So where is balance? In MilkyWay project you may run 2 wu per gpu. In gpu grid you may run only 1 wu per gpu. So I really like GPUGRID, and this is only reason why I recently, when I was upgrading my computers I buy 3GTX260 not 3 HD4870..... After all, I am not happy when I see so big differences in the scoring. ____________ POLISH NATIONAL TEAM - Join! Crunch! Win!
	ID: 11050 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 11051 - Posted: 9 Jul 2009 \| 7:40:53 UTC - in response to Message 11050. Last modified: 9 Jul 2009 \| 7:43:25 UTC
	Hi, thanks for your preference. The 260 are not the fastest card, but I understand what you mean. gdf
	ID: 11051 \| Rating: 0 \| rate: / Reply Quote

TomaszPawel Send message Joined: 18 Aug 08 Posts: 121 Credit: 59,836,411 RAC: 0 Level Scientific publications	Message 11052 - Posted: 9 Jul 2009 \| 8:43:33 UTC - in response to Message 11051.
	Another horrifying statistics: In MilkyWay... This host may crunch almost 400 000 Credits per day!!! 2xHD4870x2.... In GPUGRID fastest computer - 3xGTX295 make.... only..... almost 100 000 per day :( NO COMMENT.... ____________ POLISH NATIONAL TEAM - Join! Crunch! Win!
	ID: 11052 \| Rating: 0 \| rate: / Reply Quote

TomaszPawel Send message Joined: 18 Aug 08 Posts: 121 Credit: 59,836,411 RAC: 0 Level Scientific publications	Message 11055 - Posted: 9 Jul 2009 \| 14:17:51 UTC - in response to Message 11052. Last modified: 9 Jul 2009 \| 14:20:27 UTC
	I have a favour for administrators. Standardize the scoring so that there are no such irrational differences.... Is the MilkyWay@Home project apart from principles to which they are applying Seti@Home and GPUGRID? ____________ POLISH NATIONAL TEAM - Join! Crunch! Win!
	ID: 11055 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 11084 - Posted: 9 Jul 2009 \| 19:23:17 UTC - in response to Message 11055.
	Milkyway released their source code and people wrote optimized apps. Some of the improvements made their way into the standard client, but the last batch (about a half year old) of optimizations didn't. The optimized clients for cpus are about a factor of 4 faster than the stock client. For the stock client the credits are OK, but optimized ones get proportionally more. My 4870 has almost reached 80k RAC there. If you divided this by 4 the number would be somewhere in the realm of sanity.. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 11084 \| Rating: 0 \| rate: / Reply Quote

TomaszPawel Send message Joined: 18 Aug 08 Posts: 121 Credit: 59,836,411 RAC: 0 Level Scientific publications	Message 11101 - Posted: 10 Jul 2009 \| 6:01:21 UTC - in response to Message 11084. Last modified: 10 Jul 2009 \| 6:12:08 UTC
	When we can expect such such optimized apps for GPUGRID? If it is posible for MilkyWay it must be possible for GPUGRID. ____________ POLISH NATIONAL TEAM - Join! Crunch! Win!
	ID: 11101 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 11102 - Posted: 10 Jul 2009 \| 7:03:50 UTC - in response to Message 11101. Last modified: 10 Jul 2009 \| 7:05:35 UTC
	The problem is not the optimization but the multiplication. An ATI card in double precision cannot perform better than Nvidia in single precision. gdf
	ID: 11102 \| Rating: 0 \| rate: / Reply Quote

TomaszPawel Send message Joined: 18 Aug 08 Posts: 121 Credit: 59,836,411 RAC: 0 Level Scientific publications	Message 11103 - Posted: 10 Jul 2009 \| 7:35:58 UTC - in response to Message 11102. Last modified: 10 Jul 2009 \| 8:16:56 UTC
	Ok, so is it possible to make for GeForce such optimized apps using CUDA? (4-2 times faster then current aplication - appropriately so as for MilkyWay users do for Ati cards - or aplication witch will generate 4 times more credit Is it impossible here for GeForce?) ____________ POLISH NATIONAL TEAM - Join! Crunch! Win!
	ID: 11103 \| Rating: 0 \| rate: / Reply Quote

TomaszPawel Send message Joined: 18 Aug 08 Posts: 121 Credit: 59,836,411 RAC: 0 Level Scientific publications	Message 11104 - Posted: 10 Jul 2009 \| 8:39:40 UTC - in response to Message 11103. Last modified: 10 Jul 2009 \| 8:42:30 UTC
	And 2 images: GPUGRID: MilkyWay: It is clear that at the beginning of the June when an optimized program was published the project doubled its computing power and the scoring ____________ POLISH NATIONAL TEAM - Join! Crunch! Win!
	ID: 11104 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 11206 - Posted: 20 Jul 2009 \| 18:57:17 UTC
	Thomasz, GDF and his tam are actually very good at optimizations and have been caring about this for a long time. At Milkyway the stock code was terribly inefficient - that's why they released it, to get help from the public. So the problem is not that GPU-Grid does not have an "optimized" app, the problem is that MWs stock app is not yet optimized enough. And that they base they credits on CPU computation time compared to seti (the old standard), not flops. At GPU-Grid you can already get more credits per time from your nVidia card than at seti, because GPU-Grid is more optimized (and may have more GPU-friendly code). The rapid increase of RAC you're seeing at MW is not due to the release of an optimized app, it's actually the point when Travis (MW commander in chief) decided to let the work generator run more frequently, so that all the ATI cards wouldn't run dry any more but actually get new WU when they requested them. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 11206 \| Rating: 0 \| rate: / Reply Quote

RalphEllis Send message Joined: 11 Dec 08 Posts: 43 Credit: 2,216,617 RAC: 0 Level Scientific publications	Message 11278 - Posted: 23 Jul 2009 \| 21:01:55 UTC - in response to Message 11206.
	In the end, the credits are meaningless. The only way to have any concrete gain from distributed computing is to run folding@home for the Evga team and you can get EVGA Bucks to contribute to the purchase of an EVGA card. I like the challenge of optimizing the system to run faster but in the end the point totals only exist to encourage people to participate. 10 points a day and 100,000 point per day are really the same thing. The real bonus is that the computations are going to help medical science or astromomy or mathematics.
	ID: 11278 \| Rating: 0 \| rate: / Reply Quote

popandbob Send message Joined: 18 Jul 07 Posts: 67 Credit: 40,277,822 RAC: 0 Level Scientific publications	Message 11280 - Posted: 24 Jul 2009 \| 0:24:44 UTC - in response to Message 11206.
	At GPU-Grid you can already get more credits per time from your nVidia card than at seti, because GPU-Grid is more optimized (and may have more GPU-friendly code) The cuda 2.3 drivers and dll's make it possible to obtain more credits per time than GPUgrid or very close to the same. I'm getting a crazy ~10 credits per minute on my single gtx core 216 which works out to about 14000 RAC Bob
	ID: 11280 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 11635 - Posted: 3 Aug 2009 \| 19:47:20 UTC - in response to Message 11278.
	In the end, the credits are meaningless. The only way to have any concrete gain from distributed computing is to run folding@home for the Evga team and you can get EVGA Bucks to contribute to the purchase of an EVGA card. You imply that "not being meaningless" equals "concrete gain", something which you seem to define as "getting something of material value" (or something like that). For me I'd say I get lots of concrete gain due to DC. It's a hobby, I can enjoy spending time on it whenever I want to (and other duties permit it). That's nothing to touch, but it's a very real "gain" for me. ... but in the end the point totals only exist to encourage people to participate. 10 points a day and 100,000 point per day are really the same thing. The real bonus is that the computations are going to help medical science or astromomy or mathematics. The credits also reflect the amount of contribution. Therefore they also exist to show people how good their systems are at what they're doing. You're right that just by itself 10 or 100,000 credits don't mean anything, as it could be any arbitrary number / scaling factor. However, there's not just one of these numbers. The different amounts of credits enable comparisons between systems and projects (how valid these are is another question) and 10 credits do not equal 100,000 any more when your co-cruncher gets 1000. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 11635 \| Rating: 0 \| rate: / Reply Quote

TomaszPawel Send message Joined: 18 Aug 08 Posts: 121 Credit: 59,836,411 RAC: 0 Level Scientific publications	Message 11668 - Posted: 5 Aug 2009 \| 12:36:08 UTC - in response to Message 11635.
	Explain me this paradox... 857-GIANNI_BIND001-0-100-RND8741_3 Approximate elapsed time for entire WU: 31806.654 s Granted credit 5664.88715277777 460-GIANNI_DOPc-0-25-RND4727_1 Approximate elapsed time for entire WU: 29039.251 s Granted credit 6379.86834490741 In my opinion, for the longer counted WU more points should be granted. Assuming that the second WU was awarded correctly, scoring first is lowered. ____________ POLISH NATIONAL TEAM - Join! Crunch! Win!
	ID: 11668 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 11676 - Posted: 5 Aug 2009 \| 20:54:28 UTC - in response to Message 11668.
	Which has what to do with "ATI vs nVidia GPUs"? MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 11676 \| Rating: 0 \| rate: / Reply Quote

TomaszPawel Send message Joined: 18 Aug 08 Posts: 121 Credit: 59,836,411 RAC: 0 Level Scientific publications	Message 11679 - Posted: 5 Aug 2009 \| 21:45:56 UTC - in response to Message 11676. Last modified: 5 Aug 2009 \| 21:48:48 UTC
	In the end, the credits are meaningless. The only way to have any concrete gain from distributed computing is to run folding@home for the Evga team and you can get EVGA Bucks to contribute to the purchase of an EVGA card. You imply that "not being meaningless" equals "concrete gain", something which you seem to define as "getting something of material value" (or something like that). For me I'd say I get lots of concrete gain due to DC. It's a hobby, I can enjoy spending time on it whenever I want to (and other duties permit it). That's nothing to touch, but it's a very real "gain" for me. ... but in the end the point totals only exist to encourage people to participate. 10 points a day and 100,000 point per day are really the same thing. The real bonus is that the computations are going to help medical science or astromomy or mathematics. The credits also reflect the amount of contribution. Therefore they also exist to show people how good their systems are at what they're doing. You're right that just by itself 10 or 100,000 credits don't mean anything, as it could be any arbitrary number / scaling factor. However, there's not just one of these numbers. The different amounts of credits enable comparisons between systems and projects (how valid these are is another question) and 10 credits do not equal 100,000 any more when your co-cruncher gets 1000. MrS Which has what to do with "ATI vs nVidia GPUs"? MrS Good Question If we want to be so pure we must make other tread? ____________ POLISH NATIONAL TEAM - Join! Crunch! Win!
	ID: 11679 \| Rating: 0 \| rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 12983 - Posted: 2 Oct 2009 \| 17:35:10 UTC Last modified: 2 Oct 2009 \| 17:35:50 UTC
	Are there any new developments on the ATI GPU front for GPUGRID? Are there apps based on OpenCL planned in the near future? Thanks!
	ID: 12983 \| Rating: 0 \| rate: / Reply Quote

Paul D. Buck Send message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level Scientific publications	Message 12985 - Posted: 2 Oct 2009 \| 18:29:55 UTC - in response to Message 12983. Last modified: 2 Oct 2009 \| 18:30:16 UTC
	Are there any new developments on the ATI GPU front for GPUGRID? Are there apps based on OpenCL planned in the near future? Thanks! GDF said there would be ... I don't know, ooops yes I do: which thread... but it is about here someplace... Einstein hinted that they have OpenCL in the works too, but it is lagging behind the CUDA version they are working on ... I do not know that there will be a mad rush ... or that this will lead to Mac OpenCL apps at the same time ... but I can hope cause I have a mac that I could run GPUs in too ...
	ID: 12985 \| Rating: 0 \| rate: / Reply Quote

bloodrain Send message Joined: 11 Dec 08 Posts: 32 Credit: 748,159 RAC: 0 Level Scientific publications	Message 17268 - Posted: 24 May 2010 \| 22:19:26 UTC - in response to Message 7550.
	you nailed it gdf. on what i was going to post.
	ID: 17268 \| Rating: 0 \| rate: / Reply Quote

Fred J. Verster Send message Joined: 1 Apr 09 Posts: 58 Credit: 35,833,978 RAC: 0 Level Scientific publications	Message 21234 - Posted: 24 May 2011 \| 23:53:54 UTC - in response to Message 17268.
	And SETI Bêta is also testing a MultiBeam app. rev177 & rev.234(?) and a AstroPulse app. rev521, for ATI cards. The speedgain on AstroPulse is about 10 times less runtime.(CPU time strongly depends on % radar-blanking) ~10% CPU time with 0% radar blanking. And the Longruns here, just finished 2 of them, also takes 10-12 hours, but more credit is given compaired to SETI CUDA. I think the GPUGrid app. makes better use of more advanced architecture of the FERMI, or compute capabillity 2.0 oor 2.1. ____________ Knight Who Says Ni N!
	ID: 21234 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : Number crunching : ATI vs NVIDIA GPUs

	About	Science	Volunteers	Performance	Forum	Join us	Donate