Advanced search

Message boards : Graphics cards (GPUs) : Fermi

Author Message
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15301 - Posted: 18 Feb 2010 | 17:00:20 UTC

http://www.nvidia.com/object/fermi_architecture.html

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15305 - Posted: 18 Feb 2010 | 17:25:13 UTC - in response to Message 15301.

I'm getting at least one and maybe two. We just need to have them available so we can see how quick they process WUs!

____________
Thanks - Steve

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15316 - Posted: 18 Feb 2010 | 21:42:08 UTC

Nice link with that video explaining the basics! Just be sure to check its price and power supply requirements before you buy a couple of them ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15366 - Posted: 22 Feb 2010 | 17:12:46 UTC

http://www.nvidia.com/object/paxeast.html
____________
Thanks - Steve

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 15382 - Posted: 23 Feb 2010 | 15:59:09 UTC - in response to Message 15366.
Last modified: 23 Feb 2010 | 15:59:28 UTC

Dont waste your money - Fermi is a dead duck.

It will release late March early April, but with very limited quantities, each one being fielded at a huge loss, and each knocking hard on the 300w PCIe certification limit. It will be a PR stunt to keep them going, by May / June you will not get one for love nor money because NVidia cannot afford to subsidise more. The current design is grossly uneconomical, and can be wiped out in a performance sense by current ATI cards, let alone the ATI cards due for release this summer.

NVidia will not fix its silicon engineering problems until it can go to 28nm production, that will not be until 2011. On top of that a viable production level NVidia DX11 facility is just a marketing dream at present, NVidia is not capable of producing production standard DX11 for sustained mass deployment until it goes 28nm. Even if it took a suicidal route of keep to a "fix" for the current Fermi design, it would take a minimum of 6 months to get to market, and the final fix would be uneconomic, it would break an already theoreticaly insolvent NVidia.

By the time Fermi2 comes out 2011, ATI will be two generations ahead and NVidia is strategically left for dust (which frankly, it already is). 2010 will be a year where NVidia internally positions itself for a re-branding and market repositioning. By 2011, it will be finished in its current guise, and a rump of its former self.

275, 280, 285 are already formally EOL, and the 295 all but EOL - OEMs will not get any further 295s - there are no shots left in the NVidia armoury.

I have been an NVidia fan for many years, but the party's over ...... I will be changing to ATI for my next PC build.

NVidia did not listen to warnings given about engineering architecture issues in 2007, and will now pay the price for that.

Regards
Zy

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15383 - Posted: 23 Feb 2010 | 16:18:26 UTC - in response to Message 15382.

Can you reveal your sources?

NVidia is still profitable: http://www.istockanalyst.com/article/viewarticle/articleid/3838318
____________
Thanks - Steve

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 15384 - Posted: 23 Feb 2010 | 16:49:55 UTC - in response to Message 15383.
Last modified: 23 Feb 2010 | 17:07:40 UTC

Thats based on the first three quaters of 2009, they still had four mainstream cards to sell - they dont now .... and will not until Firmi is out. NVidia is not only about graphics cards at consumer level, there is no suggestion from me it will go out of business, but it will retreat from the classic PC Graphics card market, it has nothing left. It will go into specialist graphics applications in 2011, it cant compete with ATI.

Firmi is a disaster, the returns from the final runs of silicon end of January had a mere 2-5% yield, the silicon engineering is wrong. To fix it without a total redesign, they have to increase core voltage, that pushes up the watts to get the power they need, that in turn butts them up against the PCIe certification limit. They must, must, come in under 300w else with no PCIe certification, OEMs will not touch them. To come in under 300w they will have to disable parts of the GPU when going at full speed.

Source - my opinion from searching the Web. Sit down one day and google real, real hard along the lines of:

GTX480, GF100 design, GF100 silicon, Firmi silicon, Firmi engineering, NVidia architecture, TSMC Silicon NVidia etc.

Keep away from "card reviews" and the like, they just regurgitate NVidia specs. Delve into comments at a pure engineering level. The story is not a happy one.

Regards
David

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 15385 - Posted: 23 Feb 2010 | 17:21:50 UTC - in response to Message 15384.

Fermi launch date:
http://www.theinquirer.net/inquirer/news/1593186/fermi-finally-appear

gdf

cenit
Send message
Joined: 11 Nov 09
Posts: 23
Credit: 668,841
RAC: 0
Level
Gly
Scientific publications
watwat
Message 15386 - Posted: 23 Feb 2010 | 17:42:05 UTC - in response to Message 15385.

there are some really harsh news about Fermi...

look this one:
The chip is unworkable, unmanufacturable, and unfixable.

it also contains a lot of links and technical data to support the statement... ugh!

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 15387 - Posted: 23 Feb 2010 | 17:54:02 UTC - in response to Message 15385.

They will be out of stock by 1 Jul, maybe earlier depends on how fast take up is from existing users. After that, they cant produce more - dont have the cash to subsidise the fix they need to apply to the silicon. Its the 250 spin saga all over again, but this time there is nothing to make up for it like the 295 producing a cash flow. Current cards are just a PR stunt pre Firmi, they cant stand up to ATI on their own performance.

There's just too many indicators. Another one, why are ATI/AMD dragging their heals on a FFT fix? They say they have one and will be deployed in the next OpenCL release "in a few months". A few months?? When both ATI and NVidia are (at least publically) shouting virtues of OpenCL. Look at it from the other angle, with NVidia engineering problems, why should ATI accelarate OpenCL dev/fixes, without NVidia in its current form there is no real need for ATI to push OpenCL.

All speculative, for sure, its only an opinion. However there is far too many indicators supporting this, and equally significant, no retraction of the main issues from NVidia or their partners/supporters to some very strong reviews and statements made re the silicon issue.

Time will tell I guess, but certainly there is enough to say "hold for now", if the Firmi production reviews pick this up, and they will, or the cards suddenly become scarce "due to extrordinary demand" end June, then we will know. However I suspect the Industry experts and analysts will have blown this on out before then once the production card is out there and its engineering can be picked apart. Then the next phase will begin with Spin covering the delay over Firmi2, but I dont think they will get away with it this time.

Regards
Zy

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 15388 - Posted: 23 Feb 2010 | 17:59:42 UTC - in response to Message 15386.
Last modified: 23 Feb 2010 | 18:03:45 UTC

there are some really harsh news about Fermi...

look this one:
The chip is unworkable, unmanufacturable, and unfixable.

it also contains a lot of links and technical data to support the statement... ugh!


Charlie is a known Nvidia hater - he would spit on their grave given the chance, so we have to take some of his ire with a pinch of salt. However even if only half of that report is true, NVidia are sunk without trace.

In fairness to him, he has been right with his predictions on NVidia for the last 2 years, especially on the engineering aspects.

I'm trying to keep an open mind ... really am ... but its real hard to with whats out there.

Regards
Zy

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 15389 - Posted: 23 Feb 2010 | 18:25:15 UTC - in response to Message 15388.

Luckily we need to wait only 1 month now to know what is true.

gdf

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15410 - Posted: 24 Feb 2010 | 20:49:02 UTC - in response to Message 15384.
Last modified: 24 Feb 2010 | 20:49:55 UTC

Thats based on the first three quaters of 2009, they still had four mainstream cards to sell - they dont now .... and will not until Firmi is out. NVidia is not only about graphics cards at consumer level, there is no suggestion from me it will go out of business, but it will retreat from the classic PC Graphics card market, it has nothing left. It will go into specialist graphics applications in 2011, it cant compete with ATI.

Firmi is a disaster, the returns from the final runs of silicon end of January had a mere 2-5% yield, the silicon engineering is wrong. To fix it without a total redesign, they have to increase core voltage, that pushes up the watts to get the power they need, that in turn butts them up against the PCIe certification limit. They must, must, come in under 300w else with no PCIe certification, OEMs will not touch them. To come in under 300w they will have to disable parts of the GPU when going at full speed.

Source - my opinion from searching the Web. Sit down one day and google real, real hard along the lines of:

GTX480, GF100 design, GF100 silicon, Firmi silicon, Firmi engineering, NVidia architecture, TSMC Silicon NVidia etc.

Keep away from "card reviews" and the like, they just regurgitate NVidia specs. Delve into comments at a pure engineering level. The story is not a happy one.

Regards
David


Stupid question, I'll ask anyways. If GTX470/480 comes out clocked at 676Mhz to meet with PCIe certification limits, if Nvidia gave the option to go beyond this 300W (at UR own risk), would they get that "edge", people are looking for? OCing always voids waranties. So why not give the option (at UR own risk) & package that extra PCIe 12V connector in a way that will void all warranty, if unsealed...
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15413 - Posted: 24 Feb 2010 | 22:32:02 UTC - in response to Message 15410.

You will probably be able to OC the Fermi cards, just like the previous cards. However, almost any attempt at doing so will get you past the 300 W. So nVidia don't have to do much to allow this ;)

And it's not like your system suddenly fails if you draw more than that. At least in quality hardware there are always some safety margins built in. A nice example of this is ATIs 2-chip 5970: it draws almost exactly 300 W under normal conditions, but when OCed goes up to ~400 W. It works, but the heat and noise is unpleasant to say the least. And the power supply circuitry is quite challanged by such a load and can become a limiting factor (could be helped by a more expensuve board design).

Is that the "edge"? It depends.. I wouldn't like to draw about twice the power of an ATI for a few 10 % more performance. For me it would have to be at least ~80% more performance for 100% more power. Oh, and don't forget that you can OC the ATIs as well ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15419 - Posted: 25 Feb 2010 | 2:46:27 UTC - in response to Message 15413.

Would you like to order some Fermi cards before they are built, and before the power requirements and other specs are released? You can.

http://techreport.com/discussions.x/18515#jazz

I'll let you decide if it looks worth it.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15422 - Posted: 25 Feb 2010 | 6:37:35 UTC

Seems that my last last in this thread has turned out questionable - when I found the site that lists them for sale again, it now shows both GTX4xx models as out of stock.

Also I found an article saying that the amount of memory shown on that web page is questionable for the Fermi design.

http://www.brightsideofnews.com/news/2010/2/21/nvidia-geforce-gtx-480-shows-up-for-preorder-and-the-specs-look-fishy.aspx

Also an article showing Nvidia's announcement plans for that series:

http://www.nvidia.com/object/paxeast.html

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15428 - Posted: 25 Feb 2010 | 13:22:46 UTC - in response to Message 15413.
Last modified: 25 Feb 2010 | 14:17:49 UTC

NVidia has taken huge steps over the last year towards the low and mid range card market for many reasons, none of them being ATI. This should not be seen as a migration from the top end market, but rather a necessary defence and improvement of their low to mid range market. This is where most GPUs are sold, so its the main battlefield, and NVidia have a bigger threat than ATI, in this war.

NVidias mid range cards such as the GT220 and GT240 offer much lower power usage than previous generations, making these cards very attractive to the occasional gamer, the home media centre users (with their HDMI interface), the office environment, and of course to GPUgrid. The GT240 offers up a similar performance to a 8800GS or 9600GT in out and out GPU processing, but incorporates new features and technologies. Yet, where the 8800GS and 9800GT use around 69W when idle and 105W under high usage, the GT240 uses about 10W idle and up to 69W (typically 60 to 65W) when in high use. As these cards do not require any special power connectors, are quiet and not oversized they fit into more systems and are more ergonomically friendly.

WRT crunching GPUGrid, the GT240 will significantly outperform these and many more powerful, and power hungry, CC1.1 cards (such as a GTS 250, 150W TDP) because the GT240 is CC1.2, and is much, much more reliable (G215 core rather than G92)!

NVidia has also entered the extra-PC market, creating chips for things such as TVs and mobiles. However, this is not just an attempt to stave off Intel’s offensive on the low end GPU market. NVidia have found allies in the form of ARM; Intel also started to take the low end CPU fight to ARM, with the release of Atoms and continued development of small CPUs of limited power. However ARM are not an easy target, not by a long way. ARM sell over 1Billion CPU chips every year, for various devices including just about every mobile on the planet. So NVidia have found very solid support from ARM (especially their computer on a chip Cortex ARM9 product, Tegra 2, which could be making its way to a TV, game console, PDA or netbook near you any time soon).

So Intel’s recent move towards low end CPUs and GPUs, and more importantly their development of single chip CPU and GPUs naturally makes them the common enemy of NVidia and ARM. If NVidia failed to respond to this challenge they would have their market place threatened not just by ATI & AMD but by Intel on the main battlefield (where most cards are).


A few things to note about PCIE technologies:
PCIe operates at 2.5GHz
PCIe 2 operates at 5GHz
PCIe 2.1 operates at 5GHz
PCIe 3 will operate at 8GHz, but this has not yet been ratified – rather, delayed until the second quarter of 2010, and perhaps then some.
It is no surprise that it has not been decided upon, Fermi has not been released!

This highlights the secretive nature of GPU manufacturers, too scared to reveal what they are up to even if it hampers their own sales. It lends weight to the argument that the Fermi cards will be an NVidia loss leader; however we are not privy to the internal cost analysis, just the speculation and you could argue that Fermi is a trail blazing technology; a flagship for newly developed technology, which will lend itself to new mid range cards over the next year or so and set a new bar for performance.

It is worth noting that Intel developed PCIe, so they might want to try to scupper the technology to knobble both ATI and NVidia, saying as Intel don’t have a top end GPU; they might want to deny other GPU manufacturers the opportunity to compete on something they cannot control.

It strikes me as likely, that people who buy a Fermi in a month’s time might have to buy a new motherboard sometime later in the year to fully appreciate the card. Today it would have to work on a PCIe2 (or PCIe2.1) based motherboard, as these are the best that there is, but I would not be surprised if better performance will be brought with a future PCIe3 slotted motherboard:
PCIe2 uses an 8b/10b encoding scheme (at the data transmission layer). This results in a 20% overhead. The PCIe3 will use a 128b/130b encoding system, resulting in a <2% overhead, but this is only one area of improvement.

Profile [AF>Libristes] Dudumomo
Send message
Joined: 30 Jan 09
Posts: 45
Credit: 425,620,748
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15429 - Posted: 25 Feb 2010 | 13:40:35 UTC - in response to Message 15428.

Thanks SKGiven.
Interesting post.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15430 - Posted: 25 Feb 2010 | 14:45:49 UTC - in response to Message 15428.

NVidia has taken huge steps over the last year towards the low and mid range card market for many reasons, none of them being ATI. This should not be seen as a migration from the top end market, but rather a necessary defence and improvement of their low to mid range market. This is where most GPUs are sold, so its the main battlefield, and NVidia have a bigger threat than ATI, in this war.

The highest profit area however is the high end and the truth is that NVidia has made numerous missteps in the high end market. AMD/ATI is executing much better at the moment. NVidia spent much money developing the G200 series and now they're becoming more expensive and harder to find as they reach EOL status. Fermi sounds like it might be in trouble. It's too bad as we need strong competition to spur advances and hold down prices.

It strikes me as likely, that people who buy a Fermi in a month’s time might have to buy a new motherboard sometime later in the year to fully appreciate the card. Today it would have to work on a PCIe2 (or PCIe2.1) based motherboard, as these are the best that there is, but I would not be surprised if better performance will be brought with a future PCIe3 slotted motherboard:
PCIe2 uses an 8b/10b encoding scheme (at the data transmission layer). This results in a 20% overhead. The PCIe3 will use a 128b/130b encoding system, resulting in a <2% overhead, but this is only one area of improvement.

Interesting but not very relevant to our needs. Even the slowest versions of PCIe presently are overkill as far as interfacing the GPU and CPU for the purpose of GPU computing. Maybe gaming is a different matter, but NVidia seems to be moving away from the high end gaming market anyway.

As far as Intel, over the years they've made several attempts to take over the graphics card market and have miserably failed every time. It seems that some of their much ballyhooed technologies have been recently scrapped this time too. Still, Intel is huge, has many resources, and has never been one to let the law stand in it's way when attempting to dominate a market. As always, time will tell.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15439 - Posted: 25 Feb 2010 | 18:35:40 UTC - in response to Message 15430.

Interesting but not very relevant to our needs.

Probably, but I was talking generally because I wanted to make the point about the politics behind the situation; if preventing PCIe3 means another company cant release a significantly better product, that could be used here, some will see that as a battle won.

I expect gaming would improve in some ways with a faster PCIe interface; otherwise there would be no need for PCIe3. I doubt that all the holdups are due to the fab problems.

NVidia seems to be moving away from the high end gaming market anyway.


Well since the GTX295s that is true, because the battle was on a different front, but Firmi will be a competitive top end card. We will see in a month, or two, hopefully!

Intel have been trying desperately for years to be all things to all users, but their flood the market strategy often leaves their biggest competitor to a new product to be one of their own existing products. I dont see Intel as a graphics chip developer and I think most people would feel the same.
I think most of their attempts to take over the graphics card market have been through ownership legislations and control methods, so no wonder their miserable attempts failed, they never actually made a good GPU.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15442 - Posted: 25 Feb 2010 | 19:46:41 UTC - in response to Message 15439.

Semi Accurate reported the following, with some caution.
GTX480: 512 Shaders @600MHz or 1200MHz high clock (hot clock)
GTX470: 448 Shaders @625MHz or 1250MHz high clock (hot clock)
If correct the GTX470 will be about 10% slower than the GTX480, but cost 40% less (going by another speculative report)!

I am not interested in how it compairs to ATI cards playing games, just the cost and that they don’t crash tasks!

Dont like the sound of only 5000 to 8000 cards being made. You could sell them in one country, and our wee country might be down the list a bit ;)

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15446 - Posted: 25 Feb 2010 | 22:27:16 UTC

We started another discussion on Fermi in the ATI thread, so I'm going to reply here:

me wrote:
They desperately wanted to become the clear number 1 again, but currently it looks more like "one bird in the hand would have been better than two on the roof".

SK wrote:
If a bird in the hand is worth two in the bush, NVidia are releasing two cards, the GTX 480 and GTX 470, costing $679.99 and $479.99, how much is the bush worth?


I was thinking along the lines of
- Fermi is just too large for TSMCs 40 nm process, so even if they can get the yield up and get the via reliability under control, they can't fix the power consumption. So the chip will always be expensive and power constrained, i.e. it will be slowed down by low clock speeds due to the power limit.
- Charlie reports 448 shaders at 1.2 GHz for the current top bin of the chip.
- Had they gone for ~2.2 bilion transistors instead of 3.2 billion they'd get ~352 shaders and they'd get a much smaller chip, which is (a) easier to get out of the door at all and (b) yields considerably better. Furthermore they wouldn't be too limited by power consumption (they'd end up at ~200 W at the same voltage and clock speed as the current Fermi), so they could drive clock speed and voltage up a bit. At a very realistic 1.5 GHz they'd achieve 98% of the performance of the current Fermi chips.
- However, at this performance level they'd probably trade blows with ATI instead of dominating them, like a full 512 shader Fermi at 1.5 GHz probably would have been able to.
- The full 512 shader Fermi are the two birds on the roof. A 2.2 billion transistor chip is the one in the hand.. or at least in catching range.

I didn't read the recent posts and didn't have the time yet to reply to the ones I did read. Will probably find the time over the week end :)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15453 - Posted: 26 Feb 2010 | 3:00:53 UTC

Looks like Nvidia has planned another type of Fermi cards - the Tesla 20 series.

http://www.nvidia.com/object/cuda_home_new.html

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15457 - Posted: 26 Feb 2010 | 9:04:48 UTC - in response to Message 15453.

You just got to love their consistent naming schemes. But there's still hope the actual products will called 20x0, isn't it?

MrS
____________
Scanning for our furry friends since Jan 2002

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15463 - Posted: 26 Feb 2010 | 18:12:54 UTC - in response to Message 15428.

Thanks for the interesting post SKGiven, on the politics of chip manufacturers. Not to make any claims of better knowledge, as just reading my other posts will reveal that I'm just a N00B with his own POV on things.

That some might point out that the largest profit is from sale of High End GPU sales, but that even slight profit, in the vast low to mid range would result in an overall higher gross profit, but that nice' markets can even be expanded as Apple has done. What the future brings is what the future brings.

AMD & Intel sells off their less perfect chips as cheaper models, close to perfect as extreme editions. Even if Nvidia does turn out crappier than desired Fermi, surely they can find use for them still...

I'm an Nvidia fan, don't feel the need to apologize for being being one, but I'm thankful that Ati is around to deal out blows on the behalf of Nvidia consumers.
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15469 - Posted: 27 Feb 2010 | 9:26:13 UTC - in response to Message 15463.

NVidia stopped producing their 55nm GTX285 chips. Lets hope they simply use existing improvements in technology, tweak the designs a bit, and move it to 40nm, with DDR5. If Asus can stick two GTX285 chips onto the one card, I think everyone would be able to if the chips were 40nm instead of 55nm. The cards would run cooler and clock better. A 20% improvement on two 1081GFlops (MADD+MUL, not Boinc) might not make it the fastest graphics card out there, but it would still be a top end card and more importantly, sellable and therefore profitable.

Fermi is too fat and NVidia made the mistake of thinking they could redesign and rebuild fabrication equipment as if it was a GPU core design. Bad management to let the project continue as it did. Firmi should have been shelved and NVidia should have been trying to make competitive and profitable cards, rather than an implausible monster card.

The concept of expanding the width of a wafer to incorporate 3.2billion transistors is in itself daft. Perhaps for 28nm technology, but not 40nm. It is about scalability. The wider the chip the greater the heat, and the more difficult it is to manufacturer. Hence the manufacturing problems, and low clocks (600-625MHz). Lets face it, one of my GT240s is factory clocked to 600MHz and it is just about a mid-range card. A GTX280 built on 65nm technology clocks in at 602MHz, from June 2008! A GTS250 clocks at 738MHz. If you go forward in one way but backwards in another, there is no overall progress.

NVidia should have went for something less bulky, and scale could have been achieved through the number of GPU chips on the card. You don’t see Intel or AMD trying to make such jumps between two levels of chip fab technology. Multi chips are the way of the future, not fat chips.

Why not 3 or 4 GTX285 chips at 40nm onto one card with DDR5? Its not as if they would have to rebuild the fabrication works.
Firmi wont be anything until it can be made at 28nm.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15471 - Posted: 27 Feb 2010 | 13:29:06 UTC - in response to Message 15469.

I also want faster cards to crunch GPUIGrid with but just because Nvidia has had major delays doesn't mean the sky is falling :-)

Nvidia does not own nor is the one making changes to the fab equipement. TSMC, the fab, is currently producing wafers for ATI on 40 nm so while on one hand I agree that making a very large die is more difficult the fab plant and equipment is the same.

Was nvidia agressive in persuing their vision? Yes.
Did they encounter delays because they were so agressive? Yes.
They already shrunk the 200 arch and they already produced a 2 chip card, I think it is time for a fresh start.

Your comparison of core speed leaves scratching my head because I know that the GTX 285 at 648 crunches faster than a GTS250 at 702 MHz.

Which manufacturers are persuing a multi-chip approach?

Are you suggesting that no one buy an Nvidia card based on fermi arch until they shrink it to 28 nm?
____________
Thanks - Steve

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15472 - Posted: 27 Feb 2010 | 13:55:32 UTC - in response to Message 15469.

You don’t see Intel or AMD trying to make such jumps between two levels of chip fab technology.

Intel tried it with the Itanium and ended up losing technology leadership to AMD for a number of years. If they hadn't been able to manipulate and in some cases buy the benchmark companies in order to make the P4 look decent they would have lost much more.

Why not 3 or 4 GTX285 chips at 40nm onto one card with DDR5? Its not as if they would have to rebuild the fabrication works.
Firmi wont be anything until it can be made at 28nm.

The 285 shrink would still have to have major redesign to support new technologies and even at 40nm you'd probably be hard pressed to get more than 2 on a card without pressing power and heat limits. Then would it even be competitive with the ATI 5970 for most apps? By the time NVidia got this shrink and update done where would AMD/ATI be? I think you have a good idea but the time has probably passed for it. It will be interesting to see if Fermi is as bad as SemiAccurate says. We should know within the next few months about speed. power, heat and availability.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15473 - Posted: 27 Feb 2010 | 14:25:56 UTC - in response to Message 15471.
Last modified: 27 Feb 2010 | 14:28:44 UTC

I read that the fabrication plants equipment is actually different; they had to redesign some of it.
NVidia has produced at least 2 dual GPU cards, so why stop or leave it at that. The CPU manufacturers have demonstrated the way forward by moving to quad, hex and even 8 core CPUs. This is scaling done correctly! Trying to stick everything on the one massive chip is backwards, more so for GPUs.
http://www.pcgameshardware.com/aid,705523/Intel-Core-i7-980X-Extreme-Edition-6-core-Gulftown-priced/News/
http://www.pcgameshardware.com/aid,705032/AMD-12-core-CPUs-already-on-sale/News/
I was not comparing a GTX 285 to a GTS 250. I was highlighting that a card, almost 2 years old, will have the same clock speed as Firmi, to show that they have not made any progress on this front, and when compared to a GTS250 at 702 MHz, you could argue that Firmi has lost ground in this area and this deflates other advantages of the Firmi cards.

multi-chip approach?

    ATI:
    Gecube X1650 XT Gemini, HD 3850 X2, HD 3870 X2, HD 2850 X2, 4870 X2...
    NVidia:
    Geforce 6800 GT Dual, Geforce 6600 dual versions, Geforce 7800 GT Dual, Geforce 7900 GX2, Geforce 9800 GX2, GTX295 and Asus’s limited ed. variant with 2 GTX 285 cores... None?!?
    Quantum also made a single board Sli 3dfx Voodoo2.



Are you suggesting that no one buy an Nvidia card based on fermi arch until they shrink it to 28 nm?

No, but I would suggest that the best cannot be achieved from Firmi technology if made at 40nm; but it could (and perhaps will) be seen at 28nm (as 28nm research is in progress).

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15474 - Posted: 27 Feb 2010 | 14:47:31 UTC - in response to Message 15472.

Nvidia could have at least produced 40n GTX 285 type chips for a dual GPU board, but they chose to concentrate on low to mid range cards for mass production and keep researching Firmi technologies. To some extent this all make sense. There were no issues with the lesser cards, so they wasted not time in pushing them out, especially to OEMs. The GT200 and GT200b cards on the other hand did have manufacturing issues, enough to deter them from expanding the range. It would have been risky to go to 40nm. Who would really be buying a slightly tweaked 40nm version of a GTX 260 (ie a GTX 285 with shaders switched off, due to production issues)?

But perhaps now that they have familiarised themselves with 40nm chip production (GT 210, 220 and 240) more powerful versions will start to appear, if only to fill the future mid range card line-up for NVidia. Lets face it, a lot of their technologies are reproductions of older ones. So the better late than never, if its profitable, approach may be applied.

The last thing gamers and crunchers need is for a big GPU manufacturer to diminish into insignificance. Even to move completely out of one arena would be bad for competition.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15475 - Posted: 27 Feb 2010 | 14:58:19 UTC

Guys.. you need to differentiate between the GF100 chip and the Fermi architecture!

The GF100 chip is clearly:
- too large for 40 nm
- thereby very difficult to manufacture
- thereby eats so much power that its clock speed has to be throttled, making it slower than expected

However, the design itself is a huge improvement over GT200! Besides DX 11 and various smaller or larger tweaks it brings along a huge increase in geometry power, a vastly improved cache system and it doesn't need to waste extra transistors for special double precision hardware at 1/8th the single precision speed. Instead it uses the current execution units in a clever way and thereby achieves 1/2 the single precision performance in DP, even better than ATI at 2/5th. Fermi even supports IEEE standard DP, the enhanced programming models (in C) etc.

All of this is very handy and you don't need 512 shader processors to use it! All they need to do is to use these features in a chip of a sane size. and since they already worked all of this out (since the first batch of GT100 chips returned last autumn the design must have been ready since last summer) it would be downright stupid not to use it in new chips. And to waste engineering ressources to add features to GT200 and shift it to 40 nm (which was the starting point of the Fermi design anyway).

Like ATI they must first have been informed about the problems with TSMCs 40 nm process back in the beginning of 2008. At that time they could have scaled Fermi down a little.. just remove a few shader cluster, set the transistor budget to not much more than 2 billion and get a chip which would actually have been manufacturable. But apparanently they choose to ignore the warning, whereas ATI choose to be careful.

But now that "the child already fell into the well", as the Germans say, it's mood discussion these issues. What nVidia really has to do is to get a mainstream version of Fermi out of the door.

Oh, and forget about 28 nm: if TSMC has got that much trouble at 40 nm, why would they fare any better at 28, which is a much more demanding target? Sure, they'll try hard not to make the same mistakes again.. but that's far more complicated than rocket science.

MrS
____________
Scanning for our furry friends since Jan 2002

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15477 - Posted: 27 Feb 2010 | 15:34:05 UTC

A few more words on why GF100 is too large:

Ever since G80 nVidia designed the comparably simple shader processors to run at high clock speeds. That is a smart move, as it gives you more performance out of the same amount of transistors. The G92 chip is a very good example of this: at the 65 nm process it reached 1.83 GHz at the high end and ~150 W, whereas it could also be used in the 9800GT Green at 1.38 GHz at 75 W. These numbers are acutally from the later 55 nm versions, but these provided just a slight improvement.

This was a good chip, because its power requirements allowed the same silicon to be run as high performance chip and as a energy efficient one (=lower voltages and clocks).

G92 featured "just" 750 million transistors. Then nVidia went from there to 1.4 billion transistors for the GT200 chip at practically the same process node. The chip became so big and power hungry (*) that its clock speeds had to be scaled back considerably. nVidia no longer had the option to build a high performance version. The initial GTX260 with 192 cores cost them at least twice as much as a G92 (double the amount of transistors, actual cost is higher as yield drops with area squared), yet provided just the same single precision performance and about similar performance in games. 128 * 1.83 = 192 * 1.25, simple as that. And even worse: the new chip was not even consuming less power under load, as would have been expected for a "low clock & voltage" efficient version. That's due to the leakage from twice the amount of transistors.

(*) Cooling a GPU with stock coolers becomes difficult (=loud and expensive) at about 150 W and definitely painful at 200 W.

At 55 nm the GT200 eventually reached up to 1.5 GHz, but that's still considerably lower than the previous generation - due to the chip being very big and running into the power limit. ATI on the other hand could drive their 1 billion transistor chip quite hard (1.3 V compared to 1.1 V for nVidia), extract high clock speeds and thereby more performance from cheaper silicon. They loose power efficiency this way, but proved to be more competitve at the end.

That's why nVidia stopped making GT200 - because it's too expensive to produce for what you can sell it. Not because they wouldn't be interested in the high end market any more, that's total BS. Actually the opposite is true: GF100 was designed very agressively to banish ATI from the high end market.

From a full process shrink (130-90-65-45-32 nm etc.) you could traditionally expect doulbe the amount of transistors at comparable power consumption. However, recent shrinks fell somewhat short of this mark. With GF100 nVidia took an already power-limited design (GT200 at 1.4 billion transistors) and stretched this rule quite a bit (GF100 at 3.2 bilion transistors). And had to deal with a process which performs much worse than the already-not-applicable-any-more rule. Put it all together and you run into real trouble. I.e. even if you can fix the yield issues your design will continue to be very power limited. That means you can't use the clock speed potential of your design at all, you get less performance and have to sell at a lower price. At the same time you have to deal with the high production cost of a huge chip. That's why I thought they went mad when I first heard about the 3.2 billion transistors back in september..

MrS
____________
Scanning for our furry friends since Jan 2002

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15478 - Posted: 27 Feb 2010 | 15:48:52 UTC

SK wrote:
The CPU manufacturers have demonstrated the way forward by moving to quad, hex and even 8 core CPUs. This is scaling done correctly! Trying to stick everything on the one massive chip is backwards, more so for GPUs.


Sorry, that's just not true.

One cpu core is a hand crafted, highly optimized piece of hardware. Expanding it is difficult and requires lots of time, effort and ultimatly money. Furthermore we've run into the region of diminishing returns here: there's just so much you can do to speed up the execution of one thread. That's why they started to put more cores onto single chips. Note that they try to avoid using multiple chips, as it adds power consumption and a performance penalty (added latency for off-chip communication, even if you can get the same bandwidth) if you have to signal from chip to chip. The upside is lower manufacturing costs, which is why e.g. AMDs upcoming 12 core chips will be 2 chips with 6 cores each.

For a GPU the situation is quite different: each shader processor (512 for Fermi, 320 for Cypress) is (at least I hope) a highly optimized and possibly hand crafted piece of hardware. But all the other shader processors are identical. So you could refer to these chips as 512- or 320-core processors. Which wouldn't be quite true, as these single "cores" can only be used in larger groups, but I hope you get the point: GPUs are already much more "multi core" than CPUs. Spreading these cores over several chips increases yield, but at the same time reduces performance due to the same reason it hurts CPUs (power, latency, bandwidth). That's why scaling in SLI or Crossfire is not perfect.

MrS
____________
Scanning for our furry friends since Jan 2002

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15480 - Posted: 27 Feb 2010 | 16:25:41 UTC

SK wrote:
NVidias mid range cards such as the GT220 and GT240 offer much lower power usage than previous generations, making these cards very attractive to the occasional gamer, the home media centre users (with their HDMI interface), the office environment, and of course to GPUgrid. The GT240 offers up a similar performance to a 8800GS or 9600GT in out and out GPU processing, but incorporates new features and technologies. Yet, where the 8800GS and 9800GT use around 69W when idle and 105W under high usage, the GT240 uses about 10W idle and up to 69W (typically 60 to 65W) when in high use. As these cards do not require any special power connectors, are quiet and not oversized they fit into more systems and are more ergonomically friendly.


IMO you're being a little too optimistic here ;)

First off: the new shader model 1.2 cards are great for GPU-Grid, no questions. However, the applications the performance of these cards is normally judged by are games. And these don't really benefit from 1.2 shaders and DX 10.1. Performance wise it trades blows with the 9600GT. I'd say overall it's a little slower, but nothing to worry about. And performance-wise it can't touch either a 9800GT or its Green edition. Power consumption is quite good, but please note that idle power of 9600GT and 9800GT lies between 20 and 30 W, not 70 W! BTW: the numbers XBit-Labs measures are lower than what you'll get from the usual total system power measurements, as they measure the card directly, so power conversion inefficiency in the PSU is eleminated.

If the 9600GT was the only alternative to the GT240 it would be a nice improvement. However, the 9800GT Green is its real competitor. In Germany you can get either this one or a GT240 GDDR5 for 75€. The 9800GT Green is faster and doesn't have a PCIe connector either, so both are around 70 W max under load. The GT240 wins by 10 - 15 W at idle, whereas the GT9800 Green wins at performance outside of GPU-Grid.

I'm not saying the GT240 is a bad card. It's just not as much of an improvement over the older cards as you made it sound.

I've got a nice suggestion for nVidia to make better use of these chips: produce a GT250. Take the same chip, give the PCB an extra PCIe 6 pin connector and let it draw 80 - 90 W under load. Current clock speeds of these cards already reach 1.7 Ghz easily, so with a slight voltage bump they'd get to ~1.8 GHz at the power levels I mentioned. That would give them 34% more performance, would take care of the performance debate compared to the older generation and would still be more power efficient than the older cards. nVidia would need a slight board modification, a slightly larger cooler and possible some more MHz on the memory. Overall cost might increase by 10€. Sell it for 20€ more and everyone's happy.

MrS
____________
Scanning for our furry friends since Jan 2002

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15482 - Posted: 27 Feb 2010 | 17:24:09 UTC

Regarding the Article from SemiAccurate: the basic facts Charlie states are probably true, but I really disagree with his analysis and conclusions.

He states that:
(1) GF100 went from revision A1 to A2 and is now at A3
(2) A-revisions are usually to fix logic errors in the metal interconnect network
(3) GF100 has huge yield issues
(4) and draws too much power, resulting in low clock speeds, resulting in lower performance
(5) nVidia should use a B revision, i.e. changes to the silicon, to combat (4)
(6) since they didn't they used the wrong tool too adress the problem and are therefore stupid

(1-5) are probably true. But he also states (correctly, according to Anand, AMD and nVidia) that there were serious problems with the reliability of the vias. These are the little metal wires used to connect the transistors. These are part of the metal interconnects. Could it be that nVidia used the A-revisions to fix or at least relieve those problems in order to improve yields? Without further knowledge I'd rather give nVidia the benefit of the doubt here and assume they're not total idiots rather than claiming (6).

(7) nvidia designed GF100 to draw more amperes upon voltage increases compared to Cypress

That's just a complicated way of saying: every transistor consumes power. Increasing the average power draw of every transistor by say a factor of 1.2 (e.g. due a voltage increase) takes a 100 W chip to 120 W, whereas a 200 W chip faces a larger increase by 40 W to 240 W.

(8) Fermi groups shader processors into groups of 32
(9) this is bad because you can only disable them in full groups (to improve yields from bad chips)

G80 and G92 used groups of 16 shaders, whereas GT200 went to 24 shader clusters. Fermi just takes this natural evolution to more shading power per texture units and fixed function hardware one step further. Going with a smaller granularity also means more control logic (shared by all shaders of one group), which means you need more transistors, get a larger and more expensive chip, which runs hotter but may extract more performance from the same number of shaders for certain code.
Charlies words are "This level of granularity is bad, and you have to question why that choice was made in light of the known huge die size." I'd say the choice was made because of the known large die size, not despite of it. Besides: arguing this way would paint AMD in an even worse light, as they can "only" disable shaders in groups of 80 in the 5000 series.

(10) nVidia should redesin the silicon layer to reduce the impact of the transistor variances

I'm not aware of any "magic" design tricks one can use to combat this. Except changing the actual process, but this would be TSMCs business. One could obviously use less transistors (at least one of the measures AMD choose), but that's a chip redesign, not a revision. One could also combat transistor gate leakage by using a thicker gate oxide (resulting in lower clock speeds), but that adresses neither the variances nor the other leakage mechanims.
If anyone can tell me what else could be done I'd happily listen :)

(11) The only way to rescue Fermi is to switch to 28 nm

While I agree that GF100 is just too large for TSMCs 40 nm and will always be very power constrained, I wouldn't count on 28 nm as the magic fairy (as I said somewhere above). They'd rather get the high-end to mainstream variant of the Fermi design out of the door as soon as possible. If they're not pushing this option since a couple of months the term "idiots" would be quite an euphemism..

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15483 - Posted: 27 Feb 2010 | 17:35:58 UTC - in response to Message 15480.

AMD did not suddenly jump to 2x6 cores. They moved from single to dual (on one die, unlike Intel who preferred to use glue), then moved to quads, hex and now 12 cores. Along the way they moved from 90nm to 40nm, step by step. In fact Intel will be selling a 6core 32nm CPU soon.

Unfortunately, NVidia tried to cram too much onto one core too early.
As you say Mr.S, an increase in Transistor quantity results in energy loss (leakage in the form of heat), and probably rises to the square too. The more transistors you have, the higher the heat loss and the less you can clock the GPU.

There is only 2 ways round this; thinner wafers, or to break up the total number of transistors into cool manufacturable sizes. I know this may not scale well, but at least it scales.

The 65nm G200 cards had issues with heat, and thus did not clock well - the first GTX260 sp192 (65nm) clocked to just 576MHz, poor compaired to some G92 cores. The GTX 285 managed to clock to 648MHz, about 12.5% faster in itself. An excellent result given the 240:80:32 core config compared to that of the GTX 260s, 192:64:28. This demonstrated that with design improvements and dropping from 65 to 55nm excess leakage could be overcome. But to try to move from 1.4Billion transistors to 3Billion, even on a 40nm core was madness; not least because 40nm was untested technology and a huge leap from 55nm.

They, “bit off more than they could chew”!

If they had designed a 2Billion transistor GPU that could be paired with another on one board and also dropped to 40nm they could have made a very fast card, and one we could have been using for some time - it would have removed some manufacturing problems and vastly increased yields at 40nm, making the project financially viable in itself.

I think they could have simply reproduced a GTX 285 core at 40nm that could have been built into a dual GPU card, called it a GTX 298 or something and it would have yielded at least a 30% performance compared to a GTX 295. An opportunity missed.

I still see a quad GPU card as being a possibility, even more so with Quad Sli emerging. If it is not on any blueprint drafts I would be surprised.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15485 - Posted: 27 Feb 2010 | 19:12:07 UTC - in response to Message 15483.

AMD did not suddenly jump to 2x6 cores. They moved from single to dual (on one die, unlike Intel who preferred to use glue), then moved to quads, hex and now 12 cores. Along the way they moved from 90nm to 40nm, step by step. In fact Intel will be selling a 6core 32nm CPU soon.


I know, but what's the point? What I was trying to say is that the CPU guys are trying to avoid multi chip packages. They lead to larger chips (additional circuitry needed for communication), higher power consumption (from the additional circuitry) and lower performance (due to higher latency) and higher costs due to the more complex packaging. The only upsides are improved yields on the smaller chips and faster time to market if you already have the smaller ones. So they only use it when either (i) the yields on one large chip would be too low and offset all the penalties from going multi chip or (ii) the cost for a separate design, mask set etc. for the large chip would be too high compared to the expected profit from the chip, so in the end using 2 existing chips is more profitable and good enough.

There is only 2 ways round this; thinner wafers, or to break up the total number of transistors into cool manufacturable sizes. I know this may not scale well, but at least it scales.


Thinner wafers?

I think they could have simply reproduced a GTX 285 core at 40nm that could have been built into a dual GPU card, called it a GTX 298 or something and it would have yielded at least a 30% performance compared to a GTX 295. An opportunity missed.


And I'd rather see this chip based on the Fermi design ;)

I still see a quad GPU card as being a possibility, even more so with Quad Sli emerging.


Quad SLI: possibly, just difficult to make it scale without microstutter.

Quad chips: a resounding no. Not as an add-in card for ATX or BTX cases. Why? Simple: a chip like Cypress at ~300 mm² generally gets good enough yield and die harvesting works well. However, push a full Cypress a little and you easily approach 200 W. Use 2 of them and you're at 400 W, or just short of 300 W and downclocked as in the case of the 5970. What could you gain by trying to put 3 Cypress onto one board / card? You'd have to downclock further and thereby reduce the performance benefit, while at the same time struggling to stay within power and space limits. The major point is that producing a Cypress-class chip is not too hard, and you can't put more than 2 of these onto a card. And using 4 Junipers is going to be slower and more power hungry than 2 Cypress due to the same reasons AMD is not composing their 12 core CPU from 12 single dies.
I could imagine an external box with its own PSU, massive fans and 800+ W of power draw though. But I'm not sure there's an immediate large market for this ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15486 - Posted: 27 Feb 2010 | 19:28:25 UTC - in response to Message 15483.

I still see a quad GPU card as being a possibility, even more so with Quad Sli emerging. If it is not on any blueprint drafts I would be surprised.


Sounds reasonable, IF they can reduce the amount of heat per GPU enough that the new boards don't produce any more heat than the current ones. How soon do you expect that?

Or do you mean boards that occupy four slots instead of just one or two?

If they decide to do it by cutting the clock rate in half instead, would you buy the result?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15487 - Posted: 27 Feb 2010 | 21:14:23 UTC - in response to Message 15486.
Last modified: 27 Feb 2010 | 21:54:26 UTC

Yeah, I was technically inaccurate to say that the 9800 GT uses 69W when idle; I looked it up and that is what I found! It would be more accurate, albeit excessive, to say that the system usage with the GPU installed and sitting idle can increases by up to 69W, with the GPU perhaps using about 23W (the accurate bit); the additional losses coming from the GPUs power dependent components; the PSU and motherboard (the vague and variable bit). Of course if I say this for the 9800GT I would need to take the same approach with the GT240 (6W idle), and mention that in reality idle power consumption also depends on your motherboards ability to turn power off, your PSU, chipset and motherboards general efficiencies. Oh, and the operating systems power settings, but again excessive for a generalisation.

- XbitLabs compared a GT240 to a 9800GT (p4) and then replace the 9800GT with a GTS250 (p6)! Crunching on GPUGrid, and overclocked by between 12 and 15%, my temps, on 3 different types of GT240s stay around 49degrees C. The only exception is one card that sits a bit too close to a chipset heatsink in an overclocked i7 system. How they got 75degrees from native is beyond me. Even my hot (58deg) cards fan is at 34%.

Anyway, I think the GT240 is a good all round midrange card for its many features, and an excellent card for GPUGrid. For the occasional light gamer, that does not use the system as a media centre or for GPUGrid, you are correct, there are other better options, but these cards are less use here, and I don’t think gamers that don’t crunch are too interested in this GPUGrid forum.

I've got a nice suggestion for nVidia to make better use of these chips: produce a GT250. Take the same chip, give the PCB an extra PCIe 6 pin connector and let it draw 80 - 90 W under load. Current clock speeds of these cards already reach 1.7 Ghz easily, so with a slight voltage bump they'd get to ~1.8 GHz at the power levels I mentioned. That would give them 34% more performance, would take care of the performance debate compared to the older generation and would still be more power efficient than the older cards. nVidia would need a slight board modification, a slightly larger cooler and possible some more MHz on the memory. Overall cost might increase by 10€. Sell it for 20€ more and everyone's happy.


Your suggested card would actually be a descent gaming card, competitive with ATI’s HD 5750. NVidia has no modern cards (just G92s) to fill the vast gap between the GT240 and the GTX 260sp216 55nm. A GTX 260 is about two or two and a half times as fast on GPUGrid. Perhaps something is on its way? A GTS 240 or GTS 250 could find its way onto a 2xx 40nm die and into the mainstream market (its presently an OEM card and likely to stay that way as is). Anything higher would make it a top end card.

I could see two or four 40nm GPUs on one board, if they had a smaller transistor count. Certainly not Fermi (with 3Billion), at least not until 28nm fabrication was feasible, and thats probalby 2years+ away, by which time there will be other GPU designs. I think the lack of limitations for a GPU card offers up more possibility than 2 or 4 separate cards, and even though the circutry would increase, it would be less than the circuitry of 4 cards together. I expect there will be another dual card within the next 2 years.

Fermi could do with less transistors, even if that means a chip redesign!
A design that allows NVidia to use more imperfect chips, by switching areas off might make them more financially viable. We know AMD can disable cores if they are not up to standard. Could NVidia take this to a new level? If it means 4 or 5 cards with less and less ROPs & shaders, surely thats better for the consumer too? GTX 480, 470, 460, 450 ,440, 430. More choice, better range, better prices.

NVidia and AMD want to cram more and more transistors into a chip. But this increase in density causes heat increase. Could chips be designed to include spaces, like fire-breakers?

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15488 - Posted: 27 Feb 2010 | 23:41:05 UTC - in response to Message 15485.
Last modified: 27 Feb 2010 | 23:55:45 UTC

Quad chips: a resounding no. Not as an add-in card for ATX or BTX cases. Why? Simple: a chip like Cypress at ~300 mm² generally gets good enough yield and die harvesting works well. However, push a full Cypress a little and you easily approach 200 W. Use 2 of them and you're at 400 W, or just short of 300 W and downclocked as in the case of the 5970. What could you gain by trying to put 3 Cypress onto one board / card? You'd have to downclock further and thereby reduce the performance benefit, while at the same time struggling to stay within power and space limits. The major point is that producing a Cypress-class chip is not too hard, and you can't put more than 2 of these onto a card. And using 4 Junipers is going to be slower and more power hungry than 2 Cypress due to the same reasons AMD is not composing their 12 core CPU from 12 single dies.
I could imagine an external box with its own PSU, massive fans and 800+ W of power draw though. But I'm not sure there's an immediate large market for this ;)

MrS[/quote]

Sound fascinating with an external option. I'm thinking about the Expressbox that was tried with an external GPU to boost laptops through the ExpressCard slot. Problem was that it was a x1 PCI-E & the idea was too costly & produced too little benefits.

But what about a x16 PCI-e 2.0 external extender riser card connected to an external GPU? Already GPU's are making so much heat & taking up so much space, so why not just separate it once and for all? No need for a redesign of the motherboard... If there was an option to stack external GPU's in a way that could use SLI, an external GPU option could give enthusiasts the opportunity to use insane ammounts of money on external GPU's & stack them on top of each other w/o running into heat or power issues. Even if a fermi requires 800W, if it's external & has it's own dedicated PSU, you'd still be able to stack 3 on top of each other to get that fermi tri-SLI.
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15489 - Posted: 27 Feb 2010 | 23:45:23 UTC - in response to Message 15487.

nVidia can already disable shader clusters as they whish. They could sell a GF100 with only 32 shaders active, if this made any sense. If I remember correctly the first cards to employ these techniques were ATIs X800 Pro and nVidias 6800GS. On newer cards they may also have more flexibility regarding the ROPS, memory controllers etc., so I think that's quite good already.

What do you mean by "lack of limitations for a GPU card"? I think it does have severe limitations: the maximum length is defined by ATX cases (we're already pushing the limits here), the height is also limited by the regular cases and power draw and cooling are severely limited. Sure, you could just ask people to plug even more 6 and 8 pin PCIe connectors into the cards, but somehow you also have to remove all that heat.

I could see two or four 40nm GPUs on one board,


2 Chips definitely (HD5970), but 4? What kind of chips would you use? Since we don't know enough about Fermi I'd prefer to discuss this based on ATI chips. As I said even going for 3 Cypress (HD58x0) is almost impossible, or at least unfeasible. You could use 2 x 2 Juniper (HD 57x0), which are exactly half a Cypress. Performance in Crossfire is surprisingly good compared to the full configuration, but you need an additional 30 - 40 W for the CF tandems compared to the single chip high end cards (see next page). That would turn a 300 W HD5970 into a 360 - 380 W card, if you used 4 Junipers. It had to be noticably cheaper to justify that.

Could chips be designed to include spaces, like fire-breakers?


They could, but the manufacturing cost is the same regardless of whether you use the area for transistors or whether you leave it blank. So you reduce power density somewhat and reduce the number of chips per wafer considerably. And power density at the chip level is not the problem of GPUs, as the chips are large enough to transfer the heat to the cooler. Their main problem is that the form factor forbids the use of larger fans, so removing the heat from the heatsink is the actual problem.

BTW: thanks for discussing :)

MrS
____________
Scanning for our furry friends since Jan 2002

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15490 - Posted: 28 Feb 2010 | 0:00:33 UTC - in response to Message 15488.

@liveonc:

We'd need at least one PCIe 2.0 16x connection routed to the external box. That would mean it would have to be very close to the PC, but that's no show stopper.

What I could imagine: use a common PCB for 4 (or "x" if you prefer) GPU chips, or separate PCBs for each of them (cheaper, more flexible but you'd have to connect them). Make them approximately quadratic and cover each of them with a large heatsink. There'd probably have to be 2 or 3 heatpipes for the GPU core, whereas memory and power cicuitry could be cooled by the heatsink itself. Then put a 120 or 140 mm on top of each of these "GPU blocks". Leave the sides of the device free enough so that the hot air can flow out. You'd end up with a rather flat device which covers some surface, so ideally you'd place it like a tower parallel to your.. ehm, PC-Tower. It could also be snapped upon the front or backside, so you could easily use 2 of these moduls for one PC and both PCIe links would only have to cover ~10 cm to the motherboard I/O panel. For BOINC this would absolutely rock, whereas for games we might need faster and / or wider PCIe. One couild also replace the individual fans with a single really large one.. but then there'd be a large dead spot beneath its axis. And if only selected GPUs are loaded one couldn't speed fans up selectively. BTW: the fans would have easily accessible dust filters.

Oh wait.. did I just write all of this? Let me call my patent attorney first ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15492 - Posted: 28 Feb 2010 | 0:07:48 UTC - in response to Message 15490.
Last modified: 28 Feb 2010 | 1:01:26 UTC

e-SATA has been around for a long time. e-PCIe should be the next thing, why not? Nvidia wanted to go nuts, let them, if space, 300W, & cooling, no longer is a hindrance, Nvidia could go into the external GPU market, PSU market, & VGA cooling market. Plus they'd get their fermi.

BTW, do you know if a serial repeater is possible, in case the distance from the PC case to the external GPU has to be a bit further away, & would there be any sense in making x1, x4, x8, & x16 PCIe cables, if fx a PCIe 2.0 speed is 5.0 GHz, while the first version operates at 2.5 GHz, if fx the card is PCIe 1.0?

This is what I'm getting at, but much more simple, & only meant to house a single fermi using x16 PCIE 2.0 with an oversized GPU cooler & dedicated PSU: http://www.magma.com/expressbox7x8.html
____________

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15494 - Posted: 28 Feb 2010 | 1:45:04 UTC - in response to Message 15490.
Last modified: 28 Feb 2010 | 1:50:00 UTC

Here's another one: http://www.pcper.com/comments.php?nid=8222 but it's as I "suspect", much like the Asus XG Station: http://www.engadget.com/2007/01/08/hands-on-with-the-asus-xg-station-external-gpu, & if it is, it's limitted to x1 PCIe.

Magma has the right idea with their ExpressBox7x8: http://www.magma.com/expressbox7x8.html the iPass cable used looks as if it's at least 5m in length. But different from Magma, I was thinking of just 1 Fermi, in it's own box, with it's own PSU, on it's back to the bottom of the box (like a motherboard lies in a PC casing), with a giant heatsink & fan. If nVidia wants their Monster Fermi to use 800W it can do it outside with a moster GPU cooler & dedicated PSU. If they want enthusiasts to be able to afford multiple cards, they could make it possible (& attractive), if there is a way to stack one Fermi Box on top of the other to get that external SLI.

Everybody's talking as if nVidia went to far, but if they don't have to think about space limitation, power consumption, or heat dissipation, they can make their monster (& get away with it). I remember that all the critics of Intel LOL when Intel's Original Pentium overheated, only to have Intel come back, (with 40 belly dancers), when the placed a heat sink & fan on top of their Pentium.
____________

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15495 - Posted: 28 Feb 2010 | 3:11:26 UTC
Last modified: 28 Feb 2010 | 3:12:24 UTC

Read the ads with links at the bottoms of threads - Some of them offer rack-mounted GPU boxes (mostly Nvidia type with a CPU in the box also, but often with 4 GPUs; I've even seen one where one Xeon CPU is expected to control up to 8 GPUs (GTX295 type if you prefer). Rack-mounted systems should be able to get taller than a stack with no frame to support the higher units.

No Fermi boards yet, though.

Jeremy
Send message
Joined: 15 Feb 09
Posts: 55
Credit: 3,542,733
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 15496 - Posted: 28 Feb 2010 | 3:34:50 UTC - in response to Message 15494.

Disclaimer: I do not know as much about chip design as I'd like, and I'm holding off any commentary on Fermi until it's actually in reviewers' and end users' hands. Anything else is pure gossip and conjecture as far as I'm concerned.

That said, much has been made about Intel's vs AMD's vs nVidia's design decisions.

AMD vs Intel's approach in the CPU world intrigue me. For Intel, they were content to package two chips onto a single die to make their initial quad core offerings. It worked quite well.

What prevents GPU makers from doing the same? Two GT200b(c?) dies on a single GPU package. It'd be a GTX295 that wouldn't require doubling up the vRAM. Make it at 40nm. It seems to make so much sense (at least to me), but nobody is doing it and there has to be a reason. Does anybody know what that reason is?
____________
C2Q, GTX 660ti

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15497 - Posted: 28 Feb 2010 | 6:08:29 UTC - in response to Message 15496.

What prevents GPU makers from doing the same? Two GT200b(c?) dies on a single GPU package. It'd be a GTX295 that wouldn't require doubling up the vRAM. Make it at 40nm. It seems to make so much sense (at least to me), but nobody is doing it and there has to be a reason. Does anybody know what that reason is?

Even at 40nm two GT200b cores would be huge and most likely wouldn't yield well. Huge dies and poor yields = no profit. In addition GPUs are very different from CPUs. The shaders are basically cores. For instance in Collatz and MW a 1600 shader 5870 does twice the work of a 800 shader 4870.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15498 - Posted: 28 Feb 2010 | 12:11:31 UTC

@Robert: the problems with the current rack mounted devices are numerous

- they're meant for professionals and thus feature expensive Teslas
- I don't have a 19" rack at home (and am not planning to buy one)
- the air flow is front to back, with the units designed to be stacked
-> the cooling can't be quiet this way
- a separate Xeon to drive them? I want to crunch with GPUs so that I don't have to buy expensive CPUs!

@liveonc: that cable in the Magma-box looks really long, so it seems like PCIe extension is no big deal :)

@Jeremy: you have to feed data to these chips. 256 bit memory interfaces are fine for single chips, where the entire space around them can be used. 386 and 512 Bit quickly get inconvenient and expensive. In order to place two GT200 class chips closer together you'd have to essentially create a 768 Bit memory interface. The longer and the more curved the wires to the memory chips are, the more problematic it becomes to clock them high. If your wire layout (and some other stuff) is not good you can not even clock the memory as high as the chips themselves are specified.
So without any serious redesign you create quite some trouble by placing two of these complex chips closer together - for zero benefit. You could get some benefit, if there was some high bandwidth, low latency communication interface between the chips. Similar to what AMD planned to include in Cypress as Sideport, but had to scrap to keep the chip "small". But that's a serious redesign. And I suppose they'd rather route this communication interface over a longer distance than to cram the memory chips into an even smaller area.
A single cypress pushes the limits of what can be done with 256 Bit GDDR5 quite hard already: HD4890 was not memory bandwidth limited. HD5870 increased raw power by a factor of 2 and memory bandwidth only by a factor of 1.3 and as a result could use even faster memory. Yet they had to be very careful about the bus to the memory chips, its timing and error detection and correction to reach even these speeds.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15500 - Posted: 28 Feb 2010 | 15:59:51 UTC - in response to Message 15496.

Disclaimer: I do not know as much about chip design as I'd like, and I'm holding off any commentary on Fermi until it's actually in reviewers' and end users' hands. Anything else is pure gossip and conjecture as far as I'm concerned.

That said, much has been made about Intel's vs AMD's vs nVidia's design decisions.

AMD vs Intel's approach in the CPU world intrigue me. For Intel, they were content to package two chips onto a single die to make their initial quad core offerings. It worked quite well.

What prevents GPU makers from doing the same? Two GT200b(c?) dies on a single GPU package. It'd be a GTX295 that wouldn't require doubling up the vRAM. Make it at 40nm. It seems to make so much sense (at least to me), but nobody is doing it and there has to be a reason. Does anybody know what that reason is?


I have years of experience in some of the early stages of chip design, mostly logic simulation, but not the stages that would address putting multiple chips in one package. I'd expect two such GPU chips in the same package to generate too much heat unless the chips were redesigned first to generate perhaps only half as much heat, or the clock rate for the pair of chips was reduced about 50% leaving them doing about as much work as just one chip but at a more normal clock rate.

Also, I suspect that the chips are not already designed to let two of them share the same memory, even if both chips are in the same package, so you'd need about twice as many pins for the package to allow it to use two separate sets of memory.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15501 - Posted: 28 Feb 2010 | 17:24:01 UTC - in response to Message 15498.

@Robert: the problems with the current rack mounted devices are numerous

- they're meant for professionals and thus feature expensive Teslas
- I don't have a 19" rack at home (and am not planning to buy one)
- the air flow is front to back, with the units designed to be stacked
-> the cooling can't be quiet this way
- a separate Xeon to drive them? I want to crunch with GPUs so that I don't have to buy expensive CPUs!

MrS


The best I can tell, the current application designs don't work with just a GPU portion of the program; they need a CPU portion to allow them to reach BOINC and then the server. The more GPUs you use, the faster a CPU you need to do this interfacing for them. Feel free to ask the rack mounted equipment companies to allow use of a fast but cheap CPU if you can find one, though. Also, it looks worth checking which BOINC versions allow the CPU interface programs for interfacing to separate GPUs to run on separate CPU cores.

However, I'd expect you to know more than me about just how fast this interfacing needs to be in order to avoid slowing down the GPUs. Putting both the GPUs and the CPU on the same board, or at least in the same cabinet, tends to make the interface faster than putting them in separate cabinets.

The rack mounted devices I've looked at put the GPUs and the CPU on the same board, and therefore don't use Teslas. I haven't checked whether they also use enough memory to drive up the price to Tesla levels.

I agree with your other objections, though.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15502 - Posted: 28 Feb 2010 | 17:34:57 UTC

Jeremy, if the heat generated and the package size weren't problems, you could take your idea even further by putting the video memory chips in the same package as the GPU, and therefore speed up the GPU to memory interface.

As it is, consider how much more you'd need to slow down the GPU clocks to avoid overheating the memory chips with this approach.

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15503 - Posted: 28 Feb 2010 | 18:29:07 UTC - in response to Message 15501.
Last modified: 28 Feb 2010 | 18:30:57 UTC

However, I'd expect you to know more than me about just how fast this interfacing needs to be in order to avoid slowing down the GPUs. Putting both the GPUs and the CPU on the same board, or at least in the same cabinet, tends to make the interface faster than putting them in separate cabinets.


But Intel has already taken the first step on nVidia's foot, by not allowing them to build boards for the Core i7

nVidia already wanted to promote Supercomputing via Fermi. If they start selling external GPU kits, even an ITX with a PCIe should "theoretically", be able to use a monster GPU. Taking it even one step further, if both nVidia & Ati start selling external GPU kits with oversized GPU's unable to fit in any traditional casing, they could start making GPU boards & offer to sell the GPU chip & memory
separately. PSU, GPU Cooling, & RAM manufacturers, would be delighted, wouldn't they? But of course, Intel probably will object...
____________

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15504 - Posted: 28 Feb 2010 | 19:54:05 UTC - in response to Message 15503.

But Intel has already taken the first step on nVidia's foot, by not allowing them to build boards for the Core i7


What are you talking about?


____________
Thanks - Steve

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15505 - Posted: 28 Feb 2010 | 20:12:26 UTC - in response to Message 15504.

Only Intel makes chipsets for LGA1156 Socket & LGA1366 Socket, heard a "rumor", that it wasn't because nVidia couldn't.
____________

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15508 - Posted: 28 Feb 2010 | 21:54:39 UTC

I was confused because you said boards.
The 1366 socket chips integrated the functionality of the nvidia chipset into the cpu directly and I think nvidia announced they were getting out of the chipset market entirely (not making them for amd motherboards either) before 1156 was launched.
____________
Thanks - Steve

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15511 - Posted: 1 Mar 2010 | 0:41:04 UTC - in response to Message 15505.
Last modified: 1 Mar 2010 | 1:10:15 UTC

Only Intel makes chipsets for LGA1156 Socket & LGA1366 Socket, heard a "rumor", that it wasn't because nVidia couldn't.


Shouldn't matter unless the CPU and GPU share the same socket, the same socket type is not critical otherwise. Intel's moving in the direction of putting both the GPU and the CPU on the same chip, and therefore into the same socket, but I've seen no sign that Nvidia has designed a CPU to put on the same chip as the GPU. A few problems with Intel's approach: I've seen no sign of a BOINC version that can make any use of a GPU from Intel yet. No monster Intel GPUs yet, and I expect both heat problems and VRAM pin count problems from trying to put both a monster GPU and any of Intel's current CPU designs into the same package. I suppose that the GPU could share the same memory as the CPU, IF you're willing to give up the usual speed of the GPU to VRAM interface, but is that fast enough to bother making any? Alternatively, you could wait a few years for chip shrinkage and power requirements shrinkage to reach the point that a CPU, a monster GPU, and the entire VRAM all fit into one package without heat or size problems.

As for Nvidia giving up the chipset market, isn't that most of their product line now? Why would they be willing to risk making such a large change unless the alternative was bankruptcy? Or do you, for example, mean the part of the chipset market for chips not related to graphics?

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 15512 - Posted: 1 Mar 2010 | 1:06:01 UTC - in response to Message 15511.

Robert,

I've seen no sign of a BOINC version that can make any use of a GPU from Intel yet.


Al current Intel GPUs are low-performance, fixed-function units, quite useless for computation.


As for Nvidia giving up the chipset market, isn't that most of their product line now? Why would they be willing to risk making such a large change unless the alternative was bankruptcy?


Nvidia do not have a license for QPI, only the FSB used by older processors (and Atom).

MJH

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15513 - Posted: 1 Mar 2010 | 3:05:20 UTC - in response to Message 15508.

I was confused because you said boards.
The 1366 socket chips integrated the functionality of the nvidia chipset into the cpu directly and I think nvidia announced they were getting out of the chipset market entirely (not making them for amd motherboards either) before 1156 was launched.


Looks to me more like Nvidia has decided that they'll have to fight some legal battles with Intel, and start some new chip designs at early stages, then wait until these designs are finished, before they can offer any products compatible with any new type of board slots designed to make use of the new features of the Intel 5 and Intel 7 CPUs, but for now will continue making products for computers using the older FSB interface. See here for a report that looks somewhat biased, but at least agrees that Nvidia is not planning a full exit from the chipset market:

http://www.brightsideofnews.com/news/2009/10/9/nvidia-temporarily-stops-development-of-chipsets-to-intels-i5-and-i7.aspx

Intel appears to have designed the new QPI interface to do two things:

1. Give speedups by separating the main memory interface from the peripherals interface. This means, for example, that any benchmark testing that uses programs that use the memory a lot, but not also any peripherals, may have trouble seeing much difference between FSB and QPI interface results.

2. I'd be surprised if they aren't also trying to restrict the ability or either Nvidia or AMD/ATI to enter the graphics card market for machines using new features of the Intel 5 and Intel 7 by refusing then QPI licenses.

http://www.intel.com/technology/quickpath/index.htm?iid=tech_arch_nextgen+body_quickpath_bullet

AMD has developed a feature called Hyper-threading for their newer CPUs; I expect them to try to restrict the ability of either Nvidia or Intel to enter the graphics card market for any computers using that feature by also denying licenses.

I suspect that as a result, Nvidia will have problems offering products for any new machines designed to make full use of the new Intel or AMD features, and for now will have to concentrate on offering products for older types of motherboard slots. Somewhat later, they can offer products that put a CPU on the same board as the GPU, if they can find a CPU company wanting to increase sales of a suitable CPU but not offering any competing graphics products; hopefully one that already has a BOINC version available. I'd expect some of these products to be designed for cases that don't try to fit the ATX guidelines.

This is likely to give crunchers problems choosing computers that give the highest performance for both GPU projects and CPU projects, and are still produced in large enough quantities to avoid driving up the price, for at least some time. Not as many problems for crunchers interested mainly in GPU projects, though.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15521 - Posted: 1 Mar 2010 | 6:50:25 UTC

I've found an article about efforts to design a dual-Fermi board, although with no clear sign that a significant number of them will ever be offered for sale or that it won't overheat with the normal spacing between cards.

http://www.brightsideofnews.com/news/2010/2/26/asus-to-introduce-dual-fermi-on-computex-2010.aspx

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15525 - Posted: 1 Mar 2010 | 9:09:53 UTC - in response to Message 15513.

2. I'd be surprised if they aren't also trying to restrict the ability or either Nvidia or AMD/ATI to enter the graphics card market for machines using new features of the Intel 5 and Intel 7 by refusing then QPI licenses.

Intel partnered with NVidia and then cut them out, copied from the MS playbook. When Intel settled the lawsuit with AMD, part of the settlement was cross-licensing of technologies so AMD should be OK in this regard. Various governments still have litigation against Intel so it might be in their best interest to play a bit nicer than they're accustomed to.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15532 - Posted: 1 Mar 2010 | 15:00:46 UTC - in response to Message 15525.

The GPUs that NVidia produces are an entirely different product from their system chipsets.

There is nothing that Intel is doing to stop dedicated GPU cards created by Nvidia or any other company from working in our PCs.

While some Intel CPUs do have integrated graphics so you don't need to have a dedicated graphics card this really doesn't matter to to us.

Larrabee failed and currently Intel has no public plans to create a dedicated graphics card solution.
____________
Thanks - Steve

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 15537 - Posted: 1 Mar 2010 | 16:46:02 UTC

That was an interesting week's discussion ..... :)

Putting technical possibilities that may or may not present themselves to NVidia for averting the impending fermi disaster - and thats still my personal view - there is the very real and more pertinent commercials that have to be taken into account. ATI is already one step ahead of NVidia, they were mid last year, let alone now. They will by end of this year be two generations ahead, the Fermi hassles will not be resolved until early 2011 (aka Fermi2).

If we assume they have found a hidden bag of gold to fund their way out of this, and we assume they will not repeat the crass technical directions they have taken for the last three years in terms of architecture - and thats a huge leap of faith given their track record in last three years - by mid 2011, ATI will be bringing out cards three generations ahead. Its only 28nm process that will give NVidia breathing space, not only architecturally, but also in terms of implementing DX11. At present they are no where near DX11 capability, and will not be for most of 2010.

So come 2011, why should anyone buy a product that is 3 steps behind the competition, has immature new DX11 abilities, and still has no prospect - at least none we know of - to adjust the core architecture direction sufficiently fast and in such a huge leap in one bound to be competitive? Then to cap it all, financially NVidia will be strapped for cash, 2010 will be a disaster year for income from gpu cards.

Technically there maybe an outside chance of them playing catchup, personally I doubt it, but that aside, they are in a mega investment black hole, and I cant see them reversing out of it. With such a huge gap in a technology sense ATI / NVidia that is about to unfold, I really cant see mass take up of a poorly performing Fermi card, even if they could afford to subsidise them all year.

There is the age old saying "When you are in a hole, stop digging". NVidia is not in a hole, its in a commercial ravine. Technology possibilities are one thing, especially given their abysmal track record in the last years, Commercial reality is quite another.

Regards
Zy

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15538 - Posted: 1 Mar 2010 | 17:14:30 UTC - in response to Message 15537.
Last modified: 1 Mar 2010 | 17:23:44 UTC

Indeed it's been interesting. But most of the things discussed were things that nVidia can choose to go with, in the future.

That Fermi is in trouble because it requires lots of power, producing lots of heat, & that only 4% of their 40nm wafers are usable with such high requirements. Makes me think that maybe the 4% is of "extreme edition" grade, & that the real problem might me the power requirement & therefore also heat produced.

The only workaround is placing the GPU outside, so that space won't be an issue, & therefore neither will power or heat dissipation. If nVidia only can use & sell their purest chips (at a loss), they will empty out their coffers & unless they do find a leprechaun at the end of the rainbow & his pot of gold, they will be in trouble.

If nVidia takes these 4% & use them for Workstation GPU's & salvage as much of the rest as possible to make external mainstream GPU's. They might be able to cut loss' until they get to that 28nm.

I like nVidia, but I also like Ati pressuring nVidia.
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15539 - Posted: 1 Mar 2010 | 18:58:10 UTC - in response to Message 15538.

The anticipated supply limitations for Fermi's will mean that the cards will not be available in Europe for some time.
I expect the cards will first be available in the USA, Canada, Japan and a few hotspots in Asia. They may well make their way to Australia before Europe. No doubt they will arrive in the more affluent parts of Europe first too. So Germany, France and the UK may also see them a few weeks before Spain, Portugal and the more Easterly countries, if they even make it that far. As the numbers are expected to be so few, 5K to 8K, very few cards will make it to GPUGrid crunchers in the near future. So if anyone in the USA or Japan gets one early on, please post the specs, performance and observations as it will help other crunchers decide whether to get one or not (should they ever become available in other areas). If they do turn up, and you want one, grab it quickly (as long as it gets a good review)!

Let us all hope they get really bad reviews, or the gamers will buy them up and very few will make it to GPUGrid!

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15545 - Posted: 1 Mar 2010 | 20:47:06 UTC - in response to Message 15539.

Let us all hope they get really bad reviews, or the gamers will buy them up and very few will make it to GPUGrid!

While I'm sure you're just kidding, if this does happen it will simply be another big nail in NVidia's coffin.
We need 2 viable GPU companies to keep advances coming and prices down.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15547 - Posted: 1 Mar 2010 | 21:20:05 UTC - in response to Message 15545.
Last modified: 1 Mar 2010 | 21:28:17 UTC

We need 2 high end competitors! Only that way will new technologies advance progressively. At the time being ATI do not have a competitor! So will an average review change anything? A bad review might be just what they need to joult them back on track. Lets face it, Fermi is a loss leader that will depend on a gimic sales pitch that it outperforms ATI is some irrelivant areas of gaming.
The problem is that anyone can compete at the low end, even Intel! But if there are several low end manufacturers eating up the profit margins of the high end GPU manufacturers, they start to look over their shoulder and perhaps panic, rather than looking to the future.

By the way, nice link RobertMiles.
http://www.brightsideofnews.com/news/2010/2/26/asus-to-introduce-dual-fermi-on-computex-2010.aspx
Asus are going to limit their super Republic of Gamers HD5970 Ares dual GPU card to 1000. If they try a similar trick with Fermi's they may find it difficult to get their hands on enough GPUs!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15549 - Posted: 1 Mar 2010 | 22:14:21 UTC

Guys, I don't get it: why are (some of) you talking as if shrinking GF100 to 28 nm was nVidias only option? Just because Charlie said so? Like I said before: IMO all they need is a Fermi design of a sane size, a little over 2 billion transistors and with high speed 256 bit GDDR5.

Take a look at the GT200 on the GTX285: it's one full technology node behind Cypress and features 50% less transistors, so it should be at least 50% slower. How often do you see Cypress with more than a 50% advantage? It happens, but it's not the average. I'd say that's a good starting point for a Fermi-based mainstream chip of about Cypress' size. nVidia said they'd have something mainstream in summer. I'm still confident they won't screw this one up.

BTW: the 4% yield is probably for fully functional chips. And is probably already outdated. Yields change weekly, especially when the process is not yet mature.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15550 - Posted: 1 Mar 2010 | 23:38:19 UTC - in response to Message 15549.

Some time ago I read 9% yield.
Shrink the transistors and the yield should go up by the square. Fermi's low yield is a result of 3 main issues; a change in casting design (larger), a jump to 40nm, and 3Billion transistors.
If they cast smaller and reduced the transistor count, mainstream cards would flow out of the factories.
My arguement was that if they did this, even with GTX 285 chip design (6 months ago), they would have been producing something competitive. Obviously they should not do this now, because they have Fermi technologies, but I agree they have to shrink the transistor count, Now! In 18months or 2years, they may be able to use 28nm, but by then even Fermi will be dated (mind you, it never stopped NVidia from using dated designs in the past).

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15551 - Posted: 1 Mar 2010 | 23:49:20 UTC - in response to Message 15549.
Last modified: 2 Mar 2010 | 0:19:18 UTC

Guys, I don't get it: why are (some of) you talking as if shrinking GF100 to 28 nm was nVidias only option? Just because Charlie said so? Like I said before: IMO all they need is a Fermi design of a sane size, a little over 2 billion transistors and with high speed 256 bit GDDR5.

Take a look at the GT200 on the GTX285: it's one full technology node behind Cypress and features 50% less transistors, so it should be at least 50% slower. How often do you see Cypress with more than a 50% advantage? It happens, but it's not the average. I'd say that's a good starting point for a Fermi-based mainstream chip of about Cypress' size. nVidia said they'd have something mainstream in summer. I'm still confident they won't screw this one up.

BTW: the 4% yield is probably for fully functional chips. And is probably already outdated. Yields change weekly, especially when the process is not yet mature.

MrS


But if they've already gone into production faze, it's too late, isn't it to modify the design in one way or the other. They only thing they can do now is to make chips & put them on bigger boards. 28nm is the future, but isn't changing the design of the Fermi also in the future? I don't have the knowledge that you guys have & you can read in my wordings that I don't. But I do know that when a product goes into production faze, you can't go back to the drawing board & redesign everything. The only thing that you can do is to find a way to make what you've got work. They're already doing the same thing with the Fermi as they did with the GTX 260, making a less powerful GTX 280. If nVidia wants to launch the low-mid range before they can make a workable high end, it's just stalling, & they can't stall for too long. But if redesigning the PCB is easier & faster, then IMO it's they only way to get the original Fermi they wanted from the very start.

Here I'm thinking about 3 things. The first is to place the GPU outside the PC casing. The second is to go with water & bundle Fermi cards with "easy" water cooling kits that function like the Corsair Hydro: http://www.corsair.com/products/h50/default.aspx Or thirdly, make a huge 3-4 slot gpu card with 80mm fans blowing cool air from the front of the out the back of the casing, instead of using the blower their using now, & using a tower heatsink. (but it's upside down in most casings isn't it?)
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15553 - Posted: 2 Mar 2010 | 0:13:27 UTC - in response to Message 15551.
Last modified: 2 Mar 2010 | 0:25:13 UTC

NVidia design chips, its what they do, and they are good at it.
Unfortunately they overstreached themselves and started to pursue factory mods to manufacture larger numbers of 40nm GPUs. They tried to kill 2 birds with one stone, and missed l0l
NVidia can, and I think will, produce a modified version of Fermi - with a reduced transistor count (it's what they do), but in my opinion they should have done this earlier.
NVidia have already produced low to mid range products. GT210 & Ion (low and very low), GT 220 (low to mid), GT240 (mid range), but then there is a gap before you reach the End of Line GTX 260's (excluding any older GT92 based GPUs that are still being pushed out). By the way I have tried a GTS 240; it uses G92b (like the GTS 250) and basically does not work well on GPUGrid, ditto for several other mid-high end GT92 based cards. I'm not a gamer.

When you see Asus redign PCBs, add a bigger, and in this case better heatsink and fan, it is no wonder they get much better results. So there is more to be had! They have made 2 vastly more powerful GPUs and both create less noise! Well done, twice.
Tackle Fermi too, please. NVidia need that sort of help.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15554 - Posted: 2 Mar 2010 | 4:57:44 UTC - in response to Message 15537.

That was an interesting week's discussion ..... :)

Putting technical possibilities that may or may not present themselves to NVidia for averting the impending fermi disaster - and thats still my personal view - there is the very real and more pertinent commercials that have to be taken into account. ATI is already one step ahead of NVidia, they were mid last year, let alone now. They will by end of this year be two generations ahead, the Fermi hassles will not be resolved until early 2011 (aka Fermi2).

Regards
Zy


I'd say it's more like ATI is one step ahead in making GPU chips for graphics applications, but at least one step behind in making GPU chips for computation purposes and offering the software needed to use them for computation well. This could lead to a splitting of the GPU market could with commercial consequences almost as bad as you predict, though.

It could also mean a major setback for GPU BOINC projects that aren't already using ATI graphics cards at least as well as Nvidia graphics cards, or well on the path to getting there; and the same for all commercial operations depending on the move to GPU computing.

Zy, I suppose that you'd automatically agree with all of ATI's rather biased opinions of the Fermi, if you haven't done this already:

http://www.brightsideofnews.com/news/2010/2/10/amd-community-has-29-tough-gf100-questions-for-nvidia.aspx

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15555 - Posted: 2 Mar 2010 | 5:51:28 UTC - in response to Message 15551.

But if they've already gone into production faze, it's too late, isn't it to modify the design in one way or the other. They only thing they can do now is to make chips & put them on bigger boards. 28nm is the future, but isn't changing the design of the Fermi also in the future? I don't have the knowledge that you guys have & you can read in my wordings that I don't. But I do know that when a product goes into production faze, you can't go back to the drawing board & redesign everything. The only thing that you can do is to find a way to make what you've got work. They're already doing the same thing with the Fermi as they did with the GTX 260, making a less powerful GTX 280. If nVidia wants to launch the low-mid range before they can make a workable high end, it's just stalling, & they can't stall for too long. But if redesigning the PCB is easier & faster, then IMO it's they only way to get the original Fermi they wanted from the very start.

Here I'm thinking about 3 things. The first is to place the GPU outside the PC casing. The second is to go with water & bundle Fermi cards with "easy" water cooling kits that function like the Corsair Hydro: http://www.corsair.com/products/h50/default.aspx Or thirdly, make a huge 3-4 slot gpu card with 80mm fans blowing cool air from the front of the out the back of the casing, instead of using the blower their using now, & using a tower heatsink. (but it's upside down in most casings isn't it?)


Production phase of THAT CHIP and too late to change the design of THAT CHIP in time to help.

But are you sure that they aren't already working on another chip design that cuts down the number of GPU cores enough to use perhaps half as many transistors, but is similar otherwise?

I'd think more along the lines of for many products, putting the GPU outside the computer's main cabinet, but also adding a second CPU and second main memory close to it to provide a fast way of interfacing to it. This second cabinet should not even try to follow the ATX guidelines.

For some others, don't even try to use enough boards to make them plug in to all the cards slots it blocks other cards from using; this should leave more space for a fan and a heatsink. Or, alternatively, make some of the cards small enough that they don't even try to use the card slots they plug into for more than a way to get more power from the backplane. These products could still plug in to normal motherboards, even if they don't fully follow the guidelines for cards for those motherboards. Your plans sound good otherwise.

As for using Fermi chips that don't have all the GPU cores usable, are you sure they don't have a plan to disable use of the unusable cores in a way that will stop them from producing heat, then use them in a board family similar to the GTX 260 boards that also used chips with many of the GPU cores disabled? Likely to mean a problem with some software written assuming that all GPU cores are usable, though, and therefore difficult to use in creating programs for GPU BOINC projects.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15556 - Posted: 2 Mar 2010 | 6:16:57 UTC

An idea for Nvidia to try for any newer boards meant to use chips with some GPU cores disabled: Add a ROM to the board design that hold information about which GPU cores are disabled, if the Fermi chip design doesn't already include such a ROM, then make their software read this ROM just before making a last-minute decision of which GPU cores to distribute the GPU program among on that board. Require any other providers of software for the new boards to include a mention of whether their software can also handle a last-minute decision of which GPU cores to use, in their public offerings of the software.

This could mean a new Nvidia application program specifically for taking GPU programs compiled in a way that does not specify which GPU cores are usable, then distributing them among the parts of the GPU chip that actually are usable.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15557 - Posted: 2 Mar 2010 | 6:20:23 UTC

Do you suppose that Intel's efforts against Nvidia are at least partly because Intel plans to buy Nvidia and convert them into a GPU division of Intel, but wants to drive down the price first?

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15560 - Posted: 2 Mar 2010 | 9:20:29 UTC

Of course they're diabling non-functional shaders, just as in all previous high end chips. The GTX470 is rumored to have 448 shaders instead of 512 (2 of 16 blocks disabled).

And as mighty as Intel is, I don't think they've got anything to do with the current Fermi problems. Denying the QPI license yes, but nothing else.

MrS
____________
Scanning for our furry friends since Jan 2002

Michael Karlinsky
Avatar
Send message
Joined: 30 Jul 09
Posts: 21
Credit: 7,081,544
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwat
Message 15566 - Posted: 2 Mar 2010 | 15:54:27 UTC

News from cebit:

heise.de (in German)

In short:

- one 6 and one 8 pin power connector
- GTX 480 might have less than 512 shaders
- problems with driver
- availability: end of april

Michael
____________
Team Linux Users Everywhere

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15567 - Posted: 2 Mar 2010 | 19:37:49 UTC - in response to Message 15566.

Thanks! That means:

- 225 - 300 W power consumption, as 2 x 6 pin isn't enough (not very surprising..)
- the early performance numbers are to be taken with a grain of salt, as the drivers are not yet where they should be

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 15576 - Posted: 2 Mar 2010 | 23:42:25 UTC - in response to Message 15554.
Last modified: 2 Mar 2010 | 23:45:33 UTC

[quote] ...... Zy, I suppose that you'd automatically agree with all of ATI's rather biased opinions of the Fermi ......


I would expect ATI Marketing to go on a feeding frenzy over this, it would be surprising if they did not.

No, I dont automatically go with ATI viewpoint, mainly because I have been an NVidia fan for years - since they first broke into the market. I am the last one to want them to back out now. However, NVidia fan or not, they have to come up with the goods and be competitive, its not a charity out there.

As for ATI articles etc, I treat those the same as NVidia articles - all pre cleared by marketing and not worth the paper they are printed on until reality appears in the marketplace. Corporate press releases or articles these days are comparible to Political statements released by mainstream Political parties - the truth is only co-incidental to the main objective of misleading the less well informed.

In other words, I now view both genres as automatic lies until corroborated by other sources or methods. As GDF wisely stated above, we will know in a month or so whats true .....

If NVidia come up with a good design on general release (not short term subsidised PR stunt), they get my £s, if they dont, ATI will for the first time since NVidia appeared in the consumer/gamer end of the gpu market.

Regards
Zy

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15586 - Posted: 3 Mar 2010 | 19:43:34 UTC

For all those curious, here's a look at the goodies: http://www.theinquirer.net/inquirer/blog-post/1594686/nvidia-nda-broken-topless-gtx480-pics
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15588 - Posted: 3 Mar 2010 | 22:23:17 UTC - in response to Message 15586.

They should be ashamed of stripping this poor little GTX naked and revealing it's private parts to the public!

MrS
____________
Scanning for our furry friends since Jan 2002

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15591 - Posted: 3 Mar 2010 | 23:11:15 UTC - in response to Message 15588.

The Internet was meant to show everybody naked, including nVidia. Don't blame people for being curious or those with sight for not being blind. A German site I checked out yesterday used inches instead cm & got me really confused yesterday. Either that or Google translate, didn't work... Isn't that worse? I sat in silence when I read that the Fermi would be 4.2 x 4.2 inches!
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15593 - Posted: 3 Mar 2010 | 23:21:25 UTC - in response to Message 15591.

Is that 12 X 256MB RAM i see?

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15596 - Posted: 4 Mar 2010 | 1:20:18 UTC - in response to Message 15593.

I think it is 12 * 128 = 1536 (480) and 10 * 128 = 1280 (470)
____________
Thanks - Steve

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15598 - Posted: 4 Mar 2010 | 8:59:27 UTC - in response to Message 15591.

Well, that's true.. but didn't anyone consider the feelings of this poor chip? If he's not exhibitionist he'll be seriously ashemed by now and probable ask himself the same question over and over again: "why, oh why did they do it?" Like the Rumpelwichte in Ronja Räubertochter. No wonder he's too distracted to run any world records now!

MrS
____________
Scanning for our furry friends since Jan 2002

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15599 - Posted: 4 Mar 2010 | 9:15:59 UTC - in response to Message 15598.

Did I see what I think I saw? On youtube?! Somehow I don't think it was a coincidence that the scene ended where it ended. I'm more a fan of stupid humor, rather than sick humor. This is more me http://www.youtube.com/watch?v=VpZXhR1ibj8 Just pretend not to speak German, & it'll be like reading an Arabic translation on CNN ;-)
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15607 - Posted: 4 Mar 2010 | 19:17:14 UTC - in response to Message 15596.
Last modified: 4 Mar 2010 | 19:39:57 UTC

I think it is 12 * 128 = 1536 (480) and 10 * 128 = 1280 (470)


You may be talking about the shaders, which are on the main GPU (under the tin) or saying that 1.5GB DDR for the GeForce versions?
Anyway, I was refering to the 12 small dark chips surrounding the core, I think they are GDDR5 ECC RAM chips.

I just checked and Fermi will have up to 6GB GDDR5, so the 12 RAM chips could each be up to 512MB. If there are 3GB & 1.5GB version as well, then they could use 256MB (or 128MB, as you said) chips. Mind you, if they are going to have 3GB or 1.5GB versions, perhaps they could leave chips off (6x512MB, or 3x512MB). The PCI would have to change a bit for that, but in turn it would reduce power consumption slightly.


http://www.reghardware.co.uk/2009/10/01/nvidia_intros_fermi/
Explains why it will be good for double-precision.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15609 - Posted: 4 Mar 2010 | 20:12:23 UTC - in response to Message 15607.

I was referring to memory ...
http://www.xtremesystems.org/forums/showthread.php?t=244211&page=66
http://en.wikipedia.org/wiki/Comparison_of_NVIDIA_graphics_processors#GeForce_400_Series
____________
Thanks - Steve

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15611 - Posted: 4 Mar 2010 | 22:12:02 UTC - in response to Message 15609.

Thanks Steve,
So the GTX 470 will have 448 shaders @ perhaps 1296MHz(?), the GPU will be at 625MHz and the GDDR RAM @ 1600MHZ (3200MHZ). It will use 2555MB RAM total and because it uses 1280MB onboard, that tells us that the shaders depend directly on the RAM (and therefore bus width too) – disable a shader and you disable some RAM as well. I expect this means it has to use 12 RAM chips or disable accordingly?

Asus - an opportunity beckons!

The 220 TDP is a lot more attractive than the 300!

Not too many will be able to accommodate two 512 cards, but two 448 cards is do-able (a good 750W PSU should do the trick).

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15625 - Posted: 5 Mar 2010 | 23:59:19 UTC - in response to Message 15611.

So the GTX 470 will have 448 shaders @ perhaps 1296MHz(?), the GPU will be at 625MHz


The "slow core" will run at 1/2 the shader clock.

It will use 2555MB RAM total and because it uses 1280MB onboard


Where is the other half of that 2.5 GB if it's not on board?

that tells us that the shaders depend directly on the RAM (and therefore bus width too) – disable a shader and you disable some RAM as well.


No - correlation does not imply causality ;)

I expect this means it has to use 12 RAM chips or disable accordingly?


For a full 384 bit bus - yes. Unless someone makes memory chips with a 64 bit interface rather than 32 bit.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15639 - Posted: 7 Mar 2010 | 15:31:12 UTC - in response to Message 15625.

If a GTX480 will ship with 1.5GB RAM onboard the card, and can use 2.5GB, then it would need to be using 1GB system RAM to get to 2.5GB.

OK, so for a full 384 bit bus all 12 RAM chips are needed.
This would suggest that the bus for a GTX470 will actually be 320bit, as two RAM chips will be missing (unless thay are going to ship cards with RAM attached for no good reason).

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15644 - Posted: 7 Mar 2010 | 19:13:58 UTC - in response to Message 15639.

If a GTX480 will ship with 1.5GB RAM onboard the card, and can use 2.5GB, then it would need to be using 1GB system RAM to get to 2.5GB.


Adding the system RAM available to the card to the actual GPU memory is a bad habit of the OEMs to post bigger numbers. Any modern card can use lots of system memory, but it does not really matter how much as it's too slow anyway. It's like saying your PC has 2 TB of RAM because you just plugged in that 2 TB Green HDD.

This would suggest that the bus for a GTX470 will actually be 320bit, as two RAM chips will be missing.


Yes. Technically it's the other way around: it has 10 chips because of the 320 bit bus, but never mind. (disclaimer: if the specs are correct)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15645 - Posted: 8 Mar 2010 | 9:00:49 UTC - in response to Message 15644.



Adding the system RAM available to the card to the actual GPU memory is a bad habit of the OEMs to post bigger numbers. Any modern card can use lots of system memory, but it does not really matter how much as it's too slow anyway. It's like saying your PC has 2 TB of RAM because you just plugged in that 2 TB Green HDD.

MrS


I know it is more of an advertisement scam than a reality, but I think games still tend to get loaded into system RAM (not that I play games), rather than sit on a DVD or the hard drive.

The Green HD's are better as a second drive ;) or if your systems stay on most of the time. The last one I installed had 64MB onboard RAM.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15650 - Posted: 8 Mar 2010 | 22:34:20 UTC - in response to Message 15645.

Sure, the game gets loaded into system mem - otherwise the CPU couldn't process anything. However, if the GPU has to swap to system memory, that's entirely different. It's a mapping of the private adress space of the GPU into system memory, so logically it can access this memory just as if it was local. What it stores there is different from what's on the HDD and different from what the CPU processes, though. And as soon as the GPU has to use system mem your frame rate / performance takes a huge hit on everything but the lowest of the low end cards.. that's why using system mem for the GPU is 2practically forbidden". Consider this: on high end GPUs we've got 100 - 150 GB/s of bandwidth, whereas an i7 with 3 DDR3 channels can deliver ~15 GB/s, more than would be available over PCIe 16x.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15651 - Posted: 8 Mar 2010 | 23:42:52 UTC - in response to Message 15650.

I see your point, and 3 times present PCIe (2 & 2.1). I dont think PCIe3 would make too much difference either!

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15749 - Posted: 14 Mar 2010 | 4:38:19 UTC
Last modified: 14 Mar 2010 | 4:52:12 UTC

GALAXY GTX470 http://www.tcmagazine.com/comments.php?shownews=33185&catid=2
Supuermicro, To Kill for! http://www.computerbase.de/bildstrecke/28637/1/
On UR March, get GTX470/480, 26th GO! http://www.fudzilla.com/content/view/17952/1/
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15753 - Posted: 14 Mar 2010 | 11:35:19 UTC

Looks like the fan on the 470 could use a diameter boost.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15841 - Posted: 19 Mar 2010 | 22:25:11 UTC
Last modified: 19 Mar 2010 | 22:26:31 UTC

Might not be such a flop. http://www.fudzilla.com/content/view/18147/65/ Geforce GTX 480 has 480 stream processors and works at 700MHz for the core, 1401MHz for shaders and features 1536MB of memory that works at 1848MHz and paired up with a 384-bit memory interface.
If they can have an aprox 80% improvement in performance, fx 9800GTX+ vs GTX280 http://www.tomshardware.com/charts/gaming-graphics-cards-charts-2009-high-quality-update-3/Sum-of-FPS-Benchmarks-1920x1200,1702.html Sum of FPS Benchmarks 1920x1200
Then the GTX480 "could" score 315FPS which is what the ATI Radeon HD 5970 scores. It's too rich for me though, I'm going to wait for a GTX460-SP432 (whenever that comes out)...
____________

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 15851 - Posted: 20 Mar 2010 | 10:09:41 UTC - in response to Message 15841.
Last modified: 20 Mar 2010 | 10:13:47 UTC

Its already clear that a 5970 will outperform a 480 in its current incarnation, and thats without letting loose the 5970 to its full legs - the latter has another 25-30% performance in it above stock. One of the 480's problems is powerdraw, and it could interesting to see how they get round that issue when going for a Fermi x 2 next year.

For now though, its looking more like whether or not they have produced a card that gives sufficient real world advantage above a 295 to be commercially viable. The pricing will be constrained by the clear real world lead of the 5970 (a lead it will have for at least 12 months, and thats without thinking around a 5970 successor), the defacto 480 floor price of 295's and 5870's, and Firmi's voracious power needs. The real battle will be next year with a re-engineered Fermi2 and whatever the 5970 successor looks like.

Meanwhile I personally find myself caught waiting, I am in the market to replace my stalwart 9800GTX+. The only reason I am still waiting on what the reality of Firmi turns out to be, is the need to run CUDA apps. In my case its the new GPGrid Project about to startup, for that I need CUDA, and I am reluctant to go down the road of a 295. I tend to buy a card and use it for years, jumping generations. Without that need for CUDA, that box would be ATI already. So in a sense, another saving grace for NVidia is the CUDA dimension. Many people/organisations will have far more serious CUDA needs than me out there in the real commercial world - outside of benchmarks and ego trips - but the motivation will be similar, the need to run CUDA (for now).

NVidia have no chance of overtaking ATI on this release, but I hope they have got a sufficiently enticing real world performance/price improvement to make it worthwhile for Fermi1 to be seen as a worthy 295 successor.

Regards
Zy

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 15852 - Posted: 20 Mar 2010 | 10:45:07 UTC - in response to Message 15851.
Last modified: 20 Mar 2010 | 10:45:34 UTC

I don't think that is so clear as you depict it right now. It is likely that Fermi will be a factor two faster compared to a GTX 285 for what it matters GPUGRID. For gaming, I would think that the 480 will be the fastest single GPU card out there. Let's wait another week and see.

gdf

Its already clear that a 5970 will outperform a 480 in its current incarnation, and thats without letting loose the 5970 to its full legs - the latter has another 25-30% performance in it above stock. One of the 480's problems is powerdraw, and it could interesting to see how they get round that issue when going for a Fermi x 2 next year.

For now though, its looking more like whether or not they have produced a card that gives sufficient real world advantage above a 295 to be commercially viable. The pricing will be constrained by the clear real world lead of the 5970 (a lead it will have for at least 12 months, and thats without thinking around a 5970 successor), the defacto 480 floor price of 295's and 5870's, and Firmi's voracious power needs. The real battle will be next year with a re-engineered Fermi2 and whatever the 5970 successor looks like.

Meanwhile I personally find myself caught waiting, I am in the market to replace my stalwart 9800GTX+. The only reason I am still waiting on what the reality of Firmi turns out to be, is the need to run CUDA apps. In my case its the new GPGrid Project about to startup, for that I need CUDA, and I am reluctant to go down the road of a 295. I tend to buy a card and use it for years, jumping generations. Without that need for CUDA, that box would be ATI already. So in a sense, another saving grace for NVidia is the CUDA dimension. Many people/organisations will have far more serious CUDA needs than me out there in the real commercial world - outside of benchmarks and ego trips - but the motivation will be similar, the need to run CUDA (for now).

NVidia have no chance of overtaking ATI on this release, but I hope they have got a sufficiently enticing real world performance/price improvement to make it worthwhile for Fermi1 to be seen as a worthy 295 successor.

Regards
Zy

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15853 - Posted: 20 Mar 2010 | 10:55:13 UTC - in response to Message 15841.

That's certainly an improvement compared to the earlier rumors! And especially the price of the GTX470. That could bring the HD5870 down to the 300$ it was supposed to cost a half year ago :p

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15857 - Posted: 20 Mar 2010 | 13:49:29 UTC - in response to Message 15853.

I'm looking forward to the possibility of consolidating several systems into one with either a GTX470 or a GTX480, but Only If the more recent prices and performances are close to correct. £300 to £400 is reasonable for such a card. However I would like to see a real review first!

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 15861 - Posted: 20 Mar 2010 | 18:19:04 UTC - in response to Message 15852.

I don't think that is so clear as you depict it right now. It is likely that Fermi will be a factor two faster compared to a GTX 285 for what it matters GPUGRID. For gaming, I would think that the 480 will be the fastest single GPU card out there. Let's wait another week and see.


As a single GPU - it will be the fastest, for now. Its the real world incarnation of that compared to ATI's top card that will decide the day for 2010, and it will be significantly slower than a 5970, by an order of magnitude. The 5970 is of course - essentially - two 5870's, but the "average" consumer could care less, a card is a card etc etc, and a Fermi2 will not be upon us until 2011. Lets hope Fermi2 continues to play catchup because ATI will not stand still, I have no doubt ATI will release a "Fermi2 Killer" single GPU card this year.

I hope NVidia sort their engineering and production troubles by 2011, they need to, because at present ATI have a huge lead in a real world commercial sense. I have my fingers crossed NVidia price Fermi1 to alleviate the huge power draw, and can sustain production - we all need competition out there - else ATI will do a re-run of 15 years ago when they got complacent and let NVidia in the back door to wipe their face. ATI will not make that mistake again, its just a case of whether or not NVidia can keep up. Its a funny 'ol world, this time its NVidia thats made the big strategic mistake (2007).

Probably going to be a couple of weeks before we really know as the first reviews will be "selected" reviewers friendly to NVidia marketing. If it is significantly better than a 295, I'm in, and will try to get one - its likely the last one I will get from NVidia for a long time, and even then its only because I want to support the new GPUGrid Project, I would not even be considering it otherwise - and that is painful to me as I have been an NVidia fan since they hit the consumer graphics market.

Regards
Zy

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15862 - Posted: 20 Mar 2010 | 21:07:10 UTC - in response to Message 15861.

ATI cards have no use here. Leave it alone. This is a Fermi thread. There are ATI threads elsewhere.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15863 - Posted: 20 Mar 2010 | 21:26:37 UTC

In my world an order of magnitude is a factor of 10 ;)

And if ATI tried to build a Fermi-1-killer (i.e. be significantly faster, not "just" more economic at a comparable performance level) they'd run into the same problems Fermi faces. They'd have more experience with the 40 nm process, but they couldn't avoid the cost / heat / clock speed problems. The trick is not to try to build the biggest chip.

MrS
____________
Scanning for our furry friends since Jan 2002

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 15864 - Posted: 20 Mar 2010 | 23:51:19 UTC

ATI will issue new cards Q3 this year. Basically it's same 58xx and 59xx series, but on 28nm process available with Global Foundaries later this summer. I'm considering to buy 5990 or 5970 (if i will not be patient enough :-) ) this summer for milkyway@home project. Q3 2011 will be new generation of ATI cards.

if GTX470 will be better enough against GTX275 - i'll take ($350 is OK for me), but if not... I was nvidia fan for years, but may be it's time to say "good luck, guys"
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15865 - Posted: 21 Mar 2010 | 0:03:17 UTC - in response to Message 15864.

I dare say the GTX470 will be twice as fast as a GTX275!
The GTX480 will be twice as fast as a GTX285, and thats before any consideration of architecture over and above the counts. You will know for sure next week ;)

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15866 - Posted: 21 Mar 2010 | 1:52:42 UTC

I'll not be the first, I hope others are willing. I'm going to wait until they work out the drivers, which usually aren't the best at launch, I'll wait until the stocks refill, the prices drop, they come with a revision 2, & fermi's are sold at a discount. So that all said, I'll expect to get my first fermi in 2011 ;-)
____________

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 15868 - Posted: 21 Mar 2010 | 3:58:03 UTC - in response to Message 15865.

I dare say the GTX470 will be twice as fast as a GTX275!

Hm... Hope u r right :-) If it will twice faster my GTX275, pricewise will be $350 and will be good in OCing, allowing to rich stock GTX480 at least - I'll take it :-) OR - i'll wait for 495 till early summer.

____________

Hans-Ulrich Hugi
Send message
Joined: 1 Jul 09
Posts: 2
Credit: 598,355,471
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15880 - Posted: 21 Mar 2010 | 16:20:53 UTC
Last modified: 21 Mar 2010 | 16:24:12 UTC

It seems that the final specs and prices of both Fermi cards got public - and it seems that most (negative) rumours are true. The cards are slower that many hoped and their price indicates that the performance isn't great.
Source:
http://www.heise.de/newsticker/meldung/Preise-und-Spezifikationen-der-Fermi-Grafikkarten-959845.html (in German)
Linked to:
http://vr-zone.com/articles/nvidia-geforce-gtx-480-final-specs--pricing-revealed/8635.html

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15882 - Posted: 21 Mar 2010 | 16:54:58 UTC - in response to Message 15880.

If they say that the performance of the GTX480 is en par with the HD5870 & the GTX470 is like the HD5850. If looking at the Sum of FPS Benchmarks 1920x1200 on Toms Hardware: http://www.tomshardware.com/charts/gaming-graphics-cards-charts-2009-high-quality-update-3/Sum-of-FPS-Benchmarks-1920x1200,1702.html that "could" mean that the GTX480 gains rougly 40% against the GTX285 & the GTX470 gained 25%. If that chart can be converted to the compute capability, that "could" mean that there is a 25-40% gain in performance. That's before they make improvements to their drivers...
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15883 - Posted: 21 Mar 2010 | 17:26:41 UTC

Just a couple more days :)

I suppose the performance in games will be approximately what they're currently showing.. but the architecture itself should be capable of much more. Just consider GT200: initially it was just as fast as the compute capability 1.1 cards at GPU-Grid (at similar theoretical FLOPS). After some time we got an improvement of 40%, at the beginning of the year we got 30 - 40% more and now it's another 30 - 40% faster.
I'd expect something approximately similar with Fermi (the initial performance could be higher).

MrS
____________
Scanning for our furry friends since Jan 2002

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15884 - Posted: 21 Mar 2010 | 18:04:50 UTC - in response to Message 15883.
Last modified: 21 Mar 2010 | 18:28:21 UTC

I "guess", that it won't just be Workstation & Mainstream GPU's, Compute deserves it's own line. Not now, but soon...

And if ATI tried to build a Fermi-1-killer (i.e. be significantly faster, not "just" more economic at a comparable performance level) they'd run into the same problems Fermi faces. They'd have more experience with the 40 nm process, but they couldn't avoid the cost / heat / clock speed problems. The trick is not to try to build the biggest chip.


If Nvidia concentrate on Compute with Fermi & Redo Mainstream/Workstation. They're already ahead with Compute...

I'm also "curious" to when a GPU can do without the CPU. Is an Nvidia a RISC or a CISC? Would Linux be able to run a GPU only PC?
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15885 - Posted: 21 Mar 2010 | 18:07:24 UTC - in response to Message 15883.
Last modified: 21 Mar 2010 | 18:16:52 UTC

February's increase was 60% for CC1.3 cards.
I expect Fermi to be CC1.4 (or some new Cuda Capable system to be started). My concern would be that by moving to CC1.4 people with CC1.3 and lesser cards will eventually start seeing task failures, as has recently happened following the application updates; several people report only being able to successfully crunch using the older application. Mind you, it will be a while before any new aps are written for Fermi. I think the Fermi driver is still Beta, and no doubt there will be a few revisions there!

I am also concerned that these new cards may not actually be especially good at crunching on GPUGrid. The first GTX 200 cards used 65nm cores and were much more error prone. The cores then shrunk to 55nm and the GT200 was revised to B1 and so on. Perhaps these Fermi cards will follow that same path - So there could be lots of task errors with these early cards, and not until there are several revisions will the cards really start to perform, perhaps not until some 28nm version makes it to the shops, and that is a long way off.

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15886 - Posted: 21 Mar 2010 | 20:02:57 UTC
Last modified: 21 Mar 2010 | 20:13:13 UTC

I finally found what I was looking for! http://www.lucidlogix.com/product-adventure2000.html One of these baby's can put that nasty Fermi outside the case & give me 2 PCIe x16 or 4 PCIe x8 for every one PCIe on my mobo. If a cheap Atom is enough, one of these on the mobo would make it possible to use 2-4 High End CUDA GPU's to play crunchbox: http://www.lucidlogix.com/products_hydra200.html But I don't make PC's, I buy them. So maybe if the price ain't bad, & I can find out where i can get an Adventure 2000, I'd be able to run an external multi Fermi GPU box...

Maybe if VIA supplies the CPU, Nvidia can do the rest, with or w/o Lucid.
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15892 - Posted: 21 Mar 2010 | 22:12:48 UTC

I "guess", that it won't just be Workstation & Mainstream GPU's, Compute deserves it's own line. Not now, but soon...


They do: more memory, stronger cooling, more testing and higher price. Otherwise the chip is just the same, anything else would be rediculously expensive now and in the coming couple of years.

Developing on GPU architecture is so expensive, you don't just redisgn it for more computer or texture power. What you can do is to create a modular design, where you can easily choose the number of shader cluster, memory controllers etc. And that's just what they've been doing for years and that's how the mainstream chips will be build: from the same fundamental blocks, just less of them. And since each shader cluster contains the shaders (compute, shaders in games) and the texture and other fixed function units (gaming only) there won't be any shifts in performance between "compute" and "gaming".

Would Linux be able to run a GPU only PC?


That would probably require starting from scratchand would end up being quite slow. Not impossible, though ;)

@SK: compatibility is going to be a problem for GPUs in the long run. Currently they're still changing so much so quickly that supporting old hardware is going to become a pain. Ideally the driver should take care of this - but just how long is nVidia (and ATI) going to support old cards which are not sold any more? Drivers are not perfect and need expensive debugging as well. Which, btw is the cause of the "current" problems with initial GT200 chips: a driver bug which nVidia can't or doesn't want to fix. This should calm your concerns regarding the first Fermis, though: all nVidia has to do is get the driver done properly..

MrS
____________
Scanning for our furry friends since Jan 2002

M J Harvey
Send message
Joined: 12 Feb 07
Posts: 9
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 15893 - Posted: 22 Mar 2010 | 0:06:15 UTC - in response to Message 15886.

liveonc

I finally found what I was looking for!


http://www.colfax-intl.com/ms_tesla.asp?M=102 would involve less faffing about with sheet metal. Ain't cheap, though.

MJH

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15898 - Posted: 22 Mar 2010 | 5:15:57 UTC - in response to Message 15893.
Last modified: 22 Mar 2010 | 5:36:51 UTC

Too damn rich for me! Besides, it runs Workstation GPU's. I'm thinking about cost reduction. I bet that I could get an i7 mobo with 3 PCIe x16 running 3 Lucid Adventure 2000 boxes running 2-4 GTX295 each (depending on if you want 2 x16 PCIe or 4 x8 PCIe), costing less than a single Colfax CXT8000. That would be 1 CPU & 12-24 GPU's VS 2 CPU's & 8 GPU's.

It's all about "affordable" Supercomputing, for even the most unpopular, ill-funded, mad scientist out there (or helping one out)...
____________

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 15900 - Posted: 22 Mar 2010 | 10:45:16 UTC - in response to Message 15898.

bet that I could get an i7 mobo with 3 PCIe x16 running 3 Lucid Adventure 2000 boxes running 2-4 GTX295 each (depending on if you want 2 x16 PCIe or 4 x8 PCIe), costing less than a single Colfax CXT8000.


If you can, do let us know! I've not seen the Adventure board productised anywhere...

MJH

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15903 - Posted: 22 Mar 2010 | 10:56:13 UTC - in response to Message 15900.

Haven't either yet, but for now there is this: http://eu.msi.com/index.php?func=proddesc&maincat_no=1&cat2_no=170&prod_no=1979 Isn't what I had in mind, but it's there...

Powered by Hydra Engine, Big Bang Fuzion can offer the most flexible upgradability on 3D performance, allowing users to install cross-vender GPUs in a single system. The technology can perform scalable rendering to deliver near-linear gaming performance by load-balancing graphics processing tasks.
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15904 - Posted: 22 Mar 2010 | 10:56:29 UTC - in response to Message 15898.



The above ships with 4 Teslas and costs as much as a new car.

You would be better off just buying a good PSU and getting a couple of Fermi cards (or a new sysetem, or two, or three...).
Two GTX480 cards will probably do as much work as 4 or 5 GTX295s, ie more than 4 Teslas!

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15905 - Posted: 22 Mar 2010 | 11:33:20 UTC - in response to Message 15904.
Last modified: 22 Mar 2010 | 12:01:31 UTC



The Adventure is a PCIe expansion board based on a derivative of the HYDRA 100 ASIC (LT12102). The system allows easy, high speed connection to any PC or Workstation platform using the PCIe bus. The system is optimized for GPUs and provides superb connectivity to different PCIe devices such as SSDs, HDs and other peripherals. The board has four PCIe slots and is powered by a standard PC power supply. The Adventure is best for your multi displays, broadcasting, digital signage and storage solution. The Adventure board is typically deployed within a 4U rack mount case together with a standard power supply and up to four slots.

That's why I'm interested in a Lucid Adventure! A 1000W PSU, x2 GTX480, for every PCIe slot I've got in my PC. GPUGRID runs WU's that use 1 core pr GPU, that would require an Core i9, then we're crunching!
____________

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 15906 - Posted: 22 Mar 2010 | 11:43:00 UTC - in response to Message 15904.

It's actually a rebranded Tyan FT72B7015 http://www.tyan.com/product_SKU_spec.aspx?ProductType=BB&pid=412&SKU=600000150. Difficult to source though.

For you keen GPUGRID crunchers, it might be better find the minimum-cost host system for a GPU. CPU performance isn't an issue for our apps, so a very cheap motherboard-processor combination would do, perhaps in a wee box like this one: http://www.scan.co.uk/Product.aspx?WebProductId=982116

MJH

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15907 - Posted: 22 Mar 2010 | 12:21:09 UTC - in response to Message 15884.

I "guess", that it won't just be Workstation & Mainstream GPU's, Compute deserves it's own line. Not now, but soon...

And if ATI tried to build a Fermi-1-killer (i.e. be significantly faster, not "just" more economic at a comparable performance level) they'd run into the same problems Fermi faces. They'd have more experience with the 40 nm process, but they couldn't avoid the cost / heat / clock speed problems. The trick is not to try to build the biggest chip.


If Nvidia concentrate on Compute with Fermi & Redo Mainstream/Workstation. They're already ahead with Compute...

I'm also "curious" to when a GPU can do without the CPU. Is an Nvidia a RISC or a CISC? Would Linux be able to run a GPU only PC?


Is the Compute market big enough for Nvidia to earn enough on it without a very large reduction in their sales?

If I understand the GPU architectures correctly, they do NOT include the capability of reaching memory or peripherals off the graphics boards. Therefore, they cannot reach any BOINC projects.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15909 - Posted: 22 Mar 2010 | 12:33:47 UTC - in response to Message 15886.

I finally found what I was looking for! http://www.lucidlogix.com/product-adventure2000.html One of these baby's can put that nasty Fermi outside the case & give me 2 PCIe x16 or 4 PCIe x8 for every one PCIe on my mobo. If a cheap Atom is enough, one of these on the mobo would make it possible to use 2-4 High End CUDA GPU's to play crunchbox: http://www.lucidlogix.com/products_hydra200.html But I don't make PC's, I buy them. So maybe if the price ain't bad, & I can find out where i can get an Adventure 2000, I'd be able to run an external multi Fermi GPU box...

Maybe if VIA supplies the CPU, Nvidia can do the rest, with or w/o Lucid.


Looks like a good idea, if GPUGRID decides to rewrite their application to requires less communication between the CPU section and the GPU section.

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15911 - Posted: 22 Mar 2010 | 13:50:57 UTC - in response to Message 15907.
Last modified: 22 Mar 2010 | 14:45:17 UTC

Are you 100% sure about this? Lucid claims that:

The Adventure is a powerful PCIe Gen2.0 expansion board for heavy graphic computing environment. The platform allows connection of multiple PCIe based devices to a standard PC or server rack.The Adventure 2000 series is based on the different derivative of the HYDRA 200 series targeting a wide range of performance market segments. It provides a solution for wide range of applications such as gaming (driver required), GPGPU, high performance computing, mass storage, multi-display, digital and medical imaging.

Adventure 2500
The Adventure 2500 platform is based on Lucid’s LT24102 SoC, connecting up to 4 heterogeneous PCIe devices to a standard PC or Workstation using standard PCIe extension cable. The Adventure 2500 overcomes the imitations of hosting high-end PCIe devices inside their Workstation/PC systems. Those limitations are usually associated with space, power, number of PCIe ports and cooling.
The solution is seamless to the application and GPU vendor in order to meet the needs of various computing and storage markets.


That said, what would be require are:

x1 Lucid Adventure 2000 http://www.lucidlogix.com/product-adventure2000.html

x1 4U rack mount

x1 850W-1000W PSU

x2 GTX480

With that in place, you're supposed to just connect the thing to one of your PCIe slots within your PC. That's "maybe" $2000 a pop...

I'm also thinking about if x8 PCIe 2.0 doesn't effect performance, x4 GTX480 & maybe a 1500W PSU might be possible for "maybe" $3000 a pop...
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15917 - Posted: 22 Mar 2010 | 15:26:22 UTC - in response to Message 15905.
Last modified: 22 Mar 2010 | 15:27:25 UTC

GPUGRID runs WU's that use 1 core pr GPU, that would require an Core i9, then we're crunching!


That’s not the case:
GPUGrid tasks do not require that the CPU component of a process to be explicitly associated with one CPU Core (or logical core in the case of the i7). So, a low costing single core CPU could support two (present) high end GPUs!

As quad core i7’s use Hyperthreading, they have 8 logical cores. So, even if GPUGrid tasks were exclusively associated to one core an i7 could support 8 GPUs! Remember the faster the CPU, the less CPU time required!
I normally crunch CPU tasks on my i7-920, leaving one logical core free for my two GT240s. Overall my system only uses about 91% of the CPU, so bigger cards would be fine!

As my two GT240s (equivalent to One GTX260 sp216) only use 3.5% of my CPU then a GTX 295 would use about 7% and two GTX 295’s would use about 14%.
Therefore two GTX480 would use about 33% - so an i7-920 could support 6 Fermi cards crunching on GPUGrid!
Most people would be better off with a highly clocked dual core CPU (3.33GHz) than say a Q6600 (at only 2.4GHz), or just overclock it and leave a core or two free.

PS. There is no i9; Intel ended up calling it the i7-980X. But fortunately you don’t need it, as it costs £855. Better to build a dual GTX470 based system for that.

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15918 - Posted: 22 Mar 2010 | 15:38:28 UTC - in response to Message 15917.
Last modified: 22 Mar 2010 | 15:47:36 UTC

But can you answer if it's robertmiles, or Lucid who's right? Is it possible to build an "affordable" Supercomputer? Consisting of 4x 4U rack mounts, one for a Single or Dual CPU i7 & 3 for 2-4 GTXGPU's? If my "guess" is right, such a system would cost $8-12.000 & nobody says that you have to use it for crunching GPUGRID.net projects...

Do you know if there's still prize money going to whoever finds the highest Prime Number, if so, can CUDA help to find it, if putting together that "affordable" Super Computer is possible?
____________

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15919 - Posted: 22 Mar 2010 | 16:01:44 UTC - in response to Message 15918.

On the list of things to investigate to get your supercomputer project off the ground I would like to suggest that you look into how many GPUs in one system the NVidia drivers will support properly. Are there OS limits? How about motherboard bios limits?

Talk to the people with multiple GTX295 cards and you will see they had to do unconventional things regarding drivers and bios.

skgiven ... I think that the current linux version of gpugrid app is using up a full cpu core per gpu core. That's probably the single biggest reason I have not tried linux yet. I have seen some fast runtimes which always interest me but I am just not willing to take that much away from WCG.
____________
Thanks - Steve

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15920 - Posted: 22 Mar 2010 | 16:34:46 UTC - in response to Message 15919.
Last modified: 22 Mar 2010 | 16:36:43 UTC

Its still pretty raw, but this was an article I found from PC Persctive http://www.pcper.com/article.php?aid=815&type=expert&pid=1

They showed & tested the Hydra 200 & the Adventure. There was a mention about folding@home.
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15923 - Posted: 22 Mar 2010 | 20:14:11 UTC - in response to Message 15919.

Snow Crash, your right. 6.04 is still running over 90% CPU time on Linux, but it is saving 50min per task - Which means 5h rather than 5h 50min, or 16.5% better in terms of results/points.
My GTX 260 brings back about 24500 points per day, so under Linux I would get about 28500 per day (~ 4000 more). The quad CPU on my system only gets around 1300 Boinc points (my GTX 260 does almost 20 times the work). So using Linux I would get 2600more points per day on that one system, assuming I could not run any other CPU tasks.

Liveonc, I would say that you would really have to talk to someone who has one, to know what it can do - if the Lucid Hydra is even available yet? You would also, as Snow Crash said, need to look into the GPU drivers, other software (application), and especially the motherboards limitations. You would also need to be very sure what you want it for; crunching, gaming, rendering or display.

I guess that a direct PCIE cable would allow the whole GPU box to function 'basically' like a single GPU device and the Hydra is essentially an unmanaged GPU Layer 1&2 switch. The techs here might be able to tell you if it could at least theoretically work.
Although I design and build bespoke systems and servers this is all new and rather expensive kit. It is a bit of a niche technology, so it will be expensive and of limited use. That said if it could be used for GPUGrid, I am sure the techs would be very interested in it as it would allow them to run their own experiments internally and develop new research techniques.

For most, one Fermi in a system will be plenty! Two will be for the real hard core enthusiast or gamer. For the rare motherboards that might actually support 3 Fermi's you really are looking at a 1200W PSU (or 2 PSU's) in a Very well ventilated Tower System (£2000+).

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15924 - Posted: 22 Mar 2010 | 20:50:03 UTC - in response to Message 15923.
Last modified: 22 Mar 2010 | 21:02:15 UTC

Sorry for asking too many questions that can't really be answered unless you actually had the thing. It just fascinated me with the possible potential of the Hydra. Not just for use with GPUGRID.net, I myself "sometimes", like to play games on my PC. That most of the GPU's used in GPUGRID.net are mainstream GPU's & that Hydra allows mixing Nvidia with Ati cards, Nvidia & Ati GPU's of different types. Brought to mind someone in another thread who looked at a GTX275 with GTS250 physics card. That Hydra "might" allow mixing different GPU chips on the same card, "might" also enable Dual GPU chip cards with fx a Cypress & a Fermi, instead of putting the Hydra on the mainboard or going external.

Also, Nvidia had abandoned the idea of Hybrid-SLI GeForce® Boost, & decided to go with NVIDIA Optimus instead. If notebooks had a Hydra, they "might" be able to use both Integrated & Discrete, instead of just switching between the two.
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15926 - Posted: 22 Mar 2010 | 21:50:24 UTC - in response to Message 15924.
Last modified: 22 Mar 2010 | 21:53:55 UTC

Hydra is very interseting - it has unknown potential ;) I am sure many event organisers (DJs) would love to see it in a laptop or a box that could be attached to a laptop, for performances! It could even become a must have for ATI and NVidia alike, when demonstrating new GPU's to audiances.


Back to Fermi.

I was thinking about the difference between DDR3 and GDDR5. In itself it should mean about a 20 to 25% difference. So with improved Fermi architecture (20%) and slightly better clocks (~10%) I think the GTX480 will be at least 2.6 times as fast as a GTX295 but perhaps 2.8 times as fast.

I think they are keeping any chips with 512 working shaders aside, for monster dual cards. Although the clocks are not likely to be so high, a card that could do 4.5 times the work of a GTX295 would be a turn up for the books.

Should any lesser versiosn of Fermi turn up with DDR3, for whatever reason, avoid at all costs!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15927 - Posted: 22 Mar 2010 | 22:36:53 UTC

Hydra: it's not there yet for games, they still have to sort out the software side. Load balancing similar GPUs is difficult enough (see SLI and Crossfire), but with very different GPUs it becomes even more challenging. They'd also teach their software how crunching with Hydra had to work.

Back to Fermi: there won't be any DDR3 GF100 chips - that would be totally insane. Really bad publicity and still a very expensive board which people almost won't buy at all. Manufacturers play these tricks a lot with low end cards (which are mostly bought by people who don't know / care about performance), and sometimes mid-range cards.

I was thinking about the difference between DDR3 and GDDR5. In itself it should mean about a 20 to 25% difference.


No, it doesn't. If you take a well balanced chip (anything else doesn't get out of the door at the mid to high end range anyway) and increase its raw power by a factor of 2 you'll get anything between 0 and 100% as a speedup, depending on the application. In games probably more like 30 - 70%.
If you doulbe raw power and double memory bandwidth (and keep relative latency constant) then you'll see 100% speedup among all applications. The point is: higher memory speed doesn't make you faster by definition, because the memory bandwidth requirements scale with performance.
So if for example Fermi got 3 times as much raw crunching power as GT200 and only 2 times the bandwidth, this is not going to speed things up, regardless of the memory being GDDR5 or whatever.

However, Fermi is more than just an increase of raw power: there's also the new caching system. Which should alleviate the need for memory bandwidth somewhat, but doesn't change the fundamental issue.

MrS
____________
Scanning for our furry friends since Jan 2002

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 15928 - Posted: 23 Mar 2010 | 1:54:38 UTC - in response to Message 15926.
Last modified: 23 Mar 2010 | 1:59:42 UTC

I think the GTX480 will be at least 2.6 times as fast as a GTX295 but perhaps 2.8 times as fast.

I think they are keeping any chips with 512 working shaders aside, for monster dual cards. Although the clocks are not likely to be so high, a card that could do 4.5 times the work of a GTX295 would be a turn up for the books.


from where u took this? AFAIK from various forums, here's some "facts":
- GTX480 and GTX470 are faster then GTX285 on 40% and 25% respectively
- GTX480 is weaker then 5870 in Vantage
- OCing is really poor and in the same time - make card "critically hot"
- VERY noisy
- if the quantity of good 512 core will be enough, it will be for Tesla only
- fermi2 will no be earlier then summer 2011, most probably - fall 2011, if to start redesigning today
- there are plans for GTX495. BUT: it will be based on 275 chip (retail name GTX470 with 448 cores), but not on 375 (GTX480 - 480 cores). Another problem - PCI certification (300W), which will be really hard to meet. Power consumption gonna be "mind blowing".

Sure, let's wait for a week and see if all these true or not
____________

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 15,982
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15929 - Posted: 23 Mar 2010 | 7:07:24 UTC - in response to Message 15911.
Last modified: 23 Mar 2010 | 7:18:31 UTC

Are you 100% sure about this? Lucid claims that:

The Adventure is a powerful PCIe Gen2.0 expansion board for heavy graphic computing environment. The platform allows connection of multiple PCIe based devices to a standard PC or server rack.The Adventure 2000 series is based on the different derivative of the HYDRA 200 series targeting a wide range of performance market segments. It provides a solution for wide range of applications such as gaming (driver required), GPGPU, high performance computing, mass storage, multi-display, digital and medical imaging.

Adventure 2500
The Adventure 2500 platform is based on Lucid’s LT24102 SoC, connecting up to 4 heterogeneous PCIe devices to a standard PC or Workstation using standard PCIe extension cable. The Adventure 2500 overcomes the imitations of hosting high-end PCIe devices inside their Workstation/PC systems. Those limitations are usually associated with space, power, number of PCIe ports and cooling.
The solution is seamless to the application and GPU vendor in order to meet the needs of various computing and storage markets.


That said, what would be require are:

x1 Lucid Adventure 2000 http://www.lucidlogix.com/product-adventure2000.html

x1 4U rack mount

x1 850W-1000W PSU
[img]
x2 GTX480

With that in place, you're supposed to just connect the thing to one of your PCIe slots within your PC. That's "maybe" $2000 a pop...

I'm also thinking about if x8 PCIe 2.0 doesn't effect performance, x4 GTX480 & maybe a 1500W PSU might be possible for "maybe" $3000 a pop...


Is who 100% sure about what?

Looks like the GPUGRID project scientists and programmers need to say more about just how fast the CPU-GPU communications needs to be, and how powerful a CPU is needed to keep up with the needs of two or four GTX480s.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 15930 - Posted: 23 Mar 2010 | 8:43:29 UTC - in response to Message 15929.



[quote]Are you 100% sure about this? Lucid claims that:


Looks like the GPUGRID project scientists and programmers need to say more about just how fast the CPU-GPU communications needs to be, and how powerful a CPU is needed to keep up with the needs of two or four GTX480s.


It does not matter at all. The code always run on the GPU only apart from IO.
gdf

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15932 - Posted: 23 Mar 2010 | 8:56:15 UTC - in response to Message 15930.
Last modified: 23 Mar 2010 | 9:07:50 UTC

Well, I "guess" that maybe after Lucid sorts itself out, it "might" just be the One Chip to rule them all, One Chip to find them, One Chip to bring them all and in the darkness bind them In the Land of Mordor where the Shadows lie. ;-) With that said, I hope the Orcs get it before the Hobits destroys it. BTW, I read somewhere that Intel has a stake in Lucid.
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15935 - Posted: 23 Mar 2010 | 11:12:06 UTC - in response to Message 15932.

I bet its a pointy stake with a Lawyer behind it.
ATI and NVidia will hadrdly be bending over backwards to make their kit work with Hydra then!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15948 - Posted: 23 Mar 2010 | 21:53:33 UTC - in response to Message 15928.

AFAIK from various forums, here's some "facts":
- GTX480 and GTX470 are faster then GTX285 on 40% and 25% respectively
...
- if the quantity of good 512 core will be enough, it will be for Tesla only


I second that: reserving fully functional chips for ultra expensive Teslas is a smart move (if you don't have many of them ;)

Regarding the performance estimate: I don't doubt someone measured this with some immature hardware and driver, but I don't think it's going to be the final result. Consider this:

When ATI went from DX10 to DX11 they required 1.10 times as many transistors per shader (2200/1600 : 1000/800) and got 11% less performance per shader per clock (see 4870 1 GB vs. 5830).

If the 40% for GTX480 versus GTX285 were true that would mean nVidia spent 1.07 times as many transistors per shader (3200/512 : 1400/240) and got 0.74 times as much performance per shader per clock, i.e. 35% less [140%/(480*1400) : 100%/(240*1476)]. nVidias strength so far has been design, so I don't think they'd screw this one up so badly. Especially since they already screw up manufacturing and dimensioning.

MrS
____________
Scanning for our furry friends since Jan 2002

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 15957 - Posted: 24 Mar 2010 | 20:22:14 UTC - in response to Message 15948.


Regarding the performance estimate: I don't doubt someone measured this with some immature hardware and driver, but I don't think it's going to be the final result.
MrS


I understand what u r talking about. But remember - this time nvidia changed the “dispatching scheme” and now they sending cards and drivers directly by themselves, but not from manufacturers, as it was before. In fact, nvidia already had sent cards and the latest driver to some close to them reporters and these results came from one of them.

Saying frankly, me personally expected and nvidia promised way better performance and frequences, so if this is true I gonna stay on my GTX275, may be I’ll try to get the 2nd one for cheap. But no Fermi until Fermi2 will be available, sorry…


____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15962 - Posted: 24 Mar 2010 | 21:05:10 UTC - in response to Message 15957.

Patience is a virtue ;)

Generally I'm not very keen on advising people to spend their money on something which is a hobby for us, after all. BOINC is meant to use spare computation time, not to make people invest a fortune into new hardware ;)
However, if someone wants to buy I'll gladly try to tell them what makes sense from my point of view. Which occasionally is "don't buy".

MrS
____________
Scanning for our furry friends since Jan 2002

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15970 - Posted: 25 Mar 2010 | 11:59:33 UTC

Leaked specs and pricing from Toms hardware

GeForce GTX 480 : 480 SP, 700/1401/1848MHz core/shader/mem, 384-bit, 1536MB, 295W TDP, US$499

GeForce GTX 470 : 448 SP, 607/1215/1674MHz core/shader/mem, 320-bit, 1280MB, 225W TDP, US$349

Looks to me that the GTX470 is the better bang for the buck. Presumably the Tesla's got the fully functional chips (ie the full 512 SP), but no idea on pricing for them.

There is some mention of a GF104 chip to come out Q2 or Q3 2010 that is meant to be a lower-cost version (and presumably lower spec'ed) of the GF100 chip. It may be worthwhile waiting for these.
____________
BOINC blog

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 15971 - Posted: 25 Mar 2010 | 12:28:09 UTC - in response to Message 15962.

ExtraTerrestrial Apes,

Patience is a virtue ;)

Generally I'm not very keen on advising people to spend their money on something which is a hobby for us, after all. BOINC is meant to use spare computation time, not to make people invest a fortune into new hardware ;)
However, if someone wants to buy I'll gladly try to tell them what makes sense from my point of view. Which occasionally is "don't buy".

MrS

100% agree, man :-) The onl thing - I built my rigs for BOINC :-) Otherwise I do not need that powerful PCs :-)

MarkJ
agree about GTX470. It's just a little bit less powerful, but pricewise is OK for.

But me personally (if Fermi is really good) will wait for GTX495 to appear somewhere Q2 or early Q3 this year
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16040 - Posted: 29 Mar 2010 | 8:57:54 UTC - in response to Message 15971.

Name p14-IBUCH_chall_pYEEI_100301-26-40-RND0782_0
Workunit 1302269
Created 29 Mar 2010 1:18:10 UTC
Sent 29 Mar 2010 1:18:53 UTC
Received 29 Mar 2010 7:17:30 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 67855
Report deadline 3 Apr 2010 1:18:53 UTC
Run time 19101.678711
CPU time 4336.531
stderr out

<core_client_version>6.10.18</core_client_version>
<![CDATA[
<stderr_txt>
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 0.81 GHz
# Total amount of global memory: 1576468480 bytes
# Number of multiprocessors: 15
# Number of cores: 120
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 0.81 GHz
# Total amount of global memory: 1576468480 bytes
# Number of multiprocessors: 15
# Number of cores: 120
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 0.81 GHz
# Total amount of global memory: 1576468480 bytes
# Number of multiprocessors: 15
# Number of cores: 120
# Time per step: 30.239 ms
# Approximate elapsed time for entire WU: 18899.089 s
called boinc_finish

</stderr_txt>
]]>

Validate state Valid
Claimed credit 3977.21064814815
Granted credit 5965.81597222223
application version Full-atom molecular dynamics v6.71 (cuda23)

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16042 - Posted: 29 Mar 2010 | 10:32:23 UTC - in response to Message 16040.

Name p14-IBUCH_chall_pYEEI_100301-26-40-RND0782_0
Workunit 1302269
Created 29 Mar 2010 1:18:10 UTC
Sent 29 Mar 2010 1:18:53 UTC
Received 29 Mar 2010 7:17:30 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 67855
Report deadline 3 Apr 2010 1:18:53 UTC
Run time 19101.678711
CPU time 4336.531
stderr out

<core_client_version>6.10.18</core_client_version>
<![CDATA[
<stderr_txt>
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 0.81 GHz
# Total amount of global memory: 1576468480 bytes
# Number of multiprocessors: 15
# Number of cores: 120
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 0.81 GHz
# Total amount of global memory: 1576468480 bytes
# Number of multiprocessors: 15
# Number of cores: 120
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 0.81 GHz
# Total amount of global memory: 1576468480 bytes
# Number of multiprocessors: 15
# Number of cores: 120
# Time per step: 30.239 ms
# Approximate elapsed time for entire WU: 18899.089 s
called boinc_finish

</stderr_txt>
]]>

Validate state Valid
Claimed credit 3977.21064814815
Granted credit 5965.81597222223
application version Full-atom molecular dynamics v6.71 (cuda23)


Where's the 480 cuda-cores?
____________
BOINC blog

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16043 - Posted: 29 Mar 2010 | 10:35:44 UTC - in response to Message 16042.

OMG LOL
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,806,011,851
RAC: 9,768,357
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16045 - Posted: 29 Mar 2010 | 10:48:11 UTC - in response to Message 16043.

OMG LOL

And it's the only task they've got to run at all: All tasks for host 67855.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16046 - Posted: 29 Mar 2010 | 10:57:43 UTC - in response to Message 16045.
Last modified: 29 Mar 2010 | 11:30:12 UTC

It cannot run ACEMD V2 tasks because of the way that app was compiled – for G200 cards, not Fermi cards!

Either the app cant see all the cores or it is just not reporting them.

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16048 - Posted: 29 Mar 2010 | 11:42:10 UTC - in response to Message 16046.

It cannot run ACEMD V2 tasks because of the way that app was compiled – for G200 cards, not Fermi cards!

Either the app cant see all the cores or it is just not reporting them.


You have 197.33 driver and what version DLL's did you use? Perhaps the developers need to compile a cuda 3.0 app for you to see all the 480 cuda cores. But still if it only used a quarter of them and managed 30ms/step thats good.
____________
BOINC blog

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 16049 - Posted: 29 Mar 2010 | 11:42:28 UTC - in response to Message 16042.

Where's the 480 cuda-cores?


It's just a reporting artifact. The device query code assumes 8 cores/sm (15.8=120). Fermis actually have 32 (15.32=480).

MJH

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 16050 - Posted: 29 Mar 2010 | 11:44:17 UTC - in response to Message 16046.

It cannot run ACEMD V2 tasks because of the way that app was compiled – for G200 cards, not Fermi cards!


That's right. We need to rebuild the ACEMD V2 and beta apps to include kernel code compiled for Fermi. The older app will work because it's built in a different way.

MJH

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16052 - Posted: 29 Mar 2010 | 12:17:25 UTC - in response to Message 16050.
Last modified: 29 Mar 2010 | 12:53:53 UTC

After compairing the IBUCH task that ran on the GTX480 to a similar task that ran on a GTX275 (240shaders), it really looks like the GTX480 was only running on 120 shaders. This would mean the GTX480 is 9% faster when only using 120 shaders! (anyone)?

So, if it could use all 480 shaders it could be 4.36 times as fast as a GTX275. This makes more sense than just a reporting error, given the cards architecture and would tie in well with other CUDA findings, such as folding@home (non-Boinc).

http://www.gpugrid.net/workunit.php?wuid=1298681

p14-IBUCH_chall_pYEEI_100301-26-40-RND0782 completed in 19,101sec (5h18min) claimed credit was 3,977, granted credit was 5,965.

p20-IBUCH_5Ans_pYEEI_100216-35-40-RND4473 completed on a GTX275 in 20868sec (5h48min) claimed 3,977 credit and was granted 5,965 credit.

Boinc (6.10.18) is not reading the card correctly, so some problems are stemming from that.

By the way it is CC2.0

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16060 - Posted: 29 Mar 2010 | 16:52:19 UTC

skgiven,
not bad result with GTX480. I wonder to see results for recompiled apps. If it will that fast on 480 cores, I'm ready to change my mind and take GTX470 :-)
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16064 - Posted: 29 Mar 2010 | 20:32:14 UTC - in response to Message 16060.

I think the techs are building an app for Fermi.
Fermi won't hit the shops until 12th Apr apparently, so that gives them time to work on it. The drivers need work too, but it was known in advance that there would be compatibility issues with the applications, and the drivers will no doubt have many revisions by NVidia over the next couple of years. As it's the drivers that are reporting the card details, a driver update might allow the card to use the correct amount of shaders on an ACEMD task. Obviously NVidia would want the drivers to work before general release, so it would be reasonable to expect the main driver related problems to be gone within a couple of weeks and the techs to have working apps fairly soon.



Source

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16065 - Posted: 29 Mar 2010 | 20:47:40 UTC - in response to Message 16064.
Last modified: 29 Mar 2010 | 20:53:50 UTC

I think the techs are building an app for Fermi.
Fermi won't hit the shops until 12th Apr apparently, so that gives them time to work on it. The drivers need work too, but it was known in advance that there would be compatibility issues with the applications, and the drivers will no doubt have many revisions by NVidia over the next couple of years. As it's the drivers that are reporting the card details, a driver update might allow the card to use the correct amount of shaders on an ACEMD task. Obviously NVidia would want the drivers to work before general release, so it would be reasonable to expect the main driver related problems to be gone within a couple of weeks and the techs to have working apps fairly soon.



Source


Here's hoping they will be fixed soon. If you search on the nvidia web site it doesn't list any drivers for the GTX470 or 480. Their driver page only goes up to the Geforce 300 family.

Have you tried a later BOINC version? 6.10.43 isn't too bad. It would be interesting to see what it reports for the card. I suspect that it too may need to be tweaked to correctly report things.

I was hassling my computer shop for pricing on the GTX470. They came back with $520 AUD. By my reckoning it should have been $380 AUD. I took the recommended price in USD and converted to AUD, so they are all trying to make big bucks at the moment. Too pricey for me.
____________
BOINC blog

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16066 - Posted: 29 Mar 2010 | 22:05:40 UTC - in response to Message 16065.

I was hassling my computer shop for pricing on the GTX470. They came back with $520 AUD. By my reckoning it should have been $380 AUD. I took the recommended price in USD and converted to AUD, so they are all trying to make big bucks at the moment. Too pricey for me.

AFAIK computer component are pretty pricy in Australia...
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16073 - Posted: 29 Mar 2010 | 22:37:15 UTC - in response to Message 16049.

Where's the 480 cuda-cores?


It's just a reporting artifact. The device query code assumes 8 cores/sm (15.8=120). Fermis actually have 32 (15.32=480).


Regarding the shader detection issue I'd trust that MJH knows what he's talking about. So it shouldn't affect crunching speed and it's the GPU-Grid client which needs an update to report the correct number. It's neither the nVidia driver nor BOINC.

Furthermore it's clear that the actual crunching kernel needs to be updated. So I'm not sure if we can take any more from the current discussion than "don't rush to buy a Fermi for GPU-Grid". Which we couldn't anyway, even if we wanted to ;)

MrS
____________
Scanning for our furry friends since Jan 2002

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16109 - Posted: 31 Mar 2010 | 21:25:09 UTC - in response to Message 16065.
Last modified: 31 Mar 2010 | 21:27:16 UTC

Have you tried a later BOINC version? 6.10.43 isn't too bad. It would be interesting to see what it reports for the card. I suspect that it too may need to be tweaked to correctly report things.


Apparently BOINC also assumes 8 shaders/processor. Compute capability 1.x cards have 8 shaders/processor and 2.x (Fermi reports compute capability of 2.0) have 32 shaders/processor. They are changing it now.
____________
BOINC blog

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16117 - Posted: 1 Apr 2010 | 20:12:38 UTC - in response to Message 16109.
Last modified: 1 Apr 2010 | 21:12:04 UTC

UK Fermi online prices:

The GTX 480 will be from about £439 Including Vat (£375 to £381 ex vat).
The GTX 470 will be from about £299 Including Vat (£260 to £282 ex vat).

Gigabyte, Asus, Zotac and Gainward versions will be available from the 12th Apr.

So far only default clocks.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16118 - Posted: 1 Apr 2010 | 21:43:38 UTC - in response to Message 16117.
Last modified: 1 Apr 2010 | 21:52:54 UTC

Just a digression. We had tried some Gainward GTX275 cards in the past and they did not work at all due to very poor cooling.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16119 - Posted: 1 Apr 2010 | 21:46:51 UTC - in response to Message 16117.
Last modified: 1 Apr 2010 | 21:56:27 UTC

Also MSI and EVGA...

Cant see any special card designs, so I would suggest you buy based on price and warrenty.

GTX470 (Manufacturer, Warrantee, price):

Gainward 24months £299
VGA 24months £316
Asus 36months £319
MSI 24months £328
Gigabyte 24months £333
EVGA 9999months?!? £334

I am not sure about the EVGA 9999months (lifetime warranty perhaps); will they exist in 10 or 20 years, do you need a warranty that long, what does their small print say?

The Gainward price looks good but their design reputation is poor (personally I have bad experiences here too).
The Asus warranty looks reasonably attractive for the price.
Obviously the EVGA warranty, if valid, is better; but how many of us would want such a card for more than 3years, and is it worth it?
I see no reason to consider the others, other than the reputation of Gigabyte (but not for £1 less than EVGA)!

The GTX 480 cards are similarly warrantied.

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16120 - Posted: 1 Apr 2010 | 22:22:11 UTC

Gainward usually is suxxxx (me personally faced that), but while all of them got reference design, is make no difference at all.
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16122 - Posted: 1 Apr 2010 | 22:29:19 UTC - in response to Message 16120.

Perhaps Gainward went out of their way to find the worst capacitors around, again. I Wont Go Near Another Gain-Ward!

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16127 - Posted: 2 Apr 2010 | 8:02:26 UTC - in response to Message 16122.

All the first run of cards are actually built by NVIDIA (via sub-contract), so all the cards are the same.
So it’s down to warranty really, and EVGA are likely to be doing a 10year or lifetime warranty. As the temperatures are high (90deg+) it has been suggested that it would be better to get one with a lifetime warranty!

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16130 - Posted: 2 Apr 2010 | 12:27:15 UTC - in response to Message 16127.

EVGA are likely to be doing a 10year or lifetime warranty. As the temperatures are high (90deg+) it has been suggested that it would be better to get one with a lifetime warranty!

XFX will probably have their usual double lifetime warranty. A bit of silliness with the lifetime warranties in a way since the usefulness of the card will most likely be nil within 5 years or so. What's nice about the XFX warranty though is that it transfers to the 2nd owner.

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16203 - Posted: 8 Apr 2010 | 16:33:47 UTC
Last modified: 8 Apr 2010 | 16:36:42 UTC

IMPORTANT! Nvidia's 400 series crippled by Nvidia

In an effort to increase the sales of their Tesla C2050 and Tesla C2070 cards, Nvidia has intentionally crippled the FP64 compute ability by a whopping 75%. If left alone, the GTX 480 would have performed FP64 at 672.708 GFLOPS, almost 119 GFLOPS more than ATI's 5870. Instead, the GTX 480 comes in at 168.177 GFLOPS. Here is a link to Nvidia's own forum where we have been discussing this. You will see there is also a CUDA-Z performance screenshot confirming it, on top of the confirmation by Nvidia's own staff. Nvidia is not publically making anyone aware that this is the case. Anandtech also snuck the update onto page 6 of a 20 page review of the GTX 480/470.


proof:
http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=1662#38378
http://forums.nvidia.com/index.php?showtopic=164417&st=0

the most worst "dreams" came true...

Sure it's not my business, but IMO project should speed-up OpenCL development in order to be able to use ATI cards.
____________

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 16204 - Posted: 8 Apr 2010 | 17:52:10 UTC - in response to Message 16203.
Last modified: 8 Apr 2010 | 18:16:49 UTC

This is the crummy side of Capitalism. It's no longer survival of the fittest, it's life support. I'm not a Communist, but when Companies purposefully cripple their own baby, with the intent of making their favorite child look better, it gives me a bitter taste in the mouth & sends chills down my spine.

So the mainstream GPU isn't the favorite child, but instead of just giving it all the love & support to make it grow strong & lead the way, they have to shoot it in the leg so that the favorite child can win the race.

Even if they had to sell 10 GTX480's for every Tesla. They'll get their money back & generate lots of sales, inspire confidence in Nvidia. But no, they had to sell their Telsa's, even if it runs the risk of taking the whole Fermi family down (& possibly also Nvidia).

I've seen a Country crumble under the weight of 3 generations of Nepotism & Collusion, which bred Corruption. It's was family first, those who knew the first family, & what you had to do to get to know the first family. Inbreeding led to the fall of Rome, stupid wars in Europe, & The Mad King is dead, long live the Mad King! They'd spend 1 Billion on a project that only cost 500 Million & didn't even care if they ever got their money back. They didn't need the highway you see, they WANTED the highway! They say jump, you say how high & you NEVER asked why you had to do all that jumping...
____________

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16205 - Posted: 8 Apr 2010 | 18:08:11 UTC

liveonc,
+1, exactly what I was thinking about...

right now I see 2 options for myself:
- try to get GTX295 biult on one PCB for cheap
- wait for GPUGRID being able to use ATI cards for crunching and meanwhile continue to use GTX275


____________

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16206 - Posted: 8 Apr 2010 | 19:15:47 UTC - in response to Message 16205.
Last modified: 8 Apr 2010 | 19:16:48 UTC

Double precision as well as EEC is not needed for gaming. That's behind their choice. They don't expect this to have any impact on the real market of a GTX480.

For what it regards computing, double precision is twice as slow anyway compared to single precision but the code must be able to use it.

GPUGRID uses single precision because it coded for it and it is faster. [SETI the same].

Long term they might relax the double precision/single precision restriction, maybe just a driver update. Don't think that it is in the hardware.

gdf

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16208 - Posted: 8 Apr 2010 | 20:31:22 UTC
Last modified: 8 Apr 2010 | 20:32:53 UTC

For BOINC this is a really bad move. There are projects which could really benefit from wide spread fast double precision capabilities. But for the existing projects it's not that bad:

GPU-Grid
We're using single precision, so crippling dp doesn't change Fermi performance at all.

Milkyway
It runs almost at peak efficiency on ATIs, so previously any nVidia GPU or CPU was much less energy efficient. These shouldn't run MW, since their crunching power can be put to much better uses (projects the ATIs / nVidias can't run). Fermi could have been the first nVidia GPU to actually be of use at MW, but not in this crippled form.

Collatz
It runs very well on ATIs without dp (all but the top chips) and on compute capability 1.0 and 1.1 cards, whereas 1.2 and 1.3 cards are not any faster per theoretical FLOPS. This project should be left to the cards which can't be used anywhere else efficiently [and as a backup for the MW server..] .. i.e. all the smaller ATIs and nVidias which are too slow for GPU-Grid (or others). CC uses integers, so crippled dp on Fermi doesn't change the fact that it shouldn't be used here anyway.

Einstein
Their app offers only mild support for the CPU. Credit-wise it's absolutely not worth it. Science-wise I think we'd need proper load balancing to make it feasible on GPUs. That's a software problem, so I don't know how Fermi hardware affects this.

SETI
EDIT: just read GDFs post that SETI is using "only" sp as well.

BTW: ATI does not support dp on all but their highest end chip. They're free to do this, but consider the following: Juniper (HD57x0) is almost exactly half a Cypress (HD58x0). Everything is cut in half, except some fixed fuction hardware like display driving circuitry which is present in both chips. Therefore one would expect the amount of transistors to be about half of each other. And that's indeed the case: 2154 million versus 1040 million. 2 Junipers would need 2080 transistors. That means it would have cost them about (2154-2080)/2 = 37 million transistors to put dp support into Juniper. That would have increased it's transistor count and thereby area and cost (approximately) by 3.5%. And leaving the feature in would not have cost them any development effort: just reuse the same shaders for the entire product family. Removing the feature did require a redesign and thus cause cost.

It's a legitimate business decision: reducing cost on all of their mainstream chips does count, even if it's as small as ~4%. And enabling dp on Redwood and Cedar would probably not have provided very little benefit anyway. I would have preferred it in Juniper, though.

MrS
____________
Scanning for our furry friends since Jan 2002

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16209 - Posted: 8 Apr 2010 | 20:41:44 UTC - in response to Message 16206.

Double precision as well as EEC is not needed for gaming. That's behind their choice. They don't expect this to have any impact on the real market of a GTX480.


That's what they say.. but, seriously, it's an effort on their side to disable it. Disabling it costs them something, whereas just leaving it enabled would have required no action at all.

Their motivation is clear: pushing Tesla sales.

And I doubt it's a simple driver issue. Anyone buying non-ECC Geforces over Teslas for number crunching is running non mission critical app and would probably just use a driver hack (which would undoubtfully follow soon after the cards are released).
I read nVidia went to some length to disable the installation of Quadro drivers (better CAD performance) on recent Geforce models the hard way. Not sure if a bios flash could help, though.

MrS
____________
Scanning for our furry friends since Jan 2002

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16210 - Posted: 8 Apr 2010 | 21:10:26 UTC

ouch... if GPUGRID uses sp, that's different story :-) but it's really pity that if one day GPUGRID would like to use dp I have to spend $$$
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16211 - Posted: 8 Apr 2010 | 21:30:43 UTC - in response to Message 16210.

In the absolutely best case you're "only" 50% slower in dp than in sp. The only reason to use dp is if the precision provided by sp is not enough - a circumstance GPU-Grid can luckily avoid. Otherwise they wouldn't have been CUDA pioneers ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16212 - Posted: 8 Apr 2010 | 21:34:09 UTC - in response to Message 16211.

We don't really avoid it. We do use double precision emulation via software in some specific part of the code where it is needed. It is just implemented in terms of single precision.
Of course, if you do have to use it all the time, then it makes no sense.

gdf

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 16214 - Posted: 8 Apr 2010 | 22:14:16 UTC - in response to Message 16209.

Double precision as well as EEC is not needed for gaming. That's behind their choice. They don't expect this to have any impact on the real market of a GTX480.


That's what they say.. but, seriously, it's an effort on their side to disable it. Disabling it costs them something, whereas just leaving it enabled would have required no action at all.

Their motivation is clear: pushing Tesla sales.

And I doubt it's a simple driver issue. Anyone buying non-ECC Geforces over Teslas for number crunching is running non mission critical app and would probably just use a driver hack (which would undoubtfully follow soon after the cards are released).
I read nVidia went to some length to disable the installation of Quadro drivers (better CAD performance) on recent Geforce models the hard way. Not sure if a bios flash could help, though.

MrS


It's not the first time & it's getting really anoying. Everything from rebranding to disabling/crippling with the intent to push Tesla's. As if it wasn't hard enough for them to justify to people to pay $500 for a GTX480 that eats lots of electricity, produces tonnes of heat & only slighly outpreforms a much cheaper Ati. They're shooting themselves in the foot every time they try to pull these tricks. I'm really considdering Ati, but I'd hate to see Nvidia go away. One person is no loss, but I'm not the only one thinking these thoughts.
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16215 - Posted: 8 Apr 2010 | 22:39:00 UTC - in response to Message 16214.

Again, double precision is not required to run GPUGrid tasks; accuracy & statistical reliability of the models is achieved in different ways here.

That said, I broadly agree with what you are saying.
If NVidia had the sense to create a single precision core design it would have been less expensive to design, manufacture, sell and might have been in the shops last Nadal. This would also have separated gaming cards from professional cards. This still needs to be done.

All your eggs in one basket strategies always have their downfalls.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16217 - Posted: 8 Apr 2010 | 23:00:30 UTC

Yet it seems that AMD/ATI has no problem at all implementing fast double precision in their consumer grade cards.

fractal
Send message
Joined: 16 Aug 08
Posts: 87
Credit: 1,248,879,715
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16218 - Posted: 9 Apr 2010 | 4:01:02 UTC

From http://forums.nvidia.com/index.php?showtopic=165055

- Tesla products are built for reliable long running computing applications and
undergo intense stress testing and burn-in. In fact, we create a margin in
memory and core clocks (by using lower clocks) to increase reliability and long life.

ok, did you catch that.. Increase reliability by using lower clocks!

Now, if you were making a consumer card where all the market information including statements from your competition say that the consumer market does not need double precision floating point, what would you choose to increase reliability ... reduce clocks all around, or just reduce clocks on double precision floating point, a feature not needed by the market? I know which I would select. And we know what ATI selected when they removed double precision floating point from all but the top end models of their mainstream line of GPUs.

Maybe they will work out the technical hitches in the future so they can crank up the clock speed on double precision floating point. But for now, all these conspiracy theories are making me remember the grassy knoll..

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 16219 - Posted: 9 Apr 2010 | 5:02:13 UTC - in response to Message 16218.
Last modified: 9 Apr 2010 | 5:39:37 UTC

It's not about WHY Ati removed double precision, but HOW Nvidia crippled the GTX470/480. If the logic behind Tesla's being so great because "memory and core clocks (by using lower clocks) to increase reliability and long life" were true, why didn't they choose to cripple the Tesla's instead?

GPU's have warranties, some even longer then people care to have them. Personally, I'd use a GPU for 2-3 years I even OC them, sure they are prone to errors, but GPUGRID.net is for me just a hobby.
____________

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16221 - Posted: 9 Apr 2010 | 7:45:53 UTC - in response to Message 16219.

A question about ATI removing double precision in the lower cards.

As far as I know, Fermi and the 5000 series ATI cards use the same compute cores to compute single and double precision, just processing the data differently. There is not specific hardware for double precision (may a little of control units). I don't think there is much to gain by removing double precision in terms of saving transistor. So, maybe ATI is doing marketing as well when they remove double precision.

gdf

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16226 - Posted: 9 Apr 2010 | 17:31:33 UTC
Last modified: 9 Apr 2010 | 17:32:22 UTC

ATI said that they made the decision not to include double precision in their lower end 57xx series in order to save die space and because they didn't feel there was a demand for it in those cards. The only real change was that the 4770 did include double precision, but that was in many ways a unique product in that it was made to test the viability and design of their new 40nm process. All the 48xx, 58xx and 59xx series consumer cards support fully enabled double precision and do it very efficiently compared to anything else on the market.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16230 - Posted: 9 Apr 2010 | 22:03:41 UTC

GDF, you're right: ATI does dp using the same shaders as sp. However, to couple them for dp some clever control logic is needed. That's what they're saving by not including it in the mainstream chips. It's an economic decision as well as a marketing decision to drive crunchers to the faster cards. Which are more energy efficient at this job anyway.

Fractal, there's no need for a "conspiracy theory" here. To begin with, if anything is labeld like that it's considered wrong by definition.. which is not the case here ;)
But on a more serious note: it's just a business decision made by nVidia, without technical reasons or advantages. Their official marketing stance on it is pure BS and the real reason is pushing Tesla sales. There's no conspiracy here, they're free to cripple their product as much as they want. It's just that we crunchers think it's a bad move, short and long term.

Regarding the issue of increased reliability due to slower clock speeds: sp and dp are done by the same shaders. And the GTX 4x0 chips don't have dp disabled, it's just that less of their shaders are allowed to run in dp mode at any time. So which ever clock speed penalty might exist in dp mode, the Geforce models also receive it.
But there's more. As I explained here chips do degrade over time. Seldomly they suddely break, but they're always affected by continous decay, i.e. the maximum clock speed they can reach drops. With this knowledge it's easy to see how nVidia increases reliability by using lower clock speeds: if you take similar chips and run them under similar conditions (same voltage & temperature), they'll decay in the same way. Assume your chips can do 1.4 GHz and lose 0.1 GHz per year on average (*), then you could run them at 1.2 GHz for 2 years until you start to receive major RMAs because the cards start to fail. If you run the chips at 1.1 GHz you'll get another year, so you can either extend your warrenty or reduce the return rate considerably.

MrS

(*) This number would be different for every single chip and the decay rate wouldn't be constant over time, so this is just a Gedankenexperiment.
____________
Scanning for our furry friends since Jan 2002

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16283 - Posted: 13 Apr 2010 | 15:15:55 UTC
Last modified: 13 Apr 2010 | 15:32:41 UTC

Here's a GTX 470 at Collatz (integer math):

http://boinc.thesonntags.com/collatz/show_host_detail.php?hostid=20977

Looks like it's about 55% of the speed of the HD 5870 (around the speed of a $150 HD 5770). Someone else with a GTX 470 confirmed the same results and also tried a GTX 480, OCed it's about 90% of the speed of a stock HD 5850. Here's the thread:

http://boinc.thesonntags.com/collatz/forum_thread.php?id=378&nowrap=true#7162

Edit: Here's a GTX 480 on GPUGRID, so far it's running a little slower (and less reliably) than my GTX 260:

http://www.gpugrid.net/show_host_detail.php?hostid=35174

Edit 2: Two guys are trying to get their GTX 470 cards to run at MilkyWay (double precision), so far with no luck. Will post an update when they get them working.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16284 - Posted: 13 Apr 2010 | 16:21:40 UTC - in response to Message 16283.
Last modified: 13 Apr 2010 | 16:25:40 UTC

Horses for courses!
A speedboat is fast on the water and an F1 car is fast on a track. They don’t swap well, and Fermi is just out of the garage for its first spin.
It will work well here when the team releases more ACEMD (V1) tasks or writes, tests and releases an app for Fermi cards. We already know ACEMD Ver 2 task will not work and why they will not work (The app is specifically/rigidly written for earlier GPU core architecture), and we already know ACEMD tasks will work (as this is not the case; albeit at the expense of some speed improvements introduced with V2). So there is the short term and long term solutions in a nut shell.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16286 - Posted: 13 Apr 2010 | 16:59:58 UTC - in response to Message 16284.

I have now added the cuda3 beta application again for Fermi based cards only.

gdf

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16287 - Posted: 13 Apr 2010 | 17:47:33 UTC - in response to Message 16286.

I have tried for 30 minutes every 8 seconds make connection with gpugrid for downloading any beta3 WU.
The reply is - no work available!!!

Can i do anything about it?

thx for helping out!

Can i have the same problems with my new GTX 470 card, which i want to install tomorrow? Should i wait with installing??

Ton (ftpd)
____________
Ton (ftpd) Netherlands

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16288 - Posted: 13 Apr 2010 | 18:06:42 UTC - in response to Message 16287.

The only application that you have some chances to see working on Fermi is the beta application. You should select in the preferences to just crunch for that at the moment.

Until we get a Fermi here we cannot really debug it.

Do you have the latest driver installed?

I have submitted 50 beta WUs but there all out now, so just wait to get one.

gdf

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16290 - Posted: 13 Apr 2010 | 19:38:35 UTC - in response to Message 16288.

I installed driver 197.41

I will try tomorrowmorning again!
____________
Ton (ftpd) Netherlands

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16291 - Posted: 13 Apr 2010 | 19:51:51 UTC

just downloaded 1 beta-WU (cuda30) - # 1359859.
Cancelled after 15 seconds.
____________
Ton (ftpd) Netherlands

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16325 - Posted: 16 Apr 2010 | 13:25:19 UTC

I wonder if some1 who got fermi, can put here compute capability (in TFLOPS).
____________

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,806,011,851
RAC: 9,768,357
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16326 - Posted: 16 Apr 2010 | 13:33:45 UTC - in response to Message 16325.

I wonder if some1 who got fermi, can put here compute capability (in TFLOPS).

They've already been posted in the next-door thread:

NVIDIA GPU 0: GeForce GTX 470 (driver version 19741, CUDA version 3000, compute capability 2.0, 1280MB, 1089 GFLOPS peak)
NVIDIA GPU 0: GeForce GTX 480 (driver version 19741, CUDA version 3000, compute capability 2.0, 1536MB, 1345 GFLOPS peak)

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16327 - Posted: 16 Apr 2010 | 14:05:53 UTC - in response to Message 16326.
Last modified: 16 Apr 2010 | 14:06:26 UTC

I wonder if some1 who got fermi, can put here compute capability (in TFLOPS).

They've already been posted in the next-door thread:

NVIDIA GPU 0: GeForce GTX 470 (driver version 19741, CUDA version 3000, compute capability 2.0, 1280MB, 1089 GFLOPS peak)
NVIDIA GPU 0: GeForce GTX 480 (driver version 19741, CUDA version 3000, compute capability 2.0, 1536MB, 1345 GFLOPS peak)



The flops value reported by BOINC means close to nothing. They are indicative only within the same GPU version (G200, Fermi). Across GPU core versions and compared to ATI cards they don't mean anything.

gdf

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16328 - Posted: 16 Apr 2010 | 14:07:15 UTC - in response to Message 15904.

We have ordered one of these hosts without GPUs.

gdf



The above ships with 4 Teslas and costs as much as a new car.

You would be better off just buying a good PSU and getting a couple of Fermi cards (or a new sysetem, or two, or three...).
Two GTX480 cards will probably do as much work as 4 or 5 GTX295s, ie more than 4 Teslas!

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16329 - Posted: 16 Apr 2010 | 14:44:55 UTC - in response to Message 16328.

Have you considered the cost effectiveness of using 470s instead 480s?

I know having the the best is going to return the fastest results but at US price ratios you can get 8 x 470s for the price of 5 1/2 x 480s.
____________
Thanks - Steve

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16332 - Posted: 16 Apr 2010 | 16:41:15 UTC - in response to Message 16329.
Last modified: 16 Apr 2010 | 19:04:07 UTC

Once paid for the host, there is no point of saving into the GPUs.
Plus this is to go as fast as possible and also the cooling seems better to me on the GTX480.

GDF

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16335 - Posted: 16 Apr 2010 | 17:47:25 UTC - in response to Message 16326.

I wonder if some1 who got fermi, can put here compute capability (in TFLOPS).

They've already been posted in the next-door thread:

NVIDIA GPU 0: GeForce GTX 470 (driver version 19741, CUDA version 3000, compute capability 2.0, 1280MB, 1089 GFLOPS peak)
NVIDIA GPU 0: GeForce GTX 480 (driver version 19741, CUDA version 3000, compute capability 2.0, 1536MB, 1345 GFLOPS peak)

THX :-)

____________

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16386 - Posted: 18 Apr 2010 | 1:02:38 UTC - in response to Message 16208.

For BOINC this is a really bad move. There are projects which could really benefit from wide spread fast double precision capabilities. But for the existing projects it's not that bad:

GPU-Grid...

Your list forgot to include DNETC@Home which supports both ATI and Nvidia.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16395 - Posted: 18 Apr 2010 | 11:04:13 UTC - in response to Message 16386.

That's probably because I don't find their science very compelling ;)
Do you know if they use integer or single or double fp?

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Microcruncher*
Avatar
Send message
Joined: 12 Jun 09
Posts: 4
Credit: 185,737
RAC: 0
Level

Scientific publications
watwat
Message 16397 - Posted: 18 Apr 2010 | 11:34:44 UTC - in response to Message 16395.
Last modified: 18 Apr 2010 | 11:39:30 UTC

That's probably because I don't find their science very compelling ;)
Do you know if they use integer or single or double fp?

MrS


The CUDA variant uses integer instructions. There should be no need for floating point calculations in a rc5-72 known plaintext attack.

To prevent some misconceptions: A known plaintext attack doesn't mean that the message is known.

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16409 - Posted: 18 Apr 2010 | 16:26:32 UTC - in response to Message 16395.
Last modified: 18 Apr 2010 | 16:27:28 UTC

That's probably because I don't find their science very compelling ;)
Do you know if they use integer or single or double fp?



DNETC are Single Precision

Regards
Zy

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16426 - Posted: 19 Apr 2010 | 12:07:58 UTC - in response to Message 16395.

That's probably because I don't find their science very compelling ;)

Yes, well, you included SaH and they are not doing science at all ... they are doing exploration ...

The only point I was trying to raise is that like a particular project or not or find their purpose interesting or not, when we write summaries of the world of BOINC we should be as comprehensive as possible because someone else may find the project's focus compelling even when we think it is a waste of time.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16438 - Posted: 19 Apr 2010 | 19:52:25 UTC - in response to Message 16426.

Sorry if I wasn't clear: I'm grateful you corrected the list (or at least made it more complete). I didn't leave it out intentionally, I really forgot it. And my last statement was supposed to be taken humorously. I admit, not exactly my best post.. :p

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16439 - Posted: 19 Apr 2010 | 19:59:50 UTC - in response to Message 16438.

This is a Fermi thread on GPUGRID. Lets keep it that way, please.

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16440 - Posted: 19 Apr 2010 | 20:36:39 UTC - in response to Message 16438.

To keep this thread about Fermi cards ... I don't have one ... :)

Sorry if I wasn't clear: I'm grateful you corrected the list (or at least made it more complete). I didn't leave it out intentionally, I really forgot it. And my last statement was supposed to be taken humorously. I admit, not exactly my best post.. :p

Sorry, I missed the humor ... :(

My fault entirely ...

And I still don't have a Fermi card ...

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16517 - Posted: 24 Apr 2010 | 12:49:35 UTC
Last modified: 24 Apr 2010 | 13:01:53 UTC

While GPUGrid is clear on the fact that their app is not quite ready to run on a GTX480 I thought I would exercise my new card to get a ballpark feel for it. I will be running a couple other projects but I'm not going to do any *optimized app* installations as I am a simple kind of cruncher.

Test system is an i7-920 w/HT on @4.2 GHz running Win7 64, BOINC 6.10.43.
Asus P6T, EVGA GTX480, Corsair 2x3 1600 C7.

Milkyway does not work without an opti app.

SETI works good with shaders up to 1732 (1803 is a little shakey and 1823 grey screens).
It is hilarious to watch WUs crank through in 15-17 SECONDS instead of hours, the uploads and downloads can hardly keep up :big-grin:
SETI has some WUs that are calculating signals from low angles and they run slower on a GPU than on a CPU.
It's a known issue. There is an opti app that will switch to your cpu when these WUs are detected.

Colltaz: stock is good, 1532 is good, 1632 is good, 1732 good.
Collatz @ 1732 shaders is pushing the GPU usage at 85% which is about the same as GPUGrid does on my GTX285 and GTX295 cards with the same OS/CPU/RAM configuration.

With an ambient temperature of ~ 23-24c I manually set the fan at 70% and running temperature is ... 73c. Not quite the furnace we had been told to expect. This is in an enclosed full tower case that has excellent air flow (front, side, top, and back case fans). Everything looks pretty good to me so far, temperature and fan noise are well within my personal tolerance levels.
I have not put an extra 120 mm fan blowing on the backside (like I do on the GTX295) because it just doesn't need it.

When GPUGrid is ready, which I have every confidence will be soon, I will be switching back here where I believe I belong.
____________
Thanks - Steve

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,806,011,851
RAC: 9,768,357
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16518 - Posted: 24 Apr 2010 | 13:22:02 UTC - in response to Message 16517.

SETI works good with shaders up to 1732 (1803 is a little shakey and 1823 grey screens).
It is hilarious to watch WUs crank through in 15-17 SECONDS instead of hours, the uploads and downloads can hardly keep up :big-grin:
SETI has some WUs that are calculating signals from low angles and they run slower on a GPU than on a CPU.
It's a known issue. There is an opti app that will switch to your cpu when these WUs are detected.

Are you sure (please check) that those 15-17 second SETI tasks are producing validated results?

There is a common failure mode at SETI where older CUDA cards suffer - I believe - memory corruption following an error, and return all tasks with:

SETI@Home Informational message -9 result_overflow
NOTE: The number of results detected exceeds the storage space allocated.

This is a valid outcome for some tasks (roughly 5%), but if you get it on every task, it's an error, and damaging to the project because of all those downloads.

But if you can link me to the host, and demonstrate that the tasks are validating correctly and getting full credit, I've still got time to nip into town and pick up a Fermi this afternoon... ;-)

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16519 - Posted: 24 Apr 2010 | 14:18:09 UTC - in response to Message 16518.

Just checked in on the results I sent in earlier this morning.
Only one has been processed and it is in fact the Invalid situation you posted :-(
Fortunately I had already stopped my SETI testing so am no longer pushing the servers to feed me. I'll post back if any of the others validate properly.
____________
Thanks - Steve

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,806,011,851
RAC: 9,768,357
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16520 - Posted: 24 Apr 2010 | 15:21:56 UTC - in response to Message 16519.
Last modified: 24 Apr 2010 | 15:39:26 UTC

Just as well they were out of stock, then! "More expected on 28/04/10" - or 13 May, for some brands.

Edit - if you do want to test out the performance of the card on SETI tasks, the SETI Beta project has a Fermi-compatible application under test. Work supply is a bit erratic at the moment, though, because they're testing other things as well.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17062 - Posted: 15 May 2010 | 11:12:18 UTC

Some good news about Fermi here. They already shipped 400k units, which puts some weight behind their claim that the yield problems are solved by now. We should also see a 512 shader card at some point.
I'm still not interested in buying one, but an ultra high priced, unavailable Fermi wouldn't have done us any good.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17072 - Posted: 15 May 2010 | 16:26:46 UTC

My local Frys had about 6 of them on the shelves as a mix of 470 and 480s ...

Sadly, no 5870 or I might have been tempted to buy another ...

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17075 - Posted: 15 May 2010 | 17:14:19 UTC - in response to Message 17072.

I would not read too much into xbitlabs interpretation of the financial state of NVidia.
Firstly, the cards only started to ship towards the end of the first quarter, so they could not have made a huge impact on first quarter finances!
Secondly, companies tend to offset losses from one quarter against others (even over a year), for TAX/Financial Stability reasons.

I was surprised to hear that a 448 cored Fermi popped up in the Tesla C2050, and that report sort of suggested they had sold a lot of these. Well you can be sure they made no money there; they are not due for release until Q3!
C2050

darwincollins
Send message
Joined: 6 Jan 10
Posts: 22
Credit: 105,944,936
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 17166 - Posted: 20 May 2010 | 5:05:58 UTC - in response to Message 17072.

Similar here regarding Frys. (25yr anniversary sales was last weekend) I was shopping for a new computer for my kid. Basically, he wanted to run Starcraft 2 and I wanted a strong GPU to do cpugrid. I first tried to do a name brand computer (already built) but found nothing with video cards that were listed on cpugrid.

So, went 'clone'. I knew that I was going to make some heat, so got a Cooler master haf case. The only thing on the shelves that were listed on cpugrid was the gtx480. It was listed as 'recommended' on forum page. (I didn't realize till now that there are issues with it on gpugrid)

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17168 - Posted: 20 May 2010 | 9:13:44 UTC - in response to Message 17166.

Similar here regarding Frys. (25yr anniversary sales was last weekend) I was shopping for a new computer for my kid. Basically, he wanted to run Starcraft 2 and I wanted a strong GPU to do cpugrid. I first tried to do a name brand computer (already built) but found nothing with video cards that were listed on cpugrid.

So, went 'clone'. I knew that I was going to make some heat, so got a Cooler master haf case. The only thing on the shelves that were listed on cpugrid was the gtx480. It was listed as 'recommended' on forum page. (I didn't realize till now that there are issues with it on gpugrid)



The are no issues with a GTX480. It is the fastest card you can buy for GPUGRID with a factor almost two faster than the other ones.

gdf

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17178 - Posted: 20 May 2010 | 15:04:13 UTC - in response to Message 17168.

I've been watching the Fermis a bit trying to decide if it's worth buying one. So far most of the 470s around are running faster on comparable WUs than the 480s. Don't know why. I did find one 480 that was faster than any of the others and it's running at just a bit less credits/hour than a fast GTX 295 (which of course has 2 cores) and 1.63 times the speed of my GTX 260. So they are the fastest single core card, but not sure if the price is justified yet.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17179 - Posted: 20 May 2010 | 16:01:04 UTC - in response to Message 17178.

I've been watching the Fermis a bit trying to decide if it's worth buying one. So far most of the 470s around are running faster on comparable WUs than the 480s. Don't know why. I did find one 480 that was faster than any of the others and it's running at just a bit less credits/hour than a fast GTX 295 (which of course has 2 cores) and 1.63 times the speed of my GTX 260. So they are the fastest single core card, but not sure if the price is justified yet.


The price is not justified, it is clearly too expensive. But with cuda3.1 the performance is there. We cannot release the new version until 3.1 is out.
The GTX480 are way faster than the GTX470 (at least in Linux).


gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17182 - Posted: 20 May 2010 | 16:53:23 UTC - in response to Message 17178.
Last modified: 20 May 2010 | 17:41:08 UTC

GDF is looking at his GTX480 running on Linux and using CUDA 3.1. Natively a GTX480 is 21% faster than a GTX470.
Perhaps you are looking at an exceptionally reliable and overclocked GTX295 and compairing it to one of several GTX480's on Vista or Win7 and suffering around 60% speed loss. This is driver related.

Most GTX295's only RAC between 40K and 70K per day.

My GTX470 is clocked at the same speed as a GTX480, and my card is on Win XP x86. It does have one thread left aside for GPU consumption, with swan_sync=0 in use.

It could (going by tasks) get 70K per day, and is moving steadily towards that (no failures on that system to date, 6 days).
A native GTX480 should be about 10% faster, having 10% more shaders...
So a native GTX480 on XP could get 77K per day.
On Linux that would be about 81K and with the 11% benefit of CUDA 3.1 (over CUDA 3.0) that would be about 90K per day.

A GTX295 might get that, perhaps even slightly more, but it is unlikely to be as stable and is not as future proofed WRT crunching GPUGrid tasks.

That said, the Fermi's are too expensive.
I would expect that to relax somewhat over the next 2 or 3 months, especially when more variants are released, including the GTX 465 (June 9th is this release date, but dont expect a date for the lesser cards, they may just turn up unannounced):

Suggested GTX465 prices are $250 to $300.
Although the GTX 465 only has 1GB GDDR5 that is enough for here. The 607MHz Core and 1,215MHz Shaders have the same clocks.
The downside is that it only has 352 stream processors; 96 less than the GTX470!
So, it looks more affordable but a poorer performer (27% less than a GTX470). Overall it should be reasonable value, if you want a card that crunches here.

If this really wants to compete with the Radeon 5850, the prices will have to drop a bit more, but that usually comes with time; release prices usually fall.

There are several other suggested Fermis such as the GTS430 with 192 shaders. The expected retail price is between $175 and $200.
The 440 and 450 variants are set to bridge the gap between the GTS430 and the GTX465, with the 450 having 256 shaders.

These cards should rival existing high end cards such as the GTX260 and GTX275 for performance, but not the GTX295 without significant code enhancement.
Anyone with a GTX275 should rethink what sort of card they have; it's no-longer high end, its now mid range!

My advice for anyone thinking about buying a card to crunch with, is to either get a Fermi, a GT 240 or preferably wait for a month (mid June):
There will be more Fermi's to choose from and CUDA 3.1 (+11%) is due for release about that time.
My guess is that within a few months, we will no longer be recommending GT 240's for GPUGrid entry level cards, but perhaps they will hold their own for a while yet.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17185 - Posted: 20 May 2010 | 17:52:24 UTC
Last modified: 20 May 2010 | 17:54:34 UTC

Writing long speculative ramblings isn't very useful. If you spent the time actually looking at the results various machines are returning you'd be better off. Here's some comparisons for the GTX 295, GTX 480 and GTX 470. ALL are running WinXP

Here's the first GTX 295 I looked at:

http://www.gpugrid.net/results.php?hostid=62542

It averages around 13,800 seconds for a 6,755.61 credit TONI_HERGunb WU. Since it has 2 cores running that's a WU every 6,900 seconds.


Here's by far the fastest GTX 480 that I've found:

http://www.gpugrid.net/results.php?hostid=35174

It averages around 7,800 seconds for a 6,755.61 credit TONI_HERGunb WU.


Here's your GTX 470. It's running faster than any GTX 480 that I've been able to find except the one listed above:

http://www.gpugrid.net/results.php?hostid=71363

It averages around 8,650 seconds for a 6,755.61 credit TONI_HERGunb WU.


For comparison my GTX 260, again running XP, is averaging 12,675 seconds for a 6,755.61 credit TONI_HERGunb WU.

I did spend some time looking for GTX 480 cards. There may be more out there performing as well but I haven't found them.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17187 - Posted: 20 May 2010 | 20:14:13 UTC - in response to Message 17185.

I did look at many results, especially from top contributors. My speculation was limited to forthcoming GTS400 cards; the GTX465 specs are published.

Going by the best XP GTX295 you could find, running 6.72 (the fastest) tasks it could theoretically get a RAC of (86400/13807)*6755.61=85K:
2363335 1495413 20 May 2010 7:06:09 UTC 20 May 2010 14:50:24 UTC Completed and validated 13,807.71 737.55 4,503.74 6,755.61 Full-atom molecular dynamics v6.72 (cuda)
However that system actually gets a RAC of 63K!

Going by the tasks for Ton’s GTX480:
86400/6416.86*6642.02=90K
2363322 1495382 20 May 2010 7:53:42 UTC 20 May 2010 13:13:37 UTC Completed and validated 6,416.86 7,896.70 4,428.01 6,642.02 Full-atom molecular dynamics v6.73 (cuda30)

Unfortunately it is still too early to look at system RACs for Fermis as the systems need to be up for quite some time to equilibrate.

My GTX470 is reasonably well optimized, but nothing special. I have also recently been having system RAM issues; I only have 2GB in that system and I was running 7 CPU threads on a native i7-920 and some CPU tasks were using over 400MB each. My page file is over 5GB, but it is a testing system!
I will eventually get more RAM and I have just upped the CPU to 3.5GHz, so the times should eventually fall to below 8K sec.

Again, the card is only clocked to the same core speed as a GTX480 and it does not yet use CUDA 3.1, so there is more to come.
Lots of people are struggling with the poor Win7 drivers.
Give me a GTX480 and I could easily get 110K per day!

Anyway, wait a month and see how things pan out before getting a Fermi; there will be a GTX465 version and possibly/hopefully a 512 shader Fermi.

In my opinion, as an experienced buyer and seller of IT equipment, if you have a GTX275, GTX280, GTX285 or GTX295, SELL IT NOW and get a FERMI next month.
Within a month your GTX200 cards will be worth much less!

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17191 - Posted: 20 May 2010 | 23:04:41 UTC - in response to Message 17187.
Last modified: 20 May 2010 | 23:05:11 UTC

However that system actually gets a RAC of 63K!

Determining how well a card produces or not is difficult. The average time for a 6.72 HERG were used for the theorical calculations on the best 295 yet the note quoted above does not mention that the card is still running many 6.03 tasks.

For example, my 295 is not returning each WU quite so fast but because I switched over to mostly 6.72 two days ago my RAC is over 76K and climbing.
http://www.gpugrid.net/results.php?hostid=56900
____________
Thanks - Steve

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17192 - Posted: 20 May 2010 | 23:46:38 UTC - in response to Message 17191.

However that system actually gets a RAC of 63K!

Determining how well a card produces or not is difficult. The average time for a 6.72 HERG were used for the theorical calculations on the best 295 yet the note quoted above does not mention that the card is still running many 6.03 tasks.

For example, my 295 is not returning each WU quite so fast but because I switched over to mostly 6.72 two days ago my RAC is over 76K and climbing.
http://www.gpugrid.net/results.php?hostid=56900

Exactly, on v6.72 tasks the above GTX 295 is outproducing any GTX 480 to date. SK babbles on about future potential RAC but it's really very easy to determine which card is faster. For some reason he's just not getting it, and it's getting a little frustrating.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17195 - Posted: 21 May 2010 | 8:19:28 UTC - in response to Message 17192.

In terms of rac, I would expect a dual core GTX295 to be on the same level of a single core GTX480.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17197 - Posted: 21 May 2010 | 9:59:20 UTC - in response to Message 17195.
Last modified: 21 May 2010 | 10:02:33 UTC

That's along the same line as my babbling;
"A GTX480 could get 90K" and "a GTX295 could get 85K".

As I also explained that GTX480 is a lab card, so a Fermi using CUDA 3.0 would be about 11% slower (82K).

I dont think it is babbling to point out to people that CUDA 3.1 will improve things by 11% and it will be released next month.

As for that GTX295 you say "is outproducing any GTX 480 to date"

http://www.gpugrid.net/results.php?hostid=56900

2368139 1498742 21 May 2010 3:53:39 UTC 21 May 2010 8:41:48 UTC Error while computing 10.95 8.80 0.07 --- Full-atom molecular dynamics v6.72 (cuda)
2368121 1498726 21 May 2010 3:53:39 UTC 21 May 2010 3:56:37 UTC Error while computing 3.09 3.00 0.02 --- Full-atom molecular dynamics v6.72 (cuda)
2368111 1498720 21 May 2010 3:53:39 UTC 21 May 2010 3:56:37 UTC Error while computing 9.92 8.72 0.06 --- Full-atom molecular dynamics v6.72 (cuda)
2368109 1498719 21 May 2010 3:49:36 UTC 21 May 2010 3:53:39 UTC Error while computing 11.02 8.83 0.07 --- Full-atom molecular dynamics v6.72 (cuda)
2368098 1498712 21 May 2010 3:49:36 UTC 21 May 2010 3:53:39 UTC Error while computing 10.06 8.78 0.07 --- Full-atom molecular dynamics v6.72 (cuda)
2368090 1498706 21 May 2010 3:49:36 UTC 21 May 2010 3:53:39 UTC Error while computing 10.06 8.69 0.06 --- Full-atom molecular dynamics v6.72 (cuda)
2368050 1498678 21 May 2010 3:35:32 UTC 21 May 2010 3:38:56 UTC Error while computing 10.63 8.84 0.07 --- Full-atom molecular dynamics v6.72 (cuda)
2368049 1498677 21 May 2010 3:35:32 UTC 21 May 2010 3:38:56 UTC Error while computing 9.94 8.73 0.06 --- Full-atom molecular dynamics v6.72 (cuda)
2368043 1498674 21 May 2010 3:34:54 UTC 21 May 2010 3:38:56 UTC Error while computing 10.81 8.75 0.06 --- Full-atom molecular dynamics v6.72 (cuda)
2368020 1498659 21 May 2010 3:30:04 UTC 21 May 2010 3:33:14 UTC Error while computing 10.55 8.75 0.06 --- Full-atom molecular dynamics v6.72 (cuda)
2368017 1498656 21 May 2010 3:30:04 UTC 21 May 2010 3:33:14 UTC Error while computing 10.97 8.84 0.07 --- Full-atom molecular dynamics v6.72 (cuda)
2368003 1498649 21 May 2010 3:23:13 UTC 21 May 2010 3:26:44 UTC Error while computing 10.94 8.80 0.07 --- Full-atom molecular dynamics v6.72 (cuda)
2367976 1498629 21 May 2010 3:22:34 UTC 21 May 2010 3:24:36 UTC Error while computing 9.98 8.67 0.06 --- Full-atom molecular dynamics v6.72 (cuda)
2367960 1498617 21 May 2010 3:23:13 UTC 21 May 2010 3:26:44 UTC Error while computing 9.95 8.77 0.06 --- Full-atom molecular dynamics v6.72 (cuda)
2367933 1498599 21 May 2010 3:16:32 UTC 21 May 2010 3:20:26 UTC Error while computing 10.11 8.86 0.07 --- Full-atom molecular dynamics v6.72 (cuda)
2367905 1498576 21 May 2010 3:20:26 UTC 21 May 2010 3:22:34 UTC Error while computing 9.94 8.77 0.06 --- Full-atom molecular dynamics v6.72 (cuda)
2367895 1498567 21 May 2010 3:16:32 UTC 21 May 2010 3:20:26 UTC Error while computing 10.78 8.69 0.06 --- Full-atom molecular dynamics v6.72 (cuda)
2367865 1498546 21 May 2010 3:16:32 UTC 21 May 2010 3:20:26 UTC Error while computing 11.00 8.63 0.06 --- Full-atom molecular dynamics v6.72 (cuda)
2367814 1498513 21 May 2010 2:31:10 UTC 21 May 2010 2:34:10 UTC Error while computing 10.28 8.81 0.07 --- Full-atom molecular dynamics v6.72 (cuda)
2367795 1498502 21 May 2010 2:31:10 UTC 21 May 2010 2:34:10 UTC Error while computing 11.00 8.84 0.07 --- Full-atom molecular dynamics v6.72 (cuda)

Unfortunately, this demonstrates my point quite well; reliability is a significant factor in RAC.

- That 75K RAC wont last long going at that rate.

http://www.gpugrid.net/results.php?hostid=71363
No errors here!

As for my low RAC.
Again, you need to be running the project for weeks to average out on RAC.
My Fermi RAC is still under 40K as it has only been in that box for a week.
As I only have a GTX470, it could only reach about 70K anyway.

roundup
Send message
Joined: 11 May 10
Posts: 57
Credit: 1,587,870,193
RAC: 8,075,381
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17198 - Posted: 21 May 2010 | 10:53:34 UTC - in response to Message 17185.


For comparison my GTX 260, again running XP, is averaging 12,675 seconds for a 6,755.61 credit TONI_HERGunb WU.

Wow, my GTX 260 is averaging 16,200 seconds for a 6,755.61 credit TONI_HERGunb WU, but under Vista 64.
What are the best clock setting for an optimized yet stable output for a GTX 260 (216)?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17200 - Posted: 21 May 2010 | 12:08:45 UTC - in response to Message 17198.
Last modified: 21 May 2010 | 12:10:16 UTC

The reference clocks for a GTX260sp216 are 1.242GHz.
The card you are referring to has clocks at 1.55GHz; making it about 24% faster than reference.
Cant see what your shaders are at. Perhpas your card is slightly Factory Over Clocked?

There is also a difference between using Vista/Win7 and using XP (expecially XP SP2). Basically, XP is slightly faster, which would at least account for the extra 4% being achieved by that card.

roundup
Send message
Joined: 11 May 10
Posts: 57
Credit: 1,587,870,193
RAC: 8,075,381
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17202 - Posted: 21 May 2010 | 12:31:38 UTC - in response to Message 17200.
Last modified: 21 May 2010 | 12:39:25 UTC

My card was clocked at 694/1215(Memory)/1512(Shader) MHz. Ran absolutely stable. Just checked with 718/1548/1550 MHz. After I applid the setting for GPU clock and shaders everything looked normal, after the new setting for the meemory the application crashed. Factory clocking was 576/1000/1242MHz.
What setting should I select? Was it a mistake to overclock the memory?

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17203 - Posted: 21 May 2010 | 14:48:53 UTC - in response to Message 17202.

My card was clocked at 694/1215(Memory)/1512(Shader) MHz. Ran absolutely stable. Just checked with 718/1548/1550 MHz. After I applid the setting for GPU clock and shaders everything looked normal, after the new setting for the meemory the application crashed. Factory clocking was 576/1000/1242MHz.
What setting should I select? Was it a mistake to overclock the memory?

That's my card. It's an MSI, factory OCed to 655 core. All I did was increase the shaders to 1550, so it's at 655/1550/1050. A big difference is that it's running XP. Vista and Win7 are slower in GPUGRID since XP runs the GPU at a higher percentage of usage. Strangely, I've found no other projects that seem to be affected by this slowdown.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17208 - Posted: 21 May 2010 | 17:44:07 UTC

@skgiven ... The numerous errors you posted are a very small view of that card's performance. It crunched 6.03 without error for a very long time. Since switching to 6.72 it was returning 80% sucessfully until last night when those errors occurred. That leads me to believe we need more guidance from the project staff into what REALLY works for thier application because the only thing I have done to that card recently is turn the clocks down. Please consider taking more time to do a thourough and impartial analysis instead of posting incomplete data.

@GDF - Can you please tell us what card series with what driver + OS combinations you have sucessfully tested the non-fermi version in the lab?

I know you are all working very hard at testing new technologies, optimizing the applications, trying to raise funds, publish papers, along with taking on a public project but I imagine you will see a substantial drop in throughput now that the new version has become the primary app.

<sigh> ... I guess I'll have to stop posting for a while ... </sigh>
____________
Thanks - Steve

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17210 - Posted: 21 May 2010 | 18:40:56 UTC - in response to Message 17208.

The new application uses more GPU RAM and your card seems to run out of it for some reason. A GTX295 should have sufficient RAM to crunch everything.
Are you doing anything special while crunching?

gdf


@skgiven ... The numerous errors you posted are a very small view of that card's performance. It crunched 6.03 without error for a very long time. Since switching to 6.72 it was returning 80% sucessfully until last night when those errors occurred. That leads me to believe we need more guidance from the project staff into what REALLY works for thier application because the only thing I have done to that card recently is turn the clocks down. Please consider taking more time to do a thourough and impartial analysis instead of posting incomplete data.

@GDF - Can you please tell us what card series with what driver + OS combinations you have sucessfully tested the non-fermi version in the lab?

I know you are all working very hard at testing new technologies, optimizing the applications, trying to raise funds, publish papers, along with taking on a public project but I imagine you will see a substantial drop in throughput now that the new version has become the primary app.

<sigh> ... I guess I'll have to stop posting for a while ... </sigh>

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17211 - Posted: 21 May 2010 | 19:28:29 UTC - in response to Message 17208.
Last modified: 21 May 2010 | 19:31:13 UTC

I am still having problems with the KASHIF_HIVPR tasks,
so I would second that request for a list of working drivers.

I get this a lot,
ERROR: file ntnbrlist.cpp line 63: Insufficent memory available for pairlists. Set pairlistdist to match the cutoff.
called boinc_finish

Steve is getting the same error, and I have seen this same error on several different cards and for several different users.

For me, 6.10.56, Vista x64, 19062 works for most tasks, just not KASHIF_HIVPR.

I have tried many versions of Boinc and many drivers. The success is not consistent across platforms and cards. This suggests that the choice driver will be platform dependant and card dependent. It could come down to firmware.

For example,
I have XP x86 with a GT240, using driver 19745 and Boinc 6.10.51 (yes, the one that is supposed to stop crunching GPU tasks randomly, but works fine for me on one system). No failures of any type ever!
With the 19745 driver I could not crunch any 6.72 tasks on a GT240 Vista x64 system, or any 6.72 tasks on a GTX260 Win7 x64 system. It is also failing on Steve's system (XP SP3) with a GTX295 that is also a known good card.
A confusing picture that needs to be cleared up.

PS. It should be insufficient

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17221 - Posted: 22 May 2010 | 12:39:34 UTC - in response to Message 17210.

The new application uses more GPU RAM and your card seems to run out of it for some reason. A GTX295 should have sufficient RAM to crunch everything.
Are you doing anything special while crunching?

gdf

Sounds like something is leaking memory or is just not realeasing it quickly/ properly before the next WU starts executing. Maybe introduce a small delay at the beginning of the app or do someting at the end of the app to try and force it to release/ recover memory better?

The machine in question is dedicated to GPUGrid and WCG. I don't even have a monitor or mouse hooked up to it so rebooting regularly (which usually helps memory issues) is a pain because the motherboard (x58 EVGA SLI LE) will not boot without a monitor.

With that in mind I have moved my GTX480 (which never had problems on Win7 64) onto that machine (WinXP 32) and will monitor it for "insufficient memory" issues. On my second machine, due to summertime heat comming soon and the added electrical costs for air conditioning I am now only going to run a GTX285.
____________
Thanks - Steve

Profile Bikermatt
Send message
Joined: 8 Apr 10
Posts: 37
Credit: 3,839,902,185
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17461 - Posted: 30 May 2010 | 15:25:50 UTC

So I just noticed a few GTX 465s on newegg for US $280. I thought I was going to be really tempted by this card but now I am reading reports of a 200W TDP and I am suddenly very unimpressed.
If I am going to have a 200W TDP I would rather go for one of the 470s with a better heatsink so it runs cooler. I was hoping for <150W so I guess I will wait for for the 450s.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17467 - Posted: 30 May 2010 | 18:12:16 UTC - in response to Message 17461.

The card is in spirit rather similar to the 5830: you get the fat chip, but it's got quite some units disabled to recycle some of the bad chips. And since it's recycling they don't bin the chips for low voltage requirements / high clock speed / low leakage, hence the high power consumption. And they don't make them too cheap so they don't generate too much demand for something which is essentially an unwanted byproduct.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17469 - Posted: 30 May 2010 | 19:01:17 UTC - in response to Message 17467.

Reminds me of my first Celeron (I call it Sillyron), but they dumped those, they don't dump these. Even though I'm a great fan of the GTX260(216), the original one didn't impress me. But isn't it just a matter of time before they mount better (more expensive heatsink/fans) & drop the prices on these "byproducts"? The original hardcore OC'ers used Celerons & got them to do quite well, despite being handicapped, & the Intel chips that get the best bang for the buck is at the lower end of the price spectrum. Maybe a GTX465 with a Coolermaster V10 or a Corsair H50 GPU alternative might get that sick kitten up & running?
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17472 - Posted: 30 May 2010 | 21:37:41 UTC - in response to Message 17469.

You can switch the heatsink and clock the card up, but you can't make up for the disabled shaders and / or the voltage and leakage. And before you put an expensive heatsink onto a GTX465 it might be better to go for a GTX470 with stock cooling ;)
For CPUs it's different because you can't disable parts of them as fine-granular as GPUs. And because they're not as power consumption bound as high end GPUs.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17474 - Posted: 30 May 2010 | 23:07:37 UTC - in response to Message 17469.
Last modified: 30 May 2010 | 23:20:57 UTC

Good point about the original GTX260 being a poor product, and the later GTX260's (55nm) being much better.
My last task with my GTX260sp216 (now on XP), suggests that I could get about 40K a day with that card (with modestly overclocked shaders):
2419426 1523141 30 May 2010 10:48:28 UTC 30 May 2010 19:29:12 UTC Completed and validated 14,236.77 472.39 4,428.01 6,642.02 ACEMD2: GPU molecular dynamics v6.05 (cuda)

(86400/14236)*6642=40300

The reported GTX465 TDP of 200W might (or not) actually be slightly higher than what is observed; they will no doubt have some GTX465s that are better than others. So it will be hit and miss if you get a good GPU or not. Lots of leakage, unlucky, high TDP. Not so much, lucky, lower TDP, and more reasonable value for money.

When you think that a GTX260 sp216 (55nm) has a TDP of 171W, and a GTX285 (240shaders) has a TDP of 204W, the GTX465 with 352shaders at 200W is not that bad!
Basically, it should perform 25 to 30% slower than a GTX470, as it has 352 shaders, 96 less than the GTX470.
We will have to wait and see if the RAM is the same or if any possible difference in the RAM affects performance here.
The thermal threshold for this 40nm chip is 105C, the same as the GTX470 and GTX480 (they are all Fermi’s).
As I mentioned before, the release retail price is likely to fall over a period of a couple of months. It is in NVidias interest to have as few of these cards as possible, as they are basically reject 512, 480 and 470 cards. Frankly, I don’t see why NVidia just did not release 512, 480, 448, 416, 384, 352, 320, 288 and 256 cards (with the numbers being the shader count). Perhaps they are trying to defy reality or perhaps the other numbers will appear within the evils of OEM systems? I’m guessing an OEM Fermi will be one to especially avoid!
As with the original GTX260, I expect many of these cards will be viewed with distain in the future, but such hindsight might be some way off; I can’t see a GTX465 Ver 2 being released for many months, perhaps a year! So if you are considering one now, don’t let the prospect of a Ver2 card deter you.
Given my GTX470 (overclocked to 704MHz) could get 70K per day (or about 60K stock), a GTX465 could only get about 50 to 55K per day if similarly overclocked (or 45 to 50K stock).
When Cuda 3010 is release in a couple of weeks that will rise by about 11% which would make it 55 to 60K (overclocked) – about 45% faster than my overclocked GTX260sp216.
In terms of the shader count, I think this performance is a bit shy! Hopefully Global Foundaries will improve yields, prices will drop, there will be slight core mods and card design restrictions will be lifted to encourage competition and performance. For now, I know these generic cards have to use the (presently) relatively slower CUDA 3, and the apps are only beginning to be honed towards these new cards, so it is way too early to judge this or any other Fermi card, in terms of its crunching ability here; It would be like comparing the GTX200 range by looking at the original GTX260 cards performance on the original ACEMD app!

The present question should be,
Is the GTX465 worth the money compared to the GTX470?
Well, if it is 33% less in terms of price and performs 25 to 30% less, then there is not much in it (a slight positive). The slight negative is that it might use close to 200W, and a GTX470 only has a TDP or 215W.

My general (ball-park) advice is that if you are going to spend £300 on a base unit then £200 on a GPU is about right. If you are going to spend more on the base unit then get a GTX470 or 480 if you want to crunch here, now. If you buy a base unit for around £200, a GT240 would be the best option, or possibly a cheap second hand GTX260sp216 (55nm) if you come across one.
Remember, a high frequency dual core is much better than a sluggish quad (3.3GHz IC2D is better than a 2.1GHz quad Opteron, for example), as long as you keep a core/thread free for the GPU, and optomise (swan_sync=0)!
I would really like NVidia to produce some other non-Fermi 40nm cards, but it seems they have put all their eggs into the same basket and intend to just push out more shader reduced cards such as the GTS430.
Unless this uses significantly less power (under 120W – and no more than one power connector), it will be a flop if managed well, if not a disaster (the loss of the lower and mid range market sector), hence my wish for an alternative!

As MrS just said, get a GTX470 before you try to pimp up a GTX460 with a specialized heatsink and fan; you will not be able to make up for the missing 96 shaders!

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17476 - Posted: 31 May 2010 | 0:35:45 UTC - in response to Message 17472.

HOT, HOT, HOT! http://www.hexus.net/content/item.php?item=24299&page=8 Thought once that there was hope for a dual Fermi using one of these: http://www.coolitsystems.com/index.php/en/omni.html but it was struggling with a 16% OC'ed GTX480. Still sub 90's, but I can't see how it could a dual even at stock.
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17478 - Posted: 31 May 2010 | 10:46:29 UTC - in response to Message 17476.
Last modified: 31 May 2010 | 10:58:12 UTC

Zotac are about to launch their own design versions of the GTX480 and GTX470,


GTX480 756MHz GPU, 1512MHz shaders
GTX470 656MHz GPU, 1312MHz shaders

Hexus were also able to show that the actual observed power draw from the GTX465 is about 35W less than the GTX470 (not quite as bad as the suggested 20W TDP differnce)

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17481 - Posted: 31 May 2010 | 20:06:47 UTC - in response to Message 17478.
Last modified: 31 May 2010 | 20:52:57 UTC

This looks like the best GTX470 deal in the UK at £285.



The palit's dual fan system keeps it 10 degrees C cooler, 4dB less noisy, and its also about £14 less expensive than the next cheapest card, which uses the original NVidia design!
My Asus card was and still is £37 more expensive :(

This Inno3D GTX 470 Hawk is set to take temps down by as much as 20deg C and 8dB.


http://wccftech.com/2010/05/30/gainward-releases-geforce-gtx-470-gs-golden-sample/

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17482 - Posted: 31 May 2010 | 21:29:35 UTC

GTX465 is an interesting beast: Anand reports that nVidia can use a range of voltages for their chips. They did this before, but it's just becoming more important with a huge chip like Fermi. To illustrate the point: Anands 470 and 480 run at 0.96 V whereas their 465 gets 1.025 V delivered. It may not sound much, but would result in a 14% power consumption increase (assuming usual P ~ V^2 scaling). The result is that their 465 draws even a little more power than their 470! Other reviews show more of a difference, so I assume they got cards which are able to hit the target clock speed at lower voltages.

-> If you're in for a GTX465 you could save yourself some money (15 - 30W) if you could get a card which runs at lower voltage. Not sure how to find it out, though, except if you're buying used / from a private person.

@SK: they didn't go for 512 shaders because they're still stocking up on them. Until they get enough to finally launch such a card. And they probably didn't go for the smaller steps because they didn't want to confuse the consumer. Well, not confuse him with him noticing ;)

And regarding smaller chips based on Fermi architecture: the usual birds are singing we should see the first one in about a month. They've been pretty accurate regarding the 465. Regarding a better 465: I could imagine them enabling one more shader cluster (is they did with GTX260), but instead of a major redesign I'd rather expect the high end model of the next msaller chip to take over this spot. It's probably "only" got 256 shaders but should be able to clock them quite a bit higher (I'd expect 1.5+ GHz). And it should be more balanced for games, i.e. have more texturing units.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17493 - Posted: 1 Jun 2010 | 20:29:17 UTC - in response to Message 17482.

I think it may be open design with this one:
The builders will do what they like.

The 200W TDP, could be well out for any given card; Green version and black editions?
Best to read the small print on the box for all the info.

The GF100 with half the GPU disabled would not be competitive; too much leakage – better off in the bin!
There are too many rumours about the GF104. I heard 384, 750MHz and 130W, which would do nicely doubled up for the fall, but then I also saw a Photoshoped image of a streached GF100!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17495 - Posted: 1 Jun 2010 | 21:25:27 UTC - in response to Message 17493.

The GF100 with half the GPU disabled would not be competitive; too much leakage – better off in the bin!
There are too many rumours about the GF104. I heard 384, 750MHz and 130W, which would do nicely doubled up for the fall, but then I also saw a Photoshoped image of a streached GF100!


That's why I said "next smaller chip" ;)
And 750 MHz core clock matches very well the 1.5 GHz shader clock I've read (and the >1.5 GHz which I'd expect). However, I'd rather expect 256 shaders at 150 W and double the amount of texture units than 384 shaders at 130 W.
Based on the numbers for the GTX480 I actually estimate 143 W for "my" card, neglecting the TMU increase and any voltage increases to reach the higher clock. 384 shaders at 130 W is day dreaming - it's the same design on the same process.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17496 - Posted: 2 Jun 2010 | 10:55:37 UTC - in response to Message 17495.
Last modified: 2 Jun 2010 | 11:02:13 UTC

This thread suggests 336 and 384 shaders:
336shaders 768MB 675/1350/1800 ??
384shaders 1GB 750/1500 ??
Wishful thinking perhaps!
It's more likely to be 256shaders, and perhaps 1GB 750/1500/1800 (though 800MHz versions might turn up).
The 130W suggestion was probably for the GTS450 or GTS440. We can only guess if these have the full compliment of 256 shaders (675MHz & 625MHz for example), with the GTS430 only having 192 shaders, it might be possible to have 224 shaders.

150W is being bounced about for the GTX460, and there is talk of a dual GTX460 (GTX490). 295W seems more likely than 375W (a dual GTX480), but if a GF102 Fermi turns up with 512shaders, it would be competing against itself! So perhaps there is something in the 336/384 shader speculation, even if it is a different product and many months away.

Although they were correct about the GTX465 specs, it's still speculation, until they are anounced by NVidia.

This is my rough guess as to what might turn up, and when.
- GTX 495 - 2xGF104 295 W (long way off)
- GTX 485 - GF102 - 245 W (3 or 4 months)
- GTX 475 - GF102 - 215 W (3 or 4 months)
- GTX 465 - GF102 - 200 W (released)
- GTX 460 - GF104 - 150 W (within 2 months)
- GTS 450 - GF106 - 140 W (within 2 months)
- GTS 440 - GF106 - 130 W (within 2 months)
- GTS 430 - GF108 - 120 W (within 2 months)

As for what turns up under the hood, who knows?
It is probably not even worth speculating on because the TDP and frequencies are likely to vary with open design cards probably becoming the norm. A few of us here suggested this was the way forward for NVidia; the chips get built by NVidia and the card makers design the boards and cooling, as they do a better job of this anyway!

What's clear is that there will be more efficient Fermi's with fewer shaders, targeting the middle market.
A competitive range of cards with GPUGrid performances between that of a GT240 and a GTX285 for between £100 and £250 would be very good news for the project!

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17498 - Posted: 2 Jun 2010 | 13:39:34 UTC - in response to Message 17496.

Galaxy have a single slot GTX470,
http://www.anandtech.com/show/3731/news-a-single-slot-gtx-470-from-galaxy

Might be of interest to a few people with many slots on their boards, and the techs!


http://www.evga.com/articles/00501/

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17502 - Posted: 2 Jun 2010 | 18:56:51 UTC

SK, that looks like a good board for 4x of those single slot GTX 470 cards. Leaves good airflow space between the cards.

Here's Tom's take on the GTX 465:

http://www.tomshardware.com/reviews/nvidia-geforce-gtx-465-fermi-gf100,2642.html

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17503 - Posted: 2 Jun 2010 | 20:42:25 UTC - in response to Message 17496.
Last modified: 2 Jun 2010 | 20:48:47 UTC

I meant the rumors which I read were right about the release date of the GTX465, so I'll give them the benefit of the doubt and assume they'll also be right about the "beginning of July" release of the next smallest Fermi architecture chip (may they call it GF104 or Miss Piggy, I don't really care ;)

Regarding your first link: that looks quite bad to me. They show 8 memory modules, they even write that it's 8 of them, yet they claim it's got 192 bit memory. The latter would require 6 of the usual 32 bit GDDR5 memory chips or 24 bit GDDR5 chips - something I haven't heard of.

Furthermore they give the card 67% of the raw shader performance of the GTX480 but only 49% the memory bandwidth, which whould yield an unbalanced design (GTX 480/470 don't have that much bandwidth to spare).

And 256 bit memory hasn't been that expensive during the last years. Plus this GF104 is supposed to be the fastest of 3 mainstream chips - which need to be smaller, cheaper and slower.

Why are their benchmark screenshots so small that you can't recognize anything? You only do this if you've got something to hide, don't you?

Actually, now that I took a closer look at the die shots it appears to me the entire post is complete BS. Take a closer look: the "GTX460 die" is exactly the same, just squeezed vertically to half the original size. Even the dark line (some fotographing / lighting artefact) from the bottom left to the upper right is similar.
However, GF100 is designed in blocks. NVidia can easily take these blocks, reduce their number and call it a day. Or a new Geforce. So what you'll see in a real chip is a smaller number of (at least very similar) blocks. Not the same number of smaller blocks (which would be bad due to numerous reasons). Plus they can't scale things like the UVD engine and each of the 64 bit memory controllers in size.

Enough said.. and I don't see much point in speculating on the remaining lineup before we know what the next card / chip will actually be ;)

MrS

BTW: my suggested GF104 top model would have 57% the shader power and 58% the memory bandwidth of the GTX480 at a modest 800 MHz memory clock..
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17506 - Posted: 3 Jun 2010 | 0:59:41 UTC - in response to Message 17503.
Last modified: 3 Jun 2010 | 1:24:34 UTC

I saw early July as well, but they may stagger the releases according to locational time of year events (returning to school); so it might appear in the US in early July, but not turn up in the UK until the end of July or early August.
I know the names don’t really matter much, but NVidia like to keep using the same number systems (with a few twists). So in some respects the GTX480 is a reflection of the GTX280. We are missing GTX460, 75, 85, 95 and the lower GTS card for a full Fermi flush. Most of these should appear with the GF104.
Power usage is probably key for the lesser GF104 cards, if they are to compete with ATI (especially for OEM systems). However if they are open design, there could be some special GTX460 cards with performances matching that of a standard GTX465, and using about 50W less.
You read too much into that first link. I did say it was probably incorrect, that it is more likely to have 256 shaders, and I earlier mentioned the Photoshoped image of a streached GF100!
We will know the core numbers and frequencies soon enough. Not knowing does not take away from the fact that there will be a range of GF104 cards, which means more choice for GPUGrid crunchers, and hopefully more participants.

The single slot Galaxy cards are almost there but not quite (better than the liquid cooed cards that are 2mm to wide to be single slot). The Galaxy cards have no ability to exhaust heat via the back plate, so it just blows around the case!
To be able to use 7 of those single Slot GTX 470 Galaxy cards, you would need an extended server style case with at least 2 large side extractor fans, probably opposite door intake fans (if it’s a Full tower case) and heavy duty front intake fans. I would take the back plates off to allow heat to escape that way – not that I could afford 7 Fermi’s, that board, two 1500W PSUs and the rest of what would be a £3000+ system.

PS. A dual GF100 would require two 8pin power connectors, making it only usable with very expensive PSU’s (750W or over). At 295 or 300W a more reasonable 650w PSU might suffice.
The GTX465 RAM is at 802MHz (x4) rather than the 837MHz of the GTX470 – not that it makes too much difference here.
The GTX465 also says A3 revision, the same as my GTX470, so no changes there.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17509 - Posted: 3 Jun 2010 | 10:07:27 UTC - in response to Message 17506.

Is there anything else of interest in this article besides the "probably made up specs" of GF104 / GTX460?

And I'm not sure a more open design would really improve things. NVidia still needs to come up with a good reference design, that's for sure. Otherwise every partner would have to do the same work again. And they can already choose alternative cooling solutions, clock speeds (factory OC) and voltages (e.g. Asus Matrix). And I'm not impressed by the custom PCB designs of the HD5870. despite the usage of expensive high quality circuitry they mostly run at slightly higher clock speeds, but at the cost of substantially increased power consumption (Anand had a review of 3 recently). So I don't think there's much to be gained from such a move.. apart from the obvious benefits of better / different coolers.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17511 - Posted: 3 Jun 2010 | 10:22:13 UTC - in response to Message 17509.
Last modified: 3 Jun 2010 | 10:56:07 UTC

A more open design frame would create greater variety and thus more choice for the end user. Circuitry changes would allow different lengths and widths of cards, to fit more cases, different temperatures with new heatsink designs (eg vapor chamber cooler) for quieter environments or loud crunching farms, and of course different prices.
There may be green varieties that only require one 6pin power connector (for lesser PSUs) and a range of top end dual card versions. These GF104 cards may even turn up in small form factor cases (media centre systems).
I see lots of excellent potential for innovation with open design and I think there is a good chance more people will crunch.
If left stuck with NVidia's default GPU in a shoebox design, none of this could happen!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17521 - Posted: 3 Jun 2010 | 16:19:55 UTC - in response to Message 17511.

Also consider that there are two sides of the coin: more freedom in actual card design also means more manufacturers trying to screw people by selling them inferior stuff. A reference design also stands for guaranteed quality and driver support.
I think the shoebox is flexible already. And isn't it you who posts these links about Fermi cards with non-reference coolers in this thread? It's been just 2 months after the cards have been available!

MrS
____________
Scanning for our furry friends since Jan 2002

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17522 - Posted: 3 Jun 2010 | 17:04:16 UTC - in response to Message 17521.

But just like Bush Jr. "U fooled me once, shame on U, U fool me again, shame on ME!"

Consumers aren't stupid, they'll know when they're getting cheated, eventually...

It's against free competition to forbid the sale of inferior products, as long as they don't cause bodily harm. But how many cheap-o-crap-o brands can get away with junk, before it effects their pricing? If ASUS, fx, wants to shoot their own feet, they just need to sell junk, again, & again, & again. Very quickly people will start calling them @SSUS.
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17523 - Posted: 3 Jun 2010 | 17:52:20 UTC - in response to Message 17522.

But it's not exactly in nVidias interest to let customers find out themselves over time which design is good and which one isn't. The argument would be like "Should I get nVidia or ATI?" "Well, if you want to take a crap shot at something unproven, go and chose one of the experimental xxx designs. If you want to get reliable quality, go ATI. On the other hand, I heard the Hawazuzi XXX1234XTXTXT Superpro Very Limited Edition Ultra Clocked Monster (TM) is not such a bad model, if you can find one manufatured in the first half of week 34."

Example: considering the power draw of Fermi cards you really don't want them to use inferior capacitors to drive prices down a couple of $...

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17524 - Posted: 3 Jun 2010 | 18:09:44 UTC - in response to Message 17522.
Last modified: 3 Jun 2010 | 18:36:52 UTC

I have no doubt that cheap inferior products will turn up anyway, they are called OEM systems!
If Galaxy, XFX, BFG, Palit, Gigabyte, Asus... sell you a Fermi with a 1 year warranty and you card fails after 13months, hard luck. You should have bought a card with a 3, 5, 10year or lifetime warranty!
If they sell cheap parts and the cards start failing after 6months, then the manufacture also has a problem. They have to repair the card, and will lose face (customers).
Nice to see some Green oldies, but one size does not fit all when we move on from G92.

Your argument that, we can’t we have a Green Fermi, a shorter card, a narrower card, a dual card or any other type of bespoke card because someone in China might build a cheap GPU, makes little sense. It is what OEM’s do, so why should I not be able to buy a thinner version of a Fermi from a reputable dealer to put into my slim line case? NVidia could require a minimal reference design. You don’t open a packet at both ends!

It is up to Europe to determine what is brought into the continent, and the same applies for everywhere else. If we don’t want cheap rubbish, legislate rather than blockade. If NVidia are worried about having their reputation tarnished then its up to them to set minimum standards.
Heatsink design is not GPU design - they don’t make tyres in an engine factory.
Besides, if a manufacturer can alter the card by using a better heatsink and fan then why limit the board dimensions or voltages?
Why can Asus use better capacitors to allow more OC room? Why not use smaller of fewer capacitors on a Green board that uses lower voltages?
It’s not as if such cards would no-longer fit a PCIE slot.

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17525 - Posted: 3 Jun 2010 | 18:11:30 UTC - in response to Message 17523.

But here me & here say is how the net works. It's a jungle of knowledge & ignorance, spin, & lies. Just like consumers like it. Otherwise it would be different. Nothing comes from Nothing, & it's just a matter of time before a full tank in a good car runs out & the car stops, becomes useless, & needs bailing out.

Here in Denmark, retailers must guarantee at least 2 years on a card. If something is too cheap, it'll be a disaster in never ending RMA's. Before I buy something though, I like to google for reviews on the card I'm interested in. Sure, there can be spin & manipulation here too, but then the reputation of the site that reviews GPU's get hurt along with their credibility.

I'm just saying, let it loose! I lived in Indonesia where there once was a choice of 20+ different flavors of Fanta, now there are 3-4. If there's a greater choice of Coca Cola in the USA then in EU, that's because there's a market, otherwise it'll sort itself out in time.
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17529 - Posted: 5 Jun 2010 | 18:22:32 UTC - in response to Message 17524.

Your argument that, we can’t we have a Green Fermi, a shorter card, a narrower card, a dual card or any other type of bespoke card because someone in China might build a cheap GPU, makes little sense.


That's because this is not what I said. Or not the point I wanted to make. The point is that nVidia needs solid reference design and anything that is in their official lineup has to guarantee a certain minimum level of performance and quality.

As to why you can't buy all the different flavors of Fermi today which you'd like to see.. well, I'd said it all comes down to money.

First a very interesting option: a "Green Edition" Fermi. Why not? first let's consider what such a product can do and what it can not. The operating voltage of Fermi is already set quite close to what it needs to reach its current stock speeds. So simply lowering the voltage is not an option as there's just not enough room left. What's left is lowering clock speed and voltage. This does save power and increases efficiency, but you also loose performance. So your Green Fermi will (a) be slower or (b) needs more hardware units enabled to make up for the loss in clock speed.
Graphics card prices are still determined by their gaming performance and I dare say that the crunching / HPC market is still too small to change that. So in case of option (a) nVidia would have to sell chips at a lower price than if they were regular GF100s. That's not what they're going to do if they could still sell every functional chip. And option (b) is financially unattractive as well: let's assume you'd take a GTX480, reduce its clock speed to ~1 GHz and lower the voltage a little more. You'd get about GTX470 performance at lower power consumption. However, by now you have to sell a chip which could be sold as a GTX480 for the price of a GTX470 because it's slower. NVidia would loose profit this way, if they could still sell all chips as GTX480 instead (which currently is the case).
BTW: you can't cut the power consumption of a massive chip like Fermi arbitrarily by lowering clocks & voltage. The reason is subthreshold leakage: every transistor consumes a small amount of power irregardless of it being switched or not. This gives you a certain baseline of minimum power consumption. At idle they can keep this in check by power gating entire chip regions completely off the supply voltage, but under load you can't do that.

The dual card is easy: they're experimenting with it and landed at 430 W so far. Neither easy nor very attractive (no PCIe spec for that).

A shorter card might be nice, but then PCBs already cost something, so it's actually in ATIs and nVidias interest to keep their cards as short as possible before performance suffers. The current high end ATIs use less power then Fermis, have fewer memory channels to route and are longer. Yet I can't see any custom design (which are allowed) that "fixes" this. One would think it should be much easier to shave off some length here than with a Fermi. Why not? I can't say for sure, but routing the high speed GDDR5 lines is certainly not trivial. If you limit the amount of space your engineers can work with your signal quality will suffer and hence obtainable memory clocks. So you'd have to pay for additional development and probably sell the card at a slightly lower price, for a limited audience. This is not set in stone, but I can see solid reasons for a company not to do this, even if it was allowed.

And by narrower card you mean a low profile one? If yes, then how do you route 386 bits of memory traces in such a limited area? And fan in the cooler would have to be much smaller, so cooling really suffers. These are probably the reasons why low profile cards have traditionally been limited to cards with 64 and 128 bit memory bus width.

It is what OEM’s do, so why should I not be able to buy a thinner version of a Fermi from a reputable dealer to put into my slim line case? NVidia could require a minimal reference design. You don’t open a packet at both ends!

It is up to Europe to determine what is brought into the continent, and the same applies for everywhere else. If we don’t want cheap rubbish, legislate rather than blockade. If NVidia are worried about having their reputation tarnished then its up to them to set minimum standards.
Heatsink design is not GPU design - they don’t make tyres in an engine factory.
Besides, if a manufacturer can alter the card by using a better heatsink and fan then why limit the board dimensions or voltages?
Why can Asus use better capacitors to allow more OC room? Why not use smaller of fewer capacitors on a Green board that uses lower voltages?
It’s not as if such cards would no-longer fit a PCIE slot.

And to make one thing clear: I don't think the current Fermi offerings are the best we can and should get. I also want higher efficiency, more choices and better cooling. But I also think that you're barking up the wrong tree here - the reason for the currently limited amout of choices is not nVidia being stupid or evil, but rather them being profit oriented, so they try to get the most money out of the chips that they have.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17530 - Posted: 5 Jun 2010 | 19:06:36 UTC - in response to Message 17529.

The question then is what is green? Some say what shade of green, but I'd like to ask what green is. Is green cheap? How can green be cheap, if it's supposed to be an investment, something that'll make up for the "added costs" of being green. If it's tomorrow that you'll save, then today you'll pay more. Efficient PSU's cost more than inefficient PSU's. So when a "green" GPU costs less then a "non-green" GPU, it makes me wonder. Just because it's slow, doesn't make it green, & if the power-savings will mean more time spent on my side, then my CPU, HDD, RAM, & the rest of my PC isn't going to be "greener", just because my GPU is "slower".

Did they use recycled parts to make that "green" GPU? Is there any "green" funeral plans for my GPU when it dies? Do "green" GPU's go to Heaven?

So as far as I see, a really "green" GPU is: Expensive, made of high quality/high efficiency "recycled" components that in turn itself can be "recycled", & will only live as long as it's best for me to have it around, because an old senior GPU isn't going to be very "green".
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17531 - Posted: 5 Jun 2010 | 21:26:49 UTC - in response to Message 17530.

The question then is what is green?


To me that's quite clear: a green edition uses less power and achieves a higher power efficiency. It's got nothing to do with its price and the rest of the PC. If it's any different than other PCBs regarding recycling then it's probably inferior. So I wouldn't want that, but in the end it's not us who define which card a manufacturer declares as green. Currently they don't give a ***** about that ;)
And regarding high efficiency components: they're already using them, at least on the high end. The reason is simple: both ATI and nVidias current high end chips are power constrained, i.e. the designs could take higher voltages & clocks and thus they could get more performance out of the same silicon (same production cost), but power consumption, heat & noise would rise painfully. And nVidia is being beaten quite a bit for the high power consumption of Fermis. So if it was as simple as using 1$ more expensive voltage converters - I'm almost sure they'd already have done so. The custom HD5870 designs, which presumably use even higher quality components, all feature worse energy efficiency.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17533 - Posted: 6 Jun 2010 | 16:43:54 UTC - in response to Message 17531.
Last modified: 6 Jun 2010 | 16:53:14 UTC

When I spoke of Green cards, I was only talking about using the forthcoming GF104 flavours of Fermi, with potentially 256 shaders. Not the GF100 monsters! The potential GTX440 could be a candidate.

PS. Galaxy made a dual GTX470 prototype,

http://hothardware.com/News/Computex-2010-Dual-GPU-GTX-470-Videocard-Spotted/
Potentially a good card for GPUGrid clustering; 896 shaders.

The 8phase VRM says a lot!

Doing it Wrong

Two wrongs...

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17534 - Posted: 6 Jun 2010 | 17:28:41 UTC - in response to Message 17533.
Last modified: 6 Jun 2010 | 17:29:09 UTC

Looks like Galaxy didn't want to wait for Nvidia. Looks like they linked it with a Lucid Hydra alternative . If they use this, it just might work
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17535 - Posted: 6 Jun 2010 | 19:55:15 UTC - in response to Message 17534.
Last modified: 6 Jun 2010 | 20:19:18 UTC

Several manufacturers are now producing motherboards with 7 PCI-E slots,


http://www.atomicmpc.com.au/Gallery/171976,gigabyte-x58a-ud9-motherboard.aspx
http://www.gigabyte.com.tw/products/product-page.aspx?pid=3434#dl

Now all we need are inexpensive, cool, low power, slim-line (1 slot wide) GF104 Fermi's to fill up the board with :p

PS. That dual GTX470 might actually be a reference design!
I'm guessing you will see a dual GTX460 before the dual GTX470; they may produce a dual GTX460 with 512cores as these will be able to use higher shader and RAM clocks first, and be NVidia's fastest card, before going for a dual GTX470 or even 480 when they move to GF102, and should they manage to reduce the heat & power somehow. But they will need to be quick! So dont disregard the possibility of a dual GTX460 within 6weeks, or a dual GTX470 (which would outperform a reference 5970 in many ways) before the end of the year.
It is strange that there is so little confirmed info on the GTX460, given that it is about a month away from release. One report I read said 240shaders, and then confirmed it when questioned. Most say 256, but some even say 300-odd with redesigned ratios, so who knows, they can't all be correct!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17539 - Posted: 6 Jun 2010 | 21:54:41 UTC - in response to Message 17533.

When I spoke of Green cards, I was only talking about using the forthcoming GF104 flavours of Fermi


LOL! Well, then the discussion was a little senseless. I fully agree that GF104 (what ever it will end up being) is the chip to use in a power consumption optimized card. Production cost & quantity should be much less critical for this one and cards with a single 6 pin PCIe connector should well be possible. However, the chip is not even announced yet, so there's not much point discussing whether nVidia forbids their partners green versions of it or not :p

And, yes, that dual GPU monster is what I saw, together with a notion of 430 W TDP. No thanks. For me, anyway.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17725 - Posted: 28 Jun 2010 | 15:33:29 UTC

Here's a provocative yet interesting article concerning the GF104 - GF108 chips. Take it with a grain of salt as SemiAccurate sometimes lives up to its name:

http://www.semiaccurate.com/2010/06/21/what-are-nvidias-gf104-gf106-and-gf108/

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17726 - Posted: 28 Jun 2010 | 16:50:42 UTC - in response to Message 17725.
Last modified: 28 Jun 2010 | 16:55:20 UTC

I beg to differ. Becoming general purpose does loose out on specific tasks. But just as PCI-E Cell Processors are on the way out, General Purpose GPU's are on the way in. Skgiven even posted a PCI-E 1X 220GT with plans of a PCI-E 1X 240GT. As CUDA matures & more people use it, they'd have a market for more then just graphics & BTW, where is ATI??? Sooner or later there might not even be a point in having that VGA, DVI, or HDMI in the back, because nobody buying these cards use them.
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17727 - Posted: 28 Jun 2010 | 17:43:25 UTC - in response to Message 17726.

That report is a bit crazy, even for SemiAccurate.
There is little point starting a GF104 report by comparing GF100 architectures to ATI (it’s been done, repeatedly).

“Cutting down an inefficient architecture into smaller chunks does not change the fundamental problem of it being inefficient compared to the competition”.
How could they know, they don’t have one to compare? They don’t even know the GF104 architecture! I think NVidia did a little more than trim the transistor count. Smaller chips, faster frequencies, less leakage; the GF104 should do well as will their lesser cards in the mid-range market.

Saying things like, “the GF104/106/108 chips are a desperate stopgap that simply won't work” is just wrong. So as suggested, I will take that report with a pinch of salt.

I’m expecting a competitive variety of smaller Fermi cards with higher frequencies and at lower prices. We are soon to see the second phase of Fermi. No doubt over time there will be more revisions (improvements) and increased variety.
In the very long run (perhaps a year or two), a modified Fermi design will move to 28 or 32nm. This will inherently overcome many of the present off-putting aspects of the architecture, should yields be sufficient.
Anyway, I look forward to the first test review of a GF104 card.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17738 - Posted: 28 Jun 2010 | 21:14:48 UTC
Last modified: 28 Jun 2010 | 21:32:43 UTC

OMG... everytime I read an article about nVidia by Charlie I get an urgen to punch him into the face. Real hard. Afterwards I want him to write "I will use my brain before I post articles on the internet" onto every black board I can find. A hundred times, at least.

But seriously.. this report is so bad and apparently full of blind hatred, it's not even worth going into detail where he's wrong. I'm not sure anyone had the patience to read it through anyway.

He does point out something worthy, though: things are not looking too good for the smaller Fermis. By now we should have seen more of them at Computex. And if nVidia really didn't fix the via issue.. well, they're (apparently) pretty dumb since ATI showed them how to do it a long time ago.

That report is a bit crazy, even for SemiAccurate.

Yeah.. what he actually reports is fine, but all the "blubb blubb" and "blah blah" he adds around is so full of misunderstanding or maybe deliberately misinterpreting things.. it's disgusting.

Edit: read it again. Well.. maybe it's not as bad as I said. Some statements are actually more careful than they appear at first sight - if one takes a closer look.

MrS

(disclaimer: my main work horse is an ATI)
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17748 - Posted: 29 Jun 2010 | 8:05:16 UTC - in response to Message 17738.
Last modified: 29 Jun 2010 | 9:04:34 UTC

I stopped reading before I got half way through the assault. It does improve, a bit, but he also contradicts himself several times; so he is only wrong half the time :) You just have to work out which half! You might as well do the report yourself, without shouting crazy hatred. I guess NVidia didn’t send him a free Fermi, so now he does not want a cut-down version, and he is telling everyone, loudly! We can only speculate on how right he is about the financial aspects of NVidia and the yields, but both are generally considered to be not great. This has been known for some time.

There is an ‘alleged’ pic of the GF104 chip here, http://forums.techpowerup.com/showthread.php?t=124943
Several sites are now reporting 336 shaders, higher clocks (675 is about 11% over that of a GTX470 and a GTX465 at 608MHz) and potential design improvements. Should they manage to get the TDP down to 130-150W then a dual card would make perfect sense (though I think these might come a bit later).

http://www.tomshardware.co.uk/gtx-460-fermi-gf104,news-33758.html
I expect these cards will be excellent for GPUGrid, especially if the shaders overclock to anywhere near the reported 1660MHz. Two of them should outperform a GTX480 for games and probably cost about the same. For most people the existing Fermi cards are too expensive to purchase, and to run, so these cards would be welcomed by many people wanting to crunch here. If a GTX460 can match the performance of a GTX285 (TDP 204W) and does turn up with a TDP of 130-150W, then that is still a big improvement for NVidia. Compared to ATI cards, they might not be very competitive, but some competition is better than none!

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17752 - Posted: 29 Jun 2010 | 10:19:28 UTC - in response to Message 17748.

This looks interesting,
http://www.gnd-tech.com/main/content.php/238-ASUS-GTX-465-SoftMod

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17758 - Posted: 29 Jun 2010 | 16:19:45 UTC - in response to Message 17752.
Last modified: 29 Jun 2010 | 16:43:05 UTC

Looks very interesting. Nvidia can't shoot themselves in the foot, but Asus can shoot Nvidia in the foot. Not legally, but since Asus sells Nvidia & Ati, the bottom line being to sell Asus cards, as long as Nvidia can't sue Asus for shooting them in the foot, then it OK??? ;-) It's good for consumers, if it's true, & hope other manufacturers follow in creative ways of shooting both Nvidia & Ati in their feet.

So the SoftMod increased the amount of CUDA cores from 352 to 448. This caused an increase in pixel fillrate and texture fillrate. Normally doing such a thing would be impossible since the additional clusters of CUDA cores/texture units/raster operators are filled with silicon, rendering them unusable. This is clearly not the case with the ASUS GTX 465 and many other GTX 465s.

The ASUS GTX 465 also comes with 10 memory chips, only 8 of which are activated. Normal GTX 465s have the additional chips removed. The 8 chips mean there is 1024 MB of usable VRAM. If you unlock all 10 memory chips by flashing the BIOS, you will have 1280 MB VRAM just like the GTX 470.


Just call me paranoid & a nut, I don't care. I even said that the only builds that fall nicely, are those you plan to take down & that poppies are worth much more then oil.

Only problem with shooting everybody in the feet, is if Intel is the one doing all the shooting of Nvidia & Ati in their feet, so often & so much, that only Intel survives. Then everybody wins, only so that everybody loses in the end.
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17761 - Posted: 29 Jun 2010 | 20:09:31 UTC - in response to Message 17758.

Liveonc, you really don't like feet, do you? :D

Regarding the unlocking: very interesting that this can be done via software. I wonder if nVidia allows it deliberately to boost sales and perception of GTX465 and Fermi in general? And if Asus went full force they'd actually give us a tool where you can switch each shader cluster on/off individually. If the first 96 don't work, just try the other ones. Or just 64 - better than none at all. There might be technical reasons forbidding this, though.

@SK: there's also the rumor that the 336 shaders are an artefact of the identification tool not knowing the chip and that it's actually 386. And that chip should definitely be able to hit 150 W and a little below that, but I don't suppose for the fully fledged "high" clocked (1.35 GHz) version. We should see in due time!
And a dual chip card should definitely be possible. The only question is clocks and therefore performance, as TDP will surely scratch 300 W.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17762 - Posted: 29 Jun 2010 | 20:26:58 UTC - in response to Message 17761.
Last modified: 29 Jun 2010 | 20:49:08 UTC

But I do beg to differ. If Asus officially supported this, they'd have to guarantee at least a possibility of success & would Nvidia be interested in selling GTX470 to be sold as GTX465. If Asus guarantees anything & it doesn't deliver, there'll be RMA hell. If Nvidia does it unofficially, what's the point in keeping it shady, & where do they save or gain from using 10 memory chips instead of 8, when only 8 are used? Of course they didn't put silicon to block the unused clusters, but is it even up to Nvidia to do it, if some do it while others don't? Where would they gain from increased sales of GTX465 if it decreases sales of the GTX470/480, unless they have too many of GTX465 & too few GTX470/480?

Of course it could all be just a lie designed to promote sales of the Asus GTX465. I don't own an Asus GTX465, so I can't know if it's possible or not, but just reading that someone says it's possible, might motivate me to buy an Asus GTX465, if I can save money, & then I'll know if it's true or BS...
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17763 - Posted: 29 Jun 2010 | 21:17:44 UTC

I understood the article in the way that the soft mod was provided by Asus, but obviously without guarantee. This would be the only way that putting 2 more memory chips on there made any sense.

But the article does not actually state this and is suspeciously vague about the origin of the soft mod. They could have just edited the screenshot, made up the benchmarks and shown a photo of a naked GTX470. I couldn't tell the difference.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17764 - Posted: 29 Jun 2010 | 21:54:08 UTC - in response to Message 17763.
Last modified: 29 Jun 2010 | 21:56:22 UTC

I guess you're right that it was Asus. I didn't bother to follow the Chinese link http://translate.google.com/translate?js=y&prev=_t&hl=da&ie=UTF-8&layout=1&eotf=1&u=http%3A%2F%2Fwww.chiphell.com%2Fportal.php%3Fmod%3Dview%26aid%3D57%26page%3D7&sl=zh-CN&tl=en It was Asus itself that offered two different versions of the GTX465. So the extra memory is paid for, & so is the card with the potential. Last time I heard SoftMod being used was to change an 8800GT to a QUADRO FX 3700 or to change an 8800GT to a 9800GT. I didn't understand why they said that it you wouldn't loose your warranty as it's riskier to SoftMod than to Flash. But it's only the software part of unlocking that the Chinese site talks about, & not about flashing a GTX470 Bios to the GTX465.
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17765 - Posted: 29 Jun 2010 | 22:50:35 UTC - in response to Message 17764.
Last modified: 29 Jun 2010 | 23:09:29 UTC

I think a software mod is normally much less risky than a flash; if a flash goes wrong your card may not work, at all! But, to use all the memory chips you would need to flash and they did not report any findings of a flashed card, but rather suggested it would be the same as a GTX470.

I expect there are many GPU's that don't quite make the cut as GTX470's and end up as GTX465's. Intel have underclocked CPUs in the past just to sell them as lesser CPUs for market consumption reasons, so I guess that is a possibility (to create a range of cards), as is upping the reputation of the GTX465, as modifiable. This also reminds me of some AMD CPUs that could have their disabled cores switched back on (X3 to X4).

A while back I read reports of 386 cores for GTX460, but a GTX465 only has 352 cores! Even for NVidia it would be awkward to have a GTX460 that completely outperforms a GTX465. If a GF104 does have 386 shaders, perhaps they will disable it down to 336 Cuda Cores for the GTX460, and keep the best chips for a dual card; to square up to ATI's 5970 dragon (it also breaths fire), or whatever turns up in the near future. We are of course still missing a GTX475, but I expect that might not turn up until late next year (similar to the 65nm to 55nm move).

With only 2 weeks to go before the release starts, we will not have to wait long for confirmation. Should be rather intersting...

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17773 - Posted: 30 Jun 2010 | 16:44:54 UTC - in response to Message 17765.

WRT the GTX 460 (release date 12th July), and crunching here at GPUGrid, how much difference would that 25% higher memory bandwidth of the 256bit wide GDDR5 memory interface version have over the 192bit version?

If the price difference is not too much, I would say get the wider version, as it will have more potential to be sold on, at a later date.
If the price difference is significant, say 25%, and the extra RAM makes little difference when crunching here, then it would probably be best to keep the extra money in your pocket.

It will be interesting to find out why the 192bit version actually exists. I hope it is not just to have a 150W version, rather than the 'now speculated' 160W of the wider card.

When you think that a GT240 with only 96shaders has at least half a gig if RAM to support it, a Fermi with 336cores only being supported by 768MB seems rather weak.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17853 - Posted: 3 Jul 2010 | 20:21:03 UTC

Giving it 386 cores and disabling at least one shader cluster probably makes sense to improve yields.

And it's hard to predict the impact of 256 vs. 192 bit memory bus. They probably did it as some chips have defects within the back end containing the memory controllers. It's odd, though, that both, the 192 and 256 bit versions are said to feature the same high-end version of the GF104 chip. Normally you'd want to balance raw horse power and memory bandwidth, so you'd pair the 192 bit chip with either lower clocked or cut-down chips.

And the amount of memory doesn't really matter, as long as it's high enough. And low end chips have traditionally had lots of RAM due to simple reasons:
- they use slow RAM, which is cheaper
- the target audience is more likely to be impressed by big numbers

The high end cards need more memory to push higher resolutions and to some extend to push more demanding games. but more shaders does not automaitcally require more RAM, luckily - otherwise we'd be screwed ;)

The thing is that anything GPUs can do well has to be massively parallel, i.e. you can cut the task into many independent pieces, throw them all at the shaders and collect the results at some point. And for memory useage it does not really matter how many of these pieces you can compute at any single time, i.e. how many shaders you have (it matters for the caches, though), as all results have to be collected in memory at some point anyway.

This argument looses validity if you'd start to execute different programs simultaneously - something Fermi can do, but is so far completely unknown in the BOINC world.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17856 - Posted: 3 Jul 2010 | 23:26:31 UTC - in response to Message 17853.

On my GTX470 I ran a Folding WU and a GPUGrid WU at the same time! Just one mind you :) Then I realised GPUGrid was back online. The folding WU was a Beta app run from a command console. So it's not just theory. Both tasks finished OK.

I read that there are two types of GDDR5; one that is only guaranteed for up to 4000 and one goes up to 5000MHz (quartered really). I wonder if NVidia will use the 4000 with the 192version and the 5000 with the 256 version of the GTX460 (the OC'rs card). We will know in about 8 days.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17873 - Posted: 4 Jul 2010 | 20:15:31 UTC - in response to Message 17856.
Last modified: 4 Jul 2010 | 20:17:30 UTC

Ups, you're right: I only had in mind that Fermi can split shader clusters between kernels / programs. Which might at some point lead to much better overall GPU utilization and the inclusion of several of the "low utilization" projects like Einstein besides heavy hitters like GPU-Grid. In that case you really need to keep them all in memory simultaneously.
However, you can currently also run several programs. They share time slices between each other and in sum are not any faster (if not slower) than compared to being run one after the other. You don't have to run them like that.. but that doesn't reduce memory requirements if you do :p

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17916 - Posted: 10 Jul 2010 | 17:22:50 UTC - in response to Message 17873.

The GTX460, it is due out in 2days and some sneak peeks are starting to appear,
http://www.overclockers.com/wp-content/uploads/2010/07/slide2.jpg
http://www.overclockers.com/wp-content/uploads/2010/07/msi_gtx460.jpg

Chip clock: 675MHz, Memory clock: 900MHz, Shader clock 1350MHz • Chip: GF104 • Memory interface: 192-bit • Stream processors: 336 • Texture units: 42 • Manufacturing process: 40nm • Max. power consumption: 150W • DirectX: 11 • Shader model: 5.0 • Construction: dual-slot • Special features: HDCP support, SLI

If the prices are correct (From 175 Euro inc) then it looks like a much better card than the GTX465 in terms of value for money. The performance is about 3% shy of the GTX465, but it uses 50W less power at stock and should clock much higher.

I would still prefer the 1GB version, and I am interested in seeing if these will actually work here at GPUGrid, straight out of the box (given the architecture changes).

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17925 - Posted: 11 Jul 2010 | 17:15:05 UTC - in response to Message 17916.

So much for Reference speeds,

Palit NE5X460HF1102 GeForce GTX 460

Chipset Manufacturer NVIDIA
GPU GeForce GTX 460 (Fermi)
Core Clock 800MHz
Shader Clock 1600MHz
Stream Processors 336 Processor Cores

That factory OC'd card is 18.5% faster than reference design; which would of course make it about 15% faster than a GTX465 (going by some reviews).
This card design reminds me of some of the better GT240 designs,

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17926 - Posted: 11 Jul 2010 | 18:50:30 UTC - in response to Message 17925.

Haha - that clock sure sounds nice! I'd expect it to exceed 150 W at these settings, but it's probably well worth it. It's also nice that the card is the 1 GB 256 bit version. I suppose this is really the one to get.

And I'm curious about the number of TMUs. I'd expect there to be 50+ rather than 42, but it all depends on what nVidia decided some months ago.

MrS
____________
Scanning for our furry friends since Jan 2002

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17930 - Posted: 12 Jul 2010 | 7:46:56 UTC

GTX460's review is up! Abstract sounds positive for a 200$ card. Too bad I don't have time to read it tight now, gotta go to work *doh*

And it's got 56 TMUs :)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17931 - Posted: 12 Jul 2010 | 12:08:15 UTC - in response to Message 17930.

Ryan Smith's review, http://www.anandtech.com/show/3809/nvidias-geforce-gtx-460-the-200-king

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17935 - Posted: 12 Jul 2010 | 15:23:15 UTC - in response to Message 17931.

I know what "I'd" do with an 8 SM enabled GTX460! I'd launch a GTX465(384) LOL!
____________

michaelius
Send message
Joined: 13 Apr 10
Posts: 5
Credit: 2,204,945
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwat
Message 17937 - Posted: 12 Jul 2010 | 19:24:14 UTC - in response to Message 17935.

Anyone knows how much faster 800 Mhz GTX 460 is compared to gtx 260?

Theoretically that's 25% increase from gpu clocks and 50% more sp.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17939 - Posted: 12 Jul 2010 | 23:05:28 UTC - in response to Message 17937.

The GTX 460 design is somewhat different than the previous Fermi cards, so we don't really have an accurate reference to go by. Compared to a GTX260, my GTX470 @ 707MHz (WinXP) can get about 65K credits per day, while my GTX260 (WinXP) could in theory only get about 40K per day, overclocked (core 625MHz, Shaders 1525MHz, RAM 1100MHz).
My guess is that a GTX460 @ 800MHz could reach about 55K per day, all being well. That is smack in the middle of the 25% to 50% you mentioned. However no-one has tested one here yet, so we dont even know for sure that it will work! In theory it should work, but there could be issues; the shaders are controlled in three groups, so it might work more like a card with 256 shaders than a card with 336 shaders (under crunching conditions), or not?
It is also a bit early to be bench marking Fermi card here. They will eventually benefit from CUDA updates, and there is another Beta CUDA version for developers released for assessment (3.2). Perhaps in a month or two we will see driver and app improvements that will make the Fermi cards look more attractive.

bigtuna
Volunteer moderator
Send message
Joined: 6 May 10
Posts: 80
Credit: 98,784,188
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 17946 - Posted: 14 Jul 2010 | 3:13:59 UTC - in response to Message 17939.

I'm expecting a GTX 460-768-OC in a couple of days.

They were out of the 1GB version so the baby 768MB version is on the way.

The 1GB version has a wider memory bus, more RAM and more ROPs (whatever they are), so it is said to be worth the extra cost.

Reviews of the card have been very positive.

I'm torn between GPUGRID and Folding...

The 9800GT cards are Folding because they are better at Folding.

The ATI card are Folding because you can't GRID with ATI cards.

The GT240 cards are GRIDing because they are better at GRIDing.

The GTX460 is going to replace an ATI HD4850 which is currently Folding. If the 460 is used for Folding more CPU time will be available for Rosetta (since nVidia cards use less CPU time than ATI cards). Folding and Rosetta scores would both go up.

Suggestions?

Wait till the 1GB version is available and buy one for GRIDing and one for Folding you say?

Good idea!







Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17947 - Posted: 14 Jul 2010 | 5:26:53 UTC - in response to Message 17946.

I would suggest GPUGrid all the way! If you could at least run GPUGrid for a few days so we can get a good idea of how they perform for real that would be great.
____________
Thanks - Steve

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17954 - Posted: 14 Jul 2010 | 13:42:48 UTC - in response to Message 17947.

I can run Folding and GPUGrid tasks at the same time on my GTX470.

Perhaps you will be able to do the same on your GTX460

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17962 - Posted: 14 Jul 2010 | 20:58:01 UTC

Definitely try both.. or take a look around where they're more efficient. Depending on the code GF104 can perform like 336 shaders or like 224 shaders.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17964 - Posted: 14 Jul 2010 | 22:15:58 UTC - in response to Message 17962.

Actually I hope that as 4/7 execution units can be used at once instead of 2/6 of GF100, maybe GF104 also has better sustained performance compared to a GTX480 for a well optimized code.

gdf

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17974 - Posted: 15 Jul 2010 | 7:40:02 UTC - in response to Message 17964.
Last modified: 15 Jul 2010 | 7:51:57 UTC

Getting a sustained 2-way superscalar execution shouldn't be too hard, especially factoring in loads & stores. GPU-Grid would use the 3 blocks of 16 shaders, one block of 16 load/store units and sometimes the one block of 8 SFUs, right?

Edit: so for every load/store op you'd need to find at least (3 shader ops) or (2 shader ops plus 1 special function) and not more than (3 shader ops plus one special function). If there are no pipeline stalls (all data fetched into caches early enough) this could lead to sustained 4 ops/clock operation :)
All of these ops taken from 2 different warps, of course.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17977 - Posted: 15 Jul 2010 | 13:53:36 UTC - in response to Message 17974.

Getting a sustained 2-way superscalar execution shouldn't be too hard, especially factoring in loads & stores. GPU-Grid would use the 3 blocks of 16 shaders, one block of 16 load/store units and sometimes the one block of 8 SFUs, right?


Yes.


gdf

bigtuna
Volunteer moderator
Send message
Joined: 6 May 10
Posts: 80
Credit: 98,784,188
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 18003 - Posted: 16 Jul 2010 | 20:36:44 UTC

So it looks like Revenarius could not get GPUGRID to work with a 460:

http://www.gpugrid.net/forum_thread.php?id=2227#18002

Has anyone else tried or is it known that the 460 is not going to be able to GRID at this time?

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18006 - Posted: 16 Jul 2010 | 21:01:02 UTC - in response to Message 18003.

Yep, tried 3 projects. See the GTX 460 thread for more info...

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18246 - Posted: 5 Aug 2010 | 14:55:59 UTC - in response to Message 18006.

A Fermi with all 512 shaders?

http://en.expreview.com/2010/08/04/exclusive-the-design-card-and-testing-of-geforce-gtx-480-preview/8878.html

This is supposed to be a GTX480 with 512shaders. Although this report could be a fake, which would discredit the website, its as likely to be a sample card, and GPUZ just doesn’t have a card to reference it from its database. Although a 512shader GF100 is expected, it's more likely to be called a GTX485, and it's worth noting that not all sample cards make it to production, and when they eventually do get mass produced there can be differences. If this was destined for the more expensive version of the Tesla, then its lacking some GDDR, which would also need to be ECC.

While there has been nothing official, I can't see NVidia wanting to make a song and dance about a GF100 with 512shaders - the words late and hot would once again make headlines; better to slip it out the back door as a golden sample or limited ed. possibly along with some other release.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18253 - Posted: 5 Aug 2010 | 22:34:20 UTC

Fake or not it makes a lot of sense: if the cards are different, nVidia will give them the same number (GTX260, GTX460 etc.). They spare the different numbers for similar cards ;)

MrS
____________
Scanning for our furry friends since Jan 2002

michaelius
Send message
Joined: 13 Apr 10
Posts: 5
Credit: 2,204,945
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwat
Message 18265 - Posted: 7 Aug 2010 | 9:55:57 UTC

I've heard this pcb might be msi lightning version of 480.

Now it would be really cool if nvidia allowed some partners to make limited runs of 512 sp cards without doing official release.

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18266 - Posted: 7 Aug 2010 | 10:25:05 UTC - in response to Message 18003.

So it looks like Revenarius could not get GPUGRID to work with a 460:

http://www.gpugrid.net/forum_thread.php?id=2227#18002

Has anyone else tried or is it known that the 460 is not going to be able to GRID at this time?


Works alright for me, if a little slow.
____________
BOINC blog

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18275 - Posted: 7 Aug 2010 | 18:29:58 UTC - in response to Message 18265.

I've heard this pcb might be msi lightning version of 480.


That would make sense since 6 chips on the front and 6 on the back match well with a 384 bit bus.

@MarkJ: that post was a little old. So far GTX460 runs but not very efficiently. Possible improvements are expected in September, if everthing goes well.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19114 - Posted: 30 Oct 2010 | 12:09:08 UTC - in response to Message 18275.

No thanks to a 3month driver gap the GTX460 still runs inefficiently on Windows, and along with the GTS450 not at all on Linux. However this will hopefully change next week with the new software, and the release of 2 drivers, a week apart. A bit of skulduggery perhaps but these drivers don’t work well with the 200series cards, so avoid unless you have a Fermi. Whatever the motive it looks like there is a push to get things to work properly for the Fermi cards.

To elude to a possible +ve motive, the 261.00 driver seems to have been pulled for revealing a series name jump in the form of a GTX580, along with other discerning naming/sequencing info; apparently the GTS455 will have less cuda cores (shaders) than the GTS450 !?! We’ll see.

Let’s hope these were typo’s, but despite the names another couple of non-OEM Fermi’s might be useful here, especially if the GF110 revision brings any performance improvements. As always I would suggest people wait until these are confirmed as being well supported and their crunching performances reported before buying one. Release dates are supposed to be in Nov.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19116 - Posted: 30 Oct 2010 | 13:28:43 UTC - in response to Message 19114.

LOL .. that naming mess would be so nVidia!

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19409 - Posted: 10 Nov 2010 | 11:32:37 UTC - in response to Message 19116.
Last modified: 10 Nov 2010 | 11:34:22 UTC

NVidia seem to have released the GTX580 out of the blue, to compete against the top ATI cards about to be released. This sudden change of direction may have come at the expense of other planned releases. To my dismay I read a suggestion that the long awaited GTX475 is not coming in the predicted form or timeframe. Instead it is to be renamed to the GTX560 and according to blogspot it will be GT114 based rather than GF104.

If this is correct it would mean no GTX475, and waiting on a GTX560. Lets hope they are wrong and NVidia release both, but if not some manufacturers may still want to push out a few golden samples of GTX460’s, with full complements of cores and shaders. A pair of such cards would just about match a GTX580 and one might match a GTX560; something you might think NVidia would not be too keen on, but if there is a limited supply of GTX580's they might, and many retailers want you to pre-order the GTX580.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19436 - Posted: 11 Nov 2010 | 21:47:00 UTC

Difficult to say, but I have a hard time imagining nVidia rushing to replace GF104. It's not broken in the way GF100 was/is.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19513 - Posted: 16 Nov 2010 | 19:02:48 UTC - in response to Message 19436.

Sounds like a GTX570 is on its confusing way:

It is reported to be turning up with 480 shaders, so it's really a GF110 version of a GTX480, with a better cooler.

http://www.fudzilla.com/graphics/item/20814-geforce-gtx-570-comes-soon

Profile Fred J. Verster
Send message
Joined: 1 Apr 09
Posts: 58
Credit: 35,833,978
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19516 - Posted: 16 Nov 2010 | 21:47:18 UTC - in response to Message 19513.
Last modified: 16 Nov 2010 | 21:48:02 UTC

It's about time NVidia is gettin clear, of what it's doing, or not.
This is all very confusing and looks like keeping ahead of
ATI and 6000
series GPU's, they have to make up their minds :(
.
____________

Knight Who Says Ni N!

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19518 - Posted: 16 Nov 2010 | 23:11:13 UTC - in response to Message 19516.

Indeed, that's their game.

Not sure how their GeForce GTX 460 SE fit's into the picture, maybe it’s just a case of sell the chaff too. Just so people know, it only has 288shaders and is relatively under clocked when compared to the GTX 460. As it’s additional ROP is disabled it’s no cheaper to run (the same as a 768MB GTX460). Make your own mind up what the S stands for.

Still, for crunching here people are better off with a 448 Cuda Core GTX470 for £189 than any GTX460 and when the GTX570 turns up it should outperform the GTX480.


ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19532 - Posted: 17 Nov 2010 | 20:11:02 UTC

Well, nVidia decided that a small bump in performance and a somewhat larger bump in efficiency is worth a new generation tag. If you accept this then the rumored GTX570 does neither add confusion nor will it be a bad card. It's going to perform slower than a GTX580, probably approximately by the amount one would expect from the numbering, and be cheaper. And it should outperform the GTX480.

Personally I don't see a problem with this. Apart from the whole "5 series = 4 series" issue. But given it's nvidia I'm actually glad they don't mix DX10 chips into the 4 and 5 series..

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19539 - Posted: 18 Nov 2010 | 0:36:07 UTC - in response to Message 19532.
Last modified: 18 Nov 2010 | 6:46:29 UTC

The GTX 560 will be GF114 based, have 8 ROPs, 384 shaders (48/1 ratio), higher clocks (725MHz, 1450MHz) and faster RAM. Basically it will be the much purported GTX475, but cooler and faster. Still, the GTX580 and GTX570 will do more work.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19550 - Posted: 19 Nov 2010 | 9:00:40 UTC - in response to Message 19539.

Yes,
remember that the GPU based on the 48cores/multiprocessor like the 460 do not perform well. Most likely the best cost effective GPU to run GPUGRID will be the GTX570.

gdf

Hypernova
Send message
Joined: 16 Nov 10
Posts: 22
Credit: 24,712,746
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19551 - Posted: 19 Nov 2010 | 9:25:30 UTC - in response to Message 19550.

Excuse my ignorance, but if I put two Fermi boards on my machine but without any SLI bridge. I mean just plug two boards in their respective PCI-Express slots as two indipendent boards (no gaming use just crunching). Then this will be seen as two GPUs and I will get 4 WU's at a time on my machine (instead of 2 with one GPU). Is this correct?

The CPU share is displayed as 0.30, or 0.20 etc. But if I have a multi core like the 980X (hexacore and 12 threads) does really the GPU need 30% which is about 2 full cores to manage one board? Then if I put two boards I monopolize 4 physical cores ? that would be an overkill. So what is the reality of the CPU share ?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19557 - Posted: 19 Nov 2010 | 19:28:28 UTC - in response to Message 19551.
Last modified: 19 Nov 2010 | 20:16:25 UTC

You will get 4 tasks because at GPUGrid you can only have 2 per card. Note one runs at a time per card, and the others sit in the queue. You don’t need to have a SLI bridge (in fact using one is generally not recommended).

The GPU requires varying amounts of CPU depending on the app, the task running and your system, in particular your CPU.

Generally, for a Fermi it is recommended to free up one CPU thread per GPU in Boinc Manager and to use swan_sync=0. So for your W7 x64 system with 12 threads this would mean losing 2 threads. If you don’t do this your would-be dual Fermi setup would run 40% less efficiently. As you are new to GPUGrid I would recommend sticking to this setup for now.

For a GTX260 this is not the situation; you don’t need to free up a CPU core or use swan_sync=0 and it is 0.3 threads/cores per GPU. In fact this is not even the situation, you have top end CPUs so you will use less than 0.10 CPU threads per GPU.

vaio
Avatar
Send message
Joined: 6 Nov 09
Posts: 20
Credit: 10,781,505
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwat
Message 19560 - Posted: 19 Nov 2010 | 21:55:51 UTC - in response to Message 19557.
Last modified: 19 Nov 2010 | 21:57:10 UTC

Another naive cruncher chipping in.

From reading the above am I correct in thinking I could run 2 x GTX-460's independently without using SLI?

Does that mean any board with dual pci-e slots will do and doesn't need to be an SLI certified board?

Also, would a thread from an X4 620 or i3 530 be adequate?

As you can see I know little.....I just like to crunch!
____________
Join Here
Team Forums

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19561 - Posted: 19 Nov 2010 | 22:09:58 UTC - in response to Message 19560.

From reading the above am I correct in thinking I could run 2 x GTX-460's independently without using SLI?


Yes. And it's actually the other way around: you couldn't make them work together on one WU, even if you wanted to. SLI is meant for games, computing works in a very different way.

Does that mean any board with dual pci-e slots will do and doesn't need to be an SLI certified board?


Yes. Your main problem will likely be supplying enough power to those beasts and cooling them. A PCIe slot which mechanically provides 16 lanes but "only" 4 electrically connected lanes is fine for GPU-Grid, even at PCIe 1 (half the bandwidth of PCIe 2). Avoid 1x slots, though.

Also, would a thread from an X4 620 or i3 530 be adequate?


Yes.

MrS
____________
Scanning for our furry friends since Jan 2002

vaio
Avatar
Send message
Joined: 6 Nov 09
Posts: 20
Credit: 10,781,505
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwat
Message 19562 - Posted: 19 Nov 2010 | 22:27:05 UTC - in response to Message 19561.

Thank you ETA.

Looks like I need to add another Corair 850 to my shopping wishlist.
____________
Join Here
Team Forums

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19565 - Posted: 20 Nov 2010 | 1:01:12 UTC - in response to Message 19562.
Last modified: 29 Nov 2010 | 17:15:45 UTC

With the release of the GTX580 and the planned release of the GTX570 (7th Dec) we are likely to see further price drops for the GTX480, GTX470 and GTX465. People should be aware that for GPUGrid the GTX470 will outperform the planned GTX560. So when a GTX560 does turn up, it should work here, but not nearly as well as a GTX470 never mind a GTX570 or GTX580; a GTX460 gets about half the credits of a GTX480.

The GTX 570 has a reference 772MHz frequency, 10% higher than the 700MHz reference of a GTX 480. However the GTX570 will use less power (225W vs 250W), will be quieter and with a better designed cooler it should clock better. Its release price will also be less than the release price of the GTX480, and this will no doubt fall in time.

I remember seeing a dual GTX470 prototype and think it is likely this will emerge in some form of a dual GTX570 (GTX595 perhaps) next year, to challenge the big dual core ATI cards of course. I have heard no mention of a GTX565, but I expect these could turn up in OEM form, with a further core disabled.

Expect the GTX480, GTX 470 and GTX465 to be phased out over the next six months.

Hypernova
Send message
Joined: 16 Nov 10
Posts: 22
Credit: 24,712,746
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19570 - Posted: 20 Nov 2010 | 12:36:16 UTC - in response to Message 19557.
Last modified: 20 Nov 2010 | 12:36:49 UTC

Thanks skgiven. All is clear now. I have on order two Asus Fermi GTX580 boards. I am anxious to see how they will perform. I have one device with a Corsair 1000 Watts rated 80 Plus Silver (over 90% efficiency) PSU and an Asus Rampage Extreme III board that should be able to handle very well two 580 boards. The CPU is an 980X with water cooling, and 6Gb Corsair 2000 MHZ Dominator DDR3 memory. Disk is an Crucial 300, Sata3 256 Gb SSD. It will be my most powerful machine to date.
I will post when this will be up and running.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19573 - Posted: 20 Nov 2010 | 18:46:45 UTC - in response to Message 19560.

Also, would a thread from an X4 620 or i3 530 be adequate?

An Athlon X4 620 would easily be more than enough and still have power to drive CPU projects. I have an X4 620 on a machine with a GTX 460 and an HD 5770 (both heavily OCed) running 2 GPU projects at 99% and 4 CPU projects at full speed, uses ~375 watts. It's been running on an Antec Earthwatts 500 watt PS, no problems at all.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19713 - Posted: 29 Nov 2010 | 19:01:34 UTC - in response to Message 19573.
Last modified: 29 Nov 2010 | 19:05:28 UTC

Details of the GTX570 have been released/leaked:

    480 CUDA Cores
    GPU clock @ 732 MHz
    shaders @1464 MHz
    320-bit memory interface
    1280MB GDDR5
    Ram @ 3800 MHz.
    TDP of 225W

So it’s about 4.6% faster than a GTX480 (by clocks alone), and uses 11% less power. It should also be cooler and less noisy.
Just for comparison, it should be (732/607) X (15/14) faster than a stock GTX470; 29.2% faster at the expense of an additional 4.6% power usage.

Release date is supposed to be the 7th Dec. While this suggests there will not be a market flood (tomorrow is the peak E-sales day in the UK, not next week), NVidia also wanted to get the card out before the 6950 and 6970 - due out on the 13th Dec.

As for the price it shoud drop in between a GTX580 and a GTX480 (lowest card prices from one UK site):

    GTX580 £380
    GTX570 £3??
    GTX480 £320
    GTX470 £180
    GTX465 £170

From the above list of present retail prices it would be fair to say the GTX570 will be released for around £350, but the GTX480 will likely fall somewhat during the next week or so, bringing the GTX570 to around £320.

The GTX470 is clearly still the best deal in terms of crunching power per £, and this is likely to remain the case until the GTX570 drops below £250, which might be 6months from now, going by the GTX470’s price drops. However, I would not write off a GTX565 either; given their track record I can’t see NVidia binning 14 good ROPs, and if the GTX470 is to be phased out there would be a gap in the market between the GTX570 and a GTX560. This could be filled by a GTX465 with 448shaders. The project could do with having a good replacement card for under £200.

While a GTX595 is unlikely to appear until well into the next year, some manufacturers are building single-slot width GTX580’s.
In theory, someone (with a spare £5K) could fill all 7 PCIE slots of the best motherboard and have a system which would get close to 700K credits per day. Or put another way, if one task was run in parallel over 7 cards, reduce the time per step to under 0.5ms on a single computer.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19714 - Posted: 29 Nov 2010 | 21:11:54 UTC - in response to Message 19713.

That GTX570 looks nice and there's not much reason to doubt these specs. And remember that the GTX480 tepnded to draw way more than its 250W TDP suggested. So the power consumption advantage of the GTX570 versus GTX480 may be even higher.

And I would also expect a further cut-down version. I won't speculate on specs, tough. Alternatively they could use many of the partly-defective GF110 GPUs in Teslas and Quadros (they sell because of software anyway) and in the mobile space, where the demand for "unreasonably power hungry GPUs" appears to be surprisingly high.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19791 - Posted: 7 Dec 2010 | 11:49:55 UTC - in response to Message 19714.
Last modified: 8 Dec 2010 | 20:25:32 UTC

Just checked some UK prices for the GTX570; surprised to see one on offer for £260. That’s a good price and in terms of performance per monitory unit only 10% shy of a GTX470 for £180.
Also of note is the SuperClocked EVGA GTX 570 @ 797MHz (note this is slightly higher than a GTX580 @772MHz), available for £300. I would expect the performance of that card to be within 5% of a reference GTX 580, and for £80 less.
To put these prices in perspective, I picked up my first GTX470 for £320 and that EVGA is 40% faster; in terms of original price per performance that’s 50% faster. A decent improvement in less than a year.

Post to thread

Message boards : Graphics cards (GPUs) : Fermi

//