Advanced search

Message boards : Graphics cards (GPUs) : Low power GPUs performance comparative

Author Message
Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52615 - Posted: 11 Sep 2019 | 17:33:53 UTC

I live in Canary Islands.
Nice weather to go to beach everyday along the year.
But it is not so nice to high power electronic devices.
I´ve set a particular power boundary at 120 Watts for graphics cards I purchase, in order them to process most of the time without risk of spontaneous ignition.
No plans for air conditioning so far.

Several days ago, Toni released a limited amount of ACEMD3 test WUs.
Thank you very much, Toni. Linux users were waiting for this!
I realized that ACEMD3 test WUs might be an objective mean of testing and compare my currently working GPUs between them.
I´ve been lucky to catch at least one TONI_TESTDHFR206b WU for every GPU I have currently in production under 64 bits Linux OS, SWAN_SYNC enabled.
And here is a table comparing their performance.



Where:
GPU: Graphics card type. All of them are factory overclocked models.
Series: Graphics card micro-architecture.
Power: Card maximum rated power, as indicated by nvidia-smi command, in Watts.
Temp.: Maximum temperature reached while processing WU, as indicated by Psensor application, in ºC, at a room temperature of 26 ºC.
Time: Execution time for each WU, in seconds.
R.P. GTX: Relative performance, comparing execution time between each graphics card and the others.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 52617 - Posted: 11 Sep 2019 | 18:25:53 UTC

ServicEnginIC, thanks for that! Good work :-)

rod4x4
Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,219,935,054
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 52620 - Posted: 12 Sep 2019 | 0:02:21 UTC - in response to Message 52615.
Last modified: 12 Sep 2019 | 0:36:33 UTC

Nice work!
Very informative.
I guess you don't spend much time at the beach with all this GPUgrid testing!!

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52633 - Posted: 17 Sep 2019 | 22:19:00 UTC

I guess you don't spend much time at the beach with all this GPUgrid testing!!

+1 ;-)

Theese tests confirmed me what I previously suspected:
GPUGrid is the most power exigent Project I currently process.
Really GPUGrid tasks do squeeze GPU's power till its maximum!
Here is nvidia-smi information for my most powerful card while executing one TONI_TESTDHFR206b WU (120W of 120W used):



And here are Psensor curves for the same card from GPU resting to executing one TONI_TESTDHFR206b WU:

rod4x4
Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,219,935,054
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 52634 - Posted: 18 Sep 2019 | 0:00:10 UTC - in response to Message 52633.

GPUGrid is the most power exigent Project I currently process.

Agreed.

here are Psensor curves for the same card

Always interesting to see how other volunteers run their systems!
What I found interesting from your Psensor chart:
- GPU running warm at 80 degrees.
- CPU running cool at 51 degrees (100% utilization reported)
- Chassis fan running at 3600rpm. (do you need ear protection from the noise?)

As a comparison, my GTX 1060 GPU on Win10 processes a89-TONI_TESTDHFR206b-23-30-RND6008_0 task in 3940 seconds. Your gtx 1660 ti is completing test tasks quicker at around the 1830 second mark.
The Turing cards (on Linux) are a nice improvement in performance!

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 52636 - Posted: 18 Sep 2019 | 10:46:02 UTC - in response to Message 52634.
Last modified: 18 Sep 2019 | 10:47:12 UTC

Thanks for the data.

There may be a problem with the WINDOWS Cuda 10.1 app - it's slower than it should be. We are working on it.

SWAN_SYNC is ignored in acemd3.

rod4x4
Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,219,935,054
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 52639 - Posted: 18 Sep 2019 | 11:46:20 UTC - in response to Message 52636.


There may be a problem with the WINDOWS Cuda 10.1 app - it's slower than it should be. We are working on it.

SWAN_SYNC is ignored in acemd3.

Thanks Toni. Good to know.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52640 - Posted: 18 Sep 2019 | 11:50:14 UTC

- CPU running cool at 51 degrees (100% utilization reported)

CPU for this system is a low power version Q9550S, with high performance CPU cooler Arctic Freezer 13.
And several degrees in CPU temperature are rised by radiated heat coming from GPU...

- GPU running warm at 80 degrees.

I had to work very hard to maintain GPU temperature below 80's at full load for this particular graphics card.
But perhaps it is matter for other thread...
http://www.gpugrid.net/forum_thread.php?id=4988

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52658 - Posted: 19 Sep 2019 | 7:48:11 UTC

For comparison:
Here is the same GTX1660TI card mentioned below, now running Einstein@Home WUs due to Toni's current requirement for not to run acemd3 test WUs in Linux systems for better testing in Windows ones.





Spikes in temperature graph are the transition between two consecutive E@H WUs.
Comparing data, GPU power usage drops from 120W to 83W, and accordingly temperature from peak 80ºC to 69ºC.
Toni, this speaks very well about your acemd3 Linux code optimization for getting from GPU its maximum.
Now I'm crossing fingers for your success with Windows one!

mmonnin
Send message
Joined: 2 Jul 16
Posts: 332
Credit: 3,772,896,065
RAC: 4,765,302
Level
Arg
Scientific publications
watwatwatwatwat
Message 52685 - Posted: 21 Sep 2019 | 1:49:31 UTC

E@H is most mining like BOINC project I've seen in that it responds better from memory OC than core OC. Try a math project that can easily run parallel calculations to push GPU temps.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 7,520
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52707 - Posted: 23 Sep 2019 | 21:57:59 UTC - in response to Message 52636.

SWAN_SYNC is ignored in acemd3.
Well, the CPU time equals the Run time, so could you elaborate this?
Could someone without SWAN_SYNC check their CPU time and Run time for ACEMD3 tasks please?

mmonnin
Send message
Joined: 2 Jul 16
Posts: 332
Credit: 3,772,896,065
RAC: 4,765,302
Level
Arg
Scientific publications
watwatwatwatwat
Message 52710 - Posted: 24 Sep 2019 | 0:10:31 UTC - in response to Message 52707.
Last modified: 24 Sep 2019 | 0:10:42 UTC

SWAN_SYNC is ignored in acemd3.
Well, the CPU time equals the Run time, so could you elaborate this?
Could someone without SWAN_SYNC check their CPU time and Run time for ACEMD3 tasks please?


CPU Time = Run time with the new app.
https://www.gpugrid.net/result.php?resultid=21402171

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52974 - Posted: 11 Nov 2019 | 23:24:47 UTC

I still can squeeze a bit more the data from my tests:
I'll calculate energy consumed by every of my tested graphics cards, and order them in a new table.



In this table, increasing positions are more energy efficient cards.
As can be seen, GTX 1660 Ti coincides to be the most efficient (the less energy consumed for the same kind of task) and the most powerful (the less execution time)
And I take note for myself: For example, it would be a self-amortize idea to change GTX 950 graphics card by a GTX 1650 ASAP.
GTX 1650 has consumed less than half the energy than GTX 950 to finish the same kind of task!
Something to think about...

rod4x4
Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,219,935,054
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 52975 - Posted: 12 Nov 2019 | 1:27:26 UTC - in response to Message 52974.

Nice stats.
The Turing GPUs are an exciting GPU for efficiency and power. (If only we could get some work for them....)
I will be buying the GTX1650 SUPER (November 22nd release) to replace my Maxwell cards, once GPUgrid starts releasing ACEMD3.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 52976 - Posted: 12 Nov 2019 | 6:05:40 UTC - in response to Message 52975.

(If only we could get some work for them....)

if the GPUGRID people send out some work at all in the near future ...
the developement within the past weeks unfortunately was a steadily decreasing number of available tasks :-)

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52980 - Posted: 14 Nov 2019 | 15:11:14 UTC

How do we calculate energy consumed by GPU to process a certain Work Unit?
It is easy.
I entered my GPUGrid results page.
At current date:



Taking the execution time for my last processed Task: 59.720,55 seconds
It was processed in a GTX 1050 Ti graphics card.
Taking rated TDP (Thermal Design Power) for this GPU: 75 Watts
GPUGrid WUs are highly power demanding, and usually are able to lead GPU to its rated TDP consistently (in fact, even more if GPU is overclocked).
Therefore, aproximate energy consumed to process a WU can be calculated using this formula:
Energy (Watts x hour, Wh) = Rated GPU TDP (Watts) x WU execution time (seconds) / 3.600 (Seconds / hour)

In my example WU: 75 x 59.720,55 / 3.600 = 1.244,18 Wh (/ 1.000 = 1,24418 kWh)

the developement within the past weeks unfortunately was a steadily decreasing number of available tasks :-)

This situation remains currently the same.
Who knows, may be a change of cycle is in progress ????

Piri1974
Send message
Joined: 2 Oct 18
Posts: 38
Credit: 6,896,627
RAC: 0
Level
Ser
Scientific publications
wat
Message 53213 - Posted: 29 Nov 2019 | 20:44:22 UTC
Last modified: 29 Nov 2019 | 21:44:01 UTC

Hello ServicEnginIC;

As you are interested in lower end graphics cards, here is a post which I earlier posted in another post.
I wanted to post them here but I accidentally posted them in another post about more powerful GPUs.

DO NOT buy the GPUs mentioned below anymore for GPUGRID:
As the work units of GPUGRID will become bigger in the future, these low-end cards will not be able to meet the 5-day deadline anymore.
So I give you this information just as information, NOT as buying advice.


Here is a copy/paste.
You may use these data to add them into your list if you like:

If you like, I can give you some more data about other "low end" cards for the NEW ACEMD3 application:

These are all recent work units of last week. All were completed without errors.
You may add these to your list if you like.


1) NVIDIA GeForce GTX 745 Maxwell (2048MB 900Mhz DDR3 128-bit) - driver: 430.86:
CPU = AMD FX4320 quad core
Run TIme: 166,385.50 sec = +- 46 hours
See
http://www.gpugrid.net/show_host_detail.php?hostid=512409

2) NVIDIA GeForce GT 710 Kepler (2048MB DDR5 64-bit) - driver: 432.0
CPU = Core 2 Duo E8200 Dual core
Run Time: 375,108.98 sec = +- 104 hours
See
http://www.gpugrid.net/show_host_detail.php?hostid=502519

3) NVIDIA GeForce GT 710 Kepler (2048MB 800Mhz DDR3 64-bit) driver: 436.48
CPU = AMD A8-9600 Quad Core
Run Time = 431,583.77 sec = +- 119 hours
See
http://www.gpugrid.net/show_host_detail.php?hostid=514223

4) NVIDIA GeForce GT 1030 Pascal (2048MB DDR5 64-bit) - driver: 436.48
CPU = Celeron E3300 Dual Core
Run Time = 113,797.23 sec = +- 31 hours
See
http://www.gpugrid.net/show_host_detail.php?hostid=500299

5) NVIDIA GeForce GT 730 Kepler (2048MB DDR5 64-bit) - driver: 419.35
CPU = AMD Athlon 200GE Dual Core, Quad thread
Run time = 233,535.98 sec = +- 64 hours
See
http://www.gpugrid.net/show_host_detail.php?hostid=503400



Be aware of the following points which may have influenced the performance:
- I suppose that the ACEMD3 work units are not 100% identical in the amount of work they represent, so compare the results with a big grain of salt.
- All are Windows 10 machines.
- All computers have different processors.
- At the same time, the CPUs were crunching Mapping Cancer Marker work units, as many as they have cores/threads.
- 1 of my GT710s has DDR5 memory, the other one has DDR3 memory. I have stated it above.
- All GPUs and CPUs run at default speeds. No (manual) overclocking was done.
- None of the CPUs and GPUs ever reached 70 degrees Celsius.
In fact, except for the GT1030, all even stayed well below 60°.



Hope you like it. :-)
Greetings;
Carl Philip

Piri1974
Send message
Joined: 2 Oct 18
Posts: 38
Credit: 6,896,627
RAC: 0
Level
Ser
Scientific publications
wat
Message 53214 - Posted: 29 Nov 2019 | 21:20:02 UTC - in response to Message 53213.
Last modified: 29 Nov 2019 | 22:09:43 UTC

And by the way, as you mentioned that you live in the Canary Islands and prefer low-power devices, here is some more info about my GPUs:

- The GT710 has a TDP of only 19 Watts for the DDR3 version, the DDR5 version consumes a little more.
- The GT730 has a TDP of 38 Watts for the DDR5 version, the DDR3 version consumes a little less.
- The GT1030 has a TDP of 30 Watts.
- The GTX745 has a TDP of 55 Watts.

So they are all low-power devices.
My preferred website to find info on GPUs is Techpowerup.
Just Google "GT710 Techpowerup" for example and you will find it.

I would not recommend buying these GPUs anymore unless for a PC with low requirement-applications like an office pc, an htpc or maybe for some light gaming with older games.

For example, the slowest of my cards, the GT710, can perfectly run an old but beautiful game called "Bioshock". I run that game on it in an old PC with a Core2Duo cpu, just 2Gb of RAM and Windows 10. It runs perfectly fluid.

But anyway, I would not recommend buying the GT710 or GT730 anymore unless you need their very low consumption.
The GT1030 is the fastest of them all.
The GTX745 is just a little slower than the GT1030, but it was sold to OEMs only. But like me, you can find it new or second hand on the internet. The Fujitsu model which I have even has a nice blower-style fan which blows the hot air directly outside the pc. There is both a normal model and a low-profile model.
You can find some pictures of it here:
https://picclick.fr/Fujitsu-GPU-NVIDIA-GeForce-GTX-745-2GB-FH-264493447418.html

If low-power is important to you, I can recommend using some of AMDs APUs with integrated video chips.
I have for example a rather old APU called AMD A8-9600.
Believe it or not, I currently play the latest Star Wars game "Star Wars Jedi: Fallen Order" on this old A8-9600. :-) (in 720p resolution)
But as these APUs have Radeon GPUs, you cannot use them for CUDA projects like GPUGRID. For OpenCl projects like Milkyway, you can use them.
And my A8-9600 runs circles around my GT1030 in Milkyway... strange but true. It's about 5 times faster than my GT1030... yes, 5 TIMES faster:
My GT1030 completes a Milkyway work unit in about 15 minutes, the A8-9600 does it in about 3 minutes.
That means that in Milkyway, the A8-9600 is probably even faster than a GTX1050.
That is what I call fantastic performance per Watt or per Dollar. :-)

Cheers from Cold Brussels ;-)
Carl

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53367 - Posted: 16 Dec 2019 | 22:38:44 UTC - in response to Message 53214.

Thank you very much for your appreciations, Carl.
I've read them twice to the end.
Many points to comment:

My preferred website to find info on GPUs is Techpowerup.

I didn't know about Techpowerup site. They have an enormous database with GPU data and very exhaustive specifications. I like it, and take note.

But anyway, I would not recommend buying the GT710 or GT730 anymore unless you need their very low consumption.

I find theese both models perfect for office computers.
Specially fanless models, that offer a silent and smooth working for office applications, joining their low power consumption.
But I agree that their performance is rather scarce to process at GPUGrid.
I've made a kind request for Performance tab to be rebuilt
At the end of this tab there was a graph named GPU performance ranking (based on long WU return time)
Currently this graph is blank.
When it worked, it showed a very useful GPUs classification according to their respective performances at processing GPUGrid tasks.
Just GT 1030 sometimes appeared at far right (less performance) in the graph, and other times appeared as a legend out of the graph.
GTX 750 Ti always appeared borderline at this graph, and GTX 750 did not.
I always considered it as a kind invitation for not to use "Out of graph" GPUs...

The GTX745 is just a little slower than the GT1030, but it was sold to OEMs only

That was the first time I heard about GTX 745 GPU.
I thought: a mistake?
But then I searched for "GTX 745 Techpowerup"... and it appeared as an OEM GPU! Nice :-)

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 53948 - Posted: 19 Mar 2020 | 1:43:43 UTC

Thanks very much for the info.
I recently installed a couple of new Dell/Alienware GTX 1650 4GB cards and they are very productive. They run 60-65 deg.C and around 1850 MHz with a steady 4001 MHz DDR5 speed. I haven't tried overclocking as they appear to be running faster than NVidia advertises already.
At US$135 apiece (Ebay) and slot-powered, I'm pleased.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54244 - Posted: 5 Apr 2020 | 14:40:50 UTC - in response to Message 53948.
Last modified: 5 Apr 2020 | 14:41:13 UTC

I recently installed a couple of new Dell/Alienware GTX 1650 4GB cards and they are very productive...

I heve also three GTX 1650 cards of different models currently running 24/7, and I'm very satisfied.
They are very efficient according to their relatively low power consumption of 75W (max).

On the other hand, all the GPU models listed on my original performance table are processing current ACEMD3 WUs on time to achieve full bonus (Result returned in <24H :-)

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1284
Credit: 4,920,556,959
RAC: 6,336,580
Level
Arg
Scientific publications
watwatwatwatwat
Message 54247 - Posted: 5 Apr 2020 | 15:35:28 UTC - in response to Message 53948.

Thanks very much for the info.
I recently installed a couple of new Dell/Alienware GTX 1650 4GB cards and they are very productive. They run 60-65 deg.C and around 1850 MHz with a steady 4001 MHz DDR5 speed. I haven't tried overclocking as they appear to be running faster than NVidia advertises already.
At US$135 apiece (Ebay) and slot-powered, I'm pleased.

There really isn't much point in overclocking the core clocks because GPU Boost 3.0 overclocks the card on its own in firmware based on the thermal and power limits of the card and host.

You do want to overclock the memory though since the Nvidia drivers penalize all their consumer cards when a compute load is detected on the card and pushes the card down to P2 power state and significantly lower memory clocks than the default gaming memory clock and the stated card spec.

The memory clocks can be returned to P0 power levels and the stated spec by using any of the overclocking utilities in Windows or the Nvidia X Server Settings app in Linux after the coolbits have been set.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 54251 - Posted: 5 Apr 2020 | 18:17:00 UTC - in response to Message 54247.

... the Nvidia drivers penalize all their consumer cards when a compute load is detected on the card and pushes the card down to P2 power state and significantly lower memory clocks than the default gaming memory clock and the stated card spec.

why so ?

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1284
Credit: 4,920,556,959
RAC: 6,336,580
Level
Arg
Scientific publications
watwatwatwatwat
Message 54253 - Posted: 5 Apr 2020 | 18:21:36 UTC - in response to Message 54251.

... the Nvidia drivers penalize all their consumer cards when a compute load is detected on the card and pushes the card down to P2 power state and significantly lower memory clocks than the default gaming memory clock and the stated card spec.

why so ?

Because Nvidia doesn't want you to purchase inexpensive consumer cards for compute when they want to sell you expensive compute designed Quadros and Teslas.
No reason other than to maximize profit.

Zalster
Avatar
Send message
Joined: 26 Feb 14
Posts: 211
Credit: 4,496,324,562
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 54254 - Posted: 5 Apr 2020 | 20:14:29 UTC - in response to Message 54253.

... the Nvidia drivers penalize all their consumer cards when a compute load is detected on the card and pushes the card down to P2 power state and significantly lower memory clocks than the default gaming memory clock and the stated card spec.

why so ?

Because Nvidia doesn't want you to purchase inexpensive consumer cards for compute when they want to sell you expensive compute designed Quadros and Teslas.
No reason other than to maximize profit.


Well, they also say they can't guarantee valid result with the higher P state. Momentary drops or errors are ok with video games, not so with scientific computations. So they say they drop down the P state to avoid those errors. But like Keith says, You can OC the memory. Just be careful because you can do it too much and start to throw A LOT of errors before you know it and then you are in time out by the server.
____________

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1284
Credit: 4,920,556,959
RAC: 6,336,580
Level
Arg
Scientific publications
watwatwatwatwat
Message 54256 - Posted: 5 Apr 2020 | 22:09:55 UTC - in response to Message 54254.

Depends on the card and on the generation family. I can overclock a 1080 or 1080Ti by 2000Mhz because of the GDDR5X memory and run them at essentially 1000Mhz over official "graphics use" spec.

And not generate a single memory error in a task. The Pascal generation really hamstrung the memory in P2 mode. The Pascal cards that used plain GDDR5 memory can not overclock as well as the GDDR5X memory cards. So you can't push the 1070 and such as hard.

But the Turing generation family does not impose such a large deficit/underclock in the P2 state under compute load.

The Turing cards only drop by 400Mhz from "graphics use" memory clocks.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 332
Credit: 3,772,896,065
RAC: 4,765,302
Level
Arg
Scientific publications
watwatwatwatwat
Message 54359 - Posted: 17 Apr 2020 | 11:50:28 UTC

What happens when there is a memory error during a game? Wrong color, screen tearing, odd physics on a frame? The game moves on. So what.

What happens when there is a memory error during compute? Bad calculations that are typically carried through in the task producing invalid results.

Down clocking evidently can reduce some errors so that's what NV does.

Nick Name
Send message
Joined: 3 Sep 13
Posts: 53
Credit: 1,533,531,731
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 54361 - Posted: 17 Apr 2020 | 18:30:36 UTC - in response to Message 54359.

What happens when there is a memory error during a game? Wrong color, screen tearing, odd physics on a frame? The game moves on. So what.

What happens when there is a memory error during compute? Bad calculations that are typically carried through in the task producing invalid results.

Down clocking evidently can reduce some errors so that's what NV does.

In my opinion Nvidia does this not because of concern for valid results, it's because they don't really want consumer GPUs used for compute. They'd rather sell you a much more expensive Quadro or Tesla card for that. The performance penalty is just an excuse.
____________
Team USA forum | Team USA page
Join us and #crunchforcures. We are now also folding:join team ID 236370!

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1031
Credit: 35,493,857,483
RAC: 71,175,505
Level
Trp
Scientific publications
wat
Message 54362 - Posted: 17 Apr 2020 | 18:48:05 UTC - in response to Message 54361.

What happens when there is a memory error during a game? Wrong color, screen tearing, odd physics on a frame? The game moves on. So what.

What happens when there is a memory error during compute? Bad calculations that are typically carried through in the task producing invalid results.

Down clocking evidently can reduce some errors so that's what NV does.

In my opinion Nvidia does this not because of concern for valid results, it's because they don't really want consumer GPUs used for compute. They'd rather sell you a much more expensive Quadro or Tesla card for that. The performance penalty is just an excuse.


exactly this.
____________

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1284
Credit: 4,920,556,959
RAC: 6,336,580
Level
Arg
Scientific publications
watwatwatwatwat
Message 54363 - Posted: 18 Apr 2020 | 1:15:37 UTC - in response to Message 54362.
Last modified: 18 Apr 2020 | 1:16:37 UTC

What happens when there is a memory error during a game? Wrong color, screen tearing, odd physics on a frame? The game moves on. So what.

What happens when there is a memory error during compute? Bad calculations that are typically carried through in the task producing invalid results.

Down clocking evidently can reduce some errors so that's what NV does.

In my opinion Nvidia does this not because of concern for valid results, it's because they don't really want consumer GPUs used for compute. They'd rather sell you a much more expensive Quadro or Tesla card for that. The performance penalty is just an excuse.


exactly this.

+100
You can tell if you are overclocking beyond the bounds of your cards cooling and the compute applications you run by simply monitoring the errors reported, if any.

Why would Nvidia care one whit whether the enduser has compute errors on the card. They carry no liability for such actions. Totally on the enduser. They build consumer cards for graphics use in games. That is the only concern they have whether the card produces any kind of errors. Whether it drops frames. If you are using the card for secondary purposes, then that is your responsibility.

They simply want to sell you a Quadro or Tesla and have you use it for its intended purpose which is compute. Those cards are clocked significantly less than the consumer cards so that they will not produce any compute errors when used in typical business compute applications. Distributed computing is not even on their radar for application usage. DC is such a small percentage of any graphics card use it is not even considered.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54365 - Posted: 18 Apr 2020 | 15:52:20 UTC

On Apr 5th 2020 | 15:35:28 UTC Keith Myers wrote:

...Nvidia drivers penalize all their consumer cards when a compute load is detected on the card and pushes the card down to P2 power state and significantly lower memory clocks than the default gaming memory clock and the stated card spec.

I've been investigating this.
I suppose this fact is more likely to happen with Windows drivers (?).
I'm currently running GPUGrid as preferent GPU project on six computers.
All these systems are based on Ubuntu Linux 18.04, no overclocking at GPUs apart from some factory-overclocked cards, coolbits not changed from its default.
Results for my tests are as follows, as indicated by Linux drivers nvidia-smi command:

* System #325908 (Double GPU)


* System#147723


* System #480458 (Triple GPU)


* System #540272


* System #186626


And the only exception:
* System #482132 (Double GPU)


As can be seen at "Perf" column, all cards but the noted exception are freely running at P0 maximum performance level.
On the double GPU system pointed as exception, GTX 1660 Ti card is running at P2 performance level.
But on this situation, power consumtion is 117W of 120W TDP, and temperature is 77ºC, while GPU utilization is 96%.
I think that increasing performance level, probably would imply to exceed recommended power consumption/temperature, with relatively low performance increase...

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1031
Credit: 35,493,857,483
RAC: 71,175,505
Level
Trp
Scientific publications
wat
Message 54366 - Posted: 18 Apr 2020 | 16:12:37 UTC - in response to Message 54365.
Last modified: 18 Apr 2020 | 16:18:23 UTC

Keith forgot to mention that the P2 penalty only applies to higher end cards. low end cards in the -50 series or lower do not get this penalty and are allowed to run in the full P0 mode for compute.

you cannot increase the performance level of the GTX 1660Ti from P2 under compute loads. it will only allow P0 if it detects a 3D application like gaming being run. this is a restriction in the driver that nvidia implements on the GTX cards -60 series and higher. there was a way to force the GTX 700 and 900 series cards into P0 for compute using Nvidia Inspector on Windows, but I'm not sure you could do it on Linux.

The only thing you can do is to apply an overclock to the memory while in the P2 state to reflect what you would get in the P0 state. there is no other difference between P2 and P0 besides the memory clocks.
____________

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 7,520
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54368 - Posted: 18 Apr 2020 | 19:07:31 UTC - in response to Message 54363.

Down clocking evidently can reduce some errors so that's what NV does.
In my opinion Nvidia does this not because of concern for valid results, it's because they don't really want consumer GPUs used for compute. They'd rather sell you a much more expensive Quadro or Tesla card for that. The performance penalty is just an excuse.
They simply want to sell you a Quadro or Tesla and have you use it for its intended purpose which is compute. Those cards are clocked significantly less than the consumer cards so that they will not produce any compute errors when used in typical business compute applications.
To sum up the above: when NV reduces the clocks of a "cheap" high-end consumer card for the sake of correct calculations it's just an excuse for that they want to make us buy overly expensive professional cards which are even lower clocked for the sake of correct calculations. That's a totally consistent argument. Oh wait, it's not!

(I've snipped the distracting parts)

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54371 - Posted: 18 Apr 2020 | 22:34:50 UTC

Many thanks to all for your input. I Just finished a PABLO WU which took a little over 12 hrs on one of my GTX1650 GPUs.
I guess long runs are back.

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54372 - Posted: 19 Apr 2020 | 0:13:31 UTC - in response to Message 54368.

I find myself agreeing with Mr Zoltan on this. I am inclined to leave the timing stock on these cards as I have enough problems with errors related to running two dissimilar GPUs on my hosts. When I resolve that issue I might try pushing the memory closer to the 5 GHz advertised DDR5 max speed on my 1650's.
What I'd like to know is what clock speeds other folks have had success with before risking blowing WUs finding out.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54374 - Posted: 19 Apr 2020 | 8:35:06 UTC - in response to Message 54371.

...I Just finished a PABLO WU which took a little over 12 hrs on one of my GTX1650 GPUs.
I guess long runs are back.


I guess that an explanation about the purpose of these new PABLO WUs will appear in "News" section in short... (?)

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54375 - Posted: 19 Apr 2020 | 8:38:44 UTC - in response to Message 54372.

...I have enough problems with errors related to running two dissimilar GPUs on my hosts.

This is a known problem in wrapper-working ACEMD3 tasks, already announced by Toni in a previous post.
Can I use it on multi-GPU systems?

In general yes, with one caveat: if you have DIFFERENT types of NVIDIA GPUs in the same PC, suspending a job in one and restarting it in the other will NOT be possible (errors on restart). Consider restricting the client to one GPU type only ("exclude_gpu",
see here).

Users experiencing this problem might be interested to take a look to this Retvari Zoltan's post, with a bypass to prevent this errors.

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54380 - Posted: 19 Apr 2020 | 20:41:38 UTC - in response to Message 54375.

Thanks, ServicEnginIC. I always try to use that method when I reboot after updates. I need to get myself a good UPS backup as I live where the power grid is screwed up by the politicians and activists and momentary interruptions are way too frequent.

As soon as I can afford to upgrade from my 750ti I will put both my 1650's in one host. Am I correct that identical GPUs won't have the problem?

I've noticed that ACEMD tasks which have not yet reached 10% can be restarted without errors when the BOINC client is closed and reopened.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54381 - Posted: 19 Apr 2020 | 21:11:30 UTC - in response to Message 54380.

Am I correct that identical GPUs won't have the problem?

That's what I understand also, but I can't check it by myself for the moment.

I've noticed that ACEMD tasks which have not yet reached 10% can be restarted without errors when the BOINC client is closed and reopened.

Interesting. Thank you for sharing this.

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1284
Credit: 4,920,556,959
RAC: 6,336,580
Level
Arg
Scientific publications
watwatwatwatwat
Message 54382 - Posted: 19 Apr 2020 | 21:26:02 UTC - in response to Message 54381.

Yes you can stop and start a task at any time when all your cards are the same.
I moved cards around so that all three of my 2080's are in the same host.
No problem stopping and restarting. No tasks lost to errors.

I can't do that on the other host because of dissimilar cards.

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54384 - Posted: 20 Apr 2020 | 13:36:43 UTC - in response to Message 54382.
Last modified: 20 Apr 2020 | 13:43:46 UTC

Yes you can stop and start a task at any time when all your cards are the same.


Thanks for confirming that, Keith.
I've also found that when device zero is in the lead but less than 90% it will (sometimes) pick them both back up without erring. When dev 0 is beyond 90% it will show a computation error even if it is in the lead. Has anyone else observed this?

An unrelated note: BOINC sees a GTX 1650 as device 0 when paired with my 1060 3GB, even though it is slightly slower than the 1060 (almost undetectable running ACEMD tasks).
I think it might be due to Turing being newer than Pascal. I don't see any GPU benchmarking done by the BOINC manager.

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54385 - Posted: 20 Apr 2020 | 14:01:55 UTC

By the way, muchas gracias to ServicEnginIC for hosting this thread! It's totally synchronous with my current experiences as a cruncher. 🥇👍🥇

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1031
Credit: 35,493,857,483
RAC: 71,175,505
Level
Trp
Scientific publications
wat
Message 54386 - Posted: 20 Apr 2020 | 14:02:27 UTC - in response to Message 54384.

An unrelated note: BOINC sees a GTX 1650 as device 0 when paired with my 1060 3GB, even though it is slightly slower than the 1060 (almost undetectable running ACEMD tasks).
I think it might be due to Turing being newer than Pascal. I don't see any GPU benchmarking done by the BOINC manager.


BOINC will order Nvidia cards based on CC (compute capability), and then by how much memory it has.

the 1650 has CC of 7.5 where the 1060 has 6.1. the 1650 also has more memory than the 3GB version of the 1060.

____________

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54387 - Posted: 20 Apr 2020 | 16:31:52 UTC

Pop Piasa wrote:

By the way, muchas gracias...

You're welcome, de nada,
I enjoy learning new matters, and sharing knowledge

Ian&Steve C. wrote:
BOINC will order Nvidia cards based on CC (compute capability), and then by how much memory it has.

I've ever wondered about that.
Thank you very much!, one more detail I've learnt...

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54389 - Posted: 20 Apr 2020 | 20:26:47 UTC - in response to Message 54386.

Thanks for your generous reply Ian & Steve, you guys have a great working knowledge of BOINC's innards!
It pays for me to come here and ask, as a newbie in distrib. computing.

Great that you brought your talents and hardware over to this project when they stopped SETI. (Hmm... does shutting down the search mean that they gave up, or that the search is complete?)👽

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1031
Credit: 35,493,857,483
RAC: 71,175,505
Level
Trp
Scientific publications
wat
Message 54390 - Posted: 20 Apr 2020 | 21:20:43 UTC - in response to Message 54389.

Great that you brought your talents and hardware over to this project when they stopped SETI. (Hmm... does shutting down the search mean that they gave up, or that the search is complete?)👽


Without getting too far off topic, they basically had money/staffing problems. they didnt have the resources to constantly babysit their servers, while working on the backend analysis system at the same time. all of the work that has been processed over the last 20 years has not been analyzed by the scientists yet, it's just sitting in the database. they are building the analysis system that will sort through all the results, so they are shutting down the data distribution process so they can focus on that. they feel like 20 years of listening to the northern sky was enough for now. They left the option open that the project might return someday.
____________

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54394 - Posted: 20 Apr 2020 | 23:50:04 UTC - in response to Message 54390.
Last modified: 20 Apr 2020 | 23:51:54 UTC

they feel like 20 years of listening to the northern sky was enough for now.


When SETI first started I used a Pentium II machine at the University I retired from to crunch when it was idle and my boss dubbed me "Mr Spock". Thanks for filling in the details and providing a factual perspective to counter my active imagination. I should have guessed it was a resource issue.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54396 - Posted: 21 Apr 2020 | 5:39:33 UTC

SETI is written in Distributed Computing history with bold letters.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54397 - Posted: 21 Apr 2020 | 5:57:57 UTC - in response to Message 54371.

On Apr 18th 2020 | 22:34:50 UTC Pop Piasa wrote:

...I Just finished a PABLO WU which took a little over 12 hrs on one of my GTX1650 GPUs.
I guess long runs are back.


I also received one of this PABLO WUs... and by any chance it was assigned to my slowest GPU currently in production.
It took 34,16 hours to complete in a GTX750, anyway in time to get half bonus.

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54400 - Posted: 21 Apr 2020 | 15:31:12 UTC - in response to Message 54397.
Last modified: 21 Apr 2020 | 15:32:21 UTC

I also received one of this PABLO WUs... and by any chance it was assigned to my slowest GPU currently in production.
It took 34,16 hours to complete in a GTX750, anyway in time to get half bonus.


I've run 2 PABLOs so far, but only on my 1650s. They both ran approx. 12:10 Hrs, 11:58 CPU time. If I can get one on my 750ti (also a Dell/Alienware and clocked at 1200Mhz; 2GB 2700Mhz DDR3) we can get a good idea how they all compare.
If anyone has run a PABLO designed ACEMD task on a 1050 or other low power GPU please share your results. I would also like to see how these compare to a 2070-super as I am considering replacing the 750ti with one.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 7,520
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54402 - Posted: 21 Apr 2020 | 15:46:42 UTC - in response to Message 54400.

If anyone has run a PABLO designed ACEMD task on a 1050 or other low power GPU please share your results.
e2s112_e1s62p0f90-PABLO_UCB_NMR_KIX_CMYB_5-2-5-RND2497_0
GTX 750Ti
109,532.48 seconds = 30h 25m 32.48s

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1031
Credit: 35,493,857,483
RAC: 71,175,505
Level
Trp
Scientific publications
wat
Message 54403 - Posted: 21 Apr 2020 | 16:20:30 UTC - in response to Message 54400.

If anyone has run a PABLO designed ACEMD task on a 1050 or other low power GPU please share your results. I would also like to see how these compare to a 2070-super as I am considering replacing the 750ti with one.

not a low power card, but something to expect when moving to the 2070-super. I have 2080's and 2070's. the 2070-super should fall somewhere between these two, maybe closer to the 2080 in performance.

my 2080s do them in about ~13000 seconds = ~3.6 hours.
my 2070s do them in about ~15500 seconds = ~4.3 hours.

so maybe around 4hrs on average for a 2070 super.

____________

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54404 - Posted: 21 Apr 2020 | 17:28:50 UTC - in response to Message 54403.

my 2080s do them in about ~13000 seconds = ~3.6 hours.
my 2070s do them in about ~15500 seconds = ~4.3 hours.

so maybe around 4hrs on average for a 2070 super.


Thanks for the info., that pretty much agrees with the percentage ratios I've seen on Userbenchmark.com.
I plan to replace my PSU when I can afford it and eventually have two matching high-power GPUs on my ASUS Prime board (i7-7700K). That will leave me with a 1060 3GB and a 750ti to play with in some other old machine I might acquire from my friends' business throw-aways.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54425 - Posted: 22 Apr 2020 | 22:16:18 UTC - in response to Message 54397.
Last modified: 22 Apr 2020 | 22:21:14 UTC

I also received one of this PABLO WUs... and by any chance it was assigned to my slowest GPU currently in production.
It took 34,16 hours to complete in a GTX750, anyway in time to get half bonus.

I was lucky to receive two more PABLO WUs today.

One of them was assigned to my currently fastest GPU.
So I can compare with the first one already mentioned, assigned to the slowest.
I'll put them in image format, because they'll dissapear from GPUGrid's database in a few days.
- GTX 750 - WU 19473911: 122.977 seconds = 34,16 hours
- GTX 1660 Ti - WU 19617512: 24.626 seconds = 6,84 hours
Conclusion: GTX 1660 Ti has processed its PABLO WU in about 1/5 the time than GTX 750. (About 500% relative performance)

The second PABLO WU received today has been assigned again to other of my GTX 750 GPUs, and it will serve as a consistency check for execution time comparing to the first one.
At this time, this WU is completed in 25,8% after an execution time of 9,5 hours.

If I were lucky enough to get PABLO WUs at every of my GPUs, I'd be able to pepare a new table similar to the one at the very first of this thread...

Or, perhaps, Toni would be able to take this particular batch to rebuild the Performance page with data from every GPU models currently contributing to the project...😏

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54428 - Posted: 23 Apr 2020 | 16:51:48 UTC - in response to Message 54425.
Last modified: 23 Apr 2020 | 16:59:29 UTC

Here you go, ServicEngineIC:

GTX 1650 gpu@1860MHz/ mem@4001MHz
(Dell Optiplex 980/i7-860 @2926MHz/ 12GB PC3-10600 @1200MHz)
45,400.71sec =12.6Hrs
http://httpd://www.gpugrid.net/workunit.php?wuid=19582490

GTX 1650 gpu@1860MHz/ mem@4001MHz
(ASUS Prime Z-270/i7-7700K @4200MHz/ 16GB PC4-2132 @3000MHz)
44,027.23sec =12.2Hrs
https://www.gpugrid.net/workunit.php?wuid=19495626

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1284
Credit: 4,920,556,959
RAC: 6,336,580
Level
Arg
Scientific publications
watwatwatwatwat
Message 54429 - Posted: 23 Apr 2020 | 19:10:41 UTC
Last modified: 23 Apr 2020 | 19:14:39 UTC

Here you go, ServicEngineIC for the Pablo tasks.

https://www.gpugrid.net/result.php?resultid=24501891
RTX2080@1805Mhz mem@14400Mhz
14,864 seconds = 4.13hrs
AMD Ryzen 3950X@4100Mhz @3600Mhz CL14

https://www.gpugrid.net/result.php?resultid=24502927
RTX2070@1700Mhz mem@14400Mhz
13,911 seconds = 3.87hrs
AMD Threadripper 2920X@4100Mhz @3466Mhz CL14

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54430 - Posted: 23 Apr 2020 | 21:06:32 UTC

Ok!
Thank you!
I'll try to throw a new comparative table on this Sunday April 26th, with all data that I receive until Saturday 23:59 UTC...
Data, Data, Crunch, Crunch, 🔥🔥

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54438 - Posted: 25 Apr 2020 | 15:28:37 UTC

Well, I tried to manipulate a PABLO WU over to my gtx 750ti by suspending it at 0.23% on my gtx1650, and when it restarted on the 750ti it crashed, falsifying my theory that ACEMD tasks can all be restarted before 10%. Perhaps it is still true running MDAD tasks.

Dmitriy Otroschenko's gtx 1060 3GB ran it:
39,016.09sec = 10.84Hrs.
AMD Ryzen 7 2700X Eight-Core Processor

I'm currently in the middle of 3 more PABLO tasks (almost as if Grosso knows I want them somehow), one on each of the 3 types of GPUs on my hosts. Not sure about being finished on the 750ti by midnight, ServicEnginIC.

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1031
Credit: 35,493,857,483
RAC: 71,175,505
Level
Trp
Scientific publications
wat
Message 54439 - Posted: 25 Apr 2020 | 16:01:00 UTC - in response to Message 54438.

i think the PABLO tasks cannot be restarted at all, even in a system with identical GPUs. I had a GPU crash the other day, and after rebooting a PABLO task tried to restart on a different GPU, and it failed immediately saying that it cannot be restarted on a different device. I haven't had that problem with the MDAD tasks though. they seem to be able to be stopped and restarted pretty much whenever, even on another device if it's the same as the one it started on, 0-90% without issue.
____________

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1284
Credit: 4,920,556,959
RAC: 6,336,580
Level
Arg
Scientific publications
watwatwatwatwat
Message 54443 - Posted: 25 Apr 2020 | 17:54:26 UTC - in response to Message 54439.

I still am running with the switch timers set at 360 minutes. So don't step away from a running task for six hours. Think that might have covered my Pablo tasks but I never have changed my settings even after moving to an all RTX 2080 host.

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54446 - Posted: 26 Apr 2020 | 1:48:43 UTC
Last modified: 26 Apr 2020 | 2:14:12 UTC

Here are more PABLO results for your database, ServicEnginIC-

My GTX 1650 on ASUS Prime:
44,027.23sec = 12.23hrs
https://www.gpugrid.net/workunit.php?wuid=1949562

My GTX 1060 3GB on ASUS Prime
37556.23sec = 10.43hrs
https://www.gpugrid.net/workunit.php?wuid=19495626

1 PABLO task still running.

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54448 - Posted: 26 Apr 2020 | 15:54:17 UTC - in response to Message 54446.

Here are more PABLO results for your database, ServicEnginIC-

My GTX 1650 on ASUS Prime:
44,027.23sec = 12.23hrs
https://www.gpugrid.net/workunit.php?wuid=1949562

My GTX 1060 3GB on ASUS Prime
37556.23sec = 10.43hrs
https://www.gpugrid.net/workunit.php?wuid=19495626

1 PABLO task still running.


ServicEnginIC, I botched the above info. please ignore it.
Here is what's correct:

My GTX 1650 on ASUS Prime:
44,027.23sec = 12.23hrs
https://www.gpugrid.net/workunit.php?wuid=19495626

My GTX 1060 3GB on ASUS Prime
37,177.73sec = 10.32hrs
https://www.gpugrid.net/workunit.php?wuid=19422730


New entries below:

My GTX 750ti 2GB on Optiplex 980
116,121.21sec = 32.25hrs
https://www.gpugrid.net/workunit.php?wuid=19730833
(a tortoise)

Found more PABLOs...

My GTX 1650 4GB on Optiplex 980
45,518.61sec = 12.64hrs
https://www.gpugrid.net/workunit.php?wuid=19740597

MyGTX 1060 3GB on ASUS Prime
37,556.23sec = 10.43hrs
https://www.gpugrid.net/workunit.php?wuid=19755484

My GTX 1650 4GB on ASUS Prime
44,103.76sec = 12.25hrs
https://www.gpugrid.net/workunit.php?wuid=19436206

That's all for now.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54449 - Posted: 26 Apr 2020 | 19:06:07 UTC - in response to Message 54430.

I'll try to throw a new comparative table on this Sunday April 26th...

I'm waiting to pick up data for better complete my original table.
Today I catched a new PABLO WU, and it is running now in a GT1030.
I still miss data for my GTX950 and GTX1050Ti...
I've got from you data for GTX750Ti and GTX1650, among several other models. Thank you!

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54495 - Posted: 29 Apr 2020 | 17:23:24 UTC - in response to Message 54449.

I'll try to throw a new comparative table on this Sunday April 26th...

I'm waiting to pick up data for better complete my original table.
Today I catched a new PABLO WU, and it is running now in a GT1030.
I still miss data for my GTX950 and GTX1050Ti...
I've got from you data for GTX750Ti and GTX1650, among several other models. Thank you!


Here's one more PABLO run on my i7-7700K ASUS Prime:
GTX 1650 4GB.
4,4092.20sec = 12.25hrs.
http://www.gpugrid.net/workunit.php?wuid=19859216

Getting consistent results from these Alienware cards. Run pretty quiet for a single fan cooler at ~80% fan and a steady 59-60C @ ~24C room temp. (fan ctrl by MSI Afterburner) They run ~97% usage and ~90% power consumption running ACEMD tasks. They are factory overclocked +200MHz so I'd like to see results from a different make of card for comparison.
I think this might be the most GPU compute power per watt that NVIDIA has made so far. 🚀

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54496 - Posted: 29 Apr 2020 | 21:25:27 UTC

I stumbled on this result while tracing a WU error history.

https://www.gpugrid.net/workunit.php?wuid=19660936

RTX 2080ti/Threadripper 3970X 🛸
13,616.88sec = 3.78hrs

For comparison against low power/low heat setups. Would like to see a Titan RTX result also...

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54497 - Posted: 29 Apr 2020 | 21:54:18 UTC - in response to Message 54496.

I stumbled on this result while tracing a WU error history.

https://www.gpugrid.net/workunit.php?wuid=19660936

RTX 2080ti/Threadripper 3970X 🛸
13,616.88sec = 3.78hrs

For comparison against low power/low heat setups. Would like to see a Titan RTX result also...


Also see these results for a PABLO task on a GTX 1660S

https://www.gpugrid.net/workunit.php?wuid=19883994

GTX 1660S/ i5-9600KF/ Linux
26,123.87sec = 7.26hrs

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 7,520
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54498 - Posted: 29 Apr 2020 | 22:56:42 UTC - in response to Message 54496.
Last modified: 29 Apr 2020 | 23:04:16 UTC

e8s30_e2s69p4f100-PABLO_UCB_NMR_KIX_CMYB_5-0-5-RND3175

RTX 2080ti/Threadripper 3970X 🛸
13,616.88sec = 3.78hrs = 3h 46m 56.88s
This CPU is probably overcommitted, hindering the GPU's performance.

e6s59_e2s143p4f450-PABLO_UCB_NMR_KIX_CMYB_5-2-5-RND0707_0
11,969.48 sec = 3.33hrs = 3h 19m 29.48s RTX 2080Ti / i3-4160

e15s66_e5s54p0f140-PABLO_UCB_NMR_KIX_CMYB_5-0-5-RND6536_0
10,091.61 sec = 2.80hrs = 2h 48m 11.61s RTX 2080Ti / i3-4160

e7s4_e2s94p3f30-PABLO_UCB_NMR_KIX_CMYB_5-1-5-RND1904_0
9,501.28 sec = 2.64hrs = 2h 38m 21.28s RTX 2080Ti / i5-7500 (This GPU has higher clocks then the previous)

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1284
Credit: 4,920,556,959
RAC: 6,336,580
Level
Arg
Scientific publications
watwatwatwatwat
Message 54499 - Posted: 30 Apr 2020 | 1:13:10 UTC
Last modified: 30 Apr 2020 | 1:13:58 UTC

My latest Pablo on one of my RTX 2080's

https://www.gpugrid.net/workunit.php?wuid=19871635

Run time CPU time
13,218.66 13,202.01
3.67 hours

Helps if you don't over commit your cpu.

ProDigit
Send message
Joined: 13 Nov 19
Posts: 6
Credit: 87,400,696
RAC: 0
Level
Thr
Scientific publications
wat
Message 54574 - Posted: 4 May 2020 | 12:57:40 UTC

For future reference,
The OP requested a GPU TDP of 120W, but if he raised the TDP requirements by only 5 more watts, in Linux, this could include many of the RTX series.

A 2060 does the same 7Tflops at 125W, as stock, while a GTX 1660Ti does only 5.4 Tflops at 120W.
A GTX 1660Ti can be brought lower in power, and increase overclock; resulting in the same performance at (an estimated) 85W. Saves you about $40 a year on electricity.

See post here:
http://www.gpugrid.net/forum_thread.php?id=5113

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54577 - Posted: 4 May 2020 | 14:25:10 UTC - in response to Message 54574.

A GTX 1660Ti can be brought lower in power, and increase overclock; resulting in the same performance at (an estimated) 85W. Saves you about $40 a year on electricity.

Thank you for your tip.
I add it to my list of "things to test"...

rod4x4
Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,219,935,054
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 54581 - Posted: 5 May 2020 | 0:27:43 UTC - in response to Message 54577.
Last modified: 5 May 2020 | 0:58:54 UTC

A GTX 1660Ti can be brought lower in power, and increase overclock; resulting in the same performance at (an estimated) 85W. Saves you about $40 a year on electricity.

Thank you for your tip.
I add it to my list of "things to test"...


I agree with ProDigit. I have used the same principle on my GPUs and experienced similar results.

I run GPUs power limited and have also overclocked with a power limit on both ACEMD2 and ACEMD3 work units with positive results.

Caveat
Have not tested power draw at the wall, I find it odd that greater performance can be derived by both overclocking and power limiting. Testing at the wall should be the next test to validate this method of output increase.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54583 - Posted: 5 May 2020 | 6:15:53 UTC - in response to Message 54581.

I've read your post, with exhaustive tests, at ProDigit's Guide: RTX (also GTX), increasing efficiency in Linux thread.
Good job!

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,937,027,024
RAC: 10,950,571
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55094 - Posted: 5 Jul 2020 | 21:50:35 UTC
Last modified: 5 Jul 2020 | 21:53:05 UTC

When last ACEMD3 WUs outage arrived, I thoght it would last longer.
So, I searched for new projects in order to maintain active my GPUs, and I joined PrimeGrid.
Also, as I concluded on a previous post, I had decided to retire of production a Pascal GTX 950 GPU based graphics card, replacing it by a more efficient Turing GTX 1650 one.
I had tested four different graphics cards based on the same GTX 1650 GPU, and one dilemma arose: What would be the most convenient model for me to repeat?
- Model #1: PNY VCG16504SFPPB-O
- Model #2: ASUS TUF-GTX1650-4G-GAMING
- Model #3: ASUS DUAL-GTX1650-O4G
- Model #4: ASUS ROG-STRIX-GTX1650-O4G-GAMING

Finally, I chose a second second card of Model #4:
I've already given my reasons on [AF>Libristes] hermes Question for new gtx / rtx thread.

And as a complementary information (in close contact with this "Low power GPUs performance comparative" thread), I built a comparative table between the four tested models.
The main data for this comparative come from PrimeGrid's "Generalized Fermat Prime Search (n=20)" CUDA tasks execution.
Theese tasks are very similar in the amount of calculations required, thus being their execution times a good base for GPU's performance comparative.



Conclusions:
- Model #1: is the (negative) winner for the hottest card... But the (possitive) winner for being the most power efficient one (the least energy consumed per WU).
- Model #2: is the least factory overclocked, and therefore it is the lowest performance one, but it is rocky stable and easy to maintain cool.
- Model #3: is probably penalized for being mounted in a PCIE rev. 2.0 - DDR3 motherboard. Its boost clock is higher than the Model #1: one, but the performance for this last is higher (PCIE Rev. 3.0 - DDR4 motherboard).
- Model #4: power efficiency is penalized due to a factory TDP increase, in order to achieve higher boost clocks. But because of this, it is the winner for the highest performance (the lowest execution time).

rod4x4
Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,219,935,054
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 55095 - Posted: 6 Jul 2020 | 0:03:11 UTC - in response to Message 55094.

Nice analysis of your gtx1650 GPUs. Always enjoy your technical hardware posts.

An alternate conclusion on the data presented:
For Low Power GPUs, Model #1 can be the winner if you have the time or inclination to adjust the Fan speeds of the GPU, and optimise the air flow in the Host chassis.
Model #4 offers a 2% gain in speed, but at a penalty of 11% gain in power consumption.
Considering you are comparing a "VW Polo" (PNY) to a "Ferrari" (ASUS ROG) of the GPU world, the "VW Polo" did pretty well.

It gets down to personal choice as to whether cost, raw speed, power efficiency or a combination of these factors is the goal.

Post to thread

Message boards : Graphics cards (GPUs) : Low power GPUs performance comparative

//