Advanced search

Message boards : Number crunching : Nvidia 560 Ti seems slow

Author Message
vorksholk
Send message
Joined: 22 Dec 09
Posts: 2
Credit: 31,062,789
RAC: 0
Level
Val
Scientific publications
watwatwatwatwat
Message 22614 - Posted: 29 Nov 2011 | 0:58:39 UTC

Alright, so a little while ago, ago I build a gaming computer with the following specs:
8 GB 1600 MHz DDR3 RAM
NVidia 560 Ti GPU
Intel Core i5-2500k (3.3 GHz, 3.7 Turboboost)
2 TB HD
Asus P8Z68-v PRO MOBO

So I installed BOINC, and began running GPUGrid. I am running Windows 7 64-bit Home Premium. I was able to crunch one of the extra long Work Units (the ones that they say take 8-12 hours on the best of cards) in around 9 hours. Everything seemed great. A few days ago, I decided to try overclocking, so I went into AISuite, and chose the "Fast" automatic TPU overclocking. I rebooted my system, but World Communtity Grid, when running, pushed my processor too hot, and so I decided to reverse the overclocking (I don't have an after-market heatsync).
I booted into the BIOS of the P8Z68-v PRO and changed the settings back to what I think was default (I didn't manually enter any numbers, didn't enable EPU, simply turned the performance back to "High Performance" and what not. Now, my GPU seems to be only able to do half of one of these longer work units in the time it used to be able to do one in. CUDA-Z showed performance of around 800GFlops for single percision, but only around 112 GFlops for double percision. I was under the impression that double precision takes around 2 times as long, so my double percision rating should be 400 GFlops, according to the single-percision measurement, correct? As well, I was under the impression that the 560 Ti could hit a TFlop while running a GPUGrid WU, so these numbers were a bit surprising. Did I somehow screw something up in the BIOS, or is it normal for some long workunits to take 2x as long as other workunits of the supposed same length?

Thank you :)

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22617 - Posted: 29 Nov 2011 | 13:07:08 UTC - in response to Message 22614.
Last modified: 29 Nov 2011 | 13:07:36 UTC

Tasks vary in length, so your GPU is OK.

4584836 2883476 25 Nov 2011 6:25:00 UTC 26 Nov 2011 17:48:17 UTC Completed and validated 28,741.63 362.66 21,395.64 26,744.55 Long runs (8-12 hours on fastest card) v6.15 (cuda31)
4581046 2880431 24 Nov 2011 15:12:01 UTC 25 Nov 2011 18:51:11 UTC Completed and validated 34,455.80 3,375.61 24,990.59 31,238.24 Long runs (8-12 hours on fastest card) v6.15 (cuda31)
4564337 2867566 19 Nov 2011 23:31:51 UTC 24 Nov 2011 2:33:16 UTC Completed and validated 55,478.35 6,468.91 44,998.74 44,998.74 Long runs (8-12 hours on fastest card) v6.15 (cuda31)
4563285 2866095 19 Nov 2011 17:10:08 UTC 19 Nov 2011 23:31:51 UTC Completed and validated 13,774.88 199.38 5,199.44 7,799.16 ACEMD2: GPU molecular dynamics v6.15 (cuda31)

As you can see, the bottom task was a 'short/normal length' task (13,774sec). The next task up was a (very) long task. The top two tasks ran from between 21K and 25K seconds. Different tasks also use different amounts of CPU time. Credit is awarded depending on return+report time, so the task that took 55Ksec was not reported within 48h, and thus received normal credit. Tasks that report in <48h get 25% extra credit, and tasks that report within 24h get 50% extra credit.
Have a look at the FAQ's.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Daniel Neely
Send message
Joined: 21 Feb 09
Posts: 5
Credit: 36,705,213
RAC: 134
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 22633 - Posted: 6 Dec 2011 | 2:32:09 UTC - in response to Message 22614.

Flops64 = flops32/2 is only true if all your hardware pipelines are capable of running 2 32 or 1 64bit operation at a time. This is generally the case in CPUs. It's not the case in GPUs because 64 bit is virtually useless for gaming.

For consumer cards nVidia has 1/8 for their top tier (570, 580, 590), and 1/12 for the middle tier (560, 550). I'm not sure if their lower tiers support FP64 at all. Quadro/Tesla cards based on the same GPU as the 570+ do have 1/2 FP64 performance; but are priced high enough that only companies with deep pockets can afford them. Quadros built around lesser GPUs are as limited as their consumer equivalents.

ATI offers 1/4 FP64 performance on their 69xx series cards, 1/5 on the previous 58xx and 48xx series; and IIRC none on lower models at all since they're not in the businesss of marking GPUs up 4-6x to sell to enterprise customers.

Post to thread

Message boards : Number crunching : Nvidia 560 Ti seems slow

//