Advanced search

Message boards : Graphics cards (GPUs) : CUDA info for GM204

Author Message
eXaPower
Send message
Joined: 25 Sep 13
Posts: 293
Credit: 1,897,601,978
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 38207 - Posted: 30 Sep 2014 | 19:50:09 UTC

I'm in process- building couple of DIY Haswell/Maxwell systems. Before buying GM204- I was wondering if someone could post technical/device Info for GM204. I'd be most grateful.

Most everybody at the engineer firm I work- renders with Tesla or GK110. I would like to know GM204 compute features. I build my own CUDA programs (mostly for Immunology and Microbiology--- antigen-antibody complex. Humoral antibodies combating Intracellular antigen functions, and Cellular Immunity, Lymphocytes (B cells) clonal expansion with T-dependent antigens. A few of my colleagues also create programs for testing.) Some use Double float, but most use single. A few do integer random number types.

A Free program--Sisoftware Sandra and AIDA64 shows detailed specifications.

For example: here is a GK107 CUDA/ComputeShader/OpenCL info from Sandra

--CUDA

Graphics Processor
Manufacturer : NVIDIA
Model : NVIDIA GeForce GT 650M
OEM Device Name : nVidia GeForce GT 650M (GK107M)
Hardware ID : VEN_10DE DEV_0FD1 REV_A1
Interface : CUDA GP Processor
Type : Graphics Processor
Version : 3.00
Driver Version : 6.50.06.50
Cores per Processor : 2 Unit(s)
Unified Shaders : 384 Unit(s)
Peak Processing Performance (PPP) : 641.28GFLOPS
Adjusted Peak Performance (APP) : 577.15WG

Cache Information
L1D (1st Level) Data Cache : 16kB, 128bytes Line Size
L2 (2nd Level) Data/Unified Cache : 256kB, 32bytes Line Size

Logical/Chipset Memory Banks
Physical Memory : 2GB
Memory Bus Speed : 2GHz
Channels : 2
Width : 128-bit
Block Size : 727.83MB
Maximum Memory Pitch : 2GB

Device Information
Constants Memory : 64kB
Shared Memory per Work-group : 48kB
Dedicated Shared Memory : Yes
Threads per Work-group : 1024 Unit(s)
Registers per Work-group : 65536
Wavefront Size : 32 Unit(s)
Byte/Integer/Long Native Width : 8-bit / 32-bit
Half/Single/Double-Precision Float Native Width : 32-bit / 64-bit
Maximum Work-group Size : 1024 x 1024 x 64
Maximum Grid Size : 2147483647 x 65535 x 65535

Features
FP64 - Double-Precision Floating-Point Support : Yes
FP16 - Half-Precision Floating-Point Support : No
INT64 - Long Integer Support : Yes
OLP - Can Overlap Memory Transfers : Yes
ZC - Zero Copy Memory Access : No
GBF - Global Buffer Support : Yes
Error Correction Capability : No
CC - Compute Cluster : No
UVA - Unified Virtual Addressing : Yes
PEER - Peer 2 Peer Data Transfer : Yes
64-bit Edition : No

--ComputeShader

Graphics Processor
Model : NVIDIA GeForce GT 650M
OEM Device Name : nVidia GeForce GT 650M (GK107M)
Hardware ID : VEN_10DE DEV_0FD1 REV_A1
Interface : Compute Shader Processor
Type : Graphics Processor
Version : 176.00
Driver Version : 9.18.13.4398
Cores per Processor : 2 Unit(s)
Unified Shaders : 384 Unit(s)

Cache Information
L1D (1st Level) Data Cache : 32kB
L2 (2nd Level) Data/Unified Cache : 256kB

Logical/Chipset Memory Banks
Physical Memory : 2GB
Memory Bus Speed : 405MHz
Min/Max Memory Speed : 2GHz
Channels : 2
Width : 128-bit
Block Size : 488MB
Cached Shared Memory : 2GB

Device Information
Constants Memory : 16kB
Shared Memory per Work-group : 32kB
Dedicated Shared Memory : Yes
Threads per Work-group : 1024 Unit(s)
Registers per Work-group : 32768
Maximum Work-group Size : 1024 x 1024 x 64
Maximum Grid Size : 16384 x 16384 x 2048

Features
CS - Compute Shader Support : Yes
FP64 - Double-Precision Floating-Point Support : Yes
FP16 - Half-Precision Floating-Point Support : Yes
INT64 - Long Integer Support : No
Multi-Threading : Yes
Command Lists : Yes
XDP - Double (FP64) eXtensions Support : No

--OpenCL

Graphics Processor
Manufacturer : NVIDIA Corporation
Model : NVIDIA GeForce GT 650M
OEM Device Name : nVidia GeForce GT 650M (GK107M)
Hardware ID : VEN_10DE DEV_0FD1 REV_A1
Interface : OpenCL GP Processor
Type : Graphics Processor
Version : 1.01.03
Driver Version : 343.98
Cores per Processor : 2 Unit(s)
Unified Shaders : 384 Unit(s)
Peak Processing Performance (PPP) : 641.28GFLOPS
Adjusted Peak Performance (APP) : 577.15WG

System Timer : 1GHz

Cache Information
L1D (1st Level) Data Cache : 32kB, 128bytes Line Size
L2 (2nd Level) Data/Unified Cache : 256kB

Logical/Chipset Memory Banks
Physical Memory : 2GB
Memory Bus Speed : 2GHz
Channels : 2
Width : 128-bit
Block Size : 512MB

Device Information
Constants Memory : 64kB
Shared Memory per Work-group : 48kB
Dedicated Shared Memory : Yes
Threads per Work-group : 1024 Unit(s)
Registers per Work-group : 65536
Wavefront Size : 32 Unit(s)
Byte/Integer/Long Native Width : 8-bit / 32-bit
Half/Single/Double-Precision Float Native Width : 32-bit / 64-bit
Maximum Work-group Size : 1024 x 1024 x 64
Maximum Grid Size : 32768 x 32768 x 4096

Features
FP64 - Double-Precision Floating-Point Support : Yes
FP16 - Half-Precision Floating-Point Support : No
INT64 - Long Integer Support : No
IMG - Images/Textures Support : Yes
LEA - Little Endian Arch : Yes
JIT - Compiler Support : Yes
Error Correction Capability : No
S/G - Memory Scatter/Gather Access : Yes
KOCL - Plain Text Kernel Support : Yes
KNAT - Native/Binary Kernel Support : No
OOO - Out-of-Order Execution Support : Yes
PFX - Profiling Execution Support : Yes
OLP - Can Overlap Memory Transfers : Yes
235 - Fission (Partitioning) Support : No
UVA - Unified Virtual Addressing : No
FPCL - Full Profile : Yes
64-bit Edition : No
SVM - Shared Virtual Memory : No

CL (Compute Language) Extensions
cl_khr_byte_addressable_store : Yes
cl_khr_icd : Yes
cl_khr_gl_sharing : Yes
cl_nv_d3d9_sharing : Yes
cl_nv_d3d10_sharing : Yes
cl_khr_d3d10_sharing : Yes
cl_nv_d3d11_sharing : Yes
cl_nv_compiler_options : Yes
cl_nv_device_attribute_query : Yes
cl_nv_pragma_unroll : Yes
: Yes
cl_khr_global_int32_base_atomics : Yes
cl_khr_global_int32_extended_atomics : Yes
cl_khr_local_int32_base_atomics : Yes
cl_khr_local_int32_extended_atomics : Yes
cl_khr_fp64 : Yes

--Display and Video Adapters
SiSoftware Sandra

Video Adapter
Display : \\.\DISPLAY1
VGA Compatible : No
Official Device Name : NVIDIA GeForce GT 650M
Hardware ID : PCI\VEN_10DE&DEV_0FD1&SUBSYS_397217AA&REV_A1
OEM Device Name : nVidia GeForce GT 650M (GK107M)
Device Name : nVidia GeForce GT 650M (GK107M)
Hardware ID : VEN_10DE DEV_0FD1 REV_A1

Chipset
Model : GK231
Revision : A2
Shader Speed : 835MHz
Min/Max/Turbo Speed : 135MHz - 835/920MHz
Peak Processing Performance (PPP) : 641.28GFLOPS
Adjusted Peak Performance (APP) : 577.15WGWG
Unified Shaders : 384 Unit(s)
Cores per Processor : 2 Unit(s)
Core Voltage Rating : 0.957V
Min/Max Core Voltage : 0.813V - 1.025V

Multi Video Card Support
No. Linked Devices : 2 Unit(s)
SLI Technology : Yes

Logical/Chipset Memory Banks
Total Memory : 2GB
Memory Bus Speed : 2x 2GHz (4GHz)
Min/Max/Turbo Speed : 2x 405MHz (810MHz) - 2x 2GHz (4GHz)
Channels : 2
Width : 128-bit
Maximum Memory Bus Bandwidth : 62.5GB/s

Graphics Adapter Power Management
835MHz / 920MHz / 2GHz Engaged : 1.000V
835MHz / 920MHz / 2GHz : 1.000V
835MHz / 920MHz / 2GHz : 0.850V

Bus
Type : PCIe
Version : 2.00
Width : x8 / x16
Speed : 5Gbps / 5Gbps
Maximum Bus Bandwidth : 3.9GB/s

SMBus/i2c Controller 1
Model : GeForce GT 650M DDC16
Version : 0.161

Sensor
Model : GeForce GT 650M
Version : 0.161

Temperature Sensor(s)
Board Temperature : 62.00°C

DirectX 11 Device(s)
Interface Version : 11.00
CS - Compute Shader Support : Yes
FP64 - Double-Precision Floating-Point Support : Yes
Model : NVIDIA GeForce GT 650M
Physical Memory : 2GB
Texture Memory : 2GB

DirectX 10 Device(s)
Interface Version : 10.01
Library Version : 9.18.13.4398
Model : NVIDIA GeForce GT 650M
Physical Memory : 2GB
Texture Memory : 2GB

DirectX 9 Device(s)
Interface Version : 9.00
Model : NVIDIA GeForce GT 650M
Video Driver : nvd3dumx.dll
Library Version : 9.18.13.4398
3D Hardware Acceleration : Yes
Hardware Transform and Light : Yes
Heads : 1 Unit(s)
Pixel Shaders Version : 3.00
Vertex Shaders Version : 3.00

Accelerated Video Decoders
MPEG2 IDCT : Yes
{86695F12-340E-4F04-9FD3-9253DD327460} : Yes
MPEG2 VLD : Yes
{6F3EC719-3735-42CC-8063-65CC3CB36616} : Yes
VC1-D2010 (VC1 VLD 2010) : Yes
VC1-D (VC1 VLD) : Yes
VC1-C (VC1 IDCT) : Yes
WMV9-C (WMV9 IDCT) : Yes
{32FCFE3F-DE46-4A49-861B-AC71110649D5} : Yes
H264-S-P (H264 VLD Stereo Progressive) : Yes
H264-S (H264 VLD Stereo) : Yes
H264-E (H264 VLD NoFGT) : Yes
{1B81BE6A-A0C7-11D3-B984-00C04F2E73C5} : Yes
{8EFA5926-BD9E-4B04-8B72-8F977DC44C36} : Yes
H265-M (HEVC VLD Main) : Yes
{EFD64D74-C9E8-41D7-A5E9-E9B0E39FA319} : Yes
{ED418A9F-010D-4EDA-9AE3-9A65358D8D2E} : Yes
{9947EC6F-689B-11DC-A320-0019DBBC4184} : Yes
{B194EB52-19A0-41F0-B754-CC244AC1CB20} : Yes
{6AFFD11E-1D96-42B1-A215-93A31F09A53D} : Yes

Accelerated Video Processors
Pixel Adaptive Device : Yes
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB} : Yes
Progressive Device : Yes
Bob Device : Yes
Software Device : Yes

Video Driver
Expected Windows Version : 4.00
Screen Saver Active : No
Low Power Saving Active : No
Power Off Saving Active : No

Mode
Mode : 1920x1080 32-bit
Refresh Rate : 120Hz
Virtual Desktop Size : 1920x1080

Video Modes
Mode 5 : 640x480 32-bit 59Hz 60Hz 59Hz 60Hz 59Hz 60Hz
Mode 11 : 800x600 32-bit 59Hz 60Hz 59Hz 60Hz 59Hz 60Hz
Mode 17 : 1024x768 32-bit 59Hz 60Hz 59Hz 60Hz 59Hz 60Hz
Mode 23 : 1280x720 32-bit 59Hz 60Hz 59Hz 60Hz 59Hz 60Hz
Mode 29 : 1280x768 32-bit 59Hz 60Hz 59Hz 60Hz 59Hz 60Hz
Mode 35 : 1280x800 32-bit 59Hz 60Hz 59Hz 60Hz 59Hz 60Hz
Mode 41 : 1280x1024 32-bit 59Hz 60Hz 59Hz 60Hz 59Hz 60Hz
Mode 47 : 1360x768 32-bit 59Hz 60Hz 59Hz 60Hz 59Hz 60Hz
Mode 53 : 1366x768 32-bit 59Hz 60Hz 59Hz 60Hz 59Hz 60Hz
Mode 59 : 1680x1050 32-bit 59Hz 60Hz 59Hz 60Hz 59Hz 60Hz
Mode 61 : 1920x1080 32-bit 59Hz 60Hz 80Hz 90Hz 100Hz 120Hz

Device Mode Characteristics
Physical Medium Width : 345 mm / 14 in
Physical Medium Height : 194 mm / 8 in
Recommended Display Size : 20 in
Maximum Resolution : 96x96 dpi
Colour Bits/Planes : 32-bit / 1-bit
Colour Resolution : 24-bit
Pixel Width/Height/Diagonal : 36 / 36 / 51

Enhanced Video Settings
Animation Effects Enabled : Yes
Full Windows Drag Enabled : Yes
Font Smoothing Enabled : Yes
High Contrast Enabled : No

Notice Kelper supports INT64 on CUDA, but not OpenCL. Also, Nvidia GM204 OpenCL is 1.1, but 1.2 and 1.3 features are included on GK107.


Post to thread

Message boards : Graphics cards (GPUs) : CUDA info for GM204