Kernel [frc_sum_kernel

Message boards : Graphics cards (GPUs) : Kernel [frc_sum_kernel_bond] failed

Author	Message
Zydor Send message Joined: 8 Feb 09 Posts: 252 Credit: 1,309,451 RAC: 0 Level Scientific publications	Message 7306 - Posted: 9 Mar 2009 \| 17:17:56 UTC
	The final line - unknown error - probably gives a subtle clue :) However, anything about this one strike anyone? I had some dramas 2 weeks ago as motherboards melted around me, but thats sorted now and the errent PCs rebuilt. The GPUGrid WUs are remarkably stable as far as I am concerned - plaudits to the devs - so its unusual for me to see this. Just curious in case there is something obvious I did to cause it. Regards Zy <core_client_version>6.5.0</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # Using CUDA device 0 # Device 0: "GeForce 9800 GTX/9800 GTX+" # Clock rate: 1915000 kilohertz # Total amount of global memory: 536870912 bytes # Number of multiprocessors: 16 # Number of cores: 128 MDIO ERROR: cannot open file "restart.coor" # Using CUDA device 0 # Device 0: "GeForce 9800 GTX/9800 GTX+" # Clock rate: 1915000 kilohertz # Total amount of global memory: 536870912 bytes # Number of multiprocessors: 16 # Number of cores: 128 Cuda error: Kernel [frc_sum_kernel_bond] failed in file 'force.cu' in line 283 : unknown error. </stderr_txt> ]]>
	ID: 7306 \| Rating: 0 \| rate: / Reply Quote

uBronan Send message Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level Scientific publications	Message 7322 - Posted: 10 Mar 2009 \| 15:35:34 UTC - in response to Message 7306.
	Do you have another client like s@h on cuda running also ? Because when i stopped the simultanous run with the other cuda client it seems to run ok again now. But time will tell i am on second unit
	ID: 7322 \| Rating: 0 \| rate: / Reply Quote

Zydor Send message Joined: 8 Feb 09 Posts: 252 Credit: 1,309,451 RAC: 0 Level Scientific publications	Message 7325 - Posted: 10 Mar 2009 \| 19:10:40 UTC - in response to Message 7322. Last modified: 10 Mar 2009 \| 19:13:39 UTC
	I do run S@H CUDA, but not at the same time. I split the time the card is used between SETI & GPUGrid, at present its about 50% to each. I will crunch a GPUGrid WU, then when its done, run SETI CUDA for 12 hours or so. From time to time I might end up suspending the GPRGrid WU for a couple of hours or so (set to unload from memory not suspend held in memory), for many reasons, but that does not seem to have caused an issue in the past. The hassle a couple of weeks ago in the crunching - self evident from my task list - was due to multiple hardware failures on two PCs, ending up rebuilding both - new motherboards, cpus, cards etc - just Murphy's Law hitting all at once. The GPUGrid WU is - from where I sit - remarkably stable [thank you to the Devs - nice one :) ], and I can safely say, never had any problem attributed to the WU itself, issues were caused by me or my equipment. Thats why I was curious on this one, as all appears ok this end, and its so rare to have a WU issue, I thought there may be something obvious I have missed. At the end of the day, the world has not ended, just a natural follow up in case there is conventional wisdom I need to be aware of, and have missed. Regards Zy
	ID: 7325 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 7331 - Posted: 10 Mar 2009 \| 19:27:36 UTC - in response to Message 7325.
	At the end of the day, the world has not ended, just a natural follow up in case there is conventional wisdom I need to be aware of, and have missed. There may be some wisdom be to found here, but I don't think it's common ;) MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 7331 \| Rating: 0 \| rate: / Reply Quote

Bymark Send message Joined: 23 Feb 09 Posts: 30 Credit: 5,897,921 RAC: 0 Level Scientific publications	Message 7384 - Posted: 12 Mar 2009 \| 16:48:23 UTC - in response to Message 7306.
	I have almost the same error and not a clue what’s it's about: Regards Thomas ----------------------------------------------------------- Outcome Client error Client state Compute error Exit status 1 (0x1) Computer ID 28987 Report deadline 16 Mar 2009 10:18:06 UTC CPU time 492.5469 stderr out <core_client_version>6.4.5</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # Using CUDA device 0 # Device 0: "GeForce GTX 260" # Clock rate: 1242000 kilohertz # Total amount of global memory: 939196416 bytes # Number of multiprocessors: 27 # Number of cores: 216 MDIO ERROR: cannot open file "restart.coor" # Using CUDA device 0 # Device 0: "GeForce GTX 260" # Clock rate: 1242000 kilohertz # Total amount of global memory: 939196416 bytes # Number of multiprocessors: 27 # Number of cores: 216 Cuda error: Kernel [frc_sum_nb_forces] failed in file 'force.cu' in line 244 : unknown error. </stderr_txt> ]]> Validate state Invalid Claimed credit 2478.98611111111 Granted credit 0 application version 6.62 The final line - unknown error - probably gives a subtle clue :) However, anything about this one strike anyone? I had some dramas 2 weeks ago as motherboards melted around me, but thats sorted now and the errent PCs rebuilt. The GPUGrid WUs are remarkably stable as far as I am concerned - plaudits to the devs - so its unusual for me to see this. Just curious in case there is something obvious I did to cause it. Regards Zy <core_client_version>6.5.0</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # Using CUDA device 0 # Device 0: "GeForce 9800 GTX/9800 GTX+" # Clock rate: 1915000 kilohertz # Total amount of global memory: 536870912 bytes # Number of multiprocessors: 16 # Number of cores: 128 MDIO ERROR: cannot open file "restart.coor" # Using CUDA device 0 # Device 0: "GeForce 9800 GTX/9800 GTX+" # Clock rate: 1915000 kilohertz # Total amount of global memory: 536870912 bytes # Number of multiprocessors: 16 # Number of cores: 128 Cuda error: Kernel [frc_sum_kernel_bond] failed in file 'force.cu' in line 283 : unknown error. </stderr_txt> ]]> ____________ "Silakka" Hello from Turku > Åbo.
	ID: 7384 \| Rating: 0 \| rate: / Reply Quote

Zydor Send message Joined: 8 Feb 09 Posts: 252 Credit: 1,309,451 RAC: 0 Level Scientific publications	Message 7399 - Posted: 12 Mar 2009 \| 20:27:14 UTC - in response to Message 7384.
	I've had a couple of BSOD's lately, there is a driver clash somewhere, so I came to the (guessing) conclusion it was probably related to the BSODs, and put it down to Life's Sweet Pattern :) If it happens again in a short space of time, I'll "perk up" and dig a little. Regards Zy
	ID: 7399 \| Rating: 0 \| rate: / Reply Quote

Bymark Send message Joined: 23 Feb 09 Posts: 30 Credit: 5,897,921 RAC: 0 Level Scientific publications	Message 7426 - Posted: 13 Mar 2009 \| 20:02:39 UTC - in response to Message 7399.
	My error was a remote desktop type, and now all working fine. Do newer use windovs remote desktop in gpu projekts ! Regards Silakka
	ID: 7426 \| Rating: 0 \| rate: / Reply Quote

uBronan Send message Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level Scientific publications	Message 7465 - Posted: 15 Mar 2009 \| 10:50:35 UTC
	lol agreed i have learned that very soon after running seti it crashed several times when doing rdc :D
	ID: 7465 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : Graphics cards (GPUs) : Kernel [frc_sum_kernel_bond] failed

	About	Science	Volunteers	Performance	Forum	Join us	Donate