Advanced search

Message boards : Graphics cards (GPUs) : Errors after driver update (Linux)

Author Message
Profile Wang Solutions
Avatar
Send message
Joined: 23 Feb 09
Posts: 10
Credit: 20,238,048
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 10947 - Posted: 2 Jul 2009 | 0:18:21 UTC

Hi Everyone,

I updated my Ubuntu 8.10 box to the latest NVidia drivers today, and since then every GPUGrid WU errors with messages similar to this:

<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 9800 GT"
# Clock rate: 1512000 kilohertz
# Total amount of global memory: 1073020928 bytes
# Number of multiprocessors: 14
# Number of cores: 112
SIGSEGV: segmentation violation
Stack trace (21 frames):
acemd_6.64_x86_64-pc-linux-gnu__cuda(boinc_catch_signal+0x58)[0x44d6c8]
/lib/libc.so.6[0x7f1e4c3e20a0]
/usr/lib/libcuda.so.1[0x7f1e4d3e0020]
/usr/lib/libcuda.so.1[0x7f1e4d3e5d84]
/usr/lib/libcuda.so.1[0x7f1e4d3af10f]
/usr/lib/libcuda.so.1[0x7f1e4d13ab3b]
/usr/lib/libcuda.so.1[0x7f1e4d14b46b]
/usr/lib/libcuda.so.1[0x7f1e4d133211]
/usr/lib/libcuda.so.1(cuCtxCreate+0xaa)[0x7f1e4d12cfaa]
../../projects/www.gpugrid.net/libcudart.so.2[0x7f1e4cef9d58]
../../projects/www.gpugrid.net/libcudart.so.2[0x7f1e4cefa2a9]
../../projects/www.gpugrid.net/libcudart.so.2(cudaMalloc+0x44)[0x7f1e4cede0e4]
acemd_6.64_x86_64-pc-linux-gnu__cuda[0x41a018]
acemd_6.64_x86_64-pc-linux-gnu__cuda[0x41a15e]
acemd_6.64_x86_64-pc-linux-gnu__cuda[0x41be47]
acemd_6.64_x86_64-pc-linux-gnu__cuda[0x417815]
acemd_6.64_x86_64-pc-linux-gnu__cuda(sin+0x1591)[0x408a11]
acemd_6.64_x86_64-pc-linux-gnu__cuda(sin+0x1815)[0x408c95]
acemd_6.64_x86_64-pc-linux-gnu__cuda(sin+0x28e)[0x40770e]
/lib/libc.so.6(__libc_start_main+0xe6)[0x7f1e4c3cd466]
acemd_6.64_x86_64-pc-linux-gnu__cuda(sinh+0x49)[0x407579]

Exiting...

</stderr_txt>
]]>

The old driver was completely removed and the new one installed properly. Can anyone shed any light on why this would be occurring?
____________
Proud member of BOINC@AUSTRALIA

Profile Wang Solutions
Avatar
Send message
Joined: 23 Feb 09
Posts: 10
Credit: 20,238,048
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 10978 - Posted: 6 Jul 2009 | 5:21:58 UTC - in response to Message 10947.

Well after 5 days the problem remained, so today I uninstalled the new driver and went back to the old 180.x driver, and the work units are now processing correctly again. It appears that something in the new drivers is causing a SIGSEGV error in Linux.
____________
Proud member of BOINC@AUSTRALIA

Rabinovitch
Avatar
Send message
Joined: 25 Aug 08
Posts: 143
Credit: 64,937,578
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 11100 - Posted: 10 Jul 2009 | 2:05:51 UTC

Yep, some incompatibiliy is present. The same situation as yours is on my Kubuntu 9.04 with latest drivers. Today I'll try to rollback.
____________
From Siberia with love!

Post to thread

Message boards : Graphics cards (GPUs) : Errors after driver update (Linux)

//