Advanced search

Message boards : Number crunching : WU with error code

Author Message
Someone537833
Send message
Joined: 20 Jan 21
Posts: 2
Credit: 25,756,668
RAC: 509,392
Level
Val
Scientific publications
wat
Message 61942 - Posted: 18 Nov 2024 | 23:33:26 UTC

Hello. I am having trouble getting my work units to pass. All the work units I receive are exiting with an error. Seems like there is a permission issue?

I know it's not good practice, but I have chmod 777 everything I can think of to try and diagnose the issue, but still error. Was reading in another thread that most of the current apps need lots of VRAM.

My server previously ran GPU Grid 3 years ago without issues, but are my 1060 3GB GPUs just too old now?

http://www.gpugrid.org/show_host_detail.php?hostid=628075

Thanks for the help.

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1069
Credit: 40,231,533,983
RAC: 432
Level
Trp
Scientific publications
wat
Message 61943 - Posted: 19 Nov 2024 | 0:38:27 UTC - in response to Message 61942.

i think you need to upgrade your drivers to newer. 470 is too old i think. the cuda version is mislabeled. you actually need cuda12 drivers.

install at least 535 level, but you could also install the latest 550 or 555 version drivers.

the acemd task errors are definitely due to an old driver. probably the same problem with the ATMML tasks.

you might still see errors with Quantum Chemistry tasks however, even with newer drivers, these tasks tend to use a lot of VRAM and 3GB is probably not enough.

____________

Someone537833
Send message
Joined: 20 Jan 21
Posts: 2
Credit: 25,756,668
RAC: 509,392
Level
Val
Scientific publications
wat
Message 61947 - Posted: 21 Nov 2024 | 6:24:20 UTC - in response to Message 61943.

Thanks for the help. Seems to be working.

I updated the driver to 550, then boinc couldn't locate my GPUs anymore. I was running boinc 7.24, and upgraded to 8.0.2 and seems to be working again. Avoiding the Quantum Chemistry tasks.

I have no iGPU and rely on one of my GPU's to run the monitor to display. The task won't pause when the computer is in use, and most of the time the screen remains blank. I can still ssh into my computer, and use boinccmd.

I suppose I'm pushing my old cards to the limit. I've been running these 1060's since 2019, never thought they would last this long. Not complaining, but looking forward to upgrading once they bite the dust.

Have successfully completed two tasks so far.

Thanks again for the help.

Post to thread

Message boards : Number crunching : WU with error code

//