Advanced search

Message boards : Number crunching : acemd simulation vs. python simulation GPU

Author Message
Jari Kosonen
Send message
Joined: 5 May 22
Posts: 22
Credit: 8,923,305
RAC: 779
Level
Ser
Scientific publications
wat
Message 59093 - Posted: 11 Aug 2022 | 3:19:07 UTC

In case of the ACEMD the GPU driver is dropped off
and secondly the remaining acemd process did not get killed with the 'kill'-command.
So that case I guess the /home -partition (where the BOINC-files are)
did not get unmounted always and can cause /home -partition crash and a lot of fsck.

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1070
Credit: 1,450,903,214
RAC: 419,945
Level
Met
Scientific publications
watwatwatwatwat
Message 59096 - Posted: 11 Aug 2022 | 17:29:37 UTC - in response to Message 59093.

Not sure what you are describing. Sounds like you lost your video drivers and the task crashed?

I've never had an issue with a driver or card dropping off the bus causing disk corruption. Normally just throws away all your onboard tasks with errors and puts you into the penalty box awaiting new work.

I have BOINC installed in /home and have never corrupted /home and I have lost work due to the driver being pulled out from underneath running work by an unattended driver update many times.

I've never found it necessary to kill a BOINC process. Some of them take a while to end though like the python processes. Just need patience to wait out the couple of minutes and they end themselves normally.

Jari Kosonen
Send message
Joined: 5 May 22
Posts: 22
Credit: 8,923,305
RAC: 779
Level
Ser
Scientific publications
wat
Message 59126 - Posted: 18 Aug 2022 | 11:04:04 UTC - in response to Message 59096.

I think if the process does not get killed during shutdown,
the shutdown will not occur normally and if I press the "reset"-button,
the HDD does not get unmounted and make this segment failures.
fsck possibly can not fix 100% of them correctly during next boot.

Post to thread

Message boards : Number crunching : acemd simulation vs. python simulation GPU

//