Recently, due to a faulty GPU Titan card, my system became unpredictable and started crashing regularly which resulted in calculation errors on the GPUGrid WU's.
I have multiple GPU's (GTX 1070 & 1070 Ti) in my system.
If my system crashes due to this faulty card, the WU d'office is totally lost due these calculation errors.
This in stark contrast with CPU WU's which seem to recover fully and just restart their calculations from a previously saved intermediate point, and calculate their way to successful completioin. (Rosetta, LHC, WorldCommunity Grid, ClimatePrediction) e.g. ClimatePrediction has WU's running for 36 hours on end.
Long runs take between 6 and 12 hours, if you're at 5h30 or further in a WU, that's a massive loss of time + power.
Can this be developed for GPU Grid too please ?
I would be very grateful (and I guess many crunchers with me) if you could introduce this functionality soon.
Many thanks in advance !