Advanced search

Message boards : Number crunching : No WUs since 5 days

Author Message
zooxit
Send message
Joined: 4 Jul 21
Posts: 19
Credit: 469,504,433
RAC: 1,131,300
Level
Gln
Scientific publications
wat
Message 57697 - Posted: 30 Oct 2021 | 19:13:27 UTC

Hi!

I am new to GPUGRID, so this might be a hasty question...
WUs were readily coming to my hosts until 25th of October - no WUs since then (no matter Linux or Win).
I saw WUs were available and were sent to other hosts/users.

Is this normal behaviour for GPUGRID or did something go wrong with my computers/hosts/boinc....?

Any comments or explanation are appreciated.

Erich56
Send message
Joined: 1 Jan 15
Posts: 944
Credit: 3,683,395,665
RAC: 853,259
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwat
Message 57720 - Posted: 3 Nov 2021 | 7:03:03 UTC

well, you will need to get used to this situation: tasks available for some time, and then no tasks available for some time.

However, it would of course be nice if someone from the team could give a least a vague prediction as to when new tasks will be ready for download.
Will this be in about 1 week, 2 weeks, 1 month, ... ?

Would help many volunteers to make some plans for their GPU use.

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1070
Credit: 1,450,903,214
RAC: 419,945
Level
Met
Scientific publications
watwatwatwatwat
Message 57729 - Posted: 3 Nov 2021 | 14:31:39 UTC - in response to Message 57720.

If you want work, go to preferences and select beta applications and the PythonGPU application.

I've been getting tons of work every day.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 21,587
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 57730 - Posted: 3 Nov 2021 | 15:45:53 UTC - in response to Message 57729.

I've been getting tons of work every day.


I get tons of work units, though whether they should be characterized as work is another matter.
http://www.gpugrid.net/results.php?hostid=452287

But one has been running for 45 minutes, though it is a _5, so still suspect.

Erich56
Send message
Joined: 1 Jan 15
Posts: 944
Credit: 3,683,395,665
RAC: 853,259
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwat
Message 57731 - Posted: 3 Nov 2021 | 16:34:46 UTC - in response to Message 57729.

If you want work, go to preferences and select beta applications and the PythonGPU application.
I've been getting tons of work every day.

and they are being crunched successfully?

If I look at Jim1348's task list, I am not really convinced :-(

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1070
Credit: 1,450,903,214
RAC: 419,945
Level
Met
Scientific publications
watwatwatwatwat
Message 57734 - Posted: 3 Nov 2021 | 19:53:22 UTC - in response to Message 57731.

I have had a half dozen crunch successfully and one actually ran for the full length of the parameter test and was awarded credits of normal value for the full run length.
https://www.gpugrid.net/workunit.php?wuid=27085711

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 744
Credit: 4,943,588,494
RAC: 515,229
Level
Arg
Scientific publications
wat
Message 57738 - Posted: 3 Nov 2021 | 20:12:07 UTC - in response to Message 57734.
Last modified: 3 Nov 2021 | 20:12:36 UTC

Keith have you managed to watch one of these Python GPU tasks while it was running? I'm still wondering if these are using the GPU yet.

the last batch (several months ago) of beta tasks only used the CPU and never ran anything on the GPU despite being labelled as a GPU workunit.
____________

Erich56
Send message
Joined: 1 Jan 15
Posts: 944
Credit: 3,683,395,665
RAC: 853,259
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwat
Message 57739 - Posted: 3 Nov 2021 | 20:20:00 UTC - in response to Message 57734.

I have had a half dozen crunch successfully and one actually ran for the full length of the parameter test and was awarded credits of normal value for the full run length.
https://www.gpugrid.net/workunit.php?wuid=27085711

okay, I'll give it a try.
Since it's cuda 1121, it should work on Ampere cards, right?

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1070
Credit: 1,450,903,214
RAC: 419,945
Level
Met
Scientific publications
watwatwatwatwat
Message 57742 - Posted: 3 Nov 2021 | 20:40:50 UTC - in response to Message 57738.

Yes I watched several while they were running in nvidia-smi. Only saw about 31% utilization and about 200Mb of RAM used. Doesn't look like the tasks used much of the gpu for the tasks that couldn't complete fully.

The one task that ran normally for 3146 seconds and had what looks like a normal BOINC result finish I didn't see because I was out of town at the eye doctor this morning.

That is the one I would have liked to look with nvidia-smi while it was running at the end of its run.

https://www.gpugrid.net/result.php?resultid=32660133


wandb: Run summary:
wandb: TrainReward 0.86433
wandb: action_loss -0.06106
wandb: entropy_loss 1.47128
wandb: floor 0.66667
wandb: l 366.66667
wandb: loss -0.05398
wandb: seed 65.0
wandb: start 0.0
wandb: t 2413.87847
wandb: value_loss 0.01416
wandb:
wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Synced obstacle-ppo_e1a2: https://wandb.ai/rl-team-upf/obstacle-ppo/runs/22662wni
wandb: Find logs at: ./wandb/run-20211103_100419-22662wni/logs/debug.log
wandb:
INFO:mlagents_envs.environment:Environment shut down with return code 0.
10:45:21 (1047421): ./gpugridpy/bin/python exited; CPU time 2404.929384
10:45:21 (1047421): called boinc_finish(0)

</stderr_txt>
]]>

Erich56
Send message
Joined: 1 Jan 15
Posts: 944
Credit: 3,683,395,665
RAC: 853,259
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwat
Message 57744 - Posted: 4 Nov 2021 | 7:02:53 UTC - in response to Message 57729.

Keith Myers wrote

If you want work, go to preferences and select beta applications and the PythonGPU application.

I've been getting tons of work every day.

I have tried to download Python tasks since yesterday afternoon, none so far.
Only later it occurred to me that these tasks are Linux only at this point, right?
My system is Windows10

zooxit
Send message
Joined: 4 Jul 21
Posts: 19
Credit: 469,504,433
RAC: 1,131,300
Level
Gln
Scientific publications
wat
Message 57745 - Posted: 4 Nov 2021 | 8:13:19 UTC

@Keith Mayers: Thanks for feedback. I have all the settings (tick marks) turned on in GPUGRID preferences (from the time I have joined this project).

Do you have to install some special software (depending if it is a Python, ACEMD... task)?

Still no work since 25th of October. Bummer, I just received rtx3080 a week ago :(

zooxit
Send message
Joined: 4 Jul 21
Posts: 19
Credit: 469,504,433
RAC: 1,131,300
Level
Gln
Scientific publications
wat
Message 57746 - Posted: 4 Nov 2021 | 8:26:44 UTC

I stand corrected (was at work last two days, didn't check properly).
I received 3 Python tasks (error) and one ACEMD in last two days - but all came to my Linux machine.

So I still have this question:
Do I need to install some additional software on Windows machine (or was it just a coincidence that WUs came only to Linux machine)?

Thanks for help.

Erich56
Send message
Joined: 1 Jan 15
Posts: 944
Credit: 3,683,395,665
RAC: 853,259
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwat
Message 57747 - Posted: 4 Nov 2021 | 8:31:13 UTC - in response to Message 57746.

Do I need to install some additional software on Windows machine (or was it just a coincidence that WUs came only to Linux machine)?

most probably it was NO coincidence that the Python task came to your Linux machine. Provided that it only runs on Linux, which I am almost sure, only Linux machines will receive these tasks.
And there is nothing you can install on your Windows machine to make a Linux task run on it :-(

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1070
Credit: 1,450,903,214
RAC: 419,945
Level
Met
Scientific publications
watwatwatwatwat
Message 57760 - Posted: 4 Nov 2021 | 16:55:15 UTC - in response to Message 57745.

@Keith Mayers: Thanks for feedback. I have all the settings (tick marks) turned on in GPUGRID preferences (from the time I have joined this project).

Do you have to install some special software (depending if it is a Python, ACEMD... task)?

Still no work since 25th of October. Bummer, I just received rtx3080 a week ago :(


The new PythonGPU application is beta still at this time and only available to Linux hosts.

It was the same way with the new acemd3 application, they start out as beta for only Linux hosts then get brought to Mainline and applications for both Windows and Linux are created.

You shouldn't have to install anything. The tasks are bundled with all the support dependencies the task needs. That is what is determined in the beta phase on Linux hosts.

We found out that the libboost 1.74 library was needed to have the tasks complete successfully and we added that library in manually. But the mainline application released now for all hosts has that package included in the task bundle now.

You just have to be patient while the Linux hosts work out the bugs for the developers. I'm sure the python apps will come shortly for Windows host.

In case, you didn't notice, we picked up a new developer scientist for the python machine learning applications. New blood for the project which is very good.

Erich56
Send message
Joined: 1 Jan 15
Posts: 944
Credit: 3,683,395,665
RAC: 853,259
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwat
Message 57762 - Posted: 4 Nov 2021 | 18:30:11 UTC

most of the recent tasks were dealing with the Python tasks.

To come back to the topic of this thread, though: can anyone tell when distribution of ACEMD tasks will be resumed?

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1070
Credit: 1,450,903,214
RAC: 419,945
Level
Met
Scientific publications
watwatwatwatwat
Message 57763 - Posted: 4 Nov 2021 | 19:04:41 UTC - in response to Message 57762.

Since the topic of this thread is "No WU's since 5 days" and there has been work in fact every day till today, (PythonGPU), maybe if you are the OP, you should retitle the thread to "No acemd3 WU's since 5 days"

The acemd3 work has been spotty since the summer. It comes in bursts. My last acemd3 task was Nov. 1.

Post to thread

Message boards : Number crunching : No WUs since 5 days

//