Message boards : Server and website : New user can't get any work
Author | Message |
---|---|
I have a teammate who has joined GPUGrid and is unable to get any work. His cruncher meets all the requirements. Fri 21 Feb 2020 01:45:45 PM EST | | [work_fetch] Request work fetch: Backoff ended for GPUGRID Fri 21 Feb 2020 01:45:49 PM EST | | choose_project(): 1582310749.544944 Fri 21 Feb 2020 01:45:49 PM EST | | [work_fetch] ------- start work fetch state ------- Fri 21 Feb 2020 01:45:49 PM EST | | [work_fetch] target work buffer: 86400.00 + 864.00 sec Fri 21 Feb 2020 01:45:49 PM EST | | [work_fetch] --- project states --- Fri 21 Feb 2020 01:45:49 PM EST | Einstein@Home | [work_fetch] REC 34160464.119 prio -0.000 can't request work: suspended via Manager Fri 21 Feb 2020 01:45:49 PM EST | GPUGRID | [work_fetch] REC 0.000 prio 0.000 can request work Fri 21 Feb 2020 01:45:49 PM EST | SETI@home | [work_fetch] REC 226750485.698 prio -0.000 can't request work: suspended via Manager (88.81 sec) Fri 21 Feb 2020 01:45:49 PM EST | | [work_fetch] --- state for CPU --- Fri 21 Feb 2020 01:45:49 PM EST | | [work_fetch] shortfall 1396224.00 nidle 16.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 01:45:49 PM EST | Einstein@Home | [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:49 PM EST | GPUGRID | [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:49 PM EST | SETI@home | [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:49 PM EST | | [work_fetch] --- state for NVIDIA GPU --- Fri 21 Feb 2020 01:45:49 PM EST | | [work_fetch] shortfall 5584896.00 nidle 64.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 01:45:49 PM EST | Einstein@Home | [work_fetch] share 0.000 Fri 21 Feb 2020 01:45:49 PM EST | GPUGRID | [work_fetch] share 1.000 Fri 21 Feb 2020 01:45:49 PM EST | SETI@home | [work_fetch] share 0.000 Fri 21 Feb 2020 01:45:49 PM EST | | [work_fetch] ------- end work fetch state ------- Fri 21 Feb 2020 01:45:49 PM EST | GPUGRID | choose_project: scanning Fri 21 Feb 2020 01:45:49 PM EST | GPUGRID | can't fetch CPU: blocked by project preferences Fri 21 Feb 2020 01:45:49 PM EST | GPUGRID | can fetch NVIDIA GPU Fri 21 Feb 2020 01:45:49 PM EST | GPUGRID | NVIDIA GPU needs work - buffer low Fri 21 Feb 2020 01:45:49 PM EST | GPUGRID | checking CPU Fri 21 Feb 2020 01:45:49 PM EST | GPUGRID | CPU can't fetch: blocked by project preferences Fri 21 Feb 2020 01:45:49 PM EST | GPUGRID | checking NVIDIA GPU Fri 21 Feb 2020 01:45:49 PM EST | GPUGRID | NVIDIA GPU set_request: 1.000000 Fri 21 Feb 2020 01:45:49 PM EST | GPUGRID | [sched_op] Starting scheduler request Fri 21 Feb 2020 01:45:49 PM EST | GPUGRID | [work_fetch] request: CPU (0.00 sec, 0.00 inst) NVIDIA GPU (1.00 sec, 64.00 inst) Fri 21 Feb 2020 01:45:49 PM EST | GPUGRID | Sending scheduler request: To fetch work. Fri 21 Feb 2020 01:45:49 PM EST | GPUGRID | Requesting new tasks for NVIDIA GPU Fri 21 Feb 2020 01:45:49 PM EST | GPUGRID | [sched_op] CPU work request: 0.00 seconds; 0.00 devices Fri 21 Feb 2020 01:45:49 PM EST | GPUGRID | [sched_op] NVIDIA GPU work request: 1.00 seconds; 64.00 devices Fri 21 Feb 2020 01:45:50 PM EST | GPUGRID | Scheduler request completed: got 0 new tasks Fri 21 Feb 2020 01:45:50 PM EST | GPUGRID | [sched_op] Server version 613 Fri 21 Feb 2020 01:45:50 PM EST | GPUGRID | No tasks sent Fri 21 Feb 2020 01:45:50 PM EST | GPUGRID | No tasks are available for New version of ACEMD Fri 21 Feb 2020 01:45:50 PM EST | GPUGRID | Project requested delay of 31 seconds Fri 21 Feb 2020 01:45:50 PM EST | GPUGRID | [work_fetch] backing off NVIDIA GPU 518 sec Fri 21 Feb 2020 01:45:50 PM EST | GPUGRID | [sched_op] Deferring communication for 00:00:31 Fri 21 Feb 2020 01:45:50 PM EST | GPUGRID | [sched_op] Reason: requested by project Fri 21 Feb 2020 01:45:50 PM EST | | [work_fetch] Request work fetch: RPC complete Fri 21 Feb 2020 01:45:55 PM EST | | choose_project(): 1582310755.574162 Fri 21 Feb 2020 01:45:55 PM EST | | [work_fetch] ------- start work fetch state ------- Fri 21 Feb 2020 01:45:55 PM EST | | [work_fetch] target work buffer: 86400.00 + 864.00 sec Fri 21 Feb 2020 01:45:55 PM EST | | [work_fetch] --- project states --- Fri 21 Feb 2020 01:45:55 PM EST | Einstein@Home | [work_fetch] REC 34160464.119 prio -0.000 can't request work: suspended via Manager Fri 21 Feb 2020 01:45:55 PM EST | GPUGRID | [work_fetch] REC 0.000 prio 0.000 can't request work: scheduler RPC backoff (25.98 sec) Fri 21 Feb 2020 01:45:55 PM EST | SETI@home | [work_fetch] REC 226750485.698 prio -0.000 can't request work: suspended via Manager (82.78 sec) Fri 21 Feb 2020 01:45:55 PM EST | | [work_fetch] --- state for CPU --- Fri 21 Feb 2020 01:45:55 PM EST | | [work_fetch] shortfall 1396224.00 nidle 16.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 01:45:55 PM EST | Einstein@Home | [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:55 PM EST | GPUGRID | [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:55 PM EST | SETI@home | [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:55 PM EST | | [work_fetch] --- state for NVIDIA GPU --- Fri 21 Feb 2020 01:45:55 PM EST | | [work_fetch] shortfall 5584896.00 nidle 64.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 01:45:55 PM EST | Einstein@Home | [work_fetch] share 0.000 Fri 21 Feb 2020 01:45:55 PM EST | GPUGRID | [work_fetch] share 0.000 project is backed off (resource backoff: 513.41, inc 600.00) Fri 21 Feb 2020 01:45:55 PM EST | SETI@home | [work_fetch] share 0.000 Fri 21 Feb 2020 01:45:55 PM EST | | [work_fetch] ------- end work fetch state ------- Fri 21 Feb 2020 01:45:55 PM EST | GPUGRID | choose_project: scanning Fri 21 Feb 2020 01:45:55 PM EST | GPUGRID | skip: scheduler RPC backoff Fri 21 Feb 2020 01:45:55 PM EST | SETI@home | choose_project: scanning Fri 21 Feb 2020 01:45:55 PM EST | SETI@home | skip: suspended via Manager Fri 21 Feb 2020 01:45:55 PM EST | Einstein@Home | choose_project: scanning Fri 21 Feb 2020 01:45:55 PM EST | Einstein@Home | skip: suspended via Manager Fri 21 Feb 2020 01:45:55 PM EST | | [work_fetch] No project chosen for work fetch Anybody see something I don't recognize as an impediment to getting work? Anybody else have a similar problem getting work as a first time volunteer to GPUGrid? | |
ID: 53719 | Rating: 0 | rate: / Reply Quote | |
Just a FYI, the user has all the project configuration settings set correctly and matches mine to a T. Resource share --- | |
ID: 53720 | Rating: 0 | rate: / Reply Quote | |
Try setting resource share to something above zero just to get the project going. | |
ID: 53721 | Rating: 0 | rate: / Reply Quote | |
He tried that already with a resource share of 100. Tried beta applications also. Fri 21 Feb 2020 02:45:29 PM EST | GPUGRID | sched RPC pending: Requested by user Fri 21 Feb 2020 02:45:29 PM EST | GPUGRID | piggyback_work_request() Fri 21 Feb 2020 02:45:29 PM EST | | [work_fetch] ------- start work fetch state ------- Fri 21 Feb 2020 02:45:29 PM EST | | [work_fetch] target work buffer: 86400.00 + 864.00 sec Fri 21 Feb 2020 02:45:29 PM EST | | [work_fetch] --- project states --- Fri 21 Feb 2020 02:45:29 PM EST | Einstein@Home | [work_fetch] REC 34239357.438 prio -0.000 can't request work: suspended via Manager Fri 21 Feb 2020 02:45:29 PM EST | GPUGRID | [work_fetch] REC 0.000 prio -0.000 can request work Fri 21 Feb 2020 02:45:29 PM EST | SETI@home | [work_fetch] REC 226350578.976 prio 0.000 can't request work: suspended via Manager Fri 21 Feb 2020 02:45:29 PM EST | | [work_fetch] --- state for CPU --- Fri 21 Feb 2020 02:45:29 PM EST | | [work_fetch] shortfall 1396224.00 nidle 16.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 02:45:29 PM EST | Einstein@Home | [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:29 PM EST | GPUGRID | [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:29 PM EST | SETI@home | [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:29 PM EST | | [work_fetch] --- state for NVIDIA GPU --- Fri 21 Feb 2020 02:45:29 PM EST | | [work_fetch] shortfall 5584896.00 nidle 64.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 02:45:29 PM EST | Einstein@Home | [work_fetch] share 0.000 Fri 21 Feb 2020 02:45:29 PM EST | GPUGRID | [work_fetch] share 1.000 Fri 21 Feb 2020 02:45:29 PM EST | SETI@home | [work_fetch] share 0.000 Fri 21 Feb 2020 02:45:29 PM EST | | [work_fetch] ------- end work fetch state ------- Fri 21 Feb 2020 02:45:29 PM EST | GPUGRID | piggyback: resource CPU Fri 21 Feb 2020 02:45:29 PM EST | GPUGRID | piggyback: can't fetch CPU: blocked by project preferences Fri 21 Feb 2020 02:45:29 PM EST | GPUGRID | piggyback: resource NVIDIA GPU Fri 21 Feb 2020 02:45:29 PM EST | GPUGRID | [work_fetch] set_request() for NVIDIA GPU: ninst 64 nused_total 0.00 nidle_now 64.00 fetch share 1.00 req_inst 64.00 req_secs 5584896.00 Fri 21 Feb 2020 02:45:29 PM EST | GPUGRID | [sched_op] Starting scheduler request Fri 21 Feb 2020 02:45:29 PM EST | GPUGRID | [work_fetch] request: CPU (0.00 sec, 0.00 inst) NVIDIA GPU (5584896.00 sec, 64.00 inst) Fri 21 Feb 2020 02:45:29 PM EST | GPUGRID | Sending scheduler request: Requested by user. Fri 21 Feb 2020 02:45:29 PM EST | GPUGRID | Requesting new tasks for NVIDIA GPU Fri 21 Feb 2020 02:45:29 PM EST | GPUGRID | [sched_op] CPU work request: 0.00 seconds; 0.00 devices Fri 21 Feb 2020 02:45:29 PM EST | GPUGRID | [sched_op] NVIDIA GPU work request: 5584896.00 seconds; 64.00 devices Fri 21 Feb 2020 02:45:30 PM EST | GPUGRID | Scheduler request completed: got 0 new tasks Fri 21 Feb 2020 02:45:30 PM EST | GPUGRID | [sched_op] Server version 613 Fri 21 Feb 2020 02:45:30 PM EST | GPUGRID | No tasks sent Fri 21 Feb 2020 02:45:30 PM EST | GPUGRID | No tasks are available for New version of ACEMD Fri 21 Feb 2020 02:45:30 PM EST | GPUGRID | Project requested delay of 31 seconds Fri 21 Feb 2020 02:45:30 PM EST | GPUGRID | [sched_op] Deferring communication for 00:00:31 Fri 21 Feb 2020 02:45:30 PM EST | GPUGRID | [sched_op] Reason: requested by project Fri 21 Feb 2020 02:45:30 PM EST | | [work_fetch] Request work fetch: RPC complete Fri 21 Feb 2020 02:45:35 PM EST | | choose_project(): 1582314335.760470 Fri 21 Feb 2020 02:45:35 PM EST | | [work_fetch] ------- start work fetch state ------- Fri 21 Feb 2020 02:45:35 PM EST | | [work_fetch] target work buffer: 86400.00 + 864.00 sec Fri 21 Feb 2020 02:45:35 PM EST | | [work_fetch] --- project states --- Fri 21 Feb 2020 02:45:35 PM EST | Einstein@Home | [work_fetch] REC 34239357.438 prio -0.000 can't request work: suspended via Manager Fri 21 Feb 2020 02:45:35 PM EST | GPUGRID | [work_fetch] REC 0.000 prio 0.000 can't request work: scheduler RPC backoff (25.98 sec) Fri 21 Feb 2020 02:45:35 PM EST | SETI@home | [work_fetch] REC 226350578.976 prio 0.000 can't request work: suspended via Manager Fri 21 Feb 2020 02:45:35 PM EST | | [work_fetch] --- state for CPU --- Fri 21 Feb 2020 02:45:35 PM EST | | [work_fetch] shortfall 1396224.00 nidle 16.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 02:45:35 PM EST | Einstein@Home | [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:35 PM EST | GPUGRID | [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:35 PM EST | SETI@home | [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:35 PM EST | | [work_fetch] --- state for NVIDIA GPU --- Fri 21 Feb 2020 02:45:35 PM EST | | [work_fetch] shortfall 5584896.00 nidle 64.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 02:45:35 PM EST | Einstein@Home | [work_fetch] share 0.000 Fri 21 Feb 2020 02:45:35 PM EST | GPUGRID | [work_fetch] share 0.000 Fri 21 Feb 2020 02:45:35 PM EST | SETI@home | [work_fetch] share 0.000 Fri 21 Feb 2020 02:45:35 PM EST | | [work_fetch] ------- end work fetch state ------- Fri 21 Feb 2020 02:45:35 PM EST | GPUGRID | choose_project: scanning Fri 21 Feb 2020 02:45:35 PM EST | GPUGRID | skip: scheduler RPC backoff Fri 21 Feb 2020 02:45:35 PM EST | SETI@home | choose_project: scanning Fri 21 Feb 2020 02:45:35 PM EST | SETI@home | skip: suspended via Manager Fri 21 Feb 2020 02:45:35 PM EST | Einstein@Home | choose_project: scanning Fri 21 Feb 2020 02:45:35 PM EST | Einstein@Home | skip: suspended via Manager Fri 21 Feb 2020 02:45:35 PM EST | | [work_fetch] No project chosen for work fetch | |
ID: 53722 | Rating: 0 | rate: / Reply Quote | |
I think that the only two lines in there that have any meaning are: Fri 21 Feb 2020 02:45:29 PM EST | GPUGRID | [sched_op] NVIDIA GPU work request: 5584896.00 seconds; 64.00 devices Fri 21 Feb 2020 02:45:30 PM EST | GPUGRID | Scheduler request completed: got 0 new tasks He asked for work, and I think we must regard that as a refusal, rather than a lack of availability. Unfortunately, at this project we don't get the rejection letter with a checklist of possible reasons at the bottom that some projects send out. So we'll have to turn detective. We know * NVIDIA GeForce RTX 2080 (8192MB) driver: 420.69 * Linux Ubuntu And not much else that the server will want to check. Version of CUDA available? Compute Capability? To be honest, I don't know - but I'm wondering if Turing cards are yet supported under Linux? | |
ID: 53723 | Rating: 0 | rate: / Reply Quote | |
I think that the only two lines in there that have any meaning are: I already questioned whether his spoofed Nvidia driver version was confusing the schedulers. (The driver is 440.59 under the covers) But he already changed that in a previous try to report the standard version. No dice. Yes, the Turing cards have been working under Linux since last July. He is running sufficient driver and CC level for the CUDA100 application to be sent. I was hoping you would point out the obvious reason to us since you are more expert in reading and understanding the work_fetch_debug output. | |
ID: 53724 | Rating: 0 | rate: / Reply Quote | |
I see that your friend's host ID # 524248 is showing "[64] NVIDIA GeForce RTX 2080 (8192MB) driver: 440.59" | |
ID: 53725 | Rating: 0 | rate: / Reply Quote | |
Try a reset of the project. Force it to send a test unit to see if the GPUs are viable. | |
ID: 53726 | Rating: 0 | rate: / Reply Quote | |
I see that your friend's host ID # 524248 is showing "[64] NVIDIA GeForce RTX 2080 (8192MB) driver: 440.59" I've been getting work all day for all my hosts, the majority which have spoofed gpu counts. No problem there. | |
ID: 53729 | Rating: 0 | rate: / Reply Quote | |
Hi guys, I'm the guy who was having issues. I finally was able to get some work and earn credit so I can actually post. | |
ID: 53732 | Rating: 0 | rate: / Reply Quote | |
Glad you finally got work Ian. So how would that work if both Einstein and GPUGrid has resource share of zero when Seti runs out of work. | |
ID: 53735 | Rating: 0 | rate: / Reply Quote | |
Maybe the servers thought that 64 gpus was out of bounds. | |
ID: 53736 | Rating: 0 | rate: / Reply Quote | |
Glad you finally got work Ian. So how would that work if both Einstein and GPUGrid has resource share of zero when Seti runs out of work. with a 0 resource share, the project is "supposed" to only send you 1 WU per device, and basically send them back at a 1-1 ratio, send one, get one. it appears that GPUGrid is following this. It doesnt look like Einstein is following that though. even when I had 64 GPUs listed, it would routinely send me 80-100 WUs. I think JStateson complained about this at Einstein as well. How do you have resource share allocated for GPUGrid? I thought you ran GPUGrid as a backup project on one or more of your systems. Maybe the servers thought that 64 gpus was out of bounds. if so this must be some kind of project limit. BOINC's limit is 64. and never prevented me from getting work at SETI or Einstein. | |
ID: 53737 | Rating: 0 | rate: / Reply Quote | |
No I don't run any project as backup. They all get a standard resource share. | |
ID: 53738 | Rating: 0 | rate: / Reply Quote | |
Keith has started to do what I do. Dedicated machines for certain projects. | |
ID: 53740 | Rating: 0 | rate: / Reply Quote | |
So setting "0" resource share on both GPUGrid and Einstein results in only Einstein tasks downloading and processing. Is that correct? If so, have you tried adding an app_config.xml file to the Einstein project folder restricting the number of max concurrent tasks to 3 or 4? | |
ID: 53741 | Rating: 0 | rate: / Reply Quote | |
hmm maybe that would work. i'll have to play around with it. | |
ID: 53742 | Rating: 0 | rate: / Reply Quote | |
Message boards : Server and website : New user can't get any work