New user can't get any work

Message boards : Server and website : New user can't get any work

Author	Message
Keith Myers Send message Joined: 13 Dec 17 Posts: 1313 Credit: 6,014,417,459 RAC: 9,835,999 Level Scientific publications	Message 53719 - Posted: 21 Feb 2020 \| 19:17:07 UTC
	I have a teammate who has joined GPUGrid and is unable to get any work. His cruncher meets all the requirements. This is his rig. https://www.gpugrid.net/hosts_user.php?sort=rpc_time&rev=0&show_all=1&userid=552015 This is the output of work_fetch_debug in the Event Log. I don't see any reason why the schedulers respond with no work sent. Fri 21 Feb 2020 01:45:45 PM EST \| \| [work_fetch] Request work fetch: Backoff ended for GPUGRID Fri 21 Feb 2020 01:45:49 PM EST \| \| choose_project(): 1582310749.544944 Fri 21 Feb 2020 01:45:49 PM EST \| \| [work_fetch] ------- start work fetch state ------- Fri 21 Feb 2020 01:45:49 PM EST \| \| [work_fetch] target work buffer: 86400.00 + 864.00 sec Fri 21 Feb 2020 01:45:49 PM EST \| \| [work_fetch] --- project states --- Fri 21 Feb 2020 01:45:49 PM EST \| Einstein@Home \| [work_fetch] REC 34160464.119 prio -0.000 can't request work: suspended via Manager Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| [work_fetch] REC 0.000 prio 0.000 can request work Fri 21 Feb 2020 01:45:49 PM EST \| SETI@home \| [work_fetch] REC 226750485.698 prio -0.000 can't request work: suspended via Manager (88.81 sec) Fri 21 Feb 2020 01:45:49 PM EST \| \| [work_fetch] --- state for CPU --- Fri 21 Feb 2020 01:45:49 PM EST \| \| [work_fetch] shortfall 1396224.00 nidle 16.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 01:45:49 PM EST \| Einstein@Home \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:49 PM EST \| SETI@home \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:49 PM EST \| \| [work_fetch] --- state for NVIDIA GPU --- Fri 21 Feb 2020 01:45:49 PM EST \| \| [work_fetch] shortfall 5584896.00 nidle 64.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 01:45:49 PM EST \| Einstein@Home \| [work_fetch] share 0.000 Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| [work_fetch] share 1.000 Fri 21 Feb 2020 01:45:49 PM EST \| SETI@home \| [work_fetch] share 0.000 Fri 21 Feb 2020 01:45:49 PM EST \| \| [work_fetch] ------- end work fetch state ------- Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| choose_project: scanning Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| can't fetch CPU: blocked by project preferences Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| can fetch NVIDIA GPU Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| NVIDIA GPU needs work - buffer low Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| checking CPU Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| CPU can't fetch: blocked by project preferences Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| checking NVIDIA GPU Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| NVIDIA GPU set_request: 1.000000 Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| [sched_op] Starting scheduler request Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| [work_fetch] request: CPU (0.00 sec, 0.00 inst) NVIDIA GPU (1.00 sec, 64.00 inst) Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| Sending scheduler request: To fetch work. Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| Requesting new tasks for NVIDIA GPU Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| [sched_op] NVIDIA GPU work request: 1.00 seconds; 64.00 devices Fri 21 Feb 2020 01:45:50 PM EST \| GPUGRID \| Scheduler request completed: got 0 new tasks Fri 21 Feb 2020 01:45:50 PM EST \| GPUGRID \| [sched_op] Server version 613 Fri 21 Feb 2020 01:45:50 PM EST \| GPUGRID \| No tasks sent Fri 21 Feb 2020 01:45:50 PM EST \| GPUGRID \| No tasks are available for New version of ACEMD Fri 21 Feb 2020 01:45:50 PM EST \| GPUGRID \| Project requested delay of 31 seconds Fri 21 Feb 2020 01:45:50 PM EST \| GPUGRID \| [work_fetch] backing off NVIDIA GPU 518 sec Fri 21 Feb 2020 01:45:50 PM EST \| GPUGRID \| [sched_op] Deferring communication for 00:00:31 Fri 21 Feb 2020 01:45:50 PM EST \| GPUGRID \| [sched_op] Reason: requested by project Fri 21 Feb 2020 01:45:50 PM EST \| \| [work_fetch] Request work fetch: RPC complete Fri 21 Feb 2020 01:45:55 PM EST \| \| choose_project(): 1582310755.574162 Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] ------- start work fetch state ------- Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] target work buffer: 86400.00 + 864.00 sec Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] --- project states --- Fri 21 Feb 2020 01:45:55 PM EST \| Einstein@Home \| [work_fetch] REC 34160464.119 prio -0.000 can't request work: suspended via Manager Fri 21 Feb 2020 01:45:55 PM EST \| GPUGRID \| [work_fetch] REC 0.000 prio 0.000 can't request work: scheduler RPC backoff (25.98 sec) Fri 21 Feb 2020 01:45:55 PM EST \| SETI@home \| [work_fetch] REC 226750485.698 prio -0.000 can't request work: suspended via Manager (82.78 sec) Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] --- state for CPU --- Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] shortfall 1396224.00 nidle 16.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 01:45:55 PM EST \| Einstein@Home \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:55 PM EST \| GPUGRID \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:55 PM EST \| SETI@home \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] --- state for NVIDIA GPU --- Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] shortfall 5584896.00 nidle 64.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 01:45:55 PM EST \| Einstein@Home \| [work_fetch] share 0.000 Fri 21 Feb 2020 01:45:55 PM EST \| GPUGRID \| [work_fetch] share 0.000 project is backed off (resource backoff: 513.41, inc 600.00) Fri 21 Feb 2020 01:45:55 PM EST \| SETI@home \| [work_fetch] share 0.000 Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] ------- end work fetch state ------- Fri 21 Feb 2020 01:45:55 PM EST \| GPUGRID \| choose_project: scanning Fri 21 Feb 2020 01:45:55 PM EST \| GPUGRID \| skip: scheduler RPC backoff Fri 21 Feb 2020 01:45:55 PM EST \| SETI@home \| choose_project: scanning Fri 21 Feb 2020 01:45:55 PM EST \| SETI@home \| skip: suspended via Manager Fri 21 Feb 2020 01:45:55 PM EST \| Einstein@Home \| choose_project: scanning Fri 21 Feb 2020 01:45:55 PM EST \| Einstein@Home \| skip: suspended via Manager Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] No project chosen for work fetch Anybody see something I don't recognize as an impediment to getting work? Anybody else have a similar problem getting work as a first time volunteer to GPUGrid?
	ID: 53719 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1313 Credit: 6,014,417,459 RAC: 9,835,999 Level Scientific publications	Message 53720 - Posted: 21 Feb 2020 \| 19:23:05 UTC
	Just a FYI, the user has all the project configuration settings set correctly and matches mine to a T. Resource share --- Use CPU no Use ATI GPU no Use NVIDIA GPU yes Run test applications? no Is it OK for GPUGRID and your team (if any) to email you? yes Should GPUGRID show your computers on its web site? yes Default computer location --- Maximum CPU % for graphics 20 Run only the selected applications ACEMD short runs (2-3 hours on fastest card): no ACEMD long runs (8-12 hours on fastest GPU): no ACEMD3: yes Quantum Chemistry (CPU): no Quantum Chemistry (CPU, beta): no Python Runtime: no If no work for selected applications is available, accept work from other applications? no Use Graphics Processing Unit (GPU) if available yes Use Central Processing Unit (CPU) yes
	ID: 53720 \| Rating: 0 \| rate: / Reply Quote

biodoc Send message Joined: 26 Aug 08 Posts: 183 Credit: 7,438,039,375 RAC: 47,553,373 Level Scientific publications	Message 53721 - Posted: 21 Feb 2020 \| 19:37:19 UTC
	Try setting resource share to something above zero just to get the project going.
	ID: 53721 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1313 Credit: 6,014,417,459 RAC: 9,835,999 Level Scientific publications	Message 53722 - Posted: 21 Feb 2020 \| 20:06:35 UTC - in response to Message 53721.
	He tried that already with a resource share of 100. Tried beta applications also. Still not getting any work from a request. Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| sched RPC pending: Requested by user Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| piggyback_work_request() Fri 21 Feb 2020 02:45:29 PM EST \| \| [work_fetch] ------- start work fetch state ------- Fri 21 Feb 2020 02:45:29 PM EST \| \| [work_fetch] target work buffer: 86400.00 + 864.00 sec Fri 21 Feb 2020 02:45:29 PM EST \| \| [work_fetch] --- project states --- Fri 21 Feb 2020 02:45:29 PM EST \| Einstein@Home \| [work_fetch] REC 34239357.438 prio -0.000 can't request work: suspended via Manager Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [work_fetch] REC 0.000 prio -0.000 can request work Fri 21 Feb 2020 02:45:29 PM EST \| SETI@home \| [work_fetch] REC 226350578.976 prio 0.000 can't request work: suspended via Manager Fri 21 Feb 2020 02:45:29 PM EST \| \| [work_fetch] --- state for CPU --- Fri 21 Feb 2020 02:45:29 PM EST \| \| [work_fetch] shortfall 1396224.00 nidle 16.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 02:45:29 PM EST \| Einstein@Home \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:29 PM EST \| SETI@home \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:29 PM EST \| \| [work_fetch] --- state for NVIDIA GPU --- Fri 21 Feb 2020 02:45:29 PM EST \| \| [work_fetch] shortfall 5584896.00 nidle 64.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 02:45:29 PM EST \| Einstein@Home \| [work_fetch] share 0.000 Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [work_fetch] share 1.000 Fri 21 Feb 2020 02:45:29 PM EST \| SETI@home \| [work_fetch] share 0.000 Fri 21 Feb 2020 02:45:29 PM EST \| \| [work_fetch] ------- end work fetch state ------- Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| piggyback: resource CPU Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| piggyback: can't fetch CPU: blocked by project preferences Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| piggyback: resource NVIDIA GPU Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [work_fetch] set_request() for NVIDIA GPU: ninst 64 nused_total 0.00 nidle_now 64.00 fetch share 1.00 req_inst 64.00 req_secs 5584896.00 Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [sched_op] Starting scheduler request Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [work_fetch] request: CPU (0.00 sec, 0.00 inst) NVIDIA GPU (5584896.00 sec, 64.00 inst) Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| Sending scheduler request: Requested by user. Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| Requesting new tasks for NVIDIA GPU Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [sched_op] NVIDIA GPU work request: 5584896.00 seconds; 64.00 devices Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| Scheduler request completed: got 0 new tasks Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| [sched_op] Server version 613 Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| No tasks sent Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| No tasks are available for New version of ACEMD Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| Project requested delay of 31 seconds Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| [sched_op] Deferring communication for 00:00:31 Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| [sched_op] Reason: requested by project Fri 21 Feb 2020 02:45:30 PM EST \| \| [work_fetch] Request work fetch: RPC complete Fri 21 Feb 2020 02:45:35 PM EST \| \| choose_project(): 1582314335.760470 Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] ------- start work fetch state ------- Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] target work buffer: 86400.00 + 864.00 sec Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] --- project states --- Fri 21 Feb 2020 02:45:35 PM EST \| Einstein@Home \| [work_fetch] REC 34239357.438 prio -0.000 can't request work: suspended via Manager Fri 21 Feb 2020 02:45:35 PM EST \| GPUGRID \| [work_fetch] REC 0.000 prio 0.000 can't request work: scheduler RPC backoff (25.98 sec) Fri 21 Feb 2020 02:45:35 PM EST \| SETI@home \| [work_fetch] REC 226350578.976 prio 0.000 can't request work: suspended via Manager Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] --- state for CPU --- Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] shortfall 1396224.00 nidle 16.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 02:45:35 PM EST \| Einstein@Home \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:35 PM EST \| GPUGRID \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:35 PM EST \| SETI@home \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] --- state for NVIDIA GPU --- Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] shortfall 5584896.00 nidle 64.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 02:45:35 PM EST \| Einstein@Home \| [work_fetch] share 0.000 Fri 21 Feb 2020 02:45:35 PM EST \| GPUGRID \| [work_fetch] share 0.000 Fri 21 Feb 2020 02:45:35 PM EST \| SETI@home \| [work_fetch] share 0.000 Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] ------- end work fetch state ------- Fri 21 Feb 2020 02:45:35 PM EST \| GPUGRID \| choose_project: scanning Fri 21 Feb 2020 02:45:35 PM EST \| GPUGRID \| skip: scheduler RPC backoff Fri 21 Feb 2020 02:45:35 PM EST \| SETI@home \| choose_project: scanning Fri 21 Feb 2020 02:45:35 PM EST \| SETI@home \| skip: suspended via Manager Fri 21 Feb 2020 02:45:35 PM EST \| Einstein@Home \| choose_project: scanning Fri 21 Feb 2020 02:45:35 PM EST \| Einstein@Home \| skip: suspended via Manager Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] No project chosen for work fetch
	ID: 53722 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1589 Credit: 6,673,794,351 RAC: 8,989,891 Level Scientific publications	Message 53723 - Posted: 21 Feb 2020 \| 20:34:31 UTC - in response to Message 53722.
	I think that the only two lines in there that have any meaning are: Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [sched_op] NVIDIA GPU work request: 5584896.00 seconds; 64.00 devices Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| Scheduler request completed: got 0 new tasks He asked for work, and I think we must regard that as a refusal, rather than a lack of availability. Unfortunately, at this project we don't get the rejection letter with a checklist of possible reasons at the bottom that some projects send out. So we'll have to turn detective. We know * NVIDIA GeForce RTX 2080 (8192MB) driver: 420.69 * Linux Ubuntu And not much else that the server will want to check. Version of CUDA available? Compute Capability? To be honest, I don't know - but I'm wondering if Turing cards are yet supported under Linux?
	ID: 53723 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1313 Credit: 6,014,417,459 RAC: 9,835,999 Level Scientific publications	Message 53724 - Posted: 21 Feb 2020 \| 20:53:44 UTC - in response to Message 53723.
	I think that the only two lines in there that have any meaning are: Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [sched_op] NVIDIA GPU work request: 5584896.00 seconds; 64.00 devices Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| Scheduler request completed: got 0 new tasks He asked for work, and I think we must regard that as a refusal, rather than a lack of availability. Unfortunately, at this project we don't get the rejection letter with a checklist of possible reasons at the bottom that some projects send out. So we'll have to turn detective. We know * NVIDIA GeForce RTX 2080 (8192MB) driver: 420.69 * Linux Ubuntu And not much else that the server will want to check. Version of CUDA available? Compute Capability? To be honest, I don't know - but I'm wondering if Turing cards are yet supported under Linux? I already questioned whether his spoofed Nvidia driver version was confusing the schedulers. (The driver is 440.59 under the covers) But he already changed that in a previous try to report the standard version. No dice. Yes, the Turing cards have been working under Linux since last July. He is running sufficient driver and CC level for the CUDA100 application to be sent. I was hoping you would point out the obvious reason to us since you are more expert in reading and understanding the work_fetch_debug output.
	ID: 53724 \| Rating: 0 \| rate: / Reply Quote

ServicEnginIC Send message Joined: 24 Sep 10 Posts: 572 Credit: 7,513,597,024 RAC: 10,451,968 Level Scientific publications	Message 53725 - Posted: 21 Feb 2020 \| 22:04:09 UTC - in response to Message 53724.
	I see that your friend's host ID # 524248 is showing "[64] NVIDIA GeForce RTX 2080 (8192MB) driver: 440.59" Perhaps there is a new protection at server for not to attend requests for systems with such high amount of GPUs (?) Please, try instructing this new user for his system to reflect its true number of GPUs for a while, to test wether it corrects the problem... Toni kindly requested at this post for not using this kind of practices.
	ID: 53725 \| Rating: 0 \| rate: / Reply Quote

Zalster Send message Joined: 26 Feb 14 Posts: 211 Credit: 4,496,324,562 RAC: 0 Level Scientific publications	Message 53726 - Posted: 21 Feb 2020 \| 22:32:55 UTC
	Try a reset of the project. Force it to send a test unit to see if the GPUs are viable. ____________
	ID: 53726 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1313 Credit: 6,014,417,459 RAC: 9,835,999 Level Scientific publications	Message 53729 - Posted: 21 Feb 2020 \| 23:06:27 UTC - in response to Message 53725.
	I see that your friend's host ID # 524248 is showing "[64] NVIDIA GeForce RTX 2080 (8192MB) driver: 440.59" Perhaps there is a new protection at server for not to attend requests for systems with such high amount of GPUs (?) Please, try instructing this new user for his system to reflect its true number of GPUs for a while, to test wether it corrects the problem... Toni kindly requested at this post for not using this kind of practices. I've been getting work all day for all my hosts, the majority which have spoofed gpu counts. No problem there.
	ID: 53729 \| Rating: 0 \| rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1048 Credit: 40,228,383,983 RAC: 547,485 Level Scientific publications	Message 53732 - Posted: 22 Feb 2020 \| 2:22:15 UTC
	Hi guys, I'm the guy who was having issues. I finally was able to get some work and earn credit so I can actually post. I tried resetting the project, but that did not work. still gave the same message that no tasks are available. Then I removed my custom edits and let BOINC re-generate my coproc_info.xml with the correct number of GPUs [7]. It then downloaded some WUs and processed them. I will have to play around with some things, I see that Keith uses the spoofing also without any detriment, so maybe something just wasnt right in my coproc file. second issue, GPUGrid doesnt seem to play nicely with Einstein when they are both set to 0 resource share. Basically I want to run them BOTH as a backup, so that when SETI runs out, they share the backup load equally. is that possible? right now, I was only able to download GPUgrid work if I suspended Einstein.
	ID: 53732 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1313 Credit: 6,014,417,459 RAC: 9,835,999 Level Scientific publications	Message 53735 - Posted: 22 Feb 2020 \| 3:36:32 UTC - in response to Message 53732. Last modified: 22 Feb 2020 \| 3:43:09 UTC
	Glad you finally got work Ian. So how would that work if both Einstein and GPUGrid has resource share of zero when Seti runs out of work. Who gets the first shot at work? Is it based on where each project comes out in the alphabet. That seems the case with my 4 project hosts. Einstein goes first, then GPUGrid, then Milkyway and finally Seti when all the projects update at the same time or when you first fire up BOINC. Or is it the project with the most REC deficit? I would think that is what is supposed to happen. But Einstein will always send more work than you want unless you set a really small cache size. And if it gets to the schedulers first it will fill up your hard drive space allocation and prevent the other projects from getting work. That is what was occurring on my dual Einstein/GPUGrid host. GPUGrid never could get work because the gpu cache was always full with Einstein. Only when there was a brief outage at Einstein and the gpu cache had dwindled down did GPUGrid get a chance to download work. And once that happened . . . . then GPUGrid commandeered the gpus totally because it had to work off a huge REC deficit to Einstein.
	ID: 53735 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1313 Credit: 6,014,417,459 RAC: 9,835,999 Level Scientific publications	Message 53736 - Posted: 22 Feb 2020 \| 3:41:43 UTC
	Maybe the servers thought that 64 gpus was out of bounds. While 24 or 32 spoofed gpus is acceptable.
	ID: 53736 \| Rating: 0 \| rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1048 Credit: 40,228,383,983 RAC: 547,485 Level Scientific publications	Message 53737 - Posted: 22 Feb 2020 \| 3:55:47 UTC - in response to Message 53736.
	Glad you finally got work Ian. So how would that work if both Einstein and GPUGrid has resource share of zero when Seti runs out of work. Who gets the first shot at work? Is it based on where each project comes out in the alphabet. That seems the case with my 4 project hosts. Einstein goes first, then GPUGrid, then Milkyway and finally Seti when all the projects update at the same time or when you first fire up BOINC. Or is it the project with the most REC deficit? I would think that is what is supposed to happen. But Einstein will always send more work than you want unless you set a really small cache size. And if it gets to the schedulers first it will fill up your hard drive space allocation and prevent the other projects from getting work. That is what was occurring on my dual Einstein/GPUGrid host. GPUGrid never could get work because the gpu cache was always full with Einstein. Only when there was a brief outage at Einstein and the gpu cache had dwindled down did GPUGrid get a chance to download work. And once that happened . . . . then GPUGrid commandeered the gpus totally because it had to work off a huge REC deficit to Einstein. with a 0 resource share, the project is "supposed" to only send you 1 WU per device, and basically send them back at a 1-1 ratio, send one, get one. it appears that GPUGrid is following this. It doesnt look like Einstein is following that though. even when I had 64 GPUs listed, it would routinely send me 80-100 WUs. I think JStateson complained about this at Einstein as well. How do you have resource share allocated for GPUGrid? I thought you ran GPUGrid as a backup project on one or more of your systems. Maybe the servers thought that 64 gpus was out of bounds. While 24 or 32 spoofed gpus is acceptable. if so this must be some kind of project limit. BOINC's limit is 64. and never prevented me from getting work at SETI or Einstein.
	ID: 53737 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1313 Credit: 6,014,417,459 RAC: 9,835,999 Level Scientific publications	Message 53738 - Posted: 22 Feb 2020 \| 4:26:33 UTC - in response to Message 53737. Last modified: 22 Feb 2020 \| 4:28:00 UTC
	No I don't run any project as backup. They all get a standard resource share. I have never seen Einstein obey the standard BOINC protocols for work allotment. They don't run any server software that is current. Getting to be almost ten year old server software by now.
	ID: 53738 \| Rating: 0 \| rate: / Reply Quote

Zalster Send message Joined: 26 Feb 14 Posts: 211 Credit: 4,496,324,562 RAC: 0 Level Scientific publications	Message 53740 - Posted: 22 Feb 2020 \| 4:57:59 UTC - in response to Message 53738. Last modified: 22 Feb 2020 \| 4:58:11 UTC
	Keith has started to do what I do. Dedicated machines for certain projects.
	ID: 53740 \| Rating: 0 \| rate: / Reply Quote

biodoc Send message Joined: 26 Aug 08 Posts: 183 Credit: 7,438,039,375 RAC: 47,553,373 Level Scientific publications	Message 53741 - Posted: 22 Feb 2020 \| 11:18:41 UTC Last modified: 22 Feb 2020 \| 11:35:19 UTC
	So setting "0" resource share on both GPUGrid and Einstein results in only Einstein tasks downloading and processing. Is that correct? If so, have you tried adding an app_config.xml file to the Einstein project folder restricting the number of max concurrent tasks to 3 or 4? <app_config> <project_max_concurrent>3</project_max_concurrent> </app_config> EDIT: I'm hoping the above app_config.xml file will limit Einstein to 3 GPUs and leave the other 4 to process GPUGrid tasks if they download
	ID: 53741 \| Rating: 0 \| rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1048 Credit: 40,228,383,983 RAC: 547,485 Level Scientific publications	Message 53742 - Posted: 22 Feb 2020 \| 15:21:35 UTC - in response to Message 53741.
	hmm maybe that would work. i'll have to play around with it. but I think since I'm planning to run GPUGrid as a backup, on this host I'll just leave Einstein set to NNT, and only release it if GPUGrid doesnt have any work at the time. I have 2 other hosts [7x RTX2070, 10x RTX2070] that run SETI(prime)/Einstein(backup) but one of them has all 7 of the GPUs on USB PCIe 3.0x1 risers (PCIe 3.0 x1/x1/x1/x1/x1/x1/x1). which is fine for SETI/Einstein, but looking at PCIe usage on GPUGrid from the 2080 system, it looks to need at least a PCIe 3.0 x4 link. I'm observing ~20% PCIe use on a 3.0x16 link and ~40% use on a 3.0x8 link, meaning anything less than a 3.0x4 link will introduce a bottleneck. this 7x 2080 system has all cards plugged into the board with 3.0 x16/x8/x8/x16/x8/x8/x8 links respectively thanks to some clever PLX switches integrated to the motherboard. the other system has 8 GPUs on a 3.0x8 link each, and 2 on USB risers (PCIe 3.0 x8/x8/x8/x8/x8/x8/x8/x8/x1/x1) I could probably play around with the gpu_exclude flag but it might be easier to just dedicate the one system to SETI(prime)/GPUGrid(backup), and let the other 2 be SETI(prime)/Einstein(backup)
	ID: 53742 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : Server and website : New user can't get any work

	About	Science	Volunteers	Performance	Forum	Join us	Donate

Author	Message
Keith Myers Send message Joined: 13 Dec 17 Posts: 1313 Credit: 6,014,417,459 RAC: 9,835,999 Level Scientific publications	Message 53719 - Posted: 21 Feb 2020 \| 19:17:07 UTC
	I have a teammate who has joined GPUGrid and is unable to get any work. His cruncher meets all the requirements. This is his rig. https://www.gpugrid.net/hosts_user.php?sort=rpc_time&rev=0&show_all=1&userid=552015 This is the output of work_fetch_debug in the Event Log. I don't see any reason why the schedulers respond with no work sent. Fri 21 Feb 2020 01:45:45 PM EST \| \| [work_fetch] Request work fetch: Backoff ended for GPUGRID Fri 21 Feb 2020 01:45:49 PM EST \| \| choose_project(): 1582310749.544944 Fri 21 Feb 2020 01:45:49 PM EST \| \| [work_fetch] ------- start work fetch state ------- Fri 21 Feb 2020 01:45:49 PM EST \| \| [work_fetch] target work buffer: 86400.00 + 864.00 sec Fri 21 Feb 2020 01:45:49 PM EST \| \| [work_fetch] --- project states --- Fri 21 Feb 2020 01:45:49 PM EST \| Einstein@Home \| [work_fetch] REC 34160464.119 prio -0.000 can't request work: suspended via Manager Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| [work_fetch] REC 0.000 prio 0.000 can request work Fri 21 Feb 2020 01:45:49 PM EST \| SETI@home \| [work_fetch] REC 226750485.698 prio -0.000 can't request work: suspended via Manager (88.81 sec) Fri 21 Feb 2020 01:45:49 PM EST \| \| [work_fetch] --- state for CPU --- Fri 21 Feb 2020 01:45:49 PM EST \| \| [work_fetch] shortfall 1396224.00 nidle 16.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 01:45:49 PM EST \| Einstein@Home \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:49 PM EST \| SETI@home \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:49 PM EST \| \| [work_fetch] --- state for NVIDIA GPU --- Fri 21 Feb 2020 01:45:49 PM EST \| \| [work_fetch] shortfall 5584896.00 nidle 64.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 01:45:49 PM EST \| Einstein@Home \| [work_fetch] share 0.000 Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| [work_fetch] share 1.000 Fri 21 Feb 2020 01:45:49 PM EST \| SETI@home \| [work_fetch] share 0.000 Fri 21 Feb 2020 01:45:49 PM EST \| \| [work_fetch] ------- end work fetch state ------- Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| choose_project: scanning Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| can't fetch CPU: blocked by project preferences Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| can fetch NVIDIA GPU Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| NVIDIA GPU needs work - buffer low Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| checking CPU Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| CPU can't fetch: blocked by project preferences Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| checking NVIDIA GPU Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| NVIDIA GPU set_request: 1.000000 Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| [sched_op] Starting scheduler request Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| [work_fetch] request: CPU (0.00 sec, 0.00 inst) NVIDIA GPU (1.00 sec, 64.00 inst) Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| Sending scheduler request: To fetch work. Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| Requesting new tasks for NVIDIA GPU Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Fri 21 Feb 2020 01:45:49 PM EST \| GPUGRID \| [sched_op] NVIDIA GPU work request: 1.00 seconds; 64.00 devices Fri 21 Feb 2020 01:45:50 PM EST \| GPUGRID \| Scheduler request completed: got 0 new tasks Fri 21 Feb 2020 01:45:50 PM EST \| GPUGRID \| [sched_op] Server version 613 Fri 21 Feb 2020 01:45:50 PM EST \| GPUGRID \| No tasks sent Fri 21 Feb 2020 01:45:50 PM EST \| GPUGRID \| No tasks are available for New version of ACEMD Fri 21 Feb 2020 01:45:50 PM EST \| GPUGRID \| Project requested delay of 31 seconds Fri 21 Feb 2020 01:45:50 PM EST \| GPUGRID \| [work_fetch] backing off NVIDIA GPU 518 sec Fri 21 Feb 2020 01:45:50 PM EST \| GPUGRID \| [sched_op] Deferring communication for 00:00:31 Fri 21 Feb 2020 01:45:50 PM EST \| GPUGRID \| [sched_op] Reason: requested by project Fri 21 Feb 2020 01:45:50 PM EST \| \| [work_fetch] Request work fetch: RPC complete Fri 21 Feb 2020 01:45:55 PM EST \| \| choose_project(): 1582310755.574162 Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] ------- start work fetch state ------- Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] target work buffer: 86400.00 + 864.00 sec Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] --- project states --- Fri 21 Feb 2020 01:45:55 PM EST \| Einstein@Home \| [work_fetch] REC 34160464.119 prio -0.000 can't request work: suspended via Manager Fri 21 Feb 2020 01:45:55 PM EST \| GPUGRID \| [work_fetch] REC 0.000 prio 0.000 can't request work: scheduler RPC backoff (25.98 sec) Fri 21 Feb 2020 01:45:55 PM EST \| SETI@home \| [work_fetch] REC 226750485.698 prio -0.000 can't request work: suspended via Manager (82.78 sec) Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] --- state for CPU --- Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] shortfall 1396224.00 nidle 16.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 01:45:55 PM EST \| Einstein@Home \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:55 PM EST \| GPUGRID \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:55 PM EST \| SETI@home \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] --- state for NVIDIA GPU --- Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] shortfall 5584896.00 nidle 64.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 01:45:55 PM EST \| Einstein@Home \| [work_fetch] share 0.000 Fri 21 Feb 2020 01:45:55 PM EST \| GPUGRID \| [work_fetch] share 0.000 project is backed off (resource backoff: 513.41, inc 600.00) Fri 21 Feb 2020 01:45:55 PM EST \| SETI@home \| [work_fetch] share 0.000 Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] ------- end work fetch state ------- Fri 21 Feb 2020 01:45:55 PM EST \| GPUGRID \| choose_project: scanning Fri 21 Feb 2020 01:45:55 PM EST \| GPUGRID \| skip: scheduler RPC backoff Fri 21 Feb 2020 01:45:55 PM EST \| SETI@home \| choose_project: scanning Fri 21 Feb 2020 01:45:55 PM EST \| SETI@home \| skip: suspended via Manager Fri 21 Feb 2020 01:45:55 PM EST \| Einstein@Home \| choose_project: scanning Fri 21 Feb 2020 01:45:55 PM EST \| Einstein@Home \| skip: suspended via Manager Fri 21 Feb 2020 01:45:55 PM EST \| \| [work_fetch] No project chosen for work fetch Anybody see something I don't recognize as an impediment to getting work? Anybody else have a similar problem getting work as a first time volunteer to GPUGrid?
	ID: 53719 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1313 Credit: 6,014,417,459 RAC: 9,835,999 Level Scientific publications	Message 53720 - Posted: 21 Feb 2020 \| 19:23:05 UTC
	Just a FYI, the user has all the project configuration settings set correctly and matches mine to a T. Resource share --- Use CPU no Use ATI GPU no Use NVIDIA GPU yes Run test applications? no Is it OK for GPUGRID and your team (if any) to email you? yes Should GPUGRID show your computers on its web site? yes Default computer location --- Maximum CPU % for graphics 20 Run only the selected applications ACEMD short runs (2-3 hours on fastest card): no ACEMD long runs (8-12 hours on fastest GPU): no ACEMD3: yes Quantum Chemistry (CPU): no Quantum Chemistry (CPU, beta): no Python Runtime: no If no work for selected applications is available, accept work from other applications? no Use Graphics Processing Unit (GPU) if available yes Use Central Processing Unit (CPU) yes
	ID: 53720 \| Rating: 0 \| rate: / Reply Quote

biodoc Send message Joined: 26 Aug 08 Posts: 183 Credit: 7,438,039,375 RAC: 47,553,373 Level Scientific publications	Message 53721 - Posted: 21 Feb 2020 \| 19:37:19 UTC
	Try setting resource share to something above zero just to get the project going.
	ID: 53721 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1313 Credit: 6,014,417,459 RAC: 9,835,999 Level Scientific publications	Message 53722 - Posted: 21 Feb 2020 \| 20:06:35 UTC - in response to Message 53721.
	He tried that already with a resource share of 100. Tried beta applications also. Still not getting any work from a request. Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| sched RPC pending: Requested by user Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| piggyback_work_request() Fri 21 Feb 2020 02:45:29 PM EST \| \| [work_fetch] ------- start work fetch state ------- Fri 21 Feb 2020 02:45:29 PM EST \| \| [work_fetch] target work buffer: 86400.00 + 864.00 sec Fri 21 Feb 2020 02:45:29 PM EST \| \| [work_fetch] --- project states --- Fri 21 Feb 2020 02:45:29 PM EST \| Einstein@Home \| [work_fetch] REC 34239357.438 prio -0.000 can't request work: suspended via Manager Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [work_fetch] REC 0.000 prio -0.000 can request work Fri 21 Feb 2020 02:45:29 PM EST \| SETI@home \| [work_fetch] REC 226350578.976 prio 0.000 can't request work: suspended via Manager Fri 21 Feb 2020 02:45:29 PM EST \| \| [work_fetch] --- state for CPU --- Fri 21 Feb 2020 02:45:29 PM EST \| \| [work_fetch] shortfall 1396224.00 nidle 16.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 02:45:29 PM EST \| Einstein@Home \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:29 PM EST \| SETI@home \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:29 PM EST \| \| [work_fetch] --- state for NVIDIA GPU --- Fri 21 Feb 2020 02:45:29 PM EST \| \| [work_fetch] shortfall 5584896.00 nidle 64.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 02:45:29 PM EST \| Einstein@Home \| [work_fetch] share 0.000 Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [work_fetch] share 1.000 Fri 21 Feb 2020 02:45:29 PM EST \| SETI@home \| [work_fetch] share 0.000 Fri 21 Feb 2020 02:45:29 PM EST \| \| [work_fetch] ------- end work fetch state ------- Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| piggyback: resource CPU Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| piggyback: can't fetch CPU: blocked by project preferences Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| piggyback: resource NVIDIA GPU Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [work_fetch] set_request() for NVIDIA GPU: ninst 64 nused_total 0.00 nidle_now 64.00 fetch share 1.00 req_inst 64.00 req_secs 5584896.00 Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [sched_op] Starting scheduler request Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [work_fetch] request: CPU (0.00 sec, 0.00 inst) NVIDIA GPU (5584896.00 sec, 64.00 inst) Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| Sending scheduler request: Requested by user. Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| Requesting new tasks for NVIDIA GPU Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [sched_op] NVIDIA GPU work request: 5584896.00 seconds; 64.00 devices Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| Scheduler request completed: got 0 new tasks Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| [sched_op] Server version 613 Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| No tasks sent Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| No tasks are available for New version of ACEMD Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| Project requested delay of 31 seconds Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| [sched_op] Deferring communication for 00:00:31 Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| [sched_op] Reason: requested by project Fri 21 Feb 2020 02:45:30 PM EST \| \| [work_fetch] Request work fetch: RPC complete Fri 21 Feb 2020 02:45:35 PM EST \| \| choose_project(): 1582314335.760470 Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] ------- start work fetch state ------- Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] target work buffer: 86400.00 + 864.00 sec Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] --- project states --- Fri 21 Feb 2020 02:45:35 PM EST \| Einstein@Home \| [work_fetch] REC 34239357.438 prio -0.000 can't request work: suspended via Manager Fri 21 Feb 2020 02:45:35 PM EST \| GPUGRID \| [work_fetch] REC 0.000 prio 0.000 can't request work: scheduler RPC backoff (25.98 sec) Fri 21 Feb 2020 02:45:35 PM EST \| SETI@home \| [work_fetch] REC 226350578.976 prio 0.000 can't request work: suspended via Manager Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] --- state for CPU --- Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] shortfall 1396224.00 nidle 16.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 02:45:35 PM EST \| Einstein@Home \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:35 PM EST \| GPUGRID \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:35 PM EST \| SETI@home \| [work_fetch] share 0.000 blocked by project preferences Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] --- state for NVIDIA GPU --- Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] shortfall 5584896.00 nidle 64.00 saturated 0.00 busy 0.00 Fri 21 Feb 2020 02:45:35 PM EST \| Einstein@Home \| [work_fetch] share 0.000 Fri 21 Feb 2020 02:45:35 PM EST \| GPUGRID \| [work_fetch] share 0.000 Fri 21 Feb 2020 02:45:35 PM EST \| SETI@home \| [work_fetch] share 0.000 Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] ------- end work fetch state ------- Fri 21 Feb 2020 02:45:35 PM EST \| GPUGRID \| choose_project: scanning Fri 21 Feb 2020 02:45:35 PM EST \| GPUGRID \| skip: scheduler RPC backoff Fri 21 Feb 2020 02:45:35 PM EST \| SETI@home \| choose_project: scanning Fri 21 Feb 2020 02:45:35 PM EST \| SETI@home \| skip: suspended via Manager Fri 21 Feb 2020 02:45:35 PM EST \| Einstein@Home \| choose_project: scanning Fri 21 Feb 2020 02:45:35 PM EST \| Einstein@Home \| skip: suspended via Manager Fri 21 Feb 2020 02:45:35 PM EST \| \| [work_fetch] No project chosen for work fetch
	ID: 53722 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1589 Credit: 6,673,794,351 RAC: 8,989,891 Level Scientific publications	Message 53723 - Posted: 21 Feb 2020 \| 20:34:31 UTC - in response to Message 53722.
	I think that the only two lines in there that have any meaning are: Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [sched_op] NVIDIA GPU work request: 5584896.00 seconds; 64.00 devices Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| Scheduler request completed: got 0 new tasks He asked for work, and I think we must regard that as a refusal, rather than a lack of availability. Unfortunately, at this project we don't get the rejection letter with a checklist of possible reasons at the bottom that some projects send out. So we'll have to turn detective. We know * NVIDIA GeForce RTX 2080 (8192MB) driver: 420.69 * Linux Ubuntu And not much else that the server will want to check. Version of CUDA available? Compute Capability? To be honest, I don't know - but I'm wondering if Turing cards are yet supported under Linux?
	ID: 53723 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1313 Credit: 6,014,417,459 RAC: 9,835,999 Level Scientific publications	Message 53724 - Posted: 21 Feb 2020 \| 20:53:44 UTC - in response to Message 53723.
	I think that the only two lines in there that have any meaning are: Fri 21 Feb 2020 02:45:29 PM EST \| GPUGRID \| [sched_op] NVIDIA GPU work request: 5584896.00 seconds; 64.00 devices Fri 21 Feb 2020 02:45:30 PM EST \| GPUGRID \| Scheduler request completed: got 0 new tasks He asked for work, and I think we must regard that as a refusal, rather than a lack of availability. Unfortunately, at this project we don't get the rejection letter with a checklist of possible reasons at the bottom that some projects send out. So we'll have to turn detective. We know * NVIDIA GeForce RTX 2080 (8192MB) driver: 420.69 * Linux Ubuntu And not much else that the server will want to check. Version of CUDA available? Compute Capability? To be honest, I don't know - but I'm wondering if Turing cards are yet supported under Linux? I already questioned whether his spoofed Nvidia driver version was confusing the schedulers. (The driver is 440.59 under the covers) But he already changed that in a previous try to report the standard version. No dice. Yes, the Turing cards have been working under Linux since last July. He is running sufficient driver and CC level for the CUDA100 application to be sent. I was hoping you would point out the obvious reason to us since you are more expert in reading and understanding the work_fetch_debug output.
	ID: 53724 \| Rating: 0 \| rate: / Reply Quote

ServicEnginIC Send message Joined: 24 Sep 10 Posts: 572 Credit: 7,513,597,024 RAC: 10,451,968 Level Scientific publications	Message 53725 - Posted: 21 Feb 2020 \| 22:04:09 UTC - in response to Message 53724.
	I see that your friend's host ID # 524248 is showing "[64] NVIDIA GeForce RTX 2080 (8192MB) driver: 440.59" Perhaps there is a new protection at server for not to attend requests for systems with such high amount of GPUs (?) Please, try instructing this new user for his system to reflect its true number of GPUs for a while, to test wether it corrects the problem... Toni kindly requested at this post for not using this kind of practices.
	ID: 53725 \| Rating: 0 \| rate: / Reply Quote

Zalster Send message Joined: 26 Feb 14 Posts: 211 Credit: 4,496,324,562 RAC: 0 Level Scientific publications	Message 53726 - Posted: 21 Feb 2020 \| 22:32:55 UTC
	Try a reset of the project. Force it to send a test unit to see if the GPUs are viable. ____________
	ID: 53726 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1313 Credit: 6,014,417,459 RAC: 9,835,999 Level Scientific publications	Message 53729 - Posted: 21 Feb 2020 \| 23:06:27 UTC - in response to Message 53725.
	I see that your friend's host ID # 524248 is showing "[64] NVIDIA GeForce RTX 2080 (8192MB) driver: 440.59" Perhaps there is a new protection at server for not to attend requests for systems with such high amount of GPUs (?) Please, try instructing this new user for his system to reflect its true number of GPUs for a while, to test wether it corrects the problem... Toni kindly requested at this post for not using this kind of practices. I've been getting work all day for all my hosts, the majority which have spoofed gpu counts. No problem there.
	ID: 53729 \| Rating: 0 \| rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1048 Credit: 40,228,383,983 RAC: 547,485 Level Scientific publications	Message 53732 - Posted: 22 Feb 2020 \| 2:22:15 UTC
	Hi guys, I'm the guy who was having issues. I finally was able to get some work and earn credit so I can actually post. I tried resetting the project, but that did not work. still gave the same message that no tasks are available. Then I removed my custom edits and let BOINC re-generate my coproc_info.xml with the correct number of GPUs [7]. It then downloaded some WUs and processed them. I will have to play around with some things, I see that Keith uses the spoofing also without any detriment, so maybe something just wasnt right in my coproc file. second issue, GPUGrid doesnt seem to play nicely with Einstein when they are both set to 0 resource share. Basically I want to run them BOTH as a backup, so that when SETI runs out, they share the backup load equally. is that possible? right now, I was only able to download GPUgrid work if I suspended Einstein.
	ID: 53732 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1313 Credit: 6,014,417,459 RAC: 9,835,999 Level Scientific publications	Message 53735 - Posted: 22 Feb 2020 \| 3:36:32 UTC - in response to Message 53732. Last modified: 22 Feb 2020 \| 3:43:09 UTC
	Glad you finally got work Ian. So how would that work if both Einstein and GPUGrid has resource share of zero when Seti runs out of work. Who gets the first shot at work? Is it based on where each project comes out in the alphabet. That seems the case with my 4 project hosts. Einstein goes first, then GPUGrid, then Milkyway and finally Seti when all the projects update at the same time or when you first fire up BOINC. Or is it the project with the most REC deficit? I would think that is what is supposed to happen. But Einstein will always send more work than you want unless you set a really small cache size. And if it gets to the schedulers first it will fill up your hard drive space allocation and prevent the other projects from getting work. That is what was occurring on my dual Einstein/GPUGrid host. GPUGrid never could get work because the gpu cache was always full with Einstein. Only when there was a brief outage at Einstein and the gpu cache had dwindled down did GPUGrid get a chance to download work. And once that happened . . . . then GPUGrid commandeered the gpus totally because it had to work off a huge REC deficit to Einstein.
	ID: 53735 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1313 Credit: 6,014,417,459 RAC: 9,835,999 Level Scientific publications	Message 53736 - Posted: 22 Feb 2020 \| 3:41:43 UTC
	Maybe the servers thought that 64 gpus was out of bounds. While 24 or 32 spoofed gpus is acceptable.
	ID: 53736 \| Rating: 0 \| rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1048 Credit: 40,228,383,983 RAC: 547,485 Level Scientific publications	Message 53737 - Posted: 22 Feb 2020 \| 3:55:47 UTC - in response to Message 53736.
	Glad you finally got work Ian. So how would that work if both Einstein and GPUGrid has resource share of zero when Seti runs out of work. Who gets the first shot at work? Is it based on where each project comes out in the alphabet. That seems the case with my 4 project hosts. Einstein goes first, then GPUGrid, then Milkyway and finally Seti when all the projects update at the same time or when you first fire up BOINC. Or is it the project with the most REC deficit? I would think that is what is supposed to happen. But Einstein will always send more work than you want unless you set a really small cache size. And if it gets to the schedulers first it will fill up your hard drive space allocation and prevent the other projects from getting work. That is what was occurring on my dual Einstein/GPUGrid host. GPUGrid never could get work because the gpu cache was always full with Einstein. Only when there was a brief outage at Einstein and the gpu cache had dwindled down did GPUGrid get a chance to download work. And once that happened . . . . then GPUGrid commandeered the gpus totally because it had to work off a huge REC deficit to Einstein. with a 0 resource share, the project is "supposed" to only send you 1 WU per device, and basically send them back at a 1-1 ratio, send one, get one. it appears that GPUGrid is following this. It doesnt look like Einstein is following that though. even when I had 64 GPUs listed, it would routinely send me 80-100 WUs. I think JStateson complained about this at Einstein as well. How do you have resource share allocated for GPUGrid? I thought you ran GPUGrid as a backup project on one or more of your systems. Maybe the servers thought that 64 gpus was out of bounds. While 24 or 32 spoofed gpus is acceptable. if so this must be some kind of project limit. BOINC's limit is 64. and never prevented me from getting work at SETI or Einstein.
	ID: 53737 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1313 Credit: 6,014,417,459 RAC: 9,835,999 Level Scientific publications	Message 53738 - Posted: 22 Feb 2020 \| 4:26:33 UTC - in response to Message 53737. Last modified: 22 Feb 2020 \| 4:28:00 UTC
	No I don't run any project as backup. They all get a standard resource share. I have never seen Einstein obey the standard BOINC protocols for work allotment. They don't run any server software that is current. Getting to be almost ten year old server software by now.
	ID: 53738 \| Rating: 0 \| rate: / Reply Quote

Zalster Send message Joined: 26 Feb 14 Posts: 211 Credit: 4,496,324,562 RAC: 0 Level Scientific publications	Message 53740 - Posted: 22 Feb 2020 \| 4:57:59 UTC - in response to Message 53738. Last modified: 22 Feb 2020 \| 4:58:11 UTC
	Keith has started to do what I do. Dedicated machines for certain projects.
	ID: 53740 \| Rating: 0 \| rate: / Reply Quote

biodoc Send message Joined: 26 Aug 08 Posts: 183 Credit: 7,438,039,375 RAC: 47,553,373 Level Scientific publications	Message 53741 - Posted: 22 Feb 2020 \| 11:18:41 UTC Last modified: 22 Feb 2020 \| 11:35:19 UTC
	So setting "0" resource share on both GPUGrid and Einstein results in only Einstein tasks downloading and processing. Is that correct? If so, have you tried adding an app_config.xml file to the Einstein project folder restricting the number of max concurrent tasks to 3 or 4? <app_config> <project_max_concurrent>3</project_max_concurrent> </app_config> EDIT: I'm hoping the above app_config.xml file will limit Einstein to 3 GPUs and leave the other 4 to process GPUGrid tasks if they download
	ID: 53741 \| Rating: 0 \| rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1048 Credit: 40,228,383,983 RAC: 547,485 Level Scientific publications	Message 53742 - Posted: 22 Feb 2020 \| 15:21:35 UTC - in response to Message 53741.
	hmm maybe that would work. i'll have to play around with it. but I think since I'm planning to run GPUGrid as a backup, on this host I'll just leave Einstein set to NNT, and only release it if GPUGrid doesnt have any work at the time. I have 2 other hosts [7x RTX2070, 10x RTX2070] that run SETI(prime)/Einstein(backup) but one of them has all 7 of the GPUs on USB PCIe 3.0x1 risers (PCIe 3.0 x1/x1/x1/x1/x1/x1/x1). which is fine for SETI/Einstein, but looking at PCIe usage on GPUGrid from the 2080 system, it looks to need at least a PCIe 3.0 x4 link. I'm observing ~20% PCIe use on a 3.0x16 link and ~40% use on a 3.0x8 link, meaning anything less than a 3.0x4 link will introduce a bottleneck. this 7x 2080 system has all cards plugged into the board with 3.0 x16/x8/x8/x16/x8/x8/x8 links respectively thanks to some clever PLX switches integrated to the motherboard. the other system has 8 GPUs on a 3.0x8 link each, and 2 on USB risers (PCIe 3.0 x8/x8/x8/x8/x8/x8/x8/x8/x1/x1) I could probably play around with the gpu_exclude flag but it might be easier to just dedicate the one system to SETI(prime)/GPUGrid(backup), and let the other 2 be SETI(prime)/Einstein(backup)
	ID: 53742 \| Rating: 0 \| rate: / Reply Quote