Advanced search

Message boards : Server and website : Nearly impossible to UPLOAD

Author Message
Betting Slip
Send message
Joined: 5 Jan 09
Posts: 666
Credit: 2,498,095,550
RAC: 4,265
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 42224 - Posted: 25 Nov 2015 | 11:48:22 UTC

Nearly impossible to UPLOAD results and Website extremely SLOW.

Gerard
Volunteer moderator
Project developer
Project scientist
Send message
Joined: 26 Mar 14
Posts: 101
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 42226 - Posted: 27 Nov 2015 | 14:31:32 UTC - in response to Message 42224.

fixed?

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 666
Credit: 2,498,095,550
RAC: 4,265
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 42227 - Posted: 27 Nov 2015 | 17:19:43 UTC - in response to Message 42226.
Last modified: 27 Nov 2015 | 17:20:01 UTC

Yes.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1089
Credit: 1,366,824,414
RAC: 725,322
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 42251 - Posted: 1 Dec 2015 | 3:23:15 UTC - in response to Message 42227.

Cause and solution?

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 666
Credit: 2,498,095,550
RAC: 4,265
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 42255 - Posted: 1 Dec 2015 | 9:24:59 UTC - in response to Message 42251.
Last modified: 1 Dec 2015 | 9:26:04 UTC

The cause seemed to be something to do with the projects side Jacob. So I can't tell you anymore than for a brief period of time the site was very slow and I couldn't complete upload of the big file of a WU.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1089
Credit: 1,366,824,414
RAC: 725,322
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 42336 - Posted: 9 Dec 2015 | 3:24:35 UTC

Right. I was hoping Gerard would supply the cause/solution. Thanks though.

Dayle Diamond
Send message
Joined: 5 Dec 12
Posts: 77
Credit: 1,300,910,364
RAC: 734,205
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwat
Message 42677 - Posted: 22 Jan 2016 | 16:49:13 UTC

My upload is stuck. Been trying for 48 minutes. Anyone else having problems?

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,348,955
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 42969 - Posted: 9 Mar 2016 | 18:30:22 UTC

Uploading no problem any more, but now downloading a new WU, seems not to work it is trying for hours and only 20% is done.
____________
Greetings from TJ

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1073
Credit: 4,594,506,804
RAC: 307,067
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43275 - Posted: 28 Apr 2016 | 19:37:45 UTC - in response to Message 42969.
Last modified: 28 Apr 2016 | 19:38:41 UTC

Uploading no problem any more, but now downloading a new WU, seems not to work it is trying for hours and only 20% is done.

I've also been having major problems with downloads for quite a while. The problem seems to be with the *-coor_file. It will partially download and then hang for hours, causing the GPU to be idle and also miss bonus deadlines. This is happening on ALL of my machines, with various versions of BOINC, various hardware, various drivers. Often the only way to get the "coor" file downloaded is to disable and re-enable network communication (sometimes several times before the file completely downloads). How to solve this problem?

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 666
Credit: 2,498,095,550
RAC: 4,265
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43276 - Posted: 28 Apr 2016 | 20:34:17 UTC - in response to Message 43275.
Last modified: 28 Apr 2016 | 20:34:42 UTC

I have this problem as well and the only way I can cure it faster than letting it do it itself is wait until the download goes to (about 6 mins) "retry" from "active" and click "Retry Now"

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 883
Credit: 1,726,401,470
RAC: 1,104,282
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43277 - Posted: 28 Apr 2016 | 22:40:41 UTC - in response to Message 43276.

I have this problem as well and the only way I can cure it faster than letting it do it itself is wait until the download goes to (about 6 mins) "retry" from "active" and click "Retry Now"

Another way is to use the 'Activity' menu: suspend network activity, then return to your normal setting. You can set that any time, as soon as you notice the download has stalled - even if the status is still listed as 'active'.

If this happens on remote machines where you can't intervene manually, you could try setting the

<http_transfer_timeout>seconds</http_transfer_timeout>
Abort HTTP transfers if idle for this many seconds; default 300.

Client Configuration Option to a lower figure, like 60 seconds. Use with care: I haven't needed this myself, so I can't vouch for it personally. But it should save you 4 minutes before the start of each retry

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1073
Credit: 4,594,506,804
RAC: 307,067
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43281 - Posted: 29 Apr 2016 | 18:05:45 UTC - in response to Message 43277.

I have this problem as well and the only way I can cure it faster than letting it do it itself is wait until the download goes to (about 6 mins) "retry" from "active" and click "Retry Now"

Another way is to use the 'Activity' menu: suspend network activity, then return to your normal setting. You can set that any time, as soon as you notice the download has stalled - even if the status is still listed as 'active'.

If this happens on remote machines where you can't intervene manually, you could try setting the

<http_transfer_timeout>seconds</http_transfer_timeout>
Abort HTTP transfers if idle for this many seconds; default 300.

Client Configuration Option to a lower figure, like 60 seconds. Use with care: I haven't needed this myself, so I can't vouch for it personally. But it should save you 4 minutes before the start of each retry

Thanks for the tips. The problem seems to be with the -coor_file and also the -vel_file (interesting that both are the same size). Very occasionally another GPUGrid DL file may fail. I've seen this happen ONLY at GPUGrid and have experienced it on no other projects. Hard to believe it's not a server misconfiguration here. I'm trying your client configuration suggestion and also another cc_config option . This stalling problem is happening on every GPUGrid download and can leave the GPUs idle for hours. I have GPUGrid project priority set to 0 (the only NVidia project) to try to make some of the bonus times, something that is becoming harder as the WU size has so quickly increased. It would be nice if the bonus deadlines were increased a bit also, as it would solve some issues. I'll report back with the results.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1073
Credit: 4,594,506,804
RAC: 307,067
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43569 - Posted: 25 May 2016 | 15:18:10 UTC
Last modified: 25 May 2016 | 15:52:31 UTC

Can anything be done about the connectivity issues at GPUGrid? Beside the download timeouts, I now have 11 WUs that are now trying to upload. While they are slowly progressing, the keyword is slowly. Even if I pause all other uploads I can't even get any single UL to progress at over 17Kbps. Here's 2 WUs that finished over 6 hours ago but are still uploading:

GPUGRID e11s49_e2s8p0f47-GERARD_FXCXCL12R_1480490_1-0-1-RND0590_0 21:39:53 5/25/2016 3:53:55 AM 0.939C + 1NV (d1) Uploading

GPUGRID e11s41_e10s25p0f697-GERARD_FXCXCL12R_1480490_1-0-1-RND2403_0 21:33:05 5/25/2016 3:53:55 AM 0.939C + 1NV (d0) Uploading

The completion times are both under 21:40 but they've now missed the 1 day deadline and are still not nearly uploaded. My internet connection is DSL and not very fast (even by monopolistic USA standards) but it's the top speed available in my area. In fact I spent two days twisting some arms to get provisioned at over what is supposedly the top speed. That still doesn't account for the miserable UL speeds I'm seeing. How can we be expected to meet the tight deadlines under these conditions? It's frustrating...

Edit: Now 12 WUs uploading.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1073
Credit: 4,594,506,804
RAC: 307,067
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43576 - Posted: 25 May 2016 | 17:21:09 UTC

Particularly lovely are all the transient upload (and download) failures:

"GPUGRID 5/25/2016 12:09:02 PM Temporarily failed upload of e11s41_e10s25p0f697-GERARD_FXCXCL12R_1480490_1-0-1-RND2403_0_9: transient HTTP error"

Then BOINC backs off for a crazy amount of time (in this case 4 hours and 25 minutes) , leaving the 99,995k upload stalled at 99.394% complete:

"GPUGRID 5/25/2016 12:09:02 PM Backing off 04:25:27 on upload of e11s41_e10s25p0f697-GERARD_FXCXCL12R_1480490_1-0-1-RND2403_0_9"

Is there any way to modify or disable this BOINC behavior?

Vagelis Giannadakis
Send message
Joined: 5 May 13
Posts: 187
Credit: 349,254,454
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwat
Message 43582 - Posted: 26 May 2016 | 9:32:51 UTC - in response to Message 43576.

I don't know whether you can change some configuration option and make BOINC behave more sanely in this respect by itself, but the best manual approach I've found to mitigate this is to disable and re-enable network activity. All uploads / downloads resume immediately, disregarding the back-off times.
____________

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1073
Credit: 4,594,506,804
RAC: 307,067
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43595 - Posted: 27 May 2016 | 4:42:47 UTC - in response to Message 43582.
Last modified: 27 May 2016 | 4:45:40 UTC

I've been doing that but the file transfers are timing out constantly. I'm babysitting it WAY too much. Turns out that it can be controlled but as far as I can see you have to start BOINC from the command console. The command is in this format (when using from a windows shortcut):

"D:\Program Files\BOINC\boinc.exe" --pers_retry_delay_max 300

It does work. The last number is in seconds. Give it enough time so that the file is unlocked on the server.

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,914,631,885
RAC: 121,638
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43664 - Posted: 31 May 2016 | 8:36:37 UTC - in response to Message 43595.

Missing GERARD_FXCXCL12R task:

Task, http://84.89.134.145/result.php?resultid=15112472

Work Unit

There was no sign of this ever being on my system. I guess it didn't get sent properly or just didn't download properly. Just saying in case some of the recent timeouts were due to tasks never arriving in the first place. If there was an increase in timeouts recently you might be able to see that server-side.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Post to thread

Message boards : Server and website : Nearly impossible to UPLOAD