Advanced search

Message boards : Server and website : Outage? Apparently not?!?

Author Message
WPrion
Send message
Joined: 30 Apr 13
Posts: 77
Credit: 1,034,112,811
RAC: 609
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 53791 - Posted: 28 Feb 2020 | 14:07:00 UTC
Last modified: 28 Feb 2020 | 14:07:29 UTC

Since I don't see any comments about the "outage" I guess it was just me.

From yesterday until this morning my computer couldn't communicate with gpugrid. I had two finished tasks that would not upload for many hours. Event log example:
2/28/2020 12:16:20 AM | | Project communication failed: attempting access to reference site
2/28/2020 12:16:21 AM | | Internet access OK - project servers may be temporarily down.

I couldn't even connect to the Server Status page or Forums.

So I switched to another project which worked fine, but it also blocked gpugrid (tasks full even though I have gpu set at a much higher priority) until this morning when I manually set things back. Aside from gpugrid my computer and internet connection were fine.

Any ideas?

Thanks,

Win

Erich56
Send message
Joined: 1 Jan 15
Posts: 697
Credit: 3,295,149,283
RAC: 329,689
Level
Arg
Scientific publications
watwatwatwatwatwat
Message 53792 - Posted: 28 Feb 2020 | 14:11:54 UTC - in response to Message 53791.

Since I don't see any comments about the "outage" I guess it was just me.

I don't think it was just you. Same thing happened here. I guess GPUGRID was cut off from the internet for a while.

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 947
Credit: 4,353,973
RAC: 48
Level
Ala
Scientific publications
watwatwatwat
Message 53793 - Posted: 28 Feb 2020 | 15:11:37 UTC - in response to Message 53792.

There was an outage. Resolved this morning. In such cases we put notices on twitter.

WPrion
Send message
Joined: 30 Apr 13
Posts: 77
Credit: 1,034,112,811
RAC: 609
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 53794 - Posted: 28 Feb 2020 | 16:07:20 UTC - in response to Message 53793.

There was an outage. Resolved this morning. In such cases we put notices on twitter.


OK, Thanks.

I guess twitter makes sense when the normal connections are down. But since I don't do twitter I'll have to wait until things are fixed.

I was just surprised I didn't see any post here about the outage after-the-fact.

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 508
Credit: 532,080,241
RAC: 1,713,014
Level
Lys
Scientific publications
wat
Message 53795 - Posted: 28 Feb 2020 | 17:43:47 UTC

I kept asking in the BOINC forums if everyone else was having issues and saw no posts. Found other people saying they could not upload or contact the project. I looked in the BOINC Projects forum and on the BoincStats site about any news of project outage and found no mention.

I even looked at Twitter which I don't use and found nothing.


Would have alleviated my concern if the BOINC Projects forum would have had a post about project status. Seems the logical place for news if the project website is unreachable or missing.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1003
Credit: 2,537,765,232
RAC: 3,302,573
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53796 - Posted: 28 Feb 2020 | 20:38:20 UTC - in response to Message 53795.

Would have alleviated my concern if the BOINC Projects forum would have had a post about project status. Seems the logical place for news if the project website is unreachable or missing.
It's usually Moderators or other active volunteers who spread the news of either planned outages or other unexpected breakdowns. This particular overnight outage corresponded with European volunteers' sleep time. Our early-morning observations coincided with the staff restarting the project, so that's what I posted.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 199
Credit: 1,458,715,434
RAC: 823,206
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53797 - Posted: 29 Feb 2020 | 1:13:12 UTC - in response to Message 53793.

There was an outage. Resolved this morning.

Every projects I collaborate with, suffer temporal outages from time to time. No exception.

I took this one as an opportunity to experiment.
Early this morning, I found that every my computers had stalled uploads for finished WUs, and took screenshots. 5 computers, 12 WUs.
After a little spreadsheet work, we have this:



Some conclusions for current GPUGrid WUs:
-1) Every successfully processed WU is returning 6 files, with names finishing in _0 _1 _2 _8 _9 and _10
_10 files size is small and constant: 274 bytes
_0 files size is nearly constant, ranging from 3,15 to 3,20 KB in the studied WUs
_1 and _2 files sizes are the same, variable, and ranging from 545 to 911 KB in the sample
_8 files have been always the biggest ones, variable, and ranging from 2,67 to 4,45 MB in the observed WUs
_9 files size are also variable, and ranging from 927 to 1536 KB in the sample

-2) The returned total size for results can be taken as a very good indicative for WUs complexity, and are independent from computers and GPUs used to process. Slower computers will take more time to process, but the resulting file sizes are the same if the same WUs are processed in faster ones.

-3) As seen in Credits/MB column, this value is nearly constant: average 2824 credits per MB returned.
Credits awarded are proportional to WUs complexity, and therefore to the total size of the returned results.

-4) Average size for returned results was 6,367 MB each WU. Currently 310000 WUs ready to send: potentially 1,88 TB in results (wow)

Now my curiosity in this respect is a bit more calmed down.

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 947
Credit: 4,353,973
RAC: 48
Level
Ala
Scientific publications
watwatwatwat
Message 53799 - Posted: 29 Feb 2020 | 8:32:46 UTC - in response to Message 53797.

All correct.

The crunch time AND credits are proportional (linearly) to system size AND the simulation length. Sim length for these WUs is all the same. System size varies.

Upload size is also in proportion (with a ratio which changes for other of our projects).

Aurum
Avatar
Send message
Joined: 12 Jul 17
Posts: 252
Credit: 9,791,563,847
RAC: 2,648,925
Level
Tyr
Scientific publications
wat
Message 53801 - Posted: 29 Feb 2020 | 9:56:10 UTC - in response to Message 53793.
Last modified: 29 Feb 2020 | 10:09:00 UTC

There was an outage. Resolved this morning. In such cases we put notices on twitter.

I'm banned for life from Tweeter for saying what I really think of ImPOTUS the Imbecile and his Lickspittle Lackeys.

If folks had a day of WUs DLed maybe they wouldn't get so rattled. Time to increase 2 WU/GPU to 12 WU/GPU.

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 397
Credit: 5,243,559,537
RAC: 2,478,831
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53812 - Posted: 1 Mar 2020 | 22:47:23 UTC

I don't whether or not this is related to the outage, but just before the outage the server created another account for one of my computers.

Computer ID 263612 and computer ID 524728 are the same machine:



http://www.gpugrid.net/hosts_user.php?userid=19626



Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 397
Credit: 5,243,559,537
RAC: 2,478,831
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53865 - Posted: 6 Mar 2020 | 1:14:55 UTC - in response to Message 53812.

I don't whether or not this is related to the outage, but just before the outage the server created another account for one of my computers.

Computer ID 263612 and computer ID 524728 are the same machine:



http://www.gpugrid.net/hosts_user.php?userid=19626





The server created another account for my computer "ID 525914", see link above. It happened around the time the server was temporarily inaccessible. This "server inaccessibility" has happened several times in the last few weeks. I can't even load the gpugrid website, during this time, even though I have complete internet access. It seems the server has trouble handling the 300,000+ WUs.







Post to thread

Message boards : Server and website : Outage? Apparently not?!?