Message boards : Number crunching : Boinc Testing
Author | Message |
---|---|
the only thing - BOINC do not want to get or the project not sending the 2nd WU at all. and basically it starts to download new WU only when it finish to crunch. this means somewhat couple of minutes of idling. I'm using v6.12.4 also and am seeing this problem in all GPU projects. It seems that the GPU queue slowly shrinks and has to be constantly enlarged to maintain the same size. It's especially noticeable in Collatz since the queue is much larger. I believe this started in BOINC v6.11.7 and is still unfixed. The only way I've found to reset the queue size is to reinstall v6.11.6 or earlier (I generally use v6.10.58 for this), let it run until the until the work is downloaded then install v6.12.4 again. At that point the proper queue is reinstated but slowly starts to shrink again. Anyone else noticing this? | |
ID: 19224 | Rating: 0 | rate: / Reply Quote | |
That sounds like awful babysitting. Any benefit to not sticking with 6.10.58 for now? | |
ID: 19225 | Rating: 0 | rate: / Reply Quote | |
the only thing - BOINC do not want to get or the project not sending the 2nd WU at all. and basically it starts to download new WU only when it finish to crunch. this means somewhat couple of minutes of idling. Well, in a way that's to be expected. v6.12.4 is prominently marked (MAY BE UNSTABLE - USE ONLY FOR TESTING) - you've obviously found an instability in the cache size algorithm. But unless you report test results, and give the developers some data to work from, the fixing process is inevitably slow. As it happens, a very similar report appeared on boinc_alpha last week, again with reference to Collatz. And again, it refers to v6.12.4: if you've been seeing it since v6.11.7, may I ask why it hasn't been reported until now? My guess is that the unexpected 'High Priority' running a couple of us mentioned in this thread yesterday shares a common cause - over-estimation of the duration of the work currently in the cache. I can work on that - a BOINC programmer is already looking at the logs I submitted. But as to cache size - I'm already, deliberately, only allowing new work ten minutes before the previous task finishes - so my slow cards have an opportunity to return the new bigger tasks within 24 hours. So this new, er, "feature" suits me down to the ground ;-) | |
ID: 19228 | Rating: 0 | rate: / Reply Quote | |
That sounds like awful babysitting. Any benefit to not sticking with 6.10.58 for now? Yes. BOINC 6.11.x fixes the problem that caused the backup project to DL a whole queue full of WUs when the main project ran out of work. BOINC 6.12.4 has a much welcomed fix to the GPU project FIFO problem. Now GPU project priorities are properly honored. Big fixes for me and yes for me worth the hassle. | |
ID: 19235 | Rating: 0 | rate: / Reply Quote | |
the only thing - BOINC do not want to get or the project not sending the 2nd WU at all. and basically it starts to download new WU only when it finish to crunch. this means somewhat couple of minutes of idling. Hi Richard. In fact I was the one that reported it on the alpha list last week. I did so because there was talk of releasing 6.12.4 (or a slightly modified version) as the preferred client. I probably should have reported it earlier but didn't for a couple of reasons. First the bug seemed so obvious that I thought they HAD to know about it. Second, my experience is that unless you're one of "the insiders" on the list you generally get either a condescending comment or ignored completely. There also seems to be a bias against what they consider "power users". In this case to my surprise David asked for debug logs. I sent them to him and he replied that I should increase my "connect every" setting instead of using the "additional work buffer". I tried his suggestion and reported to him that it did not work either, the change made no difference. I haven't heard back since so I'll post another alpha list message to all instead of just David. The focus on the list seems to be on devising a bizarre new credit system, wasted effort IMO and one that will probably send many scurrying for the exits. To his credit and in fairness, David did suggest a fix. It just didn't work. Perhaps he missed my follow up e-mail. I know he's an extremely busy guy. Also I much appreciate the addition of the backup project that I long lobbied for. Probably others did too. It certainly makes my BOINC life easier. Regards/Ed | |
ID: 19236 | Rating: 0 | rate: / Reply Quote | |
Okay, I seem to be one of the ones he's listening to at the moment, so I've got some more stuff I'll post to the list in a moment. | |
ID: 19237 | Rating: 0 | rate: / Reply Quote | |
Hi Richard, have noticed many of your posts on the list and consider you one of the "voices of reason" there. I'm not a programmer (was a network manager and network VAR before retirement) so feel a bit out of place there. What's your take on the proposed new credit system that seems to be such a focus of effort? | |
ID: 19238 | Rating: 0 | rate: / Reply Quote | |
Hi Richard, have noticed many of your posts on the list and consider you one of the "voices of reason" there. I'm not a programmer (was a network manager and network VAR before retirement) so feel a bit out of place there. What's your take on the proposed new credit system that seems to be such a focus of effort? In a word, "immature". It actually needs more development work, rather than less - but until SETI gets its new servers (so there's an active large-scale project, with the working code, just down the lab for inspection), that project is probably on hold. I'll be dragging him back to http://img231.imageshack.us/img231/1146/seticlaimedvsgrantedcre.png in due course. Actually, there's more work going on now with the proposed new client scheduler (details mostly in boinc_dev), largely arising out of discussions like this one. Although I've done some programming elsewhere, so I have some idea of the way their minds work (and much sympathy with the repeated calls for logs and other data), I don't get involved in debugging the actual BOINC code except in the most trivial cases. I just report observations of the macro effects, and leave it to them to fix them. You don't need programming skills to help with the reporting ;-) | |
ID: 19239 | Rating: 0 | rate: / Reply Quote | |
how i can get inside that alpha list? there are couple onther bug I see on 6.11.7-8-9 as well as 6.12.4 versions. one of then boinc manager used to disappear although my rig continue to crunch. I'm opening manager and evething works fine, but this stuff was not on 6.10.58 | |
ID: 19240 | Rating: 0 | rate: / Reply Quote | |
how i can get inside that alpha list? there are couple onther bug I see on 6.11.7-8-9 as well as 6.12.4 versions. one of then boinc manager used to disappear although my rig continue to crunch. I'm opening manager and evething works fine, but this stuff was not on 6.10.58 Good question. It should be shown at BOINC email lists, but isn't. If there was a link on that page, it would lead here. I'll ask why it isn't there. You need to subscribe to the list, and then acknowledge the confirmation message, before you can contribute to the discussions. | |
ID: 19242 | Rating: 0 | rate: / Reply Quote | |
how i can get inside that alpha list? there are couple onther bug I see on 6.11.7-8-9 as well as 6.12.4 versions. one of then boinc manager used to disappear although my rig continue to crunch. I'm opening manager and evething works fine, but this stuff was not on 6.10.58 I've noticed that sometimes when I close BOINC manager it just exits without popping up the dialogue to close the project apps. For a LONG time now sometimes BOINC will crash and refuse to start. When this happens killing boinctray.exe in Task Manager always allows BOINC to restart. In fact commenting out boinctray.exe in startup seems to solve some of the above problems. Richard, what exactly is the purpose of boinctray.exe? As I remember it had something to do with Vista, but never really understood why it's needed and always installed with every new BOINC version. | |
ID: 19244 | Rating: 0 | rate: / Reply Quote | |
Thread for the General Discussion of Boinc Development and Test Versions. | |
ID: 19267 | Rating: 0 | rate: / Reply Quote | |
CTAPbIi - the only thing - BOINC do not want to get or the project not sending the 2nd WU at all. and basically it starts to download new WU only when it finish to crunch. this means somewhat couple of minutes of idling. | |
ID: 19270 | Rating: 0 | rate: / Reply Quote | |
From the Alpha mailing list regarding the shrinking GPU cache in 6.12.4. Email addresses removed to protect from spam. Date: Thu, 04 Nov 2010 19:07:03 -0700 I think he means <gpu_active_frac> above. | |
ID: 19271 | Rating: 0 | rate: / Reply Quote | |
We were having a chat - off-topic, admittedly - in the hardware area about the curious cases of the incredible shrinking cache and high priority running - in the latest test versions only. <time_stats> <on_frac>0.998909</on_frac> <connected_frac>1.000000</connected_frac> <active_frac>0.999975</active_frac> <gpu_active_frac>0.174423</gpu_active_frac> <last_update>1288914349.413624</last_update> </time_stats> If you change that <gpu_active_frac> to 1.000000, things will return to normal. That should be quicker than reverting to v6.10.58 ;-) The fix (a one character typo corrected) should be in v6.12.5 | |
ID: 19272 | Rating: 0 | rate: / Reply Quote | |
I've noticed that sometimes when I close BOINC manager it just exits without popping up the dialogue to close the project apps. For a LONG time now sometimes BOINC will crash and refuse to start. When this happens killing boinctray.exe in Task Manager always allows BOINC to restart. In fact commenting out boinctray.exe in startup seems to solve some of the above problems. In theory, boinctray.exe is a simple utility which just monitors for mouse and keyboard activity. That is used to judge whether a user is active on the computer, and hence give effect to BOINC's preferences to suspend activities while the computer is in use, if you use any of those options. I'm not sure how boinctray communicates with the rest of boinc. I don't think it's the TCP/IP used between the core client and the manager: more likely to be the shared memory segment also used for communicating between the core client and the science applications. That might explain why a problem with boinctray blocks the rest of boinc from running. If you don't use 'run according to preferences', your workround should be fine. But for the benefit of other volunteers who do want to use those settings, it would be better to track down the cause properly. Wild speculation as I type: do you use any special input hardware with non-standard or third-party drivers? Given boinctray's job, it should only be in some area like that that problems could arise. | |
ID: 19274 | Rating: 0 | rate: / Reply Quote | |
We were having a chat - off-topic, admittedly - in the hardware area about the curious cases of the incredible shrinking cache and high priority running - in the latest test versions only. Thanks Mark & Richard. Changing <gpu_active_frac> to 1.000000 solved the issue temporarily, certainly better than reinstalling clients. Sure hope 6.12.5 is released soon. | |
ID: 19304 | Rating: 0 | rate: / Reply Quote | |
I've noticed that sometimes when I close BOINC manager it just exits without popping up the dialogue to close the project apps. For a LONG time now sometimes BOINC will crash and refuse to start. When this happens killing boinctray.exe in Task Manager always allows BOINC to restart. In fact commenting out boinctray.exe in startup seems to solve some of the above problems. Thanks for the explanation. I have no special input devices, only generic keyboards and plain vanilla PS/2 or USB mice. No special drivers are loaded for either. Some but not all of the machines are on KVM switches, also with no special drivers loaded. This issue happens on all 9 of my machines from time to time. BOINC manager crashes or loses connection (but boinc.exe and the apps continue) and when reloaded fails to find a BOINC client. Killing boinctray.exe almost always allows the client to reconnect. As CTAPbIi reported above the issue of BOINC manager exiting and/or losing connection to the client has definitely become more common since 6.10.58. | |
ID: 19306 | Rating: 0 | rate: / Reply Quote | |
Thanks for the explanation. I have no special input devices, only generic keyboards and plain vanilla PS/2 or USB mice. No special drivers are loaded for either. Some but not all of the machines are on KVM switches, also with no special drivers loaded. This issue happens on all 9 of my machines from time to time. BOINC manager crashes or loses connection (but boinc.exe and the apps continue) and when reloaded fails to find a BOINC client. Killing boinctray.exe almost always allows the client to reconnect. As CTAPbIi reported above the issue of BOINC manager exiting and/or losing connection to the client has definitely become more common since 6.10.58. Do you use a proxy server? | |
ID: 19315 | Rating: 0 | rate: / Reply Quote | |
Thanks for the explanation. I have no special input devices, only generic keyboards and plain vanilla PS/2 or USB mice. No special drivers are loaded for either. Some but not all of the machines are on KVM switches, also with no special drivers loaded. This issue happens on all 9 of my machines from time to time. BOINC manager crashes or loses connection (but boinc.exe and the apps continue) and when reloaded fails to find a BOINC client. Killing boinctray.exe almost always allows the client to reconnect. As CTAPbIi reported above the issue of BOINC manager exiting and/or losing connection to the client has definitely become more common since 6.10.58. No, direct DSL connection. | |
ID: 19317 | Rating: 0 | rate: / Reply Quote | |
Do you use a proxy server? BOINC was sometimes slow to establish comms for me. Turned out to be waiting on the proxy server to respond. If its using one it will be in the BOINC log at startup. | |
ID: 19318 | Rating: 0 | rate: / Reply Quote | |
Do you use a proxy server? I know what a proxy server is. I spent years setting them up as a network manager and later a network VAR. I am not using a proxy server. Here's the line from the BOINC log: 11/5/2010 9:46:11 AM | | Not using a proxy | |
ID: 19319 | Rating: 0 | rate: / Reply Quote | |
Do you use a proxy server? Is that with a modem (with the computer handling the connection, IP addressing etc.), or with a router (with the connection, login, dns etc. handled by the router's firmware)? I think I've heard people saying that BOINC can have problems with its internal (localhost) TCPIP communication between client and manager, if the external (WAN) connection goes down. I've always used routers, so my external connection appears as a LAN, and remains intact even if the DSL goes down - and I've never had this 'losing connection' problem, even though I've tested some pretty ropey versions of BOINC along the way. | |
ID: 19322 | Rating: 0 | rate: / Reply Quote | |
Do you use a proxy server? DSL modem. Used DHCP for a while with the modem handling the routing. Now all the machines have fixed IPs with DNS set on the computer except for a couple that connect wirelessly. BOINC Manager loses connection occasionally in all instances. It was less common with 6.10.58 than with later versions. In the later versions I also sometimes see that BM just disappears as CTAPbIi notes above. Even if this problem is related to network variances, BM should be able to handle them gracefully without crashing or losing connection to the client on it's local machine. | |
ID: 19330 | Rating: 0 | rate: / Reply Quote | |
I’ve occasionally received similar messages along the line of Boinc is trying to contact the local client, please wait. After a few seconds it connects, so for me it’s not really a problem. I notice it frequently when starting Boinc Manager, but sometimes when I do a manual update. I had put this down to my new router/modem, it also comes with wireless, and I also see this on both the wireless and wired nodes (local network, always connected, router assigned IP6+4 address), but it could be an issue with Boinc itself. I’m only using systems that have been around for a while, so it my be a result of numerous Boinc updates. Boinc seems to be losing the local host source route. | |
ID: 19332 | Rating: 0 | rate: / Reply Quote | |
For clarity, I would describe all devices (capable of) running DHCP and supporting multiple computers simply as routers. | |
ID: 19335 | Rating: 0 | rate: / Reply Quote | |
Well yes, for simplicity I would go along with that; it’s a router, but because I use it as a router. | |
ID: 19337 | Rating: 0 | rate: / Reply Quote | |
Message boards : Number crunching : Boinc Testing