Advanced search

Message boards : Graphics cards (GPUs) : Beta BOINC client 6.6.12 available

Author Message
Neil A
Send message
Joined: 9 Oct 08
Posts: 50
Credit: 12,676,739
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 7173 - Posted: 4 Mar 2009 | 13:39:38 UTC

The above captioned is available for download and testing. Note this is a beta release and should only be used by those who are willing to test it and report any problems found.
____________
Crunching for the benefit of humanity and in memory of my dad and other family members.

Neil A
Send message
Joined: 9 Oct 08
Posts: 50
Credit: 12,676,739
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 7190 - Posted: 4 Mar 2009 | 19:19:32 UTC

what was fixed............

client: Revise Apple idle time detection for compatibility with OS 10.6.

- client: fix bugs in "abort_jobs_on_exit" feature:

- clear project backoffs when start abort sequence

- don't back off projects if in abort sequence

- boinccmd: fix bug in --set_proxy_settings command

(it wasn't setting the "use_XXX" flags). Fixes #776

- client: you can now include a <proxy_info> element

in your cc_config.xml options.



TODO: the whole proxy info thing needs an overhaul:

- no separate "use_XXX" flags;

non-empty http_server_name implies using HTTP proxy, etc.

- merge PROXY_INFO and GR_PROXY_INFO classes

- use XML_PARSER for parsing

- no PROXY_INFO element in HTTP_OP; just use gstate.proxy_info

- client: if you put <host_venue> in global_prefs_override.xml,

it should select the venue from the network prefs.

Now it does.

- client: make timeout values into #defines

- client: tag messages with project where possible; fixes #852

- client: show fetch share rather than run share in wfd message

- client: abort jobs that are unstarted and past deadline

- client: abort runaway jobs based on elapsed time instead of CPU time.

Specifically, abort jobs for which

elapsed time > WU.rsc_fpops_bound / app_version.flops

This policy works for

1) GPU jobs (which may use little CPU time)

2) jobs that run but because of bugs use little CPU time

(e.g., because they're sleeping)

whereas the old policy didn't.

- client: skip the above elapsed-time check for non-CPU-intensive jobs

- MGR: Remove the ability to display graphics on a remote computer,

or from a remote computer on *nix machines using v5 graphics.

fixes #674

- client: fix bug where if a GPU job is running,

and a 2nd GPU job with an earlier deadline arrives,

neither job is executed ever.

Reorganized things so that scheduling of GPU jobs is

done independently of CPU jobs.

The policy for GPU jobs:

- always EDF

- jobs are always removed from memory, regardless of checkpoint

(GPU memory is not paged, so it's bad to leave an idle app in
memory)

- client: shuffle the startup code to avoid showing wrong prefs info

on first-time startup.

- client: don't do an RPC until we've done CPU benchmarks.

We need the benchmark values to fill in app_version.flops

- boinccmd: make --get_messages output more readable

- manager: fix roundoff error in Advanced Prefs; fixes #613

- MGR: Make CTRL-SHIFT-A the accelerator in the simple GUI that

switches back to the advanced view.

refs #147

- client: fix bug that could cause scheduler RPC

to ask for work inappropriately,

and tell user that it wasn't asking for work.

Here's what was going on:

There are two different structures with work request fields

(req_secs, req_instances, estimated_delay):

COPROC_CUDA *coproc_cuda

and

RSC_WORK_FETCH cuda_work_fetch.

WORK_FETCH::choose_project() copied from cuda_work_fetch to coproc_cuda,

but only if a project was selected.

WORK_FETCH::clear_request() clears cuda_work_fetch but not coproc_cuda.



Scenario:

- a scheduler op is made to project A requesting X>0 secs of CUDA

- later, a scheduler op is made to project B for reason

other than work fetch (e.g., user request)

- choose_project() doesn't choose anything,

so the value of coproc_cuda->req_secs remains X

- clear_request() is called but that doesn't change *coproc_cuda



Solution: work-fetch code no longer knows about internals of

COPROC_CUDA and is not responsible for settings its request fields.

The copying of request fields from RSC_WORK_FETCH to COPROC

is done at a higher level,

in CLIENT_STATE::make_scheduler_request()



Additional bug fix: estimated_delay wasn't being cleared in some cases.

- client: RR sim FLOPS estimate for GPU jobs should reflect

fraction of time BOINC is running.

- client: if we're going to do a scheduler RPC for reasons

other than work fetch (e.g., user request, project request)

temporarily clear resource backoffs while deciding

whether to request work.

The backoffs are there only to delay RPCs,

and we're going an RPC anyway.

- manager and GUI RPC: always show resource debt and backoff even if zero;

also show backoff interval

- fix bugs in the above.

- GUI RPC: remove unused items from get_state RPC: host_info, time_stats,

net_stats, host_venue



NOTE: any change to a GUI RPC struct requires changing AsyncRPC.cpp.

This is undesirable.

- Manager: scheduling params are not defined for non-CPU-intensive projects;

don't show them.

- Update to OpenSSL 0.9.8j for Win32 and Win64

- Update to LibCurl 7.19.4 for Win32 and Win64

- client: fix message: "idle instance" => "starved"

- manager: when filtering messages by project,

show messages not tagged with a project (fixes #852)

- client: change garbage-collect logic.

old: reference-count files involved in a PERS_FILE_XFER

new: if a PERS_FILE_XFER refers to an unreferenced file,

delete it (and the associated FILE_XFER and HTTP_OP if
present)

May fix #366


____________
Crunching for the benefit of humanity and in memory of my dad and other family members.

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7217 - Posted: 5 Mar 2009 | 13:10:35 UTC

Spotted the following little gem from David on the boinc_alpha mailing list. I see Richard has already replied in regard to this.

Your app_info.xml file has an entry for a CUDA app. This confuses BOINC, which expects that all entries in app_info.xml are for CPU apps.

I'll change the client to detect this situation and ignore the CUDA entry in app_info.xml (eventually we'll support anonymous CUDA apps, but not in this release).

In the meantime, remove the entry (i.e. the following lines) and let me know what happens.

-- David

<app>
<name>setiathome_enhanced</name>
</app>
<file_info>
<name>MB_6.08_mod_CUDA_V9.exe</name>
<executable/>
</file_info>
<file_info>
<name>cudart.dll</name>
<executable/>
</file_info>
<file_info>
<name>cufft.dll</name>
<executable/>
</file_info>
<file_info>
<name>libfftw3f-3-1-1a_upx.dll</name>
<executable/>
</file_info>
<app_version>
<app_name>setiathome_enhanced</app_name>
<version_num>608</version_num>
<plan_class>cuda</plan_class>
<avg_ncpus>0.040000</avg_ncpus>
<max_ncpus>0.040000</max_ncpus>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>MB_6.08_mod_CUDA_V9.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart.dll</file_name>
</file_ref>
<file_ref>
<file_name>cufft.dll</file_name>
</file_ref>
<file_ref>
<file_name>libfftw3f-3-1-1a_upx.dll</file_name>
</file_ref>
</app_version>

____________
BOINC blog

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7218 - Posted: 5 Mar 2009 | 13:11:59 UTC

The app version numbers aren't displayed correctly under 6.6.11 and 6.6.12 either.
____________
BOINC blog

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7235 - Posted: 6 Mar 2009 | 14:50:25 UTC

Trying this version I get the dreaded message:

GPUGRID 3/6/2009 8:30:11 AM Message from server: No work sent
GPUGRID 3/6/2009 8:30:11 AM Message from server: Full-atom molecular dynamics is not available for your type of computer.
GPUGRID 3/6/2009 8:30:11 AM Message from server: Full-atom molecular dynamics on Cell processor is not available for your type of computer.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7236 - Posted: 6 Mar 2009 | 15:27:36 UTC - in response to Message 7235.

Now all of a sudden it worked. Go figure.

naja002
Avatar
Send message
Joined: 25 Sep 08
Posts: 111
Credit: 10,352,599
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 7238 - Posted: 6 Mar 2009 | 16:54:28 UTC

I'll just toss this out there:

I'm running 6.6.3 in vista 64 on 4 rigs. The same complaints: get tasks when told not to and doesn't get work when told to....

I've found that in less then 24 hrs--these things seem to correct themselves. In other words, when I select no new tasks....it will get new tasks, but within 24hrs it will stop getting new tasks--long-term. Same with requesting work.

Again, I'm running 6.6.3 right now, but it's been the same for me with each version since these issues have come up. Maybe the same with 6.6.12 also.

I've stuck with 6.6.3, because everything has been running very well....considering the above....

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7246 - Posted: 7 Mar 2009 | 2:48:50 UTC - in response to Message 7238.

I'll just toss this out there:

I'm running 6.6.3 in vista 64 on 4 rigs. The same complaints: get tasks when told not to and doesn't get work when told to....

I've found that in less then 24 hrs--these things seem to correct themselves. In other words, when I select no new tasks....it will get new tasks, but within 24hrs it will stop getting new tasks--long-term. Same with requesting work.

Again, I'm running 6.6.3 right now, but it's been the same for me with each version since these issues have come up. Maybe the same with 6.6.12 also.

I've stuck with 6.6.3, because everything has been running very well....considering the above....


BOINC said it wasn't asking for work when it was actually. It was doing this up to 6.6.11. It is one of the fixes in 6.6.12. It would appear to working properly now.
____________
BOINC blog

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 7247 - Posted: 7 Mar 2009 | 3:53:54 UTC

Now superceeded by 6.6.14
____________
BOINC blog

Post to thread

Message boards : Graphics cards (GPUs) : Beta BOINC client 6.6.12 available

//