Advanced search

Message boards : Graphics cards (GPUs) : Really low Run Times, but still Completed and Successful?

Author Message
Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29304 - Posted: 1 Apr 2013 | 9:39:18 UTC

GPU Developers:

I've been testing on running multiple tasks at the same time on my GPU, and just noticed something odd in my results.

If you look at these results, you'll see that they say "Completed and Validated", and I got the bonus credit, but... Look at those Run Times and CPU Times!

How can they be valid, if they only ran for less than 10 seconds? What happened?

http://www.gpugrid.net/result.php?resultid=6695537
http://www.gpugrid.net/result.php?resultid=6695975

Note 1: I didn't actually watch these units, and so I don't know how long they really took. Sorry.
Note 2: I'm not trying to cheat any system. This just happened, and I'm looking for an explanation.

Can you figure out what happened here?

Regards,
Jacob

Operator
Send message
Joined: 15 May 11
Posts: 108
Credit: 297,176,099
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29305 - Posted: 1 Apr 2013 | 17:15:41 UTC - in response to Message 29304.
Last modified: 1 Apr 2013 | 17:17:24 UTC



http://www.gpugrid.net/result.php?resultid=6695537 Was a short WU and you got only 16,200 credit for it despite what that page says.

Check here: http://www.gpugrid.net/workunit.php?wuid=4315248

http://www.gpugrid.net/result.php?resultid=6695975 Was indeed a long WU and the credit of 70,800 was correct.


As for the CPU time? No idea. Maybe that's the total amount of time the WU consumed getting data moved around on your system.

Operator[/url]
____________

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29306 - Posted: 1 Apr 2013 | 17:26:16 UTC - in response to Message 29305.

One was a short unit, one was a long unit. I already understood that.

My question is:
How could they have only used that little time, and yet been completed and validated, and granted bonus credit? What went wrong?

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 29307 - Posted: 1 Apr 2013 | 19:45:32 UTC - in response to Message 29306.
Last modified: 1 Apr 2013 | 19:59:45 UTC

Being valid with such short runtimes is puzzling: thanks for catching it.

After a DB check, it appears that the problem is exclusive to your last two WUs. How it happens, is beyond me, but I think BOINC got stuck somewhere in between WUs.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29308 - Posted: 1 Apr 2013 | 20:20:12 UTC - in response to Message 29307.
Last modified: 1 Apr 2013 | 20:22:33 UTC

Well... I'll tell you my setup, to see if that helps you at all in determining the cause.

Basically, I have 2 devices:
GPU Device 0: eVGA GTX 660 Ti 3GB FTW
GPU Device 1: eVGA GTX 460 1GB

For each of the 4 GPUGrid applications, I had been running the following in app_config.xml, in an attempt to run 2 tasks on a single GPU device, without dedicating a CPU to each task (since, when they run on Device 1, they don't usually use CPU at all):
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.001</cpu_usage>

I'm also attached to POEM@Home, and in order to run up-to-6 at a time while allocating a CPU for each task, for POEM's app_config.xml, I have:
<gpu_usage>0.166</gpu_usage>
<cpu_usage>1</cpu_usage>

GPUGrid mainly runs on Device 0, and at one point, I believe I saw the following all running on GPU Device 0, all at the same time:
- GPUGrid Task (0.5 GPU)
- Poem Task (0.166 GPU)
- Poem Task (0.166 GPU)
- Poem Task (0.166 GPU)

Now, that GPUGrid task may have been one that completed normally (ie: took ~35,000 secs), or it may have been one of these low-run-time tasks... I can't be sure, because I wasn't watching it closely enough.

I'll try to keep a better eye out on this, but if you find anything more out on your end, please let me know. I'd like to figure out what's happening.

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 29309 - Posted: 2 Apr 2013 | 14:20:15 UTC

Running multiple jobs on the same GPU is somewhat unorthodox, and beyond what we can troubleshoot. Feel free to keep experimenting for now, but ultimately this raises two concerns for us:

1) Users could potentially cheat the credit system this way, if this in fact a bug (and the run times are real, and not a bug that is misleading us).
2) Equally concerning for us, it could affect the integrity of the returned data. Corrupted results might be returned without being flagged as such.

For now, it just looks like a one-off peculiarity. But if either of these things become true, we would have to do something to fix it. Thanks for letting us know, and keep us posted if something like this happens again.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29310 - Posted: 2 Apr 2013 | 14:43:23 UTC - in response to Message 29309.

Nate,

I understand that we want to prohibit cheating, and that we want to make sure the results are valid. I'll definitely keep watching for more of these, and if I get any more, I'll report them.

But, since it already happened, and it seems that there's no way the result could be valid, then you might consider seeing if you can somehow account for it now (in the validator?), instead of waiting for it to happen again.

I'll let you know when it happens again.

Thanks,
Jacob

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 324
Credit: 72,394,453
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29311 - Posted: 2 Apr 2013 | 14:44:33 UTC - in response to Message 29309.

Running multiple jobs on the same GPU is somewhat unorthodox, and beyond what we can troubleshoot. Feel free to keep experimenting for now, but ultimately this raises two concerns for us:


Hello: I can only comment that I made several batches of short works with two GPU (GTX 590 = 4 tasks) tasks without problem in Linux Ubuntu 12.10 - 64bits is SO serious ... Greetings.

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 29312 - Posted: 2 Apr 2013 | 16:40:25 UTC - in response to Message 29310.

But, since it already happened, and it seems that there's no way the result could be valid, then you might consider seeing if you can somehow account for it now (in the validator?), instead of waiting for it to happen again.


Indeed. It appears that there is some problem. A file is missing, and the simulation may be affected. Let us know if it happens again. We'll have to figure out a way to search all the files/database and see if this is an issue elsewhere.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29343 - Posted: 6 Apr 2013 | 9:18:24 UTC - in response to Message 29312.

Thought it might have switched cards mid-run or that the WU might have restarted following a driver crash/restart and Boinc reset the runtime, but that's not the case:

    Sent 1 Apr 2013 | 6:01:20 UTC
    Received 1 Apr 2013 | 8:17:30 UTC


- A Long WU can't complete in 2h16min.

Maybe you could add a validation criteria so that the WU has to run for at least X amount of time to Validate. Probably best based on the app types (Short, Long).
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29349 - Posted: 6 Apr 2013 | 11:11:53 UTC - in response to Message 29343.

Maybe you could add a validation criteria so that the WU has to run for at least X amount of time to Validate. Probably best based on the app types (Short, Long).


No, don't put a validation based on time... which may have to be rewritten someday to accommodate faster cards. Validation should be based on actual validation of the results, and it's clear you guys are currently missing a check somewhere for these units (as Nate said "A file is missing"). So, fix it by checking for that file.

Note: I thought I'd copy/paste the workunit details of the 2 workunits, in case the details get removed:

Name I1R446-NATHAN_RPS1_respawn3-8-32-RND4245_0
Workunit 4315248
Created 1 Apr 2013 | 1:43:35 UTC
Sent 1 Apr 2013 | 6:18:32 UTC
Received 1 Apr 2013 | 8:18:18 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 126725
Report deadline 6 Apr 2013 | 6:18:32 UTC
Run time 11.36
CPU time 3.37
Validate state Valid
Credit 16,200.00
Application version Short runs (2-3 hours on fastest card) v6.52 (cuda42)

Name I13R89-NATHAN_dhfr36_3-23-32-RND7378_0
Workunit 4315612
Created 1 Apr 2013 | 4:33:06 UTC
Sent 1 Apr 2013 | 6:01:20 UTC
Received 1 Apr 2013 | 8:17:30 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 126725
Report deadline 6 Apr 2013 | 6:01:20 UTC
Run time 8.37
CPU time 1.40
Validate state Valid
Credit 70,800.00
Application version Long runs (8-12 hours on fastest card) v6.18 (cuda42)

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29369 - Posted: 7 Apr 2013 | 11:40:02 UTC - in response to Message 29349.

No, don't put a validation based on time... which may have to be rewritten someday to accommodate faster cards. Validation should be based on actual validation of the results, and it's clear you guys are currently missing a check somewhere for these units (as Nate said "A file is missing"). So, fix it by checking for that file.

Completely agreed! A time-based check is like sloppy 20th century programming aka "640k is enough for everyone!" and will be forgotten to update.

One could extend the usual server side checks to include such a test and bring unusually fast WUs to the devs attention automatically.. but don't automatically invalidate.

MrS
____________
Scanning for our furry friends since Jan 2002

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29399 - Posted: 10 Apr 2013 | 9:34:17 UTC - in response to Message 29312.

nate,

I had 2 more tasks that appear to have completed way prematurely, yet were granted full credit plus bonus credit.
I think I was suspending and resuming various GPU tasks, when these tasks decided they were completed. That's about all the information I have :-/

Can you please verify that the results are unusable?
And then, could you also build additional checks into the validator, that will not validate tasks like these, whose results are unusable?

Thanks,
Jacob

http://www.gpugrid.net/result.php?resultid=6738039
Name I3R50-NATHAN_dhfr36_3-31-32-RND3834_0
Workunit 4348765
Created 9 Apr 2013 | 22:06:35 UTC
Sent 10 Apr 2013 | 3:51:18 UTC
Received 10 Apr 2013 | 9:06:18 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 126725
Report deadline 15 Apr 2013 | 3:51:18 UTC
Run time 3,280.11
CPU time 3,247.89
Validate state Valid
Credit 70,800.00
Application version Long runs (8-12 hours on fastest card) v6.18 (cuda42)

http://www.gpugrid.net/result.php?resultid=6738043
Name I10R38-NATHAN_dhfr36_3-31-32-RND3600_0
Workunit 4348769
Created 9 Apr 2013 | 22:09:36 UTC
Sent 10 Apr 2013 | 3:51:18 UTC
Received 10 Apr 2013 | 9:06:18 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 126725
Report deadline 15 Apr 2013 | 3:51:18 UTC
Run time 3.12
CPU time 0.98
Validate state Valid
Credit 70,800.00
Application version Long runs (8-12 hours on fastest card) v6.18 (cuda42)

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29403 - Posted: 10 Apr 2013 | 19:08:14 UTC - in response to Message 29399.

Do you think these are actual run times, or bogus? I've seen run times reset before.
5h15min would be within the rage I've seen for one NATHAN_dhfr36 task at a time.
If you know what your cache was set to and if you were running one at a time or two, you might get some idea.

Thanks,


http://www.gpugrid.net/result.php?resultid=6738043
Name I10R38-NATHAN_dhfr36_3-31-32-RND3600_0
Workunit 4348769
Created 9 Apr 2013 | 22:09:36 UTC
Sent 10 Apr 2013 | 3:51:18 UTC
Received 10 Apr 2013 | 9:06:18 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 126725
Report deadline 15 Apr 2013 | 3:51:18 UTC
Run time 3.12
CPU time 0.98
Validate state Valid
Credit 70,800.00
Application version Long runs (8-12 hours on fastest card) v6.18 (cuda42)


____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29404 - Posted: 10 Apr 2013 | 19:12:03 UTC - in response to Message 29403.
Last modified: 10 Apr 2013 | 19:12:54 UTC

For that GPU, I usually have 4 GPUGrid tasks queued up, and as many POEM OpenCL tasks queued as possible (they rarely have them available). So, just because there was a 5-hour difference between download and reported, means nothing for this scenario.

I'm pretty sure the task started, then immediately died, then immediately reported the result, and cashed in on bonus credits.

I hope they can fix this. I don't want the credits, but even more importantly, I don't want error results mixed in with their other successful results.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29408 - Posted: 11 Apr 2013 | 12:30:16 UTC - in response to Message 29399.
Last modified: 11 Apr 2013 | 12:33:26 UTC

Uhoh, Nate!

I just had a whole rash of these, 18 of them, all Nathan long-run dhfr, just happen! And, from the best I can tell, this happened in the middle of the night while I was away from the PC.

Each task took 3 seconds, was marked completed and validated, and then was granted the full 70,800.00 credit. It looks like I'm going to be getting the most credit today that I've ever gotten before.

Have you guys begun your investigation into the validator??
I urge you to take action ASAP.
If there's anything I can do to help test, please let me know.

Thanks,
Jacob




Task
click for details
Show names Work unit
click for details Computer Sent Time reported
or deadline
explain Status Run time
(sec) CPU time
(sec) Credit Application
6742602 4352606 126725 11 Apr 2013 | 11:31:23 UTC 11 Apr 2013 | 11:34:25 UTC Completed and validated 3.61 0.90 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6742575 4352579 126725 11 Apr 2013 | 11:25:58 UTC 11 Apr 2013 | 11:28:41 UTC Completed and validated 3.23 0.92 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6742574 4352578 126725 11 Apr 2013 | 11:25:58 UTC 11 Apr 2013 | 11:28:41 UTC Completed and validated 3.56 0.98 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6742557 4352561 126725 11 Apr 2013 | 11:25:58 UTC 11 Apr 2013 | 11:28:41 UTC Completed and validated 3.60 0.86 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6742544 4352548 126725 11 Apr 2013 | 11:28:41 UTC 11 Apr 2013 | 11:31:23 UTC Completed and validated 3.25 0.92 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6742543 4352547 126725 11 Apr 2013 | 11:28:41 UTC 11 Apr 2013 | 11:31:23 UTC Completed and validated 3.71 0.95 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6742528 4352532 126725 11 Apr 2013 | 11:23:07 UTC 11 Apr 2013 | 11:25:58 UTC Completed and validated 3.66 0.89 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6742481 4352486 126725 11 Apr 2013 | 11:18:59 UTC 11 Apr 2013 | 11:20:56 UTC Completed and validated 3.24 0.86 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6742441 4352446 126725 11 Apr 2013 | 11:31:23 UTC 11 Apr 2013 | 11:34:25 UTC Completed and validated 3.26 0.92 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6742440 4352445 126725 11 Apr 2013 | 11:31:23 UTC 11 Apr 2013 | 11:34:25 UTC Completed and validated 3.64 0.94 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6742431 4352436 126725 11 Apr 2013 | 11:20:56 UTC 11 Apr 2013 | 11:23:07 UTC Completed and validated 3.65 0.90 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6742430 4352435 126725 11 Apr 2013 | 11:20:56 UTC 11 Apr 2013 | 11:23:07 UTC Completed and validated 3.21 0.98 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6742409 4352415 126725 11 Apr 2013 | 11:16:58 UTC 11 Apr 2013 | 11:18:59 UTC Completed and validated 3.71 0.87 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6742386 4352392 126725 11 Apr 2013 | 11:23:07 UTC 11 Apr 2013 | 11:25:58 UTC Completed and validated 3.19 1.01 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6742385 4352391 126725 11 Apr 2013 | 11:23:07 UTC 11 Apr 2013 | 11:25:58 UTC Completed and validated 3.43 0.98 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6742383 4352389 126725 11 Apr 2013 | 11:28:41 UTC 11 Apr 2013 | 11:31:23 UTC Completed and validated 3.74 0.86 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6742377 4352383 126725 11 Apr 2013 | 11:16:58 UTC 11 Apr 2013 | 11:20:56 UTC Completed and validated 3.22 0.86 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6741810 4351878 126725 11 Apr 2013 | 6:50:41 UTC 11 Apr 2013 | 11:16:58 UTC Completed and validated 3.27 0.90 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29414 - Posted: 11 Apr 2013 | 16:18:09 UTC

They should probably disable running multiple WUs here, at least until this issue is resolved.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29415 - Posted: 11 Apr 2013 | 16:30:38 UTC - in response to Message 29414.
Last modified: 11 Apr 2013 | 16:32:42 UTC

I disagree.

They have a way to find the erroneously-validated results, and so they have a way to filter them out whenever they go to actually use the data.

Disabling multiple WUs would hinder performance for a lot of people, those that run 2-at-a-time as well as those that run 1-at-a-time (and may not be connected to the internet very often)

What they should do is 2 things:
Priority 1) Fix the validator to stop marking these results valid, and thus stop issuing credits for invalid results
Priority 2) Fix the workunits so they do not error under whatever conditions they are erroring

- Jacob

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29416 - Posted: 11 Apr 2013 | 18:07:40 UTC - in response to Message 29415.

What they should do is 2 things:
Priority 1) Fix the validator to stop marking these results valid, and thus stop issuing credits for invalid results
Priority 2) Fix the workunits so they do not error under whatever conditions they are erroring - Jacob

Jacob, that would be fine, but primarily this is about the science. If the results are being corrupted the short term answer is to protect the validity of the science. Taking a credit hit is irrelevant.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29417 - Posted: 11 Apr 2013 | 18:36:34 UTC - in response to Message 29416.

I know this is about the science, silly!

Fixing priority 1, the validator, will assure that invalid results are not included within any list of valid results, thus preserving "the science". Fixing priority 2 will prevent network strain and user confusion.

I hope they are working on both.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29419 - Posted: 11 Apr 2013 | 23:46:21 UTC - in response to Message 29417.

But in the meantime why do something that's causing corrupted results?

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29420 - Posted: 11 Apr 2013 | 23:57:39 UTC - in response to Message 29419.
Last modified: 12 Apr 2013 | 0:11:51 UTC

Not all results are corrupted, and the admins still have a way of finding the corrupted ones after they've been erroneously validated.

So, to answer your question "why keep doing it", the answer is "the science" (Overall I do more science with a higher throughput, running 2-at-a-time, even if I get some that immediately error out).

If the admins make a request for me to change, I will of course honor it. But, so far, I've been encouraged to test (find problems) and report results (and problems).

I just hope they fix the problems I find.

Regards,
Jacob

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29423 - Posted: 12 Apr 2013 | 12:22:25 UTC - in response to Message 29420.

The apps at GPUGrid were not designed to run more than one task at a time on the same GPU. If running several tasks at once causes failures it might be massively detrimental to the project, so don't be surprised if the use of app_config gets banned. Getting lots of credits for producing failed tasks won't endear you to anyone and typically results in having your credit reduced or account suspended.

I don't think projects have to facilitate crunchers or Boinc add ons.

app_config was designed for all that can use it but is better used at other projects than here. At GPUGrid it seems the improvement is limited to but a few task types, on cards with more RAM and on Vista/W7/W8.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 29424 - Posted: 12 Apr 2013 | 12:53:18 UTC - in response to Message 29423.
Last modified: 12 Apr 2013 | 13:00:48 UTC

JK is the only one affected by the problem, as far as I can tell, and it's a nasty one. My guess is that the client does not keep the two WUs separate. Disable as soon as possible.

(Unless there is something more odd like read-only filesystems, file permissions, or the like).

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29425 - Posted: 12 Apr 2013 | 13:37:58 UTC - in response to Message 29424.
Last modified: 12 Apr 2013 | 14:06:56 UTC

JK is the only one affected by the problem, as far as I can tell, and it's a nasty one. My guess is that the client does not keep the two WUs separate. Disable as soon as possible.

(Unless there is something more odd like read-only filesystems, file permissions, or the like).


Toni:
Per your request (as a project administrator), I will stop running 2-at-a-time on the same GPU.

Toni / Nate / GDF:
Could you please do more research into the problem (hopefully not just guessing and giving up)... and fix the apps so that they can run 2-at-a-time consistently, and fix the validator so it does proper additional validation checks before granting credits? Running the tasks 2-at-a-time does increase throughput, when they work properly, and thus supporting it would be beneficial to your science.

I am available for any testing, and am eager to be allowed to run tasks 2-at-a-time, again.

Thanks,
Jacob Klein

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 29428 - Posted: 12 Apr 2013 | 19:47:17 UTC - in response to Message 29425.
Last modified: 12 Apr 2013 | 19:48:00 UTC

Hi JK, thanks for your fix. Your results now seem to work properly. My guess is that it is a bug in the client (not so much the application) which makes the two tasks not isolated from each other.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29430 - Posted: 12 Apr 2013 | 19:52:08 UTC - in response to Message 29428.

Toni,

There are times when GPUGrid-2-at-a-time works just fine for me. In fact, I'd say most of the time, it works correctly.

If it's a bug in the client, as you suggest, then let's get it fixed! I'm a beta tester, testing 7.0.62, and if you can isolate the reason you think it's a client bug, we can contact David Anderson and he can fix it. What makes you think it's a client bug?

I'm much more likely to believe that it's an application problem, though. I'm able to run x-at-a-time successfully on all of my other GPU projects.

If you determine it's an application problem, could you please fix it, as well as the validator?

I look forward to your reply,
Jacob

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29431 - Posted: 12 Apr 2013 | 20:16:22 UTC

JK's got a point in that running multiple concurrent WUs works for many projects. In fact GPu-Grid is the first one I am aware of where it doesn't work. Generally the others don't officially support and encourage it, but didn't have to adjust their code either.

Running multiple WUs per system, on multiple GPUs, works. Since GPU-Grid is rather complex and surely uses quite some CUDA libraries I could imaginethe following: maybe in some of these libraries there's some code / variables / initializations which is global per GPU, so that multi WUs share it (unintentionally, in this case). Now an error could appear if 2 WUs conflict in the use of the ressource, otherwise it will run just fine.

MrS
____________
Scanning for our furry friends since Jan 2002

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29460 - Posted: 15 Apr 2013 | 13:35:53 UTC - in response to Message 29430.

Toni,

There are times when GPUGrid-2-at-a-time works just fine for me. In fact, I'd say most of the time, it works correctly.

If it's a bug in the client, as you suggest, then let's get it fixed! I'm a beta tester, testing 7.0.62, and if you can isolate the reason you think it's a client bug, we can contact David Anderson and he can fix it. What makes you think it's a client bug?

I'm much more likely to believe that it's an application problem, though. I'm able to run x-at-a-time successfully on all of my other GPU projects.

If you determine it's an application problem, could you please fix it, as well as the validator?

I look forward to your reply,
Jacob


Toni / Nate / GDR:

I contacted David Anderson. He said that he'd be willing to help with any client problem.

I'd like to make progress on fixing the issues within this thread.
Have you guys begun an investigation yet into the cause?

It seems like the stderr output only had a single line of info, showing the BOINC version number. Maybe the code could be changed to indicate how far along in the execution it got, before crashing/completing?

I'd like to resume processing GPUGrid tasks x-at-a-time on the same GPU, like I can with all of my other GPU projects.

If there's anything I can do to help test a change or expedite the fix, I am at your command.

Thank you,
Jacob

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29461 - Posted: 15 Apr 2013 | 15:37:48 UTC - in response to Message 29460.
Last modified: 15 Apr 2013 | 16:07:40 UTC

Toni / Nate / GDR:

I disabled x-at-a-time for GPUGrid tasks, per Toni's request, on 4/12/2013.

Yet, I had 2 more results that were marked "Completed and validated", with Run time less than 4 seconds, granted full+bonus credit of 70,800, on 4/14/2013!

This leads me to believe that the problem is unrelated to running x-at-a-time.
Again, I ask you, have you begun your investigation??

Task Work unit Computer Sent Time reported or deadline Status Run time (sec) CPU time (sec) Credit Application
6755331 4362872 126725 14 Apr 2013 | 7:01:18 UTC 14 Apr 2013 | 10:38:35 UTC Completed and validated 2.40 0.84 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)
6754567 4362305 126725 14 Apr 2013 | 2:29:48 UTC 14 Apr 2013 | 10:38:01 UTC Completed and validated 3.19 0.84 70,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)

Specs:
Windows 8 x64, BOINC v7.0.62 x64 Beta, nVidia GeForce 314.22 WHQL, eVGA GTX 660 TI 3GB FTW, eVGA GTX 460 1GB, GPUGrid app_config using the following settings for all 4 apps:
<max_concurrent>9999</max_concurrent>
<gpu_versions>
<gpu_usage>1</gpu_usage>
<cpu_usage>0.001</cpu_usage>
</gpu_versions>

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29462 - Posted: 15 Apr 2013 | 15:49:16 UTC - in response to Message 29461.

<cpu_usage>.001</cpu_usage>

Jacob, could you try running without the app_config and see if you still get these glitches?

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29463 - Posted: 15 Apr 2013 | 15:55:30 UTC - in response to Message 29462.
Last modified: 15 Apr 2013 | 16:08:19 UTC

Maybe. Could you also try running WITH the app_config to see what results you get?
I'm quite frustrated -- it really feels like I'm the only one trying to solve this problem.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29464 - Posted: 15 Apr 2013 | 16:20:31 UTC - in response to Message 29463.

Maybe. Could you also try running WITH the app_config to see what results you get?
I'm quite frustrated -- it really feels like I'm the only one trying to solve this problem.

I don't see it as a problem. Many projects support running multiple instances (although not necessarily officially), this one does not. GPUGrid is extremely demanding compared to most projects. So far you've been seeing only the easiest of the long WUs compared to what we had not too long ago. I wouldn't want to try running multiple long NOELIA or GIANNI WUs. The NATHAN WUs previous to these would not even run properly 1x on my GTX 460/768 cards due to too large a memory footprint. I know you're trying to squeeze every last bit of performance out of your GTX 660 TI 3GB and that's commendable. Maybe the developers can find a solution for you, but I wouldn't be recommending either running 2x or limiting CPU reservation at this point.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29465 - Posted: 15 Apr 2013 | 16:27:13 UTC - in response to Message 29464.
Last modified: 15 Apr 2013 | 16:29:19 UTC

I don't think you understand the cpu limitation portion.
I removed the GPUGrid.net project, and re-added it, and started working on some long-run Nathan's (without any app_config override). By default, they use "0.73 CPUs + 1 NVIDIA GPU" on my machine. Having an app_config that sets it to 0.001, does not make any difference, if only 1 task is running. That number is used when to determine how many CPU jobs to add.

For instance:

Without app-config, my system runs:
1 GPUGrid.net task (0.73 CPUs + 1 NVIDIA GPU)
2 WCG HCC GPU tasks (each: 1 CPUs + 0.5 NVIDIA GPUs)
6 CPU jobs

With app_config, my system runs:
1 GPUGrid.net task (0.001 CPUs + 1 NVIDIA GPU)
2 WCG HCC GPU tasks (each: 1 CPUs + 0.5 NVIDIA GPUs)
6 CPU jobs

See? I don't see how the app_config could possibly make a difference, when running only 1 GPUGrid task, but I'm testing it anyways, since I'm losing hope that the admins care.

You may not see it as a problem, but it affects their science results. Tasks are failing immediately, their validator is erroneously marking those invalid results as valid... and if I can prove (to myself) that running x-at-a-time is not the cause, then I will likely switch back to running x-at-a-time, for increased performance.

The admins need to begin an investigation, if they care.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29466 - Posted: 15 Apr 2013 | 17:27:03 UTC - in response to Message 29465.

See? I don't see how the app_config could possibly make a difference, when running only 1 GPUGrid task, but I'm testing it anyways

That's all I asked, since it seems like these failures are unique to you and you're probably the only one running the above configuration. The only way to know for sure is to test it.

You may not see it as a problem, but it affects their science results.

Exactly, and if the config is the cause the admins should disable the use of app_info and app_config until it's sorted out.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29467 - Posted: 15 Apr 2013 | 17:41:23 UTC - in response to Message 29466.

And how do you propose they disable those things? Seriously. I don't believe it's possible. <sigh>
Man, I am SICK of all of these "ifs" and "let's do this and that."

I'm the only one testing to find the root cause, so far as I know. It's frustrating, to say the least, it feels like I'm in a hole. But if I'm the only one having the issue, then sure, I can accept being the only one doing the testing. If anyone else truly cares, they should be trying to reproduce the problem, instead of guessing at causes/fixes.

These proposals, some of which are ludicrous (making validation based on CPU time??), some of which are impossible (ban app-config??), and some of which are detrimental (limit to 1 gpu task per person??)... they're all the wrong approaches, in my opinion.

I said it before, and I'll say it again:

What they should do is 2 things:
Priority 1) Fix the validator to stop marking these results valid, and thus stop issuing credits for invalid results
Priority 2) Fix the workunits so they do not error under whatever conditions they are erroring

If you want to help me test, then help me test.
Otherwise, please don't guess at causes/solutions.

Thanks,
Jacob

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29468 - Posted: 15 Apr 2013 | 19:50:00 UTC - in response to Message 29467.

Jacob.. relax. It's only been 1 work day since Toni replied the last time. He could be on vacation or babysitting as far as we know.. or something else got in his way.

I agree the validator should be fixed, and this doesn't sound too hard. If they can fix "your" problem, on the other hand, is up in the air as long as we have no further insight into why this is happening.

One more possibility (I'm not calling it a guess ;) would be some wierd interaction with WCG@GPU, but then you probably wouldn't be the only one affected.

MrS
____________
Scanning for our furry friends since Jan 2002

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29469 - Posted: 15 Apr 2013 | 20:05:56 UTC - in response to Message 29468.

I'll try to relax.

If I knew that an investigation was proceeding, or that a fix was being worked on, then I could relax a lot easier.

It's been 2 full weeks since the problem was reported, and the admin responses thus far have been "oh wow that's odd and beyond me", "wait and see if it happens again", and "probably a client issue".

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29470 - Posted: 15 Apr 2013 | 20:50:57 UTC - in response to Message 29467.

some of which are impossible (ban app-config??)

It was my understanding that there is a server setting to allow/disallow the anonymous platform (app_info.xml). Not positive that's true and also don't know if there's a similar way to disallow the app_config. Maybe someone can enlighten us on that point.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29509 - Posted: 19 Apr 2013 | 13:22:55 UTC - in response to Message 29425.
Last modified: 19 Apr 2013 | 13:23:43 UTC

Toni / Nate / GDF:

I have been doing additional testing. Since removing/readding the project, I have not had the issue, even when using a custom app_config for 0.001 CPU and 1-at-a-time task processing.

In order to test whether the remove/readd fixed the problem, I was wondering...
Could I please turn on 2-at-a-time task processing, to continue my testing?

Awaiting your reply,
Jacob

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29516 - Posted: 20 Apr 2013 | 16:53:51 UTC - in response to Message 29509.

Toni / Nate / GDF:

Just had another one of these problems. It was using an app_config.xml file that was set for 0.001 CPU and 1.000 GPU, with a cc_config.xml file setup such that GPUGrid could run on either the GTX 660 Ti or the GTX 460.

So, again, I don't think the problem is with running multiple tasks at the same time on the same GPU. But it might still have something to do with using an app_config.xml file. I am still testing that.

HAVE YOU DONE ANYTHING SERVER SIDE TO CORRECT THE VALIDATOR AND CORRECT THE APPLICATION FROM ERRONEOUSLY COMPLETING LIKE THIS? It would also help if we could see, in the Stderr output, what Device the work unit was running on.

Regards,
Jacob

Name I14R3-NATHAN_dhfr36_5-9-32-RND6302_0
Workunit 4380107
Created 20 Apr 2013 | 2:43:43 UTC
Sent 20 Apr 2013 | 4:01:07 UTC
Received 20 Apr 2013 | 6:50:11 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 149974
Report deadline 25 Apr 2013 | 4:01:07 UTC
Run time 3.28
CPU time 0.83
Validate state Valid
Credit 70,800.00
Application version Long runs (8-12 hours on fastest card) v6.18 (cuda42)
Stderr output

<core_client_version>7.0.64</core_client_version>

Simba123
Send message
Joined: 5 Dec 11
Posts: 147
Credit: 69,970,684
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 29517 - Posted: 20 Apr 2013 | 22:16:14 UTC - in response to Message 29516.

Toni / Nate / GDF:

Just had another one of these problems. It was using an app_config.xml file that was set for 0.001 CPU and 1.000 GPU, with a cc_config.xml file setup such that GPUGrid could run on either the GTX 660 Ti or the GTX 460.

So, again, I don't think the problem is with running multiple tasks at the same time on the same GPU. But it might still have something to do with using an app_config.xml file. I am still testing that.

HAVE YOU DONE ANYTHING SERVER SIDE TO CORRECT THE VALIDATOR AND CORRECT THE APPLICATION FROM ERRONEOUSLY COMPLETING LIKE THIS? It would also help if we could see, in the Stderr output, what Device the work unit was running on.

Regards,
Jacob

Name I14R3-NATHAN_dhfr36_5-9-32-RND6302_0
Workunit 4380107
Created 20 Apr 2013 | 2:43:43 UTC
Sent 20 Apr 2013 | 4:01:07 UTC
Received 20 Apr 2013 | 6:50:11 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 149974
Report deadline 25 Apr 2013 | 4:01:07 UTC
Run time 3.28
CPU time 0.83
Validate state Valid
Credit 70,800.00
Application version Long runs (8-12 hours on fastest card) v6.18 (cuda42)
Stderr output

<core_client_version>7.0.64</core_client_version>


This is going to sound narky, but here goes.

This basically proves that there is something about running 2 GPUGrid tasks simultaneously on the same card will cause them to fail regularly.

Regardless of the fat that other projects manage to do it successfully, GPUGrid obviously does not, so it's probably time to stop until such time as the project managers give the ok.

I can understand the efficiencies you are seeking, but at the moment, all you are doing is dumping errornous results into the system.

Whatever else they are working on at the moment is taking priority over this.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29518 - Posted: 20 Apr 2013 | 22:21:20 UTC - in response to Message 29517.
Last modified: 20 Apr 2013 | 22:33:20 UTC

Simba123:
Did you read my post completely and carefully?
The most recent task that I reported, could have only run on 1 of the 2 GPUs, alone (ie: not 2-at-once-on-same-GPU), based on my settings.

Thus, again, I've been able to PROVE that these problems are happening even when NOT running 2-at-once-on-same-GPU.

I'm still trying to figure out exactly what causes the problem.
Want to help?

flashawk
Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 29519 - Posted: 20 Apr 2013 | 23:31:56 UTC - in response to Message 29518.

Did you read my post completely and carefully?
The most recent task that I reported, could have only run on 1 of the 2 GPUs, alone (ie: not 2-at-once-on-same-GPU), based on my settings.

Thus, again, I've been able to PROVE that these problems are happening even when NOT running 2-at-once-on-same-GPU.

I'm still trying to figure out exactly what causes the problem.
Want to help?


Are you saying that you are getting really short runs that are being validated while running 1 wu per GPU? If that's the case, we might all be turning in fubar results. I've been following this thread sense you started it and I thought this only happened when you ran 2 wu's simultaneously on 1 GPU, then you would get a task that finished in a few seconds and the validating server didn't catch it and you got points for a valid run.

I must not understand completely because it would seem the cat's out of the bag now and others could duplicate you're procedure to try and get some quick points. I realize that it doesn't happen every time from what you have written, I think it would be good if you could explain from the top exactly what's going on so I would know (and others) what to look for. I don't want to turn in fubar results, is there anything special we can look for? Is the whole project in jeopardy because of this?

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29520 - Posted: 21 Apr 2013 | 0:50:36 UTC - in response to Message 29519.
Last modified: 21 Apr 2013 | 1:46:00 UTC

You must not have been following the thread very closely. I'm trying my best to be informative and provide accurate information.

But relax - :) - The whole project is not in jeopardy. I wish the admins were much more proactive (sigh), but they make it sound like they can dig themselves out of this problem, even if results are quick-returned with this problem.

Here's a summary of what my testing has shown thus far:
- I have seen this issue only on Nathan Long-runs and Nathan Short-runs
- I have seen this issue running <gpu_usage> of 0.5 (aka: 2-at-once-on-same-GPU)
- I have seen this issue running <gpu_usage> of 1 (aka: Still using an app_config.xml file, with a <cpu_usage> of 0.001, but not doing 2-at-once-on-same-GPU)
- I have not yet seen this issue when not running an app_config.xml file; am testing this currently (I removed the GPUGrid project, re-added the GPUGrid project, verified it doesn't have an app_config.xml, got some tasks, verified they say something like "0.728 CPUs + 1 NVIDIA GPU", and am letting them complete without adding an app_config.xml file)
- I have not yet tested using <exclude_gpu> in the cc_config.xml to limit GPUGrid.net to just 1 of my 2 GPUs.
- Toni said (several days ago) that I'm the only one affected by the problem; but I sure wish someone could help to reproduce it with me, to better understand it.

Right now, my gut tells me that this is an application problem, likely related to:
a) Using an app_config.xml file even when it has <gpu_usage> set to 1.000... or
b) Having 2 homogenous GPUs in the system (maybe dependent on whether GPUGrid is allowed to run on both, but not necessarily dependent)

You mention "what to look for"... Well, I'm just looking at my results, and seeing if the the "Run time" values look correct for the types of tasks that I've completed. That's all I'm doing.

Hope that helps to summarize. I'm still testing to resolve this issue.
Any help would be welcomed.
If you want to help, then try to provide evidence that one of my 2 "gut feelings" is correct.
- If you only have 1 GPU, then try using an app_config.xml file (use the one below, with <cpu_usage> set to 0.001, and <gpu_usage> set to 1, with <max_concurrent> set to 9999)... to see if it causes a problem.
- If you have 2 homogenous GPUs, try letting GPUGrid run on both and see if it causes a problem.

Sorry for being miffed about this whole thing. I care a ton.
- Jacob Klein

Reference app_config.xml file:

<app_config>
<app>
<name>acemdbeta</name>
<max_concurrent>9999</max_concurrent>
<gpu_versions>
<gpu_usage>1</gpu_usage>
<cpu_usage>.001</cpu_usage>
</gpu_versions>
</app>
<app>
<name>acemdlong</name>
<max_concurrent>9999</max_concurrent>
<gpu_versions>
<gpu_usage>1</gpu_usage>
<cpu_usage>.001</cpu_usage>
</gpu_versions>
</app>
<app>
<name>acemdshort</name>
<max_concurrent>9999</max_concurrent>
<gpu_versions>
<gpu_usage>1</gpu_usage>
<cpu_usage>.001</cpu_usage>
</gpu_versions>
</app>
</app_config>

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29521 - Posted: 21 Apr 2013 | 7:38:39 UTC - in response to Message 29520.
Last modified: 21 Apr 2013 | 7:42:22 UTC

I've used app_config to run two tasks on one GPU (GTX660Ti 2GB), but I did not observe any WU's getting credit for only running a few seconds.
I've also been using app_config to run one task, and again no such problems.
The difference is that I don't saturate the CPU; I always leave some free.
Thus, it's your setup. Most likely the problem stems from completely saturating the CPU, which is a setup that has caused many problems at many projects for many years and is not recommended for crunching here. During the first few seconds of a WU loading, it uses a lot of the CPU:
<cpu_usage> of 0.001 looks Bad.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29522 - Posted: 21 Apr 2013 | 12:58:11 UTC - in response to Message 29521.
Last modified: 21 Apr 2013 | 13:24:38 UTC

skgiven:

Thank you for reporting your results.

When crunching 1 task, <cpu_usage> of 0.001 should be no different than the normal setup where GPUGRid tasks use 0.072 CPU by default and app_config.xml is not used. In both of those cases, my computer will allocate the same number of jobs to fill the available CPU resources. So, I do not think using <cpu_usage> of 0.001 can be the problem.

It's possible that just USING the app_config.xml file could be a problem, but I also believe that to be very unlikely, and am still testing to prove it.

It sounds like you might be suggesting that using a BOINC setting of 100% CPU, or using "extreme" app_config settings, could be a cause. While I don't believe that to be the case... Let's try to get more evidence! Please help.

Are you running with a BOINC setting of 100% CPU (so that your CPU resources are slightly over-committed?) If not, would you please consider changing your BOINC setting to use 100% CPU?

Also, are you running using the exact same app_config.xml file I posted 2 posts ago? If not, would you please consider changing your app_config.xml file to be exactly the same as the one I posted 2 posts ago?

The goal should be for you to attempt to reproduce the problem, so we can identify its cause. And, if you believe it's a "setup" issue, due to either the CPU % setting or the app_config.xml file setting, then TEST IT! Use my CPU % setting, and my app_config.xml file, on your machine -- and see what your results are! Note: It may take a week to get the problem to surface, from my experience. So, even if you make the changes and restart BOINC, it's a waiting game to see the results.

I'm betting, and hoping, that:
- you change your settings to 100% CPU
- you change your app_config.xml file to use my exact same app_config.xml file
- you restart BOINC
- you run it for a week
- you still aren't able to reproduce the problem, even after that week.
.... I want this problem to be related to homogenous GPUs.

Will you please help, by performing those tests?

Most importantly, I want GPUGrid.net admins to further acknowledge the problem, to put forth effort into identifying its cause (including useful statements in stderr.txt), to fix the application, and to fix the validator. One can only hope that these things become a priority.

Thanks,
Jacob

flashawk
Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 29523 - Posted: 21 Apr 2013 | 17:58:22 UTC

Thanks Jacob for explaining everything for me all in one post, sometimes I'm a little slow, it makes much more sense to me now. I appreciate you're stubborn determination in getting to the bottom of these anomalies, I just hope some other contributors aren't taking advantage of this flaw in the software.

mikey
Send message
Joined: 2 Jan 09
Posts: 286
Credit: 573,060,776
RAC: 356
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29524 - Posted: 21 Apr 2013 | 18:04:40 UTC - in response to Message 29522.

skgiven:

Thank you for reporting your results.

When crunching 1 task, <cpu_usage> of 0.001 should be no different than the normal setup where GPUGRid tasks use 0.072 CPU by default and app_config.xml is not used. In both of those cases, my computer will allocate the same number of jobs to fill the available CPU resources. So, I do not think using <cpu_usage> of 0.001 can be the problem.
Thanks,
Jacob


I have read this hole thread and Jacob you are one dedicated cruncher! I have ONE suggestion though, why not stop all cpu crunching for an hour and change your cc_config file to allow 1.0 cpu usage, instead of 0.001, and just 'see what happens'. If no changes occur then you will have confirmed that the 0.001 is NOT the problem, if everything works normally then you MAY have found a touch of the problem. The REAL problem is the Server validating bad units, but YOU can't fix that part. I KNOW you sort of did that when you with no cc_config file, but if you would run two gpu units at cone and up the cpu percentage you will at least eliminate a 'perceived' and 'possible' problem. IF my idea works then you can cut the numbers until you find the error point and then back off just a bit, kind of like overclocking, too much and you get errors, just right and the machine screams thru the units!

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29525 - Posted: 21 Apr 2013 | 19:52:59 UTC - in response to Message 29524.

Thanks for the suggestion. I highly doubt that adjusting the <cpu_usage> within the app_config.xml file would trigger the problem, but it is on my list of things to test.

Right now, I'm trying to reproduce the issue without an app_config.xml file. For some reason, I haven't been able to get that to happen yet.

Other tests I will do include:
- Test with app_config set for 1.000 CPU
- Test with BOINC set to use less CPU %
- Test with only 1 GPU in the system

It's not good enough to test for an hour. Sometimes it takes several days for this issue to happen. So, I'd say each test should take about 2 weeks, to "be sure" of the results. I guess I have to be patient and diligent with my testing.

If anyone else can replicate the problem (or at least try very hard to), I would very much appreciate any information you find.

Thanks,
Jacob

Simba123
Send message
Joined: 5 Dec 11
Posts: 147
Credit: 69,970,684
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 29532 - Posted: 22 Apr 2013 | 8:45:56 UTC



It's more likely to happen when a workunit starts. CPU usage seems to peak at that point, with a setting limiting the cpu to .0001, that's probably where it is occasionally choking.

It may also depend on the type of GPUs. I run a 660Ti and a 560Ti. the 660Ti uses as much CPU time as GPU time, whereas the 560Ti uses about 1/10th.

Even though the cpu is not saturated by any means, I have to keep 2 cores free to keep GPU usage steady instead of bouncing around.
and that only works with nothing else using the rig. If were to start web browsing etc, GPU usage starts jumping up and down again.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29533 - Posted: 22 Apr 2013 | 10:11:42 UTC - in response to Message 29532.
Last modified: 22 Apr 2013 | 10:32:41 UTC

Some of you still do not understand something very basic.

All of these 3 scenarios result in the exact same CPU saturation/allocation under my system's setup:
- Not using an app_config.xml file (so GPUGrid tasks use their default CPU usage usually around 0.73 CPU)
- Setting <CPU_Usage> to 0.999 via app_config.xml file
- Setting <CPU_Usage> to 0.001 via app_config.xml file

In all 3 of these scenarios, my system will run 1 GPUGrid task on the GTX 660 Ti, 2 WCG tasks on my GTX460, and 6 additional CPU tasks. And in all of these scenarios, the GPUGrid task is granted the exact same "amount of CPU"; ie: just because it says "0.001" CPU, that does not mean it is somehow "limiting" the CPU usage. It's just a number that's used to calculate how many total tasks to run right now. That's all. You can do research on how CPUs work, and how BOINC works, to confirm this.

Regarding CPU overloading, I disagree about the necessity to reserve cores. From my observations using ProcessExplorer (Google it and use it for yourself!), GPUGrid tasks are setup to run at a process priority of 6, with an active-thread priority of 6 or 7. These priorities are higher than regular CPU tasks, meaning that GPUGrid tasks are never starved of the CPU. Now, if you browse a webpage or move a window, then that requires GPU, and the GPU usage may fluctuate... but in terms of competing with other CPU tasks, GPUGrid tasks are never starved of CPU, because GPUGrid tasks have higher priority. You can do research into Windows priorities, and watch Task Manager's CPU Usage, and use ProcessExplorer, to confirm all of this information for yourself.

Regarding the differences between types of GPUs, the applications appear to be setup to treat different GPU architectures differently. A GTX 660 Ti appears to be setup to request to use a full core all of the time (I believe this is equivalent to the now-deprecated "Swan_Sync" "0" setting). A GTX 460, however, because it is a different architecture, appears to be setup to only request to use a small portion of a core. This is all normal behavior, and is in fact probably a reason that the "default CPU Usage" of the GPUGrid tasks is a value less than 1 (the GPUGrid admins do not want to reserve a whole core unnecessarily, which would waste CPU resources)

I have not ruled out that overloading the CPU could be a causal factor in my problem, but based on my understanding of CPUs and processes, I consider it quite unlikely. One of the tests to prove this would be to control it by decreasing BOINC's "use at most x% of the processors" setting, or to control it by setting <cpu_usage> to 1.000 in the app_config.xml file; both of these are tests on my to-do list from 2 posts ago. But currently I'm still trying to get the issue to occur without an app_config.xml file.

Finally, if you believe I'm wrong in any of this, or if you believe you have some other "answer", I encourage you to use the scientific method and PROVE it. Perform the setting changes necessary to recreate the problem. Seriously. Test, test, test! I challenge you to find the cause of the problem; I would appreciate any and all help testing. And, when you've tested for a sufficient enough time to prove something, report your results!

Guessing at the problem is no-longer helpful to me.

Thank you,
Jacob

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29538 - Posted: 23 Apr 2013 | 3:51:03 UTC - in response to Message 29533.

I just had a task fail, without an app_config.xml file... but it actually failed (instead of being instant-done completed successfully with credit).

So, I'm a bit confused.
Nathan, did you change anything?

The details are below... they include an Exit Status of -1, a Client state of Compute error, and no credit granted. (ie: If it had to fail, it at least failed gracefully, without being marked successful and without granting credit).

Note: I was having internet connectivity issues during the time that it started and promptly failed. Don't know if that is a causal factor or not.

Name I9R28-NATHAN_dhfr36_3-31-32-RND4985_4
Workunit 4368491
Created 22 Apr 2013 | 18:29:13 UTC
Sent 22 Apr 2013 | 20:20:40 UTC
Received 23 Apr 2013 | 3:38:48 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -1 (0xffffffffffffffff) Unknown error number
Computer ID 149974
Report deadline 27 Apr 2013 | 20:20:40 UTC
Run time 7.22
CPU time 5.99
Validate state Invalid
Credit 0.00
Application version Long runs (8-12 hours on fastest card) v6.18 (cuda42)
Stderr output

<core_client_version>7.0.64</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -1 (0xffffffff)
</message>
<stderr_txt>
MDIO: cannot open file "output.restart.coor"

</stderr_txt>
]]>


I'll continue to test.

Simba123
Send message
Joined: 5 Dec 11
Posts: 147
Credit: 69,970,684
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 29553 - Posted: 25 Apr 2013 | 2:15:33 UTC

looking at that taskhttp://www.gpugrid.net/workunit.php?wuid=4368491 it has failed 8 times, I had 1 like that yesterday too, so I'd say it was a 'normal' failed task, rather than anything else,

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29561 - Posted: 25 Apr 2013 | 12:31:20 UTC - in response to Message 29553.

Simba, I agree that the task that had the computational error, which failed for several other users, is unrelated to my research and testing. Thanks for pointing that out -- I should have spotted that earlier.

Also, just a note for the thread, I am now testing with nVidia Beta v320.00 drivers.

Jacob

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29572 - Posted: 27 Apr 2013 | 8:51:51 UTC - in response to Message 29561.

I've had another workunit that was Completed Successfully in a really short time. Looking at the logs, it ran on my GTX 660 Ti, while the GTX 460 was busy with 2 WCG tasks. It happened while I was using an app_config.xml file that had:
- 9999 max_concurrent
- 0.001 cpu_usage
- 1.000 gpu_usage

This frustrates me, because so far, I've only been able to reproduce the problem when I use an app_config file with those settings. But I still really believe that using an app_config.xml file shouldn't be causing the problem.

So, again, to continue to test my theory, I've reset the project, and will be running without an app_config.xml file for a while. I'll chime in again, in about 2 more weeks, to report more results.

Task details:

Name I62R17-NATHAN_dhfr36_5-25-32-RND8674_0
Workunit 4399064
Created 26 Apr 2013 | 23:37:52 UTC
Sent 26 Apr 2013 | 23:38:46 UTC
Received 27 Apr 2013 | 6:14:45 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 149974
Report deadline 1 May 2013 | 23:38:46 UTC
Run time 4.27
CPU time 0.87
Validate state Valid
Credit 70,800.00
Application version Long runs (8-12 hours on fastest card) v6.18 (cuda42)
Stderr output

<core_client_version>7.0.64</core_client_version>

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29578 - Posted: 27 Apr 2013 | 12:08:47 UTC - in response to Message 29533.

Finally, if you believe I'm wrong in any of this

Nope, sounds all right!

Going to test with your app_config now, although I'm only running GPU-Grid as backup for POEM nowadays.. so even if I'd get the error it might take me a long time. "Good" that there's so few work at POEM ;)

MrS
____________
Scanning for our furry friends since Jan 2002

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29638 - Posted: 2 May 2013 | 20:36:31 UTC - in response to Message 29578.

I've been running it for a few days by now, no anomalies.

Except that the claimed CPU utilization of 0.001 results in my BOINC 7.0.44 starting 8 Einstein threads on my 8-threaded i7+HT. You said in your case you'd get 7 CPU tasks, whether you use the app_config or not.

Running 8 Einsteins still makes my GTX660Ti use about a full core, but GPU utilization is fluctuating between 86 and 90%, whereas it stays at a nice and steady 92% without any CPU tasks. From quickly experimenting it seems like 3 or 4 CPU tasks would be enough to get full GPU-grid performance.

MrS
____________
Scanning for our furry friends since Jan 2002

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29641 - Posted: 2 May 2013 | 23:23:39 UTC - in response to Message 29638.

I've been running it for a few days by now, no anomalies.

Except that the claimed CPU utilization of 0.001 results in my BOINC 7.0.44 starting 8 Einstein threads on my 8-threaded i7+HT. You said in your case you'd get 7 CPU tasks, whether you use the app_config or not.

Running 8 Einsteins still makes my GTX660Ti use about a full core, but GPU utilization is fluctuating between 86 and 90%, whereas it stays at a nice and steady 92% without any CPU tasks. From quickly experimenting it seems like 3 or 4 CPU tasks would be enough to get full GPU-grid performance.

MrS


I didn't say what you think I said.

I had said:
In all 3 of these scenarios, my system will run 1 GPUGrid task on the GTX 660 Ti, 2 WCG tasks on my GTX460, and 6 additional CPU tasks

Each WCG GPU task was set to use 1.000 CPU, and the GPUGrid task was set to use <1.000 CPU.... so BOINC would schedule the 1 GPUGrid task, the 2 WCG GPU tasks, and then 6 additional CPU tasks.

Now that the WCG GPU app (Help Conquer Cancer) hcc1 is done, my computer now runs (without an app_config.xml file):
1 GPUGrid task on the GTX 660 Ti (0.729 CPU + 1 NVIDIA GPU)
1 GPUGrid task on the GTX 460 (0.729 CPU + 1 NVIDIA GPU)
7 CPU tasks

I still cannot recreate the "completed quickly yet gave credit" problem when not using an app_config.xml file. It's quite frustrating. I'll be switching back over to using the 0.001 CPU app_config.xml file soon, to see if I can recreate the issue while I don't have any WCG HCC tasks. Then, assuming I can recreate it, I'll switch over to using a 0.729 CPU app_config.xml file, which I have not yet tried.

I will find the exact causes to this bug, I swear this to you. Grrrrrr.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29646 - Posted: 3 May 2013 | 12:33:34 UTC - in response to Message 29641.
Last modified: 3 May 2013 | 12:36:31 UTC

WCG's HCC on GPU WU's began by using a full CPU, then used the GPU (high GPU utilization), and then used a full CPU thread again. When the CPU was being used the GPU was not and when the GPU was being used the CPU was not.

In this situation, you could therefore be running two HCC tasks without using any CPU (depending on how far along the tasks were) or be fully using two CPU threads.
In my opinion when using the app_config and specifying that GPUGrid tasks use 0.001 CPU's this meant that when some GPUGrid tasks started two CPU threads would be allocated and be fully used by HCC tasks. Along with 7 CPU tasks this meant that at that critical time for a GPUGrid task (the start), the CPU was already saturated and Boinc was being told (by the app_config file) that GPUGrid apps hardly needed any CPU (0.001). They failed as a result of CPU starvation.
They were incorrectly granted credit because the system didn't catch the error, but app_config is new and the server hasn't been equipped to catch such unknown errors. While this might be a problem for the researchers, perhaps future server updates will help protect against such problems.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29647 - Posted: 3 May 2013 | 13:29:33 UTC - in response to Message 29646.
Last modified: 3 May 2013 | 14:17:26 UTC

WCG's HCC on GPU WU's began by using a full CPU, then used the GPU (high GPU utilization), and then used a full CPU thread again. When the CPU was being used the GPU was not and when the GPU was being used the CPU was not.


That is not what I saw. On my machine, on my nVidia GTX 460, each of the WCG HCC GPU tasks consumed a CPU core fully for the entire time. Also, for each task, there were 2 times within the duration of a task, where the task would not use the GPU for about 20 seconds. I watched in Task Manager and Process Explorer and GPU-Z.

In this situation, you could therefore be running two HCC tasks without using any CPU (depending on how far along the tasks were) or be fully using two CPU threads.


That never happened for me. Again, for me, each WCG HCC task always used a full CPU core, and except for the 2 20-second non-GPU times per task, each task utilized the GPU.

In my opinion when using the app_config and specifying that GPUGrid tasks use 0.001 CPU's this meant that when some GPUGrid tasks started two CPU threads would be allocated and be fully used by HCC tasks. Along with 7 CPU tasks this meant that at that critical time for a GPUGrid task (the start), the CPU was already saturated and Boinc was being told (by the app_config file) that GPUGrid apps hardly needed any CPU (0.001). They failed as a result of CPU starvation.


When running 1 GPUGrid task... Do you have any proof that the system or BOINC does anything different, when using an app_config of 0.001, as compared to not using an app_config (where it uses 0.729 CPUs by default). I'm looking for proof, not opinion or conjecture. My claim is that the number is only used when determining how many tasks to start, and nothing else. But I have not yet been able to conclusively prove it.

They were incorrectly granted credit because the system didn't catch the error, but app_config is new and the server hasn't been equipped to catch such unknown errors. While this might be a problem for the researchers, perhaps future server updates will help protect against such problems.


The admins/researchers are capable, now, of putting in necessary checks in the validator to invalidate these types of results, without any server updates. So far, they have not made this a priority. So, I've given up on them helping the situation.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29649 - Posted: 3 May 2013 | 13:55:46 UTC - in response to Message 29646.

WCG's HCC on GPU WU's began by using a full CPU, then used the GPU (high GPU utilization), and then used a full CPU thread again. When the CPU was being used the GPU was not and when the GPU was being used the CPU was not.

Most of my ATI/AMD cards are on WCG HCC1 and generally they do best with 4 WUs/GPU running concurrently. Along with an NVidia GPU running GPUGrid it works best here to reserve 3 X6 CPU cores for the 5 GPU WUs. BTW the HCC project is winding down fast, they've collected all the data they need for now.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29655 - Posted: 3 May 2013 | 17:36:11 UTC - in response to Message 29649.
Last modified: 3 May 2013 | 17:42:05 UTC

Just checked and it looks like my GTX660Ti continuously uses a full thread running HCC - it's my ATI that only uses a full thread at the start and end of the run. I don't have a GTX460 to check with but I'll take your word that its the same. No wonder the NVidia's are so poor there; they're running something on the CPU and probably hammering the PCIE. Might be down to NVidia's OpenCL1.1 driver limitations, but it's winding up, so it's no odds now.

Jacob, exactly how do you propose proving or disproving the effect of using 0.001 or any other number in the app_config file?
Put an acceptable test together with reasoning and I'll test it, but anything that starts killing tasks is a no-no.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29658 - Posted: 3 May 2013 | 18:02:50 UTC - in response to Message 29655.
Last modified: 3 May 2013 | 18:05:45 UTC

skgiven,

Your claim is that the CPU is somehow being "overloaded", and that using 0.001 CPU_Usage in an app_config.xml file could be the cause. I don't believe that's possible, but let's have you test it!

So, a suitable test (for you to do) would be:
- Tell BOINC to "use at most 100% of the processors"
- Close BOINC
- Add an app_config.xml file to your GPUGrid.net project directory, and make it look like the one referenced here: http://www.gpugrid.net/forum_thread.php?id=3332&nowrap=true#29520
- Restart BOINC
- Confirm that any running GPUGrid tasks show up as "Running (0.001 CPU + 1 NVIDIA GPU (device x))"
- Confirm that the CPU is either fully saturated, or over-saturated. The more saturation, the better for our test!
- Run that for 3 weeks, monitoring your GPUGrid.net task results
- Let me know if any results have really low run times but are still completed successfully, validated, and granted credit

Does that sound suitable?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29660 - Posted: 3 May 2013 | 19:28:52 UTC - in response to Message 29658.

Firstly, I use my systems, so I don't want the CPU to be saturated.
I'm also not keen on deliberately trying to banjax tasks!

I think that over-saturating the CPU is a fundamentally bad setup, one that is basically designed to cause failures and certainly not one I would recommend using. So I would also be concerned that such a setup would generate other failure types.
What is run on the CPU could be significant too.

So far only you have experienced such failures and only when using 0.001 cpu_usage in an app_config.xml file. This strongly suggests that the 0.001 configuration or the use of the app_config file is causing the problem for you.

I know it's supposed to be a scheduling parameter but its terminology suggest that it stipulates the cpu usage of the app in question. If it does just apply scheduling parameters it shouldn't ordinarily limit the GPU apps usage of the CPU; if I set it to 0.001 or 0.99 it shouldn't make any difference - the apps will use whatever amount of CPU that it requires.

On that theory, yes, perhaps during the start of a task Boinc incorrectly uses this parameter to proportion CPU usage to the loading apps, and only if the CPU becomes overly-saturated does having such a low cpu_usage restrict the GPU apps access to the CPU to the extent that it causes tasks to fail. To determine that however, you would be better setting flags and looking at the logs to get some indication that this is happening. I don't think there are any specific flags for app_config but perhaps <coproc_debug>, <cpu_sched> and <cpu_sched_debug> would be useful?

The problem with this test is that you can't determine if it's the 0.001 setting or just using the app_config file when the CPU is already over-saturated.
To determine that you would have to test, as you suggested, with normal CPU usage assigned (1.0 not Boincs 0.729 estimate) and also without the cpu_usage line (to show that it's not something else to do with using app_config), if possible? That's 9weeks testing something that could turn out to be a peculiarity specific to your individual setup.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29661 - Posted: 3 May 2013 | 22:04:57 UTC

@SK: this is not about recommending a setup, it's about helping debugging. And to track the error down we've got to find what triggers it. If this makes you more comfortable to run the test Jakob suggested: I'm doing exactly what he said since last weekend. No problems so far.. of course.

And we all know modern PCs can be strange beasts at times.. but this entire argument of "not overloading CPU, might lead to errors" sounds bollux to me. The OS task scheduler is there to handle this. As I said: I'm running overcommited now with 1 more thread than logical cores, yet GPU-Grid typically still gets about 12.5% overall CPU time, i.e. one logical core. It's the lower priority (Einstein) jobs which get less CPU time, as Jakob already pointed out several times.

One could concieve a scenario that due to some CPU stalling the GPU isn't fed properly and some timeout triggers, causing the error. But this would result in a "display driver stopped responding" message. Jakob.. did you get such errors? Do they appear in the event log in general?

And this CPU utilization number is really just a guideline for the BOINC task scheduling. Actually, if the developers wanted it to be anything else, they'd be in trouble as BOINC can only start the tasks. What these do afterwards is out of their control. One could even put a multi-threaded app in there, stealing CPU time from other projects while still claiming to use 1 core. Of course, no real project would be dumb enough to try any such thing.

MrS
____________
Scanning for our furry friends since Jan 2002

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29662 - Posted: 3 May 2013 | 23:01:55 UTC - in response to Message 29660.

skgiven,

I thought you were serious about wanting to help test. But it sounds like you don't want to perform the test I recommended. That's unfortunate.

But I'm trying to channel my frustrations into solving this. It's taken me 1 month thus far, with very little guidance/help from the admins. I feel I've made some progress, and if it takes 9 more weeks, or longer, to solve it, then so be it.

You're welcome to help out any time. But it will involve stepping outside your comfort zone.

Regards,
Jacob

PS: Thanks for the advice about the logging flags.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29668 - Posted: 4 May 2013 | 10:10:27 UTC

Jacob wrote:
I didn't say what you think I said.

I had said:
In all 3 of these scenarios, my system will run 1 GPUGrid task on the GTX 660 Ti, 2 WCG tasks on my GTX460, and 6 additional CPU tasks

Sorry, you're right: I actually get 8 Einsteins and 1 GPU-Grid tasks with either config, so the same as you. I thought it would have been 7 + 1. Might have been with some prior BOINC version or who-knows!

MrS
____________
Scanning for our furry friends since Jan 2002

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29671 - Posted: 4 May 2013 | 10:51:11 UTC - in response to Message 29668.
Last modified: 4 May 2013 | 10:54:51 UTC

For reference, I believe the BOINC scheduling rule is (and has been for a long time):
- While CPU_Used <= (#CPU+1), schedule GPU tasks to run
- While CPU_Used < (#CPU+1), schedule CPU tasks to run
- If you have any high-priority tasks, they get scheduled first

So, if you're running 1 GPUGrid Task, on an 8-CPU machine, with BOINC set to use 100% CPUs, and the GPUGrid task's CPU_Usage is < 1, then I would expect you to also be running 8 CPU tasks, and it is normal.

Also, you should be running BOINC v7.0.64, which was recently publicly released.

Kind Regards,
Jacob

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29678 - Posted: 4 May 2013 | 15:44:22 UTC - in response to Message 29671.

I thought it would be (#CPU) instead of (+CPOU+1), but apparently you're right!

Also, you should be running BOINC v7.0.64, which was recently publicly released.

That's what BOINC keeps telling me as well. But regarding BOINC I like to play it safe and only upgrade if there's anything to be gained for me.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29679 - Posted: 4 May 2013 | 20:11:37 UTC - in response to Message 29661.

@SK: this is not about recommending a setup, it's about helping debugging.
Fine so long as the project is happy with people using it to test and debug Boinc code.

And to track the error down we've got to find what triggers it. If this makes you more comfortable to run the test Jakob suggested: I'm doing exactly what he said since last weekend. No problems so far.. of course.
Hence the suggested flags. The point of running without saturating the GPU was to perform a basil trial. I've done this, and not experienced any similar issues. Unfortunately running with 100% CPU usage has resulted in driver and app crashes for me. While it does make me a bit more comfortable knowing that you are testing, I still have my concerns.

And we all know modern PCs can be strange beasts at times.. but this entire argument of "not overloading CPU, might lead to errors" sounds bollux to me.
The hypothetical argument was to try and understand whats going on; state something and then prove or disprove it. Anyway, the argument was that overloading the cpu leads to errors, and it does for me, just not the same problem. Not overloading the CPU, while using app_config, hasn't so far resulted in any problems, including the same problem as Jacob described (and 've been testing with app_config for a month now). My present concern is that there are important setup differences, I don't have a 3GB card or a GTX460 to test with, and it might be important to crunch the same WU's on the second GPU (and possibly the same CPU WU's). Can we agree on something here?

The OS task scheduler is there to handle this. As I said: I'm running overcommited now with 1 more thread than logical cores, yet GPU-Grid typically still gets about 12.5% overall CPU time, i.e. one logical core. It's the lower priority (Einstein) jobs which get less CPU time, as Jakob already pointed out several times.
That's nice, but I got an app crash within 5min of running over committed. There is no point running with such a setup if you get lots of other failures, it defeats the purpose.

One could concieve a scenario that due to some CPU stalling the GPU isn't fed properly and some timeout triggers, causing the error. But this would result in a "display driver stopped responding" message. Jakob.. did you get such errors? Do they appear in the event log in general?
One realized this. I doubt that my driver and app crashes are in any way related to Jacobs special error, but they are a show stopper.

And this CPU utilization number is really just a guideline for the BOINC task scheduling. Actually, if the developers wanted it to be anything else, they'd be in trouble as BOINC can only start the tasks. What these do afterwards is out of their control.
I'm aware of what the app_config is supposed to do, but because Jacob has only experienced the problems when it's in use, what it's actually doing is open to debate. We just need to bear in mind that this could still be a client bug, or related to the GTX460.

One could even put a multi-threaded app in there, stealing CPU time from other projects while still claiming to use 1 core. Of course, no real project would be dumb enough to try any such thing.

MrS
Plenty of MT apps have been used including Aqua and more recently SimOne, and yes they stopped other tasks from running including GPU work (and that's post 7.0.40). For that reason I ran SimOne in a VM. Between driver, app and system crashes I still try to use a VM for other projects. This makes it more difficult to commit to testing with a setup that has already and repeatedly resulted in crashes for me (and that's without using the VM).

Is it not the case that without app_config you have 7CPU tasks and 1 GPU task running, but when using app_config it's 8?

Jacob has 2 GPU's in his setup (both NVidia). My setup primarily differs in that I presently have an ATI and an NVidia. Your setup differs in that it just has one NVidia - what I initially tested with. I had less errors with the one GPU when trying to use 100% CPU, but I still had some and I wasn't around enough to run like that. Adding the ATI was to facilitate POEM, which as you know doesn't like two NVidia's, but I also thought it would be closer to Jacob's setup.

Jacob, the version of Boinc you were using when you first encountered the credit rich 'errors' was pre-70.0.64, probably .58, .59 or .60.
The released date of .64 was 16-Apr, so the issue seems to span several release versions. While I agree it's best to test on a common Boinc version, I don't think it's essential. I've used 7.0.60 and 7.0.64 and possibly 7.0.62 (albeit for only a few days) - not that I've manages to replicate the problem.

Do you think the operating system or GPU types in the system would make any difference in determining if the problem is CPU or app_config related?
If not I could test for a while on a different system (XP and a GTX470).

At present we still haven't eliminated the possibility of the issue being related to the GTX460. If you pull it, you might find out.

It does look like the issue is related to app_config, but saying as no-one else has managed to replicate the problem, with or without an identical config file and CPU usage profile, it's still looking like it's rig-specific.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29680 - Posted: 4 May 2013 | 20:47:58 UTC - in response to Message 29679.
Last modified: 4 May 2013 | 21:22:31 UTC

OK, well here's a sort-of bug!

Running with Jacob's app_config.
I exited Boinc and removed the app_config file from the GPUGrid project folder.
Start Boinc, and click on the GPUGrid WU Properties. It says using 0.001 CPU's!
It should say what it's actually using which is 1 CPU.

Re-read the config file(s) and the Boinc logs say Boinc isn't using any app_config file:
04/05/2013 21:47:32 | | Re-reading cc_config.xml
04/05/2013 21:47:32 | | No config file found - using defaults

The WU properties still says it's using 0.001 CPU's.

Even after a system restart, it's still saying that the WU's using 0.001 CPU's.

The WU is clearly using a full CPU, but saying that it's using 0.001. That could only have come from the app_config file - which has now been removed. So why does Boinc still think it's using 0.001 CPU's?

The setting is being saved and retained (for future use) in the init_data.xml file in the Boinc Slot associated with the GPUGrid WU:

<project_dir>C:\ProgramData\BOINC/projects/www.gpugrid.net</project_dir>
<boinc_dir>C:\ProgramData\BOINC</boinc_dir>
<wu_name>I94R8-NATHAN_dhfr36_6-8-32-RND5174</wu_name>
<result_name>I94R8-NATHAN_dhfr36_6-8-32-RND5174_0</result_name>
<comm_obj_name>boinc_0</comm_obj_name>
<slot>1</slot>
<client_pid>3508</client_pid>
<wu_cpu_time>10328.080000</wu_cpu_time>
<starting_elapsed_time>10350.163047</starting_elapsed_time>
<using_sandbox>0</using_sandbox>
<user_total_credit>278316356.517260</user_total_credit>
<user_expavg_credit>164632.657628</user_expavg_credit>
<host_total_credit>35865525.000000</host_total_credit>
<host_expavg_credit>81185.211218</host_expavg_credit>
<resource_share_fraction>0.083333</resource_share_fraction>
<checkpoint_period>600.000000</checkpoint_period>
<fraction_done_start>0.000000</fraction_done_start>
<fraction_done_end>1.000000</fraction_done_end>
<gpu_type>NVIDIA</gpu_type>
<gpu_device_num>0</gpu_device_num>
<gpu_opencl_dev_index>0</gpu_opencl_dev_index>
<ncpus>0.001000</ncpus>
<rsc_fpops_est>5000000000000000.000000</rsc_fpops_est>
<rsc_fpops_bound>250000000000000000.000000</rsc_fpops_bound>
<rsc_memory_bound>100000000.000000</rsc_memory_bound>
<rsc_disk_bound>300000000.000000</rsc_disk_bound>
<computation_deadline>1368103379.000000</computation_deadline>
<vbox_window>0</vbox_window>
<host_info>

This means that removing app_config doesn't effect running tasks - they keep using the parameters already set.

The problem with this is that Boinc thinks it's using 0.001 CPU's while it's actually using 1 CPU, and if I have 100% CPU usage selected, Boinc will be running 9 WU's (8 CPU tasks and 1 GPUGrid WU).

If I go in and set app_config to use 1 cpu and re-read the config file(s), Boinc rethinks the situation and starts running 8 tasks (7 CPU tasks and 1 GPU task).
Of course the WU's properties still say 0.001 CPU is being used, but at least Boinc isn't actually trying to run 9 tasks and is only running 8 (including the GPU WU).
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29682 - Posted: 4 May 2013 | 22:16:11 UTC - in response to Message 29680.
Last modified: 4 May 2013 | 22:31:25 UTC

Fine so long as the project is happy with people using it to test and debug Boinc code.

We're not debugging BOINC code, here. We are simply trying to find out what triggers the problem specified in this thread. And, for some of us trying to do that, it's okay with us if we get a few errors along the way. At this time, I'm not prepared to blame BOINC code.

My present concern is that there are important setup differences, I don't have a 3GB card or a GTX460 to test with, and it might be important to crunch the same WU's on the second GPU (and possibly the same CPU WU's). Can we agree on something here?

I understand that a setup difference could be an issue. That is why it is so important for someone other than me, to replicate the problem, on their machine.

That's nice, but I got an app crash within 5min of running over committed. There is no point running with such a setup if you get lots of other failures, it defeats the purpose.

The 'purpose' of running fully committed, or over-committed, would be to try to replicate the issue in this thread. So, even if you get lots of other failures (which would probably be some other issue), it does NOT defeat the purpose of trying to replicate the issue in this thread. You only think it does.

Is it not the case that without app_config you have 7CPU tasks and 1 GPU task running, but when using app_config it's 8?

You're wrong. If he is running a GPUGrid task that says it takes < 1.000 core (either 0.729 CPU default without an app_config, or 0.001 CPU with an app_config), BOINC will still schedule 8 CPU tasks either way, because [0.729 + 8] < [8 + 1], and because [0.001 + 8] < [8 + 1]. This is normal behavior by BOINC, because BOINC's goal is to always keep all resources busy, and if a GPU task says it will only use the CPU part of the time, BOINC makes sure that a CPU task can keep the resource busy for the times where the GPU task doesn't.

So, when my system was running WCG HCC GPU tasks, when I had 2-WCG-at-a-time running on the GTX460, my setup was:
- If I set BOINC to use 100% CPUs, and reset the GPUGrid.net project (so that it doesn't have any app_config), BOINC will run 1 GPUGrid task on the 660 Ti (which says it uses 0.729 CPU, but actually uses a full core), 2 WCG GPU tasks on the GTX 460 (each saying 1.000 CPU, and each actually taking a full core), and 6 other CPU tasks (because [0.729 + 1.000 + 1.000 + 6] < [8 + 1]).
- If I set BOINC to use 100% CPUs, and use an app_config file of 0.001, BOINC will run 1 GPUGrid task on the 660 Ti (which says it uses 0.001 CPU, but actually uses a full core), 2 WCG GPU tasks on the GTX 460 (each saying 1.000 CPU, and each actually taking a full core), and 6 other CPU tasks (because [0.001 + 1.000 + 1.000 + 6] < [8 + 1]).
- So, here in this case, using an app_config file should NOT have made any significant difference at all in terms of CPU scheduling, yet, so far my tests seem to indicate that using the file DID make a difference. I cannot understand how/why.

Let's apply that logic to my current setup, now that WCG HCC GPU is done. Listen carefully, because the math now involves running GPUGrid on 2 GPUs, and it's a bit different -- here goes:
- If I set BOINC to use 100% CPUs, and reset the GPUGrid.net project (so that it doesn't have any app_config), BOINC will run 1 GPUGrid task on the 660 Ti (which says it uses 0.729 CPU, but actually uses a full core), 1 GPUGrid task on the 460 (which says it uses 0.729 CPU, but actually only uses about 1/8th of a core), and 7 other CPU tasks (because [0.729 + 0.729 + 7] < [8 + 1]).
- If I set BOINC to use 100% CPUs, and use an app_config file of 0.001, BOINC will run 1 GPUGrid task on the 660 Ti (which says it uses 0.001 CPU, but actually uses a full core), 1 GPUGrid task on the 460 (which says it uses 0.001 CPU, but actually only uses about 1/8th of a core), and 8 other CPU tasks (because [0.001 + 0.001 + 8] < [8 + 1]).
- I'm actually currently running the ([0.001 + 0.001 + 8] < [8 + 1]) setup right now, trying to recreate the problem without any WCG HCC GPU tasks.

Jacob, the version of Boinc you were using when you first encountered the credit rich 'errors' was pre-70.0.64, probably .58, .59 or .60.
The released date of .64 was 16-Apr, so the issue seems to span several release versions. While I agree it's best to test on a common Boinc version, I don't think it's essential. I've used 7.0.60 and 7.0.64 and possibly 7.0.62 (albeit for only a few days) - not that I've manages to replicate the problem. Do you think the operating system or GPU types in the system would make any difference in determining if the problem is CPU or app_config related?
If not I could test for a while on a different system (XP and a GTX470).

I don't think the BOINC version has anything to do with this bug. I'm of the opinion that, on order to pick up all of the other small bug fixes that a user might not be aware of, they should always use the latest publicly-released version of BOINC. That is why I recommended 7.0.64. Regarding OS and GPU types... I'm not sure. If you could please run at 100% CPUs, using an app_config.xml file 0.001, on as many systems as possible, for the sole purpose of trying to recreate the issue, that would be awesome. I challenge you to recreate the issue.

Running with Jacob's app_config.
I exited Boinc and removed the app_config file from the GPUGrid project folder.
Start Boinc, and click on the GPUGrid WU Properties. It says using 0.001 CPU's!
It should say what it's actually using which is 1 CPU.
Even after a system restart, it's still saying that the WU's using 0.001 CPU's.

The WU is clearly using a full CPU, but saying that it's using 0.001. That could only have come from the app_config file - which has now been removed. So why does Boinc still think it's using 0.001 CPU's?

First of all, maybe you are finally realizing that that number is just a number that is used to schedule tasks. You said "It should say what it's actually using", and that is simply not true. Heck, imagine a scenario where you have a GTX 660 Ti (which actually uses a full core), and a GTX 460 (which actually uses 1/8th of a core). BOINC doesn't use that number to "say what it's actually using", because if it did, then it would show "1.000" for my GTX 660 Ti, and "0.125" for my GTX 460. BOINC can't do that! The "number" is a number that must be the same for all tasks of a particular app.

And by default, GPUGrid developers have specified 0.729 to be the default value (at least for me) for every GPU task that runs, regardless of how much CPU the GPU task actually uses, regardless of GPU architecture. See, that number will always be the same for all tasks of a particular app that are running.

It gets set when a task is started/restarted, and can be updated when an app_config file is changed and the file is reread. If no app_config file is ever used, BOINC will use whatever default value the project gave, which for me is 0.729. Once an app_config file is used, the only way for me to get BOINC to start re-using the default value is to reset the project. That may be a bug, not sure, but that's just the way it works for now.

Is this hopefully making sense now? Stop thinking of that number as a "somehow limit the CPU usage to this amount"; BOINC and the OS don't limit the CPU usage using that number! Instead, think of that number as "it's just a number that's added into the total CPU usage amount when determining how many additional CPU tasks to process".

Kind regards,
Jacob Klein

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29683 - Posted: 4 May 2013 | 23:37:11 UTC - in response to Message 29682.
Last modified: 5 May 2013 | 1:58:47 UTC

Boinc says that my GTX470 uses 0.583 CPU's when running a long Nathan WU. In reality it uses next to nothing. The reason it uses so little cpu is that it's CC2.0 and performs something on the GPU that the Keplers perform on the cpu. The GTX660Ti is CC3.0 and uses a full CPU when running the same task types, despite Boinc saying it uses 0.687 CPU's. I think this 1 full CPU usage requirement is stipulated by the app (akin to swan). In this case it appeared that Boinc recognized this and didn't over-commit the CPU.

When I didn't use app_config and when I had the CPU set to be used 100% by Boinc apps it ran 7 CPU WU's and one GPUGrid WU on the GTX660Ti system (ATI not in use). I didn't perform a project reset; just deleted the app_config file from the project directory and when the new GPUGrid WU started it used a full CPU core, reported whatever fraction of the CPU Boinc thinks it uses (0.685), but Boinc didn't over-commit the CPU. So is using app_config the source of your problems? It seems to cause the CPU over-commitment as when it's not used Boinc runs less apps, however it's lying about what it's using and what it had actually read. I had set cpu usage to 1, then removed the app_config, re-read the config files and boinc pretended it was using 0.685 CPU's when in fact it was still using the 1CPU setting. When the WU crashed and a new one ran it then said it was using 1CPU. After a project reset, like you suggested, it then said 0.685 again, but ran 8 CPU WU's and one GPU WU.

Note that if I use both GPU's on my main system and try to use 100% CPU with 9apps or more, it often crashes inside 5min. My errors are different to yours because you have a different setup.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29684 - Posted: 5 May 2013 | 0:58:07 UTC - in response to Message 29683.

When I don't use app_config and when I have the CPU set to be used 100% by Boinc apps I run 7 CPU WU's and one GPUGrid WU on the GTX660Ti system (ATI not in use).

I require proof in the form of screenshots. Please provide this proof, as it goes against everything I've learned about how BOINC functions.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29686 - Posted: 5 May 2013 | 1:25:02 UTC - in response to Message 29684.
Last modified: 5 May 2013 | 2:37:35 UTC

Re-read my previous post, I edited it with some findings.

Screen grab below shows 7 POGS CPU WU's and one GPU WU running, with Task Manger showing 100% CPU usage.

At the time the WU properties showed the following,

However this misled me, it's either not correct or was subsequently overwritten. I think the app_config was still being applied and was setting the CPU usage to be 1 and Boinc was applying that wrt task scheduling, but just incorrectly displaying 0.685 which (along with the log file following a read config file) led me to believe app_config was no longer being applied. I think it was, because after that GPUGrid WU crashed (probably due to opening GPUZ) the next GPUGrid WU said it was using 1 CPU. Where did it get that from, and why did it start running 8 CPU WU's after resetting? Something isn't being updated, at least until Boinc is restarted or config file read several times, or the project reset? I don't think having to reset the project to un-apply app_config settings was part of the plan. So that part isn't working. I'm seeing other anomalies in the WU properties too. The POGS WU's say they have used 1 to 4minutes of CPU time and yet have run for 32min. Does that indicates that we can't go by any info being displayed or is it POGS specific?

This WU suffered an error when I opened GPUZ,
I4R13-NATHAN_dhfr36_5-20-32-RND3729_0 4423712 4 May 2013 | 20:57:09 UTC 5 May 2013 | 1:24:20 UTC Error while computing 7,166.53 7,166.53 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29687 - Posted: 5 May 2013 | 2:49:12 UTC - in response to Message 29686.
Last modified: 5 May 2013 | 2:50:19 UTC

This is interesting info, but not exactly conclusive, as I believe there are certain bugs whenever you:
- Switch the CPU_Usage in an app_config, without restarting BOINC
- Remove the app_config, without restarting BOINC

So... Let me pose it to you this way. With BOINC set to use 100% CPUs, and GPUGrid CPU_Usage is set to a value that is less than 1.000 (either because it is its default value, or because you have an app_config specifying a CPU_Usage value less than 1).... on a restart of BOINC, has BOINC ever started less than 8 CPU tasks? From how I understand BOINC task scheduling, the answer should be "no, it always starts 8 CPU tasks." If that's not the case, please provide screenshots again, clearer if possible.

I appreciate your help; I do want to make sure the information I say about an app_config's CPU_Usage setting, and about BOINC task scheduling, is all accurate, and I still claim it is.

Thanks,
Jacob

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29688 - Posted: 5 May 2013 | 11:38:32 UTC - in response to Message 29687.
Last modified: 5 May 2013 | 11:46:43 UTC

This is interesting info, but not exactly conclusive, as I believe there are certain bugs whenever you:
- Switch the CPU_Usage in an app_config, without restarting BOINC
- Remove the app_config, without restarting BOINC

Yes, there are reporting anomalies and what is being reported from Boinc Manager isn't reliable under certain conditions (without a Boinc restart). Obviously if you change app_config settings, you want BM to correctly tell you what the settings are without having to restart Boinc or goto the app_config file and work them out. However, while this misreporting has been rather unhelpful, app_config is just a set of scheduler settings. Boinc is reading the file and applying what settings it can, but not to running WU's, just to new WU's because scheduling settings are not retroactive; if a WU's already started previous settings have already been applied. To apply new settings a Boinc restart seems to be required. I would call this an undesirable feature. This might also depend on what slots WU's start/restart in?

In my case the number of slots Boinc has at it's disposal has risen. At present I have 12, because at one stage I was trying to run 4 WU's across the GPU's.

Think about this,
If you don't use any app config files and try to run 1 WU on each of 2 GPU's boinc will launch 9 WU's (dropping one of the CPU WU's if it was running 8). Agreed?
If you stipulate in an app_config file to use 2 WU's each on each of 2 GPU's and that these WU's each use 1 CPU, Boinc will again launch 9 WU's (4 GPU WU's and 5 CPU WU's). Agreed?
When you stipulate in an app_config file to use 2 WU's each on each of 2 GPU's but that these WU's each only use 0.001 CPU's, Boinc will launch 12 WU's (4 GPU WU's and 8 CPU WU's). Do you agree?

In your initial setup (running 2 GPUGrid WU's) you were doing just that, running two WU’s on each of your GPU's. You said your second GPU ran (at least at one time) 2 WCG tasks, and these used a full CPU core each, and we know the Long GPUGrid WU's also use 1 core on your GTX660Ti. Now that’s Over-Saturation of the CPU, in a big way! I think this could be the source of your problems. If the CPU is that over-committed failures are almost inevitable. However what failures occur depends on everything that is running, and your setup. So in my opinion that would need to be replicated; there is no point in randomly running CPU and GPU projects. Again, we need to agree to crunch specific WU's. Even then having different setups might mean replicating the problem is impossible.

I also think there is no point testing with one GPU and using app_config; it's not going to run any more WU's and thus it's not going to test anything. You need to test with 2 GPU's and one of them would need to be running at least 2 WU's. Possibly running two apps on the one GPU would be similar enough, but I don't know.

If these issues only occurred when running HCC on your second GPU, then you will never replicate the problem, nor will anyone else.

So... Let me pose it to you this way. With BOINC set to use 100% CPUs, and GPUGrid CPU_Usage is set to a value that is less than 1.000 (either because it is its default value, or because you have an app_config specifying a CPU_Usage value less than 1).... on a restart of BOINC, has BOINC ever started less than 8 CPU tasks? From how I understand BOINC task scheduling, the answer should be "no, it always starts 8 CPU tasks." If that's not the case, please provide screenshots again, clearer if possible.
Under these conditions it's behaving the way you would expect.

If the issues are due to CPU over-saturation then if you used app_config to run 2 GPUGrid WU's on the 3GB card, this won't cause problems so long as the CPU isn't over-saturated. Thus I would recommend testing that (saying as the original goal was to do more work), with cpu_usage of 1.0. However I would suggest limiting the CPU in BM to 99%. That way you are guaranteeing to not over-commit the CPU any other way, and thus testing if the cause is using app_config, but nothing to do with cpu saturation. If you get any repeat of the same problem then stop, and report it.

I think there is no point in anyone else trying this, unless they have a 3GB Kepler. I already tried and while I could run the short WU's (just without seeing any real advantage), the Long tasks caused a massive slow down in task turnover and really bad interface problems (GPU downclocking, system lockups, driver crashes).
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29689 - Posted: 5 May 2013 | 12:46:03 UTC - in response to Message 29688.

This is interesting info, but not exactly conclusive, as I believe there are certain bugs whenever you:
- Switch the CPU_Usage in an app_config, without restarting BOINC
- Remove the app_config, without restarting BOINC

Yes, there are reporting anomalies and what is being reported from Boinc Manager isn't reliable under certain conditions (without a Boinc restart).

Yep. That's why I usually recommend a restart anytime app_config is changed.

If you don't use any app config files and try to run 1 WU on each of 2 GPU's boinc will launch 9 WU's (dropping one of the CPU WU's if it was running 8). Agreed?
Agreed.
Without WCG HCC GPU, and GPUGrid using the project default values, my PC will run:
- 1 GPUGrid.net task on the GTX 660 Ti (0.729 CPU + 1 NVIDIA)
- 1 GPUGrid.net task on the GTX 460 (0.729 CPU + 1 NVIDIA)
- 7 CPU tasks (because [0.729 + 0.729 + 7] < [8 + 1])

If you stipulate in an app_config file to use 2 WU's each on each of 2 GPU's and that these WU's each use 1 CPU, Boinc will again launch 9 WU's (4 GPU WU's and 5 CPU WU's). Agreed?
Disagree.
Without WCG HCC GPU, and GPUGrid setup as you suggest, my PC will run:
- 2 GPUGrid.net tasks on the GTX 660 Ti (each: 1 CPU + 0.500 NVIDIA)
- 2 GPUGrid.net tasks on the GTX 460 (each: 1 CPU + 0.500 NVIDIA)
- 4 CPU tasks (because [2 + 2 + 4] < [8 + 1], but [2 + 2 + 5] !< [8 + 1])

When you stipulate in an app_config file to use 2 WU's each on each of 2 GPU's but that these WU's each only use 0.001 CPU's, Boinc will launch 12 WU's (4 GPU WU's and 8 CPU WU's). Do you agree?
Agree.
Without WCG HCC GPU, and GPUGrid setup as you suggest, my PC will run:
- 2 GPUGrid.net tasks on the GTX 660 Ti (each: 0.001 CPU + 0.500 NVIDIA)
- 2 GPUGrid.net tasks on the GTX 460 (each: 0.001 CPU + 0.500 NVIDIA)
- 8 CPU tasks (because [0.002 + 0.002 + 8] < [8 + 1])

In your initial setup (running 2 GPUGrid WU's) you were doing just that, running two WU’s on each of your GPU's. You said your second GPU ran (at least at one time) 2 WCG tasks, and these used a full CPU core each, and we know the Long GPUGrid WU's also use 1 core on your GTX660Ti. Now that’s Over-Saturation of the CPU, in a big way! I think this could be the source of your problems. If the CPU is that over-committed failures are almost inevitable. However what failures occur depends on everything that is running, and your setup. So in my opinion that would need to be replicated; there is no point in randomly running CPU and GPU projects. Again, we need to agree to crunch specific WU's. Even then having different setups might mean replicating the problem is impossible.

We can only try to replicate the issue. I still claim that "CPU oversaturation" is not the cause of this issue. Right now I'm running 0.001 on both GPUs, 1 task per GPU, and 8 CPU tasks... and not having a problem [yet]. What'll really make you cringe is that I'm considering using the cc_config option for <ncpus> to make BOINC think I have 12 or 16 CPUs. If your theory is correct, I'd see more problems, but if my theory is correct, tasks will still complete but there will be increased resource contention. Do you think that seems like a good test for me to do?

If these issues only occurred when running HCC on your second GPU, then you will never replicate the problem, nor will anyone else.

Yeah, that bugs me, but that may be the case. I hope to recreate the problem soon, but I'm not sure I can.

So... Let me pose it to you this way. With BOINC set to use 100% CPUs, and GPUGrid CPU_Usage is set to a value that is less than 1.000 (either because it is its default value, or because you have an app_config specifying a CPU_Usage value less than 1).... on a restart of BOINC, has BOINC ever started less than 8 CPU tasks? From how I understand BOINC task scheduling, the answer should be "no, it always starts 8 CPU tasks." If that's not the case, please provide screenshots again, clearer if possible.
Under these conditions it's behaving the way you would expect.

That's what I expected. Believe it or not, I really do know what I'm talking about. I try not to guess.

If the issues are due to CPU over-saturation then if you used app_config to run 2 GPUGrid WU's on the 3GB card, this won't cause problems so long as the CPU isn't over-saturated. Thus I would recommend testing that (saying as the original goal was to do more work), with cpu_usage of 1.0. However I would suggest limiting the CPU in BM to 99%. That way you are guaranteeing to not over-commit the CPU any other way, and thus testing if the cause is using app_config, but nothing to do with cpu saturation. If you get any repeat of the same problem then stop, and report it.

Thanks for the suggestions; you and I sure do think differently.

Right now, because WCG HCC GPU tasks are done, I need to re-test to see if I can reproduce the issue. And, if the re-test will run GPUGrid on both GPUs, then the re-test cannot involve 2-on-1-card, since the GTX 460 doesn't have enough memory. So, I've got WCG set for No-New-Tasks (just to make sure I don't get any rogue WCG HCC GPU tasks), and I want TONS of stress on the system. I'm using 0.001 CPU, with BOINC set to use 100% CPUs, such that right now I'm running 2 GPUGrid tasks alongside 8 CPU tasks, which is already over-committed. But, as I said, I want TONS of stress. So, I'm going to use <ncpus> and see if I can add additional stress, probably setting it to 12, to run 12 CPU tasks. My goal right now is to recreate the problem.

I think there is no point in anyone else trying this, unless they have a 3GB Kepler. I already tried and while I could run the short WU's (just without seeing any real advantage), the Long tasks caused a massive slow down in task turnover and really bad interface problems (GPU downclocking, system lockups, driver crashes).

Right now, I still claim that CPU oversaturation is not a constraint. In fact, I claim that the only real constraint on 2-at-a-time is memory usage, as documented in the Performance thread.
So, I would say that:
- if the user will be running a particular GPUGrid application on a card that is less than 2GB, then I'd recommend against 2-at-a-time for that GPUGrid application
- if the user will be running a particular GPUGrid application only on cards that are 2GB or higher, it would be worth testing 2-at-a-time on each GPU, for increased GPU Usage and a higher throughput.

I'm sure you disagree with my recommendations, but any future discussion on it should be in the other thread.

Kind regards,
Jacob

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29696 - Posted: 5 May 2013 | 20:30:48 UTC

To me it looks like the settings in the app_config are applied to the current WU and remain in effect even if the app_config is removed and BOINC restart (no project reset). That's exactly what I did the other day, and I still got the "0.001 CPUs used" as you did. However, in the next morning I was back to 0.7 without doing anything except letting my PC finish some work. For me this is all I need to know about tis issue.

And incidently I made this switch to check for myself if I had 8 Einsteins running along GPU-Grid even without the app_config. I had actually written a paragraph stating that it wasn't so (expecting to get 7+1), then added a disclaimer that this was out of my head and that I couldn't test right now.. an then thought "oh what the heck, let's do the bloody test..". And, much to my surprise, found I was also getting 8+1 tasks in this case.

I suspect the issue Jacob found might have something to do with his setup (hardware, OS, drivers, BIOS and whatever) or the other CUDA tasks he was running along GPU-Grid. It could, for example, happen that some buffer in some library/CUDA-function being used by both projects wasn't properly (re-)initialized. Ideally such things should not happen, but we all know there are still quite some bugs being found in CUDA.

What I suggest as the next steps:
- I keep testing with the app_info, jsut to make sure
- SK doesn't test, at least not on his main rig (the instability you mentioned shouldn't happen due to more CPU tasks alone.. but I guess your rig wouldn't listen to me)
- Jacob tries to reproduce the error without the WCG HCC GPU units. Throw everything into the mix except 2 concurrent WUs, including more "logical" CPUs. I doubt this will make any difference.. but let's try. CPU throughput will degrade somewhat due to more context switches and cache contention.. but that shouldn't be a major issue unless you currently run the Pentathlon.

If neither of us can provoke the error within.. say a few weeks, we're left to conclude that it was the combination with WCG HCC, and maybe also the generational jump between Jacobs GPUs.

MrS
____________
Scanning for our furry friends since Jan 2002

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29697 - Posted: 5 May 2013 | 20:40:44 UTC - in response to Message 29696.
Last modified: 5 May 2013 | 21:00:05 UTC

Yes, I like these suggestions.

For your testing, please make sure you are using a 0.001 app_config, and that you're running 8 CPU tasks alongside a GPUGrid task that says it is 0.001 CPU. I only ever saw the issue when running an app_config, so please make sure you are using one, at a setting of 0.001, to try to recreate the issue.

For my testing, I'm currently running the 0.001 app_config, so GPUGrid is on both cards, and I'm using <ncpus> so that I'm also running 12 CPU tasks. I bet skgiven is throwing up :) But, at this point, I'm simply trying to recreate the issue (without WCG HCC GPU tasks).

If we both do those tests for 3-4 weeks, and cannot recreate the issue, then I'd say it had something to do with either running WCG HCC GPU tasks in general, or running them 2-at-once on the GTX 460.

Thanks,
Jacob

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29698 - Posted: 5 May 2013 | 21:55:49 UTC - in response to Message 29697.
Last modified: 6 May 2013 | 8:04:08 UTC

For reference, I ran HCC WU's for about 4days on my GTX660Ti+HD5850 two tasks at a time on each GPU. I kept the CPU between 75% and ~95%. The results had no such problems. For a day I also ran 1 GPUGrid WU and 2 HCC tasks (only on the ATI), again with no replication of the problem.
Another feature I noticed today was Boinc suspending GPU WU's to run CPU WU's, while running Albert WU's - it suspended work on the ATI to run CPU work also from Albert. It did something similar when running several POEM WU's.

- I expect the admins aren't keen on people having more than two tasks per GPU:
The site shows my main system having 4 GPUGRid WU's in progress,
I25R19-NATHAN_dhfr36_5-19-32-RND8894_0 4429077 5 May 2013 | 23:09:30 UTC 10 May 2013 | 23:09:30 UTC In progress --- --- --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)
I32R6-NATHAN_dhfr36_5-21-32-RND2187_1 4426158 5 May 2013 | 23:09:30 UTC 10 May 2013 | 23:09:30 UTC In progress --- --- --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)
I24R7-NATHAN_dhfr36_5-21-32-RND8627_0 4426423 5 May 2013 | 23:09:30 UTC 10 May 2013 | 23:09:30 UTC In progress --- --- --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)
I50R8-NATHAN_dhfr36_5-18-32-RND4493_0 4426288 5 May 2013 | 1:24:20 UTC 10 May 2013 | 1:24:20 UTC In progress --- --- --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)

However, Boinc shows only 3, and the logs showed Boinc asked for 3,
06/05/2013 00:10:43 | GPUGRID | Scheduler request completed: got 3 new tasks

Boinc appears to have misplaced the I50R8-NATHAN_dhfr36_5-18-32-RND4493_0 WorkUnit, but it was lost following a project reset. At GPUGrid running WU's are not resent following a project reset (though they use to be with the previous server).

(presently using app_config with <gpu_usage>1</gpu_usage> <cpu_usage>0.001</cpu_usage>. I have one NVidia GPU, 1 GPUGrid WU in progress and two in the queue, had the cache up to 1 days).
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29707 - Posted: 6 May 2013 | 20:39:18 UTC - in response to Message 29697.

For your testing, please make sure you are using a 0.001 app_config, and that you're running 8 CPU tasks alongside a GPUGrid task that says it is 0.001 CPU. I only ever saw the issue when running an app_config, so please make sure you are using one, at a setting of 0.001, to try to recreate the issue.

Check!

MrS
____________
Scanning for our furry friends since Jan 2002

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29776 - Posted: 9 May 2013 | 12:36:32 UTC - in response to Message 29707.
Last modified: 9 May 2013 | 12:42:42 UTC

Well, I finally had 2 more Nathan tasks exhibit the problem. And the logs confirm that no World Community Grid GPU tasks were running anytime recently. Thus, I'm able to conclude that World Community Grid is not involved.

The errors happened while running an app_config that had, for all GPUGrid apps, the following settings:
<max_concurrent>9999</max_concurrent>
<gpu_usage>1</gpu_usage>
<cpu_usage>0.001</cpu_usage>

The 2 tasks that completed immediately and granted credit, are:

http://www.gpugrid.net/result.php?resultid=6846377

Name I87R5-NATHAN_dhfr36_6-4-32-RND9979_0
Workunit 4437702
Created 8 May 2013 | 22:03:08 UTC
Sent 9 May 2013 | 8:16:35 UTC
Received 9 May 2013 | 9:15:33 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 150631
Report deadline 14 May 2013 | 8:16:35 UTC
Run time 3.24
CPU time 0.90
Validate state Valid
Credit 70,800.00
Application version Long runs (8-12 hours on fastest card) v6.18 (cuda42)
Stderr output
<core_client_version>7.0.64</core_client_version>


http://www.gpugrid.net/result.php?resultid=6847227
Name I94R3-NATHAN_dhfr36_6-10-32-RND1771_1
Workunit 4423738
Created 9 May 2013 | 8:06:18 UTC
Sent 9 May 2013 | 9:16:14 UTC
Received 9 May 2013 | 10:01:48 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 150631
Report deadline 14 May 2013 | 9:16:14 UTC
Run time 3.09
CPU time 0.97
Validate state Valid
Credit 70,800.00
Application version Long runs (8-12 hours on fastest card) v6.18 (cuda42)
Stderr output
<core_client_version>7.0.64</core_client_version>


I also have snippets of the log files that show the download of each task, and the immediate completion of each task. That log file is here:

09-May-2013 04:17:47 [GPUGRID] Sending scheduler request: To fetch work.
09-May-2013 04:17:47 [GPUGRID] Requesting new tasks for NVIDIA
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 04:17:50 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 04:17:50 [GPUGRID] Scheduler request completed: got 1 new tasks
09-May-2013 04:17:50 [GPUGRID] app acemdbeta not found in app_config.xml
09-May-2013 04:17:50 [GPUGRID] app acemd2 not found in app_config.xml
09-May-2013 04:17:50 [GPUGRID] app acemdshort not found in app_config.xml
09-May-2013 04:17:52 [GPUGRID] Started download of I87R5-NATHAN_dhfr36_6-4-LICENSE
09-May-2013 04:17:52 [GPUGRID] Started download of I87R5-NATHAN_dhfr36_6-4-COPYRIGHT
09-May-2013 04:17:52 [GPUGRID] Started download of I87R5-NATHAN_dhfr36_6-4-I87R5-NATHAN_dhfr36_6-3-32-RND9979_1
09-May-2013 04:17:52 [GPUGRID] Started download of I87R5-NATHAN_dhfr36_6-4-I87R5-NATHAN_dhfr36_6-3-32-RND9979_2
09-May-2013 04:17:55 [GPUGRID] Finished download of I87R5-NATHAN_dhfr36_6-4-LICENSE
09-May-2013 04:17:55 [GPUGRID] Finished download of I87R5-NATHAN_dhfr36_6-4-COPYRIGHT
09-May-2013 04:17:55 [GPUGRID] Started download of I87R5-NATHAN_dhfr36_6-4-I87R5-NATHAN_dhfr36_6-3-32-RND9979_3
09-May-2013 04:17:55 [GPUGRID] Started download of I87R5-NATHAN_dhfr36_6-4-pdb_file
09-May-2013 04:17:59 [GPUGRID] Finished download of I87R5-NATHAN_dhfr36_6-4-I87R5-NATHAN_dhfr36_6-3-32-RND9979_2
09-May-2013 04:17:59 [GPUGRID] Started download of I87R5-NATHAN_dhfr36_6-4-psf_file
09-May-2013 04:18:03 [GPUGRID] Finished download of I87R5-NATHAN_dhfr36_6-4-I87R5-NATHAN_dhfr36_6-3-32-RND9979_1
09-May-2013 04:18:03 [GPUGRID] Finished download of I87R5-NATHAN_dhfr36_6-4-pdb_file
09-May-2013 04:18:03 [GPUGRID] Started download of I87R5-NATHAN_dhfr36_6-4-par_file
09-May-2013 04:18:03 [GPUGRID] Started download of I87R5-NATHAN_dhfr36_6-4-conf_file_enc
09-May-2013 04:18:04 [GPUGRID] Finished download of I87R5-NATHAN_dhfr36_6-4-I87R5-NATHAN_dhfr36_6-3-32-RND9979_3
09-May-2013 04:18:04 [GPUGRID] Finished download of I87R5-NATHAN_dhfr36_6-4-conf_file_enc
09-May-2013 04:18:04 [GPUGRID] Started download of I87R5-NATHAN_dhfr36_6-4-metainp_file
09-May-2013 04:18:04 [GPUGRID] Started download of I87R5-NATHAN_dhfr36_6-4-I87R5-NATHAN_dhfr36_6-3-32-RND9979_7
09-May-2013 04:18:05 [GPUGRID] Finished download of I87R5-NATHAN_dhfr36_6-4-metainp_file
09-May-2013 04:18:05 [GPUGRID] Finished download of I87R5-NATHAN_dhfr36_6-4-I87R5-NATHAN_dhfr36_6-3-32-RND9979_7
09-May-2013 04:18:05 [GPUGRID] Started download of I87R5-NATHAN_dhfr36_6-4-I87R5-NATHAN_dhfr36_6-3-32-RND9979_10
09-May-2013 04:18:06 [GPUGRID] Finished download of I87R5-NATHAN_dhfr36_6-4-I87R5-NATHAN_dhfr36_6-3-32-RND9979_10
09-May-2013 04:18:07 [GPUGRID] Finished download of I87R5-NATHAN_dhfr36_6-4-par_file
09-May-2013 04:18:16 [GPUGRID] Finished download of I87R5-NATHAN_dhfr36_6-4-psf_file


09-May-2013 05:11:54 [GPUGRID] Computation for task 041px31x3-NOELIA_klebe_run-2-3-RND3368_0 finished
09-May-2013 05:12:06 [GPUGRID] Starting task I87R5-NATHAN_dhfr36_6-4-32-RND9979_0 using acemdlong version 618 (cuda42) in slot 0
09-May-2013 05:12:07 [GPUGRID] Started upload of 041px31x3-NOELIA_klebe_run-2-3-RND3368_0_0
09-May-2013 05:12:07 [GPUGRID] Started upload of 041px31x3-NOELIA_klebe_run-2-3-RND3368_0_1
09-May-2013 05:12:07 [GPUGRID] Started upload of 041px31x3-NOELIA_klebe_run-2-3-RND3368_0_2
09-May-2013 05:12:07 [GPUGRID] Started upload of 041px31x3-NOELIA_klebe_run-2-3-RND3368_0_3
09-May-2013 05:12:10 [GPUGRID] Computation for task I87R5-NATHAN_dhfr36_6-4-32-RND9979_0 finished
09-May-2013 05:12:21 [SETI@home] Starting task ap_08oc11aa_B0_P0_00046_20130506_27293.wu_1 using astropulse_v6 version 604 (cuda_opencl_100) in slot 0
09-May-2013 05:12:21 [RNA World] Sending scheduler request: To fetch work.
09-May-2013 05:12:21 [RNA World] Requesting new tasks for NVIDIA
09-May-2013 05:12:24 [RNA World] Scheduler request completed: got 0 new tasks
09-May-2013 05:12:24 [RNA World] No work sent
09-May-2013 05:12:24 [RNA World] No work is available for cmcalibrate 1.0.2
09-May-2013 05:12:24 [RNA World] No work is available for cmalign 1.0.2
09-May-2013 05:12:24 [RNA World] No work is available for cmbuild 1.0.2
09-May-2013 05:12:24 [RNA World] No work available for the applications you have selected. Please check your settings on the web site.
09-May-2013 05:12:29 [ralph@home] Sending scheduler request: To fetch work.
09-May-2013 05:12:29 [ralph@home] Requesting new tasks for NVIDIA
09-May-2013 05:12:33 [ralph@home] Scheduler request completed: got 0 new tasks
09-May-2013 05:12:36 [GPUGRID] Finished upload of 041px31x3-NOELIA_klebe_run-2-3-RND3368_0_0
09-May-2013 05:12:36 [GPUGRID] Started upload of 041px31x3-NOELIA_klebe_run-2-3-RND3368_0_7
09-May-2013 05:12:37 [GPUGRID] Finished upload of 041px31x3-NOELIA_klebe_run-2-3-RND3368_0_7
09-May-2013 05:12:37 [GPUGRID] Started upload of 041px31x3-NOELIA_klebe_run-2-3-RND3368_0_9
09-May-2013 05:12:38 [Docking] Sending scheduler request: To fetch work.
09-May-2013 05:12:38 [Docking] Reporting 1 completed tasks
09-May-2013 05:12:38 [Docking] Requesting new tasks for NVIDIA
09-May-2013 05:12:48 [GPUGRID] Finished upload of 041px31x3-NOELIA_klebe_run-2-3-RND3368_0_3
09-May-2013 05:12:48 [GPUGRID] Started upload of 041px31x3-NOELIA_klebe_run-2-3-RND3368_0_10
09-May-2013 05:12:49 [GPUGRID] Finished upload of 041px31x3-NOELIA_klebe_run-2-3-RND3368_0_10
09-May-2013 05:12:49 [GPUGRID] Started upload of 041px31x3-NOELIA_klebe_run-2-3-RND3368_0_11
09-May-2013 05:12:50 [GPUGRID] Finished upload of 041px31x3-NOELIA_klebe_run-2-3-RND3368_0_11
09-May-2013 05:12:50 [GPUGRID] Started upload of I87R5-NATHAN_dhfr36_6-4-32-RND9979_0_0
09-May-2013 05:12:51 [GPUGRID] Finished upload of I87R5-NATHAN_dhfr36_6-4-32-RND9979_0_0
09-May-2013 05:12:51 [GPUGRID] Started upload of I87R5-NATHAN_dhfr36_6-4-32-RND9979_0_1
09-May-2013 05:13:51 [GPUGRID] Finished upload of I87R5-NATHAN_dhfr36_6-4-32-RND9979_0_1
09-May-2013 05:13:51 [GPUGRID] Started upload of I87R5-NATHAN_dhfr36_6-4-32-RND9979_0_2
09-May-2013 05:14:01 [GPUGRID] Finished upload of 041px31x3-NOELIA_klebe_run-2-3-RND3368_0_1
09-May-2013 05:14:01 [GPUGRID] Finished upload of 041px31x3-NOELIA_klebe_run-2-3-RND3368_0_2
09-May-2013 05:14:01 [GPUGRID] Started upload of I87R5-NATHAN_dhfr36_6-4-32-RND9979_0_3
09-May-2013 05:14:01 [GPUGRID] Started upload of I87R5-NATHAN_dhfr36_6-4-32-RND9979_0_7
09-May-2013 05:14:02 [GPUGRID] Finished upload of I87R5-NATHAN_dhfr36_6-4-32-RND9979_0_7
09-May-2013 05:14:02 [GPUGRID] Started upload of I87R5-NATHAN_dhfr36_6-4-32-RND9979_0_10
09-May-2013 05:14:03 [GPUGRID] Finished upload of I87R5-NATHAN_dhfr36_6-4-32-RND9979_0_10
09-May-2013 05:14:22 [GPUGRID] Finished upload of I87R5-NATHAN_dhfr36_6-4-32-RND9979_0_3
09-May-2013 05:14:31 [GPUGRID] Finished upload of I87R5-NATHAN_dhfr36_6-4-32-RND9979_0_2
09-May-2013 05:15:57 [boincsimap] Computation for task 20130506.263265_0 finished
09-May-2013 05:15:57 [boincsimap] Starting task 20130506.263300_1 using simap version 512 in slot 2
09-May-2013 05:15:59 [boincsimap] Started upload of 20130506.263265_0_0
09-May-2013 05:16:22 [boincsimap] Finished upload of 20130506.263265_0_0
09-May-2013 05:16:39 [Docking] Scheduler request failed: HTTP internal server error
09-May-2013 05:16:39 [Docking] Sending scheduler request: To fetch work.
09-May-2013 05:16:39 [Docking] Reporting 1 completed tasks
09-May-2013 05:16:39 [Docking] Requesting new tasks for NVIDIA
09-May-2013 05:16:41 [Docking] Scheduler request completed: got 0 new tasks
09-May-2013 05:16:41 [Docking] Server can't open database
09-May-2013 05:16:47 [GPUGRID] Sending scheduler request: To report completed tasks.
09-May-2013 05:16:47 [GPUGRID] Reporting 1 completed tasks
09-May-2013 05:16:47 [GPUGRID] Not requesting tasks: project is not highest priority
09-May-2013 05:16:49 [GPUGRID] Scheduler request completed

09-May-2013 05:17:28 [GPUGRID] Sending scheduler request: To fetch work.
09-May-2013 05:17:28 [GPUGRID] Requesting new tasks for NVIDIA
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 05:17:30 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
09-May-2013 05:17:30 [GPUGRID] Scheduler request completed: got 1 new tasks
09-May-2013 05:17:30 [GPUGRID] app acemdbeta not found in app_config.xml
09-May-2013 05:17:30 [GPUGRID] app acemd2 not found in app_config.xml
09-May-2013 05:17:30 [GPUGRID] app acemdshort not found in app_config.xml
09-May-2013 05:17:32 [GPUGRID] Started download of I94R3-NATHAN_dhfr36_6-10-LICENSE
09-May-2013 05:17:32 [GPUGRID] Started download of I94R3-NATHAN_dhfr36_6-10-COPYRIGHT
09-May-2013 05:17:32 [GPUGRID] Started download of I94R3-NATHAN_dhfr36_6-10-I94R3-NATHAN_dhfr36_6-9-32-RND1771_1
09-May-2013 05:17:32 [GPUGRID] Started download of I94R3-NATHAN_dhfr36_6-10-I94R3-NATHAN_dhfr36_6-9-32-RND1771_2
09-May-2013 05:17:33 [GPUGRID] Finished download of I94R3-NATHAN_dhfr36_6-10-LICENSE
09-May-2013 05:17:33 [GPUGRID] Finished download of I94R3-NATHAN_dhfr36_6-10-COPYRIGHT
09-May-2013 05:17:33 [GPUGRID] Started download of I94R3-NATHAN_dhfr36_6-10-I94R3-NATHAN_dhfr36_6-9-32-RND1771_3
09-May-2013 05:17:33 [GPUGRID] Started download of I94R3-NATHAN_dhfr36_6-10-pdb_file
09-May-2013 05:17:38 [GPUGRID] Finished download of I94R3-NATHAN_dhfr36_6-10-I94R3-NATHAN_dhfr36_6-9-32-RND1771_1
09-May-2013 05:17:38 [GPUGRID] Started download of I94R3-NATHAN_dhfr36_6-10-psf_file
09-May-2013 05:17:44 [GPUGRID] Finished download of I94R3-NATHAN_dhfr36_6-10-I94R3-NATHAN_dhfr36_6-9-32-RND1771_3
09-May-2013 05:17:44 [GPUGRID] Started download of I94R3-NATHAN_dhfr36_6-10-par_file
09-May-2013 05:17:50 [GPUGRID] Finished download of I94R3-NATHAN_dhfr36_6-10-psf_file
09-May-2013 05:17:50 [GPUGRID] Started download of I94R3-NATHAN_dhfr36_6-10-conf_file_enc
09-May-2013 05:17:51 [GPUGRID] Finished download of I94R3-NATHAN_dhfr36_6-10-I94R3-NATHAN_dhfr36_6-9-32-RND1771_2
09-May-2013 05:17:51 [GPUGRID] Finished download of I94R3-NATHAN_dhfr36_6-10-conf_file_enc
09-May-2013 05:17:51 [GPUGRID] Started download of I94R3-NATHAN_dhfr36_6-10-metainp_file
09-May-2013 05:17:51 [GPUGRID] Started download of I94R3-NATHAN_dhfr36_6-10-I94R3-NATHAN_dhfr36_6-9-32-RND1771_7
09-May-2013 05:17:52 [GPUGRID] Finished download of I94R3-NATHAN_dhfr36_6-10-metainp_file
09-May-2013 05:17:52 [GPUGRID] Finished download of I94R3-NATHAN_dhfr36_6-10-I94R3-NATHAN_dhfr36_6-9-32-RND1771_7
09-May-2013 05:17:52 [GPUGRID] Started download of I94R3-NATHAN_dhfr36_6-10-I94R3-NATHAN_dhfr36_6-9-32-RND1771_10
09-May-2013 05:17:53 [GPUGRID] Finished download of I94R3-NATHAN_dhfr36_6-10-I94R3-NATHAN_dhfr36_6-9-32-RND1771_10
09-May-2013 05:17:55 [GPUGRID] Finished download of I94R3-NATHAN_dhfr36_6-10-par_file
09-May-2013 05:18:02 [GPUGRID] Finished download of I94R3-NATHAN_dhfr36_6-10-pdb_file
09-May-2013 05:24:36 [GPUGRID] Finished upload of 041px31x3-NOELIA_klebe_run-2-3-RND3368_0_9
09-May-2013 05:24:36 [GPUGRID] Sending scheduler request: To report completed tasks.
09-May-2013 05:24:36 [GPUGRID] Reporting 1 completed tasks
09-May-2013 05:24:36 [GPUGRID] Not requesting tasks: project is not highest priority
09-May-2013 05:24:39 [GPUGRID] Scheduler request completed

09-May-2013 06:02:05 [SETI@home] Computation for task ap_08oc11aa_B0_P0_00046_20130506_27293.wu_1 finished
09-May-2013 06:02:05 [GPUGRID] Starting task I94R3-NATHAN_dhfr36_6-10-32-RND1771_1 using acemdlong version 618 (cuda42) in slot 0
09-May-2013 06:02:08 [SETI@home] Started upload of ap_08oc11aa_B0_P0_00046_20130506_27293.wu_1_0
09-May-2013 06:02:09 [RNA World] Sending scheduler request: To fetch work.
09-May-2013 06:02:09 [RNA World] Requesting new tasks for NVIDIA
09-May-2013 06:02:10 [GPUGRID] Computation for task I94R3-NATHAN_dhfr36_6-10-32-RND1771_1 finished
09-May-2013 06:02:23 [GPUGRID] Started upload of I94R3-NATHAN_dhfr36_6-10-32-RND1771_1_0
09-May-2013 06:02:23 [GPUGRID] Started upload of I94R3-NATHAN_dhfr36_6-10-32-RND1771_1_1
09-May-2013 06:02:23 [GPUGRID] Started upload of I94R3-NATHAN_dhfr36_6-10-32-RND1771_1_2
09-May-2013 06:02:23 [GPUGRID] Started upload of I94R3-NATHAN_dhfr36_6-10-32-RND1771_1_3
09-May-2013 06:02:23 [RNA World] Scheduler request completed: got 0 new tasks
09-May-2013 06:02:23 [RNA World] No work sent
09-May-2013 06:02:23 [RNA World] No work is available for cmcalibrate 1.0.2
09-May-2013 06:02:23 [RNA World] No work is available for cmalign 1.0.2
09-May-2013 06:02:23 [RNA World] No work is available for cmbuild 1.0.2
09-May-2013 06:02:23 [RNA World] No work available for the applications you have selected. Please check your settings on the web site.
09-May-2013 06:02:25 [GPUGRID] Finished upload of I94R3-NATHAN_dhfr36_6-10-32-RND1771_1_0
09-May-2013 06:02:25 [GPUGRID] Started upload of I94R3-NATHAN_dhfr36_6-10-32-RND1771_1_7
09-May-2013 06:02:27 [SETI@home] Finished upload of ap_08oc11aa_B0_P0_00046_20130506_27293.wu_1_0
09-May-2013 06:02:27 [GPUGRID] Finished upload of I94R3-NATHAN_dhfr36_6-10-32-RND1771_1_7
09-May-2013 06:02:27 [GPUGRID] Started upload of I94R3-NATHAN_dhfr36_6-10-32-RND1771_1_10
09-May-2013 06:02:28 [ralph@home] Sending scheduler request: To fetch work.
09-May-2013 06:02:28 [ralph@home] Requesting new tasks for NVIDIA
09-May-2013 06:02:29 [GPUGRID] Finished upload of I94R3-NATHAN_dhfr36_6-10-32-RND1771_1_10
09-May-2013 06:02:31 [ralph@home] Scheduler request failed: Error 403
09-May-2013 06:02:36 [World Community Grid] Sending scheduler request: To fetch work.
09-May-2013 06:02:36 [World Community Grid] Requesting new tasks for NVIDIA
09-May-2013 06:02:39 [World Community Grid] Scheduler request completed: got 0 new tasks
09-May-2013 06:02:39 [World Community Grid] No tasks sent
09-May-2013 06:02:39 [World Community Grid] No tasks are available for Say No to Schistosoma
09-May-2013 06:02:39 [World Community Grid] No tasks are available for GO Fight Against Malaria
09-May-2013 06:02:39 [World Community Grid] No tasks are available for Drug Search for Leishmaniasis
09-May-2013 06:02:39 [World Community Grid] No tasks are available for The Clean Energy Project - Phase 2
09-May-2013 06:02:39 [World Community Grid] No tasks are available for Help Conquer Cancer
09-May-2013 06:02:39 [World Community Grid] No tasks are available for Human Proteome Folding - Phase 2
09-May-2013 06:02:39 [World Community Grid] No tasks are available for FightAIDS@Home
09-May-2013 06:02:44 [Cosmology@Home] Sending scheduler request: To fetch work.
09-May-2013 06:02:44 [Cosmology@Home] Requesting new tasks for NVIDIA
09-May-2013 06:02:47 [GPUGRID] Finished upload of I94R3-NATHAN_dhfr36_6-10-32-RND1771_1_3
09-May-2013 06:02:49 [Cosmology@Home] Scheduler request completed: got 0 new tasks
09-May-2013 06:02:49 [Cosmology@Home] No work sent
09-May-2013 06:02:54 [rosetta@home] Sending scheduler request: To fetch work.
09-May-2013 06:02:54 [rosetta@home] Requesting new tasks for NVIDIA
09-May-2013 06:02:56 [GPUGRID] Finished upload of I94R3-NATHAN_dhfr36_6-10-32-RND1771_1_1
09-May-2013 06:02:56 [GPUGRID] Finished upload of I94R3-NATHAN_dhfr36_6-10-32-RND1771_1_2
09-May-2013 06:02:56 [rosetta@home] Scheduler request completed: got 0 new tasks
09-May-2013 06:03:01 [GPUGRID] Sending scheduler request: To report completed tasks.
09-May-2013 06:03:01 [GPUGRID] Reporting 1 completed tasks
09-May-2013 06:03:01 [GPUGRID] Not requesting tasks: project is not highest priority
09-May-2013 06:03:04 [GPUGRID] Scheduler request completed


I can't help but notice that each of the 2 Nathan tasks that failed.... were started in slot 0. I'm curious if slot 0 is relevant at all to the problem.

My next step will be to turn a couple debug flags on, and catch even more log detail when the problem happens again.

Regards,
Jacob

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29804 - Posted: 10 May 2013 | 21:33:24 UTC - in response to Message 29776.
Last modified: 10 May 2013 | 21:36:20 UTC

I just had another Nathan Long dhfr task that exhibited the issue. This time, I got all sorts of debug flags, and have a TON of information that we can stare at. Again, it happened with slot 0. I'm highly suspicious that this may be a slot 0 issue.

The task that completed quickly with credit is:
I40R14-NATHAN_dhfr36_6-9-32-RND5144_0
http://www.gpugrid.net/result.php?resultid=6849877

Name I40R14-NATHAN_dhfr36_6-9-32-RND5144_0
Workunit 4440026
Created 10 May 2013 | 14:51:34 UTC
Sent 10 May 2013 | 16:58:49 UTC
Received 10 May 2013 | 18:36:14 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 150631
Report deadline 15 May 2013 | 16:58:49 UTC
Run time 3.08
CPU time 0.94
Validate state Valid
Credit 70,800.00
Application version Long runs (8-12 hours on fastest card) v6.18 (cuda42)
Stderr output

<core_client_version>7.0.64</core_client_version>

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29805 - Posted: 10 May 2013 | 21:37:01 UTC - in response to Message 29804.
Last modified: 10 May 2013 | 21:37:55 UTC

Log part 1/3: Where BOINC downloaded the task:
---------------------------------------------------

10-May-2013 13:00:03 [GPUGRID] [sched_op] Starting scheduler request
10-May-2013 13:00:03 [GPUGRID] Sending scheduler request: To fetch work.
10-May-2013 13:00:03 [GPUGRID] Requesting new tasks for NVIDIA
10-May-2013 13:00:03 [GPUGRID] [sched_op] CPU work request: 0.00 seconds; 0.00 devices
10-May-2013 13:00:03 [GPUGRID] [sched_op] NVIDIA work request: 8748.13 seconds; 0.00 devices
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/'
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
10-May-2013 13:00:05 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/
10-May-2013 13:00:05 [GPUGRID] Scheduler request completed: got 1 new tasks
10-May-2013 13:00:05 [GPUGRID] [sched_op] Server version 613
10-May-2013 13:00:05 [GPUGRID] Project requested delay of 31 seconds
10-May-2013 13:00:05 [---] [slot] removed file slots/0/init_data.xml
10-May-2013 13:00:05 [---] [slot] removed file slots/4/init_data.xml
10-May-2013 13:00:05 [GPUGRID] app acemdbeta not found in app_config.xml
10-May-2013 13:00:05 [GPUGRID] app acemd2 not found in app_config.xml
10-May-2013 13:00:05 [GPUGRID] app acemdshort not found in app_config.xml
10-May-2013 13:00:05 [GPUGRID] [task] result state=NEW for I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 from handle_scheduler_reply
10-May-2013 13:00:05 [GPUGRID] [sched_op] estimated total CPU task duration: 0 seconds
10-May-2013 13:00:05 [GPUGRID] [sched_op] estimated total NVIDIA task duration: 65312 seconds
10-May-2013 13:00:05 [GPUGRID] [slot] linked projects/www.gpugrid.net/logogpugrid.png to projects/www.gpugrid.net/stat_icon
10-May-2013 13:00:05 [GPUGRID] [slot] linked projects/www.gpugrid.net/project_1.png to projects/www.gpugrid.net/slideshow_ga_00
10-May-2013 13:00:05 [GPUGRID] [slot] linked projects/www.gpugrid.net/project_1.png to projects/www.gpugrid.net/slideshow_cellmd_00
10-May-2013 13:00:05 [GPUGRID] [slot] linked projects/www.gpugrid.net/project_2.png to projects/www.gpugrid.net/slideshow_ga_01
10-May-2013 13:00:05 [GPUGRID] [slot] linked projects/www.gpugrid.net/project_2.png to projects/www.gpugrid.net/slideshow_cellmd_01
10-May-2013 13:00:05 [GPUGRID] [slot] linked projects/www.gpugrid.net/project_3.png to projects/www.gpugrid.net/slideshow_ga_02
10-May-2013 13:00:05 [GPUGRID] [slot] linked projects/www.gpugrid.net/project_3.png to projects/www.gpugrid.net/slideshow_cellmd_02
10-May-2013 13:00:05 [GPUGRID] [state] handle_scheduler_reply(): State after handle_scheduler_reply():
10-May-2013 13:00:05 [---] [state] Client state summary:
10-May-2013 13:00:05 [---] 17 projects:
10-May-2013 13:00:05 [---] http://boinc.bakerlab.org/rosetta/ min RPC 206.544522.0 seconds from now
10-May-2013 13:00:05 [---] http://boinc.drugdiscoveryathome.com/ min RPC 42117.516539.0 seconds from now
10-May-2013 13:00:05 [---] http://boinc.fzk.de/poem/ min RPC -0.162897.0 seconds from now
10-May-2013 13:00:05 [---] http://boinc.umiacs.umd.edu/ min RPC -3951.180071.0 seconds from now
10-May-2013 13:00:05 [---] http://boincsimap.org/boincsimap/ min RPC -17263.646673.0 seconds from now
10-May-2013 13:00:05 [---] http://cbl-boinc-server2.cs.technion.ac.il/superlinkattechnion/ min RPC 3308.156960.0 seconds from now
10-May-2013 13:00:05 [---] http://docking.cis.udel.edu/ min RPC -18.465681.0 seconds from now
10-May-2013 13:00:05 [---] http://mindmodeling.org/ min RPC -4533.884037.0 seconds from now
10-May-2013 13:00:05 [---] http://qcn.stanford.edu/sensor/
10-May-2013 13:00:05 [---] http://ralph.bakerlab.org/ min RPC 46.836083.0 seconds from now
10-May-2013 13:00:05 [---] http://setiathome.berkeley.edu/ min RPC -24524.272766.0 seconds from now
10-May-2013 13:00:05 [---] http://volunteer.cs.und.edu/dna/ min RPC 3548.865614.0 seconds from now
10-May-2013 13:00:05 [---] http://wuprop.boinc-af.org/ min RPC -1448.197718.0 seconds from now
10-May-2013 13:00:05 [---] http://www.cosmologyathome.org/ min RPC 16.679211.0 seconds from now
10-May-2013 13:00:05 [---] http://www.gpugrid.net/ min RPC -16461.104961.0 seconds from now
10-May-2013 13:00:05 [---] http://www.rnaworld.de/rnaworld/ min RPC 3.715845.0 seconds from now
10-May-2013 13:00:05 [---] http://www.worldcommunitygrid.org/ min RPC 92.512416.0 seconds from now
10-May-2013 13:00:05 [---] 547 file_infos:
10-May-2013 13:00:05 [---] minirosetta_graphics_3.43_windows_x86_64.exe status:1 inactive
10-May-2013 13:00:05 [---] Helvetica.txf status:1 inactive
10-May-2013 13:00:05 [---] minirosetta_3.46_windows_x86_64.exe status:1 inactive
10-May-2013 13:00:05 [---] minirosetta_database_rev54943.zip status:1 inactive
10-May-2013 13:00:05 [---] t7_OFD_05_data.zip status:1 inactive
10-May-2013 13:00:05 [---] 2013_5_9_mini_y061_folding.zip status:1 inactive
10-May-2013 13:00:05 [---] t7_OFD_05_relax_SAVE_ALL_OUT_80755_528_0_0 status:0 inactive
10-May-2013 13:00:05 [---] P2_2x3_1_s7_f1_R_6_2_3_1_abinitio_design_y061_003_80737_201_0_0 status:0 inactive
10-May-2013 13:00:05 [---] poempp_1.6_windows_intelx86 status:1 inactive
10-May-2013 13:00:05 [---] poemcl_1.5_windows_intelx86__opencl_nvidia_100 status:1 inactive
10-May-2013 13:00:05 [---] garli-2.0_x64.exe status:1 inactive
10-May-2013 13:00:05 [---] simap_5.12_windows_x86_64.exe status:1 inactive
10-May-2013 13:00:05 [---] 20130506.354130 status:1 inactive
10-May-2013 13:00:05 [---] 20130506.354130_1_0 status:0 inactive
10-May-2013 13:00:05 [---] 20130506.393863 status:1 inactive
10-May-2013 13:00:05 [---] 20130506.393864 status:1 inactive
10-May-2013 13:00:05 [---] 20130506.393831 status:1 inactive
10-May-2013 13:00:05 [---] 20130506.393865 status:1 inactive
10-May-2013 13:00:05 [---] 20130506.393832 status:1 inactive
10-May-2013 13:00:05 [---] 20130506.393833 status:1 inactive
10-May-2013 13:00:05 [---] 20130506.393866 status:1 inactive
10-May-2013 13:00:05 [---] 20130506.393816 status:1 inactive
10-May-2013 13:00:05 [---] 20130506.393867 status:1 inactive
10-May-2013 13:00:05 [---] 20130506.393817 status:1 inactive
10-May-2013 13:00:05 [---] 20130506.393863_1_0 status:0 inactive
10-May-2013 13:00:05 [---] 20130506.393864_0_0 status:0 inactive
10-May-2013 13:00:05 [---] 20130506.393831_1_0 status:0 inactive
10-May-2013 13:00:05 [---] 20130506.393865_0_0 status:0 inactive
10-May-2013 13:00:05 [---] 20130506.393832_1_0 status:0 inactive
10-May-2013 13:00:05 [---] 20130506.393833_0_0 status:0 inactive
10-May-2013 13:00:05 [---] 20130506.393866_0_0 status:0 inactive
10-May-2013 13:00:05 [---] 20130506.393816_1_0 status:0 inactive
10-May-2013 13:00:05 [---] 20130506.393867_0_0 status:0 inactive
10-May-2013 13:00:05 [---] 20130506.393817_1_0 status:0 inactive
10-May-2013 13:00:05 [---] charmm34_6.23_windows_x86_64 status:1 inactive
10-May-2013 13:00:05 [---] charmm34_6.23_graphics_windows_x86_64 status:1 inactive
10-May-2013 13:00:05 [---] grid_probes.rtf status:1 inactive
10-May-2013 13:00:05 [---] lpdb_amino.rtf status:1 inactive
10-May-2013 13:00:05 [---] lpdb.prm status:1 inactive
10-May-2013 13:00:05 [---] lpdb_probes.prm status:1 inactive
10-May-2013 13:00:05 [---] logo.jpg status:1 inactive
10-May-2013 13:00:05 [---] minus.jpg status:1 inactive
10-May-2013 13:00:05 [---] plus.jpg status:1 inactive
10-May-2013 13:00:05 [---] rotate_left.jpg status:1 inactive
10-May-2013 13:00:05 [---] rotate_right.jpg status:1 inactive
10-May-2013 13:00:05 [---] helvetica.txf status:1 inactive
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_28492_168327.inp status:1 inactive
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_28492_168327_0_0 status:0 inactive
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_28492_168327_0_1 status:0 inactive
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_28492_168327_0_2 status:0 inactive
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_28492_168327_0_3 status:0 inactive
10-May-2013 13:00:05 [---] mm_wrapper_1.14_windows_intelx86.exe status:1 inactive
10-May-2013 13:00:05 [---] windows_intelx86_7za.exe status:1 inactive
10-May-2013 13:00:05 [---] 1.88_windows_intelx86_ccl.exe status:1 inactive
10-May-2013 13:00:05 [---] 1.88_windows_intelx86_ccl.image status:1 inactive
10-May-2013 13:00:05 [---] 1.88_windows_intelx86_job.xml status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-aa9c0c035874 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-0fce3f34d81f status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-c7e9fb6f2488 status:1 inactive
10-May-2013 13:00:05 [---] mm_wrapper_1.13_windows_intelx86.exe status:1 inactive
10-May-2013 13:00:05 [---] 7za_windows_intelx86.exe status:1 inactive
10-May-2013 13:00:05 [---] python2.7_1.1_windows_intelx86.7z status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-7eb084b87178 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-c6d9676373ab status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-44acea400127 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-cf97a3f4153d status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-9f921a5713e1 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-d8518c1c10f3 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-98f66ca54433 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-f88a4186fdbd status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-bbb7272a8f41 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-7276ff629550 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-780583cf71d5 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-18787b5983ac status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-54692f5faa10 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-a8e3762615ee status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-948a6648497d status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-eb30710e0382 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-a89ee691977f status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-215882ce14f6 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-2bf8c2409ff2 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-3d7d0c842283 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-f284a375afec status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-3cf3467e6733 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-37a2874f70f1 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-d28f11badd46 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-c0cf8dff909e status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-d997845c25a3 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-6050af3e502a status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-4ef306ae42f8 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-011e7e96c17c status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-c38905be247f status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-cdb942793fd9 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-596c3d774088 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-e90f064e5241 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-4b02d61289e9 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-e0c3e4ac5182 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-1165454b45ec status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-edd5b89938ef status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-494dc635b5a4 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-bd14e9ad0370 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-0dc6f1070786 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-87f80513a44d status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-9d5b2f204aa1 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-3f5ef2dd76aa status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-c8b600890697 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-04d137f90e32 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-e1a05e51f65d status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-d211811cae48 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-f0cea5bd65d3 status:1 inactive
10-May-2013 13:00:05 [---] mm_wrapper_1.15_windows_intelx86.exe status:1 inactive
10-May-2013 13:00:05 [---] R2.15.1_1.1_windows_intelx86.7z status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-0f72da3bdbcd status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-32ffcea3b321 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-59ee1d70b47a status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-c7e12b2a7375 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-22329c83e36f status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-761ad52b3fa1 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-4b09b9a7cb95 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-08b1d2d07654 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-e50469588e18 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-0f855cb7b5b6 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-d2d37dd612e9 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-bc1e10672e36 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-d8452cc13288 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-9bee61ffe532 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-cea37f8e527d status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-04405b890a7a status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-ed9946e14ab7 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-f17c9597426a status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-e26b27826421 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-46107298ea05 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-5cbc8ad2f46d status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-879d8b097c0c status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-e76eac12ae21 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-317b929bf433 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-0322160c12a9 status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-41a49a1b088b status:1 inactive
10-May-2013 13:00:05 [---] mm_wrapper_1.16_windows_intelx86.exe status:1 inactive
10-May-2013 13:00:05 [---] pypy1.9_1.1_windows_intelx86.7z status:1 inactive
10-May-2013 13:00:05 [---] MindModeling-Reuse-72a3cb28696f status:1 inactive
10-May-2013 13:00:05 [---] mm_ProjectIcon_BETA_01.png status:1 inactive
10-May-2013 13:00:05 [---] activation.png status:1 inactive
10-May-2013 13:00:05 [---] sideview.png status:1 inactive
10-May-2013 13:00:05 [---] qcn_7.06_windows_x86_64__nci.exe status:1 inactive
10-May-2013 13:00:05 [---] qcn_graphics_7.06_windows_x86_64.exe status:1 inactive
10-May-2013 13:00:05 [---] hvt status:1 inactive
10-May-2013 13:00:05 [---] ntpdate_4.2.4p7c_windows_x86_64.exe status:1 inactive
10-May-2013 13:00:05 [---] phidget21x64.dll status:1 inactive
10-May-2013 13:00:05 [---] logo.jpg status:1 inactive
10-May-2013 13:00:05 [---] cbt status:1 inactive
10-May-2013 13:00:05 [---] xyzaxesbl.jpg status:1 inactive
10-May-2013 13:00:05 [---] earthday4096.jpg status:1 inactive
10-May-2013 13:00:05 [---] xyzaxes.jpg status:1 inactive
10-May-2013 13:00:05 [---] earthnight4096.jpg status:1 inactive
10-May-2013 13:00:05 [---] qcnicon.png status:1 inactive
10-May-2013 13:00:05 [---] qcn0.png status:1 inactive
10-May-2013 13:00:05 [---] qcn1.png status:1 inactive
10-May-2013 13:00:05 [---] qcn2.png status:1 inactive
10-May-2013 13:00:05 [---] qcn3.png status:1 inactive
10-May-2013 13:00:05 [---] Helvetica.txf status:1 inactive
10-May-2013 13:00:05 [---] minirosetta_beta_3.42_windows_x86_64.exe status:1 inactive
10-May-2013 13:00:05 [---] minirosetta_graphics_3.42_windows_x86_64.exe status:1 inactive
10-May-2013 13:00:05 [---] minirosetta_database_rev51296.zip status:1 inactive
10-May-2013 13:00:05 [---] minirosetta_graphics_3.43_windows_x86_64.exe status:1 inactive
10-May-2013 13:00:05 [---] minirosetta_3.45_windows_x86_64.exe status:1 inactive
10-May-2013 13:00:05 [---] minirosetta_database_rev52077.zip status:1 inactive
10-May-2013 13:00:05 [---] arecibo_181.png status:1 inactive
10-May-2013 13:00:05 [---] sah_40.png status:1 inactive
10-May-2013 13:00:05 [---] sah_banner_290.png status:1 inactive
10-May-2013 13:00:05 [---] sah_ss_290.png status:1 inactive
10-May-2013 13:00:05 [---] setiathome_6.10_windows_intelx86__cuda_fermi.exe status:1 inactive
10-May-2013 13:00:05 [---] libfftw3f-3-1-1a_upx.dll status:1 inactive
10-May-2013 13:00:05 [---] seti_609.jpg status:1 inactive
10-May-2013 13:00:05 [---] cudart32_30_14.dll status:1 inactive
10-May-2013 13:00:05 [---] cufft32_30_14.dll status:1 inactive
10-May-2013 13:00:05 [---] setiathome-6.10_cuda_AUTHORS status:1 inactive
10-May-2013 13:00:05 [---] setiathome-6.10_cuda_COPYING status:1 inactive
10-May-2013 13:00:05 [---] setiathome-6.10_cuda_COPYRIGHT status:1 inactive
10-May-2013 13:00:05 [---] setiathome-6.10_cuda_README status:1 inactive
10-May-2013 13:00:05 [---] astropulse_6.04_windows_intelx86__opencl_nvidia_100.exe status:1 inactive
10-May-2013 13:00:05 [---] AstroPulse_Kernels_r1316.cl status:1 inactive
10-May-2013 13:00:05 [---] ap_cmdline_6.04_windows_intelx86__opencl_nvidia.txt status:1 inactive
10-May-2013 13:00:05 [---] astropulse_6.04_opencl_nvidia_AUTHORS status:1 inactive
10-May-2013 13:00:05 [---] astropulse_6.04_opencl_nvidia_COPYING status:1 inactive
10-May-2013 13:00:05 [---] astropulse_6.04_opencl_nvidia_COPYRIGHT status:1 inactive
10-May-2013 13:00:05 [---] astropulse_6.04_windows_intelx86__opencl_nvidia_100.pdb status:1 inactive
10-May-2013 13:00:05 [---] libfftw3f-3.dll status:1 inactive
10-May-2013 13:00:05 [---] data_collect_v3_3.50_windows_intelx86__nci.exe status:1 inactive
10-May-2013 13:00:05 [---] camb_2.16_windows_intelx86.exe status:1 inactive
10-May-2013 13:00:05 [---] cosmo_icon.png status:1 inactive
10-May-2013 13:00:05 [---] cosmo_slide_01.png status:1 inactive
10-May-2013 13:00:05 [---] cosmo_slide_02.png status:1 inactive
10-May-2013 13:00:05 [---] cosmo_slide_03.png status:1 inactive
10-May-2013 13:00:05 [---] params_050913_065030_2.ini status:1 inactive
10-May-2013 13:00:05 [---] wu_050913_065030_2_0_0_0 status:0 inactive
10-May-2013 13:00:05 [---] wu_050913_065030_2_0_0_1 status:0 inactive
10-May-2013 13:00:05 [---] wu_050913_065030_2_0_0_2 status:0 inactive
10-May-2013 13:00:05 [---] wu_050913_065030_2_0_0_3 status:0 inactive
10-May-2013 13:00:05 [---] wu_050913_065030_2_0_0_4 status:0 inactive
10-May-2013 13:00:05 [---] wu_050913_065030_2_0_0_5 status:0 inactive
10-May-2013 13:00:05 [---] acemd.2865P.exe status:1 inactive
10-May-2013 13:00:05 [---] cudart32_42_9.dll status:1 inactive
10-May-2013 13:00:05 [---] cufft32_42_9.dll status:1 inactive
10-May-2013 13:00:05 [---] tcl85.dll status:1 inactive
10-May-2013 13:00:05 [---] logogpugrid.png status:1 inactive
10-May-2013 13:00:05 [---] project_1.png status:1 inactive
10-May-2013 13:00:05 [---] project_2.png status:1 inactive
10-May-2013 13:00:05 [---] project_3.png status:1 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-LICENSE status:1 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-COPYRIGHT status:1 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-291px9x1-NOELIA_klebe_run-0-3-RND0554_1 status:1 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-291px9x1-NOELIA_klebe_run-0-3-RND0554_2 status:1 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-291px9x1-NOELIA_klebe_run-0-3-RND0554_3 status:1 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-pdb_file status:1 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-psf_file status:1 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-par_file status:1 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-conf_file_enc status:1 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-metainp_file status:1 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-291px9x1-NOELIA_klebe_run-0-3-RND0554_7 status:1 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-291px9x1-NOELIA_klebe_run-0-3-RND0554_10 status:1 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-3-RND0554_0_0 status:0 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-3-RND0554_0_1 status:0 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-3-RND0554_0_2 status:0 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-3-RND0554_0_3 status:0 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-3-RND0554_0_4 status:0 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-3-RND0554_0_5 status:0 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-3-RND0554_0_6 status:0 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-3-RND0554_0_7 status:0 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-3-RND0554_0_8 status:0 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-3-RND0554_0_9 status:0 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-3-RND0554_0_10 status:0 inactive
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-3-RND0554_0_11 status:0 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-LICENSE status:1 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-COPYRIGHT status:1 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-I29R17-NATHAN_dhfr36_6-9-32-RND4705_1 status:1 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-I29R17-NATHAN_dhfr36_6-9-32-RND4705_2 status:1 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-I29R17-NATHAN_dhfr36_6-9-32-RND4705_3 status:1 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-pdb_file status:1 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-psf_file status:1 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-par_file status:1 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-conf_file_enc status:1 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-metainp_file status:1 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-I29R17-NATHAN_dhfr36_6-9-32-RND4705_7 status:1 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-I29R17-NATHAN_dhfr36_6-9-32-RND4705_10 status:1 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0_0 status:0 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0_1 status:0 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0_2 status:0 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0_3 status:0 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0_4 status:0 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0_5 status:0 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0_6 status:0 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0_7 status:0 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0_8 status:0 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0_9 status:0 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0_10 status:0 inactive
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0_11 status:0 inactive
10-May-2013 13:00:05 [---] stat_icon_01.png status:1 inactive
10-May-2013 13:00:05 [---] rkn.png status:1 inactive
10-May-2013 13:00:05 [---] MW-6SRNAFragment_mini_txt.png status:1 inactive
10-May-2013 13:00:05 [---] PDB-1EGK_PMID-10864501_FourWayJunction_mini_txt.png status:1 inactive
10-May-2013 13:00:05 [---] PDB-1Q8N_PMID-10617571_MalachiteGreenAptamer_mini_txt.png status:1 inactive
10-May-2013 13:00:05 [---] PDB-1SJ3_PMID-15141216_HDVRibozyme_mini_txt.png status:1 inactive
10-May-2013 13:00:05 [---] PDB-1U9S_PMID-15459389_S-DomainA-TypeRNaseP_mini_txt.png status:1 inactive
10-May-2013 13:00:05 [---] PDB-2A2E_PMID-16113684_A-typeRNaseP_mini_txt.png status:1 inactive
10-May-2013 13:00:05 [---] PDB-2NZ4_PMID-17196404_GlmS-GlcN6P_mini_txt.png status:1 inactive
10-May-2013 13:00:05 [---] faah.protease.dat.gzb status:1 inactive
10-May-2013 13:00:05 [---] stat_v01.png status:1 inactive
10-May-2013 13:00:05 [---] default_00_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] c4cw_00_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] c4cw_01_v01.png status:1 inactive
10-May-2013 13:00:05 [---] c4cw_02_v01.png status:1 inactive
10-May-2013 13:00:05 [---] c4cw_03_v01.png status:1 inactive
10-May-2013 13:00:05 [---] c4cw_04_v01.png status:1 inactive
10-May-2013 13:00:05 [---] c4cw_05_v01.png status:1 inactive
10-May-2013 13:00:05 [---] c4cw_06_v01.png status:1 inactive
10-May-2013 13:00:05 [---] cep2_00_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] cep2_01_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] cep2_02_v01.png status:1 inactive
10-May-2013 13:00:05 [---] cep2_03_v02.png status:1 inactive
10-May-2013 13:00:05 [---] cep2_04_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] cep2_05_v01.png status:1 inactive
10-May-2013 13:00:05 [---] cep2_06_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] cep2_07_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] cep2_08_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] faah_00_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] faah_01_v02.gif status:1 inactive
10-May-2013 13:00:05 [---] faah_02_v04.png status:1 inactive
10-May-2013 13:00:05 [---] faah_03_v02.gif status:1 inactive
10-May-2013 13:00:05 [---] faah_04_v02.gif status:1 inactive
10-May-2013 13:00:05 [---] hpf2_00_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] hpf2_01_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] hpf2_02_v01.png status:1 inactive
10-May-2013 13:00:05 [---] hpf2_03_v01.png status:1 inactive
10-May-2013 13:00:05 [---] hpf2_04_v01.png status:1 inactive
10-May-2013 13:00:05 [---] hpf2_05_v01.png status:1 inactive
10-May-2013 13:00:05 [---] hpf2_06_v01.png status:1 inactive
10-May-2013 13:00:05 [---] hpf2_07_v01.png status:1 inactive
10-May-2013 13:00:05 [---] hpf2_08_v01.png status:1 inactive
10-May-2013 13:00:05 [---] hcc1_00_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] hcc1_01_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] hcc1_02_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] hcc1_03_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] hcc1_04_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] hcc1_05_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] hcc1_06_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] hfcc_00_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] hfcc_01_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] hfcc_02_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] hfcc_03_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] hfcc_04_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] hfcc_05_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] hfcc_06_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] hfcc_07_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] dsfl_00_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] dsfl_01_v01.png status:1 inactive
10-May-2013 13:00:05 [---] dsfl_02_v01.png status:1 inactive
10-May-2013 13:00:05 [---] dsfl_03_v01.png status:1 inactive
10-May-2013 13:00:05 [---] dsfl_04_v01.png status:1 inactive
10-May-2013 13:00:05 [---] dsfl_05_v01.png status:1 inactive
10-May-2013 13:00:05 [---] dsfl_06_v02.png status:1 inactive
10-May-2013 13:00:05 [---] dsfl_07_v02.png status:1 inactive
10-May-2013 13:00:05 [---] gfam_00_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] gfam_01_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] gfam_02_v01.png status:1 inactive
10-May-2013 13:00:05 [---] gfam_03_v01.png status:1 inactive
10-May-2013 13:00:05 [---] gfam_04_v01.png status:1 inactive
10-May-2013 13:00:05 [---] gfam_05_v01.png status:1 inactive
10-May-2013 13:00:05 [---] gfam_06_v03.png status:1 inactive
10-May-2013 13:00:05 [---] gfam_07_v01.png status:1 inactive
10-May-2013 13:00:05 [---] sn2s_00_v01.gif status:1 inactive
10-May-2013 13:00:05 [---] sn2s_01_v02.jpg status:1 inactive
10-May-2013 13:00:05 [---] sn2s_02_v02.jpg status:1 inactive
10-May-2013 13:00:05 [---] sn2s_03_v02.jpg status:1 inactive
10-May-2013 13:00:05 [---] sn2s_04_v02.jpg status:1 inactive
10-May-2013 13:00:05 [---] sn2s_05_v02.jpg status:1 inactive
10-May-2013 13:00:05 [---] sn2s_06_v02.jpg status:1 inactive
10-May-2013 13:00:05 [---] cep2.wcg_logo_tagline.tga status:1 inactive
10-May-2013 13:00:05 [---] cep2.IBM_Logo.tga status:1 inactive
10-May-2013 13:00:05 [---] cep2.CEP1leaf3_v1.3.tga status:1 inactive
10-May-2013 13:00:05 [---] cep2.boinc_logo2.tga status:1 inactive
10-May-2013 13:00:05 [---] cep2.Harvard_Chemistry.tga status:1 inactive
10-May-2013 13:00:05 [---] cep2.CleanEnergyProjectLogo_2.tga status:1 inactive
10-May-2013 13:00:05 [---] cep2.Q-Chem-logo-7.tga status:1 inactive
10-May-2013 13:00:05 [---] cep2.Courier-Bold.txf status:1 inactive
10-May-2013 13:00:05 [---] hpf2.avgE_from_pdb.gz status:1 inactive
10-May-2013 13:00:05 [---] hpf2.Paa_n.gz status:1 inactive
10-May-2013 13:00:05 [---] hpf2.plane_data_table_1015.dat.gz status:1 inactive
10-May-2013 13:00:05 [---] hpf2.bbdep02.May.sortlib.gz status:1 inactive
10-May-2013 13:00:05 [---] hpf2.Paa_pp.gz status:1 inactive
10-May-2013 13:00:05 [---] hpf2.Rama_smooth_dyn.dat_ss_6.4.gz status:1 inactive
10-May-2013 13:00:05 [---] hpf2.sasa_prob_cdf.txt.gz status:1 inactive
10-May-2013 13:00:05 [---] hpf2.paircutoffs.gz status:1 inactive
10-May-2013 13:00:05 [---] hpf2.SASA-angles.dat.gz status:1 inactive
10-May-2013 13:00:05 [---] hpf2.SASA-masks.dat.gz status:1 inactive
10-May-2013 13:00:05 [---] hpf2.pdbpairstats_fine.gz status:1 inactive
10-May-2013 13:00:05 [---] hpf2.sasa_offsets.txt.gz status:1 inactive
10-May-2013 13:00:05 [---] hpf2.phi.theta.36.HS.resmooth.gz status:1 inactive
10-May-2013 13:00:05 [---] hpf2.Paa.gz status:1 inactive
10-May-2013 13:00:05 [---] hpf2.phi.theta.36.SS.resmooth.gz status:1 inactive
10-May-2013 13:00:05 [---] hfcc.CHIBA_logo.tga.gzb status:1 inactive
10-May-2013 13:00:05 [---] hfcc.HFCC_logo2.tga.gzb status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_sn2s_vina_6.20_windows_x86_64 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_sn2s_vina_prod_x86_64.exe.6.20 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_sn2s_gfx_prod_x86_64.exe.6.20 status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image01_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image02_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image03_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image04_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image05_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image06_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image07_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image08_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image09_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image10_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image11_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image12_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image13_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image14_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image15_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image16_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image17_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image18_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image19_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image20_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image21_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image22_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image23_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image24_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image25_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image26_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image27_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image28_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image29_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image30_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image31_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image32_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image33_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image34_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image35_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image36_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image37_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image38_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image39_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image40_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image41_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] sn2s_image42_6.20.tga status:1 inactive
10-May-2013 13:00:05 [---] wcg_hfcc_autodock_6.40_windows_intelx86 status:1 inactive
10-May-2013 13:00:05 [---] wcg_hfcc_autodock_graphics_6.40_windows_intelx86 status:1 inactive
10-May-2013 13:00:05 [---] hfcc_image05_6.40.tga status:1 inactive
10-May-2013 13:00:05 [---] hfcc_text01_6.40.tga status:1 inactive
10-May-2013 13:00:05 [---] hfcc_image02_6.40.tga status:1 inactive
10-May-2013 13:00:05 [---] hfcc_image01_6.40.tga status:1 inactive
10-May-2013 13:00:05 [---] hfcc_image04_6.40.tga status:1 inactive
10-May-2013 13:00:05 [---] hfcc_protease_6.40.dat status:1 inactive
10-May-2013 13:00:05 [---] hfcc_image03_6.40.tga status:1 inactive
10-May-2013 13:00:05 [---] hfcc_image06_6.40.tga status:1 inactive
10-May-2013 13:00:05 [---] hfcc_image07_6.40.tga status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_cep2_6.40_windows_intelx86 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_cep2_qchem_6.40_windows_intelx86 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_cep2_graphics_6.40_windows_intelx86 status:1 inactive
10-May-2013 13:00:05 [---] cep2_images_6.40.zip status:1 inactive
10-May-2013 13:00:05 [---] cep2_qcaux_6.40.zip status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_dsfl_vina_6.25_windows_x86_64 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_dsfl_vina_prod_x86_64.exe.6.25 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_dsfl_gfx_prod_x86_64.exe.6.25 status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image01_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image02_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image03_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image04_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image05_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image06_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image07_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image08_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image09_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image10_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image11_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image12_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image13_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image14_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image15_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image16_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image17_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image18_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image19_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image20_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image21_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] dsfl_image22_6.25.tga status:1 inactive
10-May-2013 13:00:05 [---] wcg_hcc1_img_7.05_windows_intelx86__nvidia_hcc1 status:1 inactive
10-May-2013 13:00:05 [---] hcckernel.cl.7.05 status:1 inactive
10-May-2013 13:00:05 [---] hcc1_image01_7.05.tga status:1 inactive
10-May-2013 13:00:05 [---] hcc1_image02_7.05.tga status:1 inactive
10-May-2013 13:00:05 [---] hcc1_image03_7.05.tga status:1 inactive
10-May-2013 13:00:05 [---] hcc1_image04_7.05.tga status:1 inactive
10-May-2013 13:00:05 [---] cep2.1_Picture1.tga status:1 inactive
10-May-2013 13:00:05 [---] cep2.1_Picture2.tga status:1 inactive
10-May-2013 13:00:05 [---] cep2.1_Picture3.tga status:1 inactive
10-May-2013 13:00:05 [---] cep2.1_Picture4.tga status:1 inactive
10-May-2013 13:00:05 [---] cep2.1_Picture5.tga status:1 inactive
10-May-2013 13:00:05 [---] cep2.1_Picture6.tga status:1 inactive
10-May-2013 13:00:05 [---] cep2.1_Picture7.tga status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_dsfl_vina_6.25_windows_intelx86 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_dsfl_vina_prod_x86.exe.6.25 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_dsfl_gfx_prod_x86.exe.6.25 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_gfam_vina_6.12_windows_x86_64 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_gfam_vina_prod_x86_64.exe.6.12 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_gfam_gfx_prod_x86_64.exe.6.12 status:1 inactive
10-May-2013 13:00:05 [---] gfam_image01_6.12.tga status:1 inactive
10-May-2013 13:00:05 [---] gfam_image02_6.12.tga status:1 inactive
10-May-2013 13:00:05 [---] gfam_image03_6.12.tga status:1 inactive
10-May-2013 13:00:05 [---] gfam_image04_6.12.tga status:1 inactive
10-May-2013 13:00:05 [---] gfam_image05_6.12.tga status:1 inactive
10-May-2013 13:00:05 [---] gfam_image06_6.12.tga status:1 inactive
10-May-2013 13:00:05 [---] gfam_image07_6.12.tga status:1 inactive
10-May-2013 13:00:05 [---] gfam_image08_6.12.tga status:1 inactive
10-May-2013 13:00:05 [---] gfam_image09_6.12.tga status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_faah_7.15_windows_x86_64 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_faah_graphics_prod_64.exe.7.15 status:1 inactive
10-May-2013 13:00:05 [---] faah_image01_7.15.tga status:1 inactive
10-May-2013 13:00:05 [---] wcg_hpf2_rosetta_6.40_windows_intelx86 status:1 inactive
10-May-2013 13:00:05 [---] wcg_hpf2_6.40.tga status:1 inactive
10-May-2013 13:00:05 [---] wcg_hpf2_rosetta_graphics_6.40_windows_intelx86 status:1 inactive
10-May-2013 13:00:05 [---] hpf2_6.40_win_paths.txt status:1 inactive
10-May-2013 13:00:05 [---] wcg_hcc1_img_7.05_windows_intelx86 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_hcc_img_prod_windows_graphics.exe.7.05 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_sn2s_vina_6.20_windows_intelx86 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_sn2s_vina_prod_x86.exe.6.20 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_sn2s_gfx_prod_x86.exe.6.20 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_gfam_vina_6.12_windows_intelx86 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_gfam_vina_prod_x86.exe.6.12 status:1 inactive
10-May-2013 13:00:05 [---] wcgrid_gfam_gfx_prod_x86.exe.6.12 status:1 inactive
10-May-2013 13:00:05 [---] d3d876c5ead4a4ad9001c2f53400153f.zip status:1 inactive
10-May-2013 13:00:05 [---] E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3_0 status:0 inactive
10-May-2013 13:00:05 [---] E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3_1 status:0 inactive
10-May-2013 13:00:05 [---] E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3_2 status:0 inactive
10-May-2013 13:00:05 [---] E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3_3 status:0 inactive
10-May-2013 13:00:05 [---] E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3_4 status:0 inactive
10-May-2013 13:00:05 [---] params_050913_223610_0.ini status:1 inactive
10-May-2013 13:00:05 [---] wu_050913_223610_0_0_0_0 status:0 inactive
10-May-2013 13:00:05 [---] wu_050913_223610_0_0_0_1 status:0 inactive
10-May-2013 13:00:05 [---] wu_050913_223610_0_0_0_2 status:0 inactive
10-May-2013 13:00:05 [---] wu_050913_223610_0_0_0_3 status:0 inactive
10-May-2013 13:00:05 [---] wu_050913_223610_0_0_0_4 status:0 inactive
10-May-2013 13:00:05 [---] wu_050913_223610_0_0_0_5 status:0 inactive
10-May-2013 13:00:05 [---] gfam.x3ENM_chnA_loopChrm1_hMAP2K.pdbqt status:1 inactive
10-May-2013 13:00:05 [---] 125b53383767172f2ab523a2be41ed25.job status:1 inactive
10-May-2013 13:00:05 [---] 636d00e7c66c8c65ed410ee324425f9e.zip status:1 inactive
10-May-2013 13:00:05 [---] dsfl.target_00080-13.pdbqt status:1 inactive
10-May-2013 13:00:05 [---] DSFL_00080-13_0000030_0649_DSFL_00080-13_0000030_0649.job status:1 inactive
10-May-2013 13:00:05 [---] DSFL_00080-13_0000030_0649_DSFL_00080-13_0000030_0649.zip status:1 inactive
10-May-2013 13:00:05 [---] GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0_0 status:0 inactive
10-May-2013 13:00:05 [---] DSFL_00080-13_0000030_0649_0_0 status:0 inactive
10-May-2013 13:00:05 [---] wu_v3_1367866805_161978_0_0 status:0 inactive
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_35803_464390.inp status:1 inactive
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_35805_144567.inp status:1 inactive
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0_0 status:0 inactive
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0_1 status:0 inactive
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0_2 status:0 inactive
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0_3 status:0 inactive
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_35805_144567_0_0 status:0 inactive
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_35805_144567_0_1 status:0 inactive
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_35805_144567_0_2 status:0 inactive
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_35805_144567_0_3 status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-LICENSE status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-COPYRIGHT status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-I40R14-NATHAN_dhfr36_6-8-32-RND5144_1 status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-I40R14-NATHAN_dhfr36_6-8-32-RND5144_2 status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-I40R14-NATHAN_dhfr36_6-8-32-RND5144_3 status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-pdb_file status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-psf_file status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-par_file status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-conf_file_enc status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-metainp_file status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-I40R14-NATHAN_dhfr36_6-8-32-RND5144_7 status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-I40R14-NATHAN_dhfr36_6-8-32-RND5144_10 status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_0 status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_1 status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_2 status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_3 status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_4 status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_5 status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_6 status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_7 status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_8 status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_9 status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_10 status:0 inactive
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_11 status:0 inactive
10-May-2013 13:00:05 [---] 32 app_versions
10-May-2013 13:00:05 [---] minirosetta 346
10-May-2013 13:00:05 [---] poempp 106
10-May-2013 13:00:05 [---] poemcl 105
10-May-2013 13:00:05 [---] garli 502
10-May-2013 13:00:05 [---] simap 512
10-May-2013 13:00:05 [---] charmm34 623
10-May-2013 13:00:05 [---] ccl_wrap 189
10-May-2013 13:00:05 [---] python2.7_wrap 102
10-May-2013 13:00:05 [---] python2.7_wrap_winOnly 102
10-May-2013 13:00:05 [---] R2.15.1_wrap 103
10-May-2013 13:00:05 [---] pypy1.9_wrap 303
10-May-2013 13:00:05 [---] qcnsensor 706
10-May-2013 13:00:05 [---] minirosetta_beta 342
10-May-2013 13:00:05 [---] minirosetta 345
10-May-2013 13:00:05 [---] setiathome_enhanced 610
10-May-2013 13:00:05 [---] astropulse_v6 604
10-May-2013 13:00:05 [---] astropulse_v6 604
10-May-2013 13:00:05 [---] data_collect_v3 350
10-May-2013 13:00:05 [---] camb 216
10-May-2013 13:00:05 [---] acemdlong 618
10-May-2013 13:00:05 [---] sn2s 620
10-May-2013 13:00:05 [---] hfcc 640
10-May-2013 13:00:05 [---] cep2 640
10-May-2013 13:00:05 [---] dsfl 625
10-May-2013 13:00:05 [---] hcc1 705
10-May-2013 13:00:05 [---] dsfl 625
10-May-2013 13:00:05 [---] gfam 612
10-May-2013 13:00:05 [---] faah 715
10-May-2013 13:00:05 [---] hpf2 640
10-May-2013 13:00:05 [---] hcc1 705
10-May-2013 13:00:05 [---] sn2s 620
10-May-2013 13:00:05 [---] gfam 612
10-May-2013 13:00:05 [---] 25 workunits
10-May-2013 13:00:05 [---] t7_OFD_05_relax_SAVE_ALL_OUT_80755_528
10-May-2013 13:00:05 [---] P2_2x3_1_s7_f1_R_6_2_3_1_abinitio_design_y061_003_80737_201
10-May-2013 13:00:05 [---] 20130506.354130
10-May-2013 13:00:05 [---] 20130506.393863
10-May-2013 13:00:05 [---] 20130506.393864
10-May-2013 13:00:05 [---] 20130506.393831
10-May-2013 13:00:05 [---] 20130506.393865
10-May-2013 13:00:05 [---] 20130506.393832
10-May-2013 13:00:05 [---] 20130506.393833
10-May-2013 13:00:05 [---] 20130506.393866
10-May-2013 13:00:05 [---] 20130506.393816
10-May-2013 13:00:05 [---] 20130506.393867
10-May-2013 13:00:05 [---] 20130506.393817
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_28492_168327
10-May-2013 13:00:05 [---] wu_050913_065030_2_0
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-3-RND0554
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-32-RND4705
10-May-2013 13:00:05 [---] E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06
10-May-2013 13:00:05 [---] wu_050913_223610_0_0
10-May-2013 13:00:05 [---] GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057
10-May-2013 13:00:05 [---] DSFL_00080-13_0000030_0649
10-May-2013 13:00:05 [---] wu_v3_1367866805_161978
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_35803_464390
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_35805_144567
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-32-RND5144
10-May-2013 13:00:05 [---] 25 results
10-May-2013 13:00:05 [---] wu_050913_065030_2_0_0 state:2
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-3-RND0554_0 state:2
10-May-2013 13:00:05 [---] 20130506.354130_1 state:5
10-May-2013 13:00:05 [---] E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 state:2
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_28492_168327_0 state:2
10-May-2013 13:00:05 [---] t7_OFD_05_relax_SAVE_ALL_OUT_80755_528_0 state:2
10-May-2013 13:00:05 [---] P2_2x3_1_s7_f1_R_6_2_3_1_abinitio_design_y061_003_80737_201_0 state:2
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 state:2
10-May-2013 13:00:05 [---] 20130506.393817_1 state:5
10-May-2013 13:00:05 [---] 20130506.393816_1 state:5
10-May-2013 13:00:05 [---] 20130506.393864_0 state:5
10-May-2013 13:00:05 [---] 20130506.393831_1 state:5
10-May-2013 13:00:05 [---] 20130506.393833_0 state:5
10-May-2013 13:00:05 [---] 20130506.393866_0 state:2
10-May-2013 13:00:05 [---] 20130506.393867_0 state:2
10-May-2013 13:00:05 [---] 20130506.393832_1 state:2
10-May-2013 13:00:05 [---] 20130506.393863_1 state:2
10-May-2013 13:00:05 [---] 20130506.393865_0 state:2
10-May-2013 13:00:05 [---] wu_050913_223610_0_0_0 state:2
10-May-2013 13:00:05 [---] DSFL_00080-13_0000030_0649_0 state:2
10-May-2013 13:00:05 [---] GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 state:2
10-May-2013 13:00:05 [---] wu_v3_1367866805_161978_0 state:2
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0 state:2
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_35805_144567_0 state:2
10-May-2013 13:00:05 [---] I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 state:0
10-May-2013 13:00:05 [---] 0 persistent file xfers
10-May-2013 13:00:05 [---] 12 active tasks
10-May-2013 13:00:05 [---] wu_050913_065030_2_0_0
10-May-2013 13:00:05 [---] 291px9x1-NOELIA_klebe_run-1-3-RND0554_0
10-May-2013 13:00:05 [---] E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3
10-May-2013 13:00:05 [---] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0
10-May-2013 13:00:05 [---] 1ajv1hsg_mod0014crossdockinghiv1_28492_168327_0
10-May-2013 13:00:05 [---] t7_OFD_05_relax_SAVE_ALL_OUT_80755_528_0
10-May-2013 13:00:05 [---] wu_050913_223610_0_0_0
10-May-2013 13:00:05 [---] P2_2x3_1_s7_f1_R_6_2_3_1_abinitio_design_y061_003_80737_201_0
10-May-2013 13:00:05 [---] DSFL_00080-13_0000030_0649_0
10-May-2013 13:00:05 [---] GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0
10-May-2013 13:00:05 [---] wu_v3_1367866805_161978_0
10-May-2013 13:00:05 [---] 20130506.393866_0
10-May-2013 13:00:05 [GPUGRID] [sched_op] Deferring communication for 31 sec
10-May-2013 13:00:05 [GPUGRID] [sched_op] Reason: requested by project
10-May-2013 13:00:07 [GPUGRID] [task] result state=FILES_DOWNLOADING for I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 from CS::update_results
10-May-2013 13:00:08 [GPUGRID] Started download of I40R14-NATHAN_dhfr36_6-9-LICENSE
10-May-2013 13:00:08 [GPUGRID] Started download of I40R14-NATHAN_dhfr36_6-9-COPYRIGHT
10-May-2013 13:00:08 [GPUGRID] Started download of I40R14-NATHAN_dhfr36_6-9-I40R14-NATHAN_dhfr36_6-8-32-RND5144_1
10-May-2013 13:00:08 [GPUGRID] Started download of I40R14-NATHAN_dhfr36_6-9-I40R14-NATHAN_dhfr36_6-8-32-RND5144_2
10-May-2013 13:00:10 [rosetta@home] [checkpoint] result P2_2x3_1_s7_f1_R_6_2_3_1_abinitio_design_y061_003_80737_201_0 checkpointed
10-May-2013 13:00:10 [GPUGRID] Finished download of I40R14-NATHAN_dhfr36_6-9-LICENSE
10-May-2013 13:00:10 [GPUGRID] Finished download of I40R14-NATHAN_dhfr36_6-9-COPYRIGHT
10-May-2013 13:00:10 [GPUGRID] Started download of I40R14-NATHAN_dhfr36_6-9-I40R14-NATHAN_dhfr36_6-8-32-RND5144_3
10-May-2013 13:00:10 [GPUGRID] Started download of I40R14-NATHAN_dhfr36_6-9-pdb_file
10-May-2013 13:00:13 [GPUGRID] Finished download of I40R14-NATHAN_dhfr36_6-9-I40R14-NATHAN_dhfr36_6-8-32-RND5144_3
10-May-2013 13:00:13 [GPUGRID] Started download of I40R14-NATHAN_dhfr36_6-9-psf_file
10-May-2013 13:00:14 [GPUGRID] Finished download of I40R14-NATHAN_dhfr36_6-9-I40R14-NATHAN_dhfr36_6-8-32-RND5144_1
10-May-2013 13:00:14 [GPUGRID] Started download of I40R14-NATHAN_dhfr36_6-9-par_file
10-May-2013 13:00:16 [GPUGRID] Finished download of I40R14-NATHAN_dhfr36_6-9-I40R14-NATHAN_dhfr36_6-8-32-RND5144_2
10-May-2013 13:00:16 [GPUGRID] Started download of I40R14-NATHAN_dhfr36_6-9-conf_file_enc
10-May-2013 13:00:17 [GPUGRID] Finished download of I40R14-NATHAN_dhfr36_6-9-conf_file_enc
10-May-2013 13:00:17 [GPUGRID] Started download of I40R14-NATHAN_dhfr36_6-9-metainp_file
10-May-2013 13:00:18 [GPUGRID] [checkpoint] result I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 checkpointed
10-May-2013 13:00:18 [GPUGRID] Finished download of I40R14-NATHAN_dhfr36_6-9-par_file
10-May-2013 13:00:18 [GPUGRID] Finished download of I40R14-NATHAN_dhfr36_6-9-metainp_file
10-May-2013 13:00:18 [GPUGRID] Started download of I40R14-NATHAN_dhfr36_6-9-I40R14-NATHAN_dhfr36_6-8-32-RND5144_7
10-May-2013 13:00:18 [GPUGRID] Started download of I40R14-NATHAN_dhfr36_6-9-I40R14-NATHAN_dhfr36_6-8-32-RND5144_10
10-May-2013 13:00:19 [GPUGRID] Finished download of I40R14-NATHAN_dhfr36_6-9-I40R14-NATHAN_dhfr36_6-8-32-RND5144_7
10-May-2013 13:00:22 [boincsimap] [checkpoint] result 20130506.393866_0 checkpointed
10-May-2013 13:00:22 [GPUGRID] Finished download of I40R14-NATHAN_dhfr36_6-9-I40R14-NATHAN_dhfr36_6-8-32-RND5144_10
10-May-2013 13:00:26 [GPUGRID] Finished download of I40R14-NATHAN_dhfr36_6-9-psf_file
10-May-2013 13:00:27 [GPUGRID] [checkpoint] result 291px9x1-NOELIA_klebe_run-1-3-RND0554_0 checkpointed
10-May-2013 13:00:29 [GPUGRID] Finished download of I40R14-NATHAN_dhfr36_6-9-pdb_file
10-May-2013 13:00:29 [GPUGRID] [task] result state=FILES_DOWNLOADED for I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 from CS::update_results
10-May-2013 13:00:29 [---] [cpu_sched_debug] Request CPU reschedule: files downloaded
10-May-2013 13:00:29 [---] [cpu_sched_debug] schedule_cpus(): start
10-May-2013 13:00:29 [GPUGRID] [cpu_sched_debug] scheduling 291px9x1-NOELIA_klebe_run-1-3-RND0554_0 (coprocessor job, FIFO) (prio -7.806956)
10-May-2013 13:00:29 [GPUGRID] [cpu_sched_debug] reserving 1.000000 of coproc NVIDIA
10-May-2013 13:00:29 [GPUGRID] [cpu_sched_debug] scheduling I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 (coprocessor job, FIFO) (prio -7.971775)
10-May-2013 13:00:29 [GPUGRID] [cpu_sched_debug] reserving 1.000000 of coproc NVIDIA
10-May-2013 13:00:29 [World Community Grid] [cpu_sched_debug] scheduling E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 (CPU job, priority order) (prio -0.027087)
10-May-2013 13:00:29 [World Community Grid] [cpu_sched_debug] scheduling DSFL_00080-13_0000030_0649_0 (CPU job, priority order) (prio -0.027241)
10-May-2013 13:00:29 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 (CPU job, priority order) (prio -0.027395)
10-May-2013 13:00:29 [Cosmology@Home] [cpu_sched_debug] scheduling wu_050913_065030_2_0_0 (CPU job, priority order) (prio -0.027825)
10-May-2013 13:00:29 [boincsimap] [cpu_sched_debug] scheduling 20130506.393866_0 (CPU job, priority order) (prio -0.027957)
10-May-2013 13:00:29 [rosetta@home] [cpu_sched_debug] scheduling P2_2x3_1_s7_f1_R_6_2_3_1_abinitio_design_y061_003_80737_201_0 (CPU job, priority order) (prio -0.028000)
10-May-2013 13:00:29 [Docking] [cpu_sched_debug] scheduling 1ajv1hsg_mod0014crossdockinghiv1_28492_168327_0 (CPU job, priority order) (prio -0.028000)
10-May-2013 13:00:29 [Cosmology@Home] [cpu_sched_debug] scheduling wu_050913_223610_0_0_0 (CPU job, priority order) (prio -0.028287)
10-May-2013 13:00:29 [---] [cpu_sched_debug] enforce_schedule(): start
10-May-2013 13:00:29 [---] [cpu_sched_debug] preliminary job list:
10-May-2013 13:00:29 [GPUGRID] [cpu_sched_debug] 0: 291px9x1-NOELIA_klebe_run-1-3-RND0554_0 (MD: no; UTS: no)
10-May-2013 13:00:29 [GPUGRID] [cpu_sched_debug] 1: I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 (MD: no; UTS: no)
10-May-2013 13:00:29 [World Community Grid] [cpu_sched_debug] 2: E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 (MD: no; UTS: yes)
10-May-2013 13:00:29 [World Community Grid] [cpu_sched_debug] 3: DSFL_00080-13_0000030_0649_0 (MD: no; UTS: no)
10-May-2013 13:00:29 [World Community Grid] [cpu_sched_debug] 4: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 (MD: no; UTS: no)
10-May-2013 13:00:29 [Cosmology@Home] [cpu_sched_debug] 5: wu_050913_065030_2_0_0 (MD: no; UTS: no)
10-May-2013 13:00:29 [boincsimap] [cpu_sched_debug] 6: 20130506.393866_0 (MD: no; UTS: yes)
10-May-2013 13:00:29 [rosetta@home] [cpu_sched_debug] 7: P2_2x3_1_s7_f1_R_6_2_3_1_abinitio_design_y061_003_80737_201_0 (MD: no; UTS: no)
10-May-2013 13:00:29 [Docking] [cpu_sched_debug] 8: 1ajv1hsg_mod0014crossdockinghiv1_28492_168327_0 (MD: no; UTS: no)
10-May-2013 13:00:29 [Cosmology@Home] [cpu_sched_debug] 9: wu_050913_223610_0_0_0 (MD: no; UTS: no)
10-May-2013 13:00:29 [---] [cpu_sched_debug] final job list:
10-May-2013 13:00:29 [GPUGRID] [cpu_sched_debug] 0: 291px9x1-NOELIA_klebe_run-1-3-RND0554_0 (MD: no; UTS: no)
10-May-2013 13:00:29 [GPUGRID] [cpu_sched_debug] 1: I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 (MD: no; UTS: no)
10-May-2013 13:00:29 [World Community Grid] [cpu_sched_debug] 2: E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 (MD: no; UTS: yes)
10-May-2013 13:00:29 [boincsimap] [cpu_sched_debug] 3: 20130506.393866_0 (MD: no; UTS: yes)
10-May-2013 13:00:29 [World Community Grid] [cpu_sched_debug] 4: DSFL_00080-13_0000030_0649_0 (MD: no; UTS: no)
10-May-2013 13:00:29 [World Community Grid] [cpu_sched_debug] 5: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 (MD: no; UTS: no)
10-May-2013 13:00:29 [Cosmology@Home] [cpu_sched_debug] 6: wu_050913_065030_2_0_0 (MD: no; UTS: no)
10-May-2013 13:00:29 [rosetta@home] [cpu_sched_debug] 7: P2_2x3_1_s7_f1_R_6_2_3_1_abinitio_design_y061_003_80737_201_0 (MD: no; UTS: no)
10-May-2013 13:00:29 [Docking] [cpu_sched_debug] 8: 1ajv1hsg_mod0014crossdockinghiv1_28492_168327_0 (MD: no; UTS: no)
10-May-2013 13:00:29 [Cosmology@Home] [cpu_sched_debug] 9: wu_050913_223610_0_0_0 (MD: no; UTS: no)
10-May-2013 13:00:29 [GPUGRID] [coproc] NVIDIA instance 0: confirming for 291px9x1-NOELIA_klebe_run-1-3-RND0554_0
10-May-2013 13:00:29 [GPUGRID] [coproc] NVIDIA instance 1: confirming for I29R17-NATHAN_dhfr36_6-10-32-RND4705_0
10-May-2013 13:00:29 [GPUGRID] [cpu_sched_debug] scheduling 291px9x1-NOELIA_klebe_run-1-3-RND0554_0
10-May-2013 13:00:29 [GPUGRID] [cpu_sched_debug] scheduling I29R17-NATHAN_dhfr36_6-10-32-RND4705_0
10-May-2013 13:00:29 [World Community Grid] [cpu_sched_debug] scheduling E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3
10-May-2013 13:00:29 [boincsimap] [cpu_sched_debug] scheduling 20130506.393866_0
10-May-2013 13:00:29 [World Community Grid] [cpu_sched_debug] scheduling DSFL_00080-13_0000030_0649_0
10-May-2013 13:00:29 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0
10-May-2013 13:00:29 [Cosmology@Home] [cpu_sched_debug] scheduling wu_050913_065030_2_0_0
10-May-2013 13:00:29 [rosetta@home] [cpu_sched_debug] scheduling P2_2x3_1_s7_f1_R_6_2_3_1_abinitio_design_y061_003_80737_201_0
10-May-2013 13:00:29 [Docking] [cpu_sched_debug] scheduling 1ajv1hsg_mod0014crossdockinghiv1_28492_168327_0
10-May-2013 13:00:29 [Cosmology@Home] [cpu_sched_debug] scheduling

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29806 - Posted: 10 May 2013 | 21:39:13 UTC - in response to Message 29805.

Log part 2/3:
---------------------------------------
10-May-2013 13:00:29 [Cosmology@Home] [cpu_sched_debug] scheduling wu_050913_223610_0_0_0
10-May-2013 13:00:29 [Cosmology@Home] [cpu_sched_debug] wu_050913_065030_2_0_0 sched state 2 next 2 task state 1
10-May-2013 13:00:29 [GPUGRID] [cpu_sched_debug] 291px9x1-NOELIA_klebe_run-1-3-RND0554_0 sched state 2 next 2 task state 1
10-May-2013 13:00:29 [World Community Grid] [cpu_sched_debug] E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 sched state 2 next 2 task state 1
10-May-2013 13:00:29 [GPUGRID] [cpu_sched_debug] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 sched state 2 next 2 task state 1
10-May-2013 13:00:29 [Docking] [cpu_sched_debug] 1ajv1hsg_mod0014crossdockinghiv1_28492_168327_0 sched state 2 next 2 task state 1
10-May-2013 13:00:29 [rosetta@home] [cpu_sched_debug] t7_OFD_05_relax_SAVE_ALL_OUT_80755_528_0 sched state 1 next 1 task state 9
10-May-2013 13:00:29 [Cosmology@Home] [cpu_sched_debug] wu_050913_223610_0_0_0 sched state 2 next 2 task state 1
10-May-2013 13:00:29 [rosetta@home] [cpu_sched_debug] P2_2x3_1_s7_f1_R_6_2_3_1_abinitio_design_y061_003_80737_201_0 sched state 2 next 2 task state 1
10-May-2013 13:00:29 [World Community Grid] [cpu_sched_debug] DSFL_00080-13_0000030_0649_0 sched state 2 next 2 task state 1
10-May-2013 13:00:29 [World Community Grid] [cpu_sched_debug] GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 sched state 2 next 2 task state 1
10-May-2013 13:00:29 [WUProp@Home] [cpu_sched_debug] wu_v3_1367866805_161978_0 sched state 2 next 2 task state 1
10-May-2013 13:00:29 [boincsimap] [cpu_sched_debug] 20130506.393866_0 sched state 2 next 2 task state 1
10-May-2013 13:00:29 [Cosmology@Home] [css] running wu_050913_065030_2_0_0 ( )
10-May-2013 13:00:29 [GPUGRID] [css] running 291px9x1-NOELIA_klebe_run-1-3-RND0554_0 (0.001 CPUs + 1 NVIDIA GPU)
10-May-2013 13:00:29 [World Community Grid] [css] running E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 ( )
10-May-2013 13:00:29 [GPUGRID] [css] running I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 (0.001 CPUs + 1 NVIDIA GPU)
10-May-2013 13:00:29 [Docking] [css] running 1ajv1hsg_mod0014crossdockinghiv1_28492_168327_0 ( )
10-May-2013 13:00:29 [Cosmology@Home] [css] running wu_050913_223610_0_0_0 ( )
10-May-2013 13:00:29 [rosetta@home] [css] running P2_2x3_1_s7_f1_R_6_2_3_1_abinitio_design_y061_003_80737_201_0 ( )
10-May-2013 13:00:29 [World Community Grid] [css] running DSFL_00080-13_0000030_0649_0 ( )
10-May-2013 13:00:29 [World Community Grid] [css] running GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 ( )
10-May-2013 13:00:29 [WUProp@Home] [css] running wu_v3_1367866805_161978_0 ( )
10-May-2013 13:00:29 [boincsimap] [css] running 20130506.393866_0 ( )
10-May-2013 13:00:29 [---] [cpu_sched_debug] enforce_schedule: end

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29807 - Posted: 10 May 2013 | 21:39:53 UTC - in response to Message 29806.

Log part 3/3: Where it ran in slot 0, and immediately completed!
---------------------------------------------------------------------
10-May-2013 14:34:39 [GPUGRID] [checkpoint] result 291px9x1-NOELIA_klebe_run-1-3-RND0554_0 checkpointed
10-May-2013 14:34:40 [GPUGRID] [task] Process for 291px9x1-NOELIA_klebe_run-1-3-RND0554_0 exited, exit code 0, task state 1
10-May-2013 14:34:40 [GPUGRID] [task] task_state=EXITED for 291px9x1-NOELIA_klebe_run-1-3-RND0554_0 from handle_exited_app
10-May-2013 14:34:40 [---] [slot] cleaning out slots/0: handle_exited_app()
10-May-2013 14:34:40 [---] [slot] removed file slots/0/acemd.2865P.exe
10-May-2013 14:34:40 [---] [slot] removed file slots/0/boinc_finish_called
10-May-2013 14:34:40 [---] [slot] removed file slots/0/boinc_task_state.xml
10-May-2013 14:34:40 [---] [slot] removed file slots/0/COPYRIGHT
10-May-2013 14:34:40 [---] [slot] removed file slots/0/cudart32_42_9.dll
10-May-2013 14:34:40 [---] [slot] removed file slots/0/cufft32_42_9.dll
10-May-2013 14:34:40 [---] [slot] removed file slots/0/init_data.xml
10-May-2013 14:34:40 [---] [slot] removed file slots/0/input
10-May-2013 14:34:40 [---] [slot] removed file slots/0/input.coor
10-May-2013 14:34:40 [---] [slot] removed file slots/0/input.idx
10-May-2013 14:34:40 [---] [slot] removed file slots/0/input.vel
10-May-2013 14:34:40 [---] [slot] removed file slots/0/input.xsc
10-May-2013 14:34:40 [---] [slot] removed file slots/0/LICENSE
10-May-2013 14:34:40 [---] [slot] removed file slots/0/META_INP
10-May-2013 14:34:40 [---] [slot] removed file slots/0/output.coor
10-May-2013 14:34:40 [---] [slot] removed file slots/0/output.dcd
10-May-2013 14:34:40 [---] [slot] removed file slots/0/output.idx
10-May-2013 14:34:40 [---] [slot] removed file slots/0/output.vel
10-May-2013 14:34:40 [---] [slot] removed file slots/0/output.vel.dcd
10-May-2013 14:34:40 [---] [slot] removed file slots/0/output.xsc
10-May-2013 14:34:40 [---] [slot] removed file slots/0/output.xstfile
10-May-2013 14:34:40 [---] [slot] removed file slots/0/output.xtc
10-May-2013 14:34:40 [---] [slot] removed file slots/0/parameters
10-May-2013 14:34:40 [---] [slot] removed file slots/0/progress.log
10-May-2013 14:34:40 [---] [slot] removed file slots/0/restart.coor
10-May-2013 14:34:40 [---] [slot] removed file slots/0/restart.idx
10-May-2013 14:34:40 [---] [slot] removed file slots/0/restart.vel
10-May-2013 14:34:40 [---] [slot] removed file slots/0/restart.xsc
10-May-2013 14:34:40 [---] [slot] removed file slots/0/structure.pdb
10-May-2013 14:34:40 [---] [slot] removed file slots/0/structure.psf
10-May-2013 14:34:40 [---] [slot] removed file slots/0/tcl85.dll
10-May-2013 14:34:40 [---] [cpu_sched_debug] Request CPU reschedule: application exited
10-May-2013 14:34:40 [GPUGRID] Computation for task 291px9x1-NOELIA_klebe_run-1-3-RND0554_0 finished
10-May-2013 14:34:40 [---] [slot] removed file projects/www.gpugrid.net/291px9x1-NOELIA_klebe_run-1-3-RND0554_0_0
10-May-2013 14:34:40 [GPUGRID] [task] result state=FILES_UPLOADING for 291px9x1-NOELIA_klebe_run-1-3-RND0554_0 from CS::app_finished
10-May-2013 14:34:40 [---] [cpu_sched_debug] Request CPU reschedule: handle_finished_apps
10-May-2013 14:34:40 [---] [cpu_sched_debug] schedule_cpus(): start
10-May-2013 14:34:40 [GPUGRID] [cpu_sched_debug] scheduling I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 (coprocessor job, FIFO) (prio -7.807973)
10-May-2013 14:34:40 [GPUGRID] [cpu_sched_debug] reserving 1.000000 of coproc NVIDIA
10-May-2013 14:34:40 [GPUGRID] [cpu_sched_debug] scheduling I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 (coprocessor job, FIFO) (prio -7.972792)
10-May-2013 14:34:40 [GPUGRID] [cpu_sched_debug] reserving 1.000000 of coproc NVIDIA
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] scheduling E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 (CPU job, priority order) (prio -0.026939)
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] scheduling DSFL_00080-13_0000030_0649_0 (CPU job, priority order) (prio -0.027093)
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 (CPU job, priority order) (prio -0.027247)
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0145_1 (CPU job, priority order) (prio -0.027401)
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0086_1 (CPU job, priority order) (prio -0.027555)
10-May-2013 14:34:40 [Cosmology@Home] [cpu_sched_debug] scheduling wu_050913_065030_2_0_0 (CPU job, priority order) (prio -0.027758)
10-May-2013 14:34:40 [boincsimap] [cpu_sched_debug] scheduling 20130506.393832_1 (CPU job, priority order) (prio -0.027794)
10-May-2013 14:34:40 [Docking] [cpu_sched_debug] scheduling 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0 (CPU job, priority order) (prio -0.027825)
10-May-2013 14:34:40 [---] [cpu_sched_debug] enforce_schedule(): start
10-May-2013 14:34:40 [---] [cpu_sched_debug] preliminary job list:
10-May-2013 14:34:40 [GPUGRID] [cpu_sched_debug] 0: I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 (MD: no; UTS: no)
10-May-2013 14:34:40 [GPUGRID] [cpu_sched_debug] 1: I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 (MD: no; UTS: yes)
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] 2: E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 (MD: no; UTS: yes)
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] 3: DSFL_00080-13_0000030_0649_0 (MD: no; UTS: no)
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] 4: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 (MD: no; UTS: no)
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] 5: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0145_1 (MD: no; UTS: yes)
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] 6: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0086_1 (MD: no; UTS: yes)
10-May-2013 14:34:40 [Cosmology@Home] [cpu_sched_debug] 7: wu_050913_065030_2_0_0 (MD: no; UTS: no)
10-May-2013 14:34:40 [boincsimap] [cpu_sched_debug] 8: 20130506.393832_1 (MD: no; UTS: yes)
10-May-2013 14:34:40 [Docking] [cpu_sched_debug] 9: 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0 (MD: no; UTS: no)
10-May-2013 14:34:40 [---] [cpu_sched_debug] final job list:
10-May-2013 14:34:40 [GPUGRID] [cpu_sched_debug] 0: I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 (MD: no; UTS: no)
10-May-2013 14:34:40 [GPUGRID] [cpu_sched_debug] 1: I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 (MD: no; UTS: no)
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] 2: E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 (MD: no; UTS: yes)
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] 3: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0145_1 (MD: no; UTS: yes)
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] 4: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0086_1 (MD: no; UTS: yes)
10-May-2013 14:34:40 [boincsimap] [cpu_sched_debug] 5: 20130506.393832_1 (MD: no; UTS: yes)
10-May-2013 14:34:40 [rosetta@home] [cpu_sched_debug] 6: t7_OFD_05_relax_SAVE_ALL_OUT_80755_528_0 (MD: no; UTS: yes)
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] 7: DSFL_00080-13_0000030_0649_0 (MD: no; UTS: no)
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] 8: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 (MD: no; UTS: no)
10-May-2013 14:34:40 [Cosmology@Home] [cpu_sched_debug] 9: wu_050913_065030_2_0_0 (MD: no; UTS: no)
10-May-2013 14:34:40 [Docking] [cpu_sched_debug] 10: 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0 (MD: no; UTS: no)
10-May-2013 14:34:40 [GPUGRID] [coproc] NVIDIA instance 1: confirming for I29R17-NATHAN_dhfr36_6-10-32-RND4705_0
10-May-2013 14:34:40 [GPUGRID] [coproc] Assigning NVIDIA instance 0 to I40R14-NATHAN_dhfr36_6-9-32-RND5144_0
10-May-2013 14:34:40 [GPUGRID] [cpu_sched_debug] scheduling I29R17-NATHAN_dhfr36_6-10-32-RND4705_0
10-May-2013 14:34:40 [GPUGRID] [cpu_sched_debug] scheduling I40R14-NATHAN_dhfr36_6-9-32-RND5144_0
10-May-2013 14:34:40 [---] [slot] cleaning out slots/0: get_free_slot()
10-May-2013 14:34:40 [GPUGRID] [slot] assigning slot 0 to I40R14-NATHAN_dhfr36_6-9-32-RND5144_0
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] scheduling E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0145_1
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0086_1
10-May-2013 14:34:40 [boincsimap] [cpu_sched_debug] scheduling 20130506.393832_1
10-May-2013 14:34:40 [rosetta@home] [cpu_sched_debug] scheduling t7_OFD_05_relax_SAVE_ALL_OUT_80755_528_0
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] scheduling DSFL_00080-13_0000030_0649_0
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0
10-May-2013 14:34:40 [Cosmology@Home] [cpu_sched_debug] scheduling wu_050913_065030_2_0_0
10-May-2013 14:34:40 [Docking] [cpu_sched_debug] all CPUs used (8.00 >= 8), skipping 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0
10-May-2013 14:34:40 [Cosmology@Home] [cpu_sched_debug] wu_050913_065030_2_0_0 sched state 2 next 2 task state 1
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 sched state 2 next 2 task state 1
10-May-2013 14:34:40 [GPUGRID] [cpu_sched_debug] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 sched state 2 next 2 task state 1
10-May-2013 14:34:40 [rosetta@home] [cpu_sched_debug] t7_OFD_05_relax_SAVE_ALL_OUT_80755_528_0 sched state 2 next 2 task state 1
10-May-2013 14:34:40 [Cosmology@Home] [cpu_sched_debug] wu_050913_223610_0_0_0 sched state 1 next 1 task state 9
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] DSFL_00080-13_0000030_0649_0 sched state 2 next 2 task state 1
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 sched state 2 next 2 task state 1
10-May-2013 14:34:40 [Docking] [cpu_sched_debug] 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0 sched state 1 next 1 task state 9
10-May-2013 14:34:40 [WUProp@Home] [cpu_sched_debug] wu_v3_1367866805_167468_0 sched state 2 next 2 task state 1
10-May-2013 14:34:40 [boincsimap] [cpu_sched_debug] 20130506.393832_1 sched state 2 next 2 task state 1
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0145_1 sched state 2 next 2 task state 1
10-May-2013 14:34:40 [World Community Grid] [cpu_sched_debug] GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0086_1 sched state 2 next 2 task state 1
10-May-2013 14:34:40 [GPUGRID] [cpu_sched_debug] I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 sched state 0 next 2 task state 0
10-May-2013 14:34:40 [Cosmology@Home] [css] running wu_050913_065030_2_0_0 ( )
10-May-2013 14:34:40 [---] [cpu_sched_debug] app startup took 11.329414 secs
10-May-2013 14:34:40 [---] [cpu_sched_debug] Request CPU reschedule: slow app startup
10-May-2013 14:34:40 [---] [cpu_sched_debug] enforce_schedule: end
10-May-2013 14:34:52 [---] [cpu_sched_debug] schedule_cpus(): start
10-May-2013 14:34:52 [GPUGRID] [cpu_sched_debug] scheduling I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 (coprocessor job, FIFO) (prio -7.807973)
10-May-2013 14:34:52 [GPUGRID] [cpu_sched_debug] reserving 1.000000 of coproc NVIDIA
10-May-2013 14:34:52 [GPUGRID] [cpu_sched_debug] scheduling I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 (coprocessor job, FIFO) (prio -7.972792)
10-May-2013 14:34:52 [GPUGRID] [cpu_sched_debug] reserving 1.000000 of coproc NVIDIA
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] scheduling E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 (CPU job, priority order) (prio -0.026939)
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] scheduling DSFL_00080-13_0000030_0649_0 (CPU job, priority order) (prio -0.027093)
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 (CPU job, priority order) (prio -0.027247)
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0145_1 (CPU job, priority order) (prio -0.027401)
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0086_1 (CPU job, priority order) (prio -0.027555)
10-May-2013 14:34:52 [Cosmology@Home] [cpu_sched_debug] scheduling wu_050913_065030_2_0_0 (CPU job, priority order) (prio -0.027758)
10-May-2013 14:34:52 [boincsimap] [cpu_sched_debug] scheduling 20130506.393832_1 (CPU job, priority order) (prio -0.027794)
10-May-2013 14:34:52 [Docking] [cpu_sched_debug] scheduling 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0 (CPU job, priority order) (prio -0.027824)
10-May-2013 14:34:52 [---] [cpu_sched_debug] enforce_schedule(): start
10-May-2013 14:34:52 [---] [cpu_sched_debug] preliminary job list:
10-May-2013 14:34:52 [GPUGRID] [cpu_sched_debug] 0: I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 (MD: no; UTS: no)
10-May-2013 14:34:52 [GPUGRID] [cpu_sched_debug] 1: I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 (MD: no; UTS: no)
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] 2: E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 (MD: no; UTS: yes)
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] 3: DSFL_00080-13_0000030_0649_0 (MD: no; UTS: no)
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] 4: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 (MD: no; UTS: no)
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] 5: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0145_1 (MD: no; UTS: yes)
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] 6: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0086_1 (MD: no; UTS: yes)
10-May-2013 14:34:52 [Cosmology@Home] [cpu_sched_debug] 7: wu_050913_065030_2_0_0 (MD: no; UTS: no)
10-May-2013 14:34:52 [boincsimap] [cpu_sched_debug] 8: 20130506.393832_1 (MD: no; UTS: yes)
10-May-2013 14:34:52 [Docking] [cpu_sched_debug] 9: 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0 (MD: no; UTS: no)
10-May-2013 14:34:52 [---] [cpu_sched_debug] final job list:
10-May-2013 14:34:52 [GPUGRID] [cpu_sched_debug] 0: I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 (MD: no; UTS: no)
10-May-2013 14:34:52 [GPUGRID] [cpu_sched_debug] 1: I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 (MD: no; UTS: no)
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] 2: E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 (MD: no; UTS: yes)
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] 3: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0145_1 (MD: no; UTS: yes)
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] 4: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0086_1 (MD: no; UTS: yes)
10-May-2013 14:34:52 [boincsimap] [cpu_sched_debug] 5: 20130506.393832_1 (MD: no; UTS: yes)
10-May-2013 14:34:52 [rosetta@home] [cpu_sched_debug] 6: t7_OFD_05_relax_SAVE_ALL_OUT_80755_528_0 (MD: no; UTS: yes)
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] 7: DSFL_00080-13_0000030_0649_0 (MD: no; UTS: no)
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] 8: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 (MD: no; UTS: no)
10-May-2013 14:34:52 [Cosmology@Home] [cpu_sched_debug] 9: wu_050913_065030_2_0_0 (MD: no; UTS: no)
10-May-2013 14:34:52 [Docking] [cpu_sched_debug] 10: 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0 (MD: no; UTS: no)
10-May-2013 14:34:52 [GPUGRID] [coproc] NVIDIA instance 1: confirming for I29R17-NATHAN_dhfr36_6-10-32-RND4705_0
10-May-2013 14:34:52 [GPUGRID] [coproc] Assigning NVIDIA instance 0 to I40R14-NATHAN_dhfr36_6-9-32-RND5144_0
10-May-2013 14:34:52 [GPUGRID] [cpu_sched_debug] scheduling I29R17-NATHAN_dhfr36_6-10-32-RND4705_0
10-May-2013 14:34:52 [GPUGRID] [cpu_sched_debug] scheduling I40R14-NATHAN_dhfr36_6-9-32-RND5144_0
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] scheduling E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0145_1
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0086_1
10-May-2013 14:34:52 [boincsimap] [cpu_sched_debug] scheduling 20130506.393832_1
10-May-2013 14:34:52 [rosetta@home] [cpu_sched_debug] scheduling t7_OFD_05_relax_SAVE_ALL_OUT_80755_528_0
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] scheduling DSFL_00080-13_0000030_0649_0
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0
10-May-2013 14:34:52 [Cosmology@Home] [cpu_sched_debug] scheduling wu_050913_065030_2_0_0
10-May-2013 14:34:52 [Docking] [cpu_sched_debug] all CPUs used (8.00 >= 8), skipping 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0
10-May-2013 14:34:52 [Cosmology@Home] [cpu_sched_debug] wu_050913_065030_2_0_0 sched state 2 next 2 task state 1
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 sched state 2 next 2 task state 1
10-May-2013 14:34:52 [GPUGRID] [cpu_sched_debug] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 sched state 2 next 2 task state 1
10-May-2013 14:34:52 [rosetta@home] [cpu_sched_debug] t7_OFD_05_relax_SAVE_ALL_OUT_80755_528_0 sched state 2 next 2 task state 1
10-May-2013 14:34:52 [Cosmology@Home] [cpu_sched_debug] wu_050913_223610_0_0_0 sched state 1 next 1 task state 9
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] DSFL_00080-13_0000030_0649_0 sched state 2 next 2 task state 1
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 sched state 2 next 2 task state 1
10-May-2013 14:34:52 [Docking] [cpu_sched_debug] 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0 sched state 1 next 1 task state 9
10-May-2013 14:34:52 [WUProp@Home] [cpu_sched_debug] wu_v3_1367866805_167468_0 sched state 2 next 2 task state 1
10-May-2013 14:34:52 [boincsimap] [cpu_sched_debug] 20130506.393832_1 sched state 2 next 2 task state 1
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0145_1 sched state 2 next 2 task state 1
10-May-2013 14:34:52 [World Community Grid] [cpu_sched_debug] GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0086_1 sched state 2 next 2 task state 1
10-May-2013 14:34:52 [GPUGRID] [cpu_sched_debug] I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 sched state 0 next 2 task state 0
10-May-2013 14:34:52 [Cosmology@Home] [css] running wu_050913_065030_2_0_0 ( )
10-May-2013 14:34:52 [World Community Grid] [css] running E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 ( )
10-May-2013 14:34:52 [GPUGRID] [css] running I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 (0.001 CPUs + 1 NVIDIA GPU)
10-May-2013 14:34:52 [rosetta@home] [css] running t7_OFD_05_relax_SAVE_ALL_OUT_80755_528_0 ( )
10-May-2013 14:34:52 [World Community Grid] [css] running DSFL_00080-13_0000030_0649_0 ( )
10-May-2013 14:34:52 [World Community Grid] [css] running GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 ( )
10-May-2013 14:34:52 [WUProp@Home] [css] running wu_v3_1367866805_167468_0 ( )
10-May-2013 14:34:52 [boincsimap] [css] running 20130506.393832_1 ( )
10-May-2013 14:34:52 [World Community Grid] [css] running GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0145_1 ( )
10-May-2013 14:34:52 [World Community Grid] [css] running GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0086_1 ( )
10-May-2013 14:34:52 [GPUGRID] [task] task_state=COPY_PENDING for I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 from start
10-May-2013 14:34:52 [GPUGRID] Starting task I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 using acemdlong version 618 (cuda42) in slot 0
10-May-2013 14:34:52 [GPUGRID] [css] running I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 (0.001 CPUs + 1 NVIDIA GPU)
10-May-2013 14:34:52 [---] [cpu_sched_debug] enforce_schedule: end
10-May-2013 14:34:52 [---] [slot] removed file slots/0/init_data.xml
10-May-2013 14:34:52 [GPUGRID] [slot] linked ../../projects/www.gpugrid.net/I40R14-NATHAN_dhfr36_6-9-LICENSE to slots/0/LICENSE
10-May-2013 14:34:52 [GPUGRID] [slot] linked ../../projects/www.gpugrid.net/I40R14-NATHAN_dhfr36_6-9-COPYRIGHT to slots/0/COPYRIGHT
10-May-2013 14:34:52 [GPUGRID] [slot] linked ../../projects/www.gpugrid.net/I40R14-NATHAN_dhfr36_6-9-pdb_file to slots/0/structure.pdb
10-May-2013 14:34:52 [GPUGRID] [slot] linked ../../projects/www.gpugrid.net/I40R14-NATHAN_dhfr36_6-9-psf_file to slots/0/structure.psf
10-May-2013 14:34:52 [GPUGRID] [slot] linked ../../projects/www.gpugrid.net/I40R14-NATHAN_dhfr36_6-9-par_file to slots/0/parameters
10-May-2013 14:34:52 [GPUGRID] [slot] linked ../../projects/www.gpugrid.net/I40R14-NATHAN_dhfr36_6-9-conf_file_enc to slots/0/input
10-May-2013 14:34:52 [GPUGRID] [slot] linked ../../projects/www.gpugrid.net/I40R14-NATHAN_dhfr36_6-9-I40R14-NATHAN_dhfr36_6-8-32-RND5144_10 to slots/0/input.xsc
10-May-2013 14:34:52 [GPUGRID] [slot] linked ../../projects/www.gpugrid.net/I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_0 to slots/0/progress.log
10-May-2013 14:34:52 [GPUGRID] [slot] linked ../../projects/www.gpugrid.net/I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_1 to slots/0/output.coor
10-May-2013 14:34:52 [GPUGRID] [slot] linked ../../projects/www.gpugrid.net/I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_2 to slots/0/output.vel
10-May-2013 14:34:52 [GPUGRID] [slot] linked ../../projects/www.gpugrid.net/I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_3 to slots/0/output.idx
10-May-2013 14:34:52 [GPUGRID] [slot] linked ../../projects/www.gpugrid.net/I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_4 to slots/0/output.dcd
10-May-2013 14:34:52 [GPUGRID] [slot] linked ../../projects/www.gpugrid.net/I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_8 to slots/0/output.vel.dcd
10-May-2013 14:34:52 [GPUGRID] [slot] linked ../../projects/www.gpugrid.net/I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_9 to slots/0/output.xtc
10-May-2013 14:34:52 [GPUGRID] [slot] linked ../../projects/www.gpugrid.net/I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_10 to slots/0/output.xsc
10-May-2013 14:34:52 [GPUGRID] [slot] linked ../../projects/www.gpugrid.net/I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_11 to slots/0/output.xstfile
10-May-2013 14:34:52 [GPUGRID] [task] task_state=EXECUTING for I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 from start
10-May-2013 14:34:53 [GPUGRID] Started upload of 291px9x1-NOELIA_klebe_run-1-3-RND0554_0_0
10-May-2013 14:34:53 [GPUGRID] Started upload of 291px9x1-NOELIA_klebe_run-1-3-RND0554_0_1
10-May-2013 14:34:53 [GPUGRID] Started upload of 291px9x1-NOELIA_klebe_run-1-3-RND0554_0_2
10-May-2013 14:34:53 [GPUGRID] Started upload of 291px9x1-NOELIA_klebe_run-1-3-RND0554_0_3
10-May-2013 14:34:55 [boincsimap] [checkpoint] result 20130506.393832_1 checkpointed
10-May-2013 14:34:56 [GPUGRID] [task] Process for I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 exited, exit code 0, task state 1
10-May-2013 14:34:56 [GPUGRID] [task] task_state=EXITED for I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 from handle_exited_app
10-May-2013 14:34:56 [---] [slot] cleaning out slots/0: handle_exited_app()
10-May-2013 14:34:56 [---] [slot] removed file slots/0/acemd.2865P.exe
10-May-2013 14:34:56 [---] [slot] removed file slots/0/boinc_finish_called
10-May-2013 14:34:56 [---] [slot] removed file slots/0/COPYRIGHT
10-May-2013 14:34:56 [---] [slot] removed file slots/0/cudart32_42_9.dll
10-May-2013 14:34:56 [---] [slot] removed file slots/0/cufft32_42_9.dll
10-May-2013 14:34:56 [---] [slot] removed file slots/0/init_data.xml
10-May-2013 14:34:56 [---] [slot] removed file slots/0/input
10-May-2013 14:34:56 [---] [slot] removed file slots/0/input.coor
10-May-2013 14:34:56 [---] [slot] removed file slots/0/input.idx
10-May-2013 14:34:56 [---] [slot] removed file slots/0/input.vel
10-May-2013 14:34:56 [---] [slot] removed file slots/0/input.xsc
10-May-2013 14:34:56 [---] [slot] removed file slots/0/LICENSE
10-May-2013 14:34:56 [---] [slot] removed file slots/0/META_INP
10-May-2013 14:34:56 [---] [slot] removed file slots/0/output.coor
10-May-2013 14:34:56 [---] [slot] removed file slots/0/output.dcd
10-May-2013 14:34:56 [---] [slot] removed file slots/0/output.idx
10-May-2013 14:34:56 [---] [slot] removed file slots/0/output.vel
10-May-2013 14:34:56 [---] [slot] removed file slots/0/output.vel.dcd
10-May-2013 14:34:56 [---] [slot] removed file slots/0/output.xsc
10-May-2013 14:34:56 [---] [slot] removed file slots/0/output.xstfile
10-May-2013 14:34:56 [---] [slot] removed file slots/0/output.xtc
10-May-2013 14:34:56 [---] [slot] removed file slots/0/parameters
10-May-2013 14:34:56 [---] [slot] removed file slots/0/progress.log
10-May-2013 14:34:56 [---] [slot] removed file slots/0/structure.pdb
10-May-2013 14:34:56 [---] [slot] removed file slots/0/structure.psf
10-May-2013 14:34:56 [---] [slot] removed file slots/0/tcl85.dll
10-May-2013 14:34:56 [---] [cpu_sched_debug] Request CPU reschedule: application exited
10-May-2013 14:34:56 [GPUGRID] Computation for task I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 finished
10-May-2013 14:34:56 [---] [slot] removed file projects/www.gpugrid.net/I40R14-NATHAN_dhfr36_6-9-32-RND5144_0_0
10-May-2013 14:34:56 [GPUGRID] [task] result state=FILES_UPLOADING for I40R14-NATHAN_dhfr36_6-9-32-RND5144_0 from CS::app_finished
10-May-2013 14:34:56 [---] [cpu_sched_debug] Request CPU reschedule: handle_finished_apps
10-May-2013 14:34:56 [---] [cpu_sched_debug] schedule_cpus(): start
10-May-2013 14:34:56 [GPUGRID] [cpu_sched_debug] scheduling I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 (coprocessor job, FIFO) (prio -7.807974)
10-May-2013 14:34:56 [GPUGRID] [cpu_sched_debug] reserving 1.000000 of coproc NVIDIA
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] scheduling E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 (CPU job, priority order) (prio -0.026939)
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] scheduling DSFL_00080-13_0000030_0649_0 (CPU job, priority order) (prio -0.027093)
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 (CPU job, priority order) (prio -0.027247)
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0145_1 (CPU job, priority order) (prio -0.027401)
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0086_1 (CPU job, priority order) (prio -0.027555)
10-May-2013 14:34:56 [Cosmology@Home] [cpu_sched_debug] scheduling wu_050913_065030_2_0_0 (CPU job, priority order) (prio -0.027758)
10-May-2013 14:34:56 [boincsimap] [cpu_sched_debug] scheduling 20130506.393832_1 (CPU job, priority order) (prio -0.027794)
10-May-2013 14:34:56 [Docking] [cpu_sched_debug] scheduling 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0 (CPU job, priority order) (prio -0.027824)
10-May-2013 14:34:56 [---] [cpu_sched_debug] enforce_schedule(): start
10-May-2013 14:34:56 [---] [cpu_sched_debug] preliminary job list:
10-May-2013 14:34:56 [GPUGRID] [cpu_sched_debug] 0: I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 (MD: no; UTS: no)
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] 1: E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 (MD: no; UTS: yes)
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] 2: DSFL_00080-13_0000030_0649_0 (MD: no; UTS: no)
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] 3: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 (MD: no; UTS: no)
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] 4: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0145_1 (MD: no; UTS: yes)
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] 5: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0086_1 (MD: no; UTS: yes)
10-May-2013 14:34:56 [Cosmology@Home] [cpu_sched_debug] 6: wu_050913_065030_2_0_0 (MD: no; UTS: no)
10-May-2013 14:34:56 [boincsimap] [cpu_sched_debug] 7: 20130506.393832_1 (MD: no; UTS: yes)
10-May-2013 14:34:56 [Docking] [cpu_sched_debug] 8: 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0 (MD: no; UTS: no)
10-May-2013 14:34:56 [---] [cpu_sched_debug] final job list:
10-May-2013 14:34:56 [GPUGRID] [cpu_sched_debug] 0: I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 (MD: no; UTS: no)
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] 1: E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 (MD: no; UTS: yes)
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] 2: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0145_1 (MD: no; UTS: yes)
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] 3: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0086_1 (MD: no; UTS: yes)
10-May-2013 14:34:56 [boincsimap] [cpu_sched_debug] 4: 20130506.393832_1 (MD: no; UTS: yes)
10-May-2013 14:34:56 [rosetta@home] [cpu_sched_debug] 5: t7_OFD_05_relax_SAVE_ALL_OUT_80755_528_0 (MD: no; UTS: yes)
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] 6: DSFL_00080-13_0000030_0649_0 (MD: no; UTS: no)
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] 7: GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 (MD: no; UTS: no)
10-May-2013 14:34:56 [Cosmology@Home] [cpu_sched_debug] 8: wu_050913_065030_2_0_0 (MD: no; UTS: no)
10-May-2013 14:34:56 [Docking] [cpu_sched_debug] 9: 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0 (MD: no; UTS: no)
10-May-2013 14:34:56 [GPUGRID] [coproc] NVIDIA instance 1: confirming for I29R17-NATHAN_dhfr36_6-10-32-RND4705_0
10-May-2013 14:34:56 [GPUGRID] [cpu_sched_debug] scheduling I29R17-NATHAN_dhfr36_6-10-32-RND4705_0
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] scheduling E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0145_1
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0086_1
10-May-2013 14:34:56 [boincsimap] [cpu_sched_debug] scheduling 20130506.393832_1
10-May-2013 14:34:56 [rosetta@home] [cpu_sched_debug] scheduling t7_OFD_05_relax_SAVE_ALL_OUT_80755_528_0
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] scheduling DSFL_00080-13_0000030_0649_0
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] scheduling GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0
10-May-2013 14:34:56 [Cosmology@Home] [cpu_sched_debug] scheduling wu_050913_065030_2_0_0
10-May-2013 14:34:56 [Docking] [cpu_sched_debug] all CPUs used (8.00 >= 8), skipping 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0
10-May-2013 14:34:56 [Cosmology@Home] [cpu_sched_debug] wu_050913_065030_2_0_0 sched state 2 next 2 task state 1
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] E213053_235_C.33.C28H18N2OSi2.01466061.4.set1d06_3 sched state 2 next 2 task state 1
10-May-2013 14:34:56 [GPUGRID] [cpu_sched_debug] I29R17-NATHAN_dhfr36_6-10-32-RND4705_0 sched state 2 next 2 task state 1
10-May-2013 14:34:56 [rosetta@home] [cpu_sched_debug] t7_OFD_05_relax_SAVE_ALL_OUT_80755_528_0 sched state 2 next 2 task state 1
10-May-2013 14:34:56 [Cosmology@Home] [cpu_sched_debug] wu_050913_223610_0_0_0 sched state 1 next 1 task state 9
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] DSFL_00080-13_0000030_0649_0 sched state 2 next 2 task state 1
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0103740_0057_0 sched state 2 next 2 task state 1
10-May-2013 14:34:56 [Docking] [cpu_sched_debug] 1ajv1hsg_mod0014crossdockinghiv1_35803_464390_0 sched state 1 next 1 task state 9
10-May-2013 14:34:56 [WUProp@Home] [cpu_sched_debug] wu_v3_1367866805_167468_0 sched state 2 next 2 task state 1
10-May-2013 14:34:56 [boincsimap] [cpu_sched_debug] 20130506.393832_1 sched state 2 next 2 task state 1
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0145_1 sched state 2 next 2 task state 1
10-May-2013 14:34:56 [World Community Grid] [cpu_sched_debug] GFAM_x3ENM_chnA_loopChrm1_hMAP2K_0100059_0086_1 sched state 2 next 2 task state 1
10-May-2013 14:34:56 [Cosmology@Home] [css] running wu_050913_065030_2_0_0 ( )
10-May-2013 14:34:56 [---] [cpu_sched_debug] app startup took 10.949368 secs
10-May-2013 14:34:56 [---] [cpu_sched_debug] Request CPU reschedule: slow app startup
10-May-2013 14:34:56 [---] [cpu_sched_debug] enforce_schedule: end

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29819 - Posted: 11 May 2013 | 12:36:09 UTC - in response to Message 29807.
Last modified: 11 May 2013 | 13:22:51 UTC

GPUGrid.net Admins:
Could you please investigate this more?


I have found some more interesting information recently, and I need your help to solve the problem!

I just had 7 more tasks exhibit the issue. Again, I was using the app_config.xml file for 9999 max_concurrent, 1 gpu_usage, 0.001 cpu_usage. Every one of the tasks was a NATHAN dhfr36 task, every one was assigned NVIDIA device 0, and most curiously, every one was assigned slot 0.
Each said: Assigning NVIDIA instance 0
Each said: assigning slot 0

I have the full log files - Nathan, can we get this fixed yet?

I'm not sure what my next test will be; Obviously my attention is again focused on slot 0. I'm thinking that the errors might only happen in slot 0, and that would be why it's hard for me to reproduce the error sometime, and also why when the error happens, it sometimes happens to several units in a row (because they all went into slot 0, out of slot 0, into slot 0, etc.) Looking at the logs, it appears that when Nathan units go into a slot OTHER THAN slot 0, they work correctly for me. It's only when they go into slot 0, that this behavior is exhibited.

So, here's a rough plan of some good tests for me:
- Keep the app_config, but suspend all other projects, and cancel all work for those projects, to see if I can easily recreate the issue without any CPU load, by cycling units through slot 0.
- Get rid of the app_config, suspend all other projects, and cancel all work for those projects, reset the GPUGrid.net project, to see if I can easily recreate the issue without any CPU load, and without an app_config, by cycling units through slot 0.

There's a chance I might get a ton of BOINC credit here. I hope you all know that getting credit is not my intent. I'm trying to solve this problem, and am requesting more help from the admins.

Here are the most-recent 7 that had the issue:
I58R6-NATHAN_dhfr36_6-13-32-RND6142_0 http://www.gpugrid.net/result.php?resultid=6851610
I94R18-NATHAN_dhfr36_6-13-32-RND7621_0 http://www.gpugrid.net/result.php?resultid=6851657
I24R9-NATHAN_dhfr36_6-8-32-RND6727_1 http://www.gpugrid.net/result.php?resultid=6851729
I34R9-NATHAN_dhfr36_6-13-32-RND9945_0 http://www.gpugrid.net/result.php?resultid=6851861
I10R6-NATHAN_dhfr36_6-11-32-RND2288_0 http://www.gpugrid.net/result.php?resultid=6851874
I92R1-NATHAN_dhfr36_6-11-32-RND7368_0 http://www.gpugrid.net/result.php?resultid=6851902
I40R1-NATHAN_dhfr36_6-11-32-RND3637_1 http://www.gpugrid.net/result.php?resultid=6852035

Thanks,
Jacob

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29823 - Posted: 11 May 2013 | 16:31:11 UTC - in response to Message 29819.
Last modified: 11 May 2013 | 17:04:36 UTC

From the run times and CPU times it looks like these Slot0 WU's might have all ran on your GTX460; the 660Ti uses a full CPU from the outset while the WU's listed in your last post are finishing between 3.10 and 3.12seconds with CPU usage ~0.89seconds. Perhaps you can confirm/reject this - possibly just by seeing what is presently running and looking at the finish times?

If this is the case then the problem might be narrowed down to a Slot0 issue while using app_config and two different generation GPU's or specifically a GTX460. It also seems specific to Nathan WU's, but there has been little else of late.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29824 - Posted: 11 May 2013 | 17:44:49 UTC - in response to Message 29823.
Last modified: 11 May 2013 | 17:57:09 UTC

From the run times and CPU times it looks like these Slot0 WU's might have all ran on your GTX460; the 660Ti uses a full CPU from the outset while the WU's listed in your last post are finishing between 3.10 and 3.12seconds with CPU usage ~0.89seconds. Perhaps you can confirm/reject this - possibly just by seeing what is presently running and looking at the finish times?

If this is the case then the problem might be narrowed down to a Slot0 issue while using app_config and two different generation GPU's or specifically a GTX460. It also seems specific to Nathan WU's, but there has been little else of late.


As stated in the previous post... They all said "Assigning NVIDIA instance 0", which is the GTX 660 Ti, not the GTX 460.
Also, this issue has only ever happened on Nathan units, mainly long-run dhfr36 tasks, but occasionally also a short-run RPS1_respawn task. I've processed all of my NOELIA units without exhibiting this issue.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29825 - Posted: 11 May 2013 | 19:03:00 UTC - in response to Message 29824.

As stated in the previous post... They all said "Assigning NVIDIA instance 0", which is the GTX 660 Ti, not the GTX 460.
Also, this issue has only ever happened on Nathan units, mainly long-run dhfr36 tasks, but occasionally also a short-run RPS1_respawn task. I've processed all of my NOELIA units without exhibiting this issue.

Just checking that your instance 0 corresponds to your GTX660Ti. I ask because my system says, [2] NVIDIA GeForce GTX 660 Ti (2048MB) driver: 311.6, GPUZ and Afterburner indicate that my GTX660Ti is in PCIE0, however Boinc says the GTX660Ti is (device 1) and the GTX470 is (device 0). Also, when I check the properties of a Nathan_dhfr36 WU running on my GTX660Ti after 3seconds it says CPU time 2sec and Run Time 3sec.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29826 - Posted: 11 May 2013 | 19:05:51 UTC - in response to Message 29825.

Right, sorry for any confusion.
Throughout all of this thread, and even now, BOINC has assigned my GPU devices consistently as:
GPU Device 0: eVGA GTX 660 Ti 3GB FTW
GPU Device 1: eVGA GTX 460 1GB

So, those 7 tasks that had the issue, and the 1 task before them... were all for sure running on:
GPU Device 0: eVGA GTX 660 Ti 3GB FTW

I'm almost at the point where I get to test how easily I can recreate the bug, with no CPU tasks. This should prove interesting.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29827 - Posted: 11 May 2013 | 19:14:05 UTC - in response to Message 29826.
Last modified: 11 May 2013 | 19:15:45 UTC

Well that's good; we now know the problem is liked to your GTX660Ti some how rather than your GTX460, as well as app_config, possibly NATHAN_dhfr36 WU's, and possibly using a high amount of CPU.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29829 - Posted: 11 May 2013 | 19:26:54 UTC - in response to Message 29827.

Well that's good; we now know the problem is liked to your GTX660Ti some how rather than your GTX460, as well as app_config, possibly NATHAN_dhfr36 WU's, and possibly using a high amount of CPU.


Well... we only know that the last 8 instances were linked to the GTX 660 Ti and were linked to slot 0. For the previous 10-20 instances, I wasn't tracking closely enough to know that information.

We know for sure it is not related to WCG, and it's not related to running 2-at-a-time-on-1-GPU.

I think we know that it only happens on Nathan units, but not all Nathan units. And, as far as I could tell so far, it only ever happened using an app_config.

I'd still call it progress. Now I get to try to narrow it down... just me, because I'm the only one that has the issue, so far. I'm hopeful I can narrow it down to a point where someone else could easily reproduce it -- that's my goal.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29835 - Posted: 12 May 2013 | 3:18:04 UTC - in response to Message 29829.

I have good news, to an extent.

In a test earlier this evening, I had no tasks in my local queue. I allowed GPUGrid (with app_config) to work-fetch, and it asked for and received 2 tasks. It got a NOELIA and a NATHAN dhfr36. The NOELIA started in slot 0 on device 0, and the NATHAN started in slot 2 on device 1. Both ran just fine.

In a test just a couple minutes ago, I again had no tasks in my local queue. I allowed GPUGrid (with app_config) to work-fetch, and it asked for and received 2 tasks. It got 2 NATHAN dhfr36 tasks. When the first downloaded it started in slot 0 on device 0, and immediately completed, exhibiting the issue, then uploaded the result, and cleaned the slot. When the second task downloaded, it too started in slot 0 on device 0, and immediately completed, exhibiting the issue.

Thus, I can conclude that the issue is not related at all to CPU saturation.
For that reason alone, tonight was a huge step forward, I think.

My next tests will be:
- set BOINC to only run GPUGrid on device 0, and see if it still happens
- set BOINC to only run GPUGrid on device 1, and see if it still happens
- reset GPUGrid, so it uses default values without app_config, and see if it still happens.

If any of you would like to attempt to get NATHAN dhfr36 tasks into your slot 0 to test, that would be helpful.

Thanks,
Jacob

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29836 - Posted: 12 May 2013 | 3:49:29 UTC - in response to Message 29835.
Last modified: 12 May 2013 | 4:02:48 UTC

I tested having BOINC just use device 0, by adding the following line to the cc_config.xml file and restarting BOINC:
<ignore_nvidia_dev>1</ignore_nvidia_dev>

Then I allowed new tasks for GPUGrid (with app_config.xml), got a NATHAN dhfr36, it was assigned "NVIDIA instance 0" (which I guess is device 0) slot 0, and it immediately completed, exhibiting the issue.

Getting closer.... :)

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29837 - Posted: 12 May 2013 | 4:02:07 UTC - in response to Message 29836.
Last modified: 12 May 2013 | 4:04:10 UTC

I tested having BOINC just use device 1, by adding the following line to the cc_config.xml file and restarting BOINC:
<ignore_nvidia_dev>0</ignore_nvidia_dev>

Then I allowed new tasks for GPUGrid (with app_config.xml), got a NATHAN dhfr36, it was assigned "NVIDIA instance 0" (which I guess might be device 1) slot 0, and it immediately completed, exhibiting the issue.

Perhaps the next test should specify a gpu_exclusion, instead of using ignore_nvidia_dev. This way, I can see in the Tasks pane for sure that the behavior is happening on each device individually. Testing this theory now.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29838 - Posted: 12 May 2013 | 4:17:05 UTC - in response to Message 29837.
Last modified: 12 May 2013 | 4:23:09 UTC

Alright, with the following block in my cc_config.xml:
<exclude_gpu>
<url>http://www.gpugrid.net</url>
<device_num>1</device_num>
</exclude_gpu>

... I restarted BOINC, allowed GPUGrid (with app_config) to get a task, it got assigned to instance 0 slot 0, and (the important part) I verified that the Tasks pane showed that it executed on my device 0 (the GTX 660 Ti), and it completed immediately, exhibiting the behavior.

Now to test device 1 (with verification in the Tasks pane)...

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29839 - Posted: 12 May 2013 | 4:22:31 UTC - in response to Message 29838.

With the following block in my cc_config.xml:
<exclude_gpu>
<url>http://www.gpugrid.net</url>
<device_num>0</device_num>
</exclude_gpu>

... I restarted BOINC, allowed GPUGrid (with app_config) to get a task, it got assigned to instance 1 slot 0, and (the important part) I verified that the Tasks pane showed that it executed on my device 1 (the GTX 460), and it completed immediately, exhibiting the behavior.

Thus, whatever is happening, is currently capable of happening on either of my 2 GPUs.

I love how far I'm getting tonight. I'm going to keep pushing.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29840 - Posted: 12 May 2013 | 4:50:44 UTC - in response to Message 29839.

So... I pulled the GTX 460 completely out of the system, and removed the gpu_exclusions from the cc_config.xml file.

... I restarted BOINC, allowed GPUGrid (with app_config) to get a task, it got assigned to instance 0 slot 0, and it completed immediately, exhibiting the behavior.

Thus, I conclude that whatever is happening is not related to having 2 GPUs in the system.

Still pushing onward...

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29841 - Posted: 12 May 2013 | 5:09:19 UTC - in response to Message 29840.
Last modified: 12 May 2013 | 5:20:22 UTC

Alright, so, with the GTX 460 still out of the system, I stopped using my Account Manager (BOINCstats BAM!), so I could remove the GPUGrid.net project. Then I restarted BOINC. Then I re-added the Account Manager, which re-added the GPUGrid.net project.

So, it initialized, and I did not place an app_config.xml file in the project directory.

And then... using default project app settings (since there was no app_config.xml file), it got a Nathan dhfr36 task, assigned to instance 0 slot 0, and completed immediately, exhibiting the behavior.

Thus, I can now *finally* conclude that:
app_config.xml is COMPLETELY UNRELATED to this issue.

PS: I also ran a test that concludes that, even if the project was added manually (not added from an Account Manager), the issue still occurs.

I've still got a couple more tests to run... pushing forward...

Log snippet:
5/12/2013 1:03:34 AM | GPUGRID | [coproc] Assigning NVIDIA instance 0 to I98R4-NATHAN_dhfr36_6-15-32-RND2897_0
5/12/2013 1:03:34 AM | GPUGRID | [slot] assigning slot 0 to I98R4-NATHAN_dhfr36_6-15-32-RND2897_0
5/12/2013 1:03:34 AM | GPUGRID | Starting task I98R4-NATHAN_dhfr36_6-15-32-RND2897_0 using acemdlong version 618 (cuda42) in slot 0
5/12/2013 1:03:34 AM | GPUGRID | [css] running I98R4-NATHAN_dhfr36_6-15-32-RND2897_0 (0.729 CPUs + 1 NVIDIA GPU)
5/12/2013 1:03:34 AM | GPUGRID | [task] task_state=EXECUTING for I98R4-NATHAN_dhfr36_6-15-32-RND2897_0 from start
5/12/2013 1:03:36 AM | GPUGRID | [css] running I98R4-NATHAN_dhfr36_6-15-32-RND2897_0 (0.729 CPUs + 1 NVIDIA GPU)
5/12/2013 1:03:37 AM | GPUGRID | [task] Process for I98R4-NATHAN_dhfr36_6-15-32-RND2897_0 exited, exit code 0, task state 1
5/12/2013 1:03:37 AM | GPUGRID | [task] task_state=EXITED for I98R4-NATHAN_dhfr36_6-15-32-RND2897_0 from handle_exited_app

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29842 - Posted: 12 May 2013 | 5:30:58 UTC - in response to Message 29841.
Last modified: 12 May 2013 | 5:37:46 UTC

Alright, so my next test was to confirm that it was related to slot 0.

So, I did was I set CPU % to 25%, and then allowed new tasks for WCG. It got 2 CPU tasks, and started them in slots 0 and 2.

Then I allowed new tasks for GPUGrid. It got 1 task, and started it in slot 3, and appeared to be progressing normally (ie: not exhibiting the issue). Log is below.

Note: The reason slot 1 keeps getting skipped is because, for some reason, it never gets cleaned properly. I'm hoping it's unrelated to my issue, but it'll be the next thing I test.

Below you'll find the contents of slot 1, and the log showing how the dhfr36 task progresses normally in slot 3.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29843 - Posted: 12 May 2013 | 5:32:37 UTC - in response to Message 29842.
Last modified: 12 May 2013 | 5:34:18 UTC

Slot 1's contents (via: dir /a /s /b) - Part 1
C:\ProgramData\BOINC\slots\1
C:\ProgramData\BOINC\slots\1\minirosetta_database
C:\ProgramData\BOINC\slots\1\minirosetta_database\chemical
C:\ProgramData\BOINC\slots\1\minirosetta_database\external
C:\P