Advanced search

Message boards : Number crunching : long running 290px..-Noelia_KLEBE WUs

Author Message
dyeman
Send message
Joined: 21 Mar 09
Posts: 35
Credit: 591,434,551
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31836 - Posted: 6 Aug 2013 | 6:45:41 UTC

I have received a bunch of 290xp...Noelia_KLEBEs. They will be lucky to finish within 24 hours (660 and 660ti) or 2 days (560ti with the additional penalty for only 1GB memory). Are these cards now too slow to process long WUs, or is there a problem with them that they run so much longer than earlier WUS?

Thanks!

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,206,655,749
RAC: 261,147
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31840 - Posted: 6 Aug 2013 | 9:44:39 UTC - in response to Message 31836.

I have received a bunch of 290xp...Noelia_KLEBEs. They will be lucky to finish within 24 hours (660 and 660ti) or 2 days (560ti with the additional penalty for only 1GB memory).

I've received a couple of them too. They need about 1.2GB GPU memory.

Are these cards now too slow to process long WUs,

These cards are not too slow to prcess them, as the deadline is 5 days. However these cards too slow to achieve the 24h or 48h return bonus.

... or is there a problem with them that they run so much longer than earlier WUS?

I had a problem with one of them, the GPU usage was 9-11% only, with some spikes of 95%. I've resarted the host, and the symptom disappeared for as long as I've watched.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31841 - Posted: 6 Aug 2013 | 9:46:35 UTC - in response to Message 31836.
Last modified: 6 Aug 2013 | 9:50:27 UTC

On my GTX660Ti and GTX660 the last NOELIA_KLEBE WU's finished just under 11h and 14h:

148px120-NOELIA_KLEBE-0-3-RND5215_1 4636380 139265 5 Aug 2013 | 9:18:05 UTC 6 Aug 2013 | 1:02:54 UTC Completed and validated 48,352.49 21,205.20 142,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)

109nx159-NOELIA_KLEBE-0-3-RND4695_1 4636187 139265 4 Aug 2013 | 22:40:17 UTC 5 Aug 2013 | 11:34:24 UTC Completed and validated 39,357.44 16,969.73 142,800.00 Long runs (8-12 hours on fastest card) v6.18 (cuda42)

Still saw a driver reset when I suspended a WU though.

As Zoltan suggests (1.2GB), you might want to try and avoid them on your 1024MB GTX560Ti (short queue).
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

dyeman
Send message
Joined: 21 Mar 09
Posts: 35
Credit: 591,434,551
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31844 - Posted: 6 Aug 2013 | 11:51:30 UTC - in response to Message 31841.

Previous Noelia KLEBEs have been ok - just this batch starting with 290px. Anyway the one on the 560ti just crsashed after 20 hours after I had to reboot the PC so I'll give it a rest from long WUs for the moment and see if it handles the new Folding WUs any better.

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31846 - Posted: 6 Aug 2013 | 15:28:50 UTC

I just finished a NOELIA 290px86 on my GTX 660. It took 19h40m, exactly six hours longer than the previous NOELIA 148px111 and for the same credit; 142,800. :(
____________

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31851 - Posted: 7 Aug 2013 | 0:00:21 UTC - in response to Message 31840.
Last modified: 7 Aug 2013 | 0:00:49 UTC

I have received a bunch of 290xp...Noelia_KLEBEs. They will be lucky to finish within 24 hours (660 and 660ti) or 2 days (560ti with the additional penalty for only 1GB memory).

I've received a couple of them too. They need about 1.2GB GPU memory.

Just found 2 of them on my 650 Ti 1GB cards, 19-20 hours run and less than 40% completed. Noelia strikes again :-(

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31886 - Posted: 7 Aug 2013 | 19:49:36 UTC - in response to Message 31851.

I'm not keen on 20h tasks, unless the failure rate is very low. Of late I've mostly been running short WU's anyway.

With previous Noelia WU's we did see a significant slowdown when all the CPU cores/threads were in use, so I would recommend 99% CPU usage for these long WU's. Might bring a few home just in time for the bonus and reduce the heartaches. Anyone who suspends and resume crunching a lot would be best sticking to the short queue.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile chip
Send message
Joined: 10 Feb 09
Posts: 5
Credit: 50,333,581
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwat
Message 31893 - Posted: 8 Aug 2013 | 9:13:46 UTC
Last modified: 8 Aug 2013 | 9:16:32 UTC

Current task 284px89-NOELIA_KLEBE-0-3-RND9863 has time 6570s at 10% progress!
Previous NOELIA_KLEBE tasks had ~29Ks, 36Ks and 42Ks total.

John C MacAlister
Send message
Joined: 17 Feb 13
Posts: 181
Credit: 144,871,276
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 31900 - Posted: 8 Aug 2013 | 18:49:53 UTC

I just lost two more long run Noelia tasks....only a few seconds lost, but I have gone back to short run tasks again.

063px68-NOELIA_KLEBE-1-3-RND0830

application
Long runs (8-12 hours on fastest card)
created
4 Aug 2013 | 19:24:48 UTC
minimum quorum
1
initial replication
1
max # of error/total/success tasks
7, 10, 6
Task
click for details
Computer
Sent
Time reported
or deadline 
explain
Status
Run time
(sec)
CPU time
(sec)
Credit
Application
7109875
154680
8 Aug 2013 | 14:31:49 UTC
8 Aug 2013 | 14:36:47 UTC
Error while computing
18.55
4.82
---
Long runs (8-12 hours on fastest card) v6.18 (cuda42)
7120477
---
---
---
Unsent
---
---
---



063px59-NOELIA_KLEBE-1-3-RND0623

application
Long runs (8-12 hours on fastest card)
created
4 Aug 2013 | 17:03:35 UTC
minimum quorum
1
initial replication
1
max # of error/total/success tasks
7, 10, 6
Task
click for details
Computer
Sent
Time reported
or deadline 
explain
Status
Run time
(sec)
CPU time
(sec)
Credit
Application
7109622
154680
8 Aug 2013 | 14:31:49 UTC
8 Aug 2013 | 14:35:14 UTC
Error while computing
15.39
3.85
---
Long runs (8-12 hours on fastest card) v6.18 (cuda42)
7120474
152522
8 Aug 2013 | 18:04:16 UTC
13 Aug 2013 | 18:04:16 UTC
In progress
---
---
---
Long runs (8-12 hours on fastest card) v6.18 (cuda42)

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 11,088,254,313
RAC: 15,488,504
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31903 - Posted: 9 Aug 2013 | 2:21:13 UTC

I ran into a few of these slow units myself. Though they were not the 20+ hours finish time units, but they did take 14 to 16 hours to finish.

The first two were on a windows 7 computer, with a GTX 690 card, 2 gigs of memory, one CPU to GPU ratio, and the cards were not under clocked.

http://www.gpugrid.net/result.php?resultid=7107922

http://www.gpugrid.net/result.php?resultid=7107910

The other computer has windows xp, with the same type of card and settings.

http://www.gpugrid.net/result.php?resultid=7119318


Normally, these units take about 9 to 12 hours to finish on my computers. It seems to be more than just a memory issue.



pvh
Send message
Joined: 17 Mar 10
Posts: 23
Credit: 1,173,824,416
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31904 - Posted: 9 Aug 2013 | 2:37:26 UTC

I too have significant problems with 063px85-NOELIA_KLEBE-1-3-RND2279. It looks like it hangs up the driver: when this task is started, it never makes it onto the GPU and the task hangs at 0% (though it appears to be running in BOINC). The GPU is not capable of running any openCL tasks after that, not even very simple test programs. Only a reboot will get the system out of this state. I have now suspended GPUGRID and the GPU is running an Einstein task just fine. So the OS and the hardware are OK. This is a linux 64-bit system with a 310.44 driver. My card is a GTX 660 Ti with 2 GiB memory, so overrunning the memory should not be an issue. I will kill the NOELIA task and see if I can get something that runs. I have to say that I am getting pretty fed up with these NOELIA tasks, they cause way too many problems. The other ones all run just fine, but it appears that the NOELIA tasks are the only ones that break...

Now that I have aborted some of these tasks I can see that they initially load on the GPU for some 10 seconds or so and then disappear from the GPU. The run time is then also reset. Memory use never exceeds 1 GiB during that initial time according to nvidia-smi.

All I can get now are these KLEBE runs, so I will disable the project until this situation has been fixed.

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 11,088,254,313
RAC: 15,488,504
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31907 - Posted: 9 Aug 2013 | 10:23:52 UTC

I downloaded a few more of these units, and while the 100 series px... seem to be running normally (9 to 12 hours finishing time for my computers). It is the 200 series px... that are the slow ones (14+ hours finishing time for my computers).

At least that is what is happening on my computers, currently.


John C MacAlister
Send message
Joined: 17 Feb 13
Posts: 181
Credit: 144,871,276
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 31908 - Posted: 9 Aug 2013 | 11:49:48 UTC - in response to Message 31903.

Hi, Bedrich:

I am happy to see your successes. I am having too many failures to continue with my FX-8350/GTX 650 Ti GPUs so I will divert these to Alzheimer's where I experience no failures at all. I will look in on GPUGrid occasionally to see if I want to return.

John

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31909 - Posted: 9 Aug 2013 | 12:18:43 UTC - in response to Message 31893.

Chip, 87°C is too high. I suggest you cool your system/GPU better.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31915 - Posted: 9 Aug 2013 | 14:59:45 UTC
Last modified: 9 Aug 2013 | 15:00:00 UTC

Seems I didn't get a Noelia 290px.. WU jet but the one's I got finish okay on my troublesome system. Even with the very recommended 310.90 driver Santi SR fail, but Noelia's steam on. They take a little longer that a few weeks back.
I can even suspend them and start again without problem. Noelia's run with all drivers I tries from 310 to 320 and the latest beta.

I did find out that suspending a GPUGRID WU (Santi and Nate) and than closing BOINC, start BOINC and resume the task again, works without error.

Staff: make a Noelia queue and I hook up my rigs there.
____________
Greetings from TJ

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 11,088,254,313
RAC: 15,488,504
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31925 - Posted: 10 Aug 2013 | 1:11:39 UTC

The last three days, I had 7 Noelia units crash. That's a lot for me, and so far, they crashed for everybody else, also. Looks like you have a problem here.



109nx25-NOELIA_KLEBE-1-3-RND9879_3 4640742 10 Aug 2013 | 0:35:47 UTC 10 Aug 2013 | 0:38:48 UTC Error while computing 24.03 3.85 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)
148px179-NOELIA_KLEBE-1-3-RND0365_1 4641851 9 Aug 2013 | 22:47:33 UTC 10 Aug 2013 | 0:35:47 UTC Error while computing 26.05 3.82 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)
109nx159-NOELIA_KLEBE-1-3-RND4695_3 4640952 9 Aug 2013 | 22:42:42 UTC 9 Aug 2013 | 23:55:05 UTC Error while computing 34.05 6.99 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)
063px200-NOELIA_KLEBE-1-3-RND2361_3 4639982 9 Aug 2013 | 18:55:47 UTC 9 Aug 2013 | 20:05:08 UTC Error while computing 26.04 3.92 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)
255px11-NOELIA_KLEBE-1-3-RND0350_1 4641822 9 Aug 2013 | 17:12:36 UTC 9 Aug 2013 | 18:08:37 UTC Error while computing 30.04 4.27 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)
063px194-NOELIA_KLEBE-1-3-RND5437_3 4639954 9 Aug 2013 | 7:12:48 UTC 9 Aug 2013 | 7:55:27 UTC Error while computing 27.03 3.92 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)
063px126-NOELIA_KLEBE-1-3-RND3582_0 4639815 8 Aug 2013 | 17:58:47 UTC 8 Aug 2013 | 19:29:46 UTC Error while computing 16.05 4.15 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)




Profile chip
Send message
Joined: 10 Feb 09
Posts: 5
Credit: 50,333,581
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwat
Message 31927 - Posted: 10 Aug 2013 | 5:34:37 UTC
Last modified: 10 Aug 2013 | 5:37:46 UTC

The same strings in stderr.txt:

(0x3) - exit code 3 (0x3)

SWAN : FATAL : Cuda driver error 999 in file 'swanlibnv2.cpp' in line 1574.
Assertion failed: a, file swanlibnv2.cpp, line 59


Windows event:

Log Name: Application
Source: Application Error
Date: 09.08.2013 17:39:07
Event ID: 1000
Task Category: (100)
Level: Error
Keywords: Classic
User: N/A
Computer: ******
Description:
Faulting application name: acemd.2865P.exe, version: 0.0.0.0, time stamp: 0x511b9dc5
Faulting module name: acemd.2865P.exe, version: 0.0.0.0, time stamp: 0x511b9dc5
Exception code: 0x40000015
Fault offset: 0x00015ad1
Faulting process id: 0x11c8
Faulting application start time: 0x01ce950e28118082
Faulting application path: C:\ProgramData\BOINC\projects\www.gpugrid.net\acemd.2865P.exe
Faulting module path: C:\ProgramData\BOINC\projects\www.gpugrid.net\acemd.2865P.exe
Report Id: 71992cef-0101-11e3-8cc4-8c89a52b18d0

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 817,865,789
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31945 - Posted: 10 Aug 2013 | 23:46:57 UTC

Wow the longest unit i ever had o.O

http://www.gpugrid.net/workunit.php?wuid=4639483
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 817,865,789
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31950 - Posted: 11 Aug 2013 | 10:23:47 UTC

Oh, a "normal" Computingerror on KLEBEs can hang a GPU too. Thats new for me as experience with my equipment thats nearly resistent about the most problems here. Computer reset -> works as usually ^^
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Post to thread

Message boards : Number crunching : long running 290px..-Noelia_KLEBE WUs

//