Advanced search

Message boards : Number crunching : NOELIA Errors - (0x3) - exit code 3 (0x3) - Assertion failed: a, file swanlibnv2.cpp, line 59

Author Message
Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29031 - Posted: 7 Mar 2013 | 10:54:04 UTC

GPUGrid Devs,

I recently got some errors on some bugged tasks, and now my GPU sits completely idle, presumably because I'm prohibited from getting work, despite the errors appearing to be bugged tasks.

Could you please have a look at:
http://www.gpugrid.net/workunit.php?wuid=4230145
http://www.gpugrid.net/workunit.php?wuid=4230179
http://www.gpugrid.net/workunit.php?wuid=4230210

Other people are having the same error on these NOELIA tasks, and the error says:

The system cannot find the path specified. (0x3) - exit code 3 (0x3)
SWAN : FATAL : Cuda driver error 999 in file 'swanlibnv2.cpp' in line 1574.
Assertion failed: a, file swanlibnv2.cpp, line 59

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.


Thanks,
Jacob

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29034 - Posted: 7 Mar 2013 | 11:58:14 UTC - in response to Message 29031.
Last modified: 7 Mar 2013 | 12:02:56 UTC

Join the club, all the new Noelia WUs are doing that. No one can run them and they fail on every machine, yet they haven't been cancelled. Do a cold reboot to get the NVIDIA card working again. It would really be nice if they TESTED these WUs before they inflict them on us :-(

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1451
Credit: 3,575,929,351
RAC: 345,202
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29035 - Posted: 7 Mar 2013 | 12:04:25 UTC

That looks like one I've just encountered:

http://www.gpugrid.net/workunit.php?wuid=4230238

I'm the 'abort': task triggered a driver reset, followed by a "acemd.2865P.exe has encountered a problem and must close" popup.

That also interrupted tasks from another project running on the other GPU: careful management (suspending the GPUGrid task in BOINC Manager, before clicking 'close' in the acemd.2865P.exe popup) allowed both GPUs to continue processing without a reboot - but the GPU running GPUGrid was put out of action until I noticed and intervened manually.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1451
Credit: 3,575,929,351
RAC: 345,202
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29037 - Posted: 7 Mar 2013 | 12:10:24 UTC - in response to Message 29034.

Join the club, all the new Noelia WUs are doing that. No one can run them and they fail on every machine, yet they haven't been cancelled. Do a cold reboot to get the NVIDIA card working again. It would really be nice if they TESTED these WUs before they inflict them on us :-(

I think that remark applies only to 'long queue' tasks. My host 43404 seems to be coping with (most) short tasks OK.

Profile algabe
Send message
Joined: 23 May 10
Posts: 9
Credit: 739,480,301
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29051 - Posted: 7 Mar 2013 | 16:55:51 UTC

I am very disappointed with this project lately, yesterday NOELIA two units of 8 hours each processed with faulty execution, today NOELIA other two units with 10 hours each processed aborted by the user, this is unacceptable and very little serious.
Now I am processing nathan two units, if these errors persist with these units, much lamentadolo definitely fails to process in this project, I'm paying a lot of money on electricity bills with the crisis there, that is useless after this effort.


Greetings.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 21,587
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29056 - Posted: 7 Mar 2013 | 18:49:54 UTC - in response to Message 29051.

Now I am processing nathan two units, if these errors persist with these units, much lamentadolo definitely fails to process in this project, I'm paying a lot of money on electricity bills with the crisis there, that is useless after this effort.

All the Nathans and Tonis have run fine for me (both long and short), it seems to be just the Noelia longs (6.18). And the ones that failed have all done so early, in the first 10 seconds for me in the most recent batch. It looks to me that they will be able to catch the bug and fix it.

flashawk
Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 29057 - Posted: 7 Mar 2013 | 19:01:34 UTC

Last night I had a spontaneous reboot that took out 2 CPDN full resolution ocean models with over 250 hours each, that was a real bummer. I've had short NOELIA's fail too, I think she needs to borrow NATHAIN's magic wand!

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 21,587
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29058 - Posted: 7 Mar 2013 | 19:14:21 UTC - in response to Message 29057.

Last night I had a spontaneous reboot that took out 2 CPDN full resolution ocean models with over 250 hours each, that was a real bummer. I've had short NOELIA's fail too, I think she needs to borrow NATHAIN's magic wand!

I have a couple of GTX 660s that I am waiting to throw at it when they get it fixed.

flashawk
Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 29059 - Posted: 7 Mar 2013 | 19:51:07 UTC

You're holding out, eh? I've heard those are pretty good cards for this project. I'm paying about $180.00 a month for electricity here in the northern Sierra's (PG&E).

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 21,587
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29062 - Posted: 7 Mar 2013 | 21:46:58 UTC - in response to Message 29059.

You're holding out, eh? I've heard those are pretty good cards for this project. I'm paying about $180.00 a month for electricity here in the northern Sierra's (PG&E).

My electric bills are going down (amazingly) in eastern PA, due to the shale gas. It affects even the price of coal-fired power.

flashawk
Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 29064 - Posted: 7 Mar 2013 | 22:00:58 UTC

We are all hydro electric here and I have an airtight wood stove. That shale gas, is it found locally or is it like that tar sands crude found in Canada? I don't think I've heard of it before. Sorry to keep asking questions, I'm just curious about stuff like that.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 21,587
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29067 - Posted: 7 Mar 2013 | 22:39:37 UTC - in response to Message 29064.

Search for the Marcellus Shale, and you will find lots of info. It is throughout central PA, extending down into West Virginia and out to Ohio. Also, it goes up into New York state, but they have a moratorium on development due to pollution concerns. It is potentially a VERY big deal, though development will have to be monitored carefully.

And there is shale oil in North Dakota, and even in California.
http://www.economist.com/news/united-states/21571899-california-tries-decide-if-it-wants-join-shale-revolution-big-reserves-big

flashawk
Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 29068 - Posted: 7 Mar 2013 | 23:08:45 UTC

That's fascinating stuff, I had no idea that they had made such a huge discovery that has that much potential and here I am asking you about it and one of the largest deposits is within 75 miles of me. Boy, I feel pretty stupid.

I remember back in the 80's, we had hundreds of off-shore wells and pumping stations and Californians voted a ban on them because they were considered an eyesore, not for environmental reasons, just an eyesore. All those millionaires that live on the coast didn't like looking at them through their 200 square foot plate glass windows.

Anyway, thanks for the info, I better stop asking these questions before a moderator asks me to stop posting off topic.

John C MacAlister
Send message
Joined: 17 Feb 13
Posts: 181
Credit: 144,871,276
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 29069 - Posted: 7 Mar 2013 | 23:26:50 UTC
Last modified: 7 Mar 2013 | 23:29:21 UTC

No NOELIA long tasks for me at present. All others appear to be OK.

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29074 - Posted: 8 Mar 2013 | 8:31:14 UTC

With one Noelia long run this message was on the screen: acemd.2865P.exe has encountered a problem and must close.
However BOINC continued and requested a new task, a short one, that ran complete without error. I did nothing to intervene as I was not at home.
Was on a Vista 32 bit system with BOINC 6.10.58 and nVidia driver 314.7
Perhaps the newest OS's gives the problems?

____________
Greetings from TJ

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29075 - Posted: 8 Mar 2013 | 8:34:08 UTC - in response to Message 29068.

That's fascinating stuff, I had no idea that they had made such a huge discovery that has that much potential and here I am asking you about it and one of the largest deposits is within 75 miles of me. Boy, I feel pretty stupid.

I remember back in the 80's, we had hundreds of off-shore wells and pumping stations and Californians voted a ban on them because they were considered an eyesore, not for environmental reasons, just an eyesore. All those millionaires that live on the coast didn't like looking at them through their 200 square foot plate glass windows.

Anyway, thanks for the info, I better stop asking these questions before a moderator asks me to stop posting off topic.

Indeed of topic but interesting though. We have a lot of shale gas in the Netherlands as well and electricity bills should decline when using it. Most people however are afraid for quakes when it is exploited. That's all I say about this of topic. Apologies.

____________
Greetings from TJ

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 21,587
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29078 - Posted: 8 Mar 2013 | 16:59:04 UTC - in response to Message 29059.
Last modified: 8 Mar 2013 | 17:49:32 UTC

You're holding out, eh? I've heard those are pretty good cards for this project.

My first two Nathan longs took 6 hours 2 minutes and 6 hours 4 minutes on the GTX 660s, so that seems good to me. For comparison, the GTX 650 Ti took 9 hours 19 minutes, and GTX 560 took 9 hours 54 minutes. Those are only single-sample measurements, but that reflects the trend.

Hopefully the Noelia longs will do comparably well. (The Keplers don't do as well on some other distributed computing projects.)

Mungo
Send message
Joined: 13 Jan 10
Posts: 2
Credit: 21,938,278
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29203 - Posted: 20 Mar 2013 | 0:40:24 UTC
Last modified: 20 Mar 2013 | 1:36:22 UTC

Just to let you know i got several "acemd.2865P.exe has encountered a problem and must close" popups while running a long NATHAN task. i aborted after 3 hours....

Win7 x64
NVIDIA GeForce GTX 460

Mungo
Send message
Joined: 13 Jan 10
Posts: 2
Credit: 21,938,278
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 29208 - Posted: 21 Mar 2013 | 0:58:15 UTC - in response to Message 29203.

Just to let you know i got several "acemd.2865P.exe has encountered a problem and must close" popups while running a long NATHAN task. i aborted after 3 hours....

Win7 x64
NVIDIA GeForce GTX 460


Subsequent NATHAN tasks seem to be performing ok.

John C MacAlister
Send message
Joined: 17 Feb 13
Posts: 181
Credit: 144,871,276
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 29209 - Posted: 21 Mar 2013 | 14:03:20 UTC

Error while computing

name I1R231-NATHAN_RPS1_respawn3-0-32-RND2410
application Short runs (2-3 hours on fastest card)
created 20 Mar 2013 | 17:10:40 UTC
minimum quorum 1
initial replication 1
max # of error/total/success tasks 7, 10, 6
Task
click for details Computer Sent Time reported
or deadline
explain Status Run time
(sec) CPU time
(sec) Credit Application
6647874 146594 20 Mar 2013 | 18:40:49 UTC 21 Mar 2013 | 5:39:56 UTC Error while computing 10,908.75 10,751.76 --- Short runs (2-3 hours on fastest card) v6.52 (cuda42)

Post to thread

Message boards : Number crunching : NOELIA Errors - (0x3) - exit code 3 (0x3) - Assertion failed: a, file swanlibnv2.cpp, line 59

//