Author |
Message |
|
SInce my system has downloaded Ver 6.61 all of the workunits have failed. The Messages from BOIC say:-
22/01/2009 18:43:19|GPUGRID|Sending scheduler request: To fetch work. Requesting 3021 seconds of work, reporting 1 completed tasks
22/01/2009 18:43:24|GPUGRID|Scheduler request completed: got 1 new tasks
22/01/2009 18:43:26|GPUGRID|Started download of kA28006-SH2_US_4-0-10-SH2_US_41140000-LICENSE
22/01/2009 18:43:26|GPUGRID|Started download of kA28006-SH2_US_4-0-10-SH2_US_41140000-COPYRIGHT
22/01/2009 18:43:27|GPUGRID|Finished download of kA28006-SH2_US_4-0-10-SH2_US_41140000-LICENSE
22/01/2009 18:43:27|GPUGRID|Finished download of kA28006-SH2_US_4-0-10-SH2_US_41140000-COPYRIGHT
22/01/2009 18:43:27|GPUGRID|Started download of kA28006-SH2_US_4-0-10-SH2_US_41140000-smd.1140000.coor
22/01/2009 18:43:27|GPUGRID|Started download of kA28006-SH2_US_4-0-10-SH2_US_41140000-smd.1140000.vel
22/01/2009 18:43:28|GPUGRID|Finished download of kA28006-SH2_US_4-0-10-SH2_US_41140000-smd.1140000.vel
22/01/2009 18:43:28|GPUGRID|Started download of kA28006-SH2_US_4-0-10-SH2_US_41140000-input.idx
22/01/2009 18:43:29|GPUGRID|Finished download of kA28006-SH2_US_4-0-10-SH2_US_41140000-input.idx
22/01/2009 18:43:29|GPUGRID|Started download of kA28006-SH2_US_4-0-10-SH2_US_41140000-complex_full.sol.ionized.pdb
22/01/2009 18:43:38|GPUGRID|Finished download of kA28006-SH2_US_4-0-10-SH2_US_41140000-smd.1140000.coor
22/01/2009 18:43:38|GPUGRID|Started download of kA28006-SH2_US_4-0-10-SH2_US_41140000-complex_full.sol.ionized.psf
22/01/2009 18:43:56|GPUGRID|Finished download of kA28006-SH2_US_4-0-10-SH2_US_41140000-complex_full.sol.ionized.pdb
22/01/2009 18:43:56|GPUGRID|Started download of kA28006-SH2_US_4-0-10-SH2_US_41140000-parameters
22/01/2009 18:43:59|GPUGRID|Finished download of kA28006-SH2_US_4-0-10-SH2_US_41140000-parameters
22/01/2009 18:43:59|GPUGRID|Started download of kA28006-SH2_US_4-0-10-SH2_US_41140000-SH2_US_41140000
22/01/2009 18:44:00|GPUGRID|Finished download of kA28006-SH2_US_4-0-10-SH2_US_41140000-SH2_US_41140000
22/01/2009 18:44:18|GPUGRID|Finished download of kA28006-SH2_US_4-0-10-SH2_US_41140000-complex_full.sol.ionized.psf
22/01/2009 18:44:20|GPUGRID|Starting kA28006-SH2_US_4-0-10-SH2_US_41140000_1
22/01/2009 18:44:20|GPUGRID|Starting task kA28006-SH2_US_4-0-10-SH2_US_41140000_1 using acemd version 661
22/01/2009 18:44:23|GPUGRID|Computation for task kA28006-SH2_US_4-0-10-SH2_US_41140000_1 finished
22/01/2009 18:44:23|GPUGRID|Output file kA28006-SH2_US_4-0-10-SH2_US_41140000_1_1 for task kA28006-SH2_US_4-0-10-SH2_US_41140000_1 absent
22/01/2009 18:44:23|GPUGRID|Output file kA28006-SH2_US_4-0-10-SH2_US_41140000_1_2 for task kA28006-SH2_US_4-0-10-SH2_US_41140000_1 absent
22/01/2009 18:44:23|GPUGRID|Output file kA28006-SH2_US_4-0-10-SH2_US_41140000_1_3 for task kA28006-SH2_US_4-0-10-SH2_US_41140000_1 absent
22/01/2009 18:44:25|GPUGRID|Started upload of kA28006-SH2_US_4-0-10-SH2_US_41140000_1_0
22/01/2009 18:44:27|GPUGRID|Finished upload of kA28006-SH2_US_4-0-10-SH2_US_41140000_1_0
Similar messages for every work unit which downloaded and failed.
There were no problems with the previous 6.55 version.
Any ideas on how I can get crunching again?
Ian
|
|
|
|
Strange, your error message is
MDIO ERROR: read error for file "input.vel", byte number 4: number of atoms (1625495040) != (39910) expected
ERROR: mdioload.cu, line 146: Unable to read binvelfile
Sorry, I've never seen this message before and don't know what you could do.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
PeteSSend message
Joined: 1 Jan 09 Posts: 7 Credit: 3,602,175 RAC: 0 Level
 Scientific publications
       |
I also had all workunits failing, finally I was told that daily quota of 8WU's has been exceeded and was not allocated any more.
You can see the failed WU's from here:
http://www.gpugrid.net/results.php?userid=12774 |
|
|
UL1Send message
Joined: 16 Sep 07 Posts: 56 Credit: 35,013,195 RAC: 0 Level
 Scientific publications
         |
Got almost 15 WUs now erroring out with exact the same error message that ianmbaker got...but me am running LINUX with app version 6.59...
Any ideas what's going wrong here ? |
|
|
DonnieSend message
Joined: 13 Nov 08 Posts: 11 Credit: 11,185,470 RAC: 0 Level
 Scientific publications
      |
I had 9 WUs also error out. They appear to have been back to back starting at 18:21:28 UTC thru 18:30:37 UTC.
I have 1 "good" 6.61 running at 69% completion.
Same error message as below:
MDIO ERROR: read error for file "input.vel", byte number 4: number of atoms (1078071040) != (39910) expected
ERROR: mdioload.cu, line 146: Unable to read binvelfile
Now I've reached my daily quota and will have 2 260 GTX 216 cards sitting idle. I suppose I could do folding @ home. |
|
|
|
I've errored out on a bunch of WU's after doing 1 successfully. Looks like 6.61 needs to be repealed and go back to 6.55. |
|
|
|
I have completed two so far, both a success ...
____________
|
|
|
|
I have a strange workunit error:
http://www.gpugrid.net/result.php?resultid=238028
Anyone knows what this means?:
"Cuda error: Kernel [frc_sum_kernel_bond] failed in file 'force.cu' in line 283 : unknown error." |
|
|
|
I have a strange workunit error:
http://www.gpugrid.net/result.php?resultid=238028
Anyone knows what this means?:
"Cuda error: Kernel [frc_sum_kernel_bond] failed in file 'force.cu' in line 283 : unknown error."
Bad unknown things ...
Really bad, and really, really unknown things ...
But we know exactly where ...
:)
Sorry, I could not resist ... and it is the only sense of humor that I have ...
____________
|
|
|
|
Bad unknown things ...
Really bad, and really, really unknown things ...
But we know exactly where ...
:)
Sorry, I could not resist ... and it is the only sense of humor that I have ...
Well...there are different kinds of unknowns...
see http://www.youtube.com/watch?v=_RpSv3HjpEw
:)
|
|
|
|
I have completed two so far, both a success ...
Same to me, success with both WUs 236301, 236294, no error, but regardless I suspend now further downloads and wait for all 6.61-result (5 from 7) in my task-queue. |
|
|
|
Well...there are different kinds of unknowns...
see http://www.youtube.com/watch?v=_RpSv3HjpEw
:)
As an engineer I lived by those rules ... and it was almost always the unknown unknowns that got me ... which is why I tried so hard to find out what they might be ...
____________
|
|
|
Chris S Send message
Joined: 18 Jan 09 Posts: 21 Credit: 3,950,530 RAC: 0 Level
 Scientific publications
     |
I had 2 with this error, but running a 6.61 OK now
MDIO ERROR: read error for file "input.vel", byte number 4: number of atoms (1625495040) != (39910) expected
ERROR: mdioload.cu, line 146: Unable to read binvelfile
|
|
|
|
It looks like the problem reported in thread http://www.gpugrid.net/forum_thread.php?id=671 was the cause of the problem. The failing work units I had were in that series.
Thanks to all who responded.
Ian
____________
|
|
|
|
Same here. I've had two successfull WU's in the past 4 days. The rest of the time I maxed out my daily quota and now it has been reduced to 1/day.
Now I sit idle (well not me at work but my computer). I have no problems with my 8800GT on Vista 64. All my issues are with 2 GTX280's on a Vista 64 machine.
Pat |
|
|
|
I had 2 with this error, but running a 6.61 OK now
MDIO ERROR: read error for file "input.vel", byte number 4: number of atoms (1625495040) != (39910) expected
ERROR: mdioload.cu, line 146: Unable to read binvelfile
Same error, first time a 6.61 crashed: 239257, Exit status 98 (0x62)
MDIO ERROR: read error for file "input.vel", byte number 4: number of atoms (1625495040) != (39910) expected
ERROR: mdioload.cu, line 146: Unable to read binvelfile
|
|
|
mikaok Send message
Joined: 16 Jan 09 Posts: 12 Credit: 639,094 RAC: 0 Level
 Scientific publications
    |
I had 2 with this error, but running a 6.61 OK now
MDIO ERROR: read error for file "input.vel", byte number 4: number of atoms (1625495040) != (39910) expected
ERROR: mdioload.cu, line 146: Unable to read binvelfile
Same error, first time a 6.61 crashed: 239257, Exit status 98 (0x62)
MDIO ERROR: read error for file "input.vel", byte number 4: number of atoms (1625495040) != (39910) expected
ERROR: mdioload.cu, line 146: Unable to read binvelfile
Same problem here. exit code 98 for the last 11 wu's in a row.. Mostly those were SH2_US_4 units, but also at least one SH2_US_5. Hope I don't get penalty for these :D |
|
|