Advanced search

Message boards : Number crunching : run2-NOELIA_SMD-0-3-RND3019 uploaded failure

Author Message
Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 11,298,029,094
RAC: 10,484,826
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26329 - Posted: 16 Jul 2012 | 7:11:41 UTC

Is this unit too big to upload?

http://www.gpugrid.net/workunit.php?wuid=3561000

There was a similar problem with the PAOLA_RND units.

http://www.gpugrid.net/forum_thread.php?id=2970#24795

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26331 - Posted: 16 Jul 2012 | 10:33:12 UTC - in response to Message 26329.
Last modified: 16 Jul 2012 | 10:56:16 UTC

The previous issue was that some PAOLA_RNP tasks errored after they completed due to an incorrect output size.

The task you linked to failed on 3 known good systems.
The task also ran for ~2days on high end GPU's, while other long tasks completed in 1/4 of that time (half a day).
So it's a problem.


The only task that ran as 3.1 (which has a decent stderr output) said this,


    Stderr output

    <core_client_version>6.12.33</core_client_version>
    <![CDATA[
    <stderr_txt>
    # Using device 1
    # There are 2 devices supporting CUDA
    # Device 0: "GeForce GTX 570"
    # Clock rate: 1.72 GHz
    # Total amount of global memory: 1341325312 bytes
    # Number of multiprocessors: 15
    # Number of cores: 120
    # Device 1: "GeForce GTX 570"
    # Clock rate: 1.72 GHz
    # Total amount of global memory: 1341849600 bytes
    # Number of multiprocessors: 15
    # Number of cores: 120
    # Time per step (avg over 11250000 steps): 10.673 ms
    # Approximate elapsed time for entire WU: 120073.797 s
    22:26:53 (22803): called boinc_finish

    </stderr_txt>
    <message>
    upload failure: <file_xfer_error>
    <file_name>run2-NOELIA_SMD-0-3-RND3019_0_4</file_name>
    <error_code>-131</error_code>
    </file_xfer_error>

    </message>
    ]]>


-131 is a file size too big error; the output file run2-NOELIA_SMD-0-3-RND3019_0_4 was bigger than max_nbytes.

Probably the same/similar problem as before. Your Boinc event log would probably confirm this.

Thanks,
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

noelia
Send message
Joined: 5 Jul 12
Posts: 35
Credit: 393,375
RAC: 0
Level

Scientific publications
wat
Message 26335 - Posted: 16 Jul 2012 | 14:04:13 UTC - in response to Message 26329.

Hi all,

There was a problem indeed with the output size. The problem is now fixed and won't happen again, I apologize for all the inconvenience caused.

Sorry again,
Noelia.

Post to thread

Message boards : Number crunching : run2-NOELIA_SMD-0-3-RND3019 uploaded failure

//