Message boards : Number crunching : WU invalid because of an upload issue at GPUGRIDs server end?
Author | Message |
---|---|
Any idea why this task was marked as invalid after approx. 40,000 seconds of precious run time on my RTX 3080? So far, this machine has not produced a single error and the end of the log appears to note an upload issue? Name e5s122_e2s172p0f91-ADRIA_AdB_KIXCMYB_HIP-1-2-RND2100_3 Arbeitspaket 27081741 Erstellt 12 Oct 2021 | 11:45:25 UTC Gesendet 12 Oct 2021 | 11:45:51 UTC Empfangen 13 Oct 2021 | 1:06:53 UTC Serverstatus Abgeschlossen Resultat Berechnungsfehler Clientstatus Berechnungsfehler Endstatus 0 (0x0) Computer ID 584499 Ablaufdatum 17 Oct 2021 | 11:45:51 UTC Laufzeit 40,157.41 CPU Zeit 39,016.81 Prüfungsstatus Ungültig Punkte 0.00 Anwendungsversion New version of ACEMD v2.18 (cuda1121) Stderr Ausgabe <core_client_version>7.16.11</core_client_version> <![CDATA[ <stderr_txt> 15:52:17 (23288): wrapper (7.9.26016): starting 15:52:17 (23288): wrapper: running bin/acemd3.exe (--boinc --device 0) 03:01:09 (23288): bin/acemd3.exe exited; CPU time 39016.812500 03:01:20 (23288): called boinc_finish(0) 0 bytes in 0 Free Blocks. 186 bytes in 4 Normal Blocks. 1144 bytes in 1 CRT Blocks. 0 bytes in 0 Ignore Blocks. 0 bytes in 0 Client Blocks. Largest number used: 0 bytes. Total allocations: 824084403 bytes. Dumping objects -> {389617} normal block at 0x0000028AAECC3BC0, 85 bytes long. Data: <<project_prefere> 3C 70 72 6F 6A 65 63 74 5F 70 72 65 66 65 72 65 ..\api\boinc_api.cpp(309) : {389614} normal block at 0x0000028AAECC4620, 8 bytes long. Data: <  ®Š > 00 00 A0 AE 8A 02 00 00 {388969} normal block at 0x0000028AAECC3C60, 85 bytes long. Data: <<project_prefere> 3C 70 72 6F 6A 65 63 74 5F 70 72 65 66 65 72 65 {388355} normal block at 0x0000028AAECC48F0, 8 bytes long. Data: < ήŠ > 10 9D CE AE 8A 02 00 00 ..\zip\boinc_zip.cpp(122) : {146} normal block at 0x0000028AAECC3090, 260 bytes long. Data: < > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 {133} normal block at 0x0000028AAECC4670, 16 bytes long. Data: <PâË®Š > 50 E2 CB AE 8A 02 00 00 00 00 00 00 00 00 00 00 {132} normal block at 0x0000028AAECBE250, 40 bytes long. Data: <pFÌ®Š conda-pa> 70 46 CC AE 8A 02 00 00 63 6F 6E 64 61 2D 70 61 {125} normal block at 0x0000028AAECBE480, 48 bytes long. Data: <--boinc --device> 2D 2D 62 6F 69 6E 63 20 2D 2D 64 65 76 69 63 65 {124} normal block at 0x0000028AAECC4030, 16 bytes long. Data: <XNÌ®Š > 58 4E CC AE 8A 02 00 00 00 00 00 00 00 00 00 00 {123} normal block at 0x0000028AAECC48A0, 16 bytes long. Data: <0NÌ®Š > 30 4E CC AE 8A 02 00 00 00 00 00 00 00 00 00 00 {122} normal block at 0x0000028AAECC4CB0, 16 bytes long. Data: < NÌ®Š > 08 4E CC AE 8A 02 00 00 00 00 00 00 00 00 00 00 {121} normal block at 0x0000028AAECC4440, 16 bytes long. Data: <àMÌ®Š > E0 4D CC AE 8A 02 00 00 00 00 00 00 00 00 00 00 {120} normal block at 0x0000028AAECC43A0, 16 bytes long. Data: <¸MÌ®Š > B8 4D CC AE 8A 02 00 00 00 00 00 00 00 00 00 00 {119} normal block at 0x0000028AAECC4530, 16 bytes long. Data: < MÌ®Š > 90 4D CC AE 8A 02 00 00 00 00 00 00 00 00 00 00 {118} normal block at 0x0000028AAECC3D60, 16 bytes long. Data: <pMÌ®Š > 70 4D CC AE 8A 02 00 00 00 00 00 00 00 00 00 00 {117} normal block at 0x0000028AAECC4B20, 16 bytes long. Data: <HMÌ®Š > 48 4D CC AE 8A 02 00 00 00 00 00 00 00 00 00 00 {116} normal block at 0x0000028AAECC4760, 16 bytes long. Data: < MÌ®Š > 20 4D CC AE 8A 02 00 00 00 00 00 00 00 00 00 00 {115} normal block at 0x0000028AAECC4D20, 496 bytes long. Data: <`GÌ®Š bin/acem> 60 47 CC AE 8A 02 00 00 62 69 6E 2F 61 63 65 6D {65} normal block at 0x0000028AAECB3280, 16 bytes long. Data: < 굤ö > 80 EA B5 A4 F6 7F 00 00 00 00 00 00 00 00 00 00 {64} normal block at 0x0000028AAECB3230, 16 bytes long. Data: <@鵤ö > 40 E9 B5 A4 F6 7F 00 00 00 00 00 00 00 00 00 00 {63} normal block at 0x0000028AAECB2FB0, 16 bytes long. Data: <øW²¤ö > F8 57 B2 A4 F6 7F 00 00 00 00 00 00 00 00 00 00 {62} normal block at 0x0000028AAECB3190, 16 bytes long. Data: <ØW²¤ö > D8 57 B2 A4 F6 7F 00 00 00 00 00 00 00 00 00 00 {61} normal block at 0x0000028AAECB2BF0, 16 bytes long. Data: <P ²¤ö > 50 04 B2 A4 F6 7F 00 00 00 00 00 00 00 00 00 00 {60} normal block at 0x0000028AAECB2BA0, 16 bytes long. Data: <0 ²¤ö > 30 04 B2 A4 F6 7F 00 00 00 00 00 00 00 00 00 00 {59} normal block at 0x0000028AAECB3780, 16 bytes long. Data: <à ²¤ö > E0 02 B2 A4 F6 7F 00 00 00 00 00 00 00 00 00 00 {58} normal block at 0x0000028AAECB2B00, 16 bytes long. Data: < ²¤ö > 10 04 B2 A4 F6 7F 00 00 00 00 00 00 00 00 00 00 {57} normal block at 0x0000028AAECB2EC0, 16 bytes long. Data: <p ²¤ö > 70 04 B2 A4 F6 7F 00 00 00 00 00 00 00 00 00 00 {56} normal block at 0x0000028AAECB3910, 16 bytes long. Data: < À°¤ö > 18 C0 B0 A4 F6 7F 00 00 00 00 00 00 00 00 00 00 Object dump complete. </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>e5s122_e2s172p0f91-ADRIA_AdB_KIXCMYB_HIP-1-2-RND2100_3_0</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> </message> ]]> Technical data transfer issues due to poor server performance are persisting for many, many years with this project and should be resolved quickly. Michael. ____________ President of Rechenkraft.net - Germany's first and largest distributed computing organization. | |
ID: 57592 | Rating: 0 | rate: / Reply Quote | |
Yes, problem with the project that can't accept large file sizes. | |
ID: 57593 | Rating: 0 | rate: / Reply Quote | |
Actually, that one had up upload error on e5s122_e2s172p0f91-ADRIA_AdB_KIXCMYB_HIP-1-2-RND2100_3_0 - not the _9 file which usually grows to ~ 500 MB and sometimes more. | |
ID: 57594 | Rating: 0 | rate: / Reply Quote | |
Yes, problem with the project that can't accept large file sizes. OMG. I can't believe this... Check out this one. Actually, that one had up upload error on e5s122_e2s172p0f91-ADRIA_AdB_KIXCMYB_HIP-1-2-RND2100_3_0 - not the _9 file which usually grows to ~ 500 MB and sometimes more. I have taken a look at some of the tasks I had completed successfully. None of them had a _9 ending in the task name. Still they had approx. 500 MB upload file sizes. Hence, the file name - to my observation - does not reliably hint to a result file size. Michael. ____________ President of Rechenkraft.net - Germany's first and largest distributed computing organization. | |
ID: 57597 | Rating: 0 | rate: / Reply Quote | |
The _9 doesn't refer to the task name, it refers to the upload file name. Each task generates multiple upload files. | |
ID: 57599 | Rating: 0 | rate: / Reply Quote | |
Check out this one. It contains the line Temporarily failed upload of e2s67_e1s44p0f1240-ADRIA_AdB_KIXCMYB_HIP-0-2-RND1963_3_9 | |
ID: 57600 | Rating: 0 | rate: / Reply Quote | |
a partial cross-post | |
ID: 57611 | Rating: 0 | rate: / Reply Quote | |
a partial cross-post nothing you can do at the moment unfortunately. ____________ | |
ID: 57614 | Rating: 0 | rate: / Reply Quote | |
.OK, I have made (multiple) backup copies of the entire GPU project folder, and have for now suspended transfers. | |
ID: 57616 | Rating: 0 | rate: / Reply Quote | |
.OK, I have made (multiple) backup copies of the entire GPU project folder, and have for now suspended transfers. even if you restart the task from a backup, you will have the same issue. the problem is the output file is too big and cannot be uploaded since it is over the maximum file size allowed by the project's server. restarting computation will result in the same file being generated, still too big. the problem can only be solved by the project. ____________ | |
ID: 57619 | Rating: 0 | rate: / Reply Quote | |
it may be a moot point, but I am NOT in the least interested in rerunning the WU (or, franking running ANY GpuGrid WUs for the foreseeable future). | |
ID: 57620 | Rating: 0 | rate: / Reply Quote | |
You claimed that there was a file in your transfers tab. That’s the file that won’t upload because it’s too large. BOINC will keep trying indefinitely already, retransferring all the files that have already been uploaded won’t make any difference. Each GPUGRID task produces several output files that all need to be uploaded. When the _9 file is too big, you run into this problem. Short of gaining control of the project’s upload server and changing their settings for them, there’s really nothing you can do at this point. | |
ID: 57621 | Rating: 0 | rate: / Reply Quote | |
15-12-2024 08:58:22 | GPUGRID | Temporarily failed upload of 1eb6A00_300_1-ANTONIOM_MDCATH300r1se-9-50-RND1180_1_10: transient upload error | |
ID: 62033 | Rating: 0 | rate: / Reply Quote | |
we can upload file after some time or days by clicking retry now on transfers tab on advanced view | |
ID: 62040 | Rating: 0 | rate: / Reply Quote | |
Message boards : Number crunching : WU invalid because of an upload issue at GPUGRIDs server end?