test logs are not uploaded to GS bucket |
|||||||||
Issue descriptionI have seen this several times. Here is an example: https://uberchromegw.corp.google.com/i/chromeos/builders/falco_li-release/builds/825 The Paygen stdio indicates test failure with "host did not return from reboot": autoupdate_EndToEndTest_paygen_au_canary_full_9129.0.0: http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=93992696 The suite job has another 1:29:31.074439 till timeout. Suite job [ PASSED ] autoupdate_EndToEndTest.paygen_au_canary_delta [ PASSED ] autoupdate_EndToEndTest.paygen_au_canary_full [ PASSED ] autoupdate_EndToEndTest.paygen_au_canary_full [ PASSED ] autoupdate_EndToEndTest.paygen_au_canary_delta [ PASSED ] autoupdate_EndToEndTest.paygen_au_canary_delta retry_count: 1 autoupdate_EndToEndTest.paygen_au_canary_full [ FAILED ] autoupdate_EndToEndTest.paygen_au_canary_full ABORT: Host did not return from reboot autoupdate_EndToEndTest.paygen_au_canary_full retry_count: 1 Links to test logs: Suite job http://cautotest/tko/retrieve_logs.cgi?job=/results/93977028-chromeos-test/ autoupdate_EndToEndTest.paygen_au_canary_delta http://cautotest/tko/retrieve_logs.cgi?job=/results/93977629-chromeos-test/ autoupdate_EndToEndTest.paygen_au_canary_full http://cautotest/tko/retrieve_logs.cgi?job=/results/93977652-chromeos-test/ autoupdate_EndToEndTest.paygen_au_canary_full http://cautotest/tko/retrieve_logs.cgi?job=/results/93977693-chromeos-test/ autoupdate_EndToEndTest.paygen_au_canary_delta http://cautotest/tko/retrieve_logs.cgi?job=/results/93988318-chromeos-test/ autoupdate_EndToEndTest.paygen_au_canary_full http://cautotest/tko/retrieve_logs.cgi?job=/results/93992696-chromeos-test/ Normally I can see the details of the failure, but in this case the bucket for the failed job: https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/93992696-chromeos-test/ is empty ("There are no objects in this folder").
,
Jan 4 2017
,
Jan 9 2017
Didn't get a chance to look into this bug. @deputies, please find a owner for the test log uploading problem.
,
Jan 17 2017
,
Jan 18 2017
https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/93992696-chromeos-test/ It's loaded for me. Some time, there is a delay on the log upload. cautotest should redirect you to shard to list logs not uploaded yet. However, some time apache runs in to some issue and failed to response, that's why it redirected you to an empty GS page.
,
Jan 18 2017
Do you really think this should not be fixed? I am running into this a lot. Are you saying that there is no way to return an error, instead of reporting "there are no objects in this folder"? It makes a big difference to know that the logs will be uploaded later, rather than just thinking "oh drats another infra bug" and never checking again. It would help to just leave an initial file in the bucket with some information about when to expect the logs to arrive.
,
Jan 18 2017
This is related to the whole log collection work flow. In some case, like CQ runs, logs for passed test won't even being uploaded to GS. They will be kept on drone/shard for 24 hours and then deleted. "leave an initial file in the bucket" That requires test to have up front knowledge of what and where to upload, not likely going to work with the current flow. The root cause of this issue is that apache becomes unresponsive on shard/drone. We have nagios alerts on apache. Whenever I received that alert, I will do a service apache2 reload to fix that problem. Ideally we can automate that (require bunch of permission hack for nagios user to reload apache), also, we don't understand why apache becomes unresponsive on the first place, which may deserve more investigation.
,
Jan 18 2017
Oh wait I think I understand now. The message "there are no objects in this folder" leaves the impression that there is an empty folder. But the truth is that there is no folder. Try with an impossible URL: https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/1234567890123456789012345678901234567890-chromeos-test/ Result: "There are no objects in this folder" So that's where the bug is. The message should either say "This folder does not exist" or, if the semantics of GS make it impossible to distinguish between an empty folder and a non-existing folder, the error should say so: "This folder does not exist or it is empty" or, if there are no folders (i.e. directories in the traditional sense), the message should be "There are no objects with this prefix" (although this is unlikely to be the case, because of performance) I would like to pass this along to GS, but if you have any input I'd love to hear it.
,
Jan 18 2017
A better error message will certainly help. I just trained my brain to translate "no objects in this folder" to results not available on GS..
,
Jan 18 2017
I have had a discussion on g/cloud-storage-discuss and g/gcs-hotline about this. As a result, they opened b/34388551. So we're done here---we can close it after that bug is closed.
,
Mar 16 2018
Bulk closing Infra>Client>ChromeOS issues untouched in over a year. |
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by semenzato@chromium.org
, Jan 2 2017