Shards hanging when retrieving test results |
|||
Issue descriptionI'm trying to debug the failures from the latest CQ, and a bunch of shards are just timing out: https://chromeos-server42.cbf.corp.google.com/results/120840654-chromeos-test/hostless/ https://chromeos-server97.mtv.corp.google.com/afe/#tab_id=view_job&object_id=120840699 https://chromeos-server26.mtv.corp.google.com/results/120840699-chromeos-test/chromeos4-row12-rack11-host3/debug I don't see anything in the machine stats for these machines that would indicate a problem: https://viceroy.corp.google.com/chromeos/machines?hostname=chromeos-server42&duration=1d&refresh=-1 https://viceroy.corp.google.com/chromeos/machines?hostname=chromeos-server97&duration=1d&refresh=-1 https://viceroy.corp.google.com/chromeos/machines?hostname=chromeos-server26&duration=1d&refresh=-1 I'll try restarting apache to see if that helps.
,
Jun 2 2017
Could this be some kind of network or firewall configuration change?
,
Jun 2 2017
,
Jun 2 2017
Do we know when this happened? I see that git versions on most of our servers changed around noon today: https://viceroy.corp.google.com/chromeos/deputy-view#_VG_OuDr13hG Was there a push-to-prod?
,
Jun 2 2017
I was just poking around this, and discovered that "https:" fails as described, but "http:" works just fine. I've observed this on two separate hosts.
,
Jun 2 2017
Following up: I note that RPC uses "http:", so that's why we've seen no lab failures being provoked by this problem.
,
Jun 2 2017
re: #4, Yes, there was a push to prod. autotest: f4610bdd3 Revert "autotest: disable video power tests if AC state is unexpected" e1729bb15 autotest: add `atest server list -N` option to list only hostnames 9857f8754 [moblab] Add new featutre to run suite to limit the tests run. 61060f8dc chrome_cr50: check ccd_lock at the end of the unlock process 5696954d3 autotest: add metric to track whether test_push passes 9a0ce5604 autotest: add test to cr50 responds to CCD disable flag 828e78005 autotest: temporarily remove autotest_SyncControl from push_to_prod 73fd8d86a Add testtracker_owner 568275829 Added required dtbo_a image for some Android devices Chromite: 2ddf2434 Use USE_GOMA instead of USE. 1881f9dd metrics: Stop catching AttributeError. 3b8eaf1b Update config settings by config-updater.
,
Jun 2 2017
I've speculatively marked two CLs: Verfied: -1 https://chromium-review.googlesource.com/c/514822/ https://chromium-review.googlesource.com/c/509211/ Could be that SSH update killed the DUTs completely.
,
Jun 2 2017
#8 is unrelated to this bug. But it recover the CQ ;)
,
Jun 2 2017
Also. https:// never worked, afaict. |
|||
►
Sign in to add a comment |
|||
Comment 1 by pho...@chromium.org
, Jun 2 2017