hostinfo attributes refer to incorrect job_repo_url, causing tests to fail |
||||||||||||
Issue descriptionThese two builds: https://luci-milo.appspot.com/buildbot/chromeos/cyan-paladin/2810 https://luci-milo.appspot.com/buildbot/chromeos/cyan-paladin/2811 Failed because a test doesn't exist. I don't see a relevant CL that the two builds have in common, and the test does appear to have existed for a while. In the last successful build, it was present and worked: http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=121823880
,
Jun 7 2017
Caroline appears to be running the same test. https://luci-milo.appspot.com/buildbot/chromeos/caroline-paladin/176
,
Jun 7 2017
ihf@ found it. https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/121885931-chromeos-test/chromeos4-row6-rack9-host12/debug/ 06/06 15:57:12.228 INFO | packages:0207| Successfully fetched packages.checksum from http://100.115.219.136:8082/static/cyan-release/R59-9460.57.0/autotest/packages/packages.checksum
,
Jun 7 2017
The point is that we are pulling CTS from the wrong branch.
,
Jun 7 2017
,
Jun 7 2017
We want to run cyan-paladin/R61-9624.0.0-rc2 but ask for test artifacts for cyan-release/R59-9460.57.0. Notice the test is versioned and we have different version on different branches. Same that cheets_StartAndroid is actually still called cheets_CTSHelper on R59. So these tests can't be found on the old build artifacts. https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/121852956-chromeos-test/chromeos4-row6-rack9-host12/debug/ 06/06 13:57:07.336 DEBUG| autoserv:0696| autoserv command was: /usr/local/autotest/server/autoserv -p -r /usr/local/autotest/results/121852956-chromeos-test/chromeos4-row6-rack9-host12 -m chromeos4-row6-rack9-host12 -u chromeos-test -l cyan-paladin/R61-9624.0.0-rc2/arc-bvt-cq/cheets_StartAndroid.stress.0 -c --lab True -P 121852956-chromeos-test/chromeos4-row6-rack9-host12 -n /usr/local/autotest/results/drone_tmp/attach.2318 --verify_job_repo_url [...] 06/06 13:57:13.497 DEBUG| utils:0203| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/is_staged?artifacts=autotest_packages&files=&archive_url=gs://chromeos-image-archive/cyan-release/R59-9460.57.0"''
,
Jun 7 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/56e6e4fef6776e2e8dc58052db578f98db1c2398 commit 56e6e4fef6776e2e8dc58052db578f98db1c2398 Author: Don Garrett <dgarrett@google.com> Date: Wed Jun 07 01:32:21 2017 chromeos_config: Make cyan-paladin experimental (workaround). Cyan is failing because of bug with how we handle CTS tests. No reason to bring the CQ down until it's fixed (hopefully tomorrow). BUG= chromium:730272 TEST=run_tests Change-Id: Id41c43058138fc2d6b0df5297832922b7e69d535 Reviewed-on: https://chromium-review.googlesource.com/526286 Tested-by: Don Garrett <dgarrett@chromium.org> Reviewed-by: Miao-chen Chou <mcchou@chromium.org> Commit-Queue: Don Garrett <dgarrett@chromium.org> [modify] https://crrev.com/56e6e4fef6776e2e8dc58052db578f98db1c2398/cbuildbot/config_dump.json [modify] https://crrev.com/56e6e4fef6776e2e8dc58052db578f98db1c2398/cbuildbot/chromeos_config.py
,
Jun 7 2017
We need to revert my change after this is fixed.
,
Jun 7 2017
And this needs to have an owner or it will get lost.
,
Jun 7 2017
,
Jun 7 2017
Adding non-PST folks to see if someone can make some headway on this since it's affecting our CQ coverage.
,
Jun 7 2017
Looking... during non-PST time.
,
Jun 7 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/558836ca3a2757fcff341abefbd1109c79ea4e12 commit 558836ca3a2757fcff341abefbd1109c79ea4e12 Author: David Riley <davidriley@chromium.org> Date: Wed Jun 07 05:20:28 2017 prebuilts_unittest: Temporary disable cyan test. cyan test is broken due to temporary revert to address crbug.com/730272 , so disable the currently broken test until cyan-paladin is re-enabled. BUG= chromium:730272 TEST=run_tests; prebuilts_unittet Change-Id: I77a5e4af56959ab4b7bcda24eb2ee7d37c720134 Reviewed-on: https://chromium-review.googlesource.com/526368 Commit-Queue: David Riley <davidriley@chromium.org> Tested-by: David Riley <davidriley@chromium.org> Trybot-Ready: David Riley <davidriley@chromium.org> Reviewed-by: Hsu Wei-Cheng <mojahsu@chromium.org> [modify] https://crrev.com/558836ca3a2757fcff341abefbd1109c79ea4e12/cbuildbot/prebuilts_unittest.py
,
Jun 7 2017
Update: 2810, 2811 failed. 2812 passed. 2813 failed again. https://luci-milo.appspot.com/buildbot/chromeos/cyan-paladin/2810 https://luci-milo.appspot.com/buildbot/chromeos/cyan-paladin/2811 https://luci-milo.appspot.com/buildbot/chromeos/cyan-paladin/2812 https://luci-milo.appspot.com/buildbot/chromeos/cyan-paladin/2813 In the log, host_info looks somehow unexpected. 2810. 06/06 13:57:07.678 DEBUG| host_info:0265| New host_info: HostInfo [Labels: ['bluetooth', 'hw_video_acc_enc_h264', 'hw_jpeg_acc_dec', 'ec:cros', 'hw_video_acc_vp8', 'hw_video_acc_vp9', 'cyan', 'pool:cq', 'audio_loopback_dongle', 'cts_abi_x86', 'cts_abi_arm', 'arc', 'internal_display', 'os:cros', 'power:battery', 'hw_video_acc_h264', 'storage:mmc', 'webcam', 'board:cyan', 'accel:cros-ec', 'servo', 'phase:PVT', 'touchpad', 'touchscreen', 'sku:cyan_intel_celeron_n3150_4Gb', 'variant:cyan', 'cros-version:cyan-paladin/R61-9624.0.0-rc2'], Attributes: {'powerunit_hostname': 'chromeos4-row5_6-rack9-rpm2', 'job_repo_url': 'http://100.115.219.136:8082/static/cyan-release/R59-9460.57.0/autotest/packages', 'powerunit_outlet': '.A12'}] 2811. 06/06 15:56:34.288 DEBUG| host_info:0265| New host_info: HostInfo [Labels: ['bluetooth', 'hw_video_acc_enc_h264', 'hw_jpeg_acc_dec', 'ec:cros', 'hw_video_acc_vp8', 'hw_video_acc_vp9', 'cyan', 'pool:cq', 'audio_loopback_dongle', 'cts_abi_x86', 'cts_abi_arm', 'arc', 'internal_display', 'os:cros', 'power:battery', 'hw_video_acc_h264', 'storage:mmc', 'webcam', 'board:cyan', 'accel:cros-ec', 'servo', 'phase:PVT', 'touchpad', 'touchscreen', 'sku:cyan_intel_celeron_n3150_4Gb', 'variant:cyan', 'cros-version:cyan-paladin/R61-9624.0.0-rc3'], Attributes: {'powerunit_hostname': 'chromeos4-row5_6-rack9-rpm2', 'job_repo_url': 'http://100.115.219.136:8082/static/cyan-release/R59-9460.57.0/autotest/packages', 'powerunit_outlet': '.A12'}] 2812. 06/06 18:40:48.153 DEBUG| host_info:0265| New host_info: HostInfo [Labels: ['bluetooth', 'hw_video_acc_enc_h264', 'hw_jpeg_acc_dec', 'ec:cros', 'hw_video_acc_vp8', 'cyan', 'hw_video_acc_vp9', 'audio_loopback_dongle', 'cts_abi_x86', 'cts_abi_arm', 'board:cyan', 'internal_display', 'os:cros', 'power:battery', 'hw_video_acc_h264', 'storage:mmc', 'webcam', 'arc', 'servo', 'pool:cq', 'accel:cros-ec', 'phase:PVT', 'touchpad', 'touchscreen', 'sku:cyan_intel_celeron_n3050_2Gb', 'variant:cyan', 'cros-version:cyan-paladin/R61-9624.0.0-rc4'], Attributes: {'powerunit_hostname': 'chromeos4-row11_12-rack11-rpm1', 'job_repo_url': 'http://100.115.219.130:8082/static/cyan-paladin/R61-9623.0.0-rc4/autotest/packages', 'powerunit_outlet': '.A3'}] 2813. 06/06 21:11:21.310 DEBUG| host_info:0265| New host_info: HostInfo [Labels: ['bluetooth', 'hw_video_acc_enc_h264', 'hw_jpeg_acc_dec', 'ec:cros', 'hw_video_acc_vp8', 'hw_video_acc_vp9', 'cyan', 'pool:cq', 'audio_loopback_dongle', 'cts_abi_x86', 'cts_abi_arm', 'arc', 'internal_display', 'os:cros', 'power:battery', 'hw_video_acc_h264', 'storage:mmc', 'webcam', 'board:cyan', 'accel:cros-ec', 'servo', 'phase:PVT', 'touchpad', 'touchscreen', 'sku:cyan_intel_celeron_n3150_4Gb', 'variant:cyan', 'cros-version:cyan-paladin/R61-9625.0.0-rc1'], Attributes: {'powerunit_hostname': 'chromeos4-row5_6-rack9-rpm2', 'job_repo_url': 'http://100.115.219.136:8082/static/cyan-release/R59-9460.57.0/autotest/packages', 'powerunit_outlet': '.A12'}] Specifically, HostInfo.Attributes.job_repo_url referres R59-release on 2810, 2811 and 2813 wrongly, while 2812 refers paladin properly. Today, autotest update is done and it contains some changes around host_info handling. d252d26af [autotest] Populate HostInfoStore from cli/host.py bbb2456ea [autotest] Use HostInfo in host factory to obtain host information 368abdf47 [autotest] Use HostInfo to access host attributes 7a3675fc5 [autotest] Use HostInfo to update version labels cbeab1213 [autotest] Respect exclusive --in-lab or --host-attributes arguments I'm suspecting them.
,
Jun 7 2017
,
Jun 7 2017
For those three cheets_StartAndroid.stress fails ran on chromeos4-row6-rack9-host12.
According to the history;
----
Job 121812634 chromeos-test cyan-paladin/R61-9623.0.0-rc3/suite_attr_wrapper/cheets_SELinuxTest 2017-06-06 08:25:37 Completed
06/06 08:25:42.817 DEBUG| host_info:0234| New host_info: HostInfo [Labels: ['bluetooth', 'hw_video_acc_enc_h264', 'hw_jpeg_acc_dec', 'ec:cros', 'hw_video_acc_vp8', 'hw_video_acc_vp9', 'cyan', 'pool:cq', 'audio_loopback_dongle', 'cts_abi_x86', 'cts_abi_arm', 'arc', 'internal_display', 'os:cros', 'power:battery', 'hw_video_acc_h264', 'storage:mmc', 'webcam', 'board:cyan', 'accel:cros-ec', 'servo', 'phase:PVT', 'touchpad', 'touchscreen', 'sku:cyan_intel_celeron_n3150_4Gb', 'variant:cyan', 'cros-version:cyan-paladin/R61-9623.0.0-rc3'], Attributes: {'powerunit_hostname': 'chromeos4-row5_6-rack9-rpm2', 'job_repo_url': 'http://100.115.219.136:8082/static/cyan-paladin/R61-9623.0.0-rc3/autotest/packages', 'powerunit_outlet': '.A12'}]
looks to have a correct info.
----
Job 121817893 chromeos-test cyan-paladin/R61-9623.0.0-rc4/arc-bvt-cq/cheets_CTS_N.7.1_r5.arm.CtsAccountManagerTestCases Failed
06/06 09:31:02.825 DEBUG| host_info:0234| New host_info: HostInfo [Labels: ['bluetooth', 'hw_video_acc_enc_h264', 'hw_jpeg_acc_dec', 'ec:cros', 'hw_video_acc_vp8', 'hw_video_acc_vp9', 'cyan', 'pool:cq', 'audio_loopback_dongle', 'cts_abi_x86', 'cts_abi_arm', 'arc', 'internal_display', 'os:cros', 'power:battery', 'hw_video_acc_h264', 'storage:mmc', 'webcam', 'board:cyan', 'accel:cros-ec', 'servo', 'phase:PVT', 'touchpad', 'touchscreen', 'sku:cyan_intel_celeron_n3150_4Gb', 'variant:cyan', 'cros-version:cyan-paladin/R61-9623.0.0-rc3'], Attributes: {'powerunit_hostname': 'chromeos4-row5_6-rack9-rpm2', 'job_repo_url': 'http://100.115.219.136:8082/static/cyan-paladin/R61-9623.0.0-rc3/autotest/packages', 'powerunit_outlet': '.A12'}]
Looks to referring old config. (cros-version also referrs rc3, rather than rc4).
----
Job 121826391 chromeos-test cyan-paladin/R61-9624.0.0-rc1/suite_attr_wrapper/cheets_CTS_N.7.1_r6.x86.CtsOpenGLTestCases 2017-06-06 11:54:16 Completed
06/06 11:55:25.842 DEBUG| host_info:0221| New host_info: HostInfo [Labels: ['bluetooth', 'hw_video_acc_enc_h264', 'hw_jpeg_acc_dec', 'ec:cros', 'hw_video_acc_vp8', 'hw_video_acc_vp9', 'cyan', 'pool:cq', 'audio_loopback_dongle', 'cts_abi_x86', 'cts_abi_arm', 'arc', 'internal_display', 'os:cros', 'power:battery', 'hw_video_acc_h264', 'storage:mmc', 'webcam', 'board:cyan', 'accel:cros-ec', 'servo', 'phase:PVT', 'touchpad', 'touchscreen', 'sku:cyan_intel_celeron_n3150_4Gb', 'variant:cyan', 'cros-version:cyan-paladin/R61-9624.0.0-rc1'], Attributes: {'powerunit_hostname': 'chromeos4-row5_6-rack9-rpm2', 'job_repo_url': 'http://100.115.219.136:8082/static/cyan-release/R59-9460.57.0/autotest/packages', 'powerunit_outlet': '.A12'}
looks to have correct cros-version, but still wrong job_repo_url (referring cyan-release).
368abdf47 [autotest] Use HostInfo to access host attributes
https://chromium-review.googlesource.com/c/517773/ is suspicious in those five CLs.
,
Jun 7 2017
I'll take a look. Update in ~ 1 hour
,
Jun 7 2017
https://luci-milo.appspot.com/buildbot/chromeos/cyan-paladin/2810 Looking at another test (cheets_GTS.4.1_r2.GtsNetTestCases) from the same build, the job_repo_url is correctly pointing to R61. https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/121852964-chromeos-test/chromeos4-row6-rack9-host3/debug/ So what's special about the test that failed (consistently in cyan build 2810, 2811, 2813, 2815): cheets_StartAndroid.stress.0 One thing to note is that every instance of this test ran on chromeos4-row6-rack9-host12 Focussing on the first failed instance: http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=121852956 The provision job before that failure was: https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/hosts/chromeos4-row6-rack9-host12/407186-provision/ (other tests had since run) When updating HostInfo, that job neither updated nor removed the job_repo_url (so that after this point, job_repo_url was wrong) 06/06 13:43:51.275 DEBUG| dev_server:2122| CrOS auto-update succeed for host chromeos4-row6-rack9-host12 06/06 13:43:51.279 DEBUG| host_info:0221| Committing HostInfo to store <autotest_lib.server.hosts.afe_store.AfeStore object at 0x7f182cccc790> 06/06 13:43:51.342 DEBUG| afe_store:0079| removing labels: ['cros-version:cyan-paladin/R61-9624.0.0-rc1'] 06/06 13:43:51.631 INFO | afe_store:0095| adding labels: ['cros-version:cyan-paladin/R61-9624.0.0-rc2'] 06/06 13:43:52.230 DEBUG| host_info:0225| HostInfo updated to: HostInfo [Labels: ['bluetooth', 'hw_video_acc_enc_h264', 'hw_jpeg_acc_dec', 'ec:cros', 'hw_video_acc_vp8', 'hw_video_acc_vp9', 'cyan', 'pool:cq', 'audio_loopback_dongle', 'cts_abi_x86', 'cts_abi_arm', 'arc', 'internal_display', 'os:cros', 'power:battery', 'hw_video_acc_h264', 'storage:mmc', 'webcam', 'board:cyan', 'accel:cros-ec', 'servo', 'phase:PVT', 'touchpad', 'touchscreen', 'sku:cyan_intel_celeron_n3150_4Gb', 'variant:cyan', 'cros-version:cyan-paladin/R61-9624.0.0-rc2'], Attributes: {'powerunit_hostname': 'chromeos4-row5_6-rack9-rpm2', 'job_repo_url': 'http://100.115.219.136:8082/static/cyan-release/R59-9460.57.0/autotest/packages', 'powerunit_outlet': '.A12'}]
,
Jun 7 2017
This is a fix: https://chromium-review.googlesource.com/527276 ... working on verification.
,
Jun 7 2017
Locally verified that my change doesn't blow anything up and fixes the problem seen. (a provision job against cyan DUT works, and updates version labels and job_repo_url as expected) There's only been one change in the autotest repo since the last prod push, and that only touches some site_tests, so I'm going to go full blind with a prod push here, to get my change in fast and verify the fix in-situ.
,
Jun 7 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/2aebbbe3b40950de80974a0734bc773854ddc5b1 commit 2aebbbe3b40950de80974a0734bc773854ddc5b1 Author: Prathmesh Prabhu <pprabhu@chromium.org> Date: Wed Jun 07 17:34:12 2017 [autotest] Aggressively commit HostInfo back to AFE. This CL is a much-too-aggressive commit of the HostInfo back to its store. We found one place where the info udpates were stepping over its own feet. Rather than audit right now if another such instance exists within |machine_install| assume the worst. BUG= chromium:730272 TEST=TBD. Change-Id: Ia7d80fc3ea3a175f387f1d268cfd9e9f9b002a22 Reviewed-on: https://chromium-review.googlesource.com/527276 Reviewed-by: Aviv Keshet <akeshet@chromium.org> Tested-by: Prathmesh Prabhu <pprabhu@chromium.org> [modify] https://crrev.com/2aebbbe3b40950de80974a0734bc773854ddc5b1/server/afe_utils.py
,
Jun 7 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/2aebbbe3b40950de80974a0734bc773854ddc5b1 commit 2aebbbe3b40950de80974a0734bc773854ddc5b1 Author: Prathmesh Prabhu <pprabhu@chromium.org> Date: Wed Jun 07 17:34:12 2017 [autotest] Aggressively commit HostInfo back to AFE. This CL is a much-too-aggressive commit of the HostInfo back to its store. We found one place where the info udpates were stepping over its own feet. Rather than audit right now if another such instance exists within |machine_install| assume the worst. BUG= chromium:730272 TEST=TBD. Change-Id: Ia7d80fc3ea3a175f387f1d268cfd9e9f9b002a22 Reviewed-on: https://chromium-review.googlesource.com/527276 Reviewed-by: Aviv Keshet <akeshet@chromium.org> Tested-by: Prathmesh Prabhu <pprabhu@chromium.org> [modify] https://crrev.com/2aebbbe3b40950de80974a0734bc773854ddc5b1/server/afe_utils.py
,
Jun 7 2017
actually, I got really lucky: pprabhu@pprabhu:files$ git log cros/prod-next..cros/master commit 2aebbbe3b40950de80974a0734bc773854ddc5b1 (HEAD, m/master, cros/master) Author: Prathmesh Prabhu <pprabhu@chromium.org> Date: Wed Jun 7 09:48:33 2017 -0700 [autotest] Aggressively commit HostInfo back to AFE. This CL is a much-too-aggressive commit of the HostInfo back to its store. We found one place where the info udpates were stepping over its own feet. Rather than audit right now if another such instance exists within |machine_install| assume the worst. BUG= chromium:730272 TEST=TBD. Change-Id: Ia7d80fc3ea3a175f387f1d268cfd9e9f9b002a22 Reviewed-on: https://chromium-review.googlesource.com/527276 Reviewed-by: Aviv Keshet <akeshet@chromium.org> Tested-by: Prathmesh Prabhu <pprabhu@chromium.org> My CL is the only CL to be pushed. Since push-to-prod test didn't catch the problem, it clearly won't catch the fix. I'm going blind.
,
Jun 7 2017
My confidence in that fix is ~80%. The problem is that this should have failed more catastrophically (and in fact I kept an eye on things going all red yesterday afternoon given the risk) Looking at the cyan duts in cq pool: pprabhu@pprabhu:files$ dut-status -b cyan -p cq -w | xargs atest host stat | grep job_repo_url job_repo_url : http://100.115.219.135:8082/static/cyan-release/R59-9460.50.0/autotest/packages job_repo_url : http://100.115.219.135:8082/static/cyan-release/R59-9460.50.0/autotest/packages job_repo_url : http://100.115.219.135:8082/static/cyan-release/R59-9460.50.0/autotest/packages job_repo_url : http://100.115.219.135:8082/static/cyan-release/R59-9460.50.0/autotest/packages job_repo_url : http://100.115.219.135:8082/static/cyan-release/R59-9460.50.0/autotest/packages job_repo_url : http://100.115.219.129:8082/static/cyan-release/R58-9334.58.0/autotest/packages ... if job_repo_url is the problem, why isn't _everything_ failing?
,
Jun 7 2017
Ah #6 explains why only this test fails so badly. Worse, other tests are likely currently using incorrect artifacts (testing the wrong things...) That's what we get for duplicating information :/ Job to validate fix in prod: http://chromeos-server97.mtv.corp.google.com/afe/#tab_id=view_job&object_id=122040337
,
Jun 7 2017
Re #25: That job got past the provision correctly but was aborted later by the auttoest_system. It was given a mere 5 minutes to run. Wonder why?... Probably because my test_that command needed some magic arguments that were missing. I will wait for cyan-paladin to cycle green before landing: https://chromium-review.googlesource.com/c/527336 It seems to be having unrelated problems. But this issue is fixed, methinks.
,
Jun 7 2017
OK, it was aborted because the suite created by test_that didn't give it enough time: Will return from run_suite with status: SUITE_TIMEOUT
,
Jun 7 2017
,
Jun 7 2017
This happened again on chell-release:1174 and 1173. https://uberchromegw.corp.google.com/i/chromeos/builders/chell-release/builds/1173 https://uberchromegw.corp.google.com/i/chromeos/builders/chell-release/builds/1174
,
Jun 7 2017
Re #29: Those failures are from before the fix was pushed around 11:00 AM PST today.
,
Jun 7 2017
Issue 730431 has been merged into this issue.
,
Jun 7 2017
Issue 730683 has been merged into this issue.
,
Jun 7 2017
I was wrong (partially). 730431 uncovered another failure mode we hadn't seen earlier. I've landed and pushed (blind) a fix: https://chromium-review.googlesource.com/c/527664/ I'll validate that failure mode is gone in prod as well. I'm very very confused why these things are failing today (as opposed to yesterday).
,
Jun 7 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/de08996f8cd194f4ed87a27a0c5ab8f6b37590e1 commit de08996f8cd194f4ed87a27a0c5ab8f6b37590e1 Author: Prathmesh Prabhu <pprabhu@chromium.org> Date: Wed Jun 07 21:31:04 2017 [autotest] Commit host attributes when provision via devserver A case of bad indentation and blocking of code. This was allowing provision to work in my local setup and moblab, but failing to update host attributes in the lab. BUG= chromium:730272 TEST=None. I'm going blind. Change-Id: I198759ed57cdc844c5b125ab51dcf31ac4fcc456 Reviewed-on: https://chromium-review.googlesource.com/527664 Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org> Commit-Queue: Prathmesh Prabhu <pprabhu@chromium.org> Tested-by: Prathmesh Prabhu <pprabhu@chromium.org> [modify] https://crrev.com/de08996f8cd194f4ed87a27a0c5ab8f6b37590e1/server/afe_utils.py
,
Jun 7 2017
OK, the fix has been pushed (again). Here's a job I aborted and re-launched in the lab to validate the fix (again): http://chromeos-server98.mtv.corp.google.com/afe/#tab_id=view_job&object_id=122062222
,
Jun 7 2017
These two issues have left jobs and DUTs in the lab in a weird state. DUTs: [1] They have the wrong cros-version: and/or job_repo_url versions. Coupled with another stupid bug (https://chromium-review.googlesource.com/c/527600/), this means that tests running on these DUTs without first provisioning anew can hang until some timeout aborts the job. [2] Currently, there are a bunch of jobs in this state. [1] is less of a problem to the builders, since a new build implies a new provision. I need to find a way to abort jobs stuck in [2] though, to move the stuck builders ahead (via failures).
,
Jun 7 2017
Issue 730787 has been merged into this issue.
,
Jun 7 2017
The job in #36 succeeded. We've restarted CQ and PFQ builders. A lab push (#35) is in progress right now, and as long as that finishes before HWTest stage from the next builds, I expect to recover.
,
Jun 8 2017
Current CQ run has one slave failure but that is not related to this bug. All other slaves have cycled green. Declaring victory.
,
Jun 8 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/31bce139f55f16180a154f73aac70ef9cdfe7773 commit 31bce139f55f16180a154f73aac70ef9cdfe7773 Author: Prathmesh Prabhu <pprabhu@chromium.org> Date: Thu Jun 08 06:45:44 2017 Revert "chromeos_config: Make cyan-paladin experimental (workaround)." This reverts commit 56e6e4fef6776e2e8dc58052db578f98db1c2398. Reason for revert: Underlying bug failing cyan HWTests has been fixed. Original change's description: > chromeos_config: Make cyan-paladin experimental (workaround). > > Cyan is failing because of bug with how we handle CTS tests. No reason > to bring the CQ down until it's fixed (hopefully tomorrow). > > BUG= chromium:730272 > TEST=run_tests > > Change-Id: Id41c43058138fc2d6b0df5297832922b7e69d535 > Reviewed-on: https://chromium-review.googlesource.com/526286 > Tested-by: Don Garrett <dgarrett@chromium.org> > Reviewed-by: Miao-chen Chou <mcchou@chromium.org> > Commit-Queue: Don Garrett <dgarrett@chromium.org> BUG= chromium:730272 Change-Id: Id31bc092d6d52db478138b82aeccb6760184554a Reviewed-on: https://chromium-review.googlesource.com/527336 Commit-Ready: Prathmesh Prabhu <pprabhu@chromium.org> Tested-by: Prathmesh Prabhu <pprabhu@chromium.org> Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org> Reviewed-by: David Riley <davidriley@chromium.org> [modify] https://crrev.com/31bce139f55f16180a154f73aac70ef9cdfe7773/cbuildbot/config_dump.json [modify] https://crrev.com/31bce139f55f16180a154f73aac70ef9cdfe7773/cbuildbot/chromeos_config.py |
||||||||||||
►
Sign in to add a comment |
||||||||||||
Comment 1 by dgarr...@chromium.org
, Jun 7 2017