passing builds marked as FAILED in cidb |
|||||||||
Issue description
,
Feb 10 2017
,
Feb 10 2017
weird
,
Feb 10 2017
It's marked as FAIL in CIDB mysql> select * from buildTable where id| id | last_updated | master_build_id | buildbot_generation | builder_name | waterfall | build_number | build_config | bot_hostname | start_time | finish_time | status | status_pickle | build_type | chrome_version | milestone_version | platform_version | full_version | sdk_version | toolchain_url | final | metadata_url | summary | deadline | important | buildbucket_id | unibuild || 1316536 | 2017-02-10 16:37:55 | NULL | 1 | master-paladin | chromeos | 13622 | master-paladin | cros-wimpy0-c2.c.chromeos-bot.internal | 2017-02-10 14:50:50 | 2017-02-10 16:37:55 | fail | NULL | paladin | NULL | 58 | 9270.0.0-rc3 | R58-9270.0.0-rc3 | NULL | 2017/02/%(target)s-2017.02.08.023328.tar.xz | 1 | gs://chromeos-image-archive/master-paladin/R58-9270.0.0-rc3/metadata.json | | 2017-02-10 19:21:46 | 1 | NULL | 0 |row in set (0.06 sec)
,
Feb 10 2017
https://chromium-review.googlesource.com/c/431158/13/cbuildbot/stages/report_stages.py#967 this should be causing the error. BUILDER_STATUS_FAILED = 'fail' BUILDER_STATUS_PASSED = 'pass' FINAL_STATUS_PASSED = 'passed' FINAL_STATUS_FAILED = 'failed'
,
Feb 10 2017
,
Feb 10 2017
Please replace all (FINAL_STATUS_PASSED, FINAL_STATUS_FAILED) to (BUILDER_STATUS_FAILED, BUILDER_STATUS_PASSED) and remove (FINAL_STATUS_PASSED, FINAL_STATUS_FAILED) https://cs.corp.google.com/chromeos_public/chromite/cbuildbot/stages/report_stages.py?type=cs&q=FINAL_STATUS_PASSED&sq=package:%5Echromeos_(internal%7Cpublic)$&l=1025
,
Feb 10 2017
,
Feb 10 2017
Let's say P1 actually. No user-visible impact, but significant impact on deputy flow and on our metrics.
,
Feb 10 2017
,
Feb 10 2017
Wait, is this an ongoing issue, or was it a one-off?
,
Feb 10 2017
maybe a chump of https://chromium-review.googlesource.com/c/431896/ can solve this problem
,
Feb 10 2017
Possibly. If that doesn't work, then could review both that and the previous CL. Here's a success: https://chromiumos-build-annotator.googleplex.com/build_annotations/edit_annotations/master-paladin/1316086/? https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/13619/steps/recipe%20result/logs/stdio and a failure: https://chromiumos-build-annotator.googleplex.com/build_annotations/edit_annotations/master-paladin/1316318/? https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/13620/steps/recipe%20result/logs/stdio Both show 'status: SUCCESS' at the end
,
Feb 10 2017
OK that failure was the one that merged my CL. But the next one also failed: https://chromiumos-build-annotator.googleplex.com/build_annotations/edit_annotations/master-paladin/1316536/? https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/13622
,
Feb 10 2017
I don't see how chumping that CL will help. In this build: https://chromiumos-build-annotator.googleplex.com/build_annotations/edit_annotations/master-paladin/1316536/? I see: 07:20:08: INFO: Recording status pass for ['master-paladin'] and 'pass' is what we are looking for. I don't understand what the annotator is or what it is doing.
,
Feb 10 2017
Re #15, I see: https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/13622/steps/steps/logs/stdio 08:37:59: INFO: Running cidb query on pid 9727, repr(query) starts with 'SELECT id, build_stage_id, outer_failure_id, exception_type, exception_message, exception_category, 08:37:59: INFO: chromeos master-paladin (id 1316536): 56 slaves, 1952 slave stages 08:37:59: INFO: parrot-paladin:22076 (id 1316540) fail but in that build: https://uberchromegw.corp.google.com/i/chromeos/builders/parrot-paladin/builds/22076/steps/steps/logs/stdio 07:46:58: INFO: Recording status pass for ['parrot-paladin'] I cannot see why the query in the master paladin thinks that the parrot paladin failed.
,
Feb 10 2017
translateStatus = lambda s: (constants.BUILDER_STATUS_PASSED
if s == constants.FINAL_STATUS_PASSED
else constants.BUILDER_STATUS_FAILED)
status_for_db = translateStatus(final_status)
after your previous CL, final_status is 'pass' when the build is successful, so translateStatus returns 'fail' because (s != 'passed'). So this build is marked as 'fail' in CIDB
,
Feb 10 2017
From what I can tell the problem is happening in SlaveFailureSummaryStage: https://cs.corp.google.com/chromeos_public/chromite/cbuildbot/stages/report_stages.py?rcl=632824123bd5c88dbaec32bcf9f17117b4bdd294&l=331 GetSlaveFailures should not include parrot-paladin:22076 but it does. It seems that InsertFailure() is being called from ReportStageFailureToCIDB(). I'm not sure why.
,
Feb 10 2017
Oh I see, thanks. In one case we assume fail if we don't see pass, and in another we assume pass if we don't see fail...
,
Feb 10 2017
I've chumped https://chromium-review.googlesource.com/c/431896/11 Hopefully that fixes it. If I had done what Ningning suggested I would not have hit this problem.
,
Feb 13 2017
Seemed to be OK now https://chromiumos-build-annotator.googleplex.com/build_annotations/builds_list/master-paladin/? shows passing builds. |
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by sbasi@chromium.org
, Feb 10 2017