New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 922585 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Today
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

coral: graphics_Gbm failure in CQ: python: command not found

Project Member Reported by semenzato@chromium.org, Jan 16 (6 days ago)

Issue description

This CL run

https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8924196617257481424

failed after 220 minutes.  The milo logs blame a change to a CL in the CQ:

01:54:36: INFO: Running cidb query on pid 23355, repr(query) starts with <sqlalchemy.sql.expression.Insert object at 0x7fe004d9ced0>
01:54:36: ERROR: Could not submit niwa:1411473:56e51f57, error: CL:1411473 was modified while the CQ was in the middle of testing it. Patch set 5 was uploaded.

01:54:40: INFO: Running cidb query on pid 23355, repr(query) starts with <sqlalchemy.sql.expression.Insert object at 0x7fe004d9c8d0>

01:54:40: ERROR: FAILED TO SUBMIT ALL CHANGES:  Could not verify that changes niwa:1411473:56e51f57 were submitted.
Submitted 62 changes successfully.
01:54:40: INFO: Translating result FAILED TO SUBMIT ALL CHANGES:  Could not verify that changes niwa:1411473:56e51f57 were submitted.
Submitted 62 changes successfully. to fail.


Patch set 5 was uploaded at 1:05.  The build started at 22:32.

Two questions then:

1. it's not clear that the submitter of that patchset knew that this would kill the CQ.  Can we warn them?  For instance we could reject the patchset, or require a flag (it's possible that the build will fail anyway without that patchset)

2. in this case there was also a test failure on the coral build.  Can I ignore that and exculpate all CLs?  (Otherwise I'll have to understand if the test was a flake, which it may be).

Thanks.



 

Comment 1 by jclinton@chromium.org, Jan 16 (6 days ago)

Owner: semenzato@chromium.org
Status: Assigned (was: Untriaged)
Yes, WAI.

1) There was a message posted explaining this. https://chromium-review.googlesource.com/c/chromiumos/platform/tast-tests/+/1411473

2) This was the only reason that CL's might not have been allowed to land. Someone submitting a change or updating a change during a CQ run doesn't affect any other CL's. Do you want to land all of the CL's in the batch? Is it possible that the Coral test failure is the result of one of these CL's?

Comment 2 by semenzato@chromium.org, Jan 16 (6 days ago)

Owner: jclinton@chromium.org
Summary: coral: graphics_Gbm failure in CQ: phthon: command not found (was: does killing the CQ run when CL is modified WAI?)
Thank you.

The coral failure is weird.  The DUT is in bad shape: python: command not found.
May I give it back to you?  I'll exculpate everybody.

https://ci.chromium.org/p/chromeos/builders/luci.chromeos.general/CQ/b8924195619915248096
https://chromeos-swarming.appspot.com/task?id=426e9a9369c32f10&refresh=10&request_detail=true&show_raw=1

01:06:46 INFO | Starting master ssh connection '/usr/bin/ssh -a -x -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_ORvryessh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos6-row6-rack3-host6'
01:06:50 INFO | Attempting to autodetect if host is of type CrosHost
01:06:53 INFO | get_network_stats: at-start RXbytes 5586916 TXbytes 104839
01:06:53 ERROR| tko parser: {'cidb_build_stage_id': 107398119, 'builds': '{"cros-version": "coral-paladin/R73-11591.0.0-rc2"}', 'job_started': 1547628649, 'job_finished': 1547629604, 'hostname': 'chromeos6-row6-rack3-host6', 'cidb_build_id': 3362503, 'status_version': 1, 'label': 'coral-paladin/R73-11591.0.0-rc2/bvt-inline/graphics_Gbm', 'parent_job_id': '426e9a0c7f1a2411', 'drone': 'cros-full-0042.mtv.corp.google.com', 'build': 'coral-paladin/R73-11591.0.0-rc2', 'suite': 'bvt-inline', 'experimental': 'False', 'user': 'chromeos-test'}
01:06:53 ERROR| tko parser: MACHINE NAME: chromeos6-row6-rack3-host6
01:06:53 ERROR| tko parser: trying to use hostinfo
01:06:53 ERROR| tko parser: MACHINE GROUP: astronaut
01:06:53 ERROR| tko parser: parsing partial test ---- SERVER_JOB
01:06:53 ERROR| tko parser: ignoring line because of extra indentation
01:06:53 ERROR| tko parser: ignoring line because of extra indentation
01:06:53 ERROR| tko parser: parsing test ---- SERVER_JOB
01:06:53 ERROR| tko parser: trying to use hostinfo
01:06:54 INFO | Attempting to autodetect if host is of type CrosHost
01:06:55 ERROR| [stderr] bash: python: command not found


Comment 3 by semenzato@chromium.org, Jan 16 (6 days ago)

Summary: coral: graphics_Gbm failure in CQ: python: command not found (was: coral: graphics_Gbm failure in CQ: phthon: command not found)

Comment 4 by jclinton@chromium.org, Jan 16 (6 days ago)

Components: -Infra>ChromeOS>CI Infra>ChromeOS>Test
Owner: akes...@chromium.org
This is a hardware autotest test failure of some kind. Over to Deputy to triage/route.

Comment 5 by akeshet@google.com, Today (6 hours ago)

Status: WontFix (was: Assigned)
likely obsolete at this point, closing

Sign in to add a comment