New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 923395 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner: ----
Closed: Jan 18
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

multiple jetstream test failures in CQ, need help with blame

Project Member Reported by semenzato@chromium.org, Jan 18 (4 days ago)

Issue description

This jetstream build:

https://ci.chromium.org/p/chromeos/builders/luci.chromeos.general/CQ/b8924020553524579904

has multiple jetstream test failures, most of them due to missing D-Bus services.  

[Test-Logs]: jetstream_GuestInterfaces: FAIL: D-Bus services not running: ['org.chromium.ap.Controller']
[Test-History]: jetstream_GuestInterfaces
[Test-Logs]: jetstream_GcdCommands: FAIL: D-Bus services not running: ['org.chromium.flimflam']
[Test-History]: jetstream_GcdCommands
[Test-Logs]: jetstream_WanCustomDns: FAIL: D-Bus services not running: ['org.chromium.ap.Controller']
[Test-History]: jetstream_WanCustomDns
[Test-Logs]: jetstream_GuestFirewall: FAIL: D-Bus services not running: ['org.chromium.flimflam']
[Test-History]: jetstream_GuestFirewall
[Test-Logs]: jetstream_ApiServerAttestation: retry_count: 1, FAIL: D-Bus services not running: ['org.chromium.ap.Controller']
[Test-History]: jetstream_ApiServerAttestation
[Test-Logs]: jetstream_NetworkInterfaces: retry_count: 1, FAIL: Radios not ready
[Test-History]: jetstream_NetworkInterfaces
[Test-Logs]: jetstream_LocalApi: retry_count: 1, FAIL: Radios not ready
[Test-History]: jetstream_LocalApi
[Test-Logs]: jetstream_ApiServerDeveloperConfiguration: retry_count: 1, FAIL: Radios not ready
[Test-History]: jetstream_ApiServerDeveloperConfiguration
[Test-Logs]: jetstream_DiagnosticReport: retry_count: 1, FAIL: Radios not ready

That build has a few shill-related CLs, I am hoping you can help identify which ones (if any) may have caused these failures.

Here's the link to the test logs.

https://stainless.corp.google.com/browse/chromeos-autotest-results/277995256-chromeos-test/

Syslog snippet:

2017-01-02T00:00:24.139329+00:00 ERR ap-av[360]: [ERROR:object_proxy.cc(581)] Failed to call method: org.chromium.flimflam.Manager.GetProperties: object_path= /: org.freedesktop.DBus.Error.ServiceUnknown: The name org.chromium.flimflam was not provided by any .service files
2017-01-02T00:00:24.139981+00:00 ERR ap-av[360]: [ERROR:dbus_method_invoker.h(113)] CallMethodAndBlockWithTimeout(...): Domain=dbus, Code=org.freedesktop.DBus.Error.ServiceUnknown, Message=The name org.chromium.flimflam was not provided by any .service files
2017-01-02T00:00:24.140639+00:00 WARNING ap-av[360]: [WARNING:shill_connection.h(332)] Unable to read proxy properties
2017-01-02T00:00:24.141155+00:00 INFO ap-av[360]: [INFO:netmgr_client_shill.cc(90)] NetMgrClientShill ctor
2017-01-02T00:00:24.142526+00:00 ERR ap-av[360]: [ERROR:object_proxy.cc(581)] Failed to call method: org.chromium.flimflam.Manager.GetProperties: object_path= /: org.freedesktop.DBus.Error.ServiceUnknown: The name org.chromium.flimflam was not provided by any .service files


 

Comment 1 by semenzato@chromium.org, Jan 18 (4 days ago)

Labels: -Pri-1 Pri-2
Well now I am confused.  It looks like all those changes were merged even though the jetstream build failed.  Does anybody know what the explanation is?

Comment 2 by drinkcat@chromium.org, Jan 18 (4 days ago)

gale-paladin was marked as experimental for such a long time (due to crbug.com/920548, which is now fixed). And jetstream-paladin is always experimental IIRC...

So yeah basically we didn't have CQ coverage for either, so the range where it started to fail might be quite wide ,-(

Comment 3 by fqj@google.com, Jan 18 (4 days ago)

1) does jetstream use shill. if yes, is shill started correctly?

2) is anything on jetstream that registers to kernel netlink to suppress kaudit output? it's weird that not seeing any audit output in syslog.

Comment 4 by fqj@google.com, Jan 18 (4 days ago)

3) does jetstream has ARC. (sorry I have no context about jetstream). if no, (2) doesn't apply. non-ARC boards don't have SELinux enforced. I guess no? since i didn't see gale or whirwind at go/cros-selinux-tests

Comment 5 by drinkcat@chromium.org, Jan 18 (4 days ago)

#4: 3) jetstream/gale are Wifi access points, so no, they definitely do not run ARC++.

Comment 6 by briannorris@chromium.org, Jan 18 (4 days ago)

Cc: -briannorris@chromium.org
Labels: OS-Chrome
whirlwind-paladin (to which the original report linked) is marked experimental and has been for a long time. It's mostly been in Exception state (aborted, because it was experimental and not fast enough) or Failure for a long time. I doubt it's a new issue.

I'll remove myself from CC (and maybe star the issue for a bit), but you should probably find Jetstream-related owners if anyone is going to care.

Comment 7 by semenzato@chromium.org, Jan 18 (4 days ago)

Labels: -Pri-2 Pri-3
Yes. thanks, sorry, I had not noticed that whirlwind was greyed out because I was looking at a different view (that master-paladin build) and it's not greyed out there and I forgot again.

I'll find some jetstream SWEs although they are probably already aware of this.

Comment 8 by semenzato@chromium.org, Jan 18 (4 days ago)

Status: Fixed (was: Untriaged)
Looks like this was fixed by https://crrev.com/i/799266, and addressed by https://issuetracker.google.com/123041453.

Comment 9 by lgoo...@chromium.org, Jan 18 (4 days ago)

The above issue is fixed: b/123041453 (affected both gale and whirlwind)

The breaking change was:
  https://chrome-internal-review.googlesource.com/c/chromeos/ap-daemons/+/727874

Fixed by:
  https://chrome-internal-review.googlesource.com/c/chromeos/ap-daemons/+/799266/

whirlwind-paladin was previously busted for a different reason, b/122766606.
That has also been fixed.

whirlwind-paladin is expected to pass, but is experimental because historically it has had some random low-probability provisioning issues related to the TPM: b/33758106. There is a detection and repair process in place now for this, so should be less of an issue.

Sign in to add a comment