New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 666028 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Nov 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug-Regression



Sign in to add a comment

Most policy_* AutoTests fail when policy is not null.

Project Member Reported by scunning...@chromium.org, Nov 16 2016

Issue description

Chrome OS Version: R56-8956.0.0 - Onward (with short break from R56-8971.0.0 to R56-8977.0.0)
Devices: All

Most policy_.* telemetry tests began regularly failing November 3rd, in build R56-8956.0.0. Failures did not happen before that. Tests continue to pass normally in R55 builds. The policy_.* tests are tests like policy_EditBookmarksEnabled, policy_ImagesAllowedForUrls, policy_ForceYouTubeSafetyMode, etc.  These tests have in common being telemetry tests, and using the fake policy_testserver.py.

Here are some example wmatrix links to failing tests:
 - https://wmatrix.googleplex.com/unfiltered?suites=bvt-perbuild&hide_experimental=True&hide_missing=True&days_back=14&releases=56&tests=policy_EditBookmarksEnabled
- https://wmatrix.googleplex.com/unfiltered?suites=bvt-perbuild&hide_experimental=True&hide_missing=True&days_back=14&releases=56&tests=policy_ImagesAllowedForUrls
- https://wmatrix.googleplex.com/unfiltered?suites=bvt-perbuild&hide_experimental=True&hide_missing=True&days_back=30&releases=56&tests=policy_Force.*


Steps to reproduce the problem:
1. On CAutotest: Search for jobs of the above tests, that were run on an R56 build. More than likely, the test will have failed. Exceptions are the test cases using the 'notset' control file, since they expect to not receive any policy values from the policy_testserver.py.
2. On local device running an R56 build: Run $ test_that <IP> e:policy_.*\\..* (where <IP> is the inet address of your local DUT). Same tests will fail.

What is the expected output?
All policy_* test cases should pass on an R56-build, when run in lab as part of the bvt-perbuild or regression suite, or on your local device.

What do you see instead?
Most of the policy_.* tests consistently fail. The client.0.INFO log files contain the following errors:
- File "/usr/local/telemetry/src/chrome/browser/policy/test/policy_testserver.py", line 958, in ProcessCloudPolicy
11/16 09:37:33.444 ERROR|   testserver_base:0052|     verification_sig)
- AttributeError: 'PolicyFetchResponse' object has no attribute 'new_public_key_verification_signature_deprecated'

When a test fails it is because the policy value from the Fake DMS (policy_testserver.py) is always set to 'None', when a non-None value was expected. E.g.,
- TestFail: Policy value shown is not correct: None (expected: false)

Those few test cases that *expect* the policy value to be 'Not set' are passing as expected.

Failures are consistent across M56 builds (since Nov 3rd) and boards.
Strangely, there was a 48-hour period, Nov 8-10, from build R56-8971.0.0 to R56-8977.0.0, when tests began passing, then resumed failing.
Though likely unrelated, this bug has some similarities to  bug 582615 , from the perspective of everything failing in sync.
 
Can somebody take this?  Did I mention that all these tests started failing at the same time? 
Cc: jen...@chromium.org
Owner: khmel@chromium.org
Status: Assigned (was: Available)
Yury, looks like we need a bisect. Can you take a look?

Comment 3 by khmel@chromium.org, Nov 17 2016

Status: Started (was: Assigned)
Any progress on this?  We're flying blind without these tests.

Here's another wmatrix link, featuring only policy_* tests in the regression suite: https://wmatrix.googleplex.com/unfiltered?releases=tot&suites=regression&hide_experimental=true&hide_missing=true&days_back=21&tests=policy_.*

Comment 5 by khmel@chromium.org, Nov 18 2016

WIP, bisect chromium - no result.
At this moment bisection Chrome OS
https://crosland.corp.google.com/log/8955.0.0..8956.0.0
Components: -Infra
This doesn't sound like an infra issue - removing the component.

Comment 7 by khmel@chromium.org, Nov 20 2016

Owner: tnagel@chromium.org
for some reason simplechrome did not reveal the problem and I went to wrong direction. Bisecting ChromeOS did not show the problem.
Finally I could reproduce the problem only when building chrome inside ChromeOS build (CHROME_ORIGIN=LOCAL_SOURCE)
Bisection showed first wrong CL:
https://chromium.googlesource.com/chromium/src/+/78ff9da2c0349d700b15428c6da567c61b894f1a

There is interval when this test became green: R56-8971.0.0 - R56-8977.0.0. This can be explainable because that time we had manual downgrade Chrome pin. Once pin was removed test became failing again.

Reassigning this to tnagel@
PTAL

Comment 8 by tnagel@chromium.org, Nov 21 2016

Cc: ljusten@chromium.org
It seems that the protofiles package needs to uprev in sync with Chrome.  Pardon my ignorance, shouldn't test failures like these prevent Chrome from uprevving?

ljusten@ has a CL in flight that should do the trick, not sure when he's planning to land, though:  https://chromium-review.googlesource.com/#/c/411781/
Thiemo - the policy autotests are not part of bvt-inline so they don't block CQ uprev.
Thanks Achuith.  What's the reasoning behind this?
Generally, that a partuicular AutoTest must prove it's reliability and stability before being promoted from "regression" or "bvt-perbuild" to "bvt-cq" or "bvt-inline". Due to infrastructure and other problems, these have not.

Thanks for the context!  Since enterprise functionality is pretty essential to cros imho, are there plans to promote these tests?
No, there is no plan to move these tests to bvt-cq or further into bvt-inline.  Enterprise problems do not usually block Dev channel releases, much less a Chrome uprev.

Feel free to put something on my calendar if you want to discuss the issue further.
There are a few issues:

The first problem is that these tests have an external dependency on gaia for enrollment. There's a bug somewhere to use the test policy server to drive 'fake' enrollment.

The second is that these tests need a non-owned device, which requires a reboot on hardware.

Thirdly, the entire suite is not suitable - there are too many tests.

We could pick a representative test, and run it in a VM. We still need to solve the first problem of the dependence on GAIA.
Achuith, these tests use fake login, not real gaia.

I would prefer we write a sanity "did the policy get set" test to check whether the fake dms will work for other tests rather than run an existing test in bvt-inline.
Wrt #14: 1) The way we are running these tests in the HW Lab, using the fake DM Server, and a fake user, they have no GAIA dependency. The framework supports the option of a real user to support E2E testing in Kirkland.
2-3) Agreed.
Labels: M-56
Resurfacing this. TE can't overstate how important it is that these automated tests run, since we don't have the resources to test them manually. Please fix this breakage asap.
I just got the +2 for the CL that should fix this. I'll put it into the commit queue first thing tomorrow morning.
Status: Fixed (was: Started)
https://chromium-review.googlesource.com/#/c/411781/ has landed, tests should succeed now.
Labels: M-57
Most (if not all) tests are now passing in M57 builds. However, most are still failing in M56 builds. Can we merge change back to M56?
Cc: zelidrag@chromium.org gkihumba@chromium.org
Labels: Merge-Request-56
+zelidrag, gkihumba for merge permission

Lutz' CL does structural changes to proto ingestion and thus I wouldn't want to merge it back to M56.  But in lieu of a merge, I'm requesting permission to uprev manually on the M-56 branch by updating the protofiles ebuild:

	"proto/cloud"
	"${CROS_GIT_HOST_URL}/chromium/src/components/policy/proto.git"
-	"1c29a8465c755c4e63fe818772307ab6f49ec52a"
+       "e35b0b86418e12108441bf893fdadf1b565444aa"

Comment 22 by dimu@chromium.org, Dec 4 2016

Labels: -Merge-Request-56 Merge-Approved-56 Hotlist-Merge-Approved
Your change meets the bar and is auto-approved for M56 (branch: 2924)
Project Member

Comment 23 by sheriffbot@chromium.org, Dec 7 2016

This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible!

If all merges have been completed, please remove any remaining Merge-Approved labels from this issue.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Project Member

Comment 24 by sheriffbot@chromium.org, Dec 12 2016

This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible!

If all merges have been completed, please remove any remaining Merge-Approved labels from this issue.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Merge-Approved-56
I guess merging this has become pointless by now.  Removing Merge-Approved-56 label.
Status: Verified (was: Fixed)

Sign in to add a comment