Lots of tests failing on chromium perf with "Your connection is not private" |
|||||||
Issue descriptionPlenty of tests and stories have started failing on different Android bots. Logs show Chrome crashes, e.g.: https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Nexus7v2_Perf__2_%2F4305%2F%2B%2Frecipes%2Fsteps%2Fsystem_health.common_mobile%2F0%2Fstdout And many of the screenshots show: "Your connection is not private" https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/profiler-file-id_31-2017-04-12_17-27-1444397.png Most crashes have been seen so far on: - https://luci-milo.appspot.com/buildbot/chromium.perf/Android%20Nexus6%20Perf%20%281%29/5180 - https://luci-milo.appspot.com/buildbot/chromium.perf/Android%20One%20Perf%20%281%29/5165 - https://luci-milo.appspot.com/buildbot/chromium.perf/Android%20Nexus7v2%20Perf%20%281%29/5008 - https://luci-milo.appspot.com/buildbot/chromium.perf/Android%20Nexus5%20Perf%20%282%29/5431 The intersection of CL ranges of those point to: http://test-results.appspot.com/revision_range?start=464144&end=464192
,
Apr 13 2017
Note there is a WPR roll in the range: https://codereview.chromium.org/2811373002
,
Apr 13 2017
Speculative revert: https://codereview.chromium.org/2814383002/
,
Apr 13 2017
,
Apr 13 2017
Interesting, even if the failures are indeed due to WPR, Chrome should not be crashing because of wonky request responses?
,
Apr 13 2017
So the only thing that changed was removing the SAN from the generate_dummy_ca_cert function, which is used for the test CA, not the server cert (sslproxy.py uses generate_cert which relies on a previously generated generate_dummy_ca_cert). Is there something invoking the WPR tools as part of telemetry? Tracing through this nest of infrastructure that I have no familiarity with is... painful :)
,
Apr 13 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/71c4bbe3a41cc99e2542711043079ce40124b5cf commit 71c4bbe3a41cc99e2542711043079ce40124b5cf Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org> Date: Thu Apr 13 15:11:11 2017 Roll src/third_party/catapult/ 41bda7361..4709b3a00 (1 commit) https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/41bda73617ed..4709b3a00e51 $ git log 41bda7361..4709b3a00 --date=short --no-merges --format='%ad %ae %s' 2017-04-13 nednguyen Revert of [web-page-replay] Roll WPR to the latest commit (patchset #1 id:1 of https://codereview.chromium.org/2811373002/ ) Created with: roll-dep src/third_party/catapult BUG= 711232 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel TBR=sullivan@chromium.org Change-Id: I22cdb7239652c5db664e44ce30664733e508f2c9 Reviewed-on: https://chromium-review.googlesource.com/476870 Reviewed-by: <catapult-deps-roller@chromium.org> Commit-Queue: <catapult-deps-roller@chromium.org> Cr-Commit-Position: refs/heads/master@{#464404} [modify] https://crrev.com/71c4bbe3a41cc99e2542711043079ce40124b5cf/DEPS
,
Apr 13 2017
#6: WPR server is invoking in Telemetry code base through here: _ It's wrapped in wpr_server object: https://github.com/catapult-project/catapult/blob/master/telemetry/telemetry/internal/util/wpr_server.py (basically just uses subprocess.popen(..) to launch the WPR server) _ Telemetry's network_controller control wpr_server https://github.com/catapult-project/catapult/blob/master/telemetry/telemetry/internal/platform/network_controller_backend.py#L163 _ StartReplay method is invoked before we run the browser test: https://github.com/catapult-project/catapult/blob/master/telemetry/telemetry/page/shared_page_state.py#L207 _
,
Apr 13 2017
Is _InstallTestCA consistently called for these tests? Or is it perhaps skipped? The context is that _InstallTestCA is responsible for setting _wpr_ca_cert_path, which is necessary to supply the right command-line arguments in https://github.com/catapult-project/catapult/blob/master/telemetry/telemetry/internal/platform/network_controller_backend.py#L260 If _InstallTestCA was not called, then AddWebProxy ( https://github.com/chromium/web-page-replay/blob/7d35257f41aae0860840cbbbda185507ecbff020/replay.py#L144 ) will not be passed the --should_generate_certs flag, and as a result, WPR will use only a SingleCertHttpsProxyServer based on the supplied https_root_ca_cert_path . The change in https://codereview.chromium.org/2811373002 was that the https_root_ca_cert did not have SANs added, on the context of our conversation that it should be using the --should_generate_certs flag (which is creating certs ad-hoc). If both paths are possible (that is, there are times when WPR does --should_generate_certs and times when it doesn't), then the code should be added to generate_dummy_ca_cert to add a SAN. From an external point of view, that's a weird config - it's rare (and undesirable) to be standing up a self-signed cert like that, versus always providing --should_generate_certs, but it sounds like in the current infrastructure, both are needed. The problem is there's no clean way to distinguish when the dummy CA cert is using a hostname versus not a hostname, and when it's not using a hostname, adding a SAN is... less than ideal :) If that sounds like a reasonable conclusion, we can just do the gross thing of unconditionally adding SANs to the CA cert (regardless of whether it's used as the server cert or to generate server certs, the latter of which is fixed and covered by the tests), to cover all possible codepaths.
,
Apr 13 2017
=== Auto-CCing suspected CL author nednguyen@google.com === Hi nednguyen@google.com, the bisect results pointed to your CL, please take a look at the results. === BISECT JOB RESULTS === Test failure found with culprit Suspected Commit Author : nednguyen Commit : 41bda73617edc29dbca3eda9b97a7b33f665c393 Date : Wed Apr 12 19:15:53 2017 Subject: [web-page-replay] Roll WPR to the latest commit Bisect Details Configuration: android_nexus6_perf_bisect Benchmark : system_health.memory_mobile Metric : memory:chrome:all_processes:dump_count_avg/browse_social/browse_social_twitter Revision Exit Code N chromium@464144 0 +- N/A 3 good chromium@464156 0 +- N/A 3 good chromium@464156,catapult@41bda73617 1 +- N/A 3 bad <-- chromium@464157 1 +- N/A 3 bad chromium@464158 1 +- N/A 3 bad chromium@464159 1 +- N/A 3 bad chromium@464162 1 +- N/A 3 bad chromium@464168 1 +- N/A 3 bad chromium@464192 1 +- N/A 3 bad Please refer to the following doc on diagnosing memory regressions: https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md To Run This Test src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=browse.social.twitter system_health.memory_mobile Debug Info https://chromeperf.appspot.com/buildbucket_job_status/8982435949395984896 Is this bisect wrong? https://chromeperf.appspot.com/bad_bisect?try_job_id=5816429133168640 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Speed>Bisection. Thank you!
,
Apr 14 2017
#9: _InstallTestCA is only activated on certain Android version: https://github.com/catapult-project/catapult/blob/c5cab904c4abe3c19fc3b208e0ef116bc8dc29a2/telemetry/telemetry/internal/platform/android_platform_backend.py#L538 Though when that happens, Telemetry will set the "--ignore-certifcate-error" flag: https://github.com/catapult-project/catapult/blob/master/telemetry/telemetry/internal/backends/chrome/chrome_browser_backend.py#L128 I did a text search on https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Nexus7v2_Perf__2_%2F4305%2F%2B%2Frecipes%2Fsteps%2Fsystem_health.common_mobile%2F0%2Fstdout & can't find "--ignore-certifcate-error", so it is certificate is probably installed.
,
Apr 14 2017
,
Apr 19 2017
,
May 1 2017
Issue 717140 has been merged into this issue.
,
May 4 2017
,
May 9 2017
This is no longer a problem due to https://codereview.chromium.org/2851563002/ |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Apr 13 2017