browse:chrome:* stories failing and taking down entire benchmarks |
||||||||
Issue description
browse:chrome:omnibox and browse:chrome:newtab have recently started failing with:
app_ui.WaitForUiNode(resource_id='search_box').Tap()
File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 57, in timeout_retry_wrapper
retry_if_func=retry_if_func)
[...]
File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py", line 111, in WaitFor
suffix=' waiting for condition %r' % condition_name)
File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py", line 54, in GetRemainingTime
raise reraiser_thread.TimeoutError(msg)
CommandTimeoutError: Timeout of 30.0 secs expired waiting for condition 'node_found'
Which is treated as an unhandleable error and brings down the entire benchmark.
In addotion to finding the root cause, the error should be made handleable. For the moment I'll just disable the stories to let the rest of the benchmark run.
,
Nov 1
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b1c0b554e0ff982245a77644057aec7581269487 commit b1c0b554e0ff982245a77644057aec7581269487 Author: Juan Antonio Navarro Perez <perezju@chromium.org> Date: Thu Nov 01 12:35:43 2018 [tools/perf] Disable browse:chrome:* stories These stories ara failing with an unhandleable error, preventing the rest of the benchmark to run. Disable these as a short term quick fix. NOTRY=true TBR=nednguyen@google.com Bug: 900909 Change-Id: I8e51ced1644853469ff8310de03cb232abba9253 Reviewed-on: https://chromium-review.googlesource.com/c/1311924 Reviewed-by: Juan Antonio Navarro Pérez <perezju@chromium.org> Commit-Queue: Juan Antonio Navarro Pérez <perezju@chromium.org> Cr-Commit-Position: refs/heads/master@{#604564} [modify] https://crrev.com/b1c0b554e0ff982245a77644057aec7581269487/tools/perf/expectations.config
,
Nov 1
I think might be we should consider changing the error catching part in story_runner to be a blacklist rather than whitelist, e.g: all error of types X, Y, Z will crash the whole benchmarks, the rest will lead to retry. Few errors that we allow to crash the whole benchmark: KeyboardInterrupt SystemExit EnvironmentError ImportError MemoryError
,
Nov 1
*also some DeviceError type from devil probably also lead to crashing the whole benchmark.
,
Nov 1
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/10b4b886e40000
,
Nov 2
Pinpoint job still running, reduced range so far is: http://test-results.appspot.com/revision_range?start=604425&end=604432&n=1000 From there the following sounds suspicious: https://chromium.googlesource.com/chromium/src/+/d3fd7a3e0505a4ca842d5c74cdfa125a8a381ef5 +mheikal could your change be preventing us from finding items in the Chrome menu now?
,
Nov 2
Yes this is my change. The change has been rolled back temporarily for breaking franky tests. So as to make sure I dont break this again on the reland. Can you give me more information about these benchmarks, how are they run (automatically for each commit, once per official build or manually via developers/testers requesting it?). Which target are these run on (chromium/chrome?, monochrome/chromemodern/chrome?)
,
Nov 2
The tests that broke are defined here: https://cs.chromium.org/chromium/src/tools/perf/page_sets/system_health/chrome_stories.py?rcl=81bbcaf91131c5ffbfb08de123fa1e372b25c5b8&l=49 They are run continuously on perf waterfall every few commits. The tests interact with the browser UI find (via uiautomator) elements to "tap" on; e.g. try to find the "url_bar", the "menu_button", etc. You can run these with e.g.: src$ ./tools/perf/run_benchmark system_health.common_mobile --browser android-chromium --story-filter browse:chrome:newtab
,
Nov 2
📍 Found a significant difference after 1 commit. https://pinpoint-dot-chromeperf.appspot.com/job/10b4b886e40000 Reland "[Android] Remove most resource names in resources.arsc" by mheikal@chromium.org https://chromium.googlesource.com/chromium/src/+/d3fd7a3e0505a4ca842d5c74cdfa125a8a381ef5 Failure rate: 0 → 1 (+1) Understanding performance regressions: http://g.co/ChromePerformanceRegressions
,
Nov 5
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e734bc5eb8b08668196b95a9e38e6bea0773748c commit e734bc5eb8b08668196b95a9e38e6bea0773748c Author: Mohamed Heikal <mheikal@chromium.org> Date: Mon Nov 05 20:48:41 2018 Reland x2 "[Android] Remove most resource names in resources.arsc" This reverts commit 65f0e20ac217ae4cfbfec49fa98d73c004690148. Reason for reland: Fixes crbug.com/900909 and crbug.com/900993 TBR=torne@chromium.org (No changes in webview section) Original change's description: > [Android] Remove most resource names in resources.arsc > > resource names are stored in the resources.arsc file to allow for > getIdentifier with the string name of the resource to work. This cl > obfuscates all of those resources to one name (and thus stored only > once) for all resources that are not accessed via getIdentifier. > > This improves binary size by about 200KB. > > Bug: 894208 > > Change-Id: I28c440c22b90cd045f53017073fdb88c7410d530 > Reviewed-on: https://chromium-review.googlesource.com/c/1265897 > Commit-Queue: Mohamed Heikal <mheikal@chromium.org> > Reviewed-by: Richard Coles <torne@chromium.org> > Reviewed-by: agrieve <agrieve@chromium.org> > Reviewed-by: Ted Choc <tedchoc@chromium.org> > Cr-Commit-Position: refs/heads/master@{#603229} Bug: 894208, 900909 , 900993 Change-Id: Ie1e8790b392b5cbfff6d207c11e12bed6b8cc765 Reviewed-on: https://chromium-review.googlesource.com/c/1316092 Reviewed-by: Mohamed Heikal <mheikal@chromium.org> Reviewed-by: agrieve <agrieve@chromium.org> Commit-Queue: Mohamed Heikal <mheikal@chromium.org> Cr-Commit-Position: refs/heads/master@{#605451} [add] https://crrev.com/e734bc5eb8b08668196b95a9e38e6bea0773748c/android_webview/aapt2.config [modify] https://crrev.com/e734bc5eb8b08668196b95a9e38e6bea0773748c/android_webview/apk/java/src/com/android/webview/chromium/LicenseActivity.java [modify] https://crrev.com/e734bc5eb8b08668196b95a9e38e6bea0773748c/build/android/gyp/apkbuilder.py [modify] https://crrev.com/e734bc5eb8b08668196b95a9e38e6bea0773748c/build/android/gyp/compile_resources.py [modify] https://crrev.com/e734bc5eb8b08668196b95a9e38e6bea0773748c/build/config/android/internal_rules.gni [modify] https://crrev.com/e734bc5eb8b08668196b95a9e38e6bea0773748c/build/config/android/rules.gni [modify] https://crrev.com/e734bc5eb8b08668196b95a9e38e6bea0773748c/chrome/android/chrome_public_apk_tmpl.gni
,
Nov 5
This should be fixed now, the new reland should keep all id resources that are used by automation tools.
,
Nov 5
Thanks! Re-opening and assigning to self to: - Re-enable the disabled tests. - Make sure these kinds of failures are no longer considered as "fatal", other tests should be able to run just fine.
,
Nov 8
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/001618fc4a619a3037f0542ae980922226f89cfe commit 001618fc4a619a3037f0542ae980922226f89cfe Author: Juan Antonio Navarro Pérez <perezju@chromium.org> Date: Thu Nov 08 21:14:32 2018 Revert "[tools/perf] Disable browse:chrome:* stories" This reverts commit b1c0b554e0ff982245a77644057aec7581269487. Reason for revert: Stories show work now after fix of underlying issue. Original change's description: > [tools/perf] Disable browse:chrome:* stories > > These stories ara failing with an unhandleable error, preventing the > rest of the benchmark to run. > > Disable these as a short term quick fix. > > NOTRY=true > TBR=nednguyen@google.com > > Bug: 900909 > Change-Id: I8e51ced1644853469ff8310de03cb232abba9253 > Reviewed-on: https://chromium-review.googlesource.com/c/1311924 > Reviewed-by: Juan Antonio Navarro Pérez <perezju@chromium.org> > Commit-Queue: Juan Antonio Navarro Pérez <perezju@chromium.org> > Cr-Commit-Position: refs/heads/master@{#604564} TBR=perezju@chromium.org,nednguyen@google.com # Not skipping CQ checks because original CL landed > 1 day ago. Bug: 900909 Change-Id: Ia698d28b432cc0606e0bb157124c2dd39ca3eff0 Reviewed-on: https://chromium-review.googlesource.com/c/1327561 Reviewed-by: Juan Antonio Navarro Pérez <perezju@chromium.org> Commit-Queue: Juan Antonio Navarro Pérez <perezju@chromium.org> Cr-Commit-Position: refs/heads/master@{#606602} [modify] https://crrev.com/001618fc4a619a3037f0542ae980922226f89cfe/tools/perf/expectations.config
,
Nov 13
The following revision refers to this bug: https://chromium.googlesource.com/catapult/+/78869faa9193eea919200caed5c17410c6779558 commit 78869faa9193eea919200caed5c17410c6779558 Author: Juan Antonio Navarro Perez <perezju@chromium.org> Date: Tue Nov 13 17:33:08 2018 [telemetry] Treat most exceptions as "possibly handleable" Telemetry classifies errors as "handleable" and "unhandleable" errors. Handleable errors, defined by a short explicit list, are displayed and reported as an individual story failure. Unhandleable errors, any error not in the list, are treated as fatal errors and stop the entire benchmark execution. This is often too drastic, as many errors raised by Telemetry clients are in fact "handleable" and should not stop the execution of the entire benchmark due to an error in one of its stories. In this CL we define instead a small set of explicitly "unhandleable" errors (KeyboardInterrupt, SystemExit, etc) which should definitely halt the benchmark execution, while most other Exception's are treated as "possibly handleable", allowing other stories to continue running but restarting the shared state to be safe. Bug: chromium:900909 Change-Id: I5f23d5c33319b04f7631f0f3e77fe2a2bb62c131 Reviewed-on: https://chromium-review.googlesource.com/c/1326943 Reviewed-by: Caleb Rouleau <crouleau@chromium.org> Reviewed-by: Ned Nguyen <nednguyen@google.com> Commit-Queue: Juan Antonio Navarro Pérez <perezju@chromium.org> [modify] https://crrev.com/78869faa9193eea919200caed5c17410c6779558/telemetry/telemetry/internal/story_runner.py [modify] https://crrev.com/78869faa9193eea919200caed5c17410c6779558/telemetry/telemetry/internal/story_runner_unittest.py
,
Nov 13
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/2aaa8a58d23ae9fc7b6716cd13d2014b1200b691 commit 2aaa8a58d23ae9fc7b6716cd13d2014b1200b691 Author: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Date: Tue Nov 13 18:36:19 2018 Roll src/third_party/catapult 84f62b6f08e8..78869faa9193 (1 commits) https://chromium.googlesource.com/catapult.git/+log/84f62b6f08e8..78869faa9193 git log 84f62b6f08e8..78869faa9193 --date=short --no-merges --format='%ad %ae %s' 2018-11-13 perezju@chromium.org [telemetry] Treat most exceptions as "possibly handleable" Created with: gclient setdep -r src/third_party/catapult@78869faa9193 The AutoRoll server is located here: https://autoroll.skia.org/r/catapult-autoroll Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. CQ_INCLUDE_TRYBOTS=luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel BUG= chromium:900909 TBR=sullivan@chromium.org Change-Id: I17117fac12c5ef19aab8edfd9080d031dccaa087 Reviewed-on: https://chromium-review.googlesource.com/c/1333858 Reviewed-by: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Commit-Queue: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#607666} [modify] https://crrev.com/2aaa8a58d23ae9fc7b6716cd13d2014b1200b691/DEPS
,
Nov 14
,
Jan 16
(6 days ago)
,
Jan 16
(6 days ago)
|
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by nedngu...@google.com
, Nov 1Components: Speed>Telemetry