rendering.mobile & smoothness.tough_pinch_zoom_cases uploads failing on Nexus5 |
|||||
Issue descriptionLink to most recently failing build: https://ci.chromium.org/buildbot/chromium.perf/Android%20Nexus5%20Perf/2000 Logs show 500 errors: https://logs.chromium.org/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Nexus5_Perf%2F2000%2F%2B%2Frecipes%2Fsteps%2Fperformance_test_suite_on_Android_device_Nexus_5%2F0%2Flogs%2FMerge_script_log%2F0
,
Jul 11
This is almost certainly timing out. We need to optimize /add_histograms, I"ll take this.
,
Jul 11
The following revision refers to this bug: https://chromium.googlesource.com/catapult/+/a35fcbb27365ab985e61a5de04a3a2f0e53650b7 commit a35fcbb27365ab985e61a5de04a3a2f0e53650b7 Author: Simon <simonhatch@chromium.org> Date: Wed Jul 11 19:47:11 2018 Dashboard - Add some timing around /add_histograms Bug: chromium:862666 Change-Id: I3a10eeb728d8b1ae632ac8c3debd6d233dc74ba1 Reviewed-on: https://chromium-review.googlesource.com/1134018 Reviewed-by: Ethan Kuefner <eakuefner@chromium.org> Commit-Queue: Simon Hatch <simonhatch@chromium.org> [modify] https://crrev.com/a35fcbb27365ab985e61a5de04a3a2f0e53650b7/dashboard/dashboard/add_histograms.py
,
Jul 11
Here's cleaned up profile of a recent slow run of /add_histograms: decompress:wall=0.216410 json.loads:wall=2.775900 hs.ImportDicts:wall=11.441570 hs.ResolveRelatedHistograms:wall=0.609060 hs.DeduplicateDiagnostics:wall=1.090710 hs._LogDebugInfo:wall=0.000180 InlineDenseSharedDiagnostics:wall=0.357080 _PurgeHistogramBinData:wall=0.238880 _GetDiagnosticValue calls:wall=0.000120 _ValidateMasterBotBenchmarkName:wall=0.000010 ComputeRevision:wall=0.010670 FindSuiteLevelSparseDiagnostics:wall=0.736760 DeduplicateAndPut:wall=0.633650 ReplaceSharedDiagnostic calls:wall=5.631550 _BatchHistogramsIntoTasks:wall=20.673660 _QueueHistogramTasks:wall=0.583730 Looks like 3 hotspots to tackle are, in order of badness: _BatchHistogramsIntoTasks, histogram_set.ImportDicts, and the various ReplaceSharedDiagnostic calls. These account for the overwhelming majority of the time spent in /add_histograms. I'll poke at _BatchHistogramsIntoTasks a bit tomorrow.
,
Jul 11
The following revision refers to this bug: https://chromium.googlesource.com/catapult/+/39de3d2258563f5ef8c7366c0354adf41fe5a94b commit 39de3d2258563f5ef8c7366c0354adf41fe5a94b Author: Simon <simonhatch@chromium.org> Date: Wed Jul 11 22:11:41 2018 Dashboard - Remove unnecessary histogram lookups. Don't expect this to have much in the way of performance gains, just silly implementation. Bug: chromium:862666 Change-Id: Iedb12a9f34964194733457d3c47039231c270247 Reviewed-on: https://chromium-review.googlesource.com/1134204 Reviewed-by: Ethan Kuefner <eakuefner@chromium.org> Commit-Queue: Simon Hatch <simonhatch@chromium.org> [modify] https://crrev.com/39de3d2258563f5ef8c7366c0354adf41fe5a94b/dashboard/dashboard/add_histograms_test.py [modify] https://crrev.com/39de3d2258563f5ef8c7366c0354adf41fe5a94b/dashboard/dashboard/add_histograms.py
,
Jul 11
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/720dadbc215c229ce100bc408edb3aee03b0697e commit 720dadbc215c229ce100bc408edb3aee03b0697e Author: catapult-chromium-autoroll <catapult-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Date: Wed Jul 11 22:44:29 2018 Roll src/third_party/catapult 5c0d851abe6e..1af68170e543 (2 commits) https://chromium.googlesource.com/catapult.git/+log/5c0d851abe6e..1af68170e543 git log 5c0d851abe6e..1af68170e543 --date=short --no-merges --format='%ad %ae %s' 2018-07-11 simonhatch@chromium.org Dashboard - Use oauth scopes for /add_histograms 2018-07-11 simonhatch@chromium.org Dashboard - Add some timing around /add_histograms Created with: gclient setdep -r src/third_party/catapult@1af68170e543 The AutoRoll server is located here: https://catapult-roll.skia.org Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. CQ_INCLUDE_TRYBOTS=luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel BUG= chromium:862730 ,chromium:862666 TBR=sullivan@chromium.org Change-Id: If4b4e810ef8ae28ad68c15848c6aebe5ccc3fb56 Reviewed-on: https://chromium-review.googlesource.com/1133869 Reviewed-by: catapult-chromium-autoroll <catapult-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Commit-Queue: catapult-chromium-autoroll <catapult-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#574387} [modify] https://crrev.com/720dadbc215c229ce100bc408edb3aee03b0697e/DEPS
,
Jul 12
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/5d4bb6ed80fc74f40da5499723417d45ab616119 commit 5d4bb6ed80fc74f40da5499723417d45ab616119 Author: catapult-chromium-autoroll <catapult-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Date: Thu Jul 12 01:14:15 2018 Roll src/third_party/catapult 1af68170e543..39de3d225856 (1 commits) https://chromium.googlesource.com/catapult.git/+log/1af68170e543..39de3d225856 git log 1af68170e543..39de3d225856 --date=short --no-merges --format='%ad %ae %s' 2018-07-11 simonhatch@chromium.org Dashboard - Remove unnecessary histogram lookups. Created with: gclient setdep -r src/third_party/catapult@39de3d225856 The AutoRoll server is located here: https://catapult-roll.skia.org Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. CQ_INCLUDE_TRYBOTS=luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel BUG=chromium:862666 TBR=sullivan@chromium.org Change-Id: I748004dfd1269077f1c4648897b9d698baadb301 Reviewed-on: https://chromium-review.googlesource.com/1133876 Reviewed-by: catapult-chromium-autoroll <catapult-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Commit-Queue: catapult-chromium-autoroll <catapult-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#574441} [modify] https://crrev.com/5d4bb6ed80fc74f40da5499723417d45ab616119/DEPS
,
Jul 12
The following revision refers to this bug: https://chromium.googlesource.com/catapult/+/68c3b1d86065cf3cdf831b9328550b7ad06449a6 commit 68c3b1d86065cf3cdf831b9328550b7ad06449a6 Author: Simon <simonhatch@chromium.org> Date: Thu Jul 12 17:16:43 2018 Dashboard - Optimize histogram_set.ReplaceSharedDiagnostic Add a fast path for generic_sets, swap out the contents so that we don't need to iterate over all histograms. Bug: chromium:862666 Change-Id: Ia17345e81b1e9004b3b34af036867f18c3590ee0 Reviewed-on: https://chromium-review.googlesource.com/1135292 Reviewed-by: Ethan Kuefner <eakuefner@chromium.org> Commit-Queue: Simon Hatch <simonhatch@chromium.org> [modify] https://crrev.com/68c3b1d86065cf3cdf831b9328550b7ad06449a6/tracing/tracing/value/diagnostics/generic_set.py [modify] https://crrev.com/68c3b1d86065cf3cdf831b9328550b7ad06449a6/tracing/tracing/value/histogram_set.py [modify] https://crrev.com/68c3b1d86065cf3cdf831b9328550b7ad06449a6/tracing/tracing/value/diagnostics/diagnostic.py
,
Jul 12
There's probably more low-hanging fruit, I just poked around in the ReplaceSharedDiagnostic calls and those should go to near 0 now. The endpoint probably won't time out anymore, but we should definitely put more effort to bring the rendering.* calls down under 30s. If there are no objections, since this shouldn't be failing anymore I'm going to drop this to p2.
,
Jul 12
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c698a3600cde3a3f002ec5248c50c49c7cfa5fc7 commit c698a3600cde3a3f002ec5248c50c49c7cfa5fc7 Author: catapult-chromium-autoroll <catapult-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Date: Thu Jul 12 21:28:06 2018 Roll src/third_party/catapult 39de3d225856..66447ba64fcc (6 commits) https://chromium.googlesource.com/catapult.git/+log/39de3d225856..66447ba64fcc git log 39de3d225856..66447ba64fcc --date=short --no-merges --format='%ad %ae %s' 2018-07-12 sadrul@chromium.org trace-viewer: Show a link to codesearch. 2018-07-12 dtu@chromium.org [pinpoint] Use ntpath.join instead of manual joining. 2018-07-12 dtu@chromium.org [pinpoint] Add unit test for JobState when adding a midpoint. 2018-07-12 simonhatch@chromium.org Dashboard - Optimize histogram_set.ReplaceSharedDiagnostic 2018-07-12 cphlipot0@gmail.com [Telemetry] Fix crash caused by unhandled atrace timeout 2018-07-11 benjhayden@chromium.org Add /api/describe/$test_suite. Created with: gclient setdep -r src/third_party/catapult@66447ba64fcc The AutoRoll server is located here: https://catapult-roll.skia.org Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. CQ_INCLUDE_TRYBOTS=luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel BUG=chromium:862666 TBR=sullivan@chromium.org Change-Id: If765fc1ce617c13fa0f24ba128ef28c63fe61e68 Reviewed-on: https://chromium-review.googlesource.com/1135615 Reviewed-by: catapult-chromium-autoroll <catapult-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Commit-Queue: catapult-chromium-autoroll <catapult-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#574727} [modify] https://crrev.com/c698a3600cde3a3f002ec5248c50c49c7cfa5fc7/DEPS
,
Jul 16
We are also having 5 upload failures on Mac: https://ci.chromium.org/buildbot/chromium.perf/mac-10_12_laptop_low_end-perf/731 They're all "Error uploading histogram data: HTTP Response 500: Internal Server Error" in https://logs.chromium.org/v/?s=chrome%2Fbb%2Fchromium.perf%2Fmac-10_12_laptop_low_end-perf%2F731%2F%2B%2Frecipes%2Fsteps%2Fperformance_test_suite_on_Intel_GPU_on_Mac_on_Mac-10.12%2F0%2Flogs%2FMerge_script_log%2F0 Simon: should we add retry logic in case the uploading fail?
,
Jul 17
So I can't find evidence of problems with the data itself in the logs, and there are no timeouts related to this. I do see quite aborted requests from slow instance startup, afaik this happens when appengine fails to spin up an instance fast enough to serve the request. I didn't set any retries on 500's when I added the histogram path in https://cs.chromium.org/chromium/src/tools/perf/core/results_dashboard.py?q=results_das&sq=package:chromium&g=0&l=540, since the existing retry logic is really basic in there and will just retry forever mindlessly. Maybe the logic can be extended a bit to keep track of how many times an upload has been retried, capping at N before discarding. I filed crbug.com/864565 with this suggestion.
,
Jul 19
Going to relinquish this, don't think I have any next steps here. I've written up some suggestions on how to improve the retry logic in the recipe uploader in crbug.com/864565 , I think taking those on would help immensely.
,
Nov 12
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by eakuefner@chromium.org
, Jul 11