Idle time scheduling changed significantly |
|||||||||||
Issue descriptionExample failing build: https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%283%29/builds/2836/steps/v8.todomvc/logs/stdio Most of the pages fail due to a timeout without a clear cause: [ FAILED ] 7 tests, listed below: [ FAILED ] Polymer [ FAILED ] AngularJS [ FAILED ] React [ FAILED ] Backbone.js [ FAILED ] Ember.js [ FAILED ] Dart [ FAILED ] Vanilla JS The pages themselves seem to show up fine, so I'm guessing there's something wrong with the idleness detector: https://cs.chromium.org/chromium/src/tools/perf/page_sets/todomvc.py?rcl=0&l=47
,
Aug 18 2016
Perf bot health sheriff pinging: Is anyone looking into this?
,
Aug 22 2016
,
Aug 22 2016
All of the last 5 runs on the waterfall have timed out on both tot and the reference build. Raising priority. https://uberchromegw.corp.google.com/i/chromium.perf/builders/Android%20Nexus5%20Perf%20%283%29
,
Aug 24 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/2c3cb38333c08f67b50e482aad3674a7bbfed6be commit 2c3cb38333c08f67b50e482aad3674a7bbfed6be Author: jochen <jochen@chromium.org> Date: Wed Aug 24 23:04:30 2016 Add some more categories for V8.TodoMVC to see what keeps us non-idle BUG= 636405 R=skyostil@chromium.org CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.perf:android_s5_perf_cq;master.tryserver.chromium.perf:linux_perf_cq;master.tryserver.chromium.perf:mac_retina_perf_cq;master.tryserver.chromium.perf:winx64_10_perf_cq Review-Url: https://codereview.chromium.org/2274933002 Cr-Commit-Position: refs/heads/master@{#414187} [modify] https://crrev.com/2c3cb38333c08f67b50e482aad3674a7bbfed6be/tools/perf/benchmarks/v8.py [modify] https://crrev.com/2c3cb38333c08f67b50e482aad3674a7bbfed6be/tools/perf/page_sets/todomvc.py
,
Aug 25 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/0f5391d7b30ce4536222f46aa5804f13f65da25d commit 0f5391d7b30ce4536222f46aa5804f13f65da25d Author: jochen <jochen@chromium.org> Date: Thu Aug 25 20:18:10 2016 Temporarily decrease min-idle time for todomvc pagesets Also record all the events BUG= 636405 R=skyostil@chromium.org CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.perf:android_s5_perf_cq;master.tryserver.chromium.perf:linux_perf_cq;master.tryserver.chromium.perf:mac_retina_perf_cq Review-Url: https://codereview.chromium.org/2276313002 Cr-Commit-Position: refs/heads/master@{#414521} [modify] https://crrev.com/0f5391d7b30ce4536222f46aa5804f13f65da25d/tools/perf/benchmarks/v8.py [modify] https://crrev.com/0f5391d7b30ce4536222f46aa5804f13f65da25d/tools/perf/page_sets/todomvc.py
,
Aug 26 2016
attaching a trace from the bots. It looks like although there long stretches of idle time, only very late the rIC is invoked with pretty short idle periods.
,
Aug 30 2016
So looking at the v8.todomvc desktop results, it looks like since https://codereview.chromium.org/2118903002 the idle time behavior changed significantly. Sami, mind investigating this further?
,
Aug 30 2016
Hmm, that is unexpected to say the least :) I'll take a look.
,
Aug 30 2016
Out of curiosity looked at some histograms and did not (yet?) see movement there: https://uma.googleplex.com/p/chrome/timeline_v2/?sid=e53fc4dfe879c205e249ffcbfba25ef1
,
Aug 30 2016
I'm looking at e.g. release/v8.todomvc/v8_execution_cpu_self_sum/Polymer I'll locally bisect to verify... sec
,
Aug 30 2016
ok, seems to be something else... still bisecting, but I'm now before the move
,
Sep 1 2016
,
Sep 2 2016
This issue is turning at least 5 of the perf bots red: https://uberchromegw.corp.google.com/i/chromium.perf/builders/Android%20Nexus9%20Perf%20(3) https://uberchromegw.corp.google.com/i/chromium.perf/builders/Android%20Nexus7v2%20Perf%20(3) https://uberchromegw.corp.google.com/i/chromium.perf/builders/Android%20Nexus6%20Perf%20(3) https://uberchromegw.corp.google.com/i/chromium.perf/builders/Android%20Nexus5%20Perf%20%283%29/builds/3070 https://uberchromegw.corp.google.com/i/chromium.perf/builders/Android%20Galaxy%20S5%20Perf%20(3) I need to disable these tests while we figure out what is wrong as the bots have been red for over a week (since 24th August).
,
Sep 2 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/253882fa1c3830b958f13da1c28f1aecbdba3d09 commit 253882fa1c3830b958f13da1c28f1aecbdba3d09 Author: jochen <jochen@chromium.org> Date: Fri Sep 02 13:24:30 2016 Revert of Temporarily decrease min-idle time for todomvc pagesets (patchset #1 id:1 of https://codereview.chromium.org/2276313002/ ) Reason for revert: no longer needed Original issue's description: > Temporarily decrease min-idle time for todomvc pagesets > > Also record all the events > > BUG= 636405 > R=skyostil@chromium.org > CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.perf:android_s5_perf_cq;master.tryserver.chromium.perf:linux_perf_cq;master.tryserver.chromium.perf:mac_retina_perf_cq > > Committed: https://crrev.com/0f5391d7b30ce4536222f46aa5804f13f65da25d > Cr-Commit-Position: refs/heads/master@{#414521} TBR=skyostil@chromium.org # Not skipping CQ checks because original CL landed more than 1 days ago. BUG= 636405 Review-Url: https://codereview.chromium.org/2308593002 Cr-Commit-Position: refs/heads/master@{#416256} [modify] https://crrev.com/253882fa1c3830b958f13da1c28f1aecbdba3d09/tools/perf/benchmarks/v8.py [modify] https://crrev.com/253882fa1c3830b958f13da1c28f1aecbdba3d09/tools/perf/page_sets/todomvc.py
,
Sep 2 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/0de9faf1af255a6e7d928ee4c65eb0214ce72a3e commit 0de9faf1af255a6e7d928ee4c65eb0214ce72a3e Author: picksi <picksi@google.com> Date: Fri Sep 02 15:34:57 2016 Disabling v8.todoMVC tests on reference builds BUG= 636405 NOTRY=true CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.perf:android_s5_perf_cq;master.tryserver.chromium.perf:linux_perf_cq;master.tryserver.chromium.perf:mac_retina_perf_cq;master.tryserver.chromium.perf:winx64_10_perf_cq Review-Url: https://codereview.chromium.org/2304113002 Cr-Commit-Position: refs/heads/master@{#416270} [modify] https://crrev.com/0de9faf1af255a6e7d928ee4c65eb0214ce72a3e/tools/perf/benchmarks/v8.py
,
Sep 2 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/7f488aeee33527e2b9d00a369874affd71f9d22d commit 7f488aeee33527e2b9d00a369874affd71f9d22d Author: jochen <jochen@chromium.org> Date: Fri Sep 02 21:08:12 2016 Revert of Add some more categories for V8.TodoMVC to see what keeps us non-idle (patchset #1 id:1 of https://codereview.chromium.org/2274933002/ ) Reason for revert: no longer needed Original issue's description: > Add some more categories for V8.TodoMVC to see what keeps us non-idle > > BUG= 636405 > R=skyostil@chromium.org > CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.perf:android_s5_perf_cq;master.tryserver.chromium.perf:linux_perf_cq;master.tryserver.chromium.perf:mac_retina_perf_cq;master.tryserver.chromium.perf:winx64_10_perf_cq > > Committed: https://crrev.com/2c3cb38333c08f67b50e482aad3674a7bbfed6be > Cr-Commit-Position: refs/heads/master@{#414187} TBR=skyostil@chromium.org # Not skipping CQ checks because original CL landed more than 1 days ago. BUG= 636405 Review-Url: https://codereview.chromium.org/2306073002 Cr-Commit-Position: refs/heads/master@{#416340} [modify] https://crrev.com/7f488aeee33527e2b9d00a369874affd71f9d22d/tools/perf/benchmarks/v8.py [modify] https://crrev.com/7f488aeee33527e2b9d00a369874affd71f9d22d/tools/perf/page_sets/todomvc.py
,
Sep 7 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/be74d8bad819ed4753b82f03003e8d88b1816ed0 commit be74d8bad819ed4753b82f03003e8d88b1816ed0 Author: lpy <lpy@chromium.org> Date: Wed Sep 07 00:53:45 2016 Revert of Disabling v8.todoMVC tests on reference builds (patchset #1 id:1 of https://codereview.chromium.org/2304113002/ ) Reason for revert: Enable v8.todoMVC tests on reference builds since they won't fail on TOT now. Original issue's description: > Disabling v8.todoMVC tests on reference builds > > BUG= 636405 > NOTRY=true > CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.perf:android_s5_perf_cq;master.tryserver.chromium.perf:linux_perf_cq;master.tryserver.chromium.perf:mac_retina_perf_cq;master.tryserver.chromium.perf:winx64_10_perf_cq > > Committed: https://crrev.com/0de9faf1af255a6e7d928ee4c65eb0214ce72a3e > Cr-Commit-Position: refs/heads/master@{#416270} TBR=skyostil@chromium.org,picksi@chromium.org,picksi@google.com # Not skipping CQ checks because original CL landed more than 1 days ago. BUG= 636405 Review-Url: https://codereview.chromium.org/2313143002 Cr-Commit-Position: refs/heads/master@{#416806} [modify] https://crrev.com/be74d8bad819ed4753b82f03003e8d88b1816ed0/tools/perf/benchmarks/v8.py
,
Sep 9 2016
I tried to reproduce this locally without success. I just sent a perf-tryjob with renderer.scheduler and disabled-by-default-renderer.scheduler categories turned on. That should let us see (assuming I can get the trace) if IdleHelper::StartIdlePeriod is getting called as expected, and let us see what state the queues are in.
,
Sep 9 2016
Well that try job (https://codereview.chromium.org/2322073004) passed although I did get the tracing categories I was after. Interestingly I note that the delay in the test passing is <1s where as in Jochen's test run from #7 it was around a 25s delay. I wonder what's going on there, that doesn't feel right. https://00e9e64bac99ec5b4a1a5f7b78f44b653d5329acb57d61a63b-apidata.googleusercontent.com/download/storage/v1_internal/b/chrome-telemetry-output/o/trace-file-id_0-2016-09-09_07-16-15-93903.html?qk=AD5uMEs5YVu8N55CjhaDqNo5S5FS9dpmqTplLauDtlCoWc4KTy13p_OHXX_S0CVv9tJ9lrZr85q3NhsxrR6_rTnawTVCp68-LAkPHTyVim3kqHxrH8ueyLvzel39Hq6C5jvEK_EXaY43pcRQolfxAKGip1liDi9Ck8VJBpa603wDiYI2PDi0wYqh7Ap4HSE9iNG8Kjgx-kyWg6FZAdJztlK1Rsug8H6_XRfofCENagHErXBIoKiLj8ZBe6cNTCfzk5y0ho3nQ8FpBVEQObP59eG2mOe99C8LkVLBOhrZ74wNOzuNt6jTN_FJssgXUqfUH5gILdgiDq_p3x6lvAXJM4HJi0Q2ISqqurUnzXUxzYsp4SX-Af5G4d6SCqP11o_ziIkbjh840U4F-e1VtUc961qZlMHNrin7MjeKojUKXbw61PWvlfdynzJJwP_GplD62mHAny5-jqYCh-ghbrv-pGhq52JjNA-adm8FMGdkPmZmYHZEjQxix-lVobea0-XlpOygp7HIawAtBJ2o714SuszvxtbOBNutaGryYmywJ7jPZenKK8Wvlj1Wku_Lqx1aPOJyRfcxJ102Ks8n7uVHE8ITHX688gCgZoA4AHUzwDNw4MAi3g0o1jC9FZNS5kKOD71PGzpTsbICmuGAUeKPdaxNqEFelc8SdBanadFSEo9yHQleYqaiIBSstS7taaRvijvwnTh5gkGAAiuLrTvTUbPu1RwzjugF9tr__Vr-Z4bIdkl5o3J9eAv4dJgvV_TS2b2rZ5VYOVOyUtMv31UfBYVPQFm3lzsTrsC6E_4bxKEC7HEJy91I0VljADWveloEog6q4mi1ai7KiVSQ-UCfjrbedsjeSssXaHzLo9bVvifBXf_wFwiaaOWatq0UuhcPTbWY9e3wZHbJtxhOpEhU-H6ybjal_UAkwg
,
Sep 9 2016
s/test passing/Interaction starting.
,
Sep 9 2016
I used a meanwhile reverted configuration where I'd finish the test after already a 10ms idle callback - that way I managed to get traces for runs that would otherwise just time out. Might be worthwhile to reland them for a day, and also activate the scheduler categories
,
Sep 20 2016
Maybe related to issue 647870
,
Sep 22 2016
Issue 644826 has been merged into this issue.
,
Oct 4 2016
Issue 647870 has been merged into this issue.
,
Oct 4 2016
,
Oct 14 2016
Hey Brian, have you had any time to dig into this?
,
Oct 14 2016
,
Oct 26 2016
So I spent some time digging into this and I *think* this was already fixed by Sunny when he removed retro-BeginFrames from the scheduler -- most recently here: https://codereview.chromium.org/2339633003 v8.todomvc is passing on all the bots I looked at, and the last failure I found was from September 26th, a bit after Sunny's patch was reverted: https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%283%29/builds/3281 My theory of what was happening is this: 1. Scheduler wants a BeginFrame and posts a retro-BeginFrame task. 2. Scheduler changes its mind => NotExpectingBeginFrameSoon. 3. RetroBeginFrame task runs and does a BeginFrame, but since we didn't subscribe to BeginFrames we won't call NotExpectingBeginFrameSoon after it => Blink Scheduler thinks we're never completely idle. Given that retro frames are gone, this isn't possible anymore. However I tried to sync back to a build with retro frames intact and I still couldn't reproduce this bug -- I tried with Wikipedia, a custom rIC page and v8.todomvc. I'll close the bug. Please reopen if you run into this problem elsewhere. |
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by tkonch...@chromium.org
, Aug 11 2016