browse:media:tumblr story in v8.runtimestats.browsing_desktop and v8.browsing_desktop benchmarks is flaky on Mac Air 10.11 |
||||||
Issue descriptionhttps://build.chromium.org/p/chromium.perf/builders/Mac%20Air%2010.11%20Perf?numbuilds=200 (INFO) 2017-10-19 21:04:45,322 browser.DumpStateUponFailure:393 DevTools listening on ws://127.0.0.1:58343/devtools/browser/c37dfcd4-a840-4024-80f5-753226735046 2017-10-19 21:03:59.386 Google Chrome[13859:180129] Errors logged by ksadmin: KSKeyedPersistentStore store directory does not exist. [com.google.UpdateEngine.CommonErrorDomain:501 - '/Library/Google/GoogleSoftwareUpdate/TicketStore' - 'KSKeyedPersistentStore.m:372'] KSPersistentTicketStore failed to load tickets. (productID: com.google.Chrome) [com.google.UpdateEngine.CoreErrorDomain:1051 - '/Library/Google/GoogleSoftwareUpdate/TicketStore/Keystone.ticketstore'] (KSKeyedPersistentStore store directory does not exist. - '/Library/Google/GoogleSoftwareUpdate/TicketStore' [com.google.UpdateEngine.CommonErrorDomain:501]) ksadmin cannot access the ticket store:<KSUpdateError:0x10071ae60 domain="com.google.UpdateEngine.CoreErrorDomain" code=1051 userInfo={ function = "-[KSProductKeyedStore(ProtectedMethods) errorForStoreError:productID:message:timeoutMessage:]"; date = 2017-10-20 04:03:59 +0000; productids = {( "com.google.Chrome" )}; filename = "KSProductKeyedStore.m"; line = 91; NSFilePath = "/Library/Google/GoogleSoftwareUpdate/TicketStore/Keystone.ticketstore"; NSUnderlyingError = <KSError:0x1007190c0 domain="com.google.UpdateEngine.CommonErrorDomain" code=501 userInfo={ date = 2017-10-20 04:03:59 +0000; line = 372; filename = "KSKeyedPersistentStore.m"; function = "-[KSKeyedPersistentStore(PrivateMethods) validateStorePath]"; NSFilePath = "/Library/Google/GoogleSoftwareUpdate/TicketStore"; NSLocalizedDescription = "KSKeyedPersistentStore store directory does not exist."; } >; NSLocalizedDescription = "KSPersistentTicketStore failed to load tickets."; } > libc++abi.dylib: terminating with uncaught exception of type std::out_of_range: vector (INFO) 2017-10-19 21:04:45,322 browser.DumpStateUponFailure:396 *********** END OF BROWSER STANDARD OUTPUT ************ (INFO) 2017-10-19 21:04:45,322 browser.DumpStateUponFailure:398 ********************* BROWSER LOG ********************* (INFO) 2017-10-19 21:04:45,322 browser.DumpStateUponFailure:400 No log file (INFO) 2017-10-19 21:04:45,322 browser.DumpStateUponFailure:403 ***************** END OF BROWSER LOG ****************** Traceback (most recent call last): File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/story_runner.py", line 104, in _RunStoryAndProcessErrorIfNeeded state.RunStory(results) File "/b/s/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function return func(*args, **kwargs) File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/page/shared_page_state.py", line 324, in RunStory self._current_page.Run(self) File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/page/__init__.py", line 118, in Run self.RunPageInteractions(action_runner) File "/b/s/w/ir/tools/perf/page_sets/system_health/system_health_story.py", line 107, in RunPageInteractions self._DidLoadDocument(action_runner) File "/b/s/w/ir/tools/perf/page_sets/system_health/browsing_stories.py", line 367, in _DidLoadDocument self._ViewMediaItem(action_runner, index) File "/b/s/w/ir/tools/perf/page_sets/system_health/browsing_stories.py", line 465, in _ViewMediaItem action_runner.MouseClick(selector='#tumblr_lightbox_center_image') File "/b/s/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function return func(*args, **kwargs) File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/actions/action_runner.py", line 597, in MouseClick self._RunAction(MouseClickAction(selector=selector)) File "/b/s/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function return func(*args, **kwargs) File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/actions/action_runner.py", line 62, in _RunAction action.WillRunAction(self._tab) File "/b/s/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function return func(*args, **kwargs) File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/actions/mouse_click.py", line 17, in WillRunAction utils.InjectJavaScript(tab, 'mouse_click.js') File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/actions/utils.py", line 11, in InjectJavaScript tab.ExecuteJavaScript(js) File "/b/s/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function return func(*args, **kwargs) File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/browser/web_contents.py", line 192, in ExecuteJavaScript return self._inspector_backend.ExecuteJavaScript(*args, **kwargs) File "/b/s/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function return func(*args, **kwargs) File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_backend.py", line 38, in Inner return func(inspector_backend, *args, **kwargs) File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_backend.py", line 225, in ExecuteJavaScript self._runtime.Execute(statement, context_id, timeout) File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_runtime.py", line 20, in Execute self.Evaluate(expr + '; 0;', context_id, timeout) File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_runtime.py", line 44, in Evaluate res = self._inspector_websocket.SyncRequest(request, timeout) File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_websocket.py", line 116, in SyncRequest res = self._Receive(timeout) File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_websocket.py", line 172, in _Receive self._HandleNotification(result) File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_websocket.py", line 185, in _HandleNotification self._domain_handlers[domain_name](result) File "/b/s/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function return func(*args, **kwargs) File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_backend.py", line 460, in _HandleInspectorDomainNotification raise exception DevtoolsTargetCrashException: <unprintable DevtoolsTargetCrashException object>
,
Oct 20 2017
📍 Couldn't reproduce a difference. https://pinpoint-dot-chromeperf.appspot.com/job/1488d517780000
,
Oct 20 2017
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/14ab2d17780000
,
Oct 20 2017
📍 Couldn't reproduce a difference. https://pinpoint-dot-chromeperf.appspot.com/job/14ab2d17780000
,
Oct 20 2017
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/12ab2d17780000
,
Oct 20 2017
📍 Couldn't reproduce a difference. https://pinpoint-dot-chromeperf.appspot.com/job/12ab2d17780000
,
Oct 20 2017
+simonhatch these bisects all actually do reproduce going from 1.0 success rate to less than 1.0 success rate. Anything pinpoint can do to narrow this down?
,
Oct 23 2017
So poking at this a bit today, it looks like p-value is extremely close to the threshold, 0.00128049769083 (0.001 being the significance level we check for). Not an expert on MWU so Dave may have to weigh in when he gets back. For 0.001 and 15 repetitions, going from 100% stable to flaky you need at least 9/15 failures right now to proceed. Seems like there's 2 approaches that can be used, either increasing the significance level or increasing the # of repetitions. If we increase the significance level, say to 0.01 (like recipe bisect), you'd need 6 failures instead of 9. But we'd get a lot of false positives on the perf side I imagine. If we increase repetitions, we increase runtime which isn't desirable either. I know longer term, Dave has plans to make Pinpoint smarter about choosing to do more repetitions when needed, maybe short term I can deploy but not land a version that uses a different p-value for testing the Test quest.
,
Oct 23 2017
It might make sense to support a more aggressive algorithm that we can toggle so that we can detect lower levels of flake as well. That way we dont have to increase the repetitions, but can still narrow down on tests when we need to.
,
Aug 2
,
Aug 2
,
Aug 10
,
Aug 10
Removing mistakenly applied label. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Oct 20 2017