Issue metadata
Sign in to add a comment
|
requestAnimationFrame fails to execute callback after hours of correct operation
Reported by
pas...@lindelauf.com,
Oct 17
|
||||||||||||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36 Steps to reproduce the problem: We have an HTML5 application that continuously runs an HTML-based presentation for digital signage purposes. Slides are built, transition takes place and the next slide is built, and this continuous indefinitely. The building of the next slide is done within a requestAnimationFrame. In ChromeOS v69 we see that after hours of correct operation, the requestAnimationFrame stops executing its callback function. Even all subsequent requestAnimation frames stop executing. What is the expected behavior? requestAnimationFrame is guaranteed to execute its callback function under any circumstances. What went wrong? After hours of correct operation, the requestAnimationFrame stops executing its callback function. Even all subsequent requestAnimation frames stop executing. Did this work before? Yes v68 (unverified, tough) Chrome version: 69.0.3497.120 Channel: n/a OS Version: 69.0.3497.120 Flash Version:
,
Oct 18
Is requestAnimationFrame the only thing that stops? Can you provide a test case that reliably reproduces? There was recently some change to freeze task queues on mobile that was planned for M69. I wonder if that may be related. Though that "hours" timeline does not match with that change. /cc panicker@ in case they are aware of any other possible changes that may be related.
,
Oct 19
Doesn't seem related to freezing. rAF doesn't run in hidden / occluded frames -- are they always visible here?
,
Oct 19
It's a kiosk application, so I think that qualifies as "always visible". The place where we see this happening is in the code where we're constructing the next slide to be shown by recursively adding all the elements to the slide:
buildElementsRecursively: function (elementArray) {
"use strict";
var self = this;
if (elementArray.length === 0) {
this._elementsBuilt = true;
} else {
requestAnimationFrame(function () {
var element = elementArray.pop();
var existingElement = self.elementById(element.element.id);
if (existingElement &&
(existingElement.updatedAt() !== element.element.updated_at ||
existingElement.shouldRerenderOnEveryPageBuild())) {
existingElement.destroy();
self._elements.splice(self._elements.indexOf(existingElement), 1);
existingElement = undefined;
}
if (!existingElement) {
self.addElement(element.element, self._resizeFactor, self._idPrefix + self._id);
}
self.buildElementsRecursively(elementArray);
});
}
},
We've established that after some time, the requestAnimationFrame is just not executing anymore.
,
Oct 19
Thank you for providing more feedback. Adding the requester to the cc list. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Oct 19
,
Oct 19
Will try to re-pro this locally.
,
Oct 23
,
Oct 23
Have you verified there are no script errors before the next call? It's hard to verify that none of the work you do between the start of the function and the recursive call doesn't result in an error which aborts script execution. If you move the call to self.buildElementsRecursively to the very first thing you do after calling elementArray.pop(); This would ensure that errors in the subsequent processing script wouldn't prevent reaching the next requestAnimationFrame. Does this reproduce in a non-kiosk setup (i.e. on a desktop machine)? Have you tried a similar experiment with a simpler script to see if requestAnimationFrame indeed stops firing? I.e. something like this? https://jsbin.com/fepajaj/edit?html,js,output Xida, were you able to reproduce this?
,
Oct 24
I actually left the try..catch block out of this example, to keep it small. But it is in the original code and there is no exception raised. So far we've actually only seen this problem on two different Chromebits that belong to our customers. We're even having a hard time reproducing this consistently on our own Chromebit. We have not been able to reproduce this on regular Chrome. I'll first try to reliably reproduce this on our Chromebit.
,
Oct 24
Thank you for providing more feedback. Adding the requester to the cc list. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Oct 24
Re#9: I wasn't able to repro this locally.
,
Oct 24
Looking through the code paths there's a few possibilities. If we are throttling rendering of that frame for some reason we would skip executing animation frame callbacks. +chrishtr any chance we would throttle this kiosk frame? Other possibilities would be deferred commits, thinking the web view is not visible, if ScriptedAnimationController::Pause were called, or possibly some race on the state of whether we need a begin main frame / max_pipeline_stage though I haven't actually discovered any such race. Last but not least, do we know for sure that it's not running out of memory? Perhaps a worthwhile test would be to also set a timeout with the same task after some long time (several seconds) and see if there is ever a case where the timeout fires first (while the page is visible of course - since while hidden the behavior is expected to differ).
,
Oct 24
Re throttling: there is I chance I guess, if it thinks the frame containing the requestAnimationFrames is hidden. But whether that would happen doesn't seem any more likely than any of the other possible causes.
,
Oct 26
,
Oct 30
Root cause seems unlikely to be directly related to animations based on analysis in comment #13.
,
Nov 2
pascal: Can you take a chrome://tracing trace when it gets into this state? See: https://www.chromium.org/developers/how-tos/trace-event-profiling-tool/recording-tracing-runs In particular, select manual trace, all the left side categories and all the cc.debug.scheduler categories on the right.
,
Dec 14
Sorry for the late reply. This is incredibly hard to reproduce for us, since we've not seen this happening on our own devices; instead our customers have seen this happening. That's also why we cannot get tracing info (and is this at all possible in a kiosk application?). The best we could do is add extra logging around this issue to track if it happens and hopefully how.
The reason I'm sending you this now, is because this logging has revealed another occurence of this strange behaviour of requestAnimationFrame.
Context
=======
In the code above, I have replaced the requestAnimationFrame with a setTimeout of 10ms. Right after that statement I have put:
var rafId = requestAnimationFrame(function () { logger.log("requestAnimationFrame still working... (id=" + rafId + ")"); });
Expected behaviour
==================
The buildElementsRecursively function is being called multiple times per minute all day long. So you expect to see the log statement above woven through the other application's log lines. And that is exactly what we see in 99.99% of the cases.
What went wrong?
================
In this particular session we see that the "requestAnimationFrame" logging stopped at around 4:30PM, whereas the rest of the logging proceeded, indicating that the application kept working. Then at around 12:50AM (so more than 6 hours later), all of a sudden we get a flood of 'requestAnimationFrame still working...' messages: nothing else. And then after a couple of thousands of them, the session stopped working.
This is on ChromeOS 70.0.3538.110.
I hope this is something you can work with, because this customer is reporting frequent instability of his system. This is the only anomaly we can find on this device.
,
Dec 14
Thank you for providing more feedback. Adding the requester to the cc list. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Dec 14
Thanks for the feedback pascal@, can you also provide the output of chrome:gpu so we can understand the exact hardware you're seeing this on. You can provide this output even if you haven't seen the issue recently. Thanks!
,
Dec 17
Sorry, this is a device running at one of our customer's offices, so I can't run chrome:gpu. However I know in this case it's an Asus Chromebit.
,
Dec 17
Thank you for providing more feedback. Adding the requester to the cc list. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Dec 17
I would suggest that there are several threads here that are possibly related. https://bugs.chromium.org/p/chromium/issues/detail?id=903007&q=chromebit&sort=-modified&colspec=ID%20Pri%20M%20Stars%20ReleaseBlock%20Component%20Status%20Owner%20Summary%20OS%20Modified https://bugs.chromium.org/p/chromium/issues/detail?id=893591&q=chromebit&sort=-modified&colspec=ID%20Pri%20M%20Stars%20ReleaseBlock%20Component%20Status%20Owner%20Summary%20OS%20Modified https://bugs.chromium.org/p/chromium/issues/detail?id=879081&q=chromebit&sort=-modified&colspec=ID%20Pri%20M%20Stars%20ReleaseBlock%20Component%20Status%20Owner%20Summary%20OS%20Modified
,
Dec 17
Likely related - https://bugs.chromium.org/p/chromium/issues/detail?id=865025 In that issue we discovered a gpu/renderer hang that has been fixed in 71+. It would only reproduce on devices with the Rockchip processor. Does the issue reproduce in versions on or after 71.0.3578.94?
,
Jan 18
(4 days ago)
pascal@ - can you check if this is addressed on the latest Chrome stable (>=71.0.3578.94). It feels like the GPU process hanging and being unable to restart, as described in issue 865025 could lead to RAF failing but setTimeout working, so hoping this is the same issue.
,
Jan 21
(2 days ago)
I will monitor the devices that have been reported as problematic closely in the hopes that I can give you a conclusive answer.
,
Jan 21
(2 days ago)
Thank you for providing more feedback. Adding the requester to the cc list. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by bugsnash@chromium.org
, Oct 17