requestAnimationFrame interferes with input events
Reported by
alexsuvo...@unity3d.com,
Mar 30 2016
|
||||||||||||||
Issue descriptionChrome Version : 49.0.2623.87 (64-bit) URLs (if applicable) : https://jsfiddle.net/va5vpxLz/1/ Other browsers tested: Add OK or FAIL, along with the version, after other browsers where you have tested this issue: Safari: Firefox: OK IE: *could not perform reliable testing in other browsers due to inaccurate timers, however visual comparison confirms Firefox does not have the issue What steps will reproduce the problem? (1) open the demo https://jsfiddle.net/va5vpxLz/1/ and enable "Log the details in console" checkbox, you will see 60~70 ms processing delay for mousemove events (2) now enable "Schedule main loop through setTimeout(0)" checkbox and check the console, now you will see 0~50 ms processing delay for mouse events, which is adequate to the target 20 fps (50ms) (3) * the test simulates a situation when frame computations appear to be heavy enough to degrade the rendering framerate What is the expected result? mouse events, appeared after requestAnimationFrame call and before the next vsync, should be processed before the corresponding rAF callback What happens instead? several mouse events appeared before vsync and sometimes even before the actual requestAnimationFrame call, are not processed prior to rAF callback, while forcing event queue processing with setTimeout(0) in rAF callback resolves the situation Please provide any additional information below. Attach a screenshot if possible. example log: [890619.46] enter mainLoop_runner [890669.72] initiate requestAnimationFrame [890671.18] process onmousemove with timeStamp: 890608.48 MOUSE PROCESSING DELAY: 62.69 [890671.41] process requestAnimationFrame with timeStamp: 890660.90 [890671.52] enter mainLoop_runner [890721.74] initiate requestAnimationFrame [890723.27] process onmousemove with timeStamp: 890644.39 MOUSE PROCESSING DELAY: 78.88 [890723.56] process requestAnimationFrame with timeStamp: 890710.91 as you can see in this example, onmousemove with timeStamp: 890644.39 was expected to be processed before the requestAnimationFrame callback with timeStamp: 890660.90, but it wasn't processed, therefore a 1 frame delay was introduced for mouse input
,
Mar 30 2016
,
Mar 30 2016
,
Mar 30 2016
I'm having difficulty comparing Chrome to both Safari and Firefox on my OS X laptop, but subjectively, Chrome 51.0.2694.1 (Official Build) canary (64-bit) looks smoother than the other two browsers. Have you tested with Canary and does it improve the situation for you compared to Stable?
,
Mar 31 2016
alexsuvorov@Could you please provide actual and expected behavior screencast for better understanding how to compare the mouse events of chrome and firefox.
,
Mar 31 2016
Hello. I am currently trying to make it display correct delay in other browsers for objective comparison, but it does not seem to be easily achievable (i.e. in Safari the timeStamp does not seem to be from a high resolution timer), I will send you a link when I can make it work reliably. We are talking about 1 frame input delay here, so timing precision is critical for objective comparison. However, at the moment you can do the following visual comparison: 1) Open the following updated link in Chrome https://jsfiddle.net/va5vpxLz/3/ (i have set the throttling fps to 15) 2) move your mouse left-right-left-right-left-right with a period of about 0.5 sec (2 times to the right each second) within about 200 px area (the distance does not really matter). You can see that with this period, due to the input delay, your mouse cursor and the rendered white square are moving in the opposite directions. 3) now enable the setTimeout(0) checkbox: now you can see that the input delay got smaller, it is still present, but it can not cover the half-period anymore. The same thing you can try in Firefox for comparison. Please let me know if you are not able to reproduce the visual effect described above.
,
Mar 31 2016
Thank you for providing more feedback. Adding requester "ssamanoori@chromium.org" for another review and adding "Needs-Review" label for tracking. For more details visit https://sites.google.com/a/chromium.org/dev/issue-tracking/autotriage - Your friendly Sheriffbot
,
Mar 31 2016
Could you also pay attention to the log of the rAF request and callback?
The following code:
consoleLog('initiate requestAnimationFrame');
requestAnimationFrame(function(e) {
consoleLog('process requestAnimationFrame with timeStamp: ' + e.toFixed(2));
useSetTimeout ? setTimeout(mainLoop_runner,0) : mainLoop_runner();
});
gives:
[890669.72] initiate requestAnimationFrame
[890671.18] process onmousemove with timeStamp: 890608.48 MOUSE PROCESSING DELAY: 62.69
[890671.41] process requestAnimationFrame with timeStamp: 890660.90
[890671.52] enter mainLoop_runner
[890721.74] initiate requestAnimationFrame
[890723.27] process onmousemove with timeStamp: 890644.39 MOUSE PROCESSING DELAY: 78.88
[890723.56] process requestAnimationFrame with timeStamp: 890710.91
which means between requestAnimationFrame and its callback, just [890669.72]-[890671.41] and [890721.74]-[890723.56] i.e. about 2ms passes on average. While Firefox gives realistic 5-15ms.
,
Mar 31 2016
,
Mar 31 2016
It sounds like input buffering might help make this work a little smoother.
,
Mar 31 2016
,
Apr 1 2016
Hello. I am not sure what exactly do you mean by input buffering, but I would like to provide additional information. Now there is the minimal test: https://jsfiddle.net/va5vpxLz/4/ 1) run the demo that has initial throttling fps set to 42 and check the console, you will see that rAF callback is executed immediately irrespective of vsync: [10268.36] schedule mainLoop [10268.79] enter mainLoop [10292.74] schedule mainLoop [10293.22] enter mainLoop 2) make the following change and run the demo: var fps = 90; now you can see that rAF is working correctly, [3141.77] schedule mainLoop [3147.11] enter mainLoop [3158.35] schedule mainLoop [3163.89] enter mainLoop throttling cycle takes about 1000/90=11ms, then the rAF callback is postponed for about 5ms, that gives the expected 16ms=1000/60 vsync period Expected behavior: all the vsync events that have been missed due to the thread being busy, should be thrown away from the schedule. Instead, the rAF callback should be executed on the next available vsync event. This is exactly what happens in Firefox at throttling 42 fps: [115412.19] schedule mainLoop [115418.24] enter mainLoop [115442.54] schedule mainLoop [115455.61] enter mainLoop 1000/42=23ms for computation ([115418.24] ~ [115442.54]), then the rAF callback is postponed for 13ms ([115442.54] ~ [115455.61]), giving total 23ms+13ms=36ms, which is quite close to the expected 33.3ms for 2 vsync events of 60Hz. As you can see, the first missed vsync which appeared during computation has been ignored in Firefox.
,
Apr 1 2016
So there is potentially work to be done here other than input batching ( issue 394562 ). I've attached a trace. This proposal seems pretty reasonable to me. Instead of: | indicates vsync RA|F R|AF |RAF, we should have RA|F | R|AF | RAF Scheduling folks, is there any reason we shouldn't do this?
,
Apr 1 2016
It really comes down to whether we want to optimize for latency (i.e., the current system) vs. smoothness. With the current logic you can do things like scheduling a rAF in response to user input and get a new frame out for the "current" vsync instead of waiting for the next one. Sync scrolling also depends on this behavior. I could buy the argument that this isn't the right behavior when the main thread can't reach 60 fps, and indeed we are going to investigate dropping to 30 fps or lower in such cases. This would give us more loosely spaced out rAFs like in #13. The main question in my mind is how do we choose between the two modes reliably.
,
Apr 1 2016
Regarding the original issue in this report. Regardless of what scheme is chosen for rAF, would it be possible to make all the currently pending input events to be processed prior to rAF callback?
,
Apr 1 2016
Yeah, it's tricky. Scheduling input may help with this, depending on how exactly we schedule it. If we handle dispatch input directly before vsync, and rAF directly after, then we could schedule rAF in response to user input without increasing latency too much, and have rAF fire more regularly.
,
Apr 1 2016
I mean, lets consider the current scheme. I am not sure if I understand it correctly, but it seems to me that currently something like this is happening: If vsync appeared while the user code was executing and the thread was busy, then all the input, appeared after this missed rAF, will only be processed after the corresponding rAF callback? If this is the case, then what about the following scheme: rAF is never queued while the thread is busy, but it's state is cached, and this cached missed rAF can be queued as soon as user code exits. This way all the input appeared after vsync while the thread was busy will be processed before the rAF callback. Does this make any sense?
,
Apr 1 2016
Traces attached of the issue with input. Sami, I think there is some scheduling work to do here. It looks like the compositor is behaving completely differently when there's a timeout scheduled. Easy to trace test here: https://output.jsbin.com/lixepa/quiet.
,
Apr 6 2016
Assigning to skyostil@ for additional triage. I think there is some scheduling work to be done here.
,
Apr 6 2016
Thanks for the traces. I'm not sure the scheduler can make a meaningful improvement here (other than with bug 485600). The "no timeout" case looks like this: 1. Input 2. Really long rAF 3. Commit => all input events suffer from the long rAF, because it is always on the critical path. Compare that to the "timeout" case: 1. Input 2. Really long timer 3. Input (because input isn't buffered) 4. Commit => every second input event will have lower latency than in the "no timeout" case. (N.B. I *think* buffered input will make both cases behave identically, i.e., have high latency.) The basic problem is that rAF is always on the critical path to the screen, so if you make that slow, then you make everything slow. Timers on the other hand can run at any time and input can flow around them freely, so you might get lower latency if you get lucky. The only way I can think of to make the "no timeout" case better would be to run rAF *before* input, but that would make it impossible to run animations in sync with input so I don't think we want that. Tim, did you have some other ideas?
,
Apr 7 2016
alexsuvorov@ Thanks for pushing on this! Stepping back from the details (you guys are the experts there, not me), big picture it sounds like Unity is really struggling to get decent performance (in an important use case -gaming) by following our guidance. They're seeing a better end-user experience by using a pattern we're actively trying to discourage (setTimeout-based animation) and intervene against! Also their users are getting a better user experience in Firefox than Chrome. IMHO, coming up with a plan for what we (web platform collectively, not necessarily just scheduler) should be investing in to improve this seems important. Eg: 1. Are there APIs we should be adding to the platform to help us know that we should change scheduling behavior to better suit their use case. Could we land some experiments behind a flag that they could try? Since we've got someone from Unity engaged now, this is urgent (don't want to wait until it's too late like we did with Artillery) 2. If we think buffered input would help, could they do this themselves from JS (maybe with our help)? Eg. on mousemove record the event details, then do interpolation/extrapolation in raf and do the real input handling there (i.e port our MotionEventBuffer to JS - https://code.google.com/p/chromium/codesearch#chromium/src/ui/events/gesture_detection/motion_event_buffer.h&q=MotionEventBuffer&sq=package:chromium&type=cs&l=20) 3. Can we provide a better version of this test which we believe uses more sound methodology? Eg. I see it's measuring the time between event timestamp and performance.now in the mouse event handler. We know from our work on touch latency that this sort of measurement can be quite misleading in regards to the user experience at frame-level granularity. Eg. 50ms reported latency isn't necessarily better than 60ms if both are hitting the same vsync. Perhaps the test should measure the time to the next rAF call instead to reflect real input-to-swap latency?
,
Apr 8 2016
Currently there are not so many applications where users experience the issue. Most applications can provide 40-60 fps so the issue is not noticeable, while other applications are not critical to input delay. However, the number of Unity WebGL applications is constantly growing and it is growing fast, so we might expect growing number of reports regarding this issue in the future. Currently we can see the difference in rAF processing scheme across browsers, specifically between Chrome and Firefox. Originally the difference has been experienced subjectively by a few game developers, while the timestamps have only been used as an attempt to display the issue objectively. You can not trust the timestamps across browsers especially if those are coming from a low-resolution timer, so I will try to prepare a much more objective test, where I plan to poll the mouse position received by JavaScript using direct memory access from another process, and then compare it with real mouse cursor position. While controlling the system cursor position programmatically I think I might be able to generate enough data for objective comparison between at least Chrome and Firefox.
,
Apr 8 2016
One thing I'm not quite clear on yet is how the heavy rAF processing in the example relates to input. In other words, are the input events fed into the rAF work, or are the two unrelated? I'm guessing the rAF processing corresponds to rendering in Unity, in which case they are related and we should deliver input at the beginning of rAF to minimize latency. In the sample code mainLoop_userCode() isn't processing input, so I'm wondering if it's really the pattern we want to optimize for.
,
Apr 8 2016
> In the sample code mainLoop_userCode() isn't processing input, so I'm wondering if it's really the pattern we want to optimize for. The sample code is a simplified version. You may think of the process in the following way. Normally the mainLoop_userCode() will process the input and generate the data necessary for the next frame rendering based on this input. The engine will then use this generated user data in the immediately following mainLoop_render().
,
Apr 8 2016
Thanks for confirming. In that case I believe we will achieve the lowest latency through vsync aligned input. The reason why it looks like setTimeout() version runs faster is that it only measures how quickly the onmousemove handler is scheduled and ignores the time spent in mainLoop_userCode()[1]. I believe the actual user visible latency is identical in both versions. One more question: would you prefer that we ran heavy games at (for example) 45 fps (like we do now) or should we lock them to 30 fps (i.e., issue 485600)? I'm wondering if we can even make that choice without giving the app some control. [1] For example you could save timestamp of the most recent onmousemove event and move the delay computation to the start of mainLoop_userCode().
,
Apr 8 2016
> The reason why it looks like setTimeout() version runs faster is that it only measures how quickly the onmousemove handler is scheduled and ignores the time spent in mainLoop_userCode()[1]. I believe the actual user visible latency is identical in both versions. As has been noted before, you may completely ignore the numbers, as they can not be relied on. Those are not the measured or displayed numbers that make the latency higher. > For example you could save timestamp of the most recent onmousemove event and move the delay computation to the start of mainLoop_userCode(). The timestamp of the mousemove event can be higher than actual physical movement of the mouse, but it can not be lower, as the code can not predict the movement of the user hand. Therefore the minimal possible latency (performance.now() - e.timeStamp) is already significantly higher than 50 ms, which should not happen on 20 fps. > One more question: would you prefer that we ran heavy games at (for example) 45 fps (like we do now) or should we lock them to 30 fps. This question deserves additional careful consideration, however at the current moment I don't see any other significant issue with the current scheme, except some difference across browsers related to input. One might argue that the current scheme may cause non-smooth animations, but there have been no reports yet of this being a noticeable issue.
,
Apr 11 2016
Thanks for digging in here Sami. I just spent some time playing with https://output.jsbin.com/lixepa/quiet - it feels to me like the latency is identical whether or not setTimeout is used, thanks for pointing that out. Firefox does feel lower latency, but I agree that vsync aligned input should address this.
,
Apr 12 2016
Hello. This issue appears to be more complicated than it seemed in the beginning. I believe we need to collect additional data to understand what exactly is going on. My suggestion is to temporary postpone the resolution of this case. We will perform detailed analysis of the Chrome vs Firefox input latency, related specifically to the Unity Engine test cases. We will prepare a timing diagram for all 4 events: system cursor movement, JavaScript mousemove event handler, vsync, and rendered position on the canvas. An external process will be used to capture this data in high resolution, therefore we will be able to objectively compare Chrome and Firefox charts and determine the cause of the difference, whether it is in fact the input latency or just a visual side-effect of the browser rendering scheme. We will try to prepare the charts by the end of the next week.
,
Apr 12 2016
alexsuvorov@ - thanks so much for your help with this issue. We really want to make the web a successful gaming platform, and collaboration with a game engine like unity is extremely valuable. I look forward to your results! It would be interesting to compare on a device which delivers input at a fixed offset from vsync, such as a Nexus 5, but I suspect that the methods you're using to measure timing data won't work on Android. It doesn't look like any desktop OS's align input to vsync. We should perhaps do a bit of manual testing on a device like the Nexus 5 to see if that has an impact.
,
Apr 13 2016
Hi everyone, Was just made aware of this thread. I believe we are experiencing this same issue at Floored. Our app was not as responsive to input in our heavy scenes which run at low framerates, and we noticed it in Chrome not Firefox. This was at least partially why we were asking to drop to 30fps. We were seeing evidence that if we just didn't make any draw calls every other frame, fps was capped at 30 and input became more responsive. I agree with alexsuvorov@ that just the input fix may solve most of what we noticed. On the other hand we really are looking for smooth animations as well. Therefore I do advocate for giving us a real way to drop to 30 fps. I look forward to the further research and charts. If there's any way we can tailor our webgl performance stats to supplement yours, we'd be happy to provide.
,
Apr 22 2016
Hello. I have prepared the test charts for rAF scheme comparison between Firefox and Chrome. You can find it here: http://files.unity3d.com/alexsuvorov/webgl/raf/ It works the following way: A pixel sized plane is following the mouse cursor while it is being moved programmatically over the canvas. The JavaScript mouse position (obtained through mousemove event) is read directly from the browser memory and compared with the real mouse position. The result is displayed over the real vsync chart (color intensity represents the current scanline number). Timing resolution is about 1.5 ms. Initially I also planned to add the rendered position of the plane on the canvas to the chart, but there does not seem to be efficient way to capture the frontbuffer with such timing resolution (except of course hooking into the browser process and copying the data from the canvas backbuffer), still I might add this functionality later if considered necessary. For now it is assumed that the plane will be rendered at the nearest vsync displayed on the chart following the JavaScript main loop. Note: the testing should be performed on native (or bootcamped) Windows, as VM might significantly affect the rAF behavior. Results: 1) Both in Chrome and Firefox rAF correlates with vsync only if the user rendering loop fits into one monitor frame (see the 90 fps chart). Otherwise rAF callback is launched immediately, so the fold fps degrading (i.e to 30 or 20) is not happening. 2) The input latency issue only affects Chrome when throttling fps is lower than 30 (see the chart with 35 fps) 3) Firefox is never missing input events and processes all of them as soon as the JavaScript main loop exits. 4) Chrome is always missing input events appeared during the last JavaScript main loop if throttling fps is lower than 30, therefore introducing additional input latency of one JavaScript frame. For example, when JavaScript fps degrades to 5, Chrome introduces additional 200 ms input latency (see the chart with 5 fps) unlike Firefox. 5) Launching rAF callback in Chrome through setTimeout(0) forces all the input to be processed before the callback, as has been noted before. Let me know if you need any assistance with the RAF analyzer. Some note about dropping fps to 30 discussed above. Currently I don't find this option particularly useful for our use case. However under assumption that rAF does not correlate with real vsync when JavaScript fps is lower than monitor refresh rate, it would be quite useful to have an API for getting the current scanline status. This way an application will be able to predict the next real vsync and adjust the animation appropriately (or even implement its own real rAF). This method of course is not as precise as just dropping fps to 30 or 20, however it should provide significantly higher fps with relatively smooth animations. For example, the browser API can provide information about the vsync period and the time to the next vsync.
,
Apr 27 2016
Thanks for this detailed analysis, and sorry for the delay here. Sami, do you think there's anything the scheduler can do to make any of these cases better? In particular, it seems like we should be able to do better with result #4, doesn't it?
,
Apr 27 2016
#4 is surprising to me too. Basically input events and rAF go into the same queue, so it's impossible for rAF to skip any input that has arrived since the previous frame. We might need to grab a trace of that.
,
Apr 30 2016
,
May 16 2016
I looked at #4 at 5 fps in more detail -- trace attached. The overall latency looks like this: 1. Mouse move event comes in. 2. Because a previous mouse event is still pending, the event sits in the browser until the previous ACK comes in (+200ms). 3. The mouse event is dispatched to the renderer. Because the renderer is now busy rendering the result of the previous mouse event, the new one is delayed until that is done (+200ms). 4. The mouse event is processed by JS, and a new frame is generated (+200ms). Basically the expensive rendering has made the input pipeline 3 frames deep, which results in the hilariously bad overall latency. I *think* this is another case where vsync aligned and buffered input will make things saner. With it we should never start rendering a new frame before all input for it has been flushed.
,
May 16 2016
Firefox doesn't do buffered input though, right? Why does it behave reasonably? The current plan for vsync alignment is to align events as they come from the OS. I'm not clear on how that would help this case. Can you outline what part of steps #1-4 would be improved with buffered input? With the current plan, the only thing that would change is the timing of when the mouse move event arrives in #1.
,
May 16 2016
I suspect Firefox doesn't have as deeply nested an input pipeline as we do. The main improvement I believe is an ACK from a previous mouse event doesn't immediately trigger the sending of a new mouse event. Without this there is always one input event stuck in the "queue". With buffered input the end-to-end latency should always be 200-400ms (depending on where in the frame the event comes in from the system.)
,
May 16 2016
"The main improvement I believe is an ACK from a previous mouse event doesn't immediately trigger the sending of a new mouse event. " This isn't the case with the current plan for buffered input. We currently plan on buffering upstream of this, because we want to buffer input for the synchronous input case (input on ui::Views etc). See the discussion on this here: https://docs.google.com/document/d/10EOeLnXvfMdSjjQhTtZxB9ithrus4vzREvZPGAWjPn0/edit?usp=sharing You can simulate the fast behavior by adding MouseMove to the list here: https://code.google.com/p/chromium/codesearch#chromium/src/content/common/input/web_input_event_traits.cc&l=491 This turns off MouseMove coalescing, which results in us processing every single mouse event on the main thread. Here's a probably bad idea: what if when we went to send an event to the main thread, we - check if the main thread is blocked - if it isn't, send the event - if it is, wait - when the main thread is unblocked, it pings us to let us know to send the most recent event - wait for that event before we render the current frame
,
May 16 2016
I think there's probably a fairly simple tweak to event coalescing that we can do to fix this. E.g., what if we sent two events before we started coalescing events? Or even if we sent events with exponential falloff (we send events #1, 2, 4, 8, 16 etc). That would bound the number of events that the main thread needs to chew through when it wakes up, but would also keep the main thread's notion of where the pointer is reasonably up to date. I'll do some testing on other browsers to see how they compare, and draft up a doc. I'll take ownership here for now.
,
May 17 2016
Thanks Tim. Thinking about this more, changing ACK behavior probably isn't necessary to fix the problem. What matters more is that we never try sending more than one new input per frame. Since we currently do, the following happens:
browser [input #2]--------->[ack]
/ \
/ \
main [rAAAAAF #0] [input #1] [r\AAAAAF #1] [input #2] [rAAAAAF #2] ...
\ /
\ /
cc [input #2]
In other words, every input event is penalized by ~two rAFs (coalescing avoids the third rAF).
With buffered input I think we'd get:
browser [input #2]-. [ack]
\ /
\ /
main [rAAAAAF #1] \ [input #2] [rAAAAAF #2]
\ /
\/
cc [input #2]
In other words there shouldn't be a way for the rAF to land in the middle of an ack for an older input and the new input event. I'm not sure if additional coalescing/falloff is needed in this case -- WDYT?
(Caveat emptor: I haven't actually tested this :)
,
May 17 2016
"there shouldn't be a way for the rAF to land in the middle of an ack for an older input and the new input event." That property would definitely fix the issue, but I'm not sure how input buffering would give us this property. Input buffering as we've currently described it only aligns input to the hardware vsync. It won't be affected by long tasks in any way, or have any impact on when rAF runs. The issue as I understand it is: 1. A stream of input arrives during a long frame. 2. The first event is dispatched to main. 3. The long frame finishes. 4. We handle the event, and dispatch an ack for it. 5. Before the event triggered by the ack arrives to main, we start the next frame, without knowledge of any input since the first event. You're claiming that #5 won't occur in the buffered input case, correct? Why not?
,
May 17 2016
Hmm, I thought the difference was that the event triggered by the ack would benefit from input coalescing until it is actually sent out at the next vsync. I'm probably imagining things though. Maybe that's how we should make it work though?
,
May 17 2016
We need to align with vsync really high in the stack in order to have consumers other than web contents benefit from vsync alignment. The current plan is to switch to behaving how Android behaves - we just align input to hardware vsync as soon as we get the input. We could potentially _additionally_ align with vsync after receiving the ack for the previous event, but I suspect we'd be introducing a lot of latency at that point. I went through the various options for this here: https://docs.google.com/document/d/10EOeLnXvfMdSjjQhTtZxB9ithrus4vzREvZPGAWjPn0/edit?usp=sharing
,
May 17 2016
Perhaps I don't understand the complexities here but it seems to me that *all* of the events received to date should be sent, as a list, to the renderer for dispatch at the beginning of the frame. Input coalescing doesn't address the basic problem that each event dispatched to the renderer requires an acknowledgment. Is the problem that the renderer may need to veto some of the events, causing them to be bubbled higher up the window hierarchy? Is that a common case? If not, then the common case should be optimized for. Perhaps a prototype should be done to see if this technique could address the latency issue, before considering more complex scheduling changes that may be hard to qualify.
,
May 17 2016
#43: Good point about needing to do this higher up. Still, I'm wondering if we'd also need secondary throttling/batching to deal with content like this which can't match the BeginFrame rate. #44: Tim knows for sure, but I think in this case the ack isn't really blocking anything else than processing of further input, and that's only because we're only allowing one outstanding mousemove event in the renderer. I think Tim was suggesting lifting this limit with the exponential backoff idea.
,
May 17 2016
Yeah, the ack isn't blocking anything other than further processing of input. If we don't wait for the ack, we get rid of this issue, but force the main thread to chew through an unbounded number of events once it's no longer blocked. I think we could fix this in all reasonable cases by sending a bounded number of events during long tasks. Re #44: > *all* of the events received to date should be sent, as a list, to the renderer We don't need all of them. For mousemoves for instance, we only care about the most recent event. > for dispatch at the beginning of the frame. Which frame are we referring to here? We'll start rAF in the middle of a beginFrame in some cases, won't we? I think there are cases where this will introduce a bunch of latency, such as in the case where there's a long timer task with a bunch of input happening, immediately followed by rAF. > Input coalescing doesn't address the basic problem that each event dispatched to the renderer requires an acknowledgment. I'm proposing to perform less coalescing, not more. > Is the problem that the renderer may need to veto some of the events, causing them to be bubbled higher up the window hierarchy? Is that a common case? If not, then the common case should be optimized for. Mousemove events in particular are never veto'd.
,
May 19 2016
Dave is going to prototype a fix for this for touch events. Here's a version of alexsuvorov's test, but with touch instead of mouse https://jsbin.com/busibut/quiet. When the main thread receives an event, it will ask the MainThreadEventQueue (which lives on the renderer compositor thread) to coalesce the queue's next event into the event it just received.
,
May 20 2016
We can't actually do what's proposed in #47. There are two criteria here we haven't discussed at this point: 1. Developers need to continue to be able to measure event queueing time via high resolution timestamps. 2. Developers need to have correct position/timestamp associations, to enable things like computing finger velocity. If, after a long task with lots of input, we only dispatch the event with the most recent timestamp, the developer will have no way of measuring the event queueing time. We could fix that by using the timestamp from the older event, but that would break velocity calculation. I think we can safely fix this by just dispatching the original event in addition to the most recent event. This could cause problems on pages with heavy event handlers though. Perhaps we should launch this via Finch to see if there's any measurable performance impact.
,
May 20 2016
Good point. My anecdotal experience is that heavy input handlers are less common than other random heavy tasks that get in their way. Do you know if this extra latency is measurable by our UMA (it should be, right?)
,
May 20 2016
If it has a big impact, we'd see it in our touch scrolling metrics. Rick pointed out that a historical events API would eliminate the need for sending both events. I've filed a bug for the historical events API here ( crbug.com/613540 ). We've talked about this API for a while, but haven't found enough justification for starting the standardization process. I'll spend some time looking for additional use cases. I don't think we should block on this, but it is a path towards eliminating the extra event dispatch.
,
May 24 2016
What's the status of vsync aligned input?
,
May 24 2016
Vsync aligned input turns out to be two completely independent efforts - one to align gesture events being passed to the compositor with vsync, and other to align input with vsync as soon as we receive it from the OS. I've got a plan in place for both, some of which is captured in this doc: https://docs.google.com/document/d/1Y9wTCWDhaLTmlajiOjp7rK9P9RXdJRuK4btUDzfFo_0 (chromium.org). That doc is a bit out of date though. I'm hoping to devote some time to this next week, to at least flesh out the implementation plan, and how touch and gesture alignment interact.
,
May 25 2016
I've uploaded a proof of concept code which removes the PostTask queuing and replaces it with an object that is locked between the two threads (compositor and main thread). This prevents the history of events in the PostTask. I've also made mouse move events non blocking because of this coalescing. Don't stare at the code too much it. There are 3 queues that I could collapse to one; but for the quickness of this change it was easiest to duplicate. https://codereview.chromium.org/2007413002 I see that the original URI behaves very similar between the RAF and setTimeout now.
,
Jun 8 2016
To test the prototype, download and run https://drive.google.com/file/d/0B7mjRvOU-oG-VFhNd000bDlsRTA/view?usp=sharing. The mini_installer is silent, and just slaps a "chromium" icon on your desktop. Make sure to close any running "chromium" instance (Chrome is fine).
,
Jun 14 2016
I have tested it on the whole fps range and it works well now. Thank you very much for the fix.
,
Jun 14 2016
Awesome, thanks for testing. Dave, let's clean this up and land it.
,
Jul 26 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/cb9341c9f5539577a82aae52e19bd789148001bc commit cb9341c9f5539577a82aae52e19bd789148001bc Author: dtapuska <dtapuska@chromium.org> Date: Tue Jul 26 16:27:53 2016 Generalize the main thread event queue into a common event queue. Teach the WebInputEventQueue about ScopedWebInputEvents so we can use a single queue instead of one per event class. This reduces code duplication and allows for another patch to then take advantage of this for all events. BUG= 624012 , 599152 Review-Url: https://codereview.chromium.org/2170913002 Cr-Commit-Position: refs/heads/master@{#407826} [modify] https://crrev.com/cb9341c9f5539577a82aae52e19bd789148001bc/content/common/input/event_with_latency_info.cc [modify] https://crrev.com/cb9341c9f5539577a82aae52e19bd789148001bc/content/common/input/event_with_latency_info.h [modify] https://crrev.com/cb9341c9f5539577a82aae52e19bd789148001bc/content/common/input/web_input_event_queue.h [modify] https://crrev.com/cb9341c9f5539577a82aae52e19bd789148001bc/content/renderer/input/main_thread_event_queue.cc [modify] https://crrev.com/cb9341c9f5539577a82aae52e19bd789148001bc/content/renderer/input/main_thread_event_queue.h [modify] https://crrev.com/cb9341c9f5539577a82aae52e19bd789148001bc/content/renderer/input/main_thread_event_queue_unittest.cc [modify] https://crrev.com/cb9341c9f5539577a82aae52e19bd789148001bc/third_party/WebKit/public/platform/WebInputEvent.h
,
Aug 10 2016
alexsuvorov@ Can you give chrome canary a run? And see that it works as what you expect?
,
Aug 10 2016
bugdroid1 doesn't seem to be updating. Nonetheless the fix landed 2 days ago; https://codereview.chromium.org/2162143002/
,
Aug 18 2016
Tested on 54.0.2831.0 canary (Windows, 32-bit). Works as expected. Thank you very much for the fix. |
||||||||||||||
►
Sign in to add a comment |
||||||||||||||
Comment 1 by alexsuvo...@unity3d.com
, Mar 30 2016