CHECK failure: VerifyRoundup(n, mul) in math_util.h |
||||||
Issue descriptionDetailed report: https://clusterfuzz.com/testcase?key=6308769984937984 Fuzzer: inferno_twister Job Type: linux_ubsan_vptr_content_shell_drt Platform Id: linux Crash Type: CHECK failure Crash Address: Crash State: VerifyRoundup(n, mul) in math_util.h viz::DirectRenderer::CalculateTextureSizeForRenderPass viz::DirectRenderer::DecideRenderPassAllocationsForFrame Sanitizer: undefined (UBSAN) Regressed: https://clusterfuzz.com/revisions?job=linux_ubsan_vptr_content_shell_drt&range=603126:603127 Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=6308769984937984 Issue filed automatically. See https://github.com/google/clusterfuzz-tools for more information.
,
Oct 27
Automatically assigning owner based on suspected regression changelist https://chromium.googlesource.com/chromium/src/+/c41b2d08296c7d720343f716641e40c9c2f68ad0 (Enable display compositor pixel dumps by default.). If this is incorrect, please let us know why and apply the Test-Predator-Wrong-CLs label. If you aren't the correct owner for this issue, please unassign yourself as soon as possible so it can be re-triaged.
,
Oct 29
danakj@, I'm hoping you can help me triage this and assign it to someone who might know how to fix it? It seems like it was triggered by my CL, but I haven't seen this code in DirectRenderer::CalculateTextureSizeForRenderPass() before. Looks like you wrote most of that function, recently, so I'm hoping you can help. https://cs.chromium.org/chromium/src/components/viz/service/display/direct_renderer.cc?rcl=e8e9e7cc77eff2e5cbc8f76ce0ed329c1ffdef48&l=734
,
Oct 30
It crashes on overflow, so this would imply there is a RenderPass with a backing so large that rounding up to the next multiple of 64 overflows. This isn't a normal state, so I'd look into why that's happening.
,
Oct 30
Ok, I'm thinking (hoping!) this is related to a bunch of other issues stemming from the use of a dangling frame pointer. I have a CL going for that, and I'll re-test this when it lands. Thanks for the help, danakj@. Do you think this could be de-prioritized to pri-2, given that it is a CHECK failure and is likely (again, hopefully) coming from layout test code?
,
Oct 30
I'm not sure what makes these layout tests low priority. I generally put P1 on any test problems, they are the core of what protects our users.
,
Oct 30
Agreed that the tests are important, but maybe protecting the tests from super-crazy clusterfuzz inputs isn't? I'm probably missing something here. Based on your comment, I'll definitely leave it pri-1. I guess I was just worried about blocking a release based on a layout test failure.
,
Oct 30
IMO test failures are probably one of the best reasons to block a release. :) If the clusterfuzz failure doesn't occur before your CL then that points both at a bug that would hit users (the web is approximately as good at finding bugs as clusterfuzz), and missing test coverage.
,
Oct 30
Haha ok, thanks. I guess my point was that the CL I'm working on to fix this only modifies code in the layout test runner. I.e. none of it gets built into Chrome, so it's hard for web users to exercise it at all. Unless they're building layout tests themselves, and from my experience so far, I really hope not many end users have to do that. The CL that caused it is this one: https://chromium-review.googlesource.com/c/chromium/src/+/1213864 I'm actually less sure this is related, so I'll definitely keep it p1 and keep debugging. Thanks for the opinions - I'm still feeling my way around this stuff.
,
Oct 30
Thanks, in this case releasing the test changes doesn't seem problematic, though other Chrome devs may run into the bug and cause them grief - it's like a soft version of a red tree. But the bug being seen may be a production bug that we should be worried about and was present already but missed, so in general I do keep test-only problems as P1. Thanks for looking at this :)
,
Oct 30
Ok, the input that causes this CHECK failure reduces to this:
<script>
testRunner.setBackingScaleFactor(3632535047,function(){});
</script>
It is trying to set the scale factor to a huge number, which (correctly, I think) causes direct_renderer.cc to try to build a huge texture, which then fails on the CheckedRoundUp function.
I now really think this isn't a bug for us, as:
a) it is caused by a call to testRunner which is doing the right thing, just
out of the bounds of reality.
b) strictly only applies to tests, as normal Chrome doesn't get direct access to the scale factor (double-check me on this...?).
About the only thing I could see doing is putting another specific check somewhere on the scale factor, but I don't know what limit we would put on it a-priori. I think the existing checks (like CheckedRoundUp) are already doing the right thing at each point in the pipeline, so I'm hesitant to add more checks, which would be somewhat arbitrary, for this.
I am going to mark this closed. Please let me know if you disagree, and I'll re-open it.
,
Oct 30
You could set a reasonable upper bound and clamp to that in testRunner.setBackingScaleFactor(). That's an approach we've taken elsewhere.
,
Oct 30
What bound would you use? I reduced that number significantly and could still get crashes. Below the point where the scale factor times the texture size exceeded 2^31, the crashes moved to an out of memory error somewhere else in the code. They continued down to something quite small (relative to this test), like 500 or less. Which is still large for a scale factor. But don't we use scale factor for browser zoom, so it could actually get to be 20-50 if zoomed way in? Generally/historically, I really hate hard-coding numbers like that. All it means is you'll hit them years into the future when memory is larger, and wonder why you put such a small limit in place. Again, though, that's a past life. And we're talking about the testRunner harness, so it won't impact users. You tell me what number to limit the scale factor to, and I'll land the CL. :-)
,
Nov 5
500 would be fine. I think 100 would be fine too if you like that. Thanks!
,
Nov 5
Ok, thanks! I'll add that as a CHECK in testRunner.setBackingScaleFactor(). Thanks danakj@.
,
Nov 5
CHECK will still crash fuzzers. I would just suggest clamping to 100 instead. Thanks :)
,
Nov 5
Ahh ok. So scalefactor=min(scalefactor,100.0). Will do, thanks.
,
Nov 6
Re-opening to see if this is fixed by limiting the backing scale factor.
,
Nov 7
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/31de4339dbe4351d16687a5b127b9cddb449b8d4 commit 31de4339dbe4351d16687a5b127b9cddb449b8d4 Author: Mason Freed <masonfreed@chromium.org> Date: Wed Nov 07 16:56:01 2018 Limit backing scale factor to 100X Previous to this CL, any value could be passed to this function, which caused problems if very large values were used. See https://crbug.com/899482 for an example. Per the discussion there, 100X seems like a reasonable limit. Bug: 899482 Change-Id: I0ed9b68510bf9462c76dcf85e6954df416c0df2f Reviewed-on: https://chromium-review.googlesource.com/c/1320015 Reviewed-by: danakj <danakj@chromium.org> Reviewed-by: Mike Pinkerton <pinkerton@chromium.org> Commit-Queue: Mason Freed <masonfreed@chromium.org> Cr-Commit-Position: refs/heads/master@{#606067} [modify] https://crrev.com/31de4339dbe4351d16687a5b127b9cddb449b8d4/content/shell/test_runner/test_runner.cc
,
Nov 7
ClusterFuzz has detected this issue as fixed in range 606066:606067. Detailed report: https://clusterfuzz.com/testcase?key=6308769984937984 Fuzzer: inferno_twister Job Type: linux_ubsan_vptr_content_shell_drt Platform Id: linux Crash Type: CHECK failure Crash Address: Crash State: VerifyRoundup(n, mul) in math_util.h viz::DirectRenderer::CalculateTextureSizeForRenderPass viz::DirectRenderer::DecideRenderPassAllocationsForFrame Sanitizer: undefined (UBSAN) Regressed: https://clusterfuzz.com/revisions?job=linux_ubsan_vptr_content_shell_drt&range=603126:603127 Fixed: https://clusterfuzz.com/revisions?job=linux_ubsan_vptr_content_shell_drt&range=606066:606067 Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=6308769984937984 See https://github.com/google/clusterfuzz-tools for more information. If you suspect that the result above is incorrect, try re-doing that job on the test case report page.
,
Nov 7
Cool, fixed. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by ClusterFuzz
, Oct 27Labels: Test-Predator-Auto-Components