New issue
Advanced search Search tips

Issue 904371 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Nov 29
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 1
Type: Bug



Sign in to add a comment

Ill in HandleFailure<int>

Project Member Reported by ClusterFuzz, Nov 12

Issue description

Detailed report: https://clusterfuzz.com/testcase?key=5257390968274944

Fuzzer: mbarbella_js_mutation_layout
Job Type: linux_asan_content_shell_drt
Platform Id: linux

Crash Type: Ill
Crash Address: 0x564ddd87d976
Crash State:
  HandleFailure<int>
  ValueOrDie<int,
  gfx::Size::GetArea
  
Sanitizer: address (ASAN)

Regressed: https://clusterfuzz.com/revisions?job=linux_asan_content_shell_drt&range=606066:606067

Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=5257390968274944

Issue filed automatically.

See https://github.com/google/clusterfuzz-tools for more information.
 
Project Member

Comment 1 by ClusterFuzz, Nov 12

Components: Internals UI>GFX
Labels: Test-Predator-Auto-Components
Automatically applying components based on crash stacktrace and information from OWNERS files.

If this is incorrect, please apply the Test-Predator-Wrong-Components label.
Project Member

Comment 2 by ClusterFuzz, Nov 12

Labels: Test-Predator-Auto-Owner
Owner: masonfreed@chromium.org
Status: Assigned (was: Untriaged)
Automatically assigning owner based on suspected regression changelist https://chromium.googlesource.com/chromium/src/+/31de4339dbe4351d16687a5b127b9cddb449b8d4 (Limit backing scale factor to 100X).

If this is incorrect, please let us know why and apply the Test-Predator-Wrong-CLs label. If you aren't the correct owner for this issue, please unassign yourself as soon as possible so it can be re-triaged.
Cc: danakj@chromium.org enne@chromium.org
I have verified that my CL did not cause this issue. I believe this is another case, just like  https://crbug.com/899482 , where an absurdly high scale factor causes issues. For that bug, I added the code in the referenced CL, to limit the scale factor to "something reasonable" which was 100X. With this clusterfuzz issue, I verified that if I further reduce that limit to 10x, this crash stops, at least on my machine. But 11x (or higher) will crash.

@danakj, any input? You helped me come up with the 100X original limit. I feel uncomfortable with a 10x limit - that seems very low. Open to suggestions here.
11 is not very large. Is there another big scale being multiplied against the 11 that we need to clamp?
I agree - seems too small. The clusterfuzz test case is really simple - it basically just sets the scale factor and does nothing else that I can see.

<script>
function __f_0() {
    if (!window.testRunner || !window.sessionStorage)
        return;
    window.targetScaleFactor = 1073741823;
    testRunner.setBackingScaleFactor(targetScaleFactor, __f_1);
}
function __f_1() {
}
window.addEventListener("load", __f_0);
</script>
What's the size of the rect it is scaling?
It looks like 800x800, multiplied by the (limited) scale factor of 100x. So 
in gl_renderer.cc(2824), current_framebuffer_texture_->size() is 80000x80000. Which causes this: [30895:30895:1115/123320.920196:ERROR:gles2_cmd_decoder.cc(17416)] [.BrowserCompositor-0x20548de2da20]GL ERROR :GL_INVALID_VALUE : glTexStorage2D: dimensions out of range


Oh ok. Wow so 800x800 clearly reasonable, so I guess 100 device scale factor isn't going to work with our current integer sizes very well :) Current scale factors are between 1 and 3 afaik. 10 is a long way away from that.. I'd say pick 10 but I don't want to come back to this next week where clusterfuzz figured out it can use 8000x8000 and still crash. So how about 5?
More thoughts. I didn't do the math though and I'm not sure exactly what are the bounds of a thing that can be copied like this would be.. Blink limits the size of a layer but a RenderPass can be N layers so possibly arbitrary size. So maybe clusterfuzz will continue to find sizes that are too big. In which case we should just fail the readback cuz it's too big for GL, instead of crashing?
Ok, I'm perfectly comfortable (code-wise) with changing the limit to 5x. If I were to instead fail the readback if the texture is too large, where (in code) would be the best place to do that?
It's crashing here: https://chromium.googlesource.com/chromium/src/+/45a8716f391142751234187fdf367376ae9164a5/components/viz/service/display/gl_renderer_copier.cc#403

If we early out of that constructor we need to signal someway to the caller that the readback is "done" which it does by watching the query() via SignalQuery() and a callback. https://chromium.googlesource.com/chromium/src/+/45a8716f391142751234187fdf367376ae9164a5/components/viz/service/display/gl_renderer_copier.cc#466

That's not really amenable to modifying the behaviour of, so we should skip that whole ordeal instead.

So we could check the size from GLRendererCopier::StartReadbackFromFramebuffer and if it's too large, we can skip directly to what ReadPixelsWorkflow::Finish would do. It's doing 

copy_request_->SendResult(std::move(result));

https://chromium.googlesource.com/chromium/src/+/45a8716f391142751234187fdf367376ae9164a5/components/viz/service/display/gl_renderer_copier.cc#440

But we won't have a GLPixelBufferRGBAResult, we want to send an empty result, as per the docs here: https://cs.chromium.org/chromium/src/components/viz/common/frame_sinks/copy_output_request.h?rcl=a908129ab31d8bc04c0c990fe184c442e3a0b80b&l=31

Destroying the CopyOutputRequest without sending a result will send an empty one by default: https://cs.chromium.org/chromium/src/components/viz/common/frame_sinks/copy_output_request.cc?rcl=d31eecb71020893b067a2b48a9efd45d91f2b2cb&l=28

So in StartReadbackFromFramebuffer:

https://chromium.googlesource.com/chromium/src/+/45a8716f391142751234187fdf367376ae9164a5/components/viz/service/display/gl_renderer_copier.cc#462

If it destroys the CopyOutputRequest it would in fact send the empty result.
Wow, I was expecting a quick pointer to somewhere in the code. Thanks for the detailed howto, danakj@! I appreciate it.
No problem, if you can pay me back with a unit test I'll be super grateful :)
Haha ok I'll see what I can do. In the meantime, this crash happens much earlier on my local system, and doesn't even get to StartReadbackFromFrameBuffer. So I'm working through that first. But thanks!
Project Member

Comment 15 by ClusterFuzz, Nov 29

ClusterFuzz has detected this issue as fixed in range 611887:611894.

Detailed report: https://clusterfuzz.com/testcase?key=5257390968274944

Fuzzer: mbarbella_js_mutation_layout
Job Type: linux_asan_content_shell_drt
Platform Id: linux

Crash Type: Ill
Crash Address: 0x564ddd87d976
Crash State:
  HandleFailure<int>
  ValueOrDie<int,
  gfx::Size::GetArea
  
Sanitizer: address (ASAN)

Regressed: https://clusterfuzz.com/revisions?job=linux_asan_content_shell_drt&range=606066:606067
Fixed: https://clusterfuzz.com/revisions?job=linux_asan_content_shell_drt&range=611887:611894

Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=5257390968274944

See https://github.com/google/clusterfuzz-tools for more information.

If you suspect that the result above is incorrect, try re-doing that job on the test case report page.
Project Member

Comment 16 by ClusterFuzz, Nov 29

Labels: ClusterFuzz-Verified
Status: Verified (was: Assigned)
ClusterFuzz testcase 5257390968274944 is verified as fixed, so closing issue as verified.

If this is incorrect, please add ClusterFuzz-Wrong label and re-open the issue.

Sign in to add a comment