Stack-overflow in blink::LayoutObject::MapLocalToAncestor |
||||||
Issue descriptionDetailed report: https://clusterfuzz.com/testcase?key=6711627681103872 Fuzzer: ochang_domfuzzer Job Type: linux_asan_content_shell_drt Platform Id: linux Crash Type: Stack-overflow Crash Address: 0x7ffd44abfcc0 Crash State: blink::LayoutObject::MapLocalToAncestor Sanitizer: address (ASAN) Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=6711627681103872 Issue filed automatically. See https://github.com/google/clusterfuzz-tools for more information.
,
Aug 30
Predator and CL could not provide any possible suspects. Using Code Search for the file, "layout_object.cc" suspecting the below Cl might have caused this issue Suspect CL: https://chromium.googlesource.com/chromium/src/+/3fe4904a00ba85fe97d3b64dee683803e8d8049b masonfreed@ -- Could you please check whether this is caused with respect to your change, if not please help us in assigning it to the right owner. Thanks!
,
Aug 31
Hello, I tried to run a bisect on this stack overflow, and I don't think it was able to find one. My hypothesis is that this potential crash has been here for a long time, and was only recently found. I'm 99% sure it is not due to my changes, as those a) did not add anything to the stack, and b) should have all produced identical compiled results. I'm new to this process, not sure what to do at this point, sorry!
,
Sep 1
Are you able to reproduce? Since it's a stack overflow, how deep is the layout tree in this testcase? Another thing to do is to look at the sizes of the stack frames during recursive calls. Sometimes compiler optimizations cause local variables declared in helper methods to be inlined, which can greatly increase stack frame sizes. You can deduce the stack size by e.g. subtracting the pointer address for the same variable in successive frames. We've encountered the above types of situations multiple times in the past, so I wouldn't be surprised if there is something here. Although I agree that your CL did not cause this stack overflow.
,
Sep 1
Hmm no, I didn’t directly try to reproduce it. I will. The layout tree for this failure is 1000 deep, and I don’t have a sense for whether that’s a lot. Doesn’t sound too deep. Thanks for the pointers about checking out the stack size - I’ll give that a try and see if it changes before and after my changes. I suppose it’s possible that converting a pointer to a reference could trigger a different compiler output, which might bloat the stack. One question: what’s our threshold for “ok” here? Surely I can always cause exactly this crash by just going 10,000 layers deep, or 100k, or 1M.
,
Sep 5
There is no precise threshold for "ok". I view bugs like this as a hint that we might have an inefficient use of stack memory, which could lead to problems on real sites.
,
Sep 6
Ok, I debugged this further. See below for the details, but my TL;DR is that we are, and have been for some time, quite close to the 1000 stack frame overflow limit. Neither my change (in 3fe4904a00), nor any others that have happened in the last 5 months, have appreciably changed the susceptibility to this crash. I'm therefore inclined to close this bug (wontfix). Please re-open if you disagree.
Details:
To debug, I checked out revision 81eda3bd78bf (chromium commit 587001 from Aug 29, 2018) and revision 272cb4d86b5ef (chromium commit 550200 from Apr 12, 2018), and instrumented both with stack frame size diagnostics within the LayoutObject::MapLocalToAncestor() function. I then ran both with the fuzzer input that caused the crash, and neither actually crash on my machine with the fuzzer's 1000 level deep tree. I had to modify the input to go larger than that to actually see a crash on my machine. For the "current" code, that overflow happens around 1079 stack frames. For the "old" code, it happens around 1017 frames. Which is interesting since the stack frame size seems to have grown slightly, from "old" 1856 to "new" 1888.
I don't know if the ClusterFuzz tester has some built-in limit of 1000-deep trees? Or have we just gotten lucky that it never tried any more than that here.
Using revision 81eda3bd78bf (chromium commit 587001 from Aug 29, 2018)
On a release build, stack frame size is 1888 bytes.
On my machine, I get a stack overflow at 1078-1080 stack frames.
Stack frame start: 0x7ffdb8043c70 (1888 used, 1069 stack depth)
Stack frame start: 0x7ffdb8043510 (1888 used, 1070 stack depth)
Stack frame start: 0x7ffdb8042db0 (1888 used, 1071 stack depth)
Stack frame start: 0x7ffdb8042650 (1888 used, 1072 stack depth)
Stack frame start: 0x7ffdb8041ef0 (1888 used, 1073 stack depth)
Stack frame start: 0x7ffdb8041790 (1888 used, 1074 stack depth)
Stack frame start: 0x7ffdb8041030 (1888 used, 1075 stack depth)
Stack frame start: 0x7ffdb80408d0 (1888 used, 1076 stack depth)
Stack frame start: 0x7ffdb8040170 (1888 used, 1077 stack depth)
Stack frame start: 0x7ffdb803fa10 (1888 used, 1078 stack depth)
Stack frame start: AddressSanitizer:DEADLYSIGNAL
=================================================================
==1==ERROR: AddressSanitizer: stack-overflow on address 0x7ffdb803d9c8 (pc 0x000003591f01 bp 0x7ffdb803e260 sp 0x7ffdb803d9d0 T0
Went back to revision 272cb4d86b5efa2e9c0 (chromium commit 550200 from Apr 12, 2018)
On a release build, stack frame size is 1856 bytes.
On my machine, I get a stack overflow at between 1015-1019 stack frames.
Stack frame start: 0x7ffd353a7ad0 (1856 used, 1004 stack depth)
Stack frame start: 0x7ffd353a7390 (1856 used, 1005 stack depth)
Stack frame start: 0x7ffd353a6c50 (1856 used, 1006 stack depth)
Stack frame start: 0x7ffd353a6510 (1856 used, 1007 stack depth)
Stack frame start: 0x7ffd353a5dd0 (1856 used, 1008 stack depth)
Stack frame start: 0x7ffd353a5690 (1856 used, 1009 stack depth)
Stack frame start: 0x7ffd353a4f50 (1856 used, 1010 stack depth)
Stack frame start: 0x7ffd353a4810 (1856 used, 1011 stack depth)
Stack frame start: 0x7ffd353a40d0 (1856 used, 1012 stack depth)
Stack frame start: 0x7ffd353a3990 (1856 used, 1013 stack depth)
Stack frame start: 0x7ffd353a3250 (1856 used, 1014 stack depth)
Stack frame start: 0x7ffd353a2b10 (1856 used, 1015 stack depth)
Stack frame start: AddressSanitizer:DEADLYSIGNAL
=================================================================
==1==ERROR: AddressSanitizer: stack-overflow on address 0x7ffd353a0ae8 (pc 0x000002fd1241 bp 0x7ffd353a1380 sp 0x7ffd353a0af0 T0)
Here is the input file I used for this:
<style>body {
columns: 150px;
</style><body>
<script>
let root = document.createElement('div');
let p = root;
for (let i = 0; i < 1500; ++i) {
p = p.appendChild(document.createElement('div'));
}
document.body.appendChild(root);
</script>
,
Sep 13
ClusterFuzz testcase 6711627681103872 is still reproducing on tip-of-tree build (trunk). If this testcase was not reproducible locally or unworkable, ignore this notification and we will file another bug soon with hopefully a better and workable testcase. Otherwise, if this is not intended to be fixed (e.g. this is an intentional crash), please add ClusterFuzz-Ignore label to prevent future bug filing with similar crash stacktrace.
,
Sep 14
chrishtr@, what do you recommend here? I can mark it ClusterFuzz-Ignore, but I'm still a little unsure about how the 1000-level deep stack was chosen as a limit. Seems like we're right on the edge of that limit (if that is indeed a limit, and not just bad luck on this Fuzz). If so, maybe we should reduce the limit to 800 or something?
,
Sep 14
I recommend ClusterFuzz-Ignore
,
Sep 14
Per chrishtr@, marking as ClusterFuzz-Ignore.
,
Jan 3
ClusterFuzz has detected this issue as fixed in range 619628:619629. Detailed report: https://clusterfuzz.com/testcase?key=6711627681103872 Fuzzer: ochang_domfuzzer Job Type: linux_asan_content_shell_drt Platform Id: linux Crash Type: Stack-overflow Crash Address: 0x7ffd44abfcc0 Crash State: blink::LayoutObject::MapLocalToAncestor Sanitizer: address (ASAN) Regressed: https://clusterfuzz.com/revisions?job=linux_asan_content_shell_drt&range=550193:550200 Fixed: https://clusterfuzz.com/revisions?job=linux_asan_content_shell_drt&range=619628:619629 Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=6711627681103872 See https://github.com/google/clusterfuzz-tools for instructions to reproduce this bug locally. If you suspect that the result above is incorrect, try re-doing that job on the test case report page. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by dtapu...@chromium.org
, Aug 29