V8 exception in WebGL blit test: v8::internal::Scavenger::PromoteObject |
||
Issue descriptionOnly seen one flaky failure so far: https://ci.chromium.org/buildbot/chromium.gpu.fyi/Win10%20Release%20%28NVIDIA%29/4989 https://chromium-swarm.appspot.com/task?id=3ba097510ff6c310&refresh=10&show_raw=1 webgl2_conformance_gl_tests on NVIDIA GPU on Windows on Windows-10 Unexpected Failures: * gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest.WebglConformance_deqp_functional_gles3_framebufferblit_conversion_11 In the logs: Received fatal exception EXCEPTION_ACCESS_VIOLATION Backtrace: v8::internal::Scavenger::PromoteObject [0x6B7B77FB+27] v8::internal::Scavenger::ScavengeObject [0x6B7B6D49+297] v8::internal::Scavenger::CheckAndScavengeObject [0x6B7B7CE5+117] v8::internal::Scavenger::ScavengePage [0x6B7B65B4+324] v8::internal::ScavengingTask::RunInParallel [0x6B77C3E7+663] v8::internal::ItemParallelJob::Task::RunInternal [0x6B78FF4C+76] base::debug::TaskAnnotator::RunTask [0x6BEADD27+231] base::internal::TaskTracker::RunOrSkipTask [0x6BEE6B8E+638] base::internal::TaskTracker::RunNextTask [0x6BEE61D1+385] base::internal::SchedulerWorker::Thread::ThreadMain [0x6BEF01FF+623] base::PlatformThread::SetCurrentThreadPriority [0x6BE52A75+533] BaseThreadInitThunk [0x77708744+36] RtlGetAppContainerNamedObjectPath [0x77D9582D+253] RtlGetAppContainerNamedObjectPath [0x77D957FD+205] No V8 sheriff is currently listed in https://ci.chromium.org/p/chromium/g/main/console.
,
Feb 14 2018
Assigning to current sheriff. (https://rotation.googleplex.com/index.html#rotation?id=4838401396178944) We have not changed any GC code for weeks now. Likely requires good minidumps or other hints to pin this down.
,
Feb 14 2018
Per the logs from the shard above, it's here: Uploading c:\b\s\w\itgkwfzg\tmp3q2die\reports\bb5475a4-01c6-4281-9f52-5f9c9d370806.dmp to gs://chrome-telemetry-output/minidump-2018-02-11_18-11-45-504544.dmp
,
Feb 15 2018
Just checked the mini dump and it seems to use a version where we cannot download symbols (neither in WinDbg, nor in vstudio). Is that expected?
,
Feb 15 2018
Thanks for the minidump link, Ken. And thanks for looking, Michael. I think the symbols are available only in the official builds. In previous investigations I inspected the minidumps manually. I am currently traveling and will be able to inspect the minidump on Monday.
,
Feb 27 2018
I think I know what's going on here. The crash happens in PromoteObject when we try to load instance_type from the map of an object. The map pointer 0x4d47fff8 is untagged, which suggests that it is actually a forwarding address stored in the object when it was migrated. The map pointer is also at the end of a page, so a word load at [0x4d47fff8 + 7] accesses an unmapped address. This looks like a result of a data race, where the object is migrated twice by two different threads. The first thread succeeds and writes a forwarding pointer in the map word. The second thread is incorrectly treating the forwarding pointer as a map and trying to dereference it. I think the culprit is object->RequiredAlignment() at the beginning of PromoteObject. That function loads the map of the object and does not check for forwarding address: https://cs.chromium.org/chromium/src/v8/src/heap/scavenger-inl.h?rcl=efc0f3d4959b7539bdef12012cb47b3d31c8b9c1&l=100 Exception info: thread id: 7624 code: C0000005 context: eax: 4d47fff8 * ebx: 5c6044e9 T ecx: 5b78c109 T edx: 00000008 edi: 0bff16a8 * esi: 0bff16a8 * ebp: 0af2f81c S esp: 0af2f800 S eip: 6b7b77fb C eflags: 10000001000000110 modules: chrome.exe at 00FF0000 stack-top: 0af2f800 S stack-bottom: 0af30000 Disassembly around exception.eip: 6b7b77bb 00000000: 6e outs dx,BYTE PTR ds:[esi] 6b7b77bc 00000001: e8 4f e9 87 01 call 0x6d036110 6b7b77c1 00000006: 83 c4 0c add esp,0xc 6b7b77c4 00000009: 85 37 test DWORD PTR [edi],esi 6b7b77c6 0000000b: 0f 84 03 ff ff ff je 0xffffff14 6b7b77cc 00000011: e9 cc fe ff ff jmp 0xfffffee2 6b7b77d1 00000016: cc int3 6b7b77d2 00000017: cc int3 6b7b77d3 00000018: cc int3 6b7b77d4 00000019: cc int3 6b7b77d5 0000001a: cc int3 6b7b77d6 0000001b: cc int3 6b7b77d7 0000001c: cc int3 6b7b77d8 0000001d: cc int3 6b7b77d9 0000001e: cc int3 6b7b77da 0000001f: cc int3 6b7b77db 00000020: cc int3 6b7b77dc 00000021: cc int3 6b7b77dd 00000022: cc int3 6b7b77de 00000023: cc int3 6b7b77df 00000024: cc int3 6b7b77e0 00000025: 55 push ebp 6b7b77e1 00000026: 89 e5 mov ebp,esp 6b7b77e3 00000028: 53 push ebx 6b7b77e4 00000029: 57 push edi 6b7b77e5 0000002a: 56 push esi 6b7b77e6 0000002b: 83 ec 10 sub esp,0x10 6b7b77e9 0000002e: a1 98 a9 7e 6f mov eax,ds:0x6f7ea998 6b7b77ee 00000033: 89 ce mov esi,ecx 6b7b77f0 00000035: 8b 4d 10 mov ecx,DWORD PTR [ebp+0x10] 6b7b77f3 00000038: 31 e8 xor eax,ebp 6b7b77f5 0000003a: 89 45 f0 mov DWORD PTR [ebp-0x10],eax 6b7b77f8 0000003d: 8b 41 ff mov eax,DWORD PTR [ecx-0x1] =>6b7b77fb 00000040: 0f b7 40 07 movzx eax,WORD PTR [eax+0x7] 6b7b77ff 00000044: 3d 92 00 00 00 cmp eax,0x92 6b7b7804 00000049: 74 0e je 0x59 6b7b7806 0000004b: 8b 41 ff mov eax,DWORD PTR [ecx-0x1] 6b7b7809 0000004e: 0f b7 40 07 movzx eax,WORD PTR [eax+0x7] 6b7b780d 00000052: 3d 94 00 00 00 cmp eax,0x94 6b7b7812 00000057: 75 0b jne 0x64 6b7b7814 00000059: 83 79 03 01 cmp DWORD PTR [ecx+0x3],0x1 6b7b7818 0000005d: b8 01 00 00 00 mov eax,0x1 6b7b781d 00000062: 77 14 ja 0x78 6b7b781f 00000064: 8b 41 ff mov eax,DWORD PTR [ecx-0x1] 6b7b7822 00000067: 0f b7 48 07 movzx ecx,WORD PTR [eax+0x7] 6b7b7826 0000006b: 31 c0 xor eax,eax 6b7b7828 0000006d: 81 f9 81 00 00 00 cmp ecx,0x81 6b7b782e 00000073: 0f 94 c0 sete al 6b7b7831 00000076: 01 c0 add eax,eax 6b7b7833 00000078: 8d 4e 44 lea ecx,[esi+0x44] 6b7b7836 0000007b: 8d 55 ec lea edx,[ebp-0x14] 6b7b7839 0000007e: 50 push eax 6b7b783a 0000007f: ff .byte 0xff
,
Feb 27 2018
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/4f43be96ca93874f4a0fb96bd9dc0befc8b85f79 commit 4f43be96ca93874f4a0fb96bd9dc0befc8b85f79 Author: Ulan Degenbaev <ulan@chromium.org> Date: Tue Feb 27 10:48:53 2018 [heap] Fix a data race in Scavenger. Scavenger::PromoteObject and Scavenger::SemiSpaceCopyObject load and dereference the map of the object to compute the alignment. This is unsafe because the object can be already migrated by another thread and the map word can contain the forwarding address. This patch removes the map load and uses the provided map argument to compute the alignment. Bug: chromium:811278,chromium:807178 Change-Id: I7343344dc65ae26eefb2602c55dee87bb511bc72 Reviewed-on: https://chromium-review.googlesource.com/939172 Commit-Queue: Ulan Degenbaev <ulan@chromium.org> Reviewed-by: Yang Guo <yangguo@chromium.org> Reviewed-by: Michael Lippautz <mlippautz@chromium.org> Cr-Commit-Position: refs/heads/master@{#51592} [modify] https://crrev.com/4f43be96ca93874f4a0fb96bd9dc0befc8b85f79/src/heap/mark-compact.cc [modify] https://crrev.com/4f43be96ca93874f4a0fb96bd9dc0befc8b85f79/src/heap/scavenger-inl.h [modify] https://crrev.com/4f43be96ca93874f4a0fb96bd9dc0befc8b85f79/src/objects-inl.h [modify] https://crrev.com/4f43be96ca93874f4a0fb96bd9dc0befc8b85f79/src/objects.h [modify] https://crrev.com/4f43be96ca93874f4a0fb96bd9dc0befc8b85f79/src/snapshot/deserializer.cc [modify] https://crrev.com/4f43be96ca93874f4a0fb96bd9dc0befc8b85f79/src/snapshot/serializer.cc
,
Mar 6 2018
Awesome work Ulan! Thank you for tracking down that nasty bug! |
||
►
Sign in to add a comment |
||
Comment 1 by kbr@chromium.org
, Feb 13 2018