New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 906468 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Dec 6
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 3
Type: Bug


Show other hotlists

Hotlists containing this issue:
Lucifer


Sign in to add a comment

Null-dereference WRITE in C:\Windows\SYSTEM32\ntdll.dll

Project Member Reported by ClusterFuzz, Nov 18

Issue description

Detailed report: https://clusterfuzz.com/testcase?key=5749495654252544

Fuzzer: libFuzzer_sequence_manager_fuzzer
Job Type: windows_libfuzzer_chrome_asan
Platform Id: windows

Crash Type: Null-dereference WRITE
Crash Address: 0x000000000000
Crash State:
  C:\Windows\SYSTEM32\ntdll.dll
  base::internal::LockImpl::Lock
  base::sequence_manager::ThreadManager::PostDelayedTask
  
Sanitizer: address (ASAN)

Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=5749495654252544

Issue filed automatically.

See https://chromium.googlesource.com/chromium/src/+/master/testing/libfuzzer/reproducing_on_windows.md for more information.

Note: This crash might not be reproducible with the provided testcase. That said, for the past 14 days we've been seeing this crash frequently. If you are unable to reproduce this, please try a speculative fix based on the crash stacktrace in the report. The fix can be verified by looking at the crash statistics in the report, a day after the fix is deployed. We will auto-close the bug if the crash is not seen for 14 days.
 
Project Member

Comment 1 by ClusterFuzz, Nov 18

Labels: Fuzz-Blocker ReleaseBlock-Beta M-72
This crash occurs very frequently on windows platform and is likely preventing the fuzzer sequence_manager_fuzzer from making much progress. Fixing this will allow more bugs to be found.

Marking this bug as a blocker for next Beta release.

If this is incorrect, please add ClusterFuzz-Wrong label and remove the ReleaseBlock-Beta label.
Project Member

Comment 2 by ClusterFuzz, Nov 18

Components: Blink>Scheduling Internals>Core
Labels: Test-Predator-Auto-Components
Automatically applying components based on crash stacktrace and information from OWNERS files.

If this is incorrect, please apply the Test-Predator-Wrong-Components label.
Issue 897872 has been merged into this issue.
I analyzed this bug when it was crbug.com/897872 (a dereference of an uninitialized pointer). It's a clear race condition where thread_manager_ is racily initialized in a freshly created thread. Here's the analysis I pasted in to the original bug:


thread_manager_ is initialized in SimpleThreadImpl::Run() which means it is initialized whenever the thread starts to run which means it is 100% racy.

I realized this was happening by making the changes above (an ugly but effective hack to halt in the debugger on the offending instruction) together with these modifications to GetThreadManagerFor:

-  return threads_[id]->thread_manager();
+  id_copy_ = id;
+  auto* thread_manager = threads_[id]->thread_manager();
+  const char* p = (const char*)thread_manager;
+  char x = *p;
+  base::debug::Alias(&x);
+  ThreadPoolManager* pThis = this;
+  base::debug::Alias(&pThis);
+  return thread_manager;

I could then see that while thread_manager was invalid, the threads_ vector contained only valid data. So, thread_manager_ was initialized *after* this function was called, but before I attached the debugger - i.e.; on another thread. With that in mind, and once I remember to look at thread_manager_ instead of thread_pool_manager_ it was obvious what was going on.

In hindsight we can see that this race condition is documented right above the declaration of thread_manager_:

// The object pointed to by |thread_manager_| is created and destructed from
// the Run function. This is necessary since it has to be constructed from the
// thread it should be bound to and destructed from the same thread.
ThreadManager* thread_manager_;


crrev.com/c/1316657 makes sure that thread_manager_ is always initialized (to nullptr). There is still a race condition, but the dereference will now reliably hit location zero instead of random memory.

Friendly ping for an update as this is marked as Beta blocker for M-72.
Labels: Stability-Sheriff-Desktop
Owner: brucedaw...@chromium.org
Status: Assigned (was: Untriaged)
Assigning to Bruce, please feel free to route to appropriate Dev if its not yours.

+Sheriff for awareness.
[Stability sheriff]

+fdoray@ as this is related to thread manager, is there some plans to fix this race?
Cc: fdoray@chromium.org
Labels: -Pri-1 -ReleaseBlock-Beta Pri-3
I think we can remove the blocker label since this seems to be a bug in the test code and not production chrome https://bugs.chromium.org/p/chromium/issues/detail?id=897872#c39
Cc: brucedaw...@chromium.org
Owner: alexclarke@chromium.org
This is a SequenceManager issue.
Cc: alexclarke@chromium.org skyos...@chromium.org
Owner: ----
Status: Available (was: Assigned)
Thanks Bruce! As far as I can see this is a bug in the fuzzer itself, not the component it's testing.
Labels: -Stability-Crash -Stability-Sheriff-Desktop
Removing this from the stability sheriff queue as this isn't a stability issue.
I agree that this is a bug in the fuzzer itself rather than sequencemanager or other shipping code. But, the original bug said:

> This crash occurs very frequently on windows platform and is likely
> preventing the fuzzer sequence_manager_fuzzer from making much
> progress. Fixing this will allow more bugs to be found.

I assume that this is still true so somebody who owns this code should deal with this bug.

Owner: carlscab@google.com
Status: Started (was: Available)
Project Member

Comment 15 by bugdroid1@chromium.org, Dec 6

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/7333851526d00cee593ca114d3394f77236fcaf8

commit 7333851526d00cee593ca114d3394f77236fcaf8
Author: Carlos Caballero <carlscab@google.com>
Date: Thu Dec 06 12:14:39 2018

Fix race in scheduler fuzzer

ThreadManager instances are created on the new thread and thus their
thread_manager_ member might not be initialized when GetThreadManagerFor
is called.

This patch addresses that issue by having two different containers. One
to keep track of threads and the other for keeping track of all the
ThreadManager instances.

This code could use some nice refactoring, this patch just tries to do
the minimum changes to get rid of the race.

Bug:  906468 
Change-Id: Ie39f5a6a18111248c4f83264f6adcdaef4befa23
Reviewed-on: https://chromium-review.googlesource.com/c/1365442
Commit-Queue: Carlos Caballero <carlscab@google.com>
Reviewed-by: Sami Kyöstilä <skyostil@chromium.org>
Cr-Commit-Position: refs/heads/master@{#614321}
[modify] https://crrev.com/7333851526d00cee593ca114d3394f77236fcaf8/third_party/blink/renderer/platform/scheduler/test/fuzzer/sequence_manager_fuzzer_processor.cc
[modify] https://crrev.com/7333851526d00cee593ca114d3394f77236fcaf8/third_party/blink/renderer/platform/scheduler/test/fuzzer/sequence_manager_fuzzer_processor_unittest.cc
[modify] https://crrev.com/7333851526d00cee593ca114d3394f77236fcaf8/third_party/blink/renderer/platform/scheduler/test/fuzzer/simple_thread_impl.cc
[modify] https://crrev.com/7333851526d00cee593ca114d3394f77236fcaf8/third_party/blink/renderer/platform/scheduler/test/fuzzer/simple_thread_impl.h
[modify] https://crrev.com/7333851526d00cee593ca114d3394f77236fcaf8/third_party/blink/renderer/platform/scheduler/test/fuzzer/thread_pool_manager.cc
[modify] https://crrev.com/7333851526d00cee593ca114d3394f77236fcaf8/third_party/blink/renderer/platform/scheduler/test/fuzzer/thread_pool_manager.h

Status: Fixed (was: Started)
The documentation for reproducing bugs on Windows was moved to: https://chromium.googlesource.com/chromium/src/+/master/testing/libfuzzer/reproducing.md

Sign in to add a comment