New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 797267 link

Starred by 21 users

Chrome_Mac: Crash Report - net::ReadCRL, _platform_memmove$VARIANT$Nehalem

Project Member Reported by cr...@system.gserviceaccount.com, Dec 22 2017

Issue description

reporter:ajha@google.com

Magic Signature: net::ReadCRL

Crash link: https://crash.corp.google.com//browse?q=product.name%3D'Chrome_Mac'%20AND%20product.version%3D'65.0.3301.0'%20AND%20custom_data.ChromeCrashProto.channel%3D'canary'%20AND%20custom_data.ChromeCrashProto.ptype%3D'browser'%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D'net%3A%3AReadCRL'%20AND%20ReportID%3D'2a0743fc8392bbd8'&sql_dialect=googlesql&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#3

-------------------------------------------------------------------------------
Sample Report
-------------------------------------------------------------------------------
Product name: Chrome_Mac
Magic Signature : net::ReadCRL
Product Version: 65.0.3301.0
Process type: browser
Report ID: 2a0743fc8392bbd8
Report Url: https://crash.corp.google.com/2a0743fc8392bbd8
Report Time: 2017-12-22T03:14:01-08:00
Upload Time: 2017-12-22T03:14:04.086-08:00
Uptime: 102000 ms
CumulativeProductUptime: 0 ms
OS Name: Mac OS X
OS Version: 10.13.2 17C88
CPU Architecture: amd64
CPU Info: family 6 model 58 stepping 9

-------------------------------------------------------------------------------
Crashing thread: Thread index: 10. Stack Quality: 77%. Thread id: 3776.
-------------------------------------------------------------------------------
0x00007fff63a74e3e (libsystem_kernel.dylib + 0x0001be3e)	__pthread_kill
0x00007fff639d1311 (libsystem_c.dylib + 0x0005d311)	abort
0x00007fff63ace865 (libsystem_malloc.dylib + 0x00003865)	free
0x00007fff61993b1c (libc++.1.dylib + 0x0003cb1c)	std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__grow_by_and_replace(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, char const*)
0x00007fff61993011 (libc++.1.dylib + 0x0003c011)	std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::assign(char const*, unsigned long)
0x00000001106becc0 (Google Chrome Framework - crl_set_storage.cc: 149)	net::ReadCRL(base::BasicStringPiece<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >*, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >*)
0x00000001106be8f7 (Google Chrome Framework - crl_set_storage.cc: 396)	net::CRLSetStorage::Parse(base::BasicStringPiece<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, scoped_refptr<net::CRLSet>*)
0x000000010ff4daef (Google Chrome Framework - crl_set_component_installer.cc: 40)	component_updater::(anonymous namespace)::LoadCRLSet(base::FilePath const&)
0x00000001102d324b (Google Chrome Framework - callback.h: 65)	base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*)
0x000000011033e46d (Google Chrome Framework - task_tracker.cc: 410)	base::internal::TaskTracker::RunOrSkipTask(base::internal::Task, base::internal::Sequence*, bool)
0x000000011033ef02 (Google Chrome Framework - task_tracker_posix.cc: 23)	base::internal::TaskTrackerPosix::RunOrSkipTask(base::internal::Task, base::internal::Sequence*, bool)
0x000000011033dcba (Google Chrome Framework - task_tracker.cc: 312)	base::internal::TaskTracker::RunNextTask(scoped_refptr<base::internal::Sequence>, base::internal::CanScheduleSequenceObserver*)
0x0000000110338adf (Google Chrome Framework - scheduler_worker.cc: 72)	base::internal::SchedulerWorker::Thread::ThreadMain()
0x0000000110349e86 (Google Chrome Framework - platform_thread_posix.cc: 75)	base::(anonymous namespace)::ThreadFunc(void*)
0x00007fff63bb06c0 (libsystem_pthread.dylib + 0x000036c0)	_pthread_body
0x00007fff63bb056c (libsystem_pthread.dylib + 0x0000356c)	_pthread_start
0x00007fff63bafc5c (libsystem_pthread.dylib + 0x00002c5c)	thread_start
0x0000000110349e2f (Google Chrome Framework + 0x01ee1e2f)	

 

Comment 1 by ajha@chromium.org, Dec 22 2017

Cc: svaldez@chromium.org gov...@chromium.org abdulsyed@chromium.org ajha@chromium.org ligim...@chromium.org
Labels: -Type-Bug -Pri-2 ReleaseBlock-Dev TE-CrashTriage RegressedIn-65 M-65 Target-65 FoundIn-65 Stability-Sheriff-Desktop Pri-0 Type-Bug-Regression
Owner: rsleevi@chromium.org
Status: Assigned (was: Untriaged)
Link to the list of the builds:
===============================
https://crash.corp.google.com/browse?q=custom_data.ChromeCrashProto.magic_signature_1.name%3D%27net%3A%3AReadCRL%27%20AND%20product.name%3D%27Chrome_Mac%27&sql_dialect=googlesql&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D

Note:
=====
1. This is #1 browser crash on Mac canary(65.0.3301.0- 638 crashes from 560 clients), hence considering below as the changelog:

https://chromium.googlesource.com/chromium/src/+log/65.0.3300.0..65.0.3301.0?pretty=fuller&n=10000

There are no direct file related change from the stack trace in the above regression range. 

>Manually inspecting 
1. Seeing single net certificate related change: https://chromium-review.googlesource.com/c/chromium/src/+/837945 but not 100% sure whether this could be rooted as this is adding Tests.

2. Going down the stack trace seeing crl_set_component_installer.cc which doesn't show any related changes in the above regression range.

Assiginig to Ryan@(chromium//src/net/cert/OWNERS) and for single CL in the regression range that is certificate related.


Ryan@: Could you please help in investigating this crash.

Thank you!



Components: Internals>Network
Owner: ----
Status: Untriaged (was: Assigned)
Not my CL, setting to Untriaged and Internals>Network so it gets network triager visibility as well.

Comment 3 by rsesek@chromium.org, Dec 22 2017

Cc: rsesek@chromium.org
 Issue 797225  has been merged into this issue.

Comment 4 by rsesek@chromium.org, Dec 22 2017

Labels: -Restrict-View-EditIssue

Comment 5 by jleedev@gmail.com, Dec 22 2017

You are probably looking for a change made after 525636 (known good), but no later than 525640 (first known bad).
CHANGELOG URL:
  https://chromium.googlesource.com/chromium/src/+log/9c222110ec241790bc2a559bc85689937d61cfdb..747cb1afa7ec0b160837f87626161d29f97063af

Reproduce by attempting to update "CRLSet" from chrome://components

Comment 6 by rsesek@chromium.org, Dec 22 2017

Cc: awhalley@chromium.org p...@chromium.org agl@chromium.org
Some more notes from the crash report:
abort() called
*** error for object 0x2011: pointer being freed was not allocated

Two possibilities:

1) The compiler roll may have changed some behavior of std::string (https://chromium.googlesource.com/chromium/src/+/ece1b14db2dedf6a9a4086b46e8cc6c2acd1b62d)
2) We pushed a new CRLSet to production that is now triggering some previously untested codepath (https://groups.google.com/a/google.com/forum/#!topic/omaha-core/tznqpEPMbZ8)
Adding some initial notes from trying to view Crash
- This particular magic frame is only on macOS
- Looking more broadly at the stack of functions - e.g. including net::CRLSetStorage::Parse - still shows an extreme bias towards macOS
- Removing the filters for Canary so all versions are considered suggests 65.0.3301.0 only

So it does look valid that this is a mac-only, recently introduced crash, and I definitely agree with the labels.

Looking through the crashes, see a mixture of invalid accesses and NPEs.

The function, ReadCRL, takes in an input blob of bytes and parses it, storing a 32-byte SHA-256 hash (out_parent_spki_hash) and a list of serials (out_serials), which are caller-provided storage.

The crash is consistent in assigning 32 bytes to out_parent_spki_hash. There is a condition check immediately before that to make sure at least 32 bytes are available, so that would indicate something amiss with |out_parent_spki_hash|, rather than |data|

There are only two calls to ReadCRL - when parsing and when applying delta updates. The callstack is consistently in Parse() - although notably, applying delta-updates uses stack-allocated variables.

In the Parse() case, a heap allocated CRLSet is created, and it has a vector of string/vector pairs (the CRLS). A new value is pushed onto the back, and then read (via pop_back), with the pointers into it being used as the output parameters.

Thus, I have trouble believing this is a straight logic bug in the code - in order for those crashes to make sense, memory must be getting stomped 'somewhere else', leading to this crash. It could be elsewhere in this function, or elsewhere in a component, but I don't think this particular crash has enough information to figure it out.
Thanks for the bisect, Comment #5 - that definitely supports rsesek's remark in Comment #6 that this may be a Clang-roll related.

Comment 9 by meh...@chromium.org, Dec 22 2017

 Issue 797315  has been merged into this issue.
One note, in case it's helpful - I deleted my ~/Library/Application Support/Google/Chrome Canary directory, and restarted Chrome Canary, and it worked fine for a while - long enough to set up my account, log in to email, start some music playing, etc, but then I stepped away for a couple minutes and it had crashed again.  That seems like it could indicate that a certain CRL is causing the crash.  
I don't think it's related to a CRLSet push - I think we should revert the Clang roll. It just happens that CRLSets sync after a while, so the first CRLSet will trigger it.
Labels: HasTestcase
I'm seeing this crash after relaunching canary with the latest version update and here is the attached system report incase if that helps.
797267.rtf
100 KB Download

Comment 14 by a...@chromium.org, Dec 22 2017

Cc: a...@chromium.org sdy@chromium.org
This is definitely a clang bug. The last time it was seen was  bug 792188 , also in net::ReadCRL().

Comment 15 by a...@chromium.org, Dec 22 2017

Cc: thakis@chromium.org h...@chromium.org r...@chromium.org mark@chromium.org
 Issue 797361  has been merged into this issue.

Comment 16 by a...@chromium.org, Dec 22 2017

Labels: clang
Summary: Chrome_Mac: Crash Report - net::ReadCRL, _platform_memmove$VARIANT$Nehalem (was: Chrome_Mac: Crash Report - net::ReadCRL)

Comment 17 by r...@chromium.org, Dec 22 2017

Owner: r...@chromium.org
Status: Assigned (was: Untriaged)

Comment 18 by kbr@chromium.org, Dec 22 2017

Cc: kbr@chromium.org

Comment 19 by kbr@chromium.org, Dec 22 2017

Blockedon: 792188
Blocking: 787920

Comment 20 by kbr@chromium.org, Dec 22 2017

I think this needs a postmortem once it's fixed. How did this make it past all of the automated testing machines?

Comment 21 by r...@chromium.org, Dec 22 2017

> I think this needs a postmortem once it's fixed. How did this make it past all of the automated testing machines?

The original roll did pass the CQ, despite probably unrelated test failures on chromeos asan bots. The Mac and Linux bots on the Clang ToT waterfall[1] are currently green, despite the fact that we still appear to have a bug.

[1] https://ci.chromium.org/p/chromium/g/chromium.clang/console

I've spent an hour diffing the code clang generates with and without the memcpyopt patch, and I don't see anything wrong yet. I see differences, but they look correct. This is the interesting code pattern:

  for (size_t crl_index = 0; !data.empty(); crl_index++) {
    // Speculatively push back a pair and pass it to ReadCRL() to avoid
    // unnecessary copies.
    crl_set->crls_.push_back(
        std::make_pair(std::string(), std::vector<std::string>()));
    std::pair<std::string, std::vector<std::string> >* const back_pair =
        &crl_set->crls_.back();

    if (!ReadCRL(&data, &back_pair->first, &back_pair->second)) {

With the suspected culprit change, clang is trying to zero-initialize the last slot in the vector in place. That's well and good, but something appears to be going wrong.

Assuming the revert lands soon, we can downgrade this to P1 and leave this until after the holidays, right?
> Assuming the revert lands soon, we can downgrade this to P1 and leave this until after the holidays, right?

Yes, though we're currently having problems landing the revert (https://chromium-review.googlesource.com/c/chromium/src/+/843242). The Android bots seem to be timing out/crashing during compile.
Project Member

Comment 23 by bugdroid1@chromium.org, Dec 22 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/bcd195d07a842f2ed3745dfa4a0a6bb341375ff5

commit bcd195d07a842f2ed3745dfa4a0a6bb341375ff5
Author: Brad Lassey <lassey@chromium.org>
Date: Fri Dec 22 22:13:47 2017

Revert "Roll clang 318667:321204."

This reverts commit ece1b14db2dedf6a9a4086b46e8cc6c2acd1b62d.

Reason for revert: suspected of causing the crash in crbug/797267

Original change's description:
> Roll clang 318667:321204.
>
> Ran `tools/clang/scripts/upload_revision.py 321204`
> and cherry-picked the non-clang changes from
> https://chromium-review.googlesource.com/c/chromium/src/+/828400 .
>
> TBR=brettw@chromium.org,rsesek@chromium.org,hans@chromium.org
>
> Bug:  787920 
> Cq-Include-Trybots: master.tryserver.chromium.android:android_arm64_dbg_recipe;master.tryserver.chromium.android:android_compile_mips_dbg;master.tryserver.chromium.android:android_compile_x64_dbg;master.tryserver.chromium.android:android_compile_x86_dbg
> Change-Id: Iaec5677e2afee0edb685db816e00f3c900214d14
> Reviewed-on: https://chromium-review.googlesource.com/835358
> Commit-Queue: Peter Collingbourne <pcc@chromium.org>
> Reviewed-by: Robert Sesek <rsesek@chromium.org>
> Reviewed-by: Hans Wennborg <hans@chromium.org>
> Reviewed-by: Peter Collingbourne <pcc@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#525638}

TBR=eugenis@chromium.org,brettw@chromium.org,hans@chromium.org,rnk@chromium.org,pcc@chromium.org,rsesek@chromium.org
NOTRY=true

Skipping the CQ because there appear to be Android bot capacity issues.

Bug:  787920 ,797267
Change-Id: I3a033492561ae2ff4fcf46fe9087efbbd9f015f6
Cq-Include-Trybots: master.tryserver.chromium.android:android_arm64_dbg_recipe;master.tryserver.chromium.android:android_compile_mips_dbg;master.tryserver.chromium.android:android_compile_x64_dbg;master.tryserver.chromium.android:android_compile_x86_dbg
Reviewed-on: https://chromium-review.googlesource.com/843242
Commit-Queue: Reid Kleckner <rnk@chromium.org>
Reviewed-by: Reid Kleckner <rnk@chromium.org>
Reviewed-by: Avi Drissman <avi@chromium.org>
Cr-Commit-Position: refs/heads/master@{#526072}
[modify] https://crrev.com/bcd195d07a842f2ed3745dfa4a0a6bb341375ff5/build/config/BUILDCONFIG.gn
[modify] https://crrev.com/bcd195d07a842f2ed3745dfa4a0a6bb341375ff5/build/config/compiler/BUILD.gn
[modify] https://crrev.com/bcd195d07a842f2ed3745dfa4a0a6bb341375ff5/media/cdm/ppapi/ppapi_cdm_adapter.gni
[modify] https://crrev.com/bcd195d07a842f2ed3745dfa4a0a6bb341375ff5/sandbox/linux/tests/unit_tests.cc
[modify] https://crrev.com/bcd195d07a842f2ed3745dfa4a0a6bb341375ff5/tools/clang/scripts/update.py

Comment 24 by a...@chromium.org, Dec 22 2017

rnk:

>I've spent an hour diffing the code clang generates with and without the memcpyopt patch, and I don't see anything wrong yet.

Anything useful from  bug 792188  in this area?

Comment 25 by a...@chromium.org, Dec 22 2017

(especially comment 7 from  bug 792188 )

Comment 26 by kbr@chromium.org, Dec 22 2017

Blocking: 797441

Comment 27 by kbr@chromium.org, Dec 22 2017

> Yes, though we're currently having problems landing the revert (https://chromium-review.googlesource.com/c/chromium/src/+/843242). The Android bots seem to be timing out/crashing during compile.

Filed Issue 797441 about the Android build exceptions. It looks to me like the Clang rollback caused a full rebuild, and it simply timed out on these two bots because their timeouts are set too low to handle that.

I was looking at the ReadCRL failure in  bug 797168  and saw it's flaky, which may explain why it got through automation undetected. It also happens on chromium revisions at least a month back.

Comment 29 by h...@chromium.org, Dec 23 2017

Since it wasn't mentioned anywhere yet, the memcpyopt change is r321138.

Comment 30 by a...@chromium.org, Dec 24 2017

FYI, canary 65.0.3302.0, built after the revert, launches fine.

Comment 31 by a...@chromium.org, Dec 24 2017

Labels: -Pri-0 Pri-1
Now that the revert is in and the canary launches, I'm downgrading to a P1. Reid, a wonderful Christmas present for you. :)
I can confirm, that with version 65.0.3303.0, it is all running again.
Thanks for acting this fast!

Happy holidays, guys!

Comment 33 by ajha@chromium.org, Dec 27 2017

Labels: -ReleaseBlock-Dev
Removing the blocker and keeping this open to check the behavior once the Clang roll is relanded.

Comment 34 by r...@chromium.org, Dec 27 2017

Here's the beginning of a post-mortem. The revision Peter chose for the reverted clang roll had a bug that removes the zero initialization of the std::pair<std::string, std::vector<std::string>> being pushed onto the back of the crls_ vector. This code *does* have unit tests (CRLSetTest.Parse), but if the uninitialized memory happens to be zero, then the tests will pass. This happens enough of the time that Chromium's flaky test retry mechanisms allowed this bad CL through the CQ.

The last time this came up, Hans was diligently watching the Clang ToT waterfall, so he filed  bug 792188  and reverted the upstream LLVM change before we attempted to roll clang. That was weeks ago, and the patch was relanded early on Wed Dec 20, which is when we were pushing to roll clang. In the meantime, we still hadn't rolled clang, and the Clang ToT waterfall was fairly red for other reasons, so we didn't spent much time consulting it looking for possible redness. Even if we had, it cycles slowly enough that we might not have been able to see a pattern of alternating failing net_unittests runs.

Bob independently filed  bug 797168  specifically about the CRLSetTest.Parse. I'm going to mark this blocked on that, and put the more detailed updates over there. We can close this for good when we fix clang and land the next roll.

Comment 35 by r...@chromium.org, Dec 27 2017

Blockedon: 787920 797168
Blocking: -787920

Comment 36 by r...@chromium.org, Dec 28 2017

The next roll is in flight: https://chromium-review.googlesource.com/c/chromium/src/+/845700

I will manually build and run net_unittests a few times before re-landing.

Comment 37 by kbr@chromium.org, Dec 28 2017

Thanks rnk@ for the detailed explanation. Let's do an official writeup and come up with concrete next steps to prevent this from happening again. Maybe for some of these unit tests we can shim the memory allocator to return memory initialized with a guaranteed non-zero value?

Labels: -Stability-Sheriff-Desktop
Project Member

Comment 39 by bugdroid1@chromium.org, Jan 3 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/0d72bb1494c94523ae7bd944476a495e8b281bcc

commit 0d72bb1494c94523ae7bd944476a495e8b281bcc
Author: Reid Kleckner <rnk@google.com>
Date: Wed Jan 03 18:32:36 2018

Roll clang 318667:321529.

Created by patching in
https://chromium-review.googlesource.com/c/chromium/src/+/835358 and
manually editting the revision in update.py.

TBR-ing rsesek@ again for the sandbox changes.
TBR-ing bretw@ again for the ppapi adapter LLD warning suppression.

I will manually build and test net_unittests after uploading to goma,
but I've built this clang revision locally and they are not flaky for
me.

R=inglorion@chromium.org,hans@chromium.org
TBR=rsesek@chromium.org,brettw@chromium.org

Bug:  787920 , 797267,  797168 
Cq-Include-Trybots: master.tryserver.chromium.android:android_arm64_dbg_recipe;master.tryserver.chromium.android:android_compile_mips_dbg;master.tryserver.chromium.android:android_compile_x64_dbg;master.tryserver.chromium.android:android_compile_x86_dbg
Change-Id: Ifd25c014e79ad7a5d4533941ae7fd74a86860191
Reviewed-on: https://chromium-review.googlesource.com/845700
Reviewed-by: Nico Weber <thakis@chromium.org>
Reviewed-by: Hans Wennborg <hans@chromium.org>
Reviewed-by: Robert Sesek <rsesek@chromium.org>
Reviewed-by: Reid Kleckner <rnk@chromium.org>
Commit-Queue: Reid Kleckner <rnk@chromium.org>
Cr-Commit-Position: refs/heads/master@{#526748}
[modify] https://crrev.com/0d72bb1494c94523ae7bd944476a495e8b281bcc/build/config/BUILDCONFIG.gn
[modify] https://crrev.com/0d72bb1494c94523ae7bd944476a495e8b281bcc/build/config/compiler/BUILD.gn
[modify] https://crrev.com/0d72bb1494c94523ae7bd944476a495e8b281bcc/media/cdm/ppapi/ppapi_cdm_adapter.gni
[modify] https://crrev.com/0d72bb1494c94523ae7bd944476a495e8b281bcc/sandbox/linux/tests/unit_tests.cc
[modify] https://crrev.com/0d72bb1494c94523ae7bd944476a495e8b281bcc/tools/clang/scripts/update.py

Sign in to add a comment