New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 811566 link

Starred by 5 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Mar 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

Check failed: leaf_index < tree_size (2 vs. 0)

Project Member Reported by alancutter@chromium.org, Feb 13 2018

Issue description

Chrome Version: canary 6810624d58594c3b2074f0bfe9a2651ed623883d

What steps will reproduce the problem?
(1) Build chrome with args.gn:
target_os = "chromeos"
dcheck_always_on = true
(2) Open chrome.

What is the expected result?

No crash.

What happens instead?

Assertion failure in the first couple of seconds.

[195929:195953:0213/123252.887200:FATAL:merkle_audit_proof.cc(16)] Check failed: leaf_index < tree_size (2 vs. 0)
#0 0x7f9e4a78904c base::debug::StackTrace::StackTrace()
#1 0x7f9e4a7b020b logging::LogMessage::~LogMessage()
#2 0x7f9e493e58b8 net::ct::CalculateAuditPathLength()
#3 0x55e89c8e5090 certificate_transparency::LogDnsClient::AuditProofQuery::RequestAuditProofNodesComplete()
#4 0x55e89c8e4907 certificate_transparency::LogDnsClient::AuditProofQuery::DoLoop()
#5 0x55e89c8e53c5 certificate_transparency::LogDnsClient::AuditProofQuery::OnDnsTransactionComplete()
#6 0x55e89b652e25 _ZN4base8internal7InvokerINS0_9BindStateIMN18chrome_browser_net14DnsProbeRunnerEFvPN3net14DnsTransa
ctionEiPKNS5_11DnsResponseEEJNS_7WeakPtrIS4_EEEEEFvS7_iSA_EE3RunEPNS0_13BindStateBaseES7_iSA_
#7 0x7f9e4946f264 net::(anonymous namespace)::DnsTransactionImpl::DoCallback()
#8 0x7f9e49470db3 net::(anonymous namespace)::DnsTransactionImpl::OnAttemptComplete()
#9 0x7f9e49470279 net::(anonymous namespace)::DnsTransactionImpl::OnUdpAttemptComplete()
#10 0x7f9e49470cb3 net::(anonymous namespace)::DnsUDPAttempt::OnIOComplete()
#11 0x7f9e496494af net::UDPSocketPosix::DoReadCallback()
#12 0x7f9e4964923d net::UDPSocketPosix::DidCompleteRead()
#13 0x7f9e4964906e net::UDPSocketPosix::ReadWatcher::OnFileCanReadWithoutBlocking()
#14 0x7f9e4a7c1443 base::MessagePumpLibevent::OnLibeventNotification()
#15 0x7f9e4a88b91d event_base_loop
#16 0x7f9e4a7c1782 base::MessagePumpLibevent::Run()
#17 0x7f9e4a7be5a9 base::MessageLoop::Run()
#18 0x7f9e4a7f57c9 base::RunLoop::Run()
#19 0x7f9e4a836947 base::Thread::Run()
#20 0x7f9e47a31fa8 content::BrowserThreadImpl::IOThreadRun()
#21 0x7f9e47a320f5 content::BrowserThreadImpl::Run()
#22 0x7f9e4a836efd base::Thread::ThreadMain()
#23 0x7f9e4a82dedf base::(anonymous namespace)::ThreadFunc()
#24 0x7f9e4a90b494 start_thread
#25 0x7f9e3d8d1a8f clone

 

Comment 1 by mgiuca@chromium.org, Feb 13 2018

Owner: robpercival@chromium.org
Status: Assigned (was: Untriaged)
+robpercival who is the blame line for this check in merkle_audit_proof.cc.

Comment 2 by mgiuca@chromium.org, Feb 13 2018

I ran a git bisect. Currently mysterious.

It had no inconsistencies during the bisect, and identified r536127 as the culprit (a WebRTC roll). However, nothing in there looked like the culprit, and when I manually rolled back to r536126, it still exhibited the bug.

So it's possible this is flaky (or stateful). I'll try another bisect using r536126 as the known-bad.

FYI, the bisect log (good: r536025, bad: r536241):
git bisect start
# good: [1285f6b1e65accbfb5ddebd8aa402d7eaca89226] Put CSSReflectValue into the blink::cssvalue namespace.
git bisect good 1285f6b1e65accbfb5ddebd8aa402d7eaca89226
# bad: [592ce94596639204c1b9e21fa7de7e3953f568b3] Reland "[Chromecast] Add API to get clear audio buffer back"
git bisect bad 592ce94596639204c1b9e21fa7de7e3953f568b3
# bad: [7645aecdf4f5d20b53c1f5efa2925a5772f40f02] Roll AFDO from 66.0.3336.3_rc-r1 to 66.0.3345.0_rc-r1
git bisect bad 7645aecdf4f5d20b53c1f5efa2925a5772f40f02
# good: [2676595f0db73d5f35359dd4a33fd8b98122bd7c] Mark http/tests/devtools/tracing-session-id.js as flaky timeout on Linux MSAN.
git bisect good 2676595f0db73d5f35359dd4a33fd8b98122bd7c
# good: [9585347206297483e4bff722c38b3b9956ec110e] Don't cancel editing when tapping omnibox button
git bisect good 9585347206297483e4bff722c38b3b9956ec110e
# good: [1235ace2707eccd944989d8a234ed7c37d51b45f] [MIPS] Do not use hash-style gnu for MIPS architecture
git bisect good 1235ace2707eccd944989d8a234ed7c37d51b45f
# good: [d784121d952ea5183c187c74f19aa050ad0507a9] "Retry" -> "Try again" in signin error UI on Android
git bisect good d784121d952ea5183c187c74f19aa050ad0507a9
# bad: [6323b52097d1274820b2516932806e99139c27b0] Import wpt@9354b7ca6002e7d6351fa352440e6468e69c9eee
git bisect bad 6323b52097d1274820b2516932806e99139c27b0
# bad: [5819a2a74d936f87e858e374cf5b8a77598d3668] Remove ability to set CSSAdditiveAnimations and WebAnimationsAPI in LayoutTests
git bisect bad 5819a2a74d936f87e858e374cf5b8a77598d3668
# bad: [7c7587e45a1eaa68b74361191a8355059a611804] Roll src/third_party/webrtc/ 0c15a0929..0d3c9a3f2 (3 commits)
git bisect bad 7c7587e45a1eaa68b74361191a8355059a611804
# first bad commit: [7c7587e45a1eaa68b74361191a8355059a611804] Roll src/third_party/webrtc/ 0c15a0929..0d3c9a3f2 (3 commits)

Comment 3 by mgiuca@chromium.org, Feb 13 2018

Another bisect of r536025 to r536126 was completely good.

git bisect start
# good: [1285f6b1e65accbfb5ddebd8aa402d7eaca89226] Put CSSReflectValue into the blink::cssvalue namespace.
git bisect good 1285f6b1e65accbfb5ddebd8aa402d7eaca89226
# bad: [d784121d952ea5183c187c74f19aa050ad0507a9] "Retry" -> "Try again" in signin error UI on Android
git bisect bad d784121d952ea5183c187c74f19aa050ad0507a9
# good: [629599659b030790e7b3669864e399387ddcdb49] Fix failing PageLoadMetrics test.
git bisect good 629599659b030790e7b3669864e399387ddcdb49
# good: [885175c664520b5bf89767efe1b19926a4ac5285] Add test coverage of field trial for Certificate Transparency log auditing
git bisect good 885175c664520b5bf89767efe1b19926a4ac5285
# good: [bef56b40a4aa9461aa8565e1d4c5311495fa42fb] s/InPage/SameDocument in TranslateManagerRenderViewHostTest.
git bisect good bef56b40a4aa9461aa8565e1d4c5311495fa42fb
# good: [1235ace2707eccd944989d8a234ed7c37d51b45f] [MIPS] Do not use hash-style gnu for MIPS architecture
git bisect good 1235ace2707eccd944989d8a234ed7c37d51b45f
# good: [73f220adf7a80a14d0569b08b11fe18f345b497b] Fix PageLoaded callback for failed navigation and load.
git bisect good 73f220adf7a80a14d0569b08b11fe18f345b497b
# good: [9acf995e9af77804c7367e7cb2587962250e8654] [SPv175] Rebaseline several tests
git bisect good 9acf995e9af77804c7367e7cb2587962250e8654
# good: [1f5941a9c1fd1161419eadd7fe688f082fb8a3ae] Upstream IIRFilterNode tests to WPT
git bisect good 1f5941a9c1fd1161419eadd7fe688f082fb8a3ae
# first bad commit: [d784121d952ea5183c187c74f19aa050ad0507a9] "Retry" -> "Try again" in signin error UI on Android

r536126 doesn't seem like the culprit. I'm not sure why I managed to get a failure just once when I hit that CL.

Comment 4 by mgiuca@chromium.org, Feb 13 2018

Synced back to r536127 and it's no longer exhibiting the bug. This is either flaky or stateful, I'm afraid. Bisecting probably won't help.

Comment 5 by mgiuca@chromium.org, Feb 13 2018

Another bisect (r536127 to r536241) ended up producing a dud culprit:

git bisect start
# good: [7c7587e45a1eaa68b74361191a8355059a611804] Roll src/third_party/webrtc/ 0c15a0929..0d3c9a3f2 (3 commits)
git bisect good 7c7587e45a1eaa68b74361191a8355059a611804
# bad: [592ce94596639204c1b9e21fa7de7e3953f568b3] Reland "[Chromecast] Add API to get clear audio buffer back"
git bisect bad 592ce94596639204c1b9e21fa7de7e3953f568b3
# good: [4d405791b5078766032d3adf1098eccdd232c4f0] Makes sure Observer is correctly removed from FocusController
git bisect good 4d405791b5078766032d3adf1098eccdd232c4f0
# bad: [fa7613e0289b9e27506ca5d60798dac798e60c77] Network config: Always show 'Allow shared' toggle for shared networks
git bisect bad fa7613e0289b9e27506ca5d60798dac798e60c77
# bad: [9a2ae8981706de73b33b0f892263693b05e6cfe7] CSS transform WPT: transform not transfor
git bisect bad 9a2ae8981706de73b33b0f892263693b05e6cfe7
# bad: [056d08eee04db3f48c53adb416c7044993ebceab] Updating XTBs based on .GRDs from branch master
git bisect bad 056d08eee04db3f48c53adb416c7044993ebceab
# good: [bf0bf643413ae72d5e49a56698544e7f2cbafc01] [LayoutNG] Implement support for text clip masks
git bisect good bf0bf643413ae72d5e49a56698544e7f2cbafc01
# good: [548258ccd28b657f7573ca832fbfd557c2922601] Async Cookies: Fix globals in idlharness test
git bisect good 548258ccd28b657f7573ca832fbfd557c2922601
# good: [314f9e860a4e6a9c845cc297d5b428469efc06d6] Import wpt@0c89ee84a0b28b31f52a2f34aadb35b1ed581381
git bisect good 314f9e860a4e6a9c845cc297d5b428469efc06d6
# first bad commit: [056d08eee04db3f48c53adb416c7044993ebceab] Updating XTBs based on .GRDs from branch master

I'm giving up.
Status: Started (was: Assigned)
What's strange here is that this CHECK is in a codepath that should only run if the "CertificateTransparencyLogAuditing" feature is enabled (https://cs.chromium.org/chromium/src/components/certificate_transparency/tree_state_tracker.cc?l=33&rcl=350b23cb99ea461720fb920cb2427b9844931a4b), but that feature is disabled by default. I did recently enable it for fieldtrial testing (https://crrev.com/c/873531), but that shouldn't affect your local build, should it?

Regardless of how this feature has ended up enabled for you, I'll look into what could cause this CHECK to fail.

Comment 7 by mge...@chromium.org, Feb 13 2018

Components: -Internals>Network>DNS Internals>Network>CertTrans
Builds with is_chrome_branded=false and fieldtrial_testing_like_official_build=false, which are the defaults, run with everything in the fieldtrial testing config enabled, so that's expected.
Thanks for clarifying that, I was under the impression that it only affected TryBots.
This is quite strange - LogDnsClient::AuditProofQuery checks repeatedly that this precondition isn't violated prior to calling the function that contains this CHECK and should abort (without crashing) if it is (see lines 396 and 406 in https://cs.chromium.org/chromium/src/components/certificate_transparency/log_dns_client.cc?l=396&rcl=f57839b0177ee08a74f55550d35325518bead4d9). I'll continue investigating...
I'm hitting this error in my dev workflow on Ash. Currently my workaround is to blow away the profile with `rm -rf ~/.config/chromium` and chromeOS will open without crashing a few more times.
Here's a profile that crashes if it helps to repro this.
chromium-profile.tar.gz
4.9 MB Download
Thanks. I'm currently attempting to reproduce but haven't had a crash myself so far (despite chrome://net-internals showing this feature is active). It must be non-deterministic :(
That profile doesn't crash for me. Can you provide any extra information that might help me reproduce the crash?
I'm having trouble reproing it now sorry. Restarted my build of chromeOS about a dozen times.

This is my full args.gn file:
is_debug = false
dcheck_always_on = true
enable_nacl = false
is_component_build = true
target_os = "chromeos"
use_goma = true

I've managed to reproduce this issue and identified the cause of the crash. Some code calling LogDnsClient is passing in a pointer to a value in a std::map, which can be invalidated if the std::map is modified before the DNS query completes (https://cs.chromium.org/chromium/src/components/certificate_transparency/single_tree_tracker.cc?l=417&rcl=52d3f11d8bb93b3808fb9d5b9b53530c01ab21fe).
I'm not sure that's actually the problem now, as I've learnt that std::map guarantees that a pointer to an element won't be invalidated unless you remove that element. I'm trying to reproduce the issue again with an asan build but no crashes so far.
No luck reproducing the issue with manual testing, but I've gotten a unit test to crash with the same CHECK fail by simulating critical memory pressure (https://crrev.com/c/939627). 
Project Member

Comment 18 by bugdroid1@chromium.org, Mar 27 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/5f3b3230aa4d2944b3f97ef9441ad74c7a134012

commit 5f3b3230aa4d2944b3f97ef9441ad74c7a134012
Author: Rob Percival <robpercival@chromium.org>
Date: Tue Mar 27 18:41:35 2018

Fix use-after-free bug triggered when memory pressure reaches critical

When memory pressure reaches critical, SingleTreeTracker clears the
pending_entries_ map. However, if an inclusion check is in progress for
one or more of those pending entries, LogDnsClient will have a pointer to
a MerkleAuditProof held in that map. This results in it trying to access
freed memory.

The fix is to cancel all inclusion checks when this happens. This is done
by changing LogDnsClient to provide a "resource handle" when it starts a
query, which can be deleted in order to abort the query. Storing this
in pending_entries_ assures that all inclusion checks will be aborted
when pending_entries_ is cleared.

Bug:  811566 
Change-Id: I86b7ff880c050b790d219fa0cd50b42839bc0d3e
Reviewed-on: https://chromium-review.googlesource.com/939627
Reviewed-by: Ryan Sleevi <rsleevi@chromium.org>
Reviewed-by: Matt Mueller <mattm@chromium.org>
Commit-Queue: Rob Percival <robpercival@chromium.org>
Cr-Commit-Position: refs/heads/master@{#546183}
[modify] https://crrev.com/5f3b3230aa4d2944b3f97ef9441ad74c7a134012/components/certificate_transparency/log_dns_client.cc
[modify] https://crrev.com/5f3b3230aa4d2944b3f97ef9441ad74c7a134012/components/certificate_transparency/log_dns_client.h
[modify] https://crrev.com/5f3b3230aa4d2944b3f97ef9441ad74c7a134012/components/certificate_transparency/log_dns_client_unittest.cc
[modify] https://crrev.com/5f3b3230aa4d2944b3f97ef9441ad74c7a134012/components/certificate_transparency/single_tree_tracker.cc
[modify] https://crrev.com/5f3b3230aa4d2944b3f97ef9441ad74c7a134012/components/certificate_transparency/single_tree_tracker_unittest.cc

Status: Fixed (was: Started)

Sign in to add a comment