Chrome crashes while running Connectivity diagnostics on Proxy network configured with invalid values - extensions::TCPSocket::OnConnectComplete |
|||||||||||||||||||||
Issue descriptionChrome Version: <From about:version: Google Chrome 70.0.3538.14> Chrome OS Version: <From about:version: Platform 11021.11.0> Chrome OS Platform: Coral/Santa Please specify Cr-* of the system to which this bug/feature applies (add the label below). Steps To Reproduce: (1)Connect to a proxy network with invalid proxy values. (2)Open any website eg., cnn.com and wait for the page to load. (3)Throws an error and gives a run connectivity diagnostics option. (4)Click on run diagnostics (5)Screen goes black, and chrome browser crashes every time. Expected Result: chrome browser should not crash. Actual Result: chrome browser crashes. How frequently does this problem reproduce? (Always, sometimes, hard to reproduce?) Always Feedback report@ https://listnr.corp.google.com/report/85652614665
,
Sep 11
Seeing this on Cyan as well.
,
Sep 11
,
Sep 12
abodenha@ - please help find an owner for this.
,
Sep 12
There's no client ID in the feedback report. Can you get an ID from chrome://crashes on this?
,
Sep 12
Crash report IDs : e76f4428e7cf5b71 and 0b6020caef52f689
,
Sep 12
Looks like: Thread 3 (id: 0x549) CRASHED [SIGSEGV /SEGV_MAPERR @ 0x00000008 ] MAGIC SIGNATURE THREAD Stack Quality100%Show frame trust levels 0x00005c91498714c1 (chrome -./../../../../../../../home/chrome-bot/chrome_root/src/base/callback.h:99 ) extensions::TCPSocket::OnConnectComplete(int, base::Optional<net::IPEndPoint> const&, base::Optional<net::IPEndPoint> const&, mojo::ScopedHandleBase<mojo::DataPipeConsumerHandle>, mojo::ScopedHandleBase<mojo::DataPipeProducerHandle>) 0x00005c914987590a (chrome -./../../../../../../../home/chrome-bot/chrome_root/src/base/bind_internal.h:516 ) base::internal::Invoker<base::internal::BindState<void (cast_channel::CastSocketImpl::*)(int, base::Optional<net::IPEndPoint> const&, base::Optional<net::IPEndPoint> const&, mojo::ScopedHandleBase<mojo::DataPipeConsumerHandle>, mojo::ScopedHandleBase<mojo::DataPipeProducerHandle>), base::WeakPtr<cast_channel::CastSocketImpl> >, void (int, base::Optional<net::IPEndPoint> const&, base::Optional<net::IPEndPoint> const&, mojo::ScopedHandleBase<mojo::DataPipeConsumerHandle>, mojo::ScopedHandleBase<mojo::DataPipeProducerHandle>)>::RunOnce(base::internal::BindStateBase*, int, base::Optional<net::IPEndPoint> const&, base::Optional<net::IPEndPoint> const&, mojo::ScopedHandleBase<mojo::DataPipeConsumerHandle>&&, mojo::ScopedHandleBase<mojo::DataPipeProducerHandle>&&) 0x00005c91498760f4 (chrome -./../../../../../../../home/chrome-bot/chrome_root/src/base/callback.h:99 ) base::internal::Invoker<base::internal::BindState<base::OnceCallback<void (int, base::Optional<net::IPEndPoint> const&, base::Optional<net::IPEndPoint> const&, mojo::ScopedHandleBase<mojo::DataPipeConsumerHandle>, mojo::ScopedHandleBase<mojo::DataPipeProducerHandle>)>, int, base::Optional<net::IPEndPoint>, base::Optional<net::IPEndPoint>, mojo::ScopedHandleBase<mojo::DataPipeConsumerHandle>, mojo::ScopedHandleBase<mojo::DataPipeProducerHandle> >, void ()>::RunOnce(base::internal::BindStateBase*) 0x00005c91481ed677 (chrome -./../../../../../../../home/chrome-bot/chrome_root/src/base/callback.h:99 ) base::MessageLoop::DoWork() 0x00005c91481fb1cb (chrome -./../../../../../../../home/chrome-bot/chrome_root/src/base/message_loop/message_pump_libevent.cc:210 ) base::MessagePumpLibevent::Run(base::MessagePump::Delegate*) 0x00005c914ad5b403 (chrome -./../../../../../../../home/chrome-bot/chrome_root/src/base/run_loop.cc:102 ) <name omitted> 0x00005c9149073633 (chrome -./../../../../../../../home/chrome-bot/chrome_root/src/base/threading/thread.cc:262 ) content::BrowserProcessSubThread::IOThreadRun(base::RunLoop*) 0x00005c914ad8853e (chrome -./../../../../../../../home/chrome-bot/chrome_root/src/base/threading/thread.cc:357 ) base::Thread::ThreadMain() 0x00005c914adbbd9b (chrome -./../../../../../../../home/chrome-bot/chrome_root/src/base/threading/platform_thread_posix.cc:76 ) base::(anonymous namespace)::ThreadFunc(void*) 0x00007d16265f52b7 (libpthread-2.23.so -pthread_create.c:333 ) start_thread 0x00007d1625abbfac (libc-2.23.so + 0x000f6fac ) clone Bumping to P1 and marking RBS since we have a reliable repro. Untriaging so the right teams can pick it up.
,
Sep 15
,
Oct 1
This bug does not have an owner and is marked RBS. Please update with the plan and ETA to fix the issue or remove the RBS label if this is no longer a blocker. Thanks.
,
Oct 4
Ping. We are nearing M70 Stable. What is the latest on this?
,
Oct 4
,
Oct 9
Ping, is this still happening on the latest Beta? M70 stable is a week out.
,
Oct 9
Still see the issue on latest M70 - 11021.43.0, 70.0.3538.53, on Kevin.
,
Oct 9
,
Oct 9
,
Oct 9
mmenke@ - are you the right owner for this?
,
Oct 9
No. That diagnostics was added by the ChromeOS team. I'm completely unfamiliar with it.
,
Oct 9
,
Oct 9
OH, sorry, this is likely a regression due to https://chromium-review.googlesource.com/c/chromium/src/+/1070434 or subsequent CLs. [+morlovich]: You reviewed that CL, mind taking this one? Happy to help poke at it.
,
Oct 10
Does this fix need to be in M70? We are less than a week from Stable.
,
Oct 11
One plausible reason may be a call to Disconnect while Connect is still pending; though I am not certain if the extension API would permit that.
,
Oct 11
Hmm, can't reproduce with a trunk CromeOS build on Linux. I wonder if I need anything specific for "invalid proxy values".
,
Oct 11
Looks like I would need to install the extension manually somehow to stand a chance of reproducing..
,
Oct 11
I am able to reproduce this with just connecting to a proxy network, without any manual proxy configuration.
,
Oct 11
Basically on my end I am having trouble getting the relevant extension to run, as I am trying to test on Linux (following https://chromium.googlesource.com/chromium/src/+/master/docs/chromeos_build_instructions.md), not ChromeOS proper. I would very much appreciate any advice you can offer for reproducing this from a source build.
,
Oct 11
https://chromium.googlesource.com/chromiumos/docs/+/master/cros_vm.md has instructions for launching and debugging with Chrome OS in a VM.
,
Oct 11
Thanks, can run a source build of CrOS via that; but still not able to reproduce it by running diagnostics; do I need fancier proxy config than 127.0.0.1:<random port>?
,
Oct 16
Ping on this.
,
Oct 16
Hi, this issue happens even if i don't set any manual config for proxy. Just connect to a proxy network with direct setting also causes browser crash.
,
Oct 17
No luck reproducing it, still :(
,
Oct 17
[morlovich]: TCPSocket::Disconnect() doesn't release weak pointers, so if something calls Disconnect() where the socket lives, when there's a Connect() call running on the UI thread, and the TCPSocket is not immediately destroyed...We end up with exactly that crash, when connection succeeds on the UI thread, and posts a task over to the IO thread to call into the destroyed connect_callback_, right? Or am I missing something?
,
Oct 17
Also, looks like we're seeing this crash on more than just ChromeOS, though it may be the most common there, relative to other crashes.
,
Oct 17
,
Oct 17
re: disconnect: I had that same impression (see comment 21), but... the code before the big rewrite: https://cs.chromium.org/chromium/src/extensions/browser/api/socket/tcp_socket.cc?q=tcp_socket.cc&sq=package:chromium&dr&rcl=3149a28489d22a482d963cffb49fa2e11a842f94 ...looks like it would behave the same --- look at all the references to connect_callback_. Hmm, maybe the explicit copy+reset has has different sequencing than the move, though? (Plus I don't know how the extensions layer sequences things). Maybe I should just try testcasing it, either way. re: non-ChromeOS: I just imagine the diagnostics extension is the easiest way of triggering this bug (except for the part where it refuses to do it on my source builds running ChromeOS vms)
,
Oct 17
The old code called Disconnect on net::TCPClientSocket, which would prevent the connect callback from being invoked, and had no PostTasks, so I don't think there was a race here before. If connection were completed already, the consumer's callback would already have been invoked. And if it weren't, disconnecting the socket would prevent the callback to invoke the consumer's callback would not be invoked.
,
Oct 17
Aha, thanks. That certainly makes things make a lot more sense, and makes me a lot more comfortable about changing things w/o being able to reproduce the original report in a dev environment. Since OnConnectCompleteOnUIThread and OnListenCompleteOnUIThread are static, and things other than Connect and Listen don't use the weak pointer factory, sure sounds like clearing it would be the right thing to do, too.
,
Oct 17
Another option would be to just null-check the connect callback, and if it's null, do nothing in the connect complete callback, dropping all arguments on the ground. Think dropping weak pointers is cleaner, though it is perhaps more regression prone.
,
Oct 17
I don't like the null check option since in case of a connect -> disconnect -> connect scenario you may end up with a non-null connect_callback_ when delivering results of the first request, while weak pointers would do the right thing.
,
Oct 17
I didn't realize these were reusable. If they are, yea, a null check certainly doesn't work.
,
Oct 18
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9566887be99d822632aa4429afb768406f749a23 commit 9566887be99d822632aa4429afb768406f749a23 Author: Maks Orlovich <morlovich@chromium.org> Date: Thu Oct 18 20:29:36 2018 extensions/tcp_socket: Fix crash on disconnect with Connect/Listen pending Bug: 882585 Change-Id: Ic740f1ccd310ba5fc23f7bbb6f6bb40d6a3c59c2 Reviewed-on: https://chromium-review.googlesource.com/c/1287069 Reviewed-by: Ken Rockot <rockot@google.com> Commit-Queue: Maks Orlovich <morlovich@chromium.org> Cr-Commit-Position: refs/heads/master@{#600887} [modify] https://crrev.com/9566887be99d822632aa4429afb768406f749a23/chrome/browser/extensions/api/socket/tcp_socket_unittest.cc [modify] https://crrev.com/9566887be99d822632aa4429afb768406f749a23/extensions/browser/api/socket/tcp_socket.cc
,
Oct 19
morlovich@, is this fixed with #41?
,
Oct 19
Should be.
,
Oct 19
[Auto-generated comment by a script] We noticed that this issue is targeted for M-70; it appears the fix may have landed after branch point, meaning a merge might be required. Please confirm if a merge is required here - if so add Merge-Request-70 label, otherwise remove Merge-TBD label. Thanks.
,
Oct 22
It would be good to hand-test the issue once its rolled into a ChromeOS dev build. I am not certain as to whether the bar is there for including into M70 respins (besides needing testing in dev; whom should I ask?), but 71 hasn't hit the beta stage yet, and it seems important enough for that based on just API-level testcase.
,
Oct 22
jmuppala@ - the fix should be in M72 (11177.0.0 or newer CrOS builds). Please give it a try and update this bug with your observation. morlovich@ - after jmuppala@ confirms the crash is not seen after the fix on ToT, this can be merged back.
,
Oct 22
Connectivity diagnostics ran successfully on Samus 11182.0.0, 72.0.3587.0. However, it took 10 min to complete the diagnostics. Is this expected?
,
Oct 23
You'll have to ask someone familiar with the app, I'm afraid.
,
Oct 23
Hi, we shouldn't merge to M71 for the sake of testing, if that's what's implied in #45. We should only merge when the fix has been verified in ToT/M72, as indicated in #46. Does the 10m completion time block the merge? Concerning?
,
Oct 23
I felt that merge to 71 is justified based on what it fixes in the extension API (null-pointer deref crash with certain calls) given my complexity/risk judgement of the CL, regardless of whether it fixes the connectivity test extension as expected or not. The latter is of course a big factor in whether touching M70 may be justified or not.
,
Oct 23
Per #47, we already have confirmation the diagnostics tool doesn't crash with the fix.
,
Oct 23
Your change meets the bar and is auto-approved for M71. Please go ahead and merge the CL to branch 3578 manually. Please contact milestone owner if you have questions. Owners: benmason@(Android), kariahda@(iOS), kbleicher@(ChromeOS), govind@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Oct 23
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/27c4668846990d29552aaa12bd47e57da8b8b95a commit 27c4668846990d29552aaa12bd47e57da8b8b95a Author: Maks Orlovich <morlovich@chromium.org> Date: Tue Oct 23 15:19:55 2018 extensions/tcp_socket: Fix crash on disconnect with Connect/Listen pending (M71 merge) Bug: 882585 Change-Id: Ic740f1ccd310ba5fc23f7bbb6f6bb40d6a3c59c2 Reviewed-on: https://chromium-review.googlesource.com/c/1287069 Reviewed-by: Ken Rockot <rockot@google.com> Commit-Queue: Maks Orlovich <morlovich@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#600887}(cherry picked from commit 9566887be99d822632aa4429afb768406f749a23) Reviewed-on: https://chromium-review.googlesource.com/c/1296617 Reviewed-by: Maks Orlovich <morlovich@chromium.org> Cr-Commit-Position: refs/branch-heads/3578@{#263} Cr-Branched-From: 4226ddf99103e493d7afb23a4c7902ee496108b6-refs/heads/master@{#599034} [modify] https://crrev.com/27c4668846990d29552aaa12bd47e57da8b8b95a/chrome/browser/extensions/api/socket/tcp_socket_unittest.cc [modify] https://crrev.com/27c4668846990d29552aaa12bd47e57da8b8b95a/extensions/browser/api/socket/tcp_socket.cc
,
Oct 23
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/27c4668846990d29552aaa12bd47e57da8b8b95a Commit: 27c4668846990d29552aaa12bd47e57da8b8b95a Author: morlovich@chromium.org Commiter: morlovich@chromium.org Date: 2018-10-23 15:19:55 +0000 UTC extensions/tcp_socket: Fix crash on disconnect with Connect/Listen pending (M71 merge) Bug: 882585 Change-Id: Ic740f1ccd310ba5fc23f7bbb6f6bb40d6a3c59c2 Reviewed-on: https://chromium-review.googlesource.com/c/1287069 Reviewed-by: Ken Rockot <rockot@google.com> Commit-Queue: Maks Orlovich <morlovich@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#600887}(cherry picked from commit 9566887be99d822632aa4429afb768406f749a23) Reviewed-on: https://chromium-review.googlesource.com/c/1296617 Reviewed-by: Maks Orlovich <morlovich@chromium.org> Cr-Commit-Position: refs/branch-heads/3578@{#263} Cr-Branched-From: 4226ddf99103e493d7afb23a4c7902ee496108b6-refs/heads/master@{#599034}
,
Nov 1
Confirmed on 71.0.3578.27 that's the crash is gone, but that the firewall test step takes forever, on same device/network that seemed to do it fine on M69.
,
Nov 1
Fails as well, actually --- looks like it likely needs further debugging; sadly it doesn't look like the usual extension debugging methods work with a bundled-in extension.
,
Dec 13
|
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by jmuppala@chromium.org
, Sep 11