New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 612844 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
please use my google.com address
Closed: May 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 1
Type: Bug



Sign in to add a comment

ChannelMojo breaks debug-renderer and debug-webkit-test

Project Member Reported by jbroman@chromium.org, May 18 2016

Issue description

Version: r394418 (d7fe10fcc032b07ab22402fafc17bd31b01622e5)
OS: Linux (Ubuntu trusty)

What steps will reproduce the problem?
(1) Make a debug build (gn: is_debug = true).
(2) Run the debug-renderer script, which uses --renderer-startup-dialog to pause the renderer.

    $ third_party/WebKit/Tools/Scripts/debug-renderer out/Default/content_shell https://www.google.com/

(3) Wait for (gdb) prompt, then run "signal SIGUSR1" to resume.

What is the expected output?
When "signal SIGUSR1" is used to resume the renderer, program execution continues.

What do you see instead?
Program terminates in content::ChildThreadImpl::EnsureConnected (because OnChannelConnected was not called before the timeout).

Locally reverting the change to switch to ChannelMojo (https://chromium.googlesource.com/chromium/src/+/88c27242702c9c3d1fb5cd29d9cd9cf099bfa5ec) fixes the issue, so I assume there is a difference in how the Mojo IPC channel implementation invokes IPC::Listener::OnChannelConnected.


gdb output:

(gdb) signal SIGUSR1
Continuing with signal SIGUSR1.
[25889:25889:0518/120937:538627849530:INFO:child_thread_impl.cc(723)] ChildThreadImpl::EnsureConnected()
[New Thread 0x7f47c4a6f700 (LWP 25942)]
[New Thread 0x7f47c5270700 (LWP 25941)]
[New Thread 0x7f47c5a71700 (LWP 25940)]
[New Thread 0x7f47c6272700 (LWP 25939)]
[New Thread 0x7f47c6a73700 (LWP 25938)]
[New Thread 0x7f47c7274700 (LWP 25937)]
[New Thread 0x7f47c7a75700 (LWP 25936)]

Program received signal SIGTERM, Terminated.
0x00007f47d24c8fb7 in kill () at ../sysdeps/unix/syscall-template.S:81
81      ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt 10
#0  0x00007f47d24c8fb7 in kill () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f47e260f9d9 in base::Process::Terminate (this=0x7ffd4c47ef70, exit_code=0, wait=false) at ../../base/process/process_posix.cc:303
#2  0x00007f47e32a8d21 in content::ChildThreadImpl::EnsureConnected (this=0x2dad05f7ea28) at ../../content/child/child_thread_impl.cc:724
#3  0x00007f47e31fdd50 in base::internal::RunnableAdapter<void (content::GpuWatchdogThread::*)()>::Run<content::GpuWatchdogThread*>(content::GpuWatchdogThread*&&) (this=0x7ffd4c47f170, receiver_ptr=<unknown type in /src/chromium/src/out/Default/./libcontent.so, CU 0x0, DIE 0xe0f5>) at ../../base/bind_internal.h:186
#4  0x00007f47e31fdcb5 in base::internal::InvokeHelper<true, void, base::internal::RunnableAdapter<void (content::GpuWatchdogThread::*)()> >::MakeItSo<base::WeakPtr<content::GpuWatchdogThread>>(base::internal::RunnableAdapter<void (content::GpuWatchdogThread::*)()>, base::WeakPtr<content::GpuWatchdogThread>) (
    runnable=..., weak_ptr=base::WeakPtr((content::RenderThreadImpl *)0x2dad05f7ea28)) at ../../base/bind_internal.h:324
#5  0x00007f47e32af1cd in base::internal::Invoker<base::IndexSequence<0ul>, base::internal::BindState<base::internal::RunnableAdapter<void (content::ChildThreadImpl::*)()>, void (content::ChildThreadImpl*), base::WeakPtr<content::ChildThreadImpl> >, base::internal::InvokeHelper<true, void, base::internal::RunnableAdapter<void (content::ChildThreadImpl::*)()> >, void ()>::Run(base::internal::BindStateBase*) (base=0x2dad05e492c0) at ../../base/bind_internal.h:362
#6  0x00007f47e24d19ee in base::Callback<void (), (base::internal::CopyMode)1>::Run() const (this=0x7ffd4c47f4d0) at ../../base/callback.h:397
#7  0x00007f47e24f730e in base::debug::TaskAnnotator::RunTask (this=0x2dad05e96940, queue_function=0x7f47d7e84362 "TaskQueueManager::PostTask", 
    pending_task=From Init()@../../content/child/child_thread_impl.cc:487 = {...}) at ../../base/debug/task_annotator.cc:51
#8  0x00007f47d7e38fb4 in scheduler::TaskQueueManager::ProcessTaskFromWorkQueue (this=0x2dad05e96820, work_queue=0x2dad05c57da0, 
    out_previous_task=0x7ffd4c47f798) at ../../components/scheduler/base/task_queue_manager.cc:289
Python Exception <class 'gdb.error'> There is no member or method named ticks_.: 
#9  0x00007f47d7e36ef2 in scheduler::TaskQueueManager::DoWork (this=0x2dad05e96820, run_time=..., from_main_thread=true)
    at ../../components/scheduler/base/task_queue_manager.cc:201
(More stack frames follow...)
 

Comment 1 by szager@chromium.org, May 25 2016

Cc: szager@chromium.org
/subscribe

This is making me unhappy.

Comment 2 by kbr@chromium.org, May 25 2016

Cc: kbr@chromium.org
Labels: -Pri-2 ReleaseBlock-Dev M-53 Pri-1
This is urgent. It affects all Chromium developers attempting to debug the product. Raising to P1 and adding a suitably urgent release blocker label.

Comment 3 by szager@chromium.org, May 25 2016

BTW, I tried commenting out this code in child_thread_impl.cc:

  /*
  message_loop_->task_runner()->PostDelayedTask(
      FROM_HERE, base::Bind(&ChildThreadImpl::EnsureConnected,
                            channel_connected_factory_.GetWeakPtr()),
      base::TimeDelta::FromSeconds(connection_timeout));
  */

Did not work:

- The renderer processing started up as expected.
- I got the expected prompt from WaitForDebugger.
- I set my breakpoints and did 'continue' in gdb.
- I sent SIGUSR1 the renderer process, and it was caught by gdb, as expected.
- The renderer entered its main event loop as expected, but it never did anything after that.

My guess, then, is that it's the renderer_host giving up on the child process.  I haven't tried debugging the browser process.

Comment 4 by szager@chromium.org, May 25 2016

And for the record, I also used --disable-hang-monitor.

Comment 5 by roc...@chromium.org, May 26 2016

Sorry for the inconvenience. I have a fix for this.

As a compromise I recently (and dubiously) added some logic to drop pending process connections after a fixed timeout. Aside from being a generally bad idea, this is especially a problem for debugging use cases like this.

I'm going to undo that change.
Project Member

Comment 6 by bugdroid1@chromium.org, May 26 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/2e82ffbd8d4dcf137b893f03edd945fff4a03f9a

commit 2e82ffbd8d4dcf137b893f03edd945fff4a03f9a
Author: rockot <rockot@chromium.org>
Date: Thu May 26 03:37:35 2016

[mojo-edk] Revert port reservation timeout behavior

This was a bad idea and it makes debugging harder.

Rather than increase the timeout, this reverts the behavior
altogether. We can add an embedder API to explicitly signal
process death in the future.

BUG= 612844 
R=amistry@chromium.org

Review-Url: https://codereview.chromium.org/2010783004
Cr-Commit-Position: refs/heads/master@{#396103}

[modify] https://crrev.com/2e82ffbd8d4dcf137b893f03edd945fff4a03f9a/mojo/edk/system/node_controller.cc
[modify] https://crrev.com/2e82ffbd8d4dcf137b893f03edd945fff4a03f9a/mojo/edk/system/node_controller.h

Comment 7 by roc...@chromium.org, May 26 2016

Status: Fixed (was: Assigned)

Sign in to add a comment