New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 671662 link

Starred by 3 users

Issue metadata

Status: Archived
Owner:
Closed: Dec 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug

Blocking:
issue 669097



Sign in to add a comment

Consider making session_manager emit login-prompt-visible Upstart signal asynchronously

Project Member Reported by derat@chromium.org, Dec 6 2016

Issue description

session_manager currently makes a synchronous D-Bus method call to Upstart to ask it to emit the login-prompt-visible signal:

----

session_manager_impl.cc:

void SessionManagerImpl::EmitLoginPromptVisible(Error* error) {
  login_metrics_->RecordStats("login-prompt-visible");
  dbus_emitter_->EmitSignal(kLoginPromptVisibleSignal);
  if (!init_controller_->TriggerImpulse("login-prompt-visible", {})) {
    static const char msg[] =
        "Emitting login-prompt-visible upstart signal failed.";
    LOG(ERROR) << msg;
    error->Set(dbus_error::kEmitFailed, msg);
  }
}

----

upstart_signal_emitter.cc:

const char UpstartSignalEmitter::kServiceName[] = "com.ubuntu.Upstart";
const char UpstartSignalEmitter::kPath[] = "/com/ubuntu/Upstart";
const char UpstartSignalEmitter::kInterface[] = "com.ubuntu.Upstart0_6";
const char UpstartSignalEmitter::kMethodName[] = "EmitEvent";

...

scoped_ptr<dbus::Response> UpstartSignalEmitter::EmitSignal(
    const std::string& signal_name,
    const std::vector<std::string>& args_keyvals) {
  DLOG(INFO) << "Emitting " << signal_name << " Upstart signal";
  
  dbus::MethodCall method_call(UpstartSignalEmitter::kInterface,
                               UpstartSignalEmitter::kMethodName);
  dbus::MessageWriter writer(&method_call);
  writer.AppendString(signal_name);
  writer.AppendArrayOfStrings(args_keyvals);
  writer.AppendBool(true);
  
  return upstart_dbus_proxy_->CallMethodAndBlock(
              &method_call, dbus::ObjectProxy::TIMEOUT_USE_DEFAULT);
}

----

We've sometimes seen Chrome's D-Bus method call asking session_manager to emit the signal time out (issue 631640, issue 669097), and one theory is that session_manager's call to Upstart is blocking on one or more login-prompt-visible-triggered jobs. Attempts to determine which job(s) are responsible haven't been successful, but Mattias suggested that we could make session_manager call Upstart asynchronously and either:

a) reply immediately to Chrome's request, or
b) reply asynchronously to Chrome's request after Upstart replies to session_manager.

I don't think that we'd get any additional benefit from b). session_manager at least wouldn't be blocked while waiting on Upstart, but from looking at chromeos/dbus/session_manager_client.cc, Chrome is already just making an async call to session_manager and discarding the response.

Given that Chrome doesn't care about waiting for the signal, I'm pretty comfortable switching session_manager's call to be async (as long as we don't do it right before a branch point). :-P

There's a risk that other parts of session_manager could be dependent on the login-prompt-visible signal being fully processed before moving on, but after looking at the dependent jobs (boot-complete, cups-clear-state, send-uptime-metrics, ui-init-late) that seemed unlikely to Mattias and me.
 

Comment 1 by derat@chromium.org, Dec 6 2016

Blocking: 669097

Comment 2 by derat@chromium.org, Dec 6 2016

Cc: sabin.fl...@intel.com jrbarnette@chromium.org
Labels: -OS-Linux -M-56 M-57
I've uploaded changes:

https://chromium-review.googlesource.com/#/c/417126/
https://chromium-review.googlesource.com/#/c/417181/

(The systemd portion of the second change is untested.)
Project Member

Comment 3 by bugdroid1@chromium.org, Dec 7 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/2e22a9bec6e886e89e472e906ce74b8e6ae58dd4

commit 2e22a9bec6e886e89e472e906ce74b8e6ae58dd4
Author: Daniel Erat <derat@chromium.org>
Date: Tue Dec 06 17:54:46 2016

login: Make UpstartSignalEmitter::EmitSignal private.

UpstartSignalEmitter's EmitSignal method is only called by
the same class's TriggerImpulse method, and overridden (but
not called) by the derived StubUpstartSignalEmitter class.
Make it be private rather than public.

BUG= chromium:671662 
TEST=compiles and unit tests still pass

Change-Id: If7fdf9c383e2de2fd608550b7c85561ae1b03b58
Reviewed-on: https://chromium-review.googlesource.com/417126
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Mattias Nissler <mnissler@chromium.org>

[modify] https://crrev.com/2e22a9bec6e886e89e472e906ce74b8e6ae58dd4/login_manager/stub_upstart_signal_emitter.h
[modify] https://crrev.com/2e22a9bec6e886e89e472e906ce74b8e6ae58dd4/login_manager/upstart_signal_emitter.cc
[modify] https://crrev.com/2e22a9bec6e886e89e472e906ce74b8e6ae58dd4/login_manager/upstart_signal_emitter.h

Project Member

Comment 4 by bugdroid1@chromium.org, Dec 7 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/cc827123b9ccc0a686698bb83d6ae8adbff4b863

commit cc827123b9ccc0a686698bb83d6ae8adbff4b863
Author: Daniel Erat <derat@chromium.org>
Date: Tue Dec 06 18:29:40 2016

login: Emit login-prompt-visible Upstart signal async.

Make session_manager's D-Bus method call to Upstart to emit
the login-prompt-visible signal be asynchronous. This is a
speculative fix for cases where session_manager appears to
block on this call, possibly due to other Upstart jobs
hanging at startup. Continue to use synchronous requests for
all other Upstart signals.

I'm also updating systemd to support starting units
asynchronously, although I'm unsure of whether those D-Bus
calls to systemd are ever expected to block.

BUG= chromium:671662 
TEST=unit tests pass; also deployed to a samus device, ran
     "initctl log-priority info", and verified that that
     emit-login-prompt is still emitted and that boot and
     login appears to work as before

Change-Id: I661a9dc03c2430622a85ce4d51c47ce22ed5d63e
Reviewed-on: https://chromium-review.googlesource.com/417181
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: Mattias Nissler <mnissler@chromium.org>

[modify] https://crrev.com/cc827123b9ccc0a686698bb83d6ae8adbff4b863/login_manager/init_daemon_controller.h
[modify] https://crrev.com/cc827123b9ccc0a686698bb83d6ae8adbff4b863/login_manager/upstart_signal_emitter.h
[modify] https://crrev.com/cc827123b9ccc0a686698bb83d6ae8adbff4b863/login_manager/session_manager_impl.cc
[modify] https://crrev.com/cc827123b9ccc0a686698bb83d6ae8adbff4b863/login_manager/session_manager_impl_unittest.cc
[modify] https://crrev.com/cc827123b9ccc0a686698bb83d6ae8adbff4b863/login_manager/upstart_signal_emitter.cc
[modify] https://crrev.com/cc827123b9ccc0a686698bb83d6ae8adbff4b863/login_manager/systemd_unit_starter.cc
[modify] https://crrev.com/cc827123b9ccc0a686698bb83d6ae8adbff4b863/login_manager/session_manager_dbus_adaptor.cc
[modify] https://crrev.com/cc827123b9ccc0a686698bb83d6ae8adbff4b863/login_manager/stub_upstart_signal_emitter.h
[modify] https://crrev.com/cc827123b9ccc0a686698bb83d6ae8adbff4b863/login_manager/session_manager_impl.h
[modify] https://crrev.com/cc827123b9ccc0a686698bb83d6ae8adbff4b863/login_manager/systemd_unit_starter.h

Cc: windmueller@google.com
Just checking on this bug since it is blocking https://bugs.chromium.org/p/chromium/issues/detail?id=669097 - This seems to have a huge effect on the VC fleet at Google. https://screenshot.googleplex.com/O9PvVOcYCSo - We seemed to lose about 300 units over the weekend and techs have to go to each room to reset them. Can we get an overview of current troubleshooting and if possible an eta of fix?

Comment 6 by derat@chromium.org, Dec 13 2016

Status: Fixed (was: Started)
I checked in this speculative workaround for the Upstart-job-is-hanging theory. I don't think I've seen any solid repro steps for issue 669097, though.

Comment 7 by roy...@google.com, Dec 13 2016

How risky is this change ? 
- Can we merge it to 56 or 55 ?

Comment 8 by derat@chromium.org, Dec 13 2016

I haven't heard of any problems but feel like it's fairly risky. I'd prefer not to merge it unless we have reason to believe that it actually fixes the problem.

(I think that Mattias and I agree that it's probably the right thing to do, but it's still just based on speculation that an Upstart job may be hanging... and we still don't have any ideas about exactly which job is hanging or why; this just makes session_manager not block on Upstart.)

Comment 9 by derat@chromium.org, Dec 13 2016

Cc: xiy...@chromium.org
Xiyuan mentioned that state keys are sent to session_manager in response to the login-prompt-visible signal:

- ui-init-late.conf is "start on login-prompt-visible or stopped ui-collect-machine-info" and runs the ui-init-late script.
- ui-init-late uses dbus-send to make a D-Bus call to session_manager's InitMachineInfo method (but only if /run/session_manager/machine-info exists).

I was initially thinking that we could have a circular dependency here:

- session_manager blocks on its call to Upstart to emit login-prompt-visible.
- That signal runs ui-init-late, which tries to call session_manager but times out since session_manager is still blocked.

If ui-collect-machine-info typically only completes after login-prompt-visible, that would explain why we don't usually hit this.

However, after some more testing (with #4 reverted), I don't think that this theory holds water:

- ui-init-late's dbus-send call doesn't use --print-reply, so it just queues the method call instead of blocking on it, which means that ui-init-late returns even if session_manager is blocked.
- Even when I make ui-init-late use --print-reply, session_manager's call to Upstart isn't blocked on this call finishing -- I see session_manager's call complete first, and then ui-init-late call completes.

So I still don't see any evidence that this fixes the CfM issue. :-(
Labels: VerifyIn-61

Comment 11 by dchan@chromium.org, Jan 22 2018

Status: Archived (was: Fixed)

Sign in to add a comment