session_manager does not restart chrome after killing an unresponsive browser |
|||
Issue descriptionTo repo, you can run desktopui_Hangdetector, or manually do what it's doing: 1. touch /run/session_manager/enable_hang_detection && restart ui 2. pstree -p to locate the main browser process which is the child of the session_manager 3. kill -SIGSTOP <browser_pid> 4. tail -f /var/log/messages looking for something like: 2018-01-02T13:19:46.173981-08:00 WARNING session_manager[8546]: [WARNING:liveness_checker_impl.cc(64)] Browser hang detected! 2018-01-02T13:19:46.174013-08:00 WARNING session_manager[8546]: [WARNING:liveness_checker_impl.cc(68)] Aborting browser process. 2018-01-02T13:19:46.174153-08:00 INFO session_manager[8546]: [INFO:browser_job.cc(167)] Terminating process: Browser did not respond to DBus liveness check. 2018-01-02T13:19:46.174186-08:00 INFO session_manager[8546]: [INFO:system_utils_impl.cc(93)] Sending 6 to 8568 as 1000 5. kill -SIGCONT <browser_pid> At this point, session_manager will not have launched a new browser. pstree -p confirms that session_manager has no children. To reset your DUT: 6. rm /run/session_manager/enable_hang_detection && restart ui
,
Jan 2 2018
Can you try desktopui_HangDetector and see what happens? I tried it on a downloaded amd64-generic VM, which has no OS/browser modifications and was built by a bot, and can see the problem there as well.
,
Jan 2 2018
I just ran "test_that --board=cave chromeos2-row8-rack6-host11.cros desktopui_HangDetector" and saw the test pass.
,
Jan 2 2018
I haven't done any modifications to chrome. I noticed this on my DUT when I was working on crbug.com/793880 . At the end of desktopui_HangDetector test runs, I had to call restart ui to get chrome running again. I can dig a little more into this later this week.
,
Jan 2 2018
CHROMEOS_RELEASE_BUILDER_PATH=cave-paladin/R65-10273.0.0-rc1 CHROMEOS_RELEASE_VERSION=10273.0.0-rc1
,
Jan 2 2018
Yup, the test passes. At the end of the test, we make sure session_manager is still running and hasn't restarted, which it is and hasn't, but we never check to ensure that a new browser has launched. This should probably be added to the test?
,
Jan 2 2018
Do you mean that the test is passing but leaving the system in a state where Chrome is unusable? (I don't know anything about this test; it looks like it was added by cmasone@ back in 2012 and essentially not touched after that.)
,
Jan 2 2018
Isn't this line waiting for Chrome?
# Wait for new browser to come up.
login.wait_for_browser(
,
Jan 2 2018
You're right - I don't know why the test is passing.
I just ran the test again. Here's my /var/log/messages:
2018-01-02T14:36:52.634170-08:00 INFO session_manager[3105]: [INFO:browser_job.cc(147)] Running child /opt/google/chrome/chrome --ppapi-flash-path=/opt/google/chrome/pepper/libpepflashplayer.so --ppapi-flash-version=28.0.0.137 --ui-prioritize-in-gpu-process --use-gl=egl --enable-native-gpu-memory-buffers --gpu-sandbox-failures-fatal=yes --enable-logging --log-level=1 --use-cras --enable-wayland-server --user-data-dir=/home/chronos --max-unused-resource-memory-usage-percentage=5 --system-developer-mode --login-profile=user --has-chromeos-keyboard --default-wallpaper-large=/usr/share/chromeos-assets/wallpaper/oem_large.jpg --default-wallpaper-small=/usr/share/chromeos-assets/wallpaper/oem_small.jpg --default-wallpaper-is-oem --guest-wallpaper-large=/usr/share/chromeos-assets/wallpaper/guest_large.jpg --guest-wallpaper-small=/usr/share/chromeos-assets/wallpaper/guest_small.jpg --enable-consumer-kiosk --enterprise-enrollment-initial-modulus=15 --enterprise-enrollment-modulus-limit=19 --login-manager --vmodule=automatic_reboot_manager=1,tablet_power_button_controller=1,*chromeos/login/*=1,auto_enrollment_controller=1,*plugin*=2,*zygote*=1,*/ui/ozone/*=1,*/ui/display/manager/chromeos/*=1,*night_light*=1,power_button_observer=2,webui_login_view=2,lock_state_controller=2,webui_screen_locker=2,screen_locker=2
2018-01-02T14:36:52.634743-08:00 INFO session_manager[3105]: [INFO:session_manager_service.cc(239)] Browser is 3128
2018-01-02T14:36:53.663171-08:00 INFO session_manager[3105]: [INFO:upstart_signal_emitter.cc(35)] Emitting login-prompt-visible Upstart signal
2018-01-02T14:36:53.704676-08:00 INFO session_manager[3105]: [INFO:server_backed_state_key_generator.cc(126)] Stable device secret missing!
2018-01-02T14:37:00.562519-08:00 DEBUG kernel: [ 791.975201] ieee80211 phy0: device no longer idle - scanning
2018-01-02T14:37:02.645318-08:00 WARNING session_manager[3105]: [WARNING:liveness_checker_impl.cc(64)] Browser hang detected!
2018-01-02T14:37:02.645357-08:00 WARNING session_manager[3105]: [WARNING:liveness_checker_impl.cc(68)] Aborting browser process.
2018-01-02T14:37:02.645479-08:00 INFO session_manager[3105]: [INFO:browser_job.cc(164)] Terminating process: Browser did not respond to DBus liveness check.
2018-01-02T14:37:02.645512-08:00 INFO session_manager[3105]: [INFO:system_utils_impl.cc(93)] Sending 6 to 3128 as 1000
2018-01-02T14:37:05.399980-08:00 WARNING crash_reporter[3306]: [user] Received crash notification for chrome[3128] sig 6, user 1000 (developer build - not testing - always dumping)
2018-01-02T14:37:05.401517-08:00 INFO crash_reporter[3306]: State of crashed process [3128]: S (sleeping)
2018-01-02T14:37:05.406334-08:00 INFO metrics_daemon[2547]: [INFO:metrics_daemon.cc(426)] Got org.chromium.CrashReporter.UserCrash D-Bus signal
2018-01-02T14:37:05.411332-08:00 NOTICE root[3317]: autotest finished iteration /usr/local/autotest/results/default/desktopui_HangDetector/sysinfo/iteration.1
2018-01-02T14:37:05.636314-08:00 INFO kernel: [ 797.050043] tpm_tis tpm_tis: command 0x65 (size 20) returned code 0x0
2018-01-02T14:37:05.672311-08:00 INFO kernel: [ 797.085978] tpm_tis tpm_tis: command 0x65 (size 22) returned code 0x0
2018-01-02T14:37:05.708313-08:00 INFO kernel: [ 797.122253] tpm_tis tpm_tis: command 0x65 (size 22) returned code 0x0
2018-01-02T14:37:05.737314-08:00 INFO kernel: [ 797.151021] tpm_tis tpm_tis: command 0x15 (size 14) returned code 0x0
2018-01-02T14:37:05.767319-08:00 INFO kernel: [ 797.181010] tpm_tis tpm_tis: command 0x15 (size 14) returned code 0x0
2018-01-02T14:37:12.085778-08:00 ERR crash_reporter[3306]: Could not open core file /var/spool/crash/chrome.20180102.143705.3128.core: No such file or directory
2018-01-02T14:37:12.086135-08:00 INFO crash_reporter[3306]: Writing conversion problems as separate crash report.
2018-01-02T14:37:13.144074-08:00 INFO session_manager[3105]: [INFO:browser_job.cc(180)] Cleaned up child 3128
2018-01-02T14:37:13.144231-08:00 ERR session_manager[3105]: [ERROR:object_proxy.cc(582)] Failed to call method: org.chromium.LibCrosServiceInterface.CheckLiveness: object_path= /org/chromium/LibCrosService: org.freedesktop.DBus.Error.NoReply: Message recipient disconnected from message bus without replying
localhost tests # pstree -p | grep session
|-session_manager(3105)---crash_reporter(3304)
pgrep chrome also returns nothing. The screen is black.
I'll look into it some more.
,
Jan 2 2018
,
Jan 2 2018
From the logs that you posted, it looks like session_manager isn't seeing PID 3128 exit. Looking at the code, I think that SessionManagerService::HandleExit is supposed to be called, which produces the "Exiting process is chrome." message in the log snippet that I posted before calling BrowserJob::RunInBackground to restart Chrome. I think that this means that session_manager isn't receiving SIGCHLD with CLD_EXITED (see child_exit_handler.cc), although I don't know why that would be. The crash_reporter process hanging around as a child of session_manager when you ran pstree seems a bit suspicious.
,
May 14 2018
|
|||
►
Sign in to add a comment |
|||
Comment 1 by derat@chromium.org
, Jan 2 2018Components: OS>Systems