factory: Goofy: Start without waiting for UI. |
||||
Issue descriptionCurrently Goofy will open a websocket and wait for Chrome to come connecting, and will terminate Goofy if that timeouts. However, we do see situation that Chrome crashed or delayed in early builds, and will become a "this site can't be reached" message when UI comes up (while goofy goes down). We should probably make sure Goofy can run without dependency of UI.
,
Jul 26 2017
assign to petershih who's also looking at startup improvements.
,
Aug 7 2017
During init, Goofy creates a thread to receive web socket requests, and wait for at least one websocket to connect. However, the wait has no timeout limit. In one of my tests, Goofy can continue to start even if a web socket is connected after an hour. I also traced the commits, this behavior seems not being changed for years. Did I miss something here? Any idea to reproduce this issue?
,
Aug 7 2017
goofy.py: init_ui -> monolithic -> env.launch_chrome -> test_environment.py#DUTEnvironment-> launch_chrome -> sync_utils.WaitFor(self.has_sockets, 90)
,
Aug 7 2017
The timeout mentioned in comment #4 will raise an exception, and fail the goofy.run_tests(). This exception is caught by the event loop (i.e., goofy_base.run_once()), and fails the goofy.run_tests(). As a result, it will not terminates Goofy, but the test list will not get run automatically as desired. This should be fixed. This timeout in launch_chrome() can be traced back in the CL: https://chromium-review.googlesource.com/c/188487, which tries to address an issue regarding to reboot the DUT several thousand times. Back in that time, the Chrome browser seems to not stable enough, so a timeout is added to try to reboot Chrome browser for a few times. My proposal is to remove the wait-for-socket in launch_chrome(), for two reasons: 1. The caller goofy.init_ui() will wait for a web socket after launch_chrome(), so this does not change the flow much. 2. In the CL 188487, the timeout is added for the purpose of restarting Chrome browser for a few times. However, this retrying logic is removed in a later commit, so it should be fine to remove the wait-for-socket from launch_chrome() now. The CL here reflects this change: https://chromium-review.googlesource.com/c/603587 To the question of "should we wait UI in Goofy?" The first commit can be traced back to commit 258a40c3 back in 2012, which is not to address a bug. So I think this is a by-design feature. Further, if we do not wait for the UI, it means the test list might start to run automatically before UI is ready. So I think the goal here is to make sure Goofy to be able to run normally, even when UI takes a (very) long time to get ready?
,
Aug 7 2017
I think (1) for most cases of Chromebooks, we want Goofy to run normally even if UI takes a long time to get ready, since this is seen for few ARM projects in proto stage. (2) for some run-in tests, it seems even better if goofy can start the tests before UI is ready; however this may cause problems in other scenario, for example in dozing stress test, operator may never see UI and can't figure out what's going on. So for short term I think Goofy should wait for UI. In future we may think about adding options in pytest so Goofy can know if it should wait for UI or not. Meanwhile I wonder what's the case for Modular Goofy running with Android as DUT. Can you check with Earl for that part?
,
Aug 8 2017
Just have some offline discussion with Earl. In conclusion, modular goofy should be fine with this change. To run goofy on an Android device, the flow is similar: start goofy first, then navigate to localhost:4012 via a Chrome browser.
,
Aug 8 2017
even for run-in tests? (I doubt if we'll connect browser to run-in tests - can you check with stimim as well, who should tried that on Gale, which should be the case we're looking for)
,
Aug 9 2017
Some offline discussion with stimim@: In Gale, which is a device with no display, we choose to not run Goofy on the DUT. Instead, all the tests are designed to be station-based, and a shell script is used for run-in tests. In conclusion, this change should be fine with the case in Gale. Few more words on the future work mentioned in Comment #6: Quoted from #6: In future we may think about adding options in pytest so Goofy can know if it should wait for UI or not. Our two cents: Two things might be related to this issue: 1. The 'has_ui' flag in 'FactoryTest' 2. After the parallel-pytest is implemented, the meaning of 'has_ui' might need to be re-considered.
,
Aug 9 2017
After parallel pytest with UI is implemented, maybe we can remove the has_ui argument, and have a default UI for tests to display some basic information (test name, test arguments, time elapsed, etc.) Also, there's a "goofy_ui" option in py/goofy/goofy.json in the design doc of modular goofy, but I guess it's not implemented yet.
,
Aug 9 2017
gale was running a lightweight version shell script to drive tests in run-in mode. the long term plan is to run goofy (as modular goofy) to drive run-in mode, but this was never implemented since we don't have new projects needing that. Maybe we'll want a wait_ui in future, but I agree it's not needed for now.
,
Aug 9 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/factory/+/391120a6e326479a0c1dc29db26449db10db6ad3 commit 391120a6e326479a0c1dc29db26449db10db6ad3 Author: Shen-En Shih <petershih@chromium.org> Date: Wed Aug 09 15:12:44 2017 goofy: Do not wait socket in launch_chrome() A timeout of 90 seconds is located in launch_chrome() to wait a web socket to connect. This timeout was added in an aged CL to try to reboot Chrome browser several times, since back then the Chrome browser seems not stable enough. As time goes, the Chrome browser is getting stable, and the retrying loop has already been removed. Furthermore, the launch_chrome() can now be removed. Noted that, the caller still waits for at least one web socket to get connected, so this does not modify the flow much. Also noted that, we can use init script to set up the Chrome startup webpage if needed. BUG= chromium:734925 TEST=make test TEST=manually test Change-Id: I33c03206a147ee8e5aa3a78162b2cd2657cfc25b Reviewed-on: https://chromium-review.googlesource.com/603587 Commit-Ready: Shen-En Shih <petershih@chromium.org> Tested-by: Shen-En Shih <petershih@chromium.org> Reviewed-by: Hung-Te Lin <hungte@chromium.org> [modify] https://crrev.com/391120a6e326479a0c1dc29db26449db10db6ad3/py/goofy/goofy.py [modify] https://crrev.com/391120a6e326479a0c1dc29db26449db10db6ad3/py/goofy/test_environment.py [modify] https://crrev.com/391120a6e326479a0c1dc29db26449db10db6ad3/py/goofy/goofy_unittest.py [modify] https://crrev.com/391120a6e326479a0c1dc29db26449db10db6ad3/py/goofy/goofy_presenter.py
,
Aug 10 2017
For monolithic mode and device mode, goofy back-end waits for a web socket to be connected when starting the first test. So it does not depend on UI during startup. For presenter mode, it does not depend on UI during startup, neither. If a DUT is connected, but the UI is not ready, it will keep retry in the event loop. Marked as fixed now.
,
Aug 14 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/factory/+/34835fca8fe61ddcb30cb98bc16fd610be330584 commit 34835fca8fe61ddcb30cb98bc16fd610be330584 Author: Wei-Han Chen <stimim@google.com> Date: Mon Aug 14 08:38:39 2017 goofy unittest: check if init_ui is called in NoHostTest In CL:603587, we change the flow in launch_chrome. But the NoHostTest is not updated properly. Current implementation is not testing what it claimed to test (UI should not be required for no_host test). Update the unittest in this CL. BUG= chromium:734925 TEST=make test Change-Id: Id681f582eb1fe113f225d18eee3c14f182fb4771 Reviewed-on: https://chromium-review.googlesource.com/611682 Commit-Ready: Wei-Han Chen <stimim@chromium.org> Tested-by: Wei-Han Chen <stimim@chromium.org> Reviewed-by: Shen-En Shih <petershih@chromium.org> [modify] https://crrev.com/34835fca8fe61ddcb30cb98bc16fd610be330584/py/goofy/goofy_unittest.py
,
Jan 22 2018
|
||||
►
Sign in to add a comment |
||||
Comment 1 by hungte@chromium.org
, Jul 14 2017