ProcessTest.WaitForExitWithTimeout sometimes fail on Linux x86 |
|||
Issue descriptionChrome Version: 69.0.3454.0 (locally built base_unittests) OS: Linux (Ubuntu 18.04) What steps will reproduce the problem? (1) Build base_unittests (2) ./base_unittests --gtest_filter=ProcessMetricsTest*:ProcessTest.* (3) What is the expected result? All tests pass What happens instead? [ FAILED ] ProcessTest.WaitForExitWithTimeout [ FAILED ] ProcessTest.ChildProcessIsRunning Please use labels and text to provide additional information. I bisected this and found that the issue was introduced by https://chromium-review.googlesource.com/920799 (https://crbug.com/806451). The tests does not fail if I run the tests alone or with either of "ProcessMetricsTest*" and "ProcessTest.*", only if I run both of them. If I run all of the tests in base_unittests these tests fail but the test suite passes due to automatic retries.
,
Jun 8 2018
Running base_unittests with: --gtest_filter=ProcessTest.Terminate:ProcessTest.WaitForExitWithTimeout or --gtest_filter=ProcessTest.Terminate:ProcessTest.ChildProcessIsRunning causes the second test to fail, in both cases. Based on this I was able to trace the issue to being that the next WaitForExit*() after the ProcessTest.Terminate test will always report that the child exited, regardless of whether it did or not. The bug is at https://cs.chromium.org/chromium/src/base/process/process_posix.cc?l=207: if (!WaitpidWithTimeout(handle, &status, timeout)) { // If multiple threads wait on the same |handle| then one wait will succeed // and the other will fail with errno set to ECHILD. > return exited || (errno == ECHILD); } where I added a check against |errno|. |errno| will only be (re)set by the waitpid() call in WaitpidWithTimeout() if it returns failure (-1), but not if it returns no-children-exited (0). Since WaitpidWithTimeout() treats both cases as failure, my change will cause a stale ECHILD from a previous call to be used. Ooops.
,
Jun 12 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/2c2caae3608757dee09e4f802ac002545d71bfad commit 2c2caae3608757dee09e4f802ac002545d71bfad Author: Wez <wez@chromium.org> Date: Tue Jun 12 04:19:53 2018 Remove ECHILD error handling from WaitForExit*(). In https://chromium-review.googlesource.com/c/chromium/src/+/920799 the WaitForExit*() calls were modified to handle the ECHILD error, which will be returned by waitpid() if the process being waited on exits and is reaped by a different thread. The implementation was incorrect, and would return true (process exited) in the case where no processes had exited, but |errno| happened to have been left set to ECHILD by a previous system call. The broken implementation passed the try-bots and waterfall thanks to TestLauncher retries, since the relevant tests pass unless run after a test that leave |errno| set to ECHILD. Remove the ECHILD handling from WaitForExit*() and instead modify the //base unit-test that requires it to do its own special handling. Bug: 850941 Change-Id: I129c5b60428c9bc3eba0cb9e82846f0c45917f85 Reviewed-on: https://chromium-review.googlesource.com/1093974 Reviewed-by: Lei Zhang <thestig@chromium.org> Commit-Queue: Wez <wez@chromium.org> Cr-Commit-Position: refs/heads/master@{#566301} [modify] https://crrev.com/2c2caae3608757dee09e4f802ac002545d71bfad/base/process/process_posix.cc [modify] https://crrev.com/2c2caae3608757dee09e4f802ac002545d71bfad/base/process/process_util_unittest.cc
,
Jun 12 2018
,
Jun 12 2018
That was quick, thanks! I can also confirm that the CL fixed the issue. |
|||
►
Sign in to add a comment |
|||
Comment 1 by w...@chromium.org
, Jun 8 2018Owner: w...@chromium.org
Status: Assigned (was: Untriaged)