Project: chromium Issues People Development process History Sign in
New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.
Starred by 1 user
Status: Archived
Owner:
Closed: Jan 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 0
Type: Bug



Sign in to add a comment
update_engine unittest stuck for 25+ minutes
Project Member Reported by semenzato@chromium.org, Jan 5 2017 Back to list
What could be causing this?  It may be a one-off, so low priority for now.

https://uberchromegw.corp.google.com/i/chromeos/builders/auron-paladin/builds/9177/steps/UnitTest/logs/stdio

update_engine-0.0.3-r2205: [       OK ] PostinstallRunnerActionTest.RunAsRootProgressUpdatesTest (27 ms)
update_engine-0.0.3-r2205: [----------] 10 tests from PostinstallRunnerActionTest (393 ms total)
update_engine-0.0.3-r2205: 
update_engine-0.0.3-r2205: [----------] Global test environment tear-down
update_engine-0.0.3-r2205: [==========] 28 tests from 3 test cases ran. (12384 ms total)
update_engine-0.0.3-r2205: [  PASSED  ] 28 tests.
update_engine-0.0.3-r2205: [INFO:testrunner.cc(62)] unittest return value: 0
=== Still running: job update_engine-0.0.3-r2205 (60m45.6s) ===
Still building update_engine-0.0.3-r2205 (61m45.7s). Logs in /tmp/update_engine-0.0.3-r2205-xk_Fpl
Pending 1/49, Building 1/1, Retrying 1, [Time 07:19:55 | Elapsed 62m53.0s | Load 0.08 0.07 1.73]
Still building update_engine-0.0.3-r2205 (63m45.8s). Logs in /tmp/update_engine-0.0.3-r2205-xk_Fpl
Pending 1/49, Building 1/1, Retrying 1, [Time 07:21:55 | Elapsed 64m53.1s | Load 0.08 0.06 1.53]
Still building update_engine-0.0.3-r2205 (65m46.0s). Logs in /tmp/update_engine-0.0.3-r2205-xk_Fpl
Pending 1/49, Building 1/1, Retrying 1, [Time 07:23:55 | Elapsed 66m53.2s | Load 0.01 0.04 1.34]
Still building update_engine-0.0.3-r2205 (67m46.1s). Logs in /tmp/update_engine-0.0.3-r2205-xk_Fpl
Pending 1/49, Building 1/1, Retrying 1, [Time 07:25:55 | Elapsed 68m53.3s | Load 0.0 0.03 1.18]
Still building update_engine-0.0.3-r2205 (69m46.2s). Logs in /tmp/update_engine-0.0.3-r2205-xk_Fpl
Pending 1/49, Building 1/1, Retrying 1, [Time 07:27:55 | Elapsed 70m53.5s | Load 0.0 0.02 1.04]
Still building update_engine-0.0.3-r2205 (71m46.3s). Logs in /tmp/update_engine-0.0.3-r2205-xk_Fpl
Pending 1/49, Building 1/1, Retrying 1, [Time 07:29:55 | Elapsed 72m53.6s | Load 0.0 0.01 0.91]
Still building update_engine-0.0.3-r2205 (73m46.5s). Logs in /tmp/update_engine-0.0.3-r2205-xk_Fpl
Pending 1/49, Building 1/1, Retrying 1, [Time 07:31:56 | Elapsed 74m53.7s | Load 0.08 0.03 0.81]
Still building update_engine-0.0.3-r2205 (75m46.6s). Logs in /tmp/update_engine-0.0.3-r2205-xk_Fpl
Pending 1/49, Building 1/1, Retrying 1, [Time 07:33:56 | Elapsed 76m53.9s | Load 0.06 0.05 0.72]
Still building update_engine-0.0.3-r2205 (77m46.7s). Logs in /tmp/update_engine-0.0.3-r2205-xk_Fpl
Pending 1/49, Building 1/1, Retrying 1, [Time 07:35:56 | Elapsed 78m54.0s | Load 0.03 0.05 0.64]
Still building update_engine-0.0.3-r2205 (79m46.8s). Logs in /tmp/update_engine-0.0.3-r2205-xk_Fpl
Pending 1/49, Building 1/1, Retrying 1, [Time 07:37:56 | Elapsed 80m54.1s | Load 0.0 0.04 0.56]
Still building update_engine-0.0.3-r2205 (81m47.0s). Logs in /tmp/update_engine-0.0.3-r2205-xk_Fpl
Pending 1/49, Building 1/1, Retrying 1, [Time 07:39:56 | Elapsed 82m54.2s | Load 0.08 0.04 0.5]
Still building update_engine-0.0.3-r2205 (83m47.1s). Logs in /tmp/update_engine-0.0.3-r2205-xk_Fpl
Pending 1/49, Building 1/1, Retrying 1, [Time 07:41:56 | Elapsed 84m54.4s | Load 0.12 0.07 0.46]
Still building update_engine-0.0.3-r2205 (85m47.2s). Logs in /tmp/update_engine-0.0.3-r2205-xk_Fpl
Pending 1/49, Building 1/1, Retrying 1, [Time 07:43:56 | Elapsed 86m54.5s | Load 0.03 0.06 0.4]
Still building update_engine-0.0.3-r2205 (87m47.3s). Logs in /tmp/update_engine-0.0.3-r2205-xk_Fpl
Pending 1/49, Building 1/1, Retrying 1, [Time 07:45:56 | Elapsed 88m54.6s | Load 0.0 0.04 0.36]
cros_sdk: Signaled to shutdown: caught 15 signal.
 
I keep running into issues where some update engine expertise would help.  Do you have recommendations for an owner?  Thanks.
Could it be something else that failed?

It looks like the update engine unit test runner succeeded and returned
  update_engine-0.0.3-r2205: [  PASSED  ] 28 tests.
  update_engine-0.0.3-r2205: [INFO:testrunner.cc(62)] unittest return value: 0

However, two chromite unit tests failed:

chromite-0.0.2-r2735: 06:20:25: ERROR: ### LOG: scripts/export_to_gcloud_unittest
chromite-0.0.2-r2735: Failed to acquire lock on virtualenv
chromite-0.0.2-r2735: Traceback (most recent call last):
chromite-0.0.2-r2735:   File "scripts/export_to_gcloud_unittest", line 102, in <module>
chromite-0.0.2-r2735:     main()
chromite-0.0.2-r2735:   File "scripts/export_to_gcloud_unittest", line 42, in main
chromite-0.0.2-r2735:     _CreateVenv()
chromite-0.0.2-r2735:   File "scripts/export_to_gcloud_unittest", line 52, in _CreateVenv
chromite-0.0.2-r2735:     ], stdout=sys.stderr)
chromite-0.0.2-r2735:   File "/usr/lib64/python2.7/subprocess.py", line 540, in check_call
chromite-0.0.2-r2735:     raise CalledProcessError(retcode, cmd)
chromite-0.0.2-r2735: subprocess.CalledProcessError: Command '['/mnt/host/source/chromite/../infra_virtualenv/create_venv', '/mnt/host/source/chromite/venv/.venv', '/mnt/host/source/chromite/venv/requirements.txt']' returned non-zero exit status 1
chromite-0.0.2-r2735: 
chromite-0.0.2-r2735: 
chromite-0.0.2-r2735: 06:20:25: ERROR: ### LOG: scripts/virtualenv_wrapper_unittest
chromite-0.0.2-r2735: Failed to acquire lock on virtualenv
chromite-0.0.2-r2735: Traceback (most recent call last):
chromite-0.0.2-r2735:   File "scripts/virtualenv_wrapper_unittest", line 102, in <module>
chromite-0.0.2-r2735:     main()
chromite-0.0.2-r2735:   File "scripts/virtualenv_wrapper_unittest", line 42, in main
chromite-0.0.2-r2735:     _CreateVenv()
chromite-0.0.2-r2735:   File "scripts/virtualenv_wrapper_unittest", line 52, in _CreateVenv
chromite-0.0.2-r2735:     ], stdout=sys.stderr)
chromite-0.0.2-r2735:   File "/usr/lib64/python2.7/subprocess.py", line 540, in check_call
chromite-0.0.2-r2735:     raise CalledProcessError(retcode, cmd)
chromite-0.0.2-r2735: subprocess.CalledProcessError: Command '['/mnt/host/source/chromite/../infra_virtualenv/create_venv', '/mnt/host/source/chromite/venv/.venv', '/mnt/host/source/chromite/venv/requirements.txt']' returned non-zero exit status 1
chromite-0.0.2-r2735: 
chromite-0.0.2-r2735: 06:20:25: ERROR: The following 2 tests failed:
chromite-0.0.2-r2735:   scripts/export_to_gcloud_unittest
chromite-0.0.2-r2735:   scripts/virtualenv_wrapper_unittest
chromite-0.0.2-r2735:  * ERROR: chromeos-base/chromite-0.0.2-r2735::chromiumos failed (test phase):
chromite-0.0.2-r2735:  *   (no error message)
chromite-0.0.2-r2735:  * 
chromite-0.0.2-r2735:  * Call stack:
chromite-0.0.2-r2735:  *     ebuild.sh, line  93:  Called src_test
chromite-0.0.2-r2735:  *   environment, line 5880:  Called die
chromite-0.0.2-r2735:  * The specific snippet of code:
chromite-0.0.2-r2735:  *       cd "${CHROMITE_DIR}/cbuildbot" && sudo -u "${PORTAGE_USERNAME}" PATH="${CROS_WORKON_SRCROOT}/../depot_tools:${PATH}" ./run_tests || die
chromite-0.0.2-r2735:  * 
chromite-0.0.2-r2735:  * If you need support, post the output of `emerge --info '=chromeos-base/chromite-0.0.2-r2735::chromiumos'`,
chromite-0.0.2-r2735:  * the complete build log and the output of `emerge -pqv '=chromeos-base/chromite-0.0.2-r2735::chromiumos'`.
chromite-0.0.2-r2735:  * The complete build log is located at '/build/auron/tmp/portage/logs/chromeos-base:chromite-0.0.2-r2735:20170105-141709.log'.
chromite-0.0.2-r2735:  * For convenience, a symlink to the build log is located at '/build/auron/tmp/portage/chromeos-base/chromite-0.0.2-r2735/temp/build.log'.
chromite-0.0.2-r2735:  * The ebuild environment file is located at '/build/auron/tmp/portage/chromeos-base/chromite-0.0.2-r2735/temp/environment'.
chromite-0.0.2-r2735:  * Working directory: '/mnt/host/source/chromite/cbuildbot'
chromite-0.0.2-r2735:  * S: '/mnt/host/source/src/platform/../../chromite'
chromite-0.0.2-r2735: >>> Failed to emerge chromeos-base/chromite-0.0.2-r2735 for /build/auron/, Log file:
chromite-0.0.2-r2735: >>>  '/build/auron/tmp/portage/logs/chromeos-base:chromite-0.0.2-r2735:20170105-141709.log'
chromite-0.0.2-r2735: 
chromite-0.0.2-r2735:  * Messages for package chromeos-base/chromite-0.0.2-r2735 merged to /build/auron/:
chromite-0.0.2-r2735: 
chromite-0.0.2-r2735:  * ERROR: chromeos-base/chromite-0.0.2-r2735::chromiumos failed (test phase):
chromite-0.0.2-r2735:  *   (no error message)
chromite-0.0.2-r2735:  * 
chromite-0.0.2-r2735:  * Call stack:
chromite-0.0.2-r2735:  *     ebuild.sh, line  93:  Called src_test
chromite-0.0.2-r2735:  *   environment, line 5880:  Called die
chromite-0.0.2-r2735:  * The specific snippet of code:
chromite-0.0.2-r2735:  *       cd "${CHROMITE_DIR}/cbuildbot" && sudo -u "${PORTAGE_USERNAME}" PATH="${CROS_WORKON_SRCROOT}/../depot_tools:${PATH}" ./run_tests || die

#2 it could be, but the message says "still building update_engine".

I also just noticed this:

update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x000000f0 : 78 38 c5 a1 fe 10 26 88 52 d8 0f d1 53 76 29 dd 
update_engine-0.0.3-r2205: [0105/142542:ERROR:payload_signer.cc(333)] PayloadVerifier::VerifySignature( signature_blob, public_key_path, payload_hash) failed.
update_engine-0.0.3-r2205: [0105/142542:FATAL:generate_delta_main.cc(177)] Check failed: PayloadSigner::VerifySignedPayload(in_file, public_key). 
update_engine-0.0.3-r2205: /usr/lib64/libbase-core-395517.so(base::debug::StackTrace::StackTrace()+0x13) [0x7f90c6972d63]
update_engine-0.0.3-r2205: 
update_engine-0.0.3-r2205: Aborted

Also, earlier in the same file there is this:

update_engine-0.0.3-r2205: [----------] 4 tests from ChromeBrowserProxyResolverTest (1 ms total)
update_engine-0.0.3-r2205: 
update_engine-0.0.3-r2205: [----------] Global test environment tear-down
update_engine-0.0.3-r2205: [==========] 762 tests from 71 test cases ran. (190363 ms total)
update_engine-0.0.3-r2205: [  PASSED  ] 762 tests.
update_engine-0.0.3-r2205: [INFO:testrunner.cc(62)] unittest return value: 0

So it looks like there are multiple unit tests for update_engine, and the last one that completed is not necessarily the last one in the set.  (There may be a "completely done with this package" message, but I don't know.)
Comment 4 by nxia@chromium.org, Jan 5 2017
chromite unittest did fail because it didn't acquire the lock, will open a separate bug for that. chromite unittest failed in 3 mins, and UnitTestStage will retry all unit tests again if the first round failed. The problem causing the timeout is the update_engine unittest. 
Is it caused by a bad CL or test flake?
Comment 5 by nxia@chromium.org, Jan 5 2017
filed the chromite bug at  crbug.com/678677 
the same update engine test failure appears only on build 9174 and 9177, so perhaps a bad CL?
### build 9174 - https://uberchromegw.corp.google.com/i/chromeos/builders/auron-paladin/builds/9174/steps/UnitTest/logs/stdio

update_engine-0.0.3-r2204: [0105/034556:INFO:generate_delta_main.cc(172)] Verifying signed payload.
update_engine-0.0.3-r2204: [0105/034556:INFO:payload_verifier.cc(93)] signature blob size = 264
update_engine-0.0.3-r2204: [0105/034556:ERROR:payload_verifier.cc(118)] None of the 1 signatures is correct. Expected:
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(436)] Logging array of length: 256
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000000 : 00 01 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000010 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000020 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000030 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000040 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000050 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000060 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000070 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000080 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000090 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x000000a0 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x000000b0 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x000000c0 : ff ff ff ff ff ff ff ff ff ff ff ff 00 30 31 30 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x000000d0 : 0d 06 09 60 86 48 01 65 03 04 02 01 05 00 04 20 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x000000e0 : c8 88 8a fe 50 c5 12 10 71 4d eb 9c 4a 3f 5d 5a 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x000000f0 : 05 20 39 0f 7f a4 02 e4 fa c8 b4 0f 7f fe 75 5f 
update_engine-0.0.3-r2204: [0105/034556:ERROR:payload_verifier.cc(121)] But found decrypted hashes:
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(436)] Logging array of length: 256
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000000 : 35 e3 5a d4 2b ab 70 cb 98 2e 60 75 15 b1 d6 e0 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000010 : 87 84 32 81 17 d3 b8 00 c4 c5 53 da d0 a7 80 d9 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000020 : f7 4b a8 c0 2f 02 92 2a ab 1a 08 f3 a5 0a 7f e0 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000030 : 14 24 18 d3 43 67 8d 15 00 f1 e1 a2 71 8a ce d9 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000040 : c7 11 9f 6f 9c 7b df 1d b4 7a 9f 15 33 91 ca 0f 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000050 : 58 3a 34 7f 22 0c 3a d8 4f 54 e0 22 33 0f 89 0d 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000060 : 3a b6 b9 69 d8 f0 21 59 a3 2a 36 7b 35 d9 43 b9 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000070 : 37 40 47 06 a1 eb 92 34 67 8f d3 2d 1b 5d 77 dd 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000080 : 3e 3c e9 87 70 d2 e5 90 a0 12 a2 84 93 14 60 75 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x00000090 : 5d 14 a8 bd 08 ce ab 08 83 f1 62 b3 58 6e d3 e7 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x000000a0 : 22 d4 09 dc 58 61 85 40 6a ac 8b 5c ab 1e 2c 6c 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x000000b0 : bf 2b cc 3f fb 43 70 e5 0c 03 1a ad 96 92 bf 48 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x000000c0 : 62 db 00 80 ac dd a6 d2 05 45 63 d6 54 87 01 26 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x000000d0 : 84 62 15 98 3d 66 ae b6 b0 2e 73 4e a6 8a 1d b2 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x000000e0 : 95 a0 1c ae 3c e0 80 db ab ba 04 fd ab a5 e4 35 
update_engine-0.0.3-r2204: [0105/034556:INFO:utils.cc(453)] 0x000000f0 : e4 bc 1a e9 6f 06 eb a0 f7 38 02 c3 98 5f 58 c4 
update_engine-0.0.3-r2204: [0105/034556:ERROR:payload_signer.cc(333)] PayloadVerifier::VerifySignature( signature_blob, public_key_path, payload_hash) failed.
update_engine-0.0.3-r2204: [0105/034556:FATAL:generate_delta_main.cc(177)] Check failed: PayloadSigner::VerifySignedPayload(in_file, public_key). 
update_engine-0.0.3-r2204: /usr/lib64/libbase-core-395517.so(base::debug::StackTrace::StackTrace()+0x13) [0x7f793f23bd63]
update_engine-0.0.3-r2204: 
update_engine-0.0.3-r2204: Aborted

### build 9177 - https://uberchromegw.corp.google.com/i/chromeos/builders/auron-paladin/builds/9177/steps/UnitTest/logs/stdio

update_engine-0.0.3-r2205: [0105/142542:INFO:generate_delta_main.cc(172)] Verifying signed payload.
update_engine-0.0.3-r2205: [0105/142542:INFO:payload_verifier.cc(93)] signature blob size = 264
update_engine-0.0.3-r2205: [0105/142542:ERROR:payload_verifier.cc(118)] None of the 1 signatures is correct. Expected:
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(436)] Logging array of length: 256
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000000 : 00 01 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000010 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000020 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000030 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000040 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000050 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000060 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000070 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000080 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000090 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x000000a0 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x000000b0 : ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x000000c0 : ff ff ff ff ff ff ff ff ff ff ff ff 00 30 31 30 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x000000d0 : 0d 06 09 60 86 48 01 65 03 04 02 01 05 00 04 20 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x000000e0 : 78 70 e0 40 6c e7 e0 e0 89 d8 15 2c 15 65 bc 35 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x000000f0 : bf 30 55 0c 4c fe 11 51 c2 5c cf 22 02 a2 0d 55 
update_engine-0.0.3-r2205: [0105/142542:ERROR:payload_verifier.cc(121)] But found decrypted hashes:
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(436)] Logging array of length: 256
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000000 : 31 e1 57 db b3 b2 07 be c1 19 a3 3b 07 75 9f e9 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000010 : a8 7f e0 f6 60 a4 b4 68 54 ec c5 3d cc 0f e5 f3 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000020 : 52 13 0e 3f a8 f8 77 8b 24 e3 9b 59 39 e7 f3 0e 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000030 : 1d 5a a8 cf e8 d0 cd 2f de 89 b0 de ff 7b f1 73 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000040 : da 89 92 88 f5 73 c3 53 4f af f5 f9 aa fe f4 8c 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000050 : 2c 4d 70 0e 0c e7 f1 1b 11 ba b8 61 df 30 1c 20 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000060 : 7d 4b 2f fe d5 ba 89 19 fb 84 bb cd 59 27 75 83 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000070 : 33 74 bd 0a 18 16 b4 8c 79 3e c1 aa ae c2 74 a1 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000080 : be df 29 2a c8 d5 e8 c5 d4 b7 b9 a7 80 dc c9 b3 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x00000090 : 0e 39 f1 5a 05 8b c8 20 7a 9c cf 4c 77 3f 00 49 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x000000a0 : c3 de f9 8e 7a e7 da 44 96 00 96 4c 8e cd 71 cc 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x000000b0 : 5b a7 34 8d 1e 8f 86 14 69 ba f3 4e 0a 3d 37 78 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x000000c0 : bd ad 26 11 3f e4 c3 f1 19 d3 30 7d 56 76 73 e2 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x000000d0 : 0d d6 d2 2c f3 2c 4a bd 9c c9 9c 51 0e d0 a0 22 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x000000e0 : 29 f9 e3 4c 2f c7 f3 d1 a7 df aa b3 b2 d2 b2 78 
update_engine-0.0.3-r2205: [0105/142542:INFO:utils.cc(453)] 0x000000f0 : 78 38 c5 a1 fe 10 26 88 52 d8 0f d1 53 76 29 dd 
update_engine-0.0.3-r2205: [0105/142542:ERROR:payload_signer.cc(333)] PayloadVerifier::VerifySignature( signature_blob, public_key_path, payload_hash) failed.
update_engine-0.0.3-r2205: [0105/142542:FATAL:generate_delta_main.cc(177)] Check failed: PayloadSigner::VerifySignedPayload(in_file, public_key). 
update_engine-0.0.3-r2205: /usr/lib64/libbase-core-395517.so(base::debug::StackTrace::StackTrace()+0x13) [0x7f90c6972d63]
update_engine-0.0.3-r2205: 
update_engine-0.0.3-r2205: Aborted
We can probably issue a few trybot runs on auron-paladin to see if it's a test flake
#6: I compared the list of CLs for those two builds and there is no overlap.  So it's likely to be test flake.

Labels: -Pri-3 Pri-2
Owner: benchan@chromium.org
Happened again!  This time on a canary.

https://uberchromegw.corp.google.com/i/chromeos/builders/lars-release/builds/732/steps/UnitTest/logs/stdio

update_engine-0.0.3-r2204: [----------] Global test environment tear-down
update_engine-0.0.3-r2204: [==========] 28 tests from 3 test cases ran. (13935 ms total)
update_engine-0.0.3-r2204: [  PASSED  ] 28 tests.
update_engine-0.0.3-r2204: [INFO:testrunner.cc(62)] unittest return value: 0
=== Still running: job update_engine-0.0.3-r2204 (60m14.5s) ===
Still building update_engine-0.0.3-r2204 (61m14.5s). Logs in /tmp/update_engine-0.0.3-r2204-BHYqir
Pending 0/50, Building 1/1, [Time 08:15:14 | Elapsed 74m57.3s | Load 74.1 52.15 40.71]
Still building update_engine-0.0.3-r2204 (63m14.6s). Logs in /tmp/update_engine-0.0.3-r2204-BHYqir
Pending 0/50, Building 1/1, [Time 08:17:14 | Elapsed 76m57.4s | Load 42.53 48.21 40.73]
Still building update_engine-0.0.3-r2204 (65m14.7s). Logs in /tmp/update_engine-0.0.3-r2204-BHYqir
Pending 0/50, Building 1/1, [Time 08:19:14 | Elapsed 78m57.5s | Load 36.4 43.98 40.08]
Still building update_engine-0.0.3-r2204 (67m14.8s). Logs in /tmp/update_engine-0.0.3-r2204-BHYqir
Pending 0/50, Building 1/1, [Time 08:21:14 | Elapsed 80m57.6s | Load 33.01 40.23 39.19]
Still building update_engine-0.0.3-r2204 (69m14.9s). Logs in /tmp/update_engine-0.0.3-r2204-BHYqir
Pending 0/50, Building 1/1, [Time 08:23:15 | Elapsed 82m57.7s | Load 30.75 37.13 38.18]
Still building update_engine-0.0.3-r2204 (71m15.0s). Logs in /tmp/update_engine-0.0.3-r2204-BHYqir
Pending 0/50, Building 1/1, [Time 08:25:15 | Elapsed 84m57.8s | Load 28.93 34.28 37.01]
Still building update_engine-0.0.3-r2204 (73m15.1s). Logs in /tmp/update_engine-0.0.3-r2204-BHYqir
Pending 0/50, Building 1/1, [Time 08:27:15 | Elapsed 86m57.9s | Load 28.73 32.45 36.01]
Still building update_engine-0.0.3-r2204 (75m15.2s). Logs in /tmp/update_engine-0.0.3-r2204-BHYqir
Pending 0/50, Building 1/1, [Time 08:29:15 | Elapsed 88m58.0s | Load 28.56 31.14 35.1]
cros_sdk: Signaled to shutdown: caught 15 signal.


Ben, it looks like you've already made a lot of progress on this.  Let me know if you don't have time for it.

Comment 12 by nxia@chromium.org, Jan 6 2017
Labels: -Pri-2 Pri-1
And here

https://uberchromegw.corp.google.com/i/chromeos/builders/auron-paladin/builds/9184/steps/UnitTest/logs/stdio

This is happening more frequently and failed 4 CQ runs. Raise the priority.
Comment 13 by nxia@chromium.org, Jan 6 2017
When we investigated crbug.com/678852, ayatane@ guessed the different ubuntu & python versions on the auron-paladin may be the reason. No idea if that difference could also affect the update_engine unittest here. As the unittest failures only happen on auron-paladin. we plan to replace the builder with a right system & python version and see how it goes.

Meanwhile, benchan@ can you continue investigating this and keep the bug updated?
Forgot about comment #7, the test 'DeltaPerformerIntegrationTest.RunAsRootSmallImageSignGeneratedShellBadKeyTest' is expected to abort as it uses bad key to sign/verify.

I issued 5 try jobs on auron-paladin (4 passed and 1 failed with the same issue).

I'd suggest someone with ssh access to a trybot to log into the machine and check the update_engine log under /tmp inside the chroot.
The process that hangs is the update_engine unittest executable.  It seems to hang at exit in a futex call (from the kernel stack).  We couldn't get a stack trace for the executable, unfortunately.

Ningning maybe you still have some of the details around?  Can you post those traces here for future reference?

It may be some kind of futex-related kernel bug.

Comment 16 by nxia@chromium.org, Jan 7 2017
Pasting some stack


chrome-bot@build176-m2:(Linux 12.04):/proc/25470$ sudo cat /proc/23241/stack
[<ffffffff810c77b9>] futex_wait_queue_me+0xd9/0x150
[<ffffffff810c8840>] futex_wait+0x100/0x210
[<ffffffff810ca69c>] do_futex+0x7c/0x1b0
[<ffffffff810ca912>] SyS_futex+0x142/0x1a0
[<ffffffff81753cdd>] system_call_fastpath+0x1a/0x1f
[<ffffffffffffffff>] 0xffffffffffffffff






warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7fffe07c6000
0x00007f798429580b in ?? ()
(gdb) where
#0  0x00007f798429580b in ?? ()
#1  0xfffffffffffffbe0 in ?? ()
#2  0x00007f79836c0a10 in ?? ()
#3  0x00007f798420d980 in ?? ()
#4  0x000000000000001d in ?? ()
#5  0x0000000000000000 in ?? ()



Cc: vapier@chromium.org
Vapier you may find this intriguing.  It has been stumping the best minds of our generation, who aren't too proud to ask for help.

It could be as trivial as a kernel bug triggered only by some executables, and hit when the process calls exit(3).

Ningning, I recall the builder kernel being 3.11, can you correct me if I am wrong?
In the meantime, we're also checking if the recent glibc update contributes to the issue. I issued a few try jobs with the glibc change reverted.
Did the tryjobs run?  Did they fail or succeed?
failed once on the trybot with glibc upgrade reverted, so it seems unrelated to the recent glibc upgrade.
However, it seems like there are some leaked processes that kept test runner process around (see https://uberchromegw.corp.google.com/i/chromiumos.tryserver/builders/paladin/builds/1837/steps/UnitTest/logs/stdio)

update_engine-0.0.3-r2207: [INFO:testrunner.cc(62)] unittest return value: 0
update_engine-0.0.3-r2207: Error: the test leaked process sleep with pid 64 (it was forcefully killed)
update_engine-0.0.3-r2207: Error: the test leaked process sleep with pid 82 (it was forcefully killed)



re #23: in order to see that error message, I ran a try job with https://chromium-review.googlesource.com/c/340541/ reverted.
and it seems like  issue 596566 #c1 describes a similar update_engine unit test timeout issue.
Don, can you comment whether this is the long-standing issue you mentioned at the meeting?  If it is, were other bugs opened for it?  (I couldn't find one.)
Cc: jdufault@chromium.org
#25 good find.

Jacob, could this be the old problem somehow coming back?
Cc: rspangler@chromium.org kinaba@chromium.org
Labels: -Pri-1 Pri-0
Auron-paladin 9209 failed again.

Shall we make this one pri-0?  I think we should.

Re #27, #25: the linked issue was not about any failure in particular
This is not the same as the longstanding bug I was thinking of.
Jacob, I am thinking of this CL.

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/ddeba19e6811cc199a3109d908336dd02c759425

commit ddeba19e6811cc199a3109d908336dd02c759425
Author: Jacob Dufault <jdufault@google.com>
Date: Tue Apr 19 19:49:48 2016

vpn: Ensure child process is killed before exiting.

The UnitTest stage is becoming more strict; any auxiliary processes need to be
terminated before the test process exits, otherwise the test will fail.

BUG= chromium:596566 
TEST=cros_run_unit_tests --board link

Change-Id: I582df3a6d388a4993a5cad937bf2f0b2a98aaf0a
Reviewed-on: https://chromium-review.googlesource.com/339741
Commit-Ready: Jacob Dufault <jdufault@chromium.org>
Tested-by: Jacob Dufault <jdufault@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/ddeba19e6811cc199a3109d908336dd02c759425/vpn-manager/daemon_test.cc

Re #31: That CL just ensured a helper process for the test was killed before the test itself exited. I later landed (and then reverted) a CL that ensured all child processes exited before the main test process. I don't think any of those changes are going to be related to this issue.
Thank you Jacob.
From logs in #23, the two leaked processes were 'sleep', which seemed to be launched from these two tests:

  SubprocessTest.CancelTest 
  P2PManagerTest.LookupURL

Both tests try to mimic timeouts by having the child process sleep for some time. But the child process is actually a 'sh' process launching 'sleep', so the 'sleep' process is actually the grandchild of the update engine test runner process. The test runner seems to only kill and reap the 'sh' process, but not the 'sleep' process. Not sure if the 'sleep' process actually keeps the test runner from terminating, but I can try to clean that up to see if the issue goes away.
Taking a look at the code.

The current theory is that we block while exiting from update_engine_unittests, after execution of RunAsRootProgressUpdatesTest in postinstall_runner_action_unittests.cc.

In fact that test appears to spawn a script (postint_progress) which runs asynchronously and outputs

0.25
0.5
1

then exits.

The script is spawned by RunPosinstallAction(), defined in the same file.  (Funny how nobody bothered to fix the typo.)  The spawning of postinstall_program (a string parameter) is hopelessly complicated.  First the executable path is added to a "Partition" object:

  InstallPlan::Partition part;
  ...
  part.postinstall_path = postinstall_program;

Then the partition is added to an InstallPlan object:

  InstallPlan install_plan;
  install_plan.partitions = {part};

Then the InstallPlan is attached to an ObjectFeederAction:

  ObjectFeederAction<InstallPlan> feeder_action;
  feeder_action.set_obj(install_plan);

Then feeder_action is associated with a PostinstallRunnerAction:

  PostinstallRunnerAction runner_action(&fake_boot_control_,
                                        powerwash_marker_file_.c_str());
  ...
  BondActions(&feeder_action, &runner_action);

and with an ObjectCollectorAction:

  ObjectCollectorAction<InstallPlan> collector_action;
  BondActions(&runner_action, &collector_action);

Then these three actions are enqueued in the "processor" object:

  processor.EnqueueAction(&feeder_action);
  processor.EnqueueAction(&runner_action);
  processor.EnqueueAction(&collector_action);
  processor.set_delegate(&processor_delegate_);

Finally the processor is started:

  loop_.PostTask(FROM_HERE,
                 base::Bind([&processor] { processor.StartProcessing(); }));
  loop_.Run();

***************
All this is needed to run a helper script?  Maybe there are good reasons, but to an outsider this looks COMPLETELY NUTS.


PostinstallRunnerActionTest.RunAsRootCancelPostinstallActionTest looks more suspicious to me. It first suspends the 'postinst_suspend' child process with a SIGSTOP, and then terminates the latter with a SIGTERM. When 'postinst_suspend' is suspended, it won't handle SIGTERM until after it's resumed. The 'sleep' grandchild process created by 'postinst_suspend' can also become a zombie waiting for its suspended parent to reap it.

I issued a few trybot runs on auron-paladin with PostinstallRunnerActionTest.RunAsRootCancelPostinstallActionTest disabled and haven't observed the test runner hanging yet. Right now, I'm testing a possible fix with trybot runs to see if issuing a SIGCONT after the SIGTERM to 'postinst_suspend' would help unblock the SIGTERM propagation.
Perhaps a tryjob with "<board>-unittest-stress" will help. It runs the unittests repeatedly to try and reproduce flake failures. No idea if it does a good job of reproducing this failure.
I tried a few CLs to prevent orphaned child processes in the unit tests. 
  https://chromium-review.googlesource.com/c/426878/
  https://chromium-review.googlesource.com/c/427059/
  https://chromium-review.googlesource.com/c/427060/

No child process is orphaned as a result of running the unit tests, but unfortunately, the unit test phase still hangs sometimes:

- passed auron-unittest-stress try run: https://uberchromegw.corp.google.com/i/chromiumos.tryserver/builders/unittest_stress/builds/0
- passed auron-paladin try run: https://uberchromegw.corp.google.com/i/chromiumos.tryserver/builders/paladin/builds/1896
- failed auron-paladin try run: https://uberchromegw.corp.google.com/i/chromiumos.tryserver/builders/paladin/builds/1900

Now need to look at the problem from a different angle.
Labels: -Pri-0 Pri-1
Any success in getting a symbolified stack trace?

I haven't seen this in over a day, I am making it pri-1 again for now.
Labels: -Pri-1 Pri-0
Yes... back to Pri-0. :/
It looks like the hang is at a much later stage. From this run:

https://uberchromegw.corp.google.com/i/chromiumos.tryserver/builders/paladin/builds/1902/steps/UnitTest/logs/stdio

`os.waitpaid(child, 0)` in platform2_test.py:run already returned, which means the update-engine test runner process already terminated. I start tracing up the stack to the python and portage land...

Interesting.  It happens mostly on auron-paladin.  Why not auron-release?  Also, it seems to have happened once on cyan and lars.

What's common to these platforms and why is auron-paladin so overrepresented?
Same GCE builder perhaps?  It could be interesting to swap auron-paladin with another board to see if it follows the platform or builder.
I wonder if this is tied to load.

Unittests are run in parallel with unrelated tasks, (and it parallel with other unittests using parallel emerge).

In practice, load levels are semi-repeatableish on the same machine with the same source/build config. But changing boards changes ebuild dependencies, which can affect load for any given package's test, and paladin vs release will certainly have a big effect.
Cc: h...@chromium.org
Hi Don, thanks.  I like the theory in #45 better (well, I have to, he's my boss).  Could the deputy check the kernel version for this particular builder, compared to some of the others?  Are they all on the same version?

(Additionally, it would be extremely pleasurable if you would include instructions or links to instructions on how to find the builder name and log on to it.)

I am suspicious of this futex_wait() kernel bug:

https://groups.google.com/forum/#!topic/mechanical-sympathy/QbmpZxp6C64%5B1-25%5D
On chromiumos.tryserver, the hang was observed on these builders (based on the last 50 runs):

build247-m2
build264-m2
build266-m2
build268-m2
build270-m2
build271-m2
build272-m2
build275-m2 
build320-m2

Last time we looked into one of the bots, it was running kernel 3.11.? last updated around Dec 2016.
Based on my last try run: https://uberchromegw.corp.google.com/i/chromiumos.tryserver/builders/paladin/builds/1936/steps/UnitTest/logs/stdio

The hang seemed to occur after `platform2_test.py run` invoked sys.exit(0) and before  platform.eclass:platform_src_test:platform_pkg_test returned. There was only one MainThread in the platform2_test Python process.

Components: -OS>Systems Infra>Client>ChromeOS
Labels: Hotlist-TreeCloser
Adding infra component for visibility.
Project Member Comment 51 by bugdroid1@chromium.org, Jan 12 2017
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/01f6ee56a1b55576a4d047fcfbdb58d1f97f4273

commit 01f6ee56a1b55576a4d047fcfbdb58d1f97f4273
Author: Ben Chan <benchan@chromium.org>
Date: Thu Jan 12 19:14:26 2017

cbuildbot_config: temporarily mark 'auron-paladin' not important

update-engine unit tests has been hanging on 'auron-paradin' more
frequently (chromium:678643) than other builders. This CL temporarily
marks 'auron-paladin' not important to reduce CQ failures as other
builders like 'samus-paladin' provide similar coverage. This CL will be
reverted once the issue is resolved.

BUG= chromium:678643 
TEST=`./cbuildbot/cbuildbot_config_unittest`

Change-Id: Ia8bdc27dca38afc02124e962a7c5f39a95b36750
Reviewed-on: https://chromium-review.googlesource.com/427325
Reviewed-by: Ben Chan <benchan@chromium.org>
Commit-Queue: Ben Chan <benchan@chromium.org>
Tested-by: Ben Chan <benchan@chromium.org>
Trybot-Ready: Ben Chan <benchan@chromium.org>

[modify] https://crrev.com/01f6ee56a1b55576a4d047fcfbdb58d1f97f4273/cbuildbot/config_dump.json
[modify] https://crrev.com/01f6ee56a1b55576a4d047fcfbdb58d1f97f4273/cbuildbot/chromeos_config.py

Status: Started
Project Member Comment 53 by bugdroid1@chromium.org, Jan 14 2017
The following revision refers to this bug:
  https://chromium.googlesource.com/aosp/platform/system/update_engine/+/f13a40924e0b1d71df07738078cebc933dd90a7e

commit f13a40924e0b1d71df07738078cebc933dd90a7e
Author: Ben Chan <benchan@chromium.org>
Date: Tue Jan 10 23:32:11 2017

update_engine: resume suspended PostInstall action to handle termination

PostinstallRunnerActionTest.RunAsRootCancelPostinstallActionTest exposes
an issue when PostinstallRunnerAction tries to terminate an action that
has been suspended. PostinstallRunnerAction::TerminateProcessing() uses
Subprocess::KillExec() to terminate the action by sending SIGTERM to the
child process associated with the action. However, if the action has
been suspended by PostinstallRunnerAction::SuspendAction(), the child
process won't receive the SIGTERM until it's resumed. This CL changes
PostinstallRunnerAction::TerminateProcessing() to resume the child
process after issuing SIGTERM.

BUG= chromium:678643 
TEST=Verified that no orphaned 'sleep' process is left after running
PostinstallRunnerActionTest.RunAsRootCancelPostinstallActionTest.

Change-Id: I6852e560550ce2ce49e28733da2b67a7f2e11ca5
Reviewed-on: https://chromium-review.googlesource.com/426878
Commit-Ready: Ben Chan <benchan@chromium.org>
Tested-by: Ben Chan <benchan@chromium.org>
Reviewed-by: Luigi Semenzato <semenzato@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/f13a40924e0b1d71df07738078cebc933dd90a7e/payload_consumer/postinstall_runner_action.cc
[modify] https://crrev.com/f13a40924e0b1d71df07738078cebc933dd90a7e/payload_consumer/postinstall_runner_action.h

Project Member Comment 54 by bugdroid1@chromium.org, Jan 14 2017
The following revision refers to this bug:
  https://chromium.googlesource.com/aosp/platform/system/update_engine/+/efb56850475b27ab9197c6010a3f3910b078cdc2

commit efb56850475b27ab9197c6010a3f3910b078cdc2
Author: Ben Chan <benchan@chromium.org>
Date: Wed Jan 11 03:56:50 2017

update_engine: clean up sleep process in SubprocessTest.CancelTest

SubprocessTest.CancelTest terminates a 'sh' process, which is put into
sleep by executing `sleep 60`. Upon receiving SIGTERM, the 'sleep'
process isn't reaped and thus becomes orphaned. As a remedy, this CL
modifies the shell command to trap SIGTERM and kill the 'sleep' process
upon receiving SIGTERM.

BUG= chromium:678643 
TEST=Verified that no orphaned 'sleep' process is left after running
SubprocessTest.CancelTest.

Change-Id: I2e00be7eb5ef44f5cf9a7f151c4f9085854f34b5
Reviewed-on: https://chromium-review.googlesource.com/427059
Commit-Ready: Ben Chan <benchan@chromium.org>
Tested-by: Ben Chan <benchan@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/efb56850475b27ab9197c6010a3f3910b078cdc2/common/subprocess_unittest.cc

Project Member Comment 55 by bugdroid1@chromium.org, Jan 14 2017
The following revision refers to this bug:
  https://chromium.googlesource.com/aosp/platform/system/update_engine/+/28f9a886f0699986f7c378a9898bea9c6f7515ee

commit 28f9a886f0699986f7c378a9898bea9c6f7515ee
Author: Ben Chan <benchan@chromium.org>
Date: Wed Jan 11 06:28:22 2017

update_engine: clean up sleep process in P2PManagerTest.LookupURL

P2PManagerTest.LookupURL terminates a 'sh' process, which is put into
sleep by executing `sleep 2`. Upon receiving SIGTERM, the 'sleep'
process isn't reaped and thus becomes orphaned. As a remedy, this CL
modifies the shell command to trap SIGTERM and kill the 'sleep' process
upon receiving SIGTERM.

BUG= chromium:678643 
TEST=Verified that no orphaned 'sleep' process is left after running
P2PManagerTest.LookupURL.

Change-Id: I05aaf14ee39ff77a7a2040f97a19bcc9d70046a8
Reviewed-on: https://chromium-review.googlesource.com/427060
Commit-Ready: Ben Chan <benchan@chromium.org>
Tested-by: Ben Chan <benchan@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/28f9a886f0699986f7c378a9898bea9c6f7515ee/p2p_manager_unittest.cc

Project Member Comment 56 by bugdroid1@chromium.org, Jan 14 2017
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/c41740c5d6afc1240285a5fb4fdf931a560d2917

commit c41740c5d6afc1240285a5fb4fdf931a560d2917
Author: Jacob Dufault <jdufault@google.com>
Date: Tue Apr 12 17:47:14 2016

common-mk: Kill any auxiliary child processes after the child terminates.

It's possible the child will fork but the forked process will continue
running even after the child exits.  This leads to hard-to-debug timeout
errors since the test never finishes.

This version doesn't make the leaked processes a failure.  We want the
debug info more than being super strict here.

BUG= chromium:596566 
BUG= chromium:678643 
TEST=cros_run_unit_tests --board link --packages="update_engine"
TEST=manually verified process behavior in repl

Change-Id: Iab8652118ffd17be6913330bbfcc8fd6f91fc711
Reviewed-on: https://chromium-review.googlesource.com/428038
Commit-Ready: Ben Chan <benchan@chromium.org>
Tested-by: Mike Frysinger <vapier@chromium.org>
Reviewed-by: Ben Chan <benchan@chromium.org>

[modify] https://crrev.com/c41740c5d6afc1240285a5fb4fdf931a560d2917/common-mk/platform2_test.py

Status: Fixed
Comment 58 by dchan@google.com, Mar 4 2017
Labels: VerifyIn-58
Labels: VerifyIn-59
Labels: VerifyIn-60
Labels: VerifyIn-61
Comment 62 by dchan@chromium.org, Oct 14 (4 days ago)
Status: Archived
Sign in to add a comment