[Flake] eve-tot-chrome-pfq-informational failed at platform_FilePerms due to imageloader component not being removed after test |
||||||||||
Issue description
11/14 13:00:41.105 WARNI| test:0606| The test failed with the following exception
Traceback (most recent call last):
File "/usr/local/autotest/common_lib/test.py", line 600, in _exec
_call_test_function(self.execute, *p_args, **p_dargs)
File "/usr/local/autotest/common_lib/test.py", line 800, in _call_test_function
return func(*args, **dargs)
File "/usr/local/autotest/common_lib/test.py", line 464, in execute
postprocess_profiled_run, args, dargs)
File "/usr/local/autotest/common_lib/test.py", line 371, in _call_run_once
self.run_once(*args, **dargs)
File "/usr/local/autotest/tests/platform_FilePerms/platform_FilePerms.py", line 511, in run_once
raise error.TestFail('Found %d permission errors' % errors)
TestFail: Found 1 permission errors
https://00e9e64bac869e22b458ef9b53b36ca0edc4444dcd54142a5a-apidata.googleusercontent.com/download/storage/v1/b/chromeos-autotest-results/o/257211223-chromeos-test%2Fchromeos6-row4-rack17-host18%2Fdebug%2Fclient.0.DEBUG?qk=AD5uMEuo589b_bBlZXEkxmz0iUEcDBrPZJJbWaXYVLs9YTf2wI7AiaxJS3lX3nLNUZwzUaU7u1WoMN3KQuI11mRjtIPmp4T9KfwFKPMLOFh7Xqk6O4XGTEt93uhY4p3WfsgmRcgyiwEd_PDx1XjNblK9F7z11qNGpOl6IKIPgjq8y2NAi1VJtTlg67tzf9KUtwBW8zpG2RzW_8RA1OSjvVtlkbFs45Lvw7FogAbqMF9qWkPiSJLntr8_hb3uXuQfDbFaqTZ5E-TfzdG59xng5nAggkfNjvSodQ5oDrt_nWRtEnctQqvzgMfgcTl5pLyGSBMKMFB0xiIuzZYvjugrRnFj04dAfytxlkg0q1Z10YruSahSl-5538U-BV7jqECDaLpaB_1UpjghRZ-GuQil8_kXSGVuzSTgZ4w8nNa7Ns4kAGqH4LrgaiUtBem6iGtw_gNoyopbB-5pv64B3x21HDOTGgdsP4EPjCWIjiqtprUkcPjsuCilKPvENTCuM_a9EC19uORoDJQwvEhlfotuOL6ovzLq3esJGv1NoAm3IB126igDQ2dyS_BqHkh-5k2a251AXTxKC5DUSUVSrTpDKdXI9bkKY026MhT9oUMeRkwMZcRW8nt7GpUqE8S9bQV3QAKa4zlpNFS6czXZiFVeCW7bHaSD7UOsBM6ZOUibyMbuQsaJo_GBQoHC7ITAnXJAoGDlWJkqSSj5xVeaSR03UDdKQK-9ibCC7VHBwHh3n-yRn4wCbz5BlBfY-NW-F7Af3YkjtGrOGFERhqIlt4ZuuL5mpuyu5tUbHkF0i45j6twmdr7qIYHbCKlhD2uaXid3gvnG7jk7VLFz7sKPMrS3CbOS32PQr84F8QERqV11qJonJMRP2_QgV4k
,
Nov 16
Over to gardener to investigate.
,
Nov 19
Over to this week's gardener.
,
Nov 19
Flaky test. Last 3 runs are successful. Will examine further.
,
Nov 26
Passing to the next gardener.
,
Nov 27
Here's the error from the failure in #5:
11/20 12:02:08.481 ERROR|platform_FilePerms:0407| No expectations entry for "/run/imageloader/cros-termina/99999.0.0" with info "{'device': '/dev/loop4', 'type': 'ext4', 'options': ['rw', 'seclabel', 'relatime', 'block_validity', 'delalloc', 'nojournal_checksum', 'barrier', 'user_xattr', 'acl']}"
This test is somewhat similar to the file-checking security tests that I've already ported to Tast, but I haven't looked into porting this one yet.
,
Nov 28
eve-tot-chrome-pfq-informational failed again at http://cros-goldeneye/chromeos/healthmonitoring/buildDetails?buildbucketId=8928622358409831392. 11/28 06:14:25.026 ERROR|platform_FilePerms:0407| No expectations entry for "/run/imageloader/cros-termina/99999.0.0" with info "{'device': '/dev/loop4', 'type': 'ext4', 'options': ['rw', 'seclabel', 'relatime', 'block_validity', 'delalloc', 'nojournal_checksum', 'barrier', 'user_xattr', 'acl']}" Xiaochu, please fix this source of flakiness -- platform_FilePerms runs in bvt-inline, so it must pass reliably. Otherwise, it makes CQ and PFQ runs fail. Presumably you need to add a new expectation for this mount to platform_FilePerms.py.
,
Nov 28
/run/imageloader/cros-termina/* is the mount point of an on-demand installed component which is not supposed to exist. Are there any tests that run on PFQ triggering crostini component download?
,
Nov 28
The mount point seems to be leaked by a tast-based test: https://cs.corp.google.com/chromeos_public/src/platform/tast-tests/src/chromiumos/tast/local/vm/util.go?l=35
,
Nov 28
Thanks for digging into it! Stephen, I think that this is your code. Mind taking a look? I suspect that there need to be some deferred cleanup statements. If cleanup requires a context, it's probably also necessary to pass a derived context with a slightly-shorter deadline to subtests so that there's still adequate time for cleanup if the tests time out.
,
Nov 28
,
Nov 28
Taking a further look, somehow imageloader is triggered to mount the termina component with a key and did not fail: 2018-11-28T13:24:26.871905+00:00 NOTICE dbus[660]: [system] Activating service name='org.chromium.ImageLoader' (using servicehelper) 2018-11-28T13:24:26.886288+00:00 NOTICE dbus[660]: [system] Successfully activated service 'org.chromium.ImageLoader' 2018-11-28T13:24:26.898031+00:00 WARNING kernel: [ 3146.975921] device-mapper: verity: Argument 0: 'payload=/dev/loop4' 2018-11-28T13:24:26.898041+00:00 WARNING kernel: [ 3146.975926] device-mapper: verity: Argument 1: 'hashtree=/dev/loop4' 2018-11-28T13:24:26.898043+00:00 WARNING kernel: [ 3146.975929] device-mapper: verity: Argument 2: 'hashstart=421848' 2018-11-28T13:24:26.898044+00:00 WARNING kernel: [ 3146.975931] device-mapper: verity: Argument 3: 'alg=sha256' 2018-11-28T13:24:26.898045+00:00 WARNING kernel: [ 3146.975933] device-mapper: verity: Argument 4: 'root_hexdigest=65277bddbcd96b71899043431646b92d2e56fed61be375347efe1bcf6d4f424a' 2018-11-28T13:24:26.898046+00:00 WARNING kernel: [ 3146.975935] device-mapper: verity: Argument 5: 'salt=56dbc24f3cc57e00f302af3ef3c1101368a6f38deb27d4356b5caba6fd95b69d' 2018-11-28T13:24:26.898047+00:00 WARNING kernel: [ 3146.975937] device-mapper: verity: Argument 6: 'error_behavior=eio' 2018-11-28T13:24:26.913022+00:00 INFO kernel: [ 3146.991109] EXT4-fs (dm-2): mounted filesystem without journal. Opts: 2018-11-28T13:24:26.913432+00:00 INFO debugd[22793]: Starting vm_concierge ... Then later imageloader is trying to unmount the image but failed since the mount point is not proper which is WAI(?) 2018-11-28T13:28:24.040910+00:00 ERR imageloader[26133]: /dev/loop4 is not device mapped! 2018-11-28T13:28:24.041643+00:00 ERR imageloader[26133]: Failed to cleanup "/run/imageloader/cros-termina/99999.0.0" 2018-11-28T13:28:24.041918+00:00 ERR imageloader[26129]: --unmount_all failed! It looks like crostini is active around this time since vm_concierge is started after image mounted.
,
Nov 28
That mount is probably left over from vm.CrostiniStartEverything. Sounds like we need two changes then: 1) Restore the defer call to shut down concierge and the VM. Otherwise unmounting the component would fail with EBUSY. 2) Add a defer to unmount the manually mounted component.
,
Nov 29
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/tast-tests/+/ed10a53502b8fdbea5d9127ffa77ecf3140cfe59 commit ed10a53502b8fdbea5d9127ffa77ecf3140cfe59 Author: Stephen Barber <smbarber@chromium.org> Date: Thu Nov 29 20:11:08 2018 tast-tests: add UnmountComponent for vm.CrostiniStartEverything Add an UnmountComponent method to allow deferred unmounting of the component by a test, and use that in vm.CrostiniStartEverything. Also clean up vm.CrostiniStartEverything to use a derived deadline for subtests so that we should always have enough time to do cleanup at the end. BUG= chromium:905479 TEST=vm.CrostiniStartEverything; check that component isn't mounted Change-Id: Ife810e47f710a2ec4fcd8bb2166dd13fb2f67bd4 Reviewed-on: https://chromium-review.googlesource.com/1354545 Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com> Tested-by: Stephen Barber <smbarber@chromium.org> Reviewed-by: Dan Erat <derat@chromium.org> [modify] https://crrev.com/ed10a53502b8fdbea5d9127ffa77ecf3140cfe59/src/chromiumos/tast/local/bundles/cros/vm/crostini_start_everything.go [modify] https://crrev.com/ed10a53502b8fdbea5d9127ffa77ecf3140cfe59/src/chromiumos/tast/local/vm/util.go
,
Nov 29
|
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by afakhry@chromium.org
, Nov 16Labels: Type-Bug