New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 807602 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: May 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

chaps mTestLogin.LoginNoPrivateWait unittest failures

Reported by vpalatin@chromium.org, Jan 31 2018

Issue description

While reviewing canaries failure, I have noted a few chaps unittest failures which are all identical:

chaps-0.0.1-r2637: [ RUN      ] TestLogin.LoginNoPrivateWait
chaps-0.0.1-r2637: [0131/041403:ERROR:chaps.cc(527)] C_Login CKR_WOULD_BLOCK_FOR_PRIVATE_OBJECTS
chaps-0.0.1-r2637: ../../../../../../../../../mnt/host/source/src/platform2/chaps/chaps_test.cc:886: Failure
chaps-0.0.1-r2637:       Expected: 0x00000000
chaps-0.0.1-r2637:       Which is: 0
chaps-0.0.1-r2637: To be equal to: C_Login(1, 1, __null, 0)
chaps-0.0.1-r2637:       Which is: 3343338496
chaps-0.0.1-r2637: [  FAILED  ] TestLogin.LoginNoPrivateWait (19 ms)

e.g.
https://luci-logdog.appspot.com/v/?s=chromeos%2Fbb%2Fchromeos%2Fnyan_kitty-release%2F1882%2F%2B%2Frecipes%2Fsteps%2FUnitTest%2F0%2Fstdout

I have the following recent failures:
https://luci-milo.appspot.com/buildbot/chromeos/gale-release/2011
https://luci-milo.appspot.com/buildbot/chromeos/nyan_kitty-release/1882
https://luci-milo.appspot.com/buildbot/chromeos/relm-release/1903
https://luci-milo.appspot.com/buildbot/chromeos/bob-nyc-android-pfq/1307
https://luci-milo.appspot.com/buildbot/chromeos/veyron_minnie-release/1904
https://luci-milo.appspot.com/buildbot/chromeos/beaglebone-paladin/15960
https://luci-milo.appspot.com/buildbot/chromeos/bob-paladin/2183
https://luci-milo.appspot.com/buildbot/chromeos/peach_pit-paladin/18338
https://luci-milo.appspot.com/buildbot/chromeos/peppy-paladin/17754
https://luci-milo.appspot.com/buildbot/chromeos/rainier-paladin/334

Most of them are ARM but not all...
 
CKR_WOULD_BLOCK_FOR_PRIVATE_OBJECTS was introduced in https://crrev.com/c/667748. And the code that repeats C_Login and if it gets this error and the (now occasionally failing) TestLogin.LoginNoPrivateWait test that verifies that this retry happens were added in  https://crrev.com/c/669945.

Possibly, in some situations on those canary VMs it fails to retry within 10 ms, which the test sets as an arbitrary timeout for waiting. We can increase that test timeout w/o slowing down the test in faster cases - it finishes as soon as it gets the 2nd (successful) response. It will slow down the test if something is wrong, but that's probably ok.

Comment 2 by derat@chromium.org, May 15 2018

Components: OS>Systems
Labels: -Pri-3 OS-Chrome Pri-2
Owner: ejcaruso@chromium.org
Status: Assigned (was: Untriaged)
I just saw this again on amd64-generic-tot-asan-informational at http://cros-goldeneye/chromeos/healthmonitoring/buildDetails?buildbucketId=8946434049661770096 :

chaps-0.0.1-r2710: [ RUN      ] TestLogin.LoginNoPrivateWait
chaps-0.0.1-r2710: [0515/212210:ERROR:chaps.cc(527)] C_Login - CKR_WOULD_BLOCK_FOR_PRIVATE_OBJECTS
chaps-0.0.1-r2710: ../../../../../../../tmp/portage/chromeos-base/chaps-0.0.1-r2710/work/chaps-0.0.1/chaps/chaps_test.cc:886: Failure
chaps-0.0.1-r2710:       Expected: 0x00000000
chaps-0.0.1-r2710:       Which is: 0
chaps-0.0.1-r2710: To be equal to: C_Login(1, 1, __null, 0)
chaps-0.0.1-r2710:       Which is: 3343338496
chaps-0.0.1-r2710: [  FAILED  ] TestLogin.LoginNoPrivateWait (11 ms)
Same 3343338496 value as before, interestingly.
3343338496 is 0xc7474c00 which is CKR_CHAPS_SPECIFIC_FIRST/CKR_WOULD_BLOCK_FOR_PRIVATE_OBJECTS [CKR_VENDOR_DEFINED (0x80000000) + 0x47474c00]. I imagine asan might be slowing things down enough for this to fail?

I can try bumping the timeout, but there should probably be a better way to do this that isn't so sensitive to timing issues.
We could add extra instrumentation for testing, so that PerformNonBlocking() in chaps.cc waits for the specified number of retries during unit tests rather than timeout. But that'd make LoginNoPrivateWait (almost) pointless, I guess. It's entire purpose is to verify that this loop-with-timeout in PerformNonBlocking exists and works.

A better approach is mocked time. Not sure how hard it is to inject it into libchaps.

Comment 5 by derat@chromium.org, May 19 2018

Should the timeout just be increased for now?
Re #5: yes, on its way to CQ: https://crrev.com/c/1064748
Project Member

Comment 7 by bugdroid1@chromium.org, May 19 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/35597d39512e173e4d8c5beae501ac4f83c818b2

commit 35597d39512e173e4d8c5beae501ac4f83c818b2
Author: Eric Caruso <ejcaruso@chromium.org>
Date: Sat May 19 00:20:02 2018

chaps: increase timeout on TestLogin.LoginNoPrivateWait

Workaround until we can get something more robust in place.

BUG= chromium:807602 
TEST=unit tests

Change-Id: Ibf24d364f4a764c34738ca598ef48a18c884d134
Reviewed-on: https://chromium-review.googlesource.com/1064748
Commit-Ready: Eric Caruso <ejcaruso@chromium.org>
Tested-by: Eric Caruso <ejcaruso@chromium.org>
Reviewed-by: Andrey Pronin <apronin@chromium.org>

[modify] https://crrev.com/35597d39512e173e4d8c5beae501ac4f83c818b2/chaps/chaps_test.cc

Comment 8 by derat@chromium.org, May 19 2018

Thanks!
Status: Fixed (was: Assigned)
I filed bug #844114 for a more robust solution. Marking this one closed.

Sign in to add a comment