New issue
Advanced search Search tips

Issue 915691 link

Starred by 2 users

Issue metadata

Status: Untriaged
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Task



Sign in to add a comment

Excessive work between Chrome and libchaps

Project Member Reported by rsleevi@chromium.org, Dec 17

Issue description

While investigating an NSS bug, I had added additional diagnostic logging to NSS to try to trace down some unexpected call sequences. In particular, I noticed multiple threads attempting to enumerate smart cards, which also involves reparsing certificates and fighting over locks.

I'm uncertain as to how to determine whether these traces are significant on ChromeOS (sampling traces?), but it was fruit that seemed low-hanging enough that it would be worth avoiding extra work.

In the NSSCertDatabaseChromeOS, calls to ListCerts() are dispatched back into the pool with CONTINUE_ON_SHUTDOWN behaviour. As a consequence, multiple calls to ListCerts() may result in multiple workers being allocated, each in a ListCertsImpl (which may block, and may require UI intervention). Each of these calls are going to fight over the module list lock, the module lock(s), the token lock(s), and the certificate lock(s).

Further, with the investigation for Issue 844537 adding additional logging, these logging operations may also be dispatched to a worker, adding more contention. This logging operation is triggered by NSSCertDatabase::NotifyObserversCertDBChanged

Given that libchaps is going to serialize these calls over DBus, it may make sense to serialize these calls within the NSSCertDatabase, by using an explicit SequencedTaskRunner that can sequence these tasks. While that serialization would reduce contention, it would add time - but it may make sense to either join existing operations in process (for new requests) or to reparent 'old' requests to new operations, and cancel the in-progress operation.

Of course, none of these may be useful - and it may solely be an artifact of the additional logging that will be removed - but it stood out when trying to trace through call sequences as to why so much activity was going on in such a small window, repeatedly for the same task. Due to interplay with the NSS bug, it added more timing non-determinism (based on whether or not there were listing tasks going on at the time of the test), which further surprised.
 
Cc: hendrich@chromium.org emaxx@chromium.org
I'll send a CL to remove the additional logging in the next few days.

Other than that, I agree that the behavior could be improved. Assuming a scenario of:
(*) ListCerts_Call_1 comes in
(*) ListCerts_Call_2 comes in before ListCerts_Call_1 has provided a result,

IIUC we have these options as you're describing them:
(1) Leave everything as it is. Both operations are independent.
(2) Serialize. ListCerts_Call_2 will start only after ListCerts_Call_1 has finished.
(3) Join. ListCerts_Call_2 will not talk to NSS, but just piggy-back on ListCerts_Call_1.
(4) Reparent w/ serialize. ListCerts_Call_1 will finish but we won't tell anyone. ListCerts_Call_2 starts afterwards, and the results will be used for both calls.
(5) Reparent w/ immediate start. ListCerts_Call_2 will start immediately and its results will be used for both calls. IIUC We can't actively cancel ListCerts_Call_1, so it will continue but its results will be discarded.

(1) is status quo.
Assuming no UI is shown during cert listing, (4) and (5) do not seem to have any immediate advantage over (2), so I'll disregard them for now (or am I missing an advantage)?

That would leave us with (2) and (3). I'm not sure if we should do (3) - it means that an operation such as
- Import Certificate
- List Certificates
could in theory result in the imported certificate not being listed, which (while code should be able to deal with that - the cert could have been deleted in the meantime) feels like it could break assumptions we implicitly have in the code.

(2) seems like an option we can do, the only downside would be that listing certs could take longer in some cases.

WDYT?
Project Member

Comment 2 by bugdroid1@chromium.org, Jan 8

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4d6bd7ac0faa1bef8e11c5d4e421ef89f6446437

commit 4d6bd7ac0faa1bef8e11c5d4e421ef89f6446437
Author: Pavol Marko <pmarko@chromium.org>
Date: Tue Jan 08 07:19:54 2019

Remove logging of system token user certs

Remove logging added for https://crbug.com/844537 under the assumption
that the issue that the logging should help diagnose has been fixed.

Note: This is not a simple revert of CL:1169478 to keep the NULL-nullptr
clean up done there.

Bug: 844537, 915691
Test: net_unittests
Change-Id: I2b2126ae40118d6cef74ddfe95d4af751ad017a4
Reviewed-on: https://chromium-review.googlesource.com/c/1398282
Reviewed-by: Xiyuan Xia <xiyuan@chromium.org>
Reviewed-by: Matt Mueller <mattm@chromium.org>
Commit-Queue: Pavol Marko <pmarko@chromium.org>
Cr-Commit-Position: refs/heads/master@{#620642}
[modify] https://crrev.com/4d6bd7ac0faa1bef8e11c5d4e421ef89f6446437/chrome/browser/chromeos/chrome_browser_main_chromeos.cc
[modify] https://crrev.com/4d6bd7ac0faa1bef8e11c5d4e421ef89f6446437/net/cert/nss_cert_database.cc
[modify] https://crrev.com/4d6bd7ac0faa1bef8e11c5d4e421ef89f6446437/net/cert/nss_cert_database.h
[modify] https://crrev.com/4d6bd7ac0faa1bef8e11c5d4e421ef89f6446437/net/cert/nss_cert_database_chromeos.cc
[modify] https://crrev.com/4d6bd7ac0faa1bef8e11c5d4e421ef89f6446437/net/cert/nss_cert_database_chromeos.h
[modify] https://crrev.com/4d6bd7ac0faa1bef8e11c5d4e421ef89f6446437/net/cert/nss_cert_database_chromeos_unittest.cc

(2) sounds like the safest option here.

Didn't you plan on removing/replacing NSS altogether? Does that plan still exist? How would this be handled then?
Considering that #2 will end up serialized by NSS with regards to token locks if it has to grab them, #2 works.

This would apply regardless of NSS, as it's a PKCS#11 issue. Even if it wasn't a PKCS#11 issue, with chaps we'd also have the DBus issue, so... serializing all the way down ;)

Sign in to add a comment