New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 749777 link

Starred by 4 users

Issue metadata

Status: WontFix
Owner: ----
Closed: Aug 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 3
Type: Bug



Sign in to add a comment

CheckPublicKeyPins takes too long on the IO thread

Reported by bizzbys...@gmail.com, Jul 27 2017

Issue description

Chrome Version       : 58.0.3029.0
OS Version: OS X 10.12.5
URLs (if applicable) : https://cs.chromium.org/chromium/src/net/socket/ssl_client_socket_impl.cc?type=cs&sq=package:chromium&l=1308
Other browsers tested:
  Add OK or FAIL after other browsers where you have tested this issue:
     Safari 5:
  Firefox 4.x:
     IE 7/8/9:

What steps will reproduce the problem?
1. Add code to set start_time to TimeTicks::Now prior to this function
2. Then add this code after this function
  base::TimeDelta duration = base::TimeTicks::Now() - start_time;
  UMA_HISTOGRAM_CUSTOM_TIMES(
    "Net.DoVerifyCertComplete.CheckPublicKeyPins.Latency",
    duration,
    base::TimeDelta::FromMilliseconds(0),
    base::TimeDelta::FromMilliseconds(100),
    100);
3. Build "ninja -C out/Release chrome" with args.gn containing "is_debug = false"
4. Visit 4 https web sites (www.kayak.com, www.washingtonpost.com, www.nytimes.com, www.bankofamerica.com)

What is the expected result?
Mean latency of CheckPublicKeyPins.Latency should be significantly less than 1 ms.

What happens instead of that?
Histogram: Net.DoVerifyCertComplete.CheckPublicKeyPins.Latency recorded 174 samples, mean = 1.0 (flags = 0x41)
0  ------------------------------------------------------------------------O (79 = 45.4%)
1  --------------------------------O                                         (35 = 20.1%) {45.4%}
2  -----------------------------------------O                                (45 = 25.9%) {65.5%}
3  ------------O                                                             (13 = 7.5%) {91.4%}
4  --O  


Please provide any additional information below. Attach a screenshot if
possible.

UserAgentString: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.0 Safari/537.36



 
Components: Internals>Network>Certificate
That indicates it's taking between 0-4ms in an unoptimized build. That does not seem unreasonable at all, particularly for CPU-bound activity that would be negatively impacted by excessive context switching (such as thread hops).

I suspect with

is_official_build=true
is_chrome_branded=false
is_debug=false

Those would be even less.

In your chromium-dev@ thread, you suggested up to 30ms - what lead you to that conclusion?
I saw 30 ms on a test Windows machine that might have some configuration issues. I'm trying to get to the bottom of that and will follow up here. But in the meantime, I ran these experiments on my own developer build on Mac and the above was the result. By the way, I don't see other tasks on the IO thread taking anywhere close to this long, with one exception that I'm still debugging.
Cc: lgar...@chromium.org
Components: Internals>Network>DomainSecurityPolicy
As with any performance measurement, using optimized builds is strongly encouraged to ensure it's representative testing. chrome://tracing can further help you investigate performance.

This may also be a regression due to the growth of domain security policies, and potentially related to the data structure used to look up the policies for the domain.
Further testing indicates that I am seeing the delay in optimized builds but that the delay is here (https://cs.chromium.org/chromium/src/net/socket/ssl_client_socket_impl.cc?type=cs&sq=package:chromium&l=1303) as opposed to CheckPublicKeyPins. Will have an update with more detail soon. Just wanted to make sure no one was staring at the CheckPublicKeyPins method for a possible regression.
What architecture are you using/testing on? (That is, is the MacOS user agent string representative, or are you on mobile?)
So I am using two platforms:
1. my developer machine is a mac desktop
2. my test machine is a windows 7 desktop.

The problem appears to be worse on the windows machine but again I'm still narrowing this down. At this point I don't want to waste your time unless you are already chasing this based on other evidence.
Nope, just wanted to confirm you weren't using a mobile platform. That method performs several CPU-bound activities (thus context-switching doesn't really improve performance) that are critical to connection establishment (thus context-switching can negatively affect all connections).
Labels: Needs-Milestone
Cc: kkaluri@chromium.org
Labels: TE-NeedsTriageHelp
Unable to triage this issue from TE-End, hence adding "TE-NeedsTriageHelp" for further triage.
I opened this issue and now think it should be closed.

First, CheckPublicKeyPins is not where the delay is. The delay is in VerifyCT but the numbers I'm seeing using an optimized build and physical hardware are much lower than I saw on that one test machine, which was a VM and may have some security context issues.

Histogram: IOThreadDelay.DoVerifyCertComplete.VerifyCT.Latency recorded 836 samples, mean = 1.8 (flags = 0x41)
0   ---------------------------------------------------------O                (230 = 27.5%)
1   --------------------O                                                     (79 = 9.4%) {27.5%}
2   ------------------------------------------------------------------------O (291 = 34.8%) {37.0%}
3   ----------------------------O                                             (112 = 13.4%) {71.8%}
4   -------------------------O                                                (100 = 12.0%) {85.2%}
5   ---O                                                                      (13 = 1.6%) {97.1%}
6   -O                                                                        (5 = 0.6%) {98.7%}
7   -O                                                                        (3 = 0.4%) {99.3%}
8   O                                                                         (2 = 0.2%) {99.6%}
9   O                                                                         (0 = 0.0%) {99.9%}
10  O                                                                         (1 = 0.1%) {99.9%}
11  ... 

My guess is that this will be seen as acceptable and so again this issue should be closed.
Status: WontFix (was: Unconfirmed)

Sign in to add a comment