CheckPublicKeyPins takes too long on the IO thread
Reported by
bizzbys...@gmail.com,
Jul 27 2017
|
|||||
Issue descriptionChrome Version : 58.0.3029.0 OS Version: OS X 10.12.5 URLs (if applicable) : https://cs.chromium.org/chromium/src/net/socket/ssl_client_socket_impl.cc?type=cs&sq=package:chromium&l=1308 Other browsers tested: Add OK or FAIL after other browsers where you have tested this issue: Safari 5: Firefox 4.x: IE 7/8/9: What steps will reproduce the problem? 1. Add code to set start_time to TimeTicks::Now prior to this function 2. Then add this code after this function base::TimeDelta duration = base::TimeTicks::Now() - start_time; UMA_HISTOGRAM_CUSTOM_TIMES( "Net.DoVerifyCertComplete.CheckPublicKeyPins.Latency", duration, base::TimeDelta::FromMilliseconds(0), base::TimeDelta::FromMilliseconds(100), 100); 3. Build "ninja -C out/Release chrome" with args.gn containing "is_debug = false" 4. Visit 4 https web sites (www.kayak.com, www.washingtonpost.com, www.nytimes.com, www.bankofamerica.com) What is the expected result? Mean latency of CheckPublicKeyPins.Latency should be significantly less than 1 ms. What happens instead of that? Histogram: Net.DoVerifyCertComplete.CheckPublicKeyPins.Latency recorded 174 samples, mean = 1.0 (flags = 0x41) 0 ------------------------------------------------------------------------O (79 = 45.4%) 1 --------------------------------O (35 = 20.1%) {45.4%} 2 -----------------------------------------O (45 = 25.9%) {65.5%} 3 ------------O (13 = 7.5%) {91.4%} 4 --O Please provide any additional information below. Attach a screenshot if possible. UserAgentString: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.0 Safari/537.36
,
Jul 27 2017
That indicates it's taking between 0-4ms in an unoptimized build. That does not seem unreasonable at all, particularly for CPU-bound activity that would be negatively impacted by excessive context switching (such as thread hops). I suspect with is_official_build=true is_chrome_branded=false is_debug=false Those would be even less. In your chromium-dev@ thread, you suggested up to 30ms - what lead you to that conclusion?
,
Jul 27 2017
I saw 30 ms on a test Windows machine that might have some configuration issues. I'm trying to get to the bottom of that and will follow up here. But in the meantime, I ran these experiments on my own developer build on Mac and the above was the result. By the way, I don't see other tasks on the IO thread taking anywhere close to this long, with one exception that I'm still debugging.
,
Jul 27 2017
As with any performance measurement, using optimized builds is strongly encouraged to ensure it's representative testing. chrome://tracing can further help you investigate performance. This may also be a regression due to the growth of domain security policies, and potentially related to the data structure used to look up the policies for the domain.
,
Jul 28 2017
Further testing indicates that I am seeing the delay in optimized builds but that the delay is here (https://cs.chromium.org/chromium/src/net/socket/ssl_client_socket_impl.cc?type=cs&sq=package:chromium&l=1303) as opposed to CheckPublicKeyPins. Will have an update with more detail soon. Just wanted to make sure no one was staring at the CheckPublicKeyPins method for a possible regression.
,
Jul 28 2017
What architecture are you using/testing on? (That is, is the MacOS user agent string representative, or are you on mobile?)
,
Jul 28 2017
So I am using two platforms: 1. my developer machine is a mac desktop 2. my test machine is a windows 7 desktop. The problem appears to be worse on the windows machine but again I'm still narrowing this down. At this point I don't want to waste your time unless you are already chasing this based on other evidence.
,
Jul 28 2017
Nope, just wanted to confirm you weren't using a mobile platform. That method performs several CPU-bound activities (thus context-switching doesn't really improve performance) that are critical to connection establishment (thus context-switching can negatively affect all connections).
,
Aug 2 2017
,
Aug 3 2017
Unable to triage this issue from TE-End, hence adding "TE-NeedsTriageHelp" for further triage.
,
Aug 3 2017
I opened this issue and now think it should be closed.
First, CheckPublicKeyPins is not where the delay is. The delay is in VerifyCT but the numbers I'm seeing using an optimized build and physical hardware are much lower than I saw on that one test machine, which was a VM and may have some security context issues.
Histogram: IOThreadDelay.DoVerifyCertComplete.VerifyCT.Latency recorded 836 samples, mean = 1.8 (flags = 0x41)
0 ---------------------------------------------------------O (230 = 27.5%)
1 --------------------O (79 = 9.4%) {27.5%}
2 ------------------------------------------------------------------------O (291 = 34.8%) {37.0%}
3 ----------------------------O (112 = 13.4%) {71.8%}
4 -------------------------O (100 = 12.0%) {85.2%}
5 ---O (13 = 1.6%) {97.1%}
6 -O (5 = 0.6%) {98.7%}
7 -O (3 = 0.4%) {99.3%}
8 O (2 = 0.2%) {99.6%}
9 O (0 = 0.0%) {99.9%}
10 O (1 = 0.1%) {99.9%}
11 ...
My guess is that this will be seen as acceptable and so again this issue should be closed.
,
Aug 9 2017
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by davidben@chromium.org
, Jul 27 2017