New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 736236 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: Aug 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Feature



Sign in to add a comment

install DNS cache on shards to speed up ssh and other networking tools

Project Member Reported by ihf@chromium.org, Jun 23 2017

Issue description

For context see issue 734887 and issue 726481.

We can speed up tests on the CQ by 20-30s (or 5-15%) simply by installing a DNS cache. This should free up to one DUT out of the pool:cq (usually around 7). This is not the final fix for ssh being slow, just a general quick improvement for the servers (and should be easy using puppet).

https://bugs.chromium.org/p/chromium/issues/detail?id=734887#c16

Time improvements:
                                        1284   1301   1305   delta    %

security_NetworkListeners               138s   142s   118s     20s  14%
cheets_KeyboardTest                     176s   165s   141s     24s  14%
cheets_CTS_N.CtsOpenGlPerf2TestCases    484s   494s   461s     23s   5%
cheets_StartAndroid.stress              681s   683s   650s     31s   5%

---

If you want to just follow chromeos-server98:
apt-get install unbound

Change config /etc/unbound/unbound.conf to something like
forward-zone:
name: "."
forward-addr: 8.8.8.8
forward-addr: 8.8.4.4
forward-addr: 208.67.222.222
forward-addr: 208.67.220.220

To check it is working run
unbound-control stats

Also instead of about 40ms without the cache this should show 0ms or 1ms after repeated use
dig chromeos6-row2-rack23-host8.cros | grep time
;; Query time: 0 msec

 
Labels: -Type-Bug Type-Feature
Owner: ayatane@chromium.org
Status: (was: Untriaged)
Labels: -M-61
Status: Assigned
The last time I added a DNS cache service I broke a lot of things spectacularly.  Shards are Goobuntu machines, so our Puppet contends with Goobuntu's Puppet, and Goobuntu's Puppet deploys its own DNS cache/resolver service (dnsmasq I believe).

What I'm saying is, this isn't quite as straightforward as it may seem.

Comment 3 by ihf@chromium.org, Jun 30 2017

Fair enough - do you recall the old issue?

We should keep monitoring chromeos-server98 then as it has been using the cache for the last week
unbound-control stats
thread0.num.queries=86602626
thread0.num.cachehits=85057603
thread0.num.cachemiss=1545023
time.up=838964.203310
time.elapsed=657699.519068

The issue is simply that our Puppet and Goobuntu Puppet would deploy different configuration on top of each other.  If you have a machine successfully using it, we should be able to replicate the same setup through Puppet.

What DNS cache and configuration are you using?

Comment 5 by ihf@chromium.org, Jul 19 2017

The cache is called "unbound" and the config is above. I just checked and it still works fine on chromeos-server98.mtv.
Cc: ayatane@chromium.org pho...@chromium.org
Labels: -Pri-2 Pri-1
Owner: ----
Status: Available (was: Assigned)
Upgrading to P1, rationale: significant performance impact on class 1 service (CQ).

+deputies for the week. If you have spare cycles, can you pick up adding a puppet-owned conf file as described in OP?


Comment 7 by pho...@chromium.org, Jul 19 2017

Cc: jrbarnette@chromium.org
+jrbarnette (#6 should have included him)

Comment 8 by pho...@chromium.org, Jul 19 2017

Owner: pho...@chromium.org
Status: Assigned (was: Available)
I can pick this up.

Comment 9 by pho...@chromium.org, Aug 18 2017

Status: Fixed (was: Assigned)

Comment 10 by ihf@chromium.org, Aug 19 2017

Cc: kenobi@chromium.org hidehiko@chromium.org marc...@chromium.org snanda@chromium.org
Status: Verified (was: Fixed)
The change landed as
https://chrome-internal-review.googlesource.com/#/c/chromeos/chromeos-admin/+/416090/

Looks like the runtime of client test security_AccountsBaseline on reef-paladin went from 42-52s to 28-32s with this change.

Good stuff.

--

Details:
Last slow run is
https://viceroy.corp.google.com/chromeos/build_details?build_config=reef-paladin&build_number=3337
https://viceroy.corp.google.com/chromeos/suite_details?build_id=1765328

First fast run is
https://viceroy.corp.google.com/chromeos/build_details?build_config=reef-paladin&build_number=3338
Start time	Aug. 17, 2017, 9:05 p.m.
https://viceroy.corp.google.com/chromeos/suite_details?build_id=1765579

More samples
security_ModuleLocking 48s vs 35s
graphics_Gbm 52s vs. 40s
graphics_dEQP.bvt  54s vs. 38s

In other words 20-25% speedup of client tests.

Sign in to add a comment