network overloaded by offloading big size files to GS |
||||||||||
Issue descriptionThe kilobyte transfer rate of gsoffloader has spiked a lot since 6am on April 12. It means the number of large files to offload has increased since then. We looked into some big folder examples, they all include big crash dump files. we suspect the bad chrome version generates big crash files and causes network overload. And the network connectivity issue have been affecting the CQ/Canary/release builds a lot (see crbug.com/713004) jobs: http://shortn/_30BwQteEGu kilobytes: http://shortn/_WwvfnYqEGk bytes offloaded per job http://shortn/_iApXsTMdYc
,
Apr 20 2017
There are multiple issues surrounding the problem that excessive crashes in the lab due to a product bug can bring the lab network (+ other stuff) down. issue 489845 has collected some past outages due to this, and has some ideas about what needs to be done.
,
Apr 20 2017
We're going to exclude 'chrome*core' files in collecting_logs for now.
,
Apr 21 2017
,
Apr 21 2017
,
Apr 21 2017
,
Apr 21 2017
,
Apr 24 2017
bytes offloaded per job http://shortn/_iApXsTMdYc has dropped to the normal number, downgrade this to p1 and cc the current deputy.
,
Apr 24 2017
(note: it may just have dropped because we aren't running anything, so let's not declare victory yet :) )
,
Apr 25 2017
,
Apr 25 2017
+dshi is gonna be working on a ddoc in this area
,
Apr 28 2017
At this point, the GS Offloader dashboard says we're back to roughly the level prior to the outage. As best we can tell, this outage is over.
,
Aug 1 2017
,
Jan 22 2018
|
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by nxia@chromium.org
, Apr 20 2017