New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 739526 link

Starred by 6 users

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: Jul 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

lots of "HttpsMonitor.send received status 400" spam in cbuildbot logs

Project Member Reported by akes...@chromium.org, Jul 5 2017

Issue description

See for instance https://luci-logdog.appspot.com/v/?s=chromeos%2Fbb%2Fchromeos%2Fmaster-paladin%2F15283%2F%2B%2Frecipes%2Fsteps%2FCommitQueueCompletion%2F0%2Fstdout

[1;33m10:57:53: WARNING: HttpsMonitor.send received status 400: {
  "error": {
    "code": 400,
    "message": "Operation was attempted past the valid range.",
    "status": "OUT_OF_RANGE"
  }
}

I suspect this is related to recent ts_mon changes, but not certain.

I doubt it is logdog, since the logs appear to be available in logdog.
 
Status: Started (was: Untriaged)
Hm, if I'm right, this is actually a pre-existing bug that was exacerbated by the recent change to ts-mon. By fixing the ts_mon_config metrics-sending process to properly send metrics every minute (even if there were no messages received in the queue), we now emit pings to /chrome/infra/presence/up every minute from both the parent cbuildbot and the rexec'd cbuildbot processes.

The proper solution to this problem would be to use different task_num arguments in the parent and child cbuildbot processes. See https://chromium-review.googlesource.com/560632 .

Comment 3 by glevin@chromium.org, Jul 10 2017

Cc: derat@chromium.org steve...@chromium.org glevin@chromium.org
Is this just spam?  I'm seeing this error in

https://uberchromegw.corp.google.com/i/chromeos/builders/master-chromium-pfq/builds/4596

but in this case, the master pfq builder's MasterSlaveSyncCompletion step seems to be caught in a loop with this error:

https://uberchromegw.corp.google.com/i/chromeos/builders/master-chromium-pfq/builds/4596/steps/MasterSlaveSyncCompletion/logs/stdio

Comment 4 by glevin@chromium.org, Jul 10 2017

Ok, nevermind.  It did finally finish right after I posted this, and I've been reassured that it is just logspam.  Carry on :-)

Comment 5 by sjg@google.com, Jul 11 2017

Cc: sjg@chromium.org jclinton@chromium.org
logspam is still a problem, though. Can we rollback the ts-mon change?

No, we can't rollback at this point.

Comment 8 by pho...@chromium.org, Jul 17 2017

 Issue 744593  has been merged into this issue.
Is there an easy way to fix this?
Re #9, see #2 - that approach should fix it, but it's not working for some reason.

http://shortn/_jf9Fv4NF7i shows that cbuildbot is not using a different task_num than 0, even though https://chromium-review.googlesource.com/560632 has landed.

I'll look into it this week.
Cc: dgarr...@chromium.org
+dgarrett 

Oh, I think this is due to a bug Don mentioned - the cbuildbot version wasn't incremented in constants.py and wasn't supplied when adding the argument, so it's possible that the argument isn't even being sent to the cbuildbot subprocess.
Cool, thanks for the update.
hm, now I'm confused - from https://cs.corp.google.com/chromeos_public/chromite/cbuildbot/stages/sync_stages.py?type=cs&q=api_version+file:%5Echromite/+package:%5Echromeos_public$&l=251 it seems like it's only filtering out arguments from options.parsed_args (line 251); ts-mon-task-num gets added separately on line 251. Indeed, it seems like cbuildbot is invoked with the flag (e.g. find "ts-mon-task-num" in https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/15424/steps/steps/logs/stdio gives several results).

Maybe the code that uses the argument isn't being called, or doesn't have the desired effect?
I figured out the problem - the task_num argument isn't being passed down to the subprocess when SetupTsMonGlobalState is called with indirect=True, as it is in cbuildbot.
Good catch!

Comment 16 by sjg@google.com, Jul 20 2017

OK great!
Can we get this fixed soon? This error is usually about 50% of log lines in files like https://uberchromegw.corp.google.com/i/chromeos/builders/edgar-paladin/builds/181/steps/HWTest%20%5Bbvt-cq%5D/logs/stdio.
There's a CL in flight to fix it.
Just saw the update on the CL, thanks =)
Project Member

Comment 20 by bugdroid1@chromium.org, Jul 22 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/8dca32a7c95b69e103a97d5f8e111bda666b9106

commit 8dca32a7c95b69e103a97d5f8e111bda666b9106
Author: Paul Hobbs <phobbs@google.com>
Date: Sat Jul 22 07:09:11 2017

ts_mon_config: Pass through task_num to subprocess

Any new argument to SetupTsMonGlobalState should be passed through to
_CreateTsMonFlushingProcess. When I added the task_num argument, I didn't also
pass it through to _CreateTsMonFlushingProcess.

BUG= chromium:739526 
TEST=ts_mon_config_unittest

Change-Id: I9c96b8326a799e452790002d61085645ddab7ec5
Reviewed-on: https://chromium-review.googlesource.com/580507
Commit-Ready: Paul Hobbs <phobbs@google.com>
Tested-by: Paul Hobbs <phobbs@google.com>
Reviewed-by: Simon Glass <sjg@chromium.org>

[modify] https://crrev.com/8dca32a7c95b69e103a97d5f8e111bda666b9106/lib/ts_mon_config.py
[modify] https://crrev.com/8dca32a7c95b69e103a97d5f8e111bda666b9106/lib/ts_mon_config_unittest.py

Project Member

Comment 21 by bugdroid1@chromium.org, Jul 22 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/e6efe5108eec569a701555324e97286fdf23713f

commit e6efe5108eec569a701555324e97286fdf23713f
Author: Paul Hobbs <phobbs@google.com>
Date: Sat Jul 22 07:09:11 2017

ts_mon_config: Pass options instead of kwargs

Passing an argparse options object is safer than passing kwargs, because we can
reduce the number of places that the options are used. If this change had been
made, we would have avoided a bug whereby a new kwarg was not passed through to
_CreateTsMonFlushingProcess, making the flag broken for cbuildbot (which uses
indirect metrics).

BUG= chromium:739526 
TEST=unit tests

Change-Id: I672b94b59789483de0692d3120e7e186405b9db0
Reviewed-on: https://chromium-review.googlesource.com/580577
Commit-Ready: Paul Hobbs <phobbs@google.com>
Tested-by: Paul Hobbs <phobbs@google.com>
Reviewed-by: Simon Glass <sjg@chromium.org>

[modify] https://crrev.com/e6efe5108eec569a701555324e97286fdf23713f/lib/ts_mon_config.py
[modify] https://crrev.com/e6efe5108eec569a701555324e97286fdf23713f/lib/ts_mon_config_unittest.py

Status: Verified (was: Started)

Sign in to add a comment