lots of "HttpsMonitor.send received status 400" spam in cbuildbot logs |
||||||
Issue descriptionSee for instance https://luci-logdog.appspot.com/v/?s=chromeos%2Fbb%2Fchromeos%2Fmaster-paladin%2F15283%2F%2B%2Frecipes%2Fsteps%2FCommitQueueCompletion%2F0%2Fstdout [1;33m10:57:53: WARNING: HttpsMonitor.send received status 400: { "error": { "code": 400, "message": "Operation was attempted past the valid range.", "status": "OUT_OF_RANGE" } } I suspect this is related to recent ts_mon changes, but not certain. I doubt it is logdog, since the logs appear to be available in logdog.
,
Jul 6 2017
Hm, if I'm right, this is actually a pre-existing bug that was exacerbated by the recent change to ts-mon. By fixing the ts_mon_config metrics-sending process to properly send metrics every minute (even if there were no messages received in the queue), we now emit pings to /chrome/infra/presence/up every minute from both the parent cbuildbot and the rexec'd cbuildbot processes. The proper solution to this problem would be to use different task_num arguments in the parent and child cbuildbot processes. See https://chromium-review.googlesource.com/560632 .
,
Jul 10 2017
Is this just spam? I'm seeing this error in https://uberchromegw.corp.google.com/i/chromeos/builders/master-chromium-pfq/builds/4596 but in this case, the master pfq builder's MasterSlaveSyncCompletion step seems to be caught in a loop with this error: https://uberchromegw.corp.google.com/i/chromeos/builders/master-chromium-pfq/builds/4596/steps/MasterSlaveSyncCompletion/logs/stdio
,
Jul 10 2017
Ok, nevermind. It did finally finish right after I posted this, and I've been reassured that it is just logspam. Carry on :-)
,
Jul 11 2017
,
Jul 11 2017
logspam is still a problem, though. Can we rollback the ts-mon change?
,
Jul 11 2017
No, we can't rollback at this point.
,
Jul 17 2017
Issue 744593 has been merged into this issue.
,
Jul 17 2017
Is there an easy way to fix this?
,
Jul 17 2017
Re #9, see #2 - that approach should fix it, but it's not working for some reason. http://shortn/_jf9Fv4NF7i shows that cbuildbot is not using a different task_num than 0, even though https://chromium-review.googlesource.com/560632 has landed. I'll look into it this week.
,
Jul 17 2017
+dgarrett Oh, I think this is due to a bug Don mentioned - the cbuildbot version wasn't incremented in constants.py and wasn't supplied when adding the argument, so it's possible that the argument isn't even being sent to the cbuildbot subprocess.
,
Jul 17 2017
Cool, thanks for the update.
,
Jul 17 2017
hm, now I'm confused - from https://cs.corp.google.com/chromeos_public/chromite/cbuildbot/stages/sync_stages.py?type=cs&q=api_version+file:%5Echromite/+package:%5Echromeos_public$&l=251 it seems like it's only filtering out arguments from options.parsed_args (line 251); ts-mon-task-num gets added separately on line 251. Indeed, it seems like cbuildbot is invoked with the flag (e.g. find "ts-mon-task-num" in https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/15424/steps/steps/logs/stdio gives several results). Maybe the code that uses the argument isn't being called, or doesn't have the desired effect?
,
Jul 20 2017
I figured out the problem - the task_num argument isn't being passed down to the subprocess when SetupTsMonGlobalState is called with indirect=True, as it is in cbuildbot.
,
Jul 20 2017
Good catch!
,
Jul 20 2017
OK great!
,
Jul 21 2017
Can we get this fixed soon? This error is usually about 50% of log lines in files like https://uberchromegw.corp.google.com/i/chromeos/builders/edgar-paladin/builds/181/steps/HWTest%20%5Bbvt-cq%5D/logs/stdio.
,
Jul 21 2017
There's a CL in flight to fix it.
,
Jul 21 2017
Just saw the update on the CL, thanks =)
,
Jul 22 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/8dca32a7c95b69e103a97d5f8e111bda666b9106 commit 8dca32a7c95b69e103a97d5f8e111bda666b9106 Author: Paul Hobbs <phobbs@google.com> Date: Sat Jul 22 07:09:11 2017 ts_mon_config: Pass through task_num to subprocess Any new argument to SetupTsMonGlobalState should be passed through to _CreateTsMonFlushingProcess. When I added the task_num argument, I didn't also pass it through to _CreateTsMonFlushingProcess. BUG= chromium:739526 TEST=ts_mon_config_unittest Change-Id: I9c96b8326a799e452790002d61085645ddab7ec5 Reviewed-on: https://chromium-review.googlesource.com/580507 Commit-Ready: Paul Hobbs <phobbs@google.com> Tested-by: Paul Hobbs <phobbs@google.com> Reviewed-by: Simon Glass <sjg@chromium.org> [modify] https://crrev.com/8dca32a7c95b69e103a97d5f8e111bda666b9106/lib/ts_mon_config.py [modify] https://crrev.com/8dca32a7c95b69e103a97d5f8e111bda666b9106/lib/ts_mon_config_unittest.py
,
Jul 22 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/e6efe5108eec569a701555324e97286fdf23713f commit e6efe5108eec569a701555324e97286fdf23713f Author: Paul Hobbs <phobbs@google.com> Date: Sat Jul 22 07:09:11 2017 ts_mon_config: Pass options instead of kwargs Passing an argparse options object is safer than passing kwargs, because we can reduce the number of places that the options are used. If this change had been made, we would have avoided a bug whereby a new kwarg was not passed through to _CreateTsMonFlushingProcess, making the flag broken for cbuildbot (which uses indirect metrics). BUG= chromium:739526 TEST=unit tests Change-Id: I672b94b59789483de0692d3120e7e186405b9db0 Reviewed-on: https://chromium-review.googlesource.com/580577 Commit-Ready: Paul Hobbs <phobbs@google.com> Tested-by: Paul Hobbs <phobbs@google.com> Reviewed-by: Simon Glass <sjg@chromium.org> [modify] https://crrev.com/e6efe5108eec569a701555324e97286fdf23713f/lib/ts_mon_config.py [modify] https://crrev.com/e6efe5108eec569a701555324e97286fdf23713f/lib/ts_mon_config_unittest.py
,
Jul 23 2017
https://uberchromegw.corp.google.com/i/chromeos/builders/edgar-paladin/builds/190/steps/HWTest%20%5Bbvt-cq%5D/logs/stdio reads so much easier now. Thanks!
,
Jul 31 2017
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by x...@chromium.org
, Jul 5 2017