v8.browsing_desktop failing on chromium.perf/Linux Perf |
||||||||||||
Issue descriptionImgur page in v8.browsing_desktop failing on chromium.perf/Linux Perf Builders failed on: - Linux Perf: https://build.chromium.org/p/chromium.perf/builders/Linux%20Perf
,
Nov 28 2017
Regardless of the solution to this, can we disable the test that's failing? (I don't know how to do that with these python tests) This bot has been unhealthy for several days we need to make it green.
,
Nov 28 2017
I will be landing a cl to disable imgur story that is causing the timeout. Looks like the timeout is happening when computing the metric. The trace looks as follows:
(INFO) 2017-11-27 14:48:14,644 trace_data.Serialize:191 Trace sizes in bytes: {'traceEvents': 566015235, 'telemetry': 211820, 'tabIds': 38}
+-----------------------------------------------------------------------------+
| End of shard 0 |
| Pending: 3554.1s Duration: 1511.6s Bot: build30-a9 Exit: -15 TIMED_OUT |
+-----------------------------------------------------------------------------+
Total duration: 1511.6s
WARNING:root:collect_cmd had non-zero return code: 241
Traceback (most recent call last):
File "/b/rr/tmpCH3dfE/rw/checkout/scripts/slave/recipe_modules/swarming/resources/standard_isolated_script_merge.py", line 45, in <module>
sys.exit(main())
File "/b/rr/tmpCH3dfE/rw/checkout/scripts/slave/recipe_modules/swarming/resources/standard_isolated_script_merge.py", line 41, in main
return StandardIsolatedScriptMerge(args.output_json, args.jsons_to_merge)
File "/b/rr/tmpCH3dfE/rw/checkout/scripts/slave/recipe_modules/swarming/resources/standard_isolated_script_merge.py", line 24, in StandardIsolatedScriptMerge
shard_results_list.append(json.load(f))
File "/usr/lib/python2.7/json/__init__.py", line 290, in load
**kw)
File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
WARNING:root:merge_cmd had non-zero return code: 1
step returned non-zero exit code: 241
,
Nov 28 2017
Ned, Juan any idea what could have gone wrong recently. It seems to be happening only on linux perf bots. on windows it is working fine.
,
Nov 28 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/564d279682098c46cdba237d51476e8bbba34a18 commit 564d279682098c46cdba237d51476e8bbba34a18 Author: Mythri Alle <mythria@chromium.org> Date: Tue Nov 28 17:58:55 2017 Disable imgur page in v8.browsing_desktop on linux platform Bug: chromium:788796 Change-Id: I208022746d63a6379dcf55cb12c1ec8fea8aabcc Reviewed-on: https://chromium-review.googlesource.com/794110 Reviewed-by: Juan Antonio Navarro Pérez <perezju@chromium.org> Commit-Queue: Mythri Alle <mythria@chromium.org> Cr-Commit-Position: refs/heads/master@{#519740} [modify] https://crrev.com/564d279682098c46cdba237d51476e8bbba34a18/tools/perf/benchmarks/v8_browsing.py
,
Nov 28 2017
I would rely on bisect to figure out what went wrong on imgur page.
,
Nov 29 2017
+jbudorick the call stack in #3 is about an error in standard_isolated_script_merge.py not sure what this has to do with the imgur page?
,
Nov 29 2017
I am not sure, but looking at: "ValueError: No JSON object could be decoded" at the end, I guess the page timesout when producing the data. The script fails because there is no data maybe. I dint think about this yesterday, but imgur page has been failing for quite some time on v8.runtimestats.browsing_desktop benchmark. I recently changed this to v8.browsing_desktop. So, this is not a recent failure. I am sorry, dint look at carefully yesterday. Also, when compared to other pages the number of trace events are much higher on this one. For example for the trace in #3 the number of trace events is 566015235. Does this seem reasonable? Is telemetry expected to handle such numbers?
,
Nov 29 2017
Which build on Linux Perf is that stack from #3 from?
,
Nov 29 2017
#8: yeah, it looks like standard_isolated_script_merge.py is being handed an empty but existing file. Sending out a CL to clean up how that's handled here: https://chromium-review.googlesource.com/c/chromium/tools/build/+/797150
,
Nov 29 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/f011851a07aa38e7bdba78a3d9834d74d5511f09 commit f011851a07aa38e7bdba78a3d9834d74d5511f09 Author: John Budorick <jbudorick@chromium.org> Date: Wed Nov 29 17:44:56 2017 [api.swarming] Revise how empty shard JSON files are handled by collect_task. Bug: 788796 Change-Id: Iaeaf276150d1ed15e6e7abf121731451981f71ec Reviewed-on: https://chromium-review.googlesource.com/797150 Commit-Queue: John Budorick <jbudorick@chromium.org> Reviewed-by: Stephen Martinis <martiniss@chromium.org> Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org> [modify] https://crrev.com/f011851a07aa38e7bdba78a3d9834d74d5511f09/scripts/slave/recipe_modules/swarming/resources/standard_isolated_script_merge.py [modify] https://crrev.com/f011851a07aa38e7bdba78a3d9834d74d5511f09/scripts/slave/recipe_modules/swarming/unittests/collect_task_test.py [modify] https://crrev.com/f011851a07aa38e7bdba78a3d9834d74d5511f09/scripts/slave/recipe_modules/swarming/resources/collect_task.py
,
Nov 29 2017
,
Dec 7 2017
It seems like this test suite isn't failing on the bot in question; presumably the CLs above helped. ->mythria to decide what further needs to be done on this bug; it's not clear to me what further action should be taken here. Removing from sheriff queue.
,
Dec 7 2017
,
Dec 7 2017
I have disabled imgur page, so I guess the bot is happy. Though I still think we need to look into why the json file was not produced in the first place. The fix from jbudorick@ (in #12) would handle the case when there are empty JSON files more gracefully. So, I think we still need to keep this bug to track failures on imgur.
,
Dec 13 2017
,
Dec 14 2017
,
Dec 14 2017
Shouldn't we re-enable imgur in order to try if the CL helped?
,
Dec 19 2017
I am sorry, I was planning to look into it but didn't get to it. In my understanding, the json file should not be empty. imgur page fails locally too. Not consistently though. That is the reason, I haven't enabled imgur yet. I thought it could be because of the large number of trace events it produces (10x more than the average). I haven't verified it yet. I will be on vacation from tomorrow. I will have a look at this once I am back after christmas.
,
Jan 30 2018
Is there any test page where we can test and check for the bisect for this?
,
Jan 2
Since we are moving to a new pages and it is too old may be not worth investigating it now. |
||||||||||||
►
Sign in to add a comment |
||||||||||||
Comment 1 by altimin@chromium.org
, Nov 27 2017Status: Assigned (was: Available)