Add log data to json test results produced by telemetry |
||||||
Issue descriptionTelemetry currently produces json test results. IIRC this was implemented by ashleymarie@. Apparently the ability to include logs for each test already exists, and is used by other test suites. We'd like to use this ability for perf benchmarks, as we're changing how we execute our tests, and our current system for viewing logs (looking at the swarming log) isn't very scalable given the new architecture. It's unclear exactly how this works; https://build.chromium.org/p/chromium.linux/builders/Linux%20Tests%20%28dbg%29%281%29/builds/66591 is an example of this happening. I'll investigate this more.
,
Sep 15 2017
I looked into how https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=webkit_layout_tests&showExpectations=true&tests=http%2Ftests%2Fnavigation%2Fstart-load-during-provisional-loader-detach.html shows results. It turns out that it's a rough version of my original plan. It looks like the tests upload their data to a google storage bucket, with a known prefix in test results (https://cs.chromium.org/chromium/infra/go/src/infra/appengine/test-results/frontend/static/dashboards/js/flakiness_dashboard.js?q=flakiness_dashboard.js&sq=package:chromium&l=38), which then is used by the flakiness dashboard to fetch the results and display them. It uses google storage as a backend, rather than logdog, but that's roughly what we had decided to do. But there's nothing in the test result json file which indicates where logs are. So I think the first order of things is to add that capability. Does that sound correct dirk?
,
Sep 18 2017
Reply to comment #1: In regards to the chicken & the egg problem: 1) where do we generate the results in telemetry? 2) at that point, what data don't we have? 3) what data do we actually need? do we have enough at that point? 4) There were discussions around what we actually care to look at in SOM and from the swarming logs. Do we care about a bunch of debug statements? That is the exception debug case (ie really knowledgeable bot health sheriffs like stephen). If we want to pass this off to chromium sheriffs they want high level. I think only telemetry, not recipe code, should have to know about telemetry and what is needed to debug telemetry. Lets try and push it in there if we can.
,
Sep 18 2017
In reply to comment #2: Stephen can you clarify for the benefit of those on this bug that weren't in the Friday meeting, what was your original plan you reference? I agree with you Stephen that the next step is to add the capability to indicate where the logs are in the test results format (ashleymarie@ will hopefully guide us on where to add these from in telemetry) One of the other open questions in my mind is whether or not google storage is the right path for these links. If we are to add links to the test results format, where do those links lead to? Google storage? logdog? I know we talked about not doing logdog for one buildbot step right now, but is that the right path in the future? I thought logdog backed by google storage anyways, so why don't we just use the built in python library Robbie was referring to in the meeting instead of doing an intermediate solution.
,
Sep 18 2017
#3: 1) The code of json test results in https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/internal/results/json_3_output_formatter.py?dr=CSs. The method for generated the results is invoked in https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/internal/story_runner.py?rcl=49fbcfa16b5d148a595cc50036d5e8354739f13f&l=364 And https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/internal/story_runner.py?rcl=49fbcfa16b5d148a595cc50036d5e8354739f13f&l=304 (for case the whole benchmark disabled) 2) At that point, we wouldn't have the log of anything that happen after it. Most of them don't matter, except for things like: + A stack trace if there is a crash in code that generates results. + A stack trace if there is a crash in code that upload Telemetry's artifacts to the perf dashboard. + Log message related to Telemetry's best efforts to shut down lingering processes (we do this through https://cs.chromium.org/chromium/src/third_party/catapult/common/py_utils/py_utils/atexit_with_log.py?dr&q=atexit+file:%5Esrc/third_party/catapult/+package:%5Echromium$&l=1) 3 & 4) These is a great question to ask. For simplification, I can totally see that we divide Telemetry's log message into two categories: + Non experts: these include Telemetry's logging data for each tests, stack trace, Chrome browser crash stack, screenshot. + Experts: the full Telemetry stack, which include things in (2) and things that happen outside of story test run loop. **Background on Telemetry's test life cycle. Telemetry would run tests in the following way: Commandline for running Telemetry benchmark is invoked. Telemetry does stuffs to prepare for the benchmark suite run (discover the platform to run on, picking the browser...) Start run story 1 ... Finish run story 1 Start run story 2 ... Finish run story 2 Start run story 3 ... Finish run story 3 ... ... Start run story N ... Finish run story N Telemetry does clean up stuffs after all stories are run: generate the test results, upload files to cloud storage, kill off lingering processess... +Juan in case I miss anything.
,
Sep 18 2017
,
Sep 18 2017
,
Sep 19 2017
I feel like I'm missing a bit of context on what the current plans are so, some questions/comments follow: - When adding logs to the json results, does that mean we can assign an individual log fragment to each individual story run? It would be amazing if one could quickly go from a failed story link (on buildbot or SoM) to the log of _that_ particular story. - I think it's fine is we miss some things outside of the main story-run-loop; as long as those are kept in the main "one huge log" for the entire benchmark run somewhere. - What to keep on the per story logs? At least browser info, and exception/crash info plus screen shot in case of failure. (i.e. what Ned mentioned as "non experts log" sgtm).
,
Sep 22 2017
> - When adding logs to the json results, does that mean we can assign an > individual log fragment to each individual story run? Yes, I think that's the goal.
,
Sep 24
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue. Sorry for the inconvenience if the bug really should have been left as Available. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jan 7
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by nedngu...@google.com
, Sep 15 2017