New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 772215 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Jan 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Feature

Blocked on:
issue 792389
issue 793433

Blocking:
issue 772208



Sign in to add a comment

Updating telemetry to utilize artifacts

Project Member Reported by eyaich@chromium.org, Oct 6 2017

Issue description

We are moving towards a model that outputs debugging information locally in telemetry and adds corresponding entries to the JSON test results format to be passed through to sheriff o matic.  

This requires two parts: 

1) Cleaning up parts of telemetry to output the debugging information locally instead of cloud storage (crash stacks, screenshots, etc).

2) Adding the entries to the test results format when it is generated in telemetry



 
We should note that #2 above is relatively straightforward, but #1 might take some design work and refactoring in telemetry.  

Currently we upload most of this data to cloud storage from telemetry and we will need to re-route that.  

It might be a good time to do some work on the crash stacks problem () but that might be out of scope of this project.https://docs.google.com/document/d/1CAHzQlCueCZUNNDiCK9HwjNWBm6wjut5012K3PTVoAQ/edit#heading=h.l42hld6pjqvn
Status: Assigned (was: Untriaged)
Cc: perezju@chromium.org kbr@chromium.org
Some quick thought, this project can be divided to 4 parts:

(1) Design a object that track all artifacts in Telemetry. Some questions are:
i) Who owns that object? Global? What is it life cycle?
ii) How do we make sure that it can track artifacts on a per story_run basis (for Telemetry Benchmark test)
iii) How do we make sure that it can track artifacts on a per test basis (for Telemetry browser test)

(2) Modify Telemetry so that:
   i) All the logging to stdout/stderr go through Python logging module. If there is any place that does "print "foobar"", or sys.stdout.write("foobar"), then we can't quite capture those log to a file.
   ii) Create a log handler that can do both: write log to a file & write log to stdout/stderr. The first is for SOM serving, the second is for the global log view for expert debugging. Document about log handler can be found in https://docs.python.org/2/library/logging.html#handler-objects

(3) Add the artifact entries to the test results in Telemetry (https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/internal/results/json_3_output_formatter.py) so that it contains the artifacts per story run (notes that a story run is different from story, as a single story can map to multiple story run, https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/internal/results/story_run.py?q=story_run.py&dr).

(4) Modify the rest of Telemetry to use artifact for storing files. Right now Telemetry just does uploading to cloud storage all over the places.
Example:
Uploading trace: 
https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/internal/results/page_test_results.py?rcl=6697a19406e0ef46e929f0c4b092cc00496825e3&l=492

Uploading profile files:
https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/internal/results/page_test_results.py?rcl=6697a19406e0ef46e929f0c4b092cc00496825e3&l=496

Uploading minidump:
https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome/desktop_browser_backend.py?rcl=6697a19406e0ef46e929f0c4b092cc00496825e3&l=539

Uploading browser log:
https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/internal/browser/browser.py?rcl=6697a19406e0ef46e929f0c4b092cc00496825e3&l=274

------------------------------------

Now that's a lot of work to do 1), 2), 3) & 4) altogether, so we should scope this in Q4 to unblock One BuildBot step only. The total amount of work we need for Q4 would be:

(1)-(i), (1)-(ii) 
(2) 
(3)

Supporting Telemetry browser test (item (1)-(iii)) & (4) can be done in subsequent quarters.


+Juan: another Telemetry owner

+kbr@: client of Telemetry browser test.
Cc: -nednguyen@chromium.org nedngu...@google.com
Components: Speed>Telemetry Tests>Telemetry
I'm not sure about (1). I feel like it could maybe be attached to the results object, which has a lifecycle of the test run. Not sure exactly what it's supposed to be used for.

For (2), I did a bit of digging in telemetry, although I'm not the most familiar with it. I did a basic grep, and didn't see that many bits of code (outside of test code), which used sys.stdout or sys.stderr.

For ii), I think something like what's described in https://docs.python.org/3/howto/logging-cookbook.html#multiple-handlers-and-formatters should work. 

I made https://chromium-review.googlesource.com/c/catapult/+/706424 as a very WIP idea of what we could do to implement all of this. I make the logger stuff before creating the results object, and then pass it into the CreateResults call. There might be some cleanup stuff to be done afterwards, I'm not sure.

This all sounds pretty doable to me, in general.
Ah, I don't my idea of using the results object would really work for tracking artifacts per story run or per test. I'm not sure what the best way to do that is.
(1) +1 on attaching to results object, sounds like the most relevant place.

On (4), tbh, I think the best approach would be to have some module which we can call to upload the artifact and get back a cloud store url, which then gets stored in the results object.

But I still think it's probably best for Telemetry to be the one calling this module directly rather than relying on an external "results processor" to swap local files with cloud urls. This way even if something terrible happens (some bad error killing the whole of Telemety so we are unable to produce the results.json) we should still be able to inspect logs and recover the uploaded artifacts while debugging.

As you mention this can be scoped to next quarter.
Project Member

Comment 10 by bugdroid1@chromium.org, Dec 5 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/catapult/+/a9cf2fa54a3334ff6834e0598c782156e0b3d364

commit a9cf2fa54a3334ff6834e0598c782156e0b3d364
Author: Stephen Martinis <martiniss@chromium.org>
Date: Tue Dec 05 22:50:09 2017

Emit artifacts from story runner

Currently only emits test log (from python logger) and screenshots.

Bug: chromium:772215 
Change-Id: I0ac06536fcfd8a1d2072838f53b79c27380ae921
Reviewed-on: https://chromium-review.googlesource.com/772831
Commit-Queue: Stephen Martinis <martiniss@chromium.org>
Reviewed-by: Juan Antonio Navarro Pérez <perezju@chromium.org>
Reviewed-by: Ned Nguyen <nednguyen@google.com>

[add] https://crrev.com/a9cf2fa54a3334ff6834e0598c782156e0b3d364/common/py_utils/py_utils/logging_util.py
[modify] https://crrev.com/a9cf2fa54a3334ff6834e0598c782156e0b3d364/telemetry/telemetry/page/page_run_end_to_end_unittest.py
[modify] https://crrev.com/a9cf2fa54a3334ff6834e0598c782156e0b3d364/telemetry/telemetry/internal/results/page_test_results.py
[add] https://crrev.com/a9cf2fa54a3334ff6834e0598c782156e0b3d364/common/py_utils/py_utils/logging_util_unittest.py
[modify] https://crrev.com/a9cf2fa54a3334ff6834e0598c782156e0b3d364/telemetry/telemetry/internal/results/artifact_results_unittest.py
[modify] https://crrev.com/a9cf2fa54a3334ff6834e0598c782156e0b3d364/telemetry/telemetry/internal/story_runner.py
[modify] https://crrev.com/a9cf2fa54a3334ff6834e0598c782156e0b3d364/common/py_utils/py_utils/py_utils_unittest.py
[modify] https://crrev.com/a9cf2fa54a3334ff6834e0598c782156e0b3d364/telemetry/telemetry/page/shared_page_state.py
[modify] https://crrev.com/a9cf2fa54a3334ff6834e0598c782156e0b3d364/telemetry/telemetry/internal/story_runner_unittest.py
[modify] https://crrev.com/a9cf2fa54a3334ff6834e0598c782156e0b3d364/telemetry/telemetry/internal/results/results_options.py
[modify] https://crrev.com/a9cf2fa54a3334ff6834e0598c782156e0b3d364/telemetry/telemetry/internal/results/artifact_results.py

Blockedon: 792389
Blockedon: 793433
Status: Fixed (was: Assigned)
I believe this work is done. Thanks Stephen!
Components: Test>Telemetry
Components: -Speed>Telemetry

Sign in to add a comment