add a metric about devserver artifact "caching efficiency" |
||||||
Issue descriptionEasy way: After a build is staged, increment a counter every time you reply True to is_staged request. (possibly also increment some separate download count counter for every download request per artifact). When the build expires and is un-staged, throw this count at monarch / statsd. Harder way long term: Emit a data point to database about every single build that is staged and that is downloaded, from every devserver. This allows us to make more sophisticated queries and to figure out which builds or artifact types are not being efficiently cached. Starting with dgarrett@ who may have a way to quickly implement the easy way.
,
Jun 21 2016
,
Jun 21 2016
,
Jun 21 2016
So... I know where the code that stages artifacts is now. How do I reports stats about it in a way that is useful?
,
Jun 21 2016
my preferred approach would be to use ts_mon from the devserver to export to monarch. I think jrbarnette is working on an example CL in autotest that will do that from the scheduler. The devserver doesn't have autotest, but I think it has chromite that it can import. Look at cidb.py use of ts_mon for an example. We can discuss schema in person.
,
Jun 22 2016
The devserver doesn't currently import from chromite. It seems to be pretty standalone.
,
Jun 22 2016
I think there must be something on the devserver that is importing from chromite, because we do have devserver stats on stads, which I think we get from chromite these days. Maybe it's a separate watchdog process and not devserver.py dshi@ do you know? Also, you may want to base your metric around this wrapper class: https://chromium-review.googlesource.com/#/c/354691/
,
Jun 22 2016
Ah... devserver itself does not, but some libraries in the same directory tree do. I don't THINK any of those libraries are imported to devserver proper. It would really surprise me if chromite was available for cases like "cros flash". grep -R chromite host/lib/tools.py:from chromite.lib import cros_build_lib host/lib/tools.py:from chromite.lib import git host/lib/cros_output.py:from chromite.lib import terminal host/willis: import chromite.lib.git host/willis: import chromite.lib.git host/willis: print 'Failed to import chromite library' host/willis:import chromite.lib.terminal host/willis:Colorizer = chromite.lib.terminal.Color(os.isatty(sys.stdout.fileno())) host/willis: if chromite.lib.git.IsSHA1(project['revision']): host/willis: repo_path = chromite.lib.git.FindRepoDir(os.getcwd()) host/willis: repo_path = chromite.lib.git.FindRepoDir(os.path.dirname(sys.argv[0])) host/willis: manifest = chromite.lib.git.ManifestCheckout(root) checkfiles/devserver/gs_check.py:from chromite.lib import cros_build_lib checkfiles/devserver/gs_check.py: # Additionally, we cannot use chromite's timeout_util as it is
,
Jun 22 2016
Re #7 devserver stats in graphite is reported in autotest. Most data are reported based on check_health call.
,
Jun 22 2016
I had a thought about this last night. The current devserver logs show when something is staged. To quickly identify staging problems in production, we can just grep logs. Then build a better answer when there isn't such a rush.
,
Jun 22 2016
After going through logs at some length, we don't have enough information there to see if we are doing excessive staging. We carefully log whenever we ask to stage something, and we log the basename of every file we download. But there is no good way to tie those logs together to see what builds we are downloading files for. To fix this, we'd need to update this log message appropriately: https://cs.corp.google.com/chromeos_public/src/platform/dev/build_artifact.py?rcl=57d181772c8d3cf99b383ccdb9864f2ffc1cbade&l=327
,
Jul 16 2016
We didn't pull out enough information to be useful as part of this project, but unrelated metrics effort should be addressing the same problem in a more generalized way. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by autumn@chromium.org
, Jun 21 2016