Error: https://pantheon.corp.google.com/errors/8504089329729126393?time=P30D&filter&project=findit-for-me&authuser=1 This is happening with large analyses, since we store the data points as raw data in the analysis. When I measured the analysis reponsible for the error, found that it was approximately half a mb in size. analysis https://findit-for-me.appspot.com/waterfall/flake?key=ag9zfmZpbmRpdC1mb3ItbWVy4QELEhdNYXN0ZXJGbGFrZUFuYWx5c2lzUm9vdCKqAWNocm9taXVtLndpbi9XaW4xMCBUZXN0cyB4NjQvMTc2NzAvcmVuZGVyZXJfc2lkZV9uYXZpZ2F0aW9uX2Jyb3dzZXJfdGVzdHMgb24gV2luZG93cy0xMC0xNDM5My9WR2gxYldKdVlXbHNWR1Z6ZEM1VGFHOTFiR1JEWVhCMGRYSmxUMjVPWVhacFoyRjBhVzVuUVhkaGVVVjRjR3hwWTJsMFYyRnBkQT09DAsSE01hc3RlckZsYWtlQW5hbHlzaXMYAQw reference entity https://pantheon.corp.google.com/datastore/entities/query?project=findit-for-me&authuser=1&ns=&kind=MasterFlakeAnalysis&filter=7%2F__key__%7CKEY%7CEQ%7C327%2Fag9zfmZpbmRpdC1mb3ItbWVy4QELEhdNYXN0ZXJGbGFrZUFuYWx5c2lzUm9vdCKqAWNocm9taXVtLndpbi9XaW4xMCBUZXN0cyB4NjQvMTc2NzAvcmVuZGVyZXJfc2lkZV9uYXZpZ2F0aW9uX2Jyb3dzZXJfdGVzdHMgb24gV2luZG93cy0xMC0xNDM5My9WR2gxYldKdVlXbHNWR1Z6ZEM1VGFHOTFiR1JEWVhCMGRYSmxUMjVPWVhacFoyRjBhVzVuUVhkaGVVVjRjR3hwWTJsMFYyRnBkQT09DAsSE01hc3RlckZsYWtlQW5hbHlzaXMYAQw There are three large fields in the analysis: (1) algorithm_parameters, these are just config values we should store at the root level, and access via a static method rather than storing them on every analysis. (2) data_points. These are stored as raw data in the analysis. Especially as we scale and move forward, these should be factored out into entities of their own, and the analysis should look them up by key. (3) swarming_rerun_results. This was the biggest field as far as I could tell from visual inspection. The only usage is it being set. We should remove this asap.
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/bead4bb6d98b5069ae0edce201324c80727ccbb5 commit bead4bb6d98b5069ae0edce201324c80727ccbb5 Author: Brandon Wylie <wylieb@chromium.org> Date: Mon Nov 13 20:48:25 2017 [Findit] Flake Analyzer - Remove swarming_rerun_results from MasterFlakeAnalysis This field is very large, and we don't use the data. Removing this as a first step to minimizing the MasterFlakeAnalysis model. Bug: 783848 Change-Id: I5f9f1180302b9f82ec79101fcb7673a23d7ea5d5 Reviewed-on: https://chromium-review.googlesource.com/764592 Reviewed-by: Shuotao Gao <stgao@chromium.org> Reviewed-by: Jeffrey Li <lijeffrey@chromium.org> Commit-Queue: Brandon Wylie <wylieb@chromium.org> [modify] https://crrev.com/bead4bb6d98b5069ae0edce201324c80727ccbb5/appengine/findit/model/flake/master_flake_analysis.py [modify] https://crrev.com/bead4bb6d98b5069ae0edce201324c80727ccbb5/appengine/findit/waterfall/flake/update_flake_analysis_data_points_pipeline.py [modify] https://crrev.com/bead4bb6d98b5069ae0edce201324c80727ccbb5/appengine/findit/model/flake/test/master_flake_analysis_test.py [modify] https://crrev.com/bead4bb6d98b5069ae0edce201324c80727ccbb5/appengine/findit/waterfall/flake/test/update_flake_analysis_data_points_pipeline_test.py
Comment 1 by bugdroid1@chromium.org
, Nov 13 2017