Several telemetry tests crash on Android |
|||||||
Issue description
,
Apr 1 2016
Common cause seems to be a null pointer in Omaha code: E/AndroidRuntime(21168): Process: org.chromium.chrome, PID: 21168 E/AndroidRuntime(21168): java.lang.NullPointerException: Attempt to invoke virtual method 'java.lang.String org.chromium.chrome.browser.omaha.RequestGenerator.generateXML(java.lang.String, java.lang.String, long, org.chromium.chrome.browser.omaha.RequestData)' on a null object reference E/AndroidRuntime(21168): at org.chromium.chrome.browser.omaha.OmahaClient.generateAndPostRequest(OmahaClient.java:326) E/AndroidRuntime(21168): at org.chromium.chrome.browser.omaha.OmahaClient.handlePostRequestIntent(OmahaClient.java:299) E/AndroidRuntime(21168): at org.chromium.chrome.browser.omaha.OmahaClient.onHandleIntent(OmahaClient.java:210) E/AndroidRuntime(21168): at android.app.IntentService$ServiceHandler.handleMessage(IntentService.java:65) E/AndroidRuntime(21168): at android.os.Handler.dispatchMessage(Handler.java:102) E/AndroidRuntime(21168): at android.os.Looper.loop(Looper.java:145) E/AndroidRuntime(21168): at android.os.HandlerThread.run(HandlerThread.java:61)
,
Apr 1 2016
Doing a cq dry run with a revert of the only Android-related patch in the range: https://codereview.chromium.org/1849193002#
,
Apr 1 2016
Last good and first bad revisions from all the Android bots: last_good = [ 384290, 384238, 384244, 384313, 384245, 384279, 384200, 384252, 384262, 384251, 384244, 384294, 384313, 384196, 384287, 384197, 384235, 384287, 384275, 384291, 384275, ] first_bad = [ 384311, 384275, 384313, 384317, 384317, 384313, 384327, 384311, 384292, 384333, 384317, 384315, 384322, 384316, 384317, 384338, 384281, 384315, 384313, 384317, 384323, ] >>> max(last_good) 384313 >>> min(first_bad) 384275 On other words, the last known good build was using a later revision than the first bad build (?!).
,
Apr 1 2016
I used a script to crawl through the build logs and got this result: last_good=384533 first_bad=384340 Again, the last good build is later than the first bad build :P
,
Apr 1 2016
I think bug 584114 has somehow started to manifest itself on all the perf bots, and there's no common revision range for the failures. Nothing jumped out from the internal build repo changes either. I've was able to repro locally with an official build of chrome_public so I'll dig into that. Any ideas what might be going on?
,
Apr 1 2016
The omaha ping is being sent because https://code.google.com/p/chromium/codesearch#chromium/src/chrome/android/java/src/org/chromium/chrome/browser/omaha/OmahaClient.java&l=220 does not early-out in an official build. I tried to make a build from earlier this week and it crashes in the same way. Therefore I think something in the build configuration has changed instead something in the code.
,
Apr 1 2016
John, Dirk, any idea of what could have changed in the build repo? This is causing all of android perf bots to crash. https://build.chromium.org/p/chromium.perf/console
,
Apr 1 2016
I don't know of any changes to infra-land stuff that would have caused this, nor do I see anything blatantly suspicious in the git log. Still looking, though. Bumping priority, as all Android bots on chromium.perf are red.
,
Apr 1 2016
I took a look at the oldest build I was able to find[1] on the Android perf builder and it's using an identical gn config to the most latest one (i.e., official build and chrome branding): /b/build/slave/Android_Builder/build/src/buildtools/linux64/gn gen //out/Release '--args=is_chrome_branded=true is_official_build=true is_debug=false use_goma=true goma_dir="/b/build/goma" symbol_level=1 target_os="android" ffmpeg_branding="Chrome" proprietary_codecs=true' --check [1] https://build.chromium.org/p/chromium.perf/builders/Android%20Builder/builds/74617/steps/steps/logs/stdio/text
,
Apr 1 2016
(A gn config change would also be a source-side change to //tools/mb/mb_config.pyl)
,
Apr 1 2016
I'm not aware of anything, no ...
,
Apr 1 2016
Good find :) I must have messed up my revision range math as I completely failed to spot that. I'll go back and see what tripped me up, but meanwhile we should probably revert that patch.
,
Apr 1 2016
,
Apr 1 2016
Analysis post-morten: in the first pass I failed to spot that build bot blame logs go from oldest to newest instead of the opposite way around. In the second pass my fancy log crawler script fingered the wrong builds because of silent network errors.
,
Apr 4 2016
Breakage is gone now that the revert is in -- although looks like some new failures snuck in while the tree was red.
,
Apr 4 2016
|
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by skyos...@chromium.org
, Apr 1 2016