Issue metadata
Sign in to add a comment
|
cheets_GTS.GtsAdminTestCases failing on eve-tot-chrome-pfq-informational |
||||||||||||||||||||
Issue descriptionIn HWTest [bvt-arc] on eve-tot-chrome-pfq-informational, cheets_GTS.GtsAdminTestCases failed: https://ci.chromium.org/p/chromeos/builds/b8946537013290657920 I don't see anything meaningful in the logs: 05/14 12:28:37.477 ERROR| utils:0282| [stderr] cat: /tmp/sysinfo/autoserv-MMoFrk/.checksum: No such file or directory 05/14 12:29:25.956 ERROR| utils:0282| [stderr] SELinux: Loaded file_contexts contexts from /file_contexts.bin. 05/14 12:30:40.473 ERROR| utils:0282| [stderr] SELinux: Loaded file_contexts contexts from /file_contexts.bin. 05/14 12:31:40.128 ERROR| utils:0282| [stderr] SELinux: Loaded file_contexts contexts from /file_contexts.bin. 05/14 12:32:22.186 ERROR| utils:0282| [stderr] cat: /tmp/sysinfo/autoserv-MMoFrk/.checksum: No such file or directory 05/14 12:32:46.860 ERROR| logging_manager:0626| tko parser: {'cidb_build_stage_id': 79688286, 'builds': "{'cros-version': 'eve-tot-chrome-pfq-informational/R68-10678.0.0-b2569696'}", 'job_started': 1526326068, 'hostname': 'chromeos6-row4-rack11-host3', 'cidb_build_id': 2569696, 'status_version': 1, 'label': 'eve-tot-chrome-pfq-informational/R68-10678.0.0-b2569696/bvt-arc/cheets_GTS.5.1_r4.GtsAdminTestCases', 'parent_job_id': 200094324, 'drone': 'cros-full-0004.mtv.corp.google.com', 'user': 'chromeos-test', 'suite': 'bvt-arc', 'job_queued': 1526321990, 'experimental': 'False', 'build': 'eve-tot-chrome-pfq-informational/R68-10678.0.0-b2569696'} 05/14 12:32:46.861 ERROR| logging_manager:0626| tko parser: MACHINE NAME: chromeos6-row4-rack11-host3 05/14 12:32:46.861 ERROR| logging_manager:0626| tko parser: MACHINE GROUP: eve 05/14 12:32:46.861 ERROR| logging_manager:0626| tko parser: parsing partial test ---- SERVER_JOB 05/14 12:32:46.861 ERROR| logging_manager:0626| tko parser: parsing partial test cheets_GTS.GtsAdminTestCases cheets_GTS.GtsAdminTestCases 05/14 12:32:46.862 ERROR| logging_manager:0626| tko parser: RUNNING: RUNNING 05/14 12:32:46.862 ERROR| logging_manager:0626| Subdir: cheets_GTS.GtsAdminTestCases 05/14 12:32:46.862 ERROR| logging_manager:0626| Testname: cheets_GTS.GtsAdminTestCases 05/14 12:32:46.862 ERROR| logging_manager:0626| 05/14 12:32:46.862 ERROR| logging_manager:0626| tko parser: update RUNNING reason: Failed: after 2 retries giving up. passed=2, failed=0, waived=0, notexec>=1. run(p=2, f=0, w=0) retry(p=2, f=0, w=0) retry(p=2, f=0, w=0) 05/14 12:32:46.862 ERROR| logging_manager:0626| tko parser: parsing test cheets_GTS.GtsAdminTestCases cheets_GTS.GtsAdminTestCases 05/14 12:32:46.863 ERROR| logging_manager:0626| tko parser: ADD: FAIL 05/14 12:32:46.863 ERROR| logging_manager:0626| Subdir: cheets_GTS.GtsAdminTestCases 05/14 12:32:46.863 ERROR| logging_manager:0626| Testname: cheets_GTS.GtsAdminTestCases 05/14 12:32:46.863 ERROR| logging_manager:0626| Failed: after 2 retries giving up. passed=2, failed=0, waived=0, notexec>=1. run(p=2, f=0, w=0) retry(p=2, f=0, w=0) retry(p=2, f=0, w=0) 05/14 12:32:46.863 ERROR| logging_manager:0626| tko parser: parsing test ---- SERVER_JOB I guess it's saying that one or more tests weren't run?
,
May 15 2018
In the Android log, the test process is crashing in: 05-14 12:32:02.325 2071 2071 E AndroidRuntime: FATAL EXCEPTION: main 05-14 12:32:02.325 2071 2071 E AndroidRuntime: Process: com.google.android.admin.gts, PID: 2071 05-14 12:32:02.325 2071 2071 E AndroidRuntime: java.lang.NullPointerException: Attempt to invoke virtual method 'android.content.res.Configuration android.content.res.Resources.getConfiguration()' on a null object reference 05-14 12:32:02.325 2071 2071 E AndroidRuntime: at android.app.ActivityThread.updateLocaleListFromAppContext(ActivityThread.java:5204) 05-14 12:32:02.325 2071 2071 E AndroidRuntime: at android.app.ActivityThread.handleBindApplication(ActivityThread.java:5441) 05-14 12:32:02.325 2071 2071 E AndroidRuntime: at android.app.ActivityThread.-wrap2(ActivityThread.java) 05-14 12:32:02.325 2071 2071 E AndroidRuntime: at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1595) 05-14 12:32:02.325 2071 2071 E AndroidRuntime: at android.os.Handler.dispatchMessage(Handler.java:102) 05-14 12:32:02.325 2071 2071 E AndroidRuntime: at android.os.Looper.loop(Looper.java:154) 05-14 12:32:02.325 2071 2071 E AndroidRuntime: at android.app.ActivityThread.main(ActivityThread.java:6320) 05-14 12:32:02.325 2071 2071 E AndroidRuntime: at java.lang.reflect.Method.invoke(Native Method) 05-14 12:32:02.325 2071 2071 E AndroidRuntime: at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:891) 05-14 12:32:02.325 2071 2071 E AndroidRuntime: at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:781) 05-14 12:32:02.326 2071 2071 I Process : Sending signal. PID: 2071 SIG: 9
,
May 15 2018
Looks like an Android issue not specific to ARC++/ChromeOS..? b/issues?q=updateLocaleListFromAppContext%20status:open b/63333966 But that said, we seemed to have mitigated the flakiness by retries (orange cells) in the past. https://stainless.corp.google.com/search?view=matrix&row=build&col=test&first_date=2018-04-15&last_date=2018-05-15&test=cheets_GTS.5.*Gts%28Admin%7CNet%7CPlacement%29TestCases&board=%5Eeve%24&status=GOOD&status=WARN&status=FAIL&status=ERROR&exclude_cts=false&exclude_not_run=false&exclude_non_release=false&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=false Recently, even after the retries the failures tend to alive more.
,
May 15 2018
(passing to the PST constable. I have no progress since #3)
,
May 16 2018
Looked at the logs, no conclusion yet. Our crash/ server (ChromeOS_ARC category) does have crashes at updateLocaleListFromAppContext() as I wrote in b/63333966#comment16 but I'm not sure if this particular GTS failure shares the same root cause with the crashes because GTS test failures seem to happen more often than crashes in crash/. The root cause of the upstream bug b/63333966 seems also unknown but the log from GTS does not match the one in b/63333966#comment8 at least. It's hard to say if this is the same as upstream's. The logcat log when the test fails is really strange. Before the FATAL EXCEPTION, it has 05-14 12:32:02.310 168 178 I ActivityManager: Force stopping com.google.android.admin.gts appid=10058 user=0: start instr 05-14 12:32:02.321 168 178 I ActivityManager: Start proc 2071:com.google.android.admin.gts/u0a58 for added application com.google.android.admin.gts 05-14 12:32:02.324 2071 2071 W System : ClassLoader referenced unknown path: /data/app/com.google.android.admin.gts-1/base.apk 05-14 12:32:02.324 2071 2071 W System : ClassLoader referenced unknown path: /data/app/com.google.android.admin.gts-1/lib/x86 05-14 12:32:02.324 2071 2071 W linker : Warning: unable to resolve "/data/app/com.google.android.admin.gts-1/base.apk": No such file or directory 05-14 12:32:02.325 2071 2071 W linker : Warning: unable to resolve "/data/app/com.google.android.admin.gts-1/base.apk": No such file or directory 05-14 12:32:02.325 2071 2071 W art : JIT profile information will not be recorded: profile file does not exits. 05-14 12:32:02.325 2071 2071 W asset : Asset path /data/app/com.google.android.admin.gts-1/base.apk is neither a directory nor file (type=1). 05-14 12:32:02.325 2071 2071 E ResourcesManager: failed to add asset path /data/app/com.google.android.admin.gts-1/base.apk 05-14 12:32:02.325 2071 2071 D AndroidRuntime: Shutting down VM 05-14 12:32:02.325 2071 2071 E AndroidRuntime: FATAL EXCEPTION: main and the linker line is from Bionic. if (realpath(zip_path.c_str(), resolved_path) == nullptr) { DL_WARN("Warning: unable to resolve \"%s\": %s", zip_path.c_str(), strerror(errno)); continue; } This means that the base.apk file does not exist at all at this point. Just in case, I searched for SELinux denials, but found none. I'm comparing two logs (test failure vs success) and one thing I've noticed is when it fails, the log only shows dex2oat for com.google.android.admin.gts for isa=x86: 05-14 12:32:01.388 168 194 I PackageManager.DexOptimizer: Running dexopt (dex2oat) on: /data/app/vmdl91353407.tmp/base.apk pkg=com.google.android.admin.gts isa=x86 vmSafeMode=false debuggable=false target-filter=interpret-only oatDir = /data/app/vmdl91353407.tmp/oat sharedLibraries=null 05-14 12:32:01.396 2032 2032 I dex2oat : /system/bin/dex2oat --no-digest-validation --compiler-filter=interpret-only 05-14 12:32:01.570 2032 2036 W dex2oat : Class android.support.test.rule.logging.AtraceLogger failed lock verification and will run slower. 05-14 12:32:01.570 2032 2036 W dex2oat : Common causes for lock verification issues are non-optimized dex code 05-14 12:32:01.570 2032 2036 W dex2oat : and incorrect proguard optimizations. 05-14 12:32:01.766 2032 2032 I dex2oat : dex2oat took 370.840ms (threads: 4) arena alloc=17KB (18352B) java alloc=2MB (2987816B) native alloc=7MB (8337248B) free=3MB (3197088B) 05-14 12:32:01.769 168 194 I BackgroundDexOptService: onBootPhase 05-14 12:32:01.769 168 194 W SELinux : SELinux: setxattr restorecon_last failed (expected): /data/app/com.google.android.admin.gts-1: Operation not permitted. 05-14 12:32:01.805 62 62 D installd: Detected label change from u:object_r:system_data_file:s0 to u:object_r:app_data_file:s0:c512,c768 at /data/data/com.google.android.admin.gts; running recursive restorecon 05-14 12:32:01.806 62 62 D installd: Detected label change from u:object_r:system_data_file:s0 to u:object_r:app_data_file:s0:c512,c768 at /data/user_de/0/com.google.android.admin.gts; running recursive restorecon 05-14 12:32:01.806 168 194 V BackupManagerService: restoreAtInstall pkg=com.google.android.admin.gts token=4 restoreSet=0 05-14 12:32:01.806 168 194 V BackupManagerService: Finishing install immediately 05-14 12:32:01.806 168 194 I art : Starting a blocking GC Explicit 05-14 12:32:01.815 168 194 I art : Explicit concurrent mark sweep GC freed 5889(462KB) AllocSpace objects, 5(104KB) LOS objects, 33% free, 7MB/11MB, paused 167us total 8.941ms 05-14 12:32:01.816 62 62 E : Couldn't opendir /data/app/vmdl91353407.tmp: No such file or directory // this line seems benign btw. When it succeeds, dex2oat runs for com.google.android.admin.gts for isa=x86_64 happens first, then start com.google.android.admin.gts, then dex2oat runs for isa=x86. What I'm probably going to do next is to check if the test fails if I limit the isa to only x86, and if it actually fails, investigate why sometimes only x86 is used for dex2oat. (but if kinaba@ has other ideas, please do.)
,
May 16 2018
x86 vs x86_64 was not related to the issue. The bad case only tried x86 simply because it was a retry and tradefed explicitly excluded x86_64 variant from the test.
I checked the first trial's log instead, and it did use x86_64 and still failed in the same way. I again compared good and bad logcat logs, but the only first obvious difference is that the bad case had the following:
System : ClassLoader referenced unknown path: /data/app/com.google.android.admin.gts-1/base.apk
which basically said base.apk was neither a directory or file (maybe nonexistent.)
My best guess is that the code for renaming /data/app/vmdl*.tpm to /data/app/com.google.android.admin.gts-1 was not called for some reason??? This is the code and unless DEBUG_INSTALL is enabled, nothing is logged here:
// PackageManagerService.java
if (DEBUG_INSTALL) Slog.d(TAG, "Renaming " + beforeCodeFile + " to " + afterCodeFile);
try {
Os.rename(beforeCodeFile.getAbsolutePath(), afterCodeFile.getAbsolutePath());
} catch (ErrnoException e) {
Slog.w(TAG, "Failed to rename", e);
return false;
}
kinaba@, I have two questions:
1) Can I temporarily enable the debug log above? (will send a CL shortly)
2) According to the host side logcat log, P's tradefed command is used for this rather than N's. Is that expected?
,
May 17 2018
Landed: ag/4067574
,
May 17 2018
Answered offline, but for record: Re: (2), GTS is built from the latest release branch (now O MR1) targeting all the existing branches. So, newer GTS running on N is expected/intended.
,
May 17 2018
,
May 17 2018
10688.0.0 has the logging patch. Waiting for the gts test to fail...
,
May 19 2018
Unfortunately, #8 didn't tell us anything. https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/200963879-chromeos-test this failure for example, has the log (== /data/app/app-name-1/ was created) but right after that, the log says /data/app/app-name-1/base.apk is missing. It's unclear what happened in between. I already reverted #8 since it was useless. I guess we need even more logging to know what's going on... I sent ag/4112534 to you (nya@). nya@, I'm reassigning this to you, next week's constable. Can you take a look? I think this is the most important cheets auto test flake at this point.
,
May 21 2018
Sorry, I don't have any news about this issue. Assigning to PST constable.
,
May 21 2018
Added some more logging at ag/4112534. Please revert once this is fixed (or the logs are useless.)
,
May 24 2018
Recent GTS crashes left different logs: https://storage.cloud.google.com/chromeos-autotest-results/202442445-chromeos-test/chromeos6-row4-rack11-host7/cheets_GTS.GtsNetTestCases/results/android-gts/logs/2018.05.23_21.28.13/inv_564407203681777484/device_logcat_setup_7538128073790608880.txt.gz 05-23 21:27:54.221 75 75 F libc : Fatal signal 6 (SIGABRT), code -6 in tid 75 (system_server) 05-23 21:27:54.221 6 6 I chatty : uid=0(root) /system/bin/debuggerd64 expire 1 line 05-23 21:27:54.226 160 160 I chatty : uid=0(root) expire 1 line 05-23 21:27:54.277 160 160 F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** 05-23 21:27:54.277 160 160 F DEBUG : Build fingerprint: 'google/eve/eve_cheets:7.1.1/R68-10714.0.0/4799854:user/release-keys' 05-23 21:27:54.277 160 160 F DEBUG : Revision: '0' 05-23 21:27:54.277 160 160 F DEBUG : ABI: 'x86_64' 05-23 21:27:54.277 160 160 F DEBUG : pid: 75, tid: 75, name: system_server >>> system_server <<< 05-23 21:27:54.277 160 160 F DEBUG : signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr -------- 05-23 21:27:54.278 160 160 F DEBUG : Abort message: 'art/runtime/gc/space/image_space.cc:256] Unable to find a relocated version of image file /system/framework/boot.art' 05-23 21:27:54.278 160 160 F DEBUG : rax 0000000000000000 rbx 00007fed1b3a7be8 rcx 00007fed1623a587 rdx 0000000000000006 05-23 21:27:54.278 160 160 F DEBUG : rsi 000000000000004b rdi 000000000000004b 05-23 21:27:54.278 160 160 F DEBUG : r8 ffffffffffffffc8 r9 000000000026c92e r10 0000000000000008 r11 0000000000000206 05-23 21:27:54.278 160 160 F DEBUG : r12 000000000000004b r13 0000000000000006 r14 00007fed0b999fd0 r15 00007fed15c95a00 05-23 21:27:54.278 160 160 F DEBUG : cs 0000000000000033 ss 000000000000002b 05-23 21:27:54.278 160 160 F DEBUG : rip 00007fed1623a587 rbp 000000000000000d rsp 00007ffc2e140978 eflags 0000000000000206 05-23 21:27:54.321 60 60 I ServiceManager: Waiting for service media.camera.proxy... 05-23 21:27:54.421 60 60 I ServiceManager: Waiting for service media.camera.proxy... 05-23 21:27:54.522 60 60 I ServiceManager: Waiting for service media.camera.proxy... 05-23 21:27:54.596 160 160 F DEBUG : 05-23 21:27:54.596 160 160 F DEBUG : backtrace: 05-23 21:27:54.596 160 160 F DEBUG : #00 pc 000000000008d587 /system/lib64/libc.so (tgkill+7) 05-23 21:27:54.597 160 160 F DEBUG : #01 pc 000000000008a041 /system/lib64/libc.so (pthread_kill+65) 05-23 21:27:54.597 160 160 F DEBUG : #02 pc 0000000000030201 /system/lib64/libc.so (raise+17) 05-23 21:27:54.597 160 160 F DEBUG : #03 pc 000000000002887d /system/lib64/libc.so (abort+77) 05-23 21:27:54.597 160 160 F DEBUG : #04 pc 00000000004ed81a /system/lib64/libart.so (_ZN3art7Runtime5AbortEPKc+474) 05-23 21:27:54.597 160 160 F DEBUG : #05 pc 00000000001509d9 /system/lib64/libart.so (_ZN3art10LogMessageD1Ev+1785) 05-23 21:27:54.597 160 160 F DEBUG : #06 pc 00000000002ad4a3 /system/lib64/libart.so (_ZN3art2gc5space10ImageSpace20ReadImageHeaderOrDieEPKcNS_14InstructionSetE+147) 05-23 21:27:54.597 160 160 F DEBUG : #07 pc 0000000000460f73 /system/lib64/libart.so (_ZN3art16OatFileAssistant12GetImageInfoEv+451) 05-23 21:27:54.597 160 160 F DEBUG : #08 pc 000000000045ff52 /system/lib64/libart.so (_ZN3art16OatFileAssistant23GivenOatFileIsOutOfDateERKNS_7OatFileE+722) 05-23 21:27:54.597 160 160 F DEBUG : #09 pc 00000000004603d4 /system/lib64/libart.so (_ZN3art16OatFileAssistant22GivenOatFileIsUpToDateERKNS_7OatFileE+36) 05-23 21:27:54.597 160 160 F DEBUG : #10 pc 000000000045aa88 /system/lib64/libart.so (_ZN3art16OatFileAssistant15GetDexOptNeededENS_14CompilerFilter6FilterE+344) 05-23 21:27:54.597 160 160 F DEBUG : #11 pc 0000000000421303 /system/lib64/libart.so (_ZN3artL23DexFile_getDexOptNeededEP7_JNIEnvP7_jclassP8_jstringS5_S5_h+435) 05-23 21:27:54.597 160 160 F DEBUG : #12 pc 0000000000857160 /system/framework/x86_64/boot-core-libart.oat (offset 0x666000) (int dalvik.system.DexFile.getDexOptNeeded(java.lang.String, java.lang.String, java.lang.String, boolean)+268) 05-23 21:27:54.597 160 160 F DEBUG : #13 pc 00000000018f2f0c /system/framework/oat/x86_64/services.odex (offset 0x1083000) (com.android.server.pm.PackageManagerService.<init>+6632) 05-23 21:27:54.597 160 160 F DEBUG : #14 pc 000000000121fefe /system/framework/oat/x86_64/services.odex (offset 0x1083000) (com.android.server.SystemServer.startBootstrapServices+1274) 05-23 21:27:54.597 160 160 F DEBUG : #15 pc 000000000121f637 /system/framework/oat/x86_64/services.odex (offset 0x1083000) (com.android.server.SystemServer.run+1347) 05-23 21:27:54.597 160 160 F DEBUG : #16 pc 000000000121eae7 /system/framework/oat/x86_64/services.odex (offset 0x1083000) (com.android.server.SystemServer.main+83) 05-23 21:27:54.597 160 160 F DEBUG : #17 pc 000000000013c016 /system/lib64/libart.so (art_quick_invoke_static_stub+806) 05-23 21:27:54.597 160 160 F DEBUG : #18 pc 0000000000149edc /system/lib64/libart.so (_ZN3art9ArtMethod6InvokeEPNS_6ThreadEPjjPNS_6JValueEPKc+268) 05-23 21:27:54.597 160 160 F DEBUG : #19 pc 00000000004e66c7 /system/lib64/libart.so (_ZN3artL18InvokeWithArgArrayERKNS_33ScopedObjectAccessAlreadyRunnableEPNS_9ArtMethodEPNS_8ArgArrayEPNS_6JValueEPKc+87) 05-23 21:27:54.597 160 160 F DEBUG : #20 pc 00000000004e87b4 /system/lib64/libart.so (_ZN3art12InvokeMethodERKNS_33ScopedObjectAccessAlreadyRunnableEP8_jobjectS4_S4_m+1444) 05-23 21:27:54.597 160 160 F DEBUG : #21 pc 0000000000448cf8 /system/lib64/libart.so (_ZN3artL13Method_invokeEP7_JNIEnvP8_jobjectS3_S3_+56) 05-23 21:27:54.597 160 160 F DEBUG : #22 pc 0000000000813527 /system/framework/x86_64/boot.oat (offset 0x73b000) (java.lang.Object java.lang.reflect.Method.invoke(java.lang.Object, java.lang.Object[])+227) 05-23 21:27:54.597 160 160 F DEBUG : #23 pc 0000000002fb0faa /system/framework/x86_64/boot-framework.oat (offset 0x1fcb000) (void com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run()+134) 05-23 21:27:54.597 160 160 F DEBUG : #24 pc 0000000002fb219f /system/framework/x86_64/boot-framework.oat (offset 0x1fcb000) (void com.android.internal.os.ZygoteInit.main(java.lang.String[])+1355) 05-23 21:27:54.597 160 160 F DEBUG : #25 pc 000000000013c016 /system/lib64/libart.so (art_quick_invoke_static_stub+806) 05-23 21:27:54.597 160 160 F DEBUG : #26 pc 0000000000149edc /system/lib64/libart.so (_ZN3art9ArtMethod6InvokeEPNS_6ThreadEPjjPNS_6JValueEPKc+268) 05-23 21:27:54.597 160 160 F DEBUG : #27 pc 00000000004e66c7 /system/lib64/libart.so (_ZN3artL18InvokeWithArgArrayERKNS_33ScopedObjectAccessAlreadyRunnableEPNS_9ArtMethodEPNS_8ArgArrayEPNS_6JValueEPKc+87) 05-23 21:27:54.597 160 160 F DEBUG : #28 pc 00000000004e6274 /system/lib64/libart.so (_ZN3art17InvokeWithVarArgsERKNS_33ScopedObjectAccessAlreadyRunnableEP8_jobjectP10_jmethodIDP13__va_list_tag+388) 05-23 21:27:54.597 160 160 F DEBUG : #29 pc 00000000003e288d /system/lib64/libart.so (_ZN3art3JNI21CallStaticVoidMethodVEP7_JNIEnvP7_jclassP10_jmethodIDP13__va_list_tag+781) 05-23 21:27:54.597 160 160 F DEBUG : #30 pc 00000000000ac1d2 /system/lib64/libandroid_runtime.so (_ZN7_JNIEnv20CallStaticVoidMethodEP7_jclassP10_jmethodIDz+146) 05-23 21:27:54.597 160 160 F DEBUG : #31 pc 00000000000af039 /system/lib64/libandroid_runtime.so (_ZN7android14AndroidRuntime5startEPKcRKNS_6VectorINS_7String8EEEb+665) 05-23 21:27:54.597 160 160 F DEBUG : #32 pc 00000000000027f7 /system/bin/app_process64 05-23 21:27:54.597 160 160 F DEBUG : #33 pc 000000000001c9d4 /system/lib64/libc.so (__libc_init+84) 05-23 21:27:54.597 160 160 F DEBUG : #34 pc 000000000000209f /system/bin/app_process64 05-23 21:27:54.597 160 160 F DEBUG : #35 pc 0000000000000005 <unknown>
,
May 30 2018
Recent run still fails like comment#2: https://stainless.corp.google.com/browse/chromeos-autotest-results/203643264-chromeos-test/ One interesting thing is that the test runs 3 ABIs, and in the failed run the first one (x86_64) pass, and the other two (x86, armeabi-v7) fail. ========================================= 5/29 03:10:21.896 DEBUG| utils:0283| [stdout] 05-29 03:10:21 I/CompatibilityTest: ======================================== 05/29 03:10:21.896 DEBUG| utils:0283| [stdout] 05-29 03:10:21 I/CompatibilityTest: Starting a run with 3 unique modules. 05/29 03:10:21.897 DEBUG| utils:0283| [stdout] 05-29 03:10:21 I/CompatibilityTest: ======================================== 05/29 03:10:21.898 DEBUG| utils:0283| [stdout] 05-29 03:10:21 I/ModuleRepo: chromeos6-row4-rack11-host13:22 running 3 test sub-modules, expected to complete in 3m 0s. 05/29 03:10:21.900 DEBUG| utils:0283| [stdout] 05-29 03:10:21 I/CompatibilityTest: Starting 3 test sub-modules on chromeos6-row4-rack11-host13:22 05/29 03:10:24.469 DEBUG| utils:0283| [stdout] 05-29 03:10:24 I/ConsoleReporter: [chromeos6-row4-rack11-host13:22] Starting x86_64 GtsAdminTestCases with 2 tests 05/29 03:10:24.473 DEBUG| utils:0283| [stdout] 05-29 03:10:24 I/ConsoleReporter: [1/2 x86_64 GtsAdminTestCases chromeos6-row4-rack11-host13:22] com.google.android.admin.gts.EnterprisePrivacySettingsTest#testEnterprisePrivacySettings pass 05/29 03:10:24.475 DEBUG| utils:0283| [stdout] 05-29 03:10:24 I/ConsoleReporter: [2/2 x86_64 GtsAdminTestCases chromeos6-row4-rack11-host13:22] com.google.android.admin.gts.ProfileOwnerTest#testSetProfileOwnerActivityExists pass 05/29 03:10:24.476 DEBUG| utils:0283| [stdout] 05-29 03:10:24 I/ConsoleReporter: [chromeos6-row4-rack11-host13:22] x86_64 GtsAdminTestCases completed in 8 ms. 2 passed, 0 failed, 0 not executed 05/29 03:10:24.891 DEBUG| utils:0283| [stdout] 05-29 03:10:24 W/CompatibilityTest: Inaccurate runtime hint for x86_64 GtsAdminTestCases, expected 1m 0s was 2s 05/29 03:10:25.544 DEBUG| utils:0283| [stdout] 05-29 03:10:25 D/KeyguardStatusChecker: KeyguardControllerState is not supported by the device. Skipping System Checker. 05/29 03:10:28.082 DEBUG| utils:0283| [stdout] 05-29 03:10:28 W/InstrumentationTest: Run failure Instrumentation run failed due to 'Process crashed.' when collecting tests to run for com.google.android.admin.gts on device chromeos6-row4-rack11-host13:22. 05/29 03:10:28.410 DEBUG| utils:0283| [stdout] 05-29 03:10:28 I/ConsoleReporter: [chromeos6-row4-rack11-host13:22] Starting x86 GtsAdminTestCases with 0 test 05/29 03:10:28.411 DEBUG| utils:0283| [stdout] 05-29 03:10:28 I/ConsoleReporter: [chromeos6-row4-rack11-host13:22] Instrumentation run failed due to 'Process crashed.' 05/29 03:10:28.411 DEBUG| utils:0283| [stdout] 05-29 03:10:28 I/ConsoleReporter: [chromeos6-row4-rack11-host13:22] x86 GtsAdminTestCases completed in 0 ms. 0 passed, 0 failed, 0 not executed 05/29 03:10:29.344 DEBUG| utils:0283| [stdout] 05-29 03:10:29 W/CompatibilityTest: Inaccurate runtime hint for x86 GtsAdminTestCases, expected 1m 0s was 3s 05/29 03:10:29.943 DEBUG| utils:0283| [stdout] 05-29 03:10:29 D/KeyguardStatusChecker: KeyguardControllerState is not supported by the device. Skipping System Checker. 05/29 03:10:32.346 DEBUG| utils:0283| [stdout] 05-29 03:10:32 W/InstrumentationTest: Run failure Instrumentation run failed due to 'Process crashed.' when collecting tests to run for com.google.android.admin.gts on device chromeos6-row4-rack11-host13:22. 05/29 03:10:32.630 DEBUG| utils:0283| [stdout] 05-29 03:10:32 I/ConsoleReporter: [chromeos6-row4-rack11-host13:22] Starting armeabi-v7a GtsAdminTestCases with 0 test 05/29 03:10:32.630 DEBUG| utils:0283| [stdout] 05-29 03:10:32 I/ConsoleReporter: [chromeos6-row4-rack11-host13:22] Instrumentation run failed due to 'Process crashed.' 05/29 03:10:32.631 DEBUG| utils:0283| [stdout] 05-29 03:10:32 I/ConsoleReporter: [chromeos6-row4-rack11-host13:22] armeabi-v7a GtsAdminTestCases completed in 0 ms. 0 passed, 0 failed, 0 not executed ============================================= It looks like sometimes the harness cannot replace the apk correctly (maybe some race?). @kinaba, is there any known issues with running multiple ABIs in the same invocation? Should we run them separately instead, using distinct gts-tradefed commands? Otherwise, please reassign to current constable for further analysis. Also we might consider running only primary/default abi until root cause is diagnosed to get rid of the flakiness More observations (maybe irrelevant): 1. Failing run has many suspicious Finsky logs (good run doesn't have them): 05-29 03:12:31.227 1147 1147 W Finsky : [1] com.google.android.finsky.application.d.cY(896): No account configured on this device 2. eve-tot-chrome-pfq-informational seems to fail much more often that other builders: https://stainless.corp.google.com/search?view=list&first_date=2018-05-03&last_date=2018-05-30&builder_name=%5E.*eve.*%24&test=cheets_GTS.*GtsAdminTestCases&status=FAIL&status=ERROR&status=ABORT&exclude_cts=false&exclude_not_run=false&exclude_non_release=false&exclude_au=true&exclude_acts=true&exclude_retried=false&exclude_non_production=false Maybe the flakiness depends on the specific hardware/network properties.
,
May 30 2018
,
May 30 2018
Sorry I just now noticed I got assigned. > @kinaba, is there any known issues with running multiple ABIs in the same invocation? > Should we run them separately instead, using distinct gts-tradefed commands? I haven't heard any such issues... at least I haven't ever bitten by this kind of bug on regularly running arc-gts suite. As you wrote, eve-tot-chrome-pfq-informational looks much more affected. Weird. Separating the GTS execution per ABI is technically possible, but GTS is usually not run in that way... I mean, running multiple ABIs at once has to be a perfectly valid execution. If it failed, the test detected some badness correctly. I don't think we want to change the test to ignore bugs. That said, > Also we might consider running only primary/default abi until root cause is diagnosed to get rid of the flakiness that sounds like a practical choice we need to consider. Is there any way to detect (from the test itself) from which suite the test is run? Currently the GtsAdminTestCases is shared between bvt and arc-gts(-qual), where the latter is used for generating the certification of Android compatibility hence we cannot omit any supported ABI.
,
May 31 2018
Looking at the dashboard: https://stainless.corp.google.com/search?view=matrix&row=build&col=test&first_date=2018-05-04&last_date=2018-05-31&test=cheets_GTS.5.*r4.*Gts%28Admin%7CNet%7CPlacement%29TestCases&board=%5Eeve%24&status=GOOD&status=WARN&status=FAIL&status=ERROR&exclude_cts=false&exclude_not_run=false&exclude_non_release=false&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=false (1) The failures are only seen on non-release builds (-b#######). (2) Until R68-10655, retry made the test passing (=yellow cells). Since R68-10658 the failure started to persist. The "retry" of (2) restarts the browser and recreates the ChromeOS user and hence Android /data, so to me it looked weird that any bad state kept beyond the retries. @cmtm, I wonder if https://chromium-review.googlesource.com/1043488 in the regression range may explain it. The delayed timing of unmount may be leaving something unexpectedly.
,
May 31 2018
And if we widen the dashboard range https://stainless.corp.google.com/search?view=matrix&row=build&col=test&first_date=2018-02-04&last_date=2018-05-31&test=cheets_GTS.5.*Gts%28Admin%7CNet%7CPlacement%29TestCases&board=%5Eeve%24&status=GOOD&status=WARN&status=FAIL&status=ERROR&exclude_cts=false&exclude_not_run=false&exclude_non_release=false&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=false even the yellow cell is first seen on R68-10598.0.0, so it can be a relatively recent regression
,
May 31 2018
#15 may be closer to the root cause. Actually it is not a different logs. All the failing instances are first crashing by #15 (at the container start up phase), and then crashing by #2 while running the GTS test. Passing runs show neither of them. So probably the situation may probably be that #15 crashed art/system_server/whatever and left the container in a bad state that triggered #2.
,
May 31 2018
And IIUC #15 is triggered when the checksum of /system/framework/<image_isa>/boot.art and /data/dalvik-cache/<image_isa>/system@framework@boot.art did not match, or the latter did not exist. https://android.googlesource.com/platform/art/+/nougat-mr1-dev/runtime/gc/space/image_space.cc#274
,
May 31 2018
arc-boot-continue.log comes with "Unexpected mount info". Does it have something in common with crbug.com/834479 ? [0529/031154:INFO:arc_setup.cc(653)] Unmounting: /opt/google/containers/android/rootfs/android-data/data/dalvik-cache/arm [0529/031154:INFO:arc_setup.cc(654)] Running arc_mounter_->UmountLazily(mount_path)... [0529/031154:ERROR:arc_setup_util.cc(357)] Failed to lazy-umount /opt/google/containers/android/rootfs/android-data/data/dalvik-cache/arm: Invalid argument [0529/031154:INFO:arc_setup.cc(654)] Ignoring failures: arc_mounter_->UmountLazily(mount_path) [0529/031154:INFO:arc_setup.cc(653)] Unmounting: /opt/google/containers/android/rootfs/android-data/data/dalvik-cache/x86 [0529/031154:INFO:arc_setup.cc(654)] Running arc_mounter_->UmountLazily(mount_path)... [0529/031154:ERROR:arc_setup.cc(242)] Unexpected mount info: 113 18 259:1 /unencrypted/art-data/dalvik-cache/x86 /opt/google/containers/android/rootfs/android-data/data/dalvik-cache/x86 rw,nosuid,nodev,noexec,noatime - ext4 /dev/nvme0n1p1 rw,seclabel,commit=600,data=ordered [0529/031154:ERROR:arc_setup.cc(242)] Unexpected mount info: 114 18 259:1 /unencrypted/art-data/dalvik-cache/x86_64 /opt/google/containers/android/rootfs/android-data/data/dalvik-cache/x86_64 rw,nosuid,nodev,noexec,noatime - ext4 /dev/nvme0n1p1 rw,seclabel,commit=600,data=ordered [0529/031154:ERROR:arc_setup.cc(242)] Unexpected mount info: 120 114 259:1 /unencrypted/art-data/dalvik-cache/x86_64 /opt/google/containers/android/rootfs/android-data/data/dalvik-cache/x86_64 rw,nosuid,nodev,noexec,noatime - ext4 /dev/nvme0n1p1 rw,seclabel,commit=600,data=ordered [0529/031154:INFO:arc_setup.cc(653)] Unmounting: /opt/google/containers/android/rootfs/android-data/data/dalvik-cache/x86_64 [0529/031154:INFO:arc_setup.cc(654)] Running arc_mounter_->UmountLazily(mount_path)... [0529/031154:ERROR:arc_setup.cc(242)] Unexpected mount info: 114 18 259:1 /unencrypted/art-data/dalvik-cache/x86_64 /opt/google/containers/android/rootfs/android-data/data/dalvik-cache/x86_64 rw,nosuid,nodev,noexec,noatime - ext4 /dev/nvme0n1p1 rw,seclabel,commit=600,data=ordered
,
May 31 2018
To #19, after the suite is expanded to job, it only records its parent suite id, not the suite name. I do not know any method to fetch it from the test itself without reaching out the database for the information. BTW, dynamically adjust test's behavior based on which suite it is scheduled on sounds not right to me. Suite only serves as a simple wrapper of test collection. If we want different behavior of the test, we should create new test control file for it if needed.
,
May 31 2018
#24 The cheets_ContainerMount failure happens occasionally. Not very often, but it sometimes does. I'm trying to fix the issue by moving the mount operation from arc-setup to config.json, but haven't finished it yet. For totally unknown reasons, betty pre-cq freezes with my CL.....
,
Jun 8 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform2/+/3031d6000a517cb20d4d61d6e80704fa828817fb commit 3031d6000a517cb20d4d61d6e80704fa828817fb Author: yusukes <yusukes@google.com> Date: Fri Jun 08 02:56:16 2018 arc: Mount /data/dalvik-cache/<isa> in the container namespace With run_oci, there's no reason to do that in the init namespace. This will fix the occasional mount point leak reported at crbug.com/834479 too. BUG= chromium:834479 BUG= chromium:842927 TEST=ARC++ still starts, cheets_ContainerMount CQ-DEPEND=CL:1065582 Change-Id: I4b442d4702c3a09f020a5cc7a40463b5bd5dff59 Reviewed-on: https://chromium-review.googlesource.com/1065587 Commit-Ready: Yusuke Sato <yusukes@chromium.org> Tested-by: Yusuke Sato <yusukes@chromium.org> Reviewed-by: Luis Hector Chavez <lhchavez@chromium.org> [modify] https://crrev.com/3031d6000a517cb20d4d61d6e80704fa828817fb/arc/container-bundle/pi/config.json [modify] https://crrev.com/3031d6000a517cb20d4d61d6e80704fa828817fb/arc/container-bundle/nyc/config.json [modify] https://crrev.com/3031d6000a517cb20d4d61d6e80704fa828817fb/arc/container-bundle/master/config.json [modify] https://crrev.com/3031d6000a517cb20d4d61d6e80704fa828817fb/arc/setup/arc_setup.cc
,
Jun 8 2018
#27 should remove #24's unexpected mount info errors. Let's see if this fixes the flake...
,
Jun 12 2018
Is the test still flaky?
,
Jun 12 2018
I assume it fixed. |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by kinaba@chromium.org
, May 14 2018Owner: kinaba@chromium.org