"CvoxBrailleUtilUnitTest.TextField" is flaky (Multiple chromevox_tests are flaky) |
||||||||||
Issue description"CvoxBrailleUtilUnitTest.TextField" is flaky. This issue was created automatically by the chromium-try-flakes app. Please find the right owner to fix the respective test/step and assign this issue to them. If the step/test is infrastructure-related, please add Infra-Troopers label and change issue status to Untriaged. When done, please remove the issue from Sheriff Bug Queue by removing the Sheriff-Chromium label. We have detected 3 recent flakes. List of all flakes can be found at https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyLAsSBUZsYWtlIiFDdm94QnJhaWxsZVV0aWxVbml0VGVzdC5UZXh0RmllbGQM. Flaky tests should be disabled within 30 minutes unless culprit CL is found and reverted. Please see more details here: https://sites.google.com/a/chromium.org/dev/developers/tree-sheriffs/sheriffing-bug-queues#triaging-auto-filed-flakiness-bugs This flaky test/step was previously tracked in issue 540070 .
,
Apr 3 2018
Issue 828342 has been merged into this issue.
,
Apr 3 2018
Issue 828320 has been merged into this issue.
,
Apr 3 2018
Issue 828319 has been merged into this issue.
,
Apr 3 2018
Issue 828289 has been merged into this issue.
,
Apr 3 2018
Issue 828106 has been merged into this issue.
,
Apr 3 2018
Issue 828080 has been merged into this issue.
,
Apr 3 2018
Issue 828070 has been merged into this issue.
,
Apr 3 2018
Issue 827824 has been merged into this issue.
,
Apr 3 2018
Issue 827823 has been merged into this issue.
,
Apr 3 2018
Issue 827818 has been merged into this issue.
,
Apr 3 2018
Issue 827808 has been merged into this issue.
,
Apr 3 2018
Issue 827766 has been merged into this issue.
,
Apr 3 2018
Issue 827795 has been merged into this issue.
,
Apr 3 2018
Issue 827765 has been merged into this issue.
,
Apr 3 2018
Issue 827657 has been merged into this issue.
,
Apr 3 2018
Issue 827655 has been merged into this issue.
,
Apr 3 2018
Issue 827636 has been merged into this issue.
,
Apr 3 2018
Sheriff here. There's tons of flake across chromevox_tests and it's been happening for a few days now. There is no clear regression CL that I can find. Here's an example log which has no real useful information: https://logs.chromium.org/v/?s=chromium%2Fbb%2Ftryserver.chromium.chromiumos%2Flinux-chromeos-rel%2F91089%2F%2B%2Frecipes%2Fsteps%2Fchromevox_tests__with_patch_%2F0%2Fstdout I can repro locally, but flake is only seen when tests are run in batch. It seems to affect tests randomly, probably unrelated to specific test code. Here's the stack I'm seeing (below). +James, does this look potentially like the Ozone initialization race you were telling me about before? #0 0x00007fc76d739655 in base::internal::Invoker<base::internal::BindState<content::GpuProcessHost::InitOzone()::$_1, int, base::RepeatingCallback<void (IPC::Message*)> >, void (ui::OzonePlatform*)>::RunOnce(base::internal::BindStateBase*, ui::OzonePlatform*) () from /work/chrome/src/out/cros/./libcontent.so (gdb) bt #0 0x00007fc76d739655 in base::internal::Invoker<base::internal::BindState<content::GpuProcessHost::InitOzone()::$_1, int, base::RepeatingCallback<void (IPC::Message*)> >, void (ui::OzonePlatform*)>::RunOnce(base::internal::BindStateBase*, ui::OzonePlatform*) () from /work/chrome/src/out/cros/./libcontent.so #1 0x00007fc76d738ead in base::internal::Invoker<base::internal::BindState<content::(anonymous namespace)::OzoneRegisterStartupCallbackHelper(base::OnceCallback<void (ui::OzonePlatform*)>)::$_3, base::internal::RetainedRefWrapper<base::SingleThreadTaskRunner>, base::internal::PassedWrapper<base::OnceCallback<void (ui::OzonePlatform*)> > >, void (ui::OzonePlatform*)>::RunOnce(base::internal::BindStateBase*, ui::OzonePlatform*) () from /work/chrome/src/out/cros/./libcontent.so #2 0x00007fc769fb6b4a in ui::OzonePlatform::RegisterStartupCallback(base::OnceCallback<void (ui::OzonePlatform*)>) () from /work/chrome/src/out/cros/./libozone.so #3 0x00007fc76d735825 in content::(anonymous namespace)::OzoneRegisterStartupCallbackHelper(base::OnceCallback<void (ui::OzonePlatform*)>) () from /work/chrome/src/out/cros/./libcontent.so #4 0x00007fc76d735730 in content::GpuProcessHost::InitOzone() () from /work/chrome/src/out/cros/./libcontent.so #5 0x00007fc76d73387e in content::GpuProcessHost::Init() () from /work/chrome/src/out/cros/./libcontent.so #6 0x00007fc76d73320b in content::GpuProcessHost::Get(content::GpuProcessHost::GpuProcessKind, bool) () from /work/chrome/src/out/cros/./libcontent.so #7 0x00007fc76d721a36 in content::BrowserGpuChannelHostFactory::EstablishRequest::EstablishOnIO() () from /work/chrome/src/out/cros/./libcontent.so #8 0x00007fc76d185a3e in base::internal::Invoker<base::internal::BindState<base::internal::IgnoreResultHelper<bool (content::PlatformNotificationContextImpl::*)()>, scoped_refptr<content::PlatformNotificationContextImpl> >, void ()>::Run(base::internal::BindStateBase*) () from /work/chrome/src/out/cros/./libcontent.so #9 0x00007fc7702da654 in base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) () from /work/chrome/src/out/cros/./libbase.so #10 0x00007fc77030ccb9 in base::internal::IncomingTaskQueue::RunTask(base::PendingTask*) () from /work/chrome/src/out/cros/./libbase.so #11 0x00007fc77031080b in base::MessageLoop::RunTask(base::PendingTask*) () from /work/chrome/src/out/cros/./libbase.so #12 0x00007fc770310baa in base::MessageLoop::DeferOrRunPendingTask(base::PendingTask) () from /work/chrome/src/out/cros/./libbase.so #13 0x00007fc770310e0c in base::MessageLoop::DoWork() () from /work/chrome/src/out/cros/./libbase.so #14 0x00007fc770313239 in base::MessagePumpLibevent::Run(base::MessagePump::Delegate*) () from /work/chrome/src/out/cros/./libbase.so #15 0x00007fc770310139 in base::MessageLoop::Run(bool) () from /work/chrome/src/out/cros/./libbase.so #16 0x00007fc770346e59 in base::RunLoop::Run() () from /work/chrome/src/out/cros/./libbase.so #17 0x00007fc770386b07 in base::Thread::Run(base::RunLoop*) () from /work/chrome/src/out/cros/./libbase.so #18 0x00007fc76d59c374 in content::BrowserProcessSubThread::IOThreadRun(base::RunLoop*) () from /work/chrome/src/out/cros/./libcontent.so #19 0x00007fc76d59c314 in content::BrowserProcessSubThread::Run(base::RunLoop*) () from /work/chrome/src/out/cros/./libcontent.so #20 0x00007fc7703870bd in base::Thread::ThreadMain() () from /work/chrome/src/out/cros/./libbase.so
,
Apr 3 2018
Issue 827471 has been merged into this issue.
,
Apr 3 2018
Issue 827466 has been merged into this issue.
,
Apr 3 2018
Issue 827472 has been merged into this issue.
,
Apr 3 2018
Issue 827524 has been merged into this issue.
,
Apr 3 2018
Issue 827590 has been merged into this issue.
,
Apr 3 2018
Issue 827581 has been merged into this issue.
,
Apr 3 2018
Robert, the source of flake appears to be this line of code: https://cs.chromium.org/chromium/src/content/browser/gpu/gpu_process_host.cc?rcl=d134a09bfac89f0e5e9961403fa3425f6731f302&l=736 Specifically I can confirm that local repros have GetGpuPlatformSupportHost() racily returning null. Any ideas? I think it's obviously a bit late to revert the change which introduced this problem, but we should get it fixed ASAP.
,
Apr 3 2018
Also +sadrul to get more eyes on this
,
Apr 3 2018
Issue 828488 has been merged into this issue.
,
Apr 3 2018
OK, did more investigation and the issue appears to be Mus. GpuProcessHost::InitOzone() assumes OzonePlatform::InitializeForUI() has already been called by the time it runs. This is always true except for when Mus is enabled, where don't call OzonePlatform::InitializeForUI until the ui service is started on its own background thread. I think we need to force GpuProcessHost::InitOzone()[1] (or ui::OzonePlatform::RegisterStartupCallback's invocation of its callback[2]) until after ui service has been brought up. I defer to sky@ and/or rjkroege@ to help figure out the best way to do that. [1] https://cs.chromium.org/chromium/src/content/browser/gpu/gpu_process_host.cc?rcl=ae910bccac3f28ce18316a1a308a6a66a5f8b993&l=742 [2] https://cs.chromium.org/chromium/src/ui/ozone/public/ozone_platform.cc?rcl=a6aadcce39e544094dc0ec25d538d4636462d03e&l=110
,
Apr 3 2018
I have an alternative proposal for a fix which I'll express in CL form shortly.
,
Apr 3 2018
Issue 828514 has been merged into this issue.
,
Apr 3 2018
Issue 828534 has been merged into this issue.
,
Apr 3 2018
Issue 828548 has been merged into this issue.
,
Apr 3 2018
Issue 828555 has been merged into this issue.
,
Apr 3 2018
OK, I've exhausted my available energy for looking into this bug. It is definitely a race between ui service initialization (OnStart) and GpuProcessHost initialization, as explained in c#29. I was going to suggest that we simply let Aura initialize OzonePlatform in the Mus case, but this is complicated by global state management around both InputDeviceManager and DeviceDataManager. Not sure what the best path forward is, but Mus-enabled Chrome will be flakily crashy on startup until this is resolved.
,
Apr 3 2018
This sounds vaguely similar to issue 807781 "DCHECK gfx::ClientNativePixmapFactory::GetInstance() in ui::Service::OnStart()", where both ui service OnStart and aura init were fighting over initialization, in that case of a pixmap factory. There is a separate issue 824809 "mus: Startup crash on device in ui::DrmDisplayHostManager::DrmDisplayHostManager", but that's on-device only.
,
Apr 3 2018
Sadrul is more familiar with this than I am. Sadrul, is there an easy fix for this? If not, two options: 1. Make ChromeVox tests force mus off. 2. disable mus again for dev builds. I'm happy to do either of these, let me know what you think.
,
Apr 4 2018
Issue 828980 has been merged into this issue.
,
Apr 5 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/a51e3935c81c5572eb1439a1f8d83c9184f3f128 commit a51e3935c81c5572eb1439a1f8d83c9184f3f128 Author: kylechar <kylechar@chromium.org> Date: Thu Apr 05 17:57:00 2018 Fix two Ozone initialization races. The first race is only for Ozone DRM. In DrmDisplayHostManager |primary_drm_device_handle_| was being accessed from multiple threads. The value was changed by the IO thread while the OzoneUI thread was still using it. Change calls to GpuThreadObserver OnGpuProcessLaunched() so it happens on the OzoneUI thread. This delays the call slightly but it removes the possibility for a race modifying |primary_drm_device_handle_|. The second race was with OzonePlatform::RegisterStartupCallback(). This is called from the IO thread and it checks if there is an OzonePlatform instance and if Ozone UI initialization has happened. If both of those things are true, then it runs a callback immediately, otherwise it runs a callback after those things become true. The problem is that |g_platform_initialized_ui| was set true on the UI thread before Ozone UI initialization was fully finished. The callback accessed a null OzonePlatform member variable and crashed. Make sure that |g_platform_initialized_ui| is set after initialization is finished and that variable change is protected by the same lock used in RegisterStartupCallback(). The lock only protects changing |g_platform_initialized_ui|, not all of initialization, so the IO thread won't block for an extended period. Bug: 824809, 828407 Change-Id: I73b9404c823c9eeaaeaba99feb1e113953a5bb1b Reviewed-on: https://chromium-review.googlesource.com/980574 Reviewed-by: Robert Kroeger <rjkroege@chromium.org> Commit-Queue: kylechar <kylechar@chromium.org> Cr-Commit-Position: refs/heads/master@{#548481} [modify] https://crrev.com/a51e3935c81c5572eb1439a1f8d83c9184f3f128/ui/ozone/platform/drm/host/drm_display_host_manager.cc [modify] https://crrev.com/a51e3935c81c5572eb1439a1f8d83c9184f3f128/ui/ozone/platform/drm/host/drm_gpu_platform_support_host.cc [modify] https://crrev.com/a51e3935c81c5572eb1439a1f8d83c9184f3f128/ui/ozone/platform/drm/host/drm_gpu_platform_support_host.h [modify] https://crrev.com/a51e3935c81c5572eb1439a1f8d83c9184f3f128/ui/ozone/platform/drm/host/gpu_thread_observer.h [modify] https://crrev.com/a51e3935c81c5572eb1439a1f8d83c9184f3f128/ui/ozone/public/ozone_platform.cc
,
Apr 5 2018
Detected 3 new flakes for test/step "BackgroundTest.OptionChildIndexCount". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyLwsSBUZsYWtlIiRCYWNrZ3JvdW5kVGVzdC5PcHRpb25DaGlsZEluZGV4Q291bnQM. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
,
Apr 5 2018
,
Apr 5 2018
,
Apr 5 2018
,
Apr 5 2018
Issue 829489 has been merged into this issue.
,
Apr 5 2018
Issue 829502 has been merged into this issue.
,
Apr 5 2018
Issue 829515 has been merged into this issue.
,
Apr 5 2018
chromevox_tests seems pretty flaky since 3/29/2018 https://findit-for-me.appspot.com/waterfall/list-flakes?step_name=chromevox_tests
,
Apr 6 2018
Looks like the vast majority of the flake is fixed? https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=chromevox_tests |
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by zmin@chromium.org
, Apr 3 2018Owner: dmazz...@chromium.org
Status: Assigned (was: Untriaged)