Issue metadata
Sign in to add a comment
|
GPU tests failing randomly on Mac Retinas with AMD GPU |
||||||||||||||||||||||||
Issue descriptionIn this tryjob run: https://build.chromium.org/p/tryserver.chromium.mac/builders/mac_optional_gpu_tests_rel/builds/1094/steps/webgl2_conformance_tests%20on%20ATI%20GPU%20on%20Mac%20Retina%20%28with%20patch%29%20on%20Mac-10.10/logs/stdio It looks like the disk is full on one of the new MacBook Pro Retinas with AMD GPU. Troopers, can you please confirm?
,
Jun 1 2016
fyi the disk is not full on these: chrome-bot@build53-b1:(Mac 10.10.5):~$ df -h Filesystem Size Used Avail Capacity iused ifree %iused Mounted on /dev/disk0s2 465Gi 90Gi 375Gi 20% 23676122 98252107 19% / chrome-bot@build486-m4:(Mac 10.10.5):~$ df -h Filesystem Size Used Avail Capacity iused ifree %iused Mounted on /dev/disk1 465Gi 91Gi 373Gi 20% 23982741 97856873 20% /
,
Jun 1 2016
Hmm. Thanks for checking Bryce. The warning about important_file_writer.cc must be spurious. It's probably because the isolate's file system is read-only. I'll continue digging and remove the Troopers and Infra labels in the next edit.
,
Jun 1 2016
,
Jun 1 2016
The first test failure in https://build.chromium.org/p/tryserver.chromium.mac/builders/mac_optional_gpu_tests_rel/builds/1089/steps/webgl2_conformance_tests%20on%20ATI%20GPU%20on%20Mac%20Retina%20%28with%20patch%29%20on%20Mac-10.10/logs/stdio is as follows: [ RUN ] WebglConformance.conformance2_textures_image_bitmap_from_video_tex_3d_r11f_g11f_b10f_rgb_unsigned_int_10f_11f_11f_rev (INFO) 2016-06-01 13:56:52,169 cache_temperature.EnsurePageCacheTemperature:55 PageCacheTemperature: any [9138:1299:0601/135652:WARNING:webmediaplayer_impl.cc(345)] Using MultibufferDataSource Traceback (most recent call last): File "/b/swarm_slave/work/isolated/run3kxECH/third_party/catapult/telemetry/telemetry/internal/story_runner.py", line 84, in _RunStoryAndProcessErrorIfNeeded state.RunStory(results) File "/b/swarm_slave/work/isolated/run3kxECH/content/test/gpu/gpu_tests/gpu_test_base.py", line 122, in RunStory RunStoryWithRetries(DesktopGpuSharedPageState, self, results) File "/b/swarm_slave/work/isolated/run3kxECH/content/test/gpu/gpu_tests/gpu_test_base.py", line 72, in RunStoryWithRetries super(cls, shared_page_state).RunStory(results) File "/b/swarm_slave/work/isolated/run3kxECH/third_party/catapult/telemetry/telemetry/page/shared_page_state.py", line 304, in RunStory self._current_page.Run(self) File "/b/swarm_slave/work/isolated/run3kxECH/third_party/catapult/telemetry/telemetry/page/__init__.py", line 95, in Run shared_state.page_test.RunNavigateSteps(self, current_tab) File "/b/swarm_slave/work/isolated/run3kxECH/third_party/catapult/telemetry/telemetry/page/legacy_page_test.py", line 191, in RunNavigateSteps page.RunNavigateSteps(action_runner) File "/b/swarm_slave/work/isolated/run3kxECH/content/test/gpu/gpu_tests/webgl_conformance.py", line 192, in RunNavigateSteps 'webglTestHarness._finished', timeout_in_seconds=300) File "/b/swarm_slave/work/isolated/run3kxECH/third_party/catapult/telemetry/telemetry/internal/actions/action_runner.py", line 186, in WaitForJavaScriptCondition self._tab.WaitForJavaScriptExpression(condition, timeout_in_seconds) File "/b/swarm_slave/work/isolated/run3kxECH/third_party/catapult/telemetry/telemetry/internal/browser/web_contents.py", line 136, in WaitForJavaScriptExpression e.message + '\n' + debug_message) TimeoutException: Timed out while waiting 300s for IsJavaScriptExpressionTrue. Console output: [ FAILED ] WebglConformance.conformance2_textures_image_bitmap_from_video_tex_3d_r11f_g11f_b10f_rgb_unsigned_int_10f_11f_11f_rev (313767 ms) All the subsequent tests failed, until the tryjob failed after an hour. Mo, your laptop is the same model as these -- are you seeing this kind of failure?
,
Jun 2 2016
,
Jun 2 2016
It looks like many tests are failing randomly on these machines now. I don't know whether there's a pattern to the failures -- i.e., whether they're happening on specific machines. The symptom seems to be that the browser hangs during launch. This is really serious. There are browser hangs upon start seen on other platforms too. See Issue 615044 . I have a feeling they're all related and so am blocking this on the other bug. The Linux Intel bots on the chromium.gpu.fyi waterfall seem to fail one test on nearly every run with this symptom so I think taking one of them offline and debugging directly on it is the best way to proceed.
,
Jun 2 2016
Unrestricting access.
,
Jun 2 2016
,
Jun 2 2016
I suspect the root cause is screenshot capture: Issue 614394 Issue 615044 and older reported hangups on browser start are likely unrelated.
,
Jun 7 2016
Duplicating all screenshot-related timeouts on the AMD based Retina MacBook Pros into Issue 599776 . The root cause is known and a fix / workaround is underway. |
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by kbr@chromium.org
, Jun 1 2016Components: Infra>Platform>Swarming
Summary: Disk apparently full on build53-b1 and build486-m4 (was: Disk apparently full on build53-b1)