[kevin/bob] graphics_PerfControl regression [irq/66-chromeos] |
|||||||||
Issue descriptionLooks like a kernel change/test that ran way before us causing a CPU to spin? https://stainless.corp.google.com/search?exclude_retried=true&first_date=2018-05-08&master_builder_name=&builder_name_number=&shard=&exclude_acts=true&builder_name=&master_builder_name_number=&owner=&retry=&exclude_cts=true&exclude_non_production=false&hostname=&board=&test=%5Egraphics%5C_PerfControl%24&exclude_not_run=false&build=%5ER68%5C-10674%5C.0%5C.0%24&status=FAIL&status=ERROR&status=ABORT&reason=&waterfall=&suite=&last_date=2018-05-14&exclude_non_release=true&exclude_au=true&model=&view=list&board_model=%5Ebob%5C%2Fbob%24 https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/199581220-chromeos-test/chromeos2-row6-rack3-host1/graphics_PerfControl/debug/ 05/12 23:46:14.819 INFO | utils:1612| Starting to wait up to 60.0s for idle CPU... 05/12 23:46:15.822 INFO | utils:1624| After waiting 1.0s CPU utilization is 0.173. 05/12 23:46:17.825 INFO | utils:1624| After waiting 3.0s CPU utilization is 0.172. 05/12 23:46:21.830 INFO | utils:1624| After waiting 7.0s CPU utilization is 0.179. 05/12 23:46:29.840 INFO | utils:1624| After waiting 15.0s CPU utilization is 0.179. 05/12 23:46:45.857 INFO | utils:1624| After waiting 31.0s CPU utilization is 0.175. 05/12 23:47:01.874 INFO | utils:1624| After waiting 47.0s CPU utilization is 0.176. 05/12 23:47:17.892 INFO | utils:1624| After waiting 63.0s CPU utilization is 0.175. 05/12 23:47:17.892 WARNI| utils:1626| CPU did not become idle. 05/12 23:47:17.901 INFO | utils:1644| Logging current process activity using top and ps. 05/12 23:47:17.901 DEBUG| utils:0214| Running 'top -b -n1 -c' 05/12 23:47:18.121 INFO | utils:1647| * Command: top -b -n1 -c Exit status: 0 Duration: 0.210731983185 stdout: top - 23:47:18 up 11 min, 0 users, load average: 2.26, 3.69, 2.23 Tasks: 241 total, 2 running, 238 sleeping, 0 stopped, 1 zombie %Cpu(s): 11.6 us, 24.3 sy, 0.6 ni, 63.2 id, 0.3 wa, 0.0 hi, 0.1 si, 0.0 st KiB Mem : 3904868 total, 1991572 free, 234884 used, 1678412 buff/cache KiB Swap: 3813344 total, 3813344 free, 0 used. 3470228 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 120 root -51 0 0 0 0 R 100.0 0.0 11:40.84 [irq/66-ch+ 119 root 20 0 0 0 0 S 41.2 0.0 0:59.54 [spi5] 17254 root 20 0 2616 1380 1204 R 17.6 0.0 0:00.05 top -b -n1+ 1 root 20 0 3404 2548 1580 S 0.0 0.1 0:03.13 /sbin/init [...] 1 0 120 2 -51 - 0 0 - R ? 11:40 [irq/66-chromeos] -- Also need to import error at the right place? File "/usr/local/autotest/tests/graphics_PerfControl/graphics_PerfControl.py", line 40, in run_once raise error.TestFailure('Failed: %s' % pc.get_error_reason()) AttributeError: 'module' object has no attribute 'TestFailure'
,
May 14 2018
66: 15747595 0 0 0 0 0 rockchip_gpio_irq 1 Level chromeos-ec -> to Puneet for ec
,
May 14 2018
,
May 14 2018
note that this is still happening on chromeos2-row6-rack3-host1.cros
,
May 14 2018
+Ilja, maybe we want to also remove that device from our pools until the irq storm is fixed
,
May 14 2018
Same behavior is observed in other machine as well. chromeos6-row2-rack24-host*.cros etc. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 124 root -51 0 0 0 0 R 98.4 0.0 7:34.27 irq/66-chromeos
,
May 14 2018
+sadolfsson Looking at the CL landed in 10674.0.0 This looks fishing to me as it touches cros_ec.c https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/1051085
,
May 15 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/d335682a57150067457d848a0f5e9b344720929b commit d335682a57150067457d848a0f5e9b344720929b Author: Po-Hsien Wang <pwang@chromium.org> Date: Tue May 15 04:29:52 2018 graphics_PerfControl: fix TestFailure TestFailure is not used in autotest anymore. Use TestFail instead. BUG= chromium:842873 TEST=Check error.py Change-Id: Ic888833e0721a65a464b0e69adda2ece0255f402 Reviewed-on: https://chromium-review.googlesource.com/1058142 Commit-Ready: Pohsien Wang <pwang@chromium.org> Tested-by: Pohsien Wang <pwang@chromium.org> Reviewed-by: Ilja H. Friedel <ihf@chromium.org> [modify] https://crrev.com/d335682a57150067457d848a0f5e9b344720929b/client/site_tests/graphics_PerfControl/graphics_PerfControl.py
,
May 15 2018
,
May 15 2018
Looking at stainless history https://stainless.corp.google.com/search?view=matrix&row=board_model&col=build&first_date=2018-05-09&last_date=2018-05-15&test=%5Egraphics%5C_PerfControl%24&board=kevin%7Cbob%7Cscarlet&exclude_cts=true&exclude_not_run=false&exclude_non_release=true&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=false it is clear that the regression started https://crosland.corp.google.com/log/10673.0.0..10674.0.0 There are only two suspicious changes. I will revert them.
,
May 15 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/ec/+/f139d3a0ca9215b5b5bb2abc1f120ff6171036c9 commit f139d3a0ca9215b5b5bb2abc1f120ff6171036c9 Author: Ilja H. Friedel <ihf@chromium.org> Date: Tue May 15 17:00:53 2018 Revert "npcx: CEC: Send CEC message in mkbp event" This reverts commit 74b5a2ccb58739d4e21fdeb36e40fe01c0ca7ede. Suspected to have broken perf tests by keeping a CPU busy on kevin/bob. BUG= chromium:842873 , b:76467407 Change-Id: Iebbbb4623116840b851656e3ec28e75dc99cff79 Reviewed-on: https://chromium-review.googlesource.com/1060073 Reviewed-by: Ilja H. Friedel <ihf@chromium.org> Tested-by: Ilja H. Friedel <ihf@chromium.org> [modify] https://crrev.com/f139d3a0ca9215b5b5bb2abc1f120ff6171036c9/include/ec_commands.h [modify] https://crrev.com/f139d3a0ca9215b5b5bb2abc1f120ff6171036c9/common/mkbp_event.c [modify] https://crrev.com/f139d3a0ca9215b5b5bb2abc1f120ff6171036c9/util/ectool.c [modify] https://crrev.com/f139d3a0ca9215b5b5bb2abc1f120ff6171036c9/chip/npcx/cec.c
,
May 15 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/7f22e51e592d4cac1ebc832688148910cc1965c9 commit 7f22e51e592d4cac1ebc832688148910cc1965c9 Author: Ilja H. Friedel <ihf@chromium.org> Date: Tue May 15 17:01:22 2018 Revert "CHROMIUM: mkbp: Increase maximum mkbp event size" This reverts commit 8e3c039f25b874f5c22dbf1109f9ee97cd9ec9f0. Suspected to have broken perf tests by keeping a CPU busy on kevin/bob. BUG= chromium:842873 , b:76467407 Change-Id: I4a452f1ab472cbdcb4b70cc24526bc8ed051b335 Reviewed-on: https://chromium-review.googlesource.com/1059947 Reviewed-by: Ilja H. Friedel <ihf@chromium.org> Tested-by: Ilja H. Friedel <ihf@chromium.org> [modify] https://crrev.com/7f22e51e592d4cac1ebc832688148910cc1965c9/include/linux/mfd/cros_ec.h [modify] https://crrev.com/7f22e51e592d4cac1ebc832688148910cc1965c9/drivers/mfd/cros_ec.c [modify] https://crrev.com/7f22e51e592d4cac1ebc832688148910cc1965c9/include/linux/mfd/cros_ec_commands.h
,
May 15 2018
Just to clarify, the condition is not caused by graphics_PerfControl, but detected by it. It is caused by a test that runs earlier. As an example here https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/200307442-chromeos-test/chromeos6-row1-rack24-host9/graphics_PerfControl/debug/ one CPU got burned for 7:27.51 minutes at 09:19:55.851. There were three tests run on the DUT since boot. I guess the first login into Chrome caused the problem (or maybe that is even not needed). 2018-05-15T16:12:21.688965+00:00 INFO kernel: [ 0.000000] Booting Linux on physical CPU 0x0 2018-05-15T16:12:43.475008+00:00 NOTICE root[3508]: autotest runtest desktopui_SimpleLogin 2018-05-15T16:14:31.313580+00:00 NOTICE root[5945]: autotest runtest desktopui_ChromeSanity 2018-05-15T16:16:26.383237+00:00 NOTICE root[8666]: autotest runtest logging_FeedbackReport 2018-05-15T16:18:50.805711+00:00 NOTICE root[12756]: autotest starting iteration /usr/local/autotest/results/default/graphics_PerfControl/sysinfo/iteration.1 on kevin-arcnext_2.0GHz_4GB 05/15 09:19:55.842 WARNI| utils:1626| CPU did not become idle. 05/15 09:19:55.851 INFO | utils:1644| Logging current process activity using top and ps. 05/15 09:19:55.851 DEBUG| utils:0214| Running 'top -b -n1 -c' 05/15 09:19:56.073 INFO | utils:1647| * Command: top -b -n1 -c Exit status: 0 Duration: 0.212687969208 stdout: top - 09:19:56 up 7 min, 0 users, load average: 2.03, 2.58, 1.37 Tasks: 233 total, 2 running, 230 sleeping, 0 stopped, 1 zombie %Cpu(s): 11.3 us, 24.3 sy, 0.0 ni, 64.0 id, 0.4 wa, 0.0 hi, 0.1 si, 0.0 st KiB Mem : 3904868 total, 2773236 free, 218716 used, 912916 buff/cache KiB Swap: 3813344 total, 3813344 free, 0 used. 3488828 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 121 root -51 0 0 0 0 R 94.4 0.0 7:27.51 [irq/66-ch+
,
May 15 2018
ihf, is the reverted patch still the suspect?
,
May 15 2018
(the one I authored)
,
May 15 2018
Add a comment and make changes below ⚐ Flag issue as spam [kevin/bob] graphics_PerfControl regression [irq/66-chromeos] Project Member Reported by ihf@chromium.org, Today (23 hours ago) Edit description Back to list Issue description Looks like a kernel change/test that ran way before us causing a CPU to spin? https://stainless.corp.google.com/search?exclude_retried=true&first_date=2018-05-08&master_builder_name=&builder_name_number=&shard=&exclude_acts=true&builder_name=&master_builder_name_number=&owner=&retry=&exclude_cts=true&exclude_non_production=false&hostname=&board=&test=%5Egraphics%5C_PerfControl%24&exclude_not_run=false&build=%5ER68%5C-10674%5C.0%5C.0%24&status=FAIL&status=ERROR&status=ABORT&reason=&waterfall=&suite=&last_date=2018-05-14&exclude_non_release=true&exclude_au=true&model=&view=list&board_model=%5Ebob%5C%2Fbob%24 https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/199581220-chromeos-test/chromeos2-row6-rack3-host1/graphics_PerfControl/debug/ 05/12 23:46:14.819 INFO | utils:1612| Starting to wait up to 60.0s for idle CPU... 05/12 23:46:15.822 INFO | utils:1624| After waiting 1.0s CPU utilization is 0.173. 05/12 23:46:17.825 INFO | utils:1624| After waiting 3.0s CPU utilization is 0.172. 05/12 23:46:21.830 INFO | utils:1624| After waiting 7.0s CPU utilization is 0.179. 05/12 23:46:29.840 INFO | utils:1624| After waiting 15.0s CPU utilization is 0.179. 05/12 23:46:45.857 INFO | utils:1624| After waiting 31.0s CPU utilization is 0.175. 05/12 23:47:01.874 INFO | utils:1624| After waiting 47.0s CPU utilization is 0.176. 05/12 23:47:17.892 INFO | utils:1624| After waiting 63.0s CPU utilization is 0.175. 05/12 23:47:17.892 WARNI| utils:1626| CPU did not become idle. 05/12 23:47:17.901 INFO | utils:1644| Logging current process activity using top and ps. 05/12 23:47:17.901 DEBUG| utils:0214| Running 'top -b -n1 -c' 05/12 23:47:18.121 INFO | utils:1647| * Command: top -b -n1 -c Exit status: 0 Duration: 0.210731983185 stdout: top - 23:47:18 up 11 min, 0 users, load average: 2.26, 3.69, 2.23 Tasks: 241 total, 2 running, 238 sleeping, 0 stopped, 1 zombie %Cpu(s): 11.6 us, 24.3 sy, 0.6 ni, 63.2 id, 0.3 wa, 0.0 hi, 0.1 si, 0.0 st KiB Mem : 3904868 total, 1991572 free, 234884 used, 1678412 buff/cache KiB Swap: 3813344 total, 3813344 free, 0 used. 3470228 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 120 root -51 0 0 0 0 R 100.0 0.0 11:40.84 [irq/66-ch+ 119 root 20 0 0 0 0 S 41.2 0.0 0:59.54 [spi5] 17254 root 20 0 2616 1380 1204 R 17.6 0.0 0:00.05 top -b -n1+ 1 root 20 0 3404 2548 1580 S 0.0 0.1 0:03.13 /sbin/init [...] 1 0 120 2 -51 - 0 0 - R ? 11:40 [irq/66-chromeos] -- Also need to import error at the right place? File "/usr/local/autotest/tests/graphics_PerfControl/graphics_PerfControl.py", line 40, in run_once raise error.TestFailure('Failed: %s' % pc.get_error_reason()) AttributeError: 'module' object has no attribute 'TestFailure' Comment 1 by marcheu@chromium.org, Today (23 hours ago) ⚐ can you look at /proc/interrupts to see what is irq 66? Comment 2 by marcheu@chromium.org, Today (23 hours ago) ⚐ Owner: puneetster@chromium.org 66: 15747595 0 0 0 0 0 rockchip_gpio_irq 1 Level chromeos-ec -> to Puneet for ec Comment 3 by pwang@chromium.org, Today (23 hours ago) ⚐ Cc: pwang@chromium.org Comment 4 by marcheu@chromium.org, Today (22 hours ago) ⚐ note that this is still happening on chromeos2-row6-rack3-host1.cros Comment 5 by marcheu@chromium.org, Today (22 hours ago) ⚐ Cc: ihf@chromium.org +Ilja, maybe we want to also remove that device from our pools until the irq storm is fixed Comment 6 by pwang@chromium.org, Today (22 hours ago) ⚐ Same behavior is observed in other machine as well. chromeos6-row2-rack24-host*.cros etc. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 124 root -51 0 0 0 0 R 98.4 0.0 7:34.27 irq/66-chromeos Comment 7 by pwang@chromium.org, Today (22 hours ago) ⚐ Cc: sadolfsson@chromium.org +sadolfsson Looking at the CL landed in 10674.0.0 This looks fishing to me as it touches cros_ec.c https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/1051085 Project Member Comment 8 by bugdroid1@chromium.org, Today (15 hours ago) ⚐ The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/d335682a57150067457d848a0f5e9b344720929b commit d335682a57150067457d848a0f5e9b344720929b Author: Po-Hsien Wang <pwang@chromium.org> Date: Tue May 15 04:29:52 2018 graphics_PerfControl: fix TestFailure TestFailure is not used in autotest anymore. Use TestFail instead. BUG= chromium:842873 TEST=Check error.py Change-Id: Ic888833e0721a65a464b0e69adda2ece0255f402 Reviewed-on: https://chromium-review.googlesource.com/1058142 Commit-Ready: Pohsien Wang <pwang@chromium.org> Tested-by: Pohsien Wang <pwang@chromium.org> Reviewed-by: Ilja H. Friedel <ihf@chromium.org> [modify] https://crrev.com/d335682a57150067457d848a0f5e9b344720929b/client/site_tests/graphics_PerfControl/graphics_PerfControl.py Comment 9 by deanliao@google.com, Today (12 hours ago) ⚐ Blocking: 843037 Comment 10 by ihf@chromium.org, Today (2 hours ago) ⚐ Looking at stainless history https://stainless.corp.google.com/search?view=matrix&row=board_model&col=build&first_date=2018-05-09&last_date=2018-05-15&test=%5Egraphics%5C_PerfControl%24&board=kevin%7Cbob%7Cscarlet&exclude_cts=true&exclude_not_run=false&exclude_non_release=true&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=false it is clear that the regression started https://crosland.corp.google.com/log/10673.0.0..10674.0.0 There are only two suspicious changes. I will revert them. Project Member Comment 11 by bugdroid1@chromium.org, Today (2 hours ago) ⚐ The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/ec/+/f139d3a0ca9215b5b5bb2abc1f120ff6171036c9 commit f139d3a0ca9215b5b5bb2abc1f120ff6171036c9 Author: Ilja H. Friedel <ihf@chromium.org> Date: Tue May 15 17:00:53 2018 Revert "npcx: CEC: Send CEC message in mkbp event" This reverts commit 74b5a2ccb58739d4e21fdeb36e40fe01c0ca7ede. Suspected to have broken perf tests by keeping a CPU busy on kevin/bob. BUG= chromium:842873 , b:76467407 Change-Id: Iebbbb4623116840b851656e3ec28e75dc99cff79 Reviewed-on: https://chromium-review.googlesource.com/1060073 Reviewed-by: Ilja H. Friedel <ihf@chromium.org> Tested-by: Ilja H. Friedel <ihf@chromium.org> [modify] https://crrev.com/f139d3a0ca9215b5b5bb2abc1f120ff6171036c9/include/ec_commands.h [modify] https://crrev.com/f139d3a0ca9215b5b5bb2abc1f120ff6171036c9/common/mkbp_event.c [modify] https://crrev.com/f139d3a0ca9215b5b5bb2abc1f120ff6171036c9/util/ectool.c [modify] https://crrev.com/f139d3a0ca9215b5b5bb2abc1f120ff6171036c9/chip/npcx/cec.c Project Member Comment 12 by bugdroid1@chromium.org, Today (2 hours ago) ⚐ Labels: merge-merged-chromeos-4.4 The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/7f22e51e592d4cac1ebc832688148910cc1965c9 commit 7f22e51e592d4cac1ebc832688148910cc1965c9 Author: Ilja H. Friedel <ihf@chromium.org> Date: Tue May 15 17:01:22 2018 Revert "CHROMIUM: mkbp: Increase maximum mkbp event size" This reverts commit 8e3c039f25b874f5c22dbf1109f9ee97cd9ec9f0. Suspected to have broken perf tests by keeping a CPU busy on kevin/bob. BUG= chromium:842873 , b:76467407 Change-Id: I4a452f1ab472cbdcb4b70cc24526bc8ed051b335 Reviewed-on: https://chromium-review.googlesource.com/1059947 Reviewed-by: Ilja H. Friedel <ihf@chromium.org> Tested-by: Ilja H. Friedel <ihf@chromium.org> [modify] https://crrev.com/7f22e51e592d4cac1ebc832688148910cc1965c9/include/linux/mfd/cros_ec.h [modify] https://crrev.com/7f22e51e592d4cac1ebc832688148910cc1965c9/drivers/mfd/cros_ec.c [modify] https://crrev.com/7f22e51e592d4cac1ebc832688148910cc1965c9/include/linux/mfd/cros_ec_commands.h Comment 13 by ihf@chromium.org, Today (2 hours ago) ⚐ Just to clarify, the condition is not caused by graphics_PerfControl, but detected by it. It is caused by a test that runs earlier. As an example here https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/200307442-chromeos-test/chromeos6-row1-rack24-host9/graphics_PerfControl/debug/ one CPU got burned for 7:27.51 minutes at 09:19:55.851. There were three tests run on the DUT since boot. I guess the first login into Chrome caused the problem (or maybe that is even not needed). 2018-05-15T16:12:21.688965+00:00 INFO kernel: [ 0.000000] Booting Linux on physical CPU 0x0 2018-05-15T16:12:43.475008+00:00 NOTICE root[3508]: autotest runtest desktopui_SimpleLogin 2018-05-15T16:14:31.313580+00:00 NOTICE root[5945]: autotest runtest desktopui_ChromeSanity 2018-05-15T16:16:26.383237+00:00 NOTICE root[8666]: autotest runtest logging_FeedbackReport 2018-05-15T16:18:50.805711+00:00 NOTICE root[12756]: autotest starting iteration /usr/local/autotest/results/default/graphics_PerfControl/sysinfo/iteration.1 on kevin-arcnext_2.0GHz_4GB 05/15 09:19:55.842 WARNI| utils:1626| CPU did not become idle. 05/15 09:19:55.851 INFO | utils:1644| Logging current process activity using top and ps. 05/15 09:19:55.851 DEBUG| utils:0214| Running 'top -b -n1 -c' 05/15 09:19:56.073 INFO | utils:1647| * Command: top -b -n1 -c Exit status: 0 Duration: 0.212687969208 stdout: top - 09:19:56 up 7 min, 0 users, load average: 2.03, 2.58, 1.37 Tasks: 233 total, 2 running, 230 sleeping, 0 stopped, 1 zombie %Cpu(s): 11.3 us, 24.3 sy, 0.0 ni, 64.0 id, 0.4 wa, 0.0 hi, 0.1 si, 0.0 st KiB Mem : 3904868 total, 2773236 free, 218716 used, 912916 buff/cache KiB Swap: 3813344 total, 3813344 free, 0 used. 3488828 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 121 root -51 0 0 0 0 R 94.4 0.0 7:27.51 [irq/66-ch+ Comment 14 by sadolfsson@chromium.org, Today (38 minutes ago) ⚐ ihf, is the reverted patch still the suspect? Comment 15 by sadolfsson@chromium.org, Today (33 minutes ago) ⚐ (the one I authored) Comment 16 by vpalatin@google.com, Today (moments ago) Delete comment ⚐ as commented in the revert patch, the EC changes is very unlikely to cause this kind of troubles as we are never rolling out ToT firmware changes on test DUTs. So the firmware changes are effectivily no-op for the perf tests. There is a small ectool userspace diff but looking at it, it seems quite innocuous (as in, there is a high probability these code paths are not even executed). the kernel one might be the issue though (particularly as in this case it was running without the matching EC change, the fallback in that case might have been badly tested)
,
May 15 2018
I am having a hard time reading Vincent's comment. Stefan, we will know tomorrow. If the tests still fail with the revert I will reland the patches.
,
May 15 2018
He is saying "npcx: CEC: Send CEC message in mkbp event" which only touches EC-firmware should not even be in this test, but ""CHROMIUM: mkbp: Increase maximum mkbp event size" which is in the kernel does. Also there is some dependency between them, so having the second one without the first one has some special handling in the code.
,
May 16 2018
R68-10682.0.0 passed on kevin/bob. https://stainless.corp.google.com/search?view=matrix&row=model&col=build&first_date=2018-05-10&last_date=2018-05-16&test=graphics_PerfControl%7Cvideo_HangoutHardwarePerf&board=bob%7Ckevin&exclude_cts=false&exclude_not_run=false&exclude_non_release=true&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=false Checking some of the DUT in the lab. It looks like the situation is sane. $ ssh chromeos2-row6-rack3-host1.cros PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 27050 root 20 0 2448 1536 1396 R 61.9 0.0 0:06.13 tar $ ssh chromeos6-row2-rack24-host11.cros PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 17258 chronos 12 -8 574536 177852 83436 S 15.0 4.6 7:20.24 chrome I'll close this for now.
,
May 16 2018
FYI, my kernel commit was the problem. I have a fixed version out for review now with some comments in the commit message on what went wrong: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/1061879
,
May 16 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/ec/+/86734119fcba3909f02ab294ae3b2b31540c17a7 commit 86734119fcba3909f02ab294ae3b2b31540c17a7 Author: Stefan Adolfsson <sadolfsson@chromium.org> Date: Wed May 16 23:43:03 2018 Reland "npcx: CEC: Send CEC message in mkbp event" This reverts commit f139d3a0ca9215b5b5bb2abc1f120ff6171036c9. Reason for revert: Verified that the problem is in the kernel, not EC. Original change's description: > Revert "npcx: CEC: Send CEC message in mkbp event" > > This reverts commit 74b5a2ccb58739d4e21fdeb36e40fe01c0ca7ede. > > Suspected to have broken perf tests by keeping a CPU busy on kevin/bob. > > BUG= chromium:842873 , b:76467407 > > Change-Id: Iebbbb4623116840b851656e3ec28e75dc99cff79 > Reviewed-on: https://chromium-review.googlesource.com/1060073 > Reviewed-by: Ilja H. Friedel <ihf@chromium.org> > Tested-by: Ilja H. Friedel <ihf@chromium.org> Bug: chromium:842873 , b:76467407 Change-Id: I7d8990b2b8901b7de08f190a993bec645bbdacd2 Reviewed-on: https://chromium-review.googlesource.com/1061854 Commit-Ready: Stefan Adolfsson <sadolfsson@chromium.org> Tested-by: Stefan Adolfsson <sadolfsson@chromium.org> Reviewed-by: Stefan Adolfsson <sadolfsson@chromium.org> Reviewed-by: Vincent Palatin <vpalatin@chromium.org> [modify] https://crrev.com/86734119fcba3909f02ab294ae3b2b31540c17a7/include/ec_commands.h [modify] https://crrev.com/86734119fcba3909f02ab294ae3b2b31540c17a7/common/mkbp_event.c [modify] https://crrev.com/86734119fcba3909f02ab294ae3b2b31540c17a7/util/ectool.c [modify] https://crrev.com/86734119fcba3909f02ab294ae3b2b31540c17a7/chip/npcx/cec.c
,
May 18 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/ed5018b2daae9fabbf95d819b6fb14c968372dc8 commit ed5018b2daae9fabbf95d819b6fb14c968372dc8 Author: Stefan Adolfsson <sadolfsson@chromium.org> Date: Fri May 18 02:35:02 2018 CHROMIUM: mkbp: Increase maximum mkbp event size Having a 16 byte mkbp event size makes it possible to send CEC messages from the EC to the AP directly inside the mkbp event instead of first doing a notification and then a read. This version of the change includes a fix for a failed performance auto-test on bob/kevin. The problem was that the previous version did not properly fall back to v0 of the mkbp event since the return value that it looked on was that of the transfer, not that of the protocol error. Signed-off-by: Stefan Adolfsson <sadolfsson@chromium.org> BRANCH=none TEST=Verified that bob falls back to v0 version of the mkbp event and that Fizz works with both v0 and v1 (using some extra debug prints). Also made sure there is nothing strange when running "top" on bob, and that bobs keyboard works. BUG= chromium:842873 , b:76467407 Change-Id: I39e96cd17213f6b702707b3448859fafa682f146 Reviewed-on: https://chromium-review.googlesource.com/1061879 Commit-Ready: Stefan Adolfsson <sadolfsson@chromium.org> Tested-by: Stefan Adolfsson <sadolfsson@chromium.org> Reviewed-by: Stefan Adolfsson <sadolfsson@chromium.org> Reviewed-by: Vincent Palatin <vpalatin@chromium.org> [modify] https://crrev.com/ed5018b2daae9fabbf95d819b6fb14c968372dc8/include/linux/mfd/cros_ec.h [modify] https://crrev.com/ed5018b2daae9fabbf95d819b6fb14c968372dc8/drivers/mfd/cros_ec.c [modify] https://crrev.com/ed5018b2daae9fabbf95d819b6fb14c968372dc8/include/linux/mfd/cros_ec_commands.h
,
Jun 19 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/a9d2c1832967c9f1864391bef407c328281e44ff commit a9d2c1832967c9f1864391bef407c328281e44ff Author: Stefan Adolfsson <sadolfsson@chromium.org> Date: Tue Jun 19 20:10:11 2018 CHROMIUM: mkbp: Increase maximum mkbp event size Having a 16 byte mkbp event size makes it possible to send CEC messages from the EC to the AP directly inside the mkbp event instead of first doing a notification and then a read. This version of the change includes a fix for a failed performance auto-test on bob/kevin. The problem was that the previous version did not properly fall back to v0 of the mkbp event since the return value that it looked on was that of the transfer, not that of the protocol error. Signed-off-by: Stefan Adolfsson <sadolfsson@chromium.org> BRANCH=none TEST=Verified that bob falls back to v0 version of the mkbp event and that Fizz works with both v0 and v1 (using some extra debug prints). Also made sure there is nothing strange when running "top" on bob, and that bobs keyboard works. BUG= chromium:842873 , b:76467407 Change-Id: I39e96cd17213f6b702707b3448859fafa682f146 Reviewed-on: https://chromium-review.googlesource.com/1061879 Commit-Ready: Stefan Adolfsson <sadolfsson@chromium.org> Tested-by: Stefan Adolfsson <sadolfsson@chromium.org> Reviewed-by: Stefan Adolfsson <sadolfsson@chromium.org> Reviewed-by: Vincent Palatin <vpalatin@chromium.org> (cherry picked from commit ed5018b2daae9fabbf95d819b6fb14c968372dc8) Reviewed-on: https://chromium-review.googlesource.com/1071670 Reviewed-by: Daisuke Nojiri <dnojiri@chromium.org> Commit-Queue: Stefan Adolfsson <sadolfsson@chromium.org> [modify] https://crrev.com/a9d2c1832967c9f1864391bef407c328281e44ff/include/linux/mfd/cros_ec.h [modify] https://crrev.com/a9d2c1832967c9f1864391bef407c328281e44ff/drivers/mfd/cros_ec.c [modify] https://crrev.com/a9d2c1832967c9f1864391bef407c328281e44ff/include/linux/mfd/cros_ec_commands.h |
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by marc...@chromium.org
, May 14 2018