High rate of RFMF_GET_COOKIES_BAD_ORIGIN renderer kills in the wild |
||||||||||||||||
Issue descriptionThe dashboard shows an elevated number of RFMF_GET_COOKIES_BAD_ORIGIN renderer kills, and also some presumably related RFMF_SET_COOKIE_BAD_ORIGIN kills. These are likely due to a logic bug. We should add temporary logging to determine the cause of these, and fix the underlying problem.
,
Apr 19 2016
,
May 3 2016
Re-opening this since these crashes are still happening, though at a much lower rate.
,
May 3 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/0e6e9faae9e11de8684440b128e75569b98c7553 commit 0e6e9faae9e11de8684440b128e75569b98c7553 Author: nick <nick@chromium.org> Date: Tue May 03 23:34:14 2016 Add DumpWithoutCrashing and crash keys to get more context for RFMF_SET_COOKIE_BAD_ORIGIN and RFMF_GET_COOKIES_BAD_ORIGIN renderer kills. This is all temporary code, which will be reverted after we understand the issue. BUG= 600441 TEST=RenderFrameMessageFilterBrowserTest.CrossSiteCookieSecurityEnforcement CQ_INCLUDE_TRYBOTS=tryserver.chromium.linux:linux_site_isolation Review-Url: https://codereview.chromium.org/1855383002 Cr-Commit-Position: refs/heads/master@{#391396} [modify] https://crrev.com/0e6e9faae9e11de8684440b128e75569b98c7553/chrome/common/crash_keys.cc [modify] https://crrev.com/0e6e9faae9e11de8684440b128e75569b98c7553/content/browser/child_process_security_policy_impl.cc [modify] https://crrev.com/0e6e9faae9e11de8684440b128e75569b98c7553/content/browser/child_process_security_policy_impl.h [modify] https://crrev.com/0e6e9faae9e11de8684440b128e75569b98c7553/content/browser/frame_host/render_frame_message_filter.cc
,
May 4 2016
The data from the DumpWithoutCrashing cases has come back from canary, and the following seems true: - policy->CanAccessDataForOrigin() returns false not because of an origin lock failure, but because the SecurityStateMap::find operation returns false. In other words, there is no ChildProcessSecurityPolicy for this process at all. - Thus, these kills don't seem to require any active OOPIF mode. That makes this bug much more urgent than we had anticipated. Here is a plan of action: 1. As a short term fix, I will simply force CanAccessDataForOrigin to return true always. This should be safe enough to merge into all branches. 2. Then, I will try to understand what the possibilities are to produce this scenario, and work to obtain a repro. 3. In parallel with 2, we can look for patterns in the URLs we see in the crash keys. 4. We'll try to create a browsertest that reproduces this scenario.
,
May 4 2016
Current theory of this bug based on code inspection:
ChildProcessSecurityPolicyImpl::Remove() is called synchronously on ~RenderProcessHostImpl, but the IO thread MessageFilters may briefly outlive the RPH.
We could could try to repro this, potentially, by doing something like reading document.cookie in a loop, and closing the tab while that goes on.
We might consider doing a PostTask of the Remove operation to the IO thread, but this doesn't help message filters on non-{UI|IO} threads. There are some usages of ChildProcessSecurityPolicy from message filters running on other threads: for example, FileAPIMessageFilter dispatches FileSystemHostMsg_SyncGetPlatformPath to the blocking pool, though it currently doesn't trigger hard failures.
,
May 4 2016
Users experienced this crash on the following builds: Win Canary 52.0.2724.0 - 1.44 CPM, 19 reports, 19 clients (signature [Dump without crash] content::RenderFrameMessageFilter::OnGetCookies) Win Canary 52.0.2724.0 - 0.61 CPM, 8 reports, 6 clients (signature [Dump without crash] content::RenderFrameMessageFilter::OnSetCookie) Mac Canary 52.0.2724.0 - 2.15 CPM, 3 reports, 3 clients (signature [Dump without crash] content::RenderFrameMessageFilter::OnGetCookies) Mac Canary 52.0.2724.0 - 2.15 CPM, 3 reports, 2 clients (signature [Dump without crash] content::RenderFrameMessageFilter::OnSetCookie) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
May 4 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/470457d18d306a0f32ea548ad111371c24e509e1 commit 470457d18d306a0f32ea548ad111371c24e509e1 Author: nick <nick@chromium.org> Date: Wed May 04 23:06:18 2016 ChildProcessSecurityPolicy::CanAccessDataForOrigin workaround to suppress bad kills crbug.com/600441 apparently involves a race where we check ChildProcessSecurityPolicy for a process_id that it doesn't know -- probably a renderer shutdown race. Returning true instead of false here is a temporary workaround to suppress these bad kills, which are affecting the non-oopif user population. This is a tolerable short-term behavior as far as security is concerned, since CanAccessDataForOrigin is only meant to offer meaningful protection in --site-per-process mode, which is not yet launched. BUG= 600441 Review-Url: https://codereview.chromium.org/1945173003 Cr-Commit-Position: refs/heads/master@{#391678} [modify] https://crrev.com/470457d18d306a0f32ea548ad111371c24e509e1/content/browser/child_process_security_policy_impl.cc [modify] https://crrev.com/470457d18d306a0f32ea548ad111371c24e509e1/content/browser/renderer_host/media/webrtc_identity_service_host_unittest.cc
,
May 5 2016
Data from crash/ confirm that the above band-aid fix ( https://codereview.chromium.org/1945173003 ) has resolved the problem in the latest canary. Given the low risk of the fix, I'd like to merge it to the M51 beta channel. It should improve our UMA renderer crashes by 1-2 CPM.
,
May 5 2016
Your change meets the bar and is auto-approved for M51 (branch: 2704)
,
May 5 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/36602ec1b414e65ddd7c55d4a27f167174825d9e commit 36602ec1b414e65ddd7c55d4a27f167174825d9e Author: Nick Carter <nick@chromium.org> Date: Thu May 05 22:44:16 2016 ChildProcessSecurityPolicy::CanAccessDataForOrigin workaround to suppress bad kills crbug.com/600441 apparently involves a race where we check ChildProcessSecurityPolicy for a process_id that it doesn't know -- probably a renderer shutdown race. Returning true instead of false here is a temporary workaround to suppress these bad kills, which are affecting the non-oopif user population. This is a tolerable short-term behavior as far as security is concerned, since CanAccessDataForOrigin is only meant to offer meaningful protection in --site-per-process mode, which is not yet launched. BUG= 600441 Review-Url: https://codereview.chromium.org/1945173003 Cr-Commit-Position: refs/heads/master@{#391678} (cherry picked from commit 470457d18d306a0f32ea548ad111371c24e509e1) Review URL: https://codereview.chromium.org/1948223005 . Cr-Commit-Position: refs/branch-heads/2704@{#402} Cr-Branched-From: 6e53600def8f60d8c632fadc70d7c1939ccea347-refs/heads/master@{#386251} [modify] https://crrev.com/36602ec1b414e65ddd7c55d4a27f167174825d9e/content/browser/child_process_security_policy_impl.cc [modify] https://crrev.com/36602ec1b414e65ddd7c55d4a27f167174825d9e/content/browser/renderer_host/media/webrtc_identity_service_host_unittest.cc
,
May 6 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ab59274bcbf4cf2507e7ba5a017e9bb5ec39ca5b commit ab59274bcbf4cf2507e7ba5a017e9bb5ec39ca5b Author: nick <nick@chromium.org> Date: Fri May 06 17:34:28 2016 Revert "Add DumpWithoutCrashing and crash keys to get more context" This reverts commit 0e6e9faae9e11de8684440b128e75569b98c7553. TBR=kmarshall@chromium.org BUG= 600441 CQ_INCLUDE_TRYBOTS=tryserver.chromium.linux:linux_site_isolation Review-Url: https://codereview.chromium.org/1957553002 Cr-Commit-Position: refs/heads/master@{#392085} [modify] https://crrev.com/ab59274bcbf4cf2507e7ba5a017e9bb5ec39ca5b/blimp/engine/app/blimp_engine_crash_keys.cc [modify] https://crrev.com/ab59274bcbf4cf2507e7ba5a017e9bb5ec39ca5b/chrome/common/crash_keys.cc [modify] https://crrev.com/ab59274bcbf4cf2507e7ba5a017e9bb5ec39ca5b/content/browser/child_process_security_policy_impl.cc [modify] https://crrev.com/ab59274bcbf4cf2507e7ba5a017e9bb5ec39ca5b/content/browser/child_process_security_policy_impl.h [modify] https://crrev.com/ab59274bcbf4cf2507e7ba5a017e9bb5ec39ca5b/content/browser/frame_host/render_frame_message_filter.cc
,
Dec 6 2016
nick@, is the rate of kills still high? Do we need to do any more work on this issue?
,
Feb 28 2017
Issue 697023 has been merged into this issue.
,
Mar 1 2017
Users experienced this crash on the following builds: Mac Canary 58.0.3026.3 - 1.32 CPM, 2 reports, 1 clients (signature [Renderer kill 79] content::RenderFrameMessageFilter::GetCookies) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Mar 31 2017
Users experienced this crash on the following builds: Mac Dev 59.0.3053.3 - 0.20 CPM, 1 reports, 1 clients (signature [Renderer kill 79] content::RenderFrameMessageFilter::GetCookies) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
May 21 2017
Users experienced this crash on the following builds: Mac Canary 60.0.3106.0 - 0.85 CPM, 1 reports, 1 clients (signature [Renderer kill 79] content::RenderFrameMessageFilter::GetCookies) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Jun 3 2017
Users experienced this crash on the following builds: Mac Canary 61.0.3118.0 - 0.28 CPM, 1 reports, 1 clients (signature [Renderer kill 79] content::RenderFrameMessageFilter::GetCookies) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Jun 6 2017
Users experienced this crash on the following builds: Linux Dev 60.0.3112.10 - 0.54 CPM, 2 reports, 1 clients (signature [Renderer kill 79] content::RenderFrameMessageFilter::GetCookies) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Sep 22 2017
Users experienced this crash on the following builds: Win Canary 63.0.3221.0 - 0.10 CPM, 1 reports, 1 clients (signature [Renderer kill 79] content::RenderFrameMessageFilter::GetCookies) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Oct 18 2017
Users experienced this crash on the following builds: Mac Canary 64.0.3242.0 - 0.95 CPM, 3 reports, 2 clients (signature [Renderer kill 79] content::RenderFrameMessageFilter::GetCookies) Linux Dev 63.0.3236.7 - 0.29 CPM, 1 reports, 1 clients (signature [Renderer kill 79] content::RenderFrameMessageFilter::GetCookies) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Nov 18 2017
Interesting-- we were seeing a pretty steady stream of these renderer kills until 64.0.3256.0 (15 days ago). Alex, I wonder if one of your recent fixes for LockToOrigin might have put a stop to these? Crash link: https://crash.corp.google.com/browse?q=custom_data.ChromeCrashProto.magic_signature_1.name%3D%27%5BRenderer%20kill%2079%5D%20content%3A%3ARenderFrameMessageFilter%3A%3AGetCookies%27&sql_dialect=dremelsql&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#-property-selector,+productversion:1000
,
Jan 2 2018
I'm guessing that r513607, a fix for incorrect process reuse which went into 64.0.3257.0, helped with this. I see that this kill appeared in a few other canary versions after that though. The big spike in 65.0.3286.* is due to issue 792413. There are a few more versions with a single kill, though none after 65.0.3300.0 - those might be legitimate kills from our site isolation trials? I'll mark this as fixed for now, but let's reopen or file a new bug if this spikes up again.
,
Dec 14
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ec08cda18f9773396a600b2f974a9242df8b3be5 commit ec08cda18f9773396a600b2f974a9242df8b3be5 Author: Aaron Colwell <acolwell@google.com> Date: Fri Dec 14 00:36:35 2018 Remove missing security state workaround in CanAccessDataForOrigin(). - Remove an old workaround that was put in place before site isolation was on by default. - Fixed a test that depended on this behavior to pass. Bug: 898281, 600441 Change-Id: I1f08a0d7af59514c84f7eebd5f48027758a0f63b Reviewed-on: https://chromium-review.googlesource.com/c/1374831 Commit-Queue: Aaron Colwell <acolwell@chromium.org> Reviewed-by: Charlie Reis <creis@chromium.org> Reviewed-by: Alex Moshchuk <alexmos@chromium.org> Reviewed-by: Marijn Kruisselbrink <mek@chromium.org> Cr-Commit-Position: refs/heads/master@{#616524} [modify] https://crrev.com/ec08cda18f9773396a600b2f974a9242df8b3be5/content/browser/child_process_security_policy_impl.cc [modify] https://crrev.com/ec08cda18f9773396a600b2f974a9242df8b3be5/content/browser/dom_storage/session_storage_context_mojo_unittest.cc
,
Dec 14
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/0435a1c98b06a29493ce5207d3bf144ae4def1f4 commit 0435a1c98b06a29493ce5207d3bf144ae4def1f4 Author: Alex Moshchuk <alexmos@chromium.org> Date: Fri Dec 14 22:02:59 2018 Revert "Remove missing security state workaround in CanAccessDataForOrigin()." This reverts commit ec08cda18f9773396a600b2f974a9242df8b3be5. Reason for revert: Caused renderer kills. See https://crbug.com/898281#c8. Original change's description: > Remove missing security state workaround in CanAccessDataForOrigin(). > > - Remove an old workaround that was put in place before site isolation > was on by default. > - Fixed a test that depended on this behavior to pass. > > Bug: 898281, 600441 > Change-Id: I1f08a0d7af59514c84f7eebd5f48027758a0f63b > Reviewed-on: https://chromium-review.googlesource.com/c/1374831 > Commit-Queue: Aaron Colwell <acolwell@chromium.org> > Reviewed-by: Charlie Reis <creis@chromium.org> > Reviewed-by: Alex Moshchuk <alexmos@chromium.org> > Reviewed-by: Marijn Kruisselbrink <mek@chromium.org> > Cr-Commit-Position: refs/heads/master@{#616524} TBR=creis@chromium.org,acolwell@chromium.org,mek@chromium.org,alexmos@chromium.org Change-Id: Ic67afdbd498f1484ca728431427909a049985550 No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: 898281, 600441 Reviewed-on: https://chromium-review.googlesource.com/c/1379032 Reviewed-by: Alex Moshchuk <alexmos@chromium.org> Commit-Queue: Alex Moshchuk <alexmos@chromium.org> Cr-Commit-Position: refs/heads/master@{#616840} [modify] https://crrev.com/0435a1c98b06a29493ce5207d3bf144ae4def1f4/content/browser/child_process_security_policy_impl.cc [modify] https://crrev.com/0435a1c98b06a29493ce5207d3bf144ae4def1f4/content/browser/dom_storage/session_storage_context_mojo_unittest.cc
,
Dec 19
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/263ce44bb356617e098d2cbad147146e0a36fc64 commit 263ce44bb356617e098d2cbad147146e0a36fc64 Author: Aaron Colwell <acolwell@google.com> Date: Wed Dec 19 20:39:02 2018 Replace security state workaround in CanAccessDataForOrigin() - Replace workaround with code that is more strict about enforcing security policy during child process shutdown. The old code would always allow data access for IDs not in the security_state_ map. The new code adds a pending map so we can deal with UI/IO thread races during child process removal AND rejects any unknown IDs. - Fixed a test that depended on the old behavior where unknown IDs always allowed access. Bug: 898281, 600441 , 915203 Change-Id: I26ca1e48536672b05d2310d8a17be47d5b6ef5c7 Reviewed-on: https://chromium-review.googlesource.com/c/1382855 Commit-Queue: Aaron Colwell <acolwell@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org> Reviewed-by: Alex Moshchuk <alexmos@chromium.org> Cr-Commit-Position: refs/heads/master@{#617937} [modify] https://crrev.com/263ce44bb356617e098d2cbad147146e0a36fc64/content/browser/child_process_security_policy_impl.cc [modify] https://crrev.com/263ce44bb356617e098d2cbad147146e0a36fc64/content/browser/child_process_security_policy_impl.h [modify] https://crrev.com/263ce44bb356617e098d2cbad147146e0a36fc64/content/browser/child_process_security_policy_unittest.cc [modify] https://crrev.com/263ce44bb356617e098d2cbad147146e0a36fc64/content/browser/dom_storage/session_storage_context_mojo_unittest.cc [modify] https://crrev.com/263ce44bb356617e098d2cbad147146e0a36fc64/content/browser/dom_storage/test/mojo_test_with_file_service.h [modify] https://crrev.com/263ce44bb356617e098d2cbad147146e0a36fc64/content/browser/fileapi/browser_file_system_helper_unittest.cc [modify] https://crrev.com/263ce44bb356617e098d2cbad147146e0a36fc64/content/browser/network_service_client_unittest.cc
,
Dec 22
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b06974abd43560ae11fe5a98fa63245d0fe3d5f8 commit b06974abd43560ae11fe5a98fa63245d0fe3d5f8 Author: Nasko Oskov <nasko@chromium.org> Date: Sat Dec 22 01:04:27 2018 Revert "Replace security state workaround in CanAccessDataForOrigin()" This reverts commit 263ce44bb356617e098d2cbad147146e0a36fc64. Reason for revert: There are still renderer process terminations due to this CL, see details in https://crbug.com/915203#c9. Original change's description: > Replace security state workaround in CanAccessDataForOrigin() > > - Replace workaround with code that is more strict about enforcing > security policy during child process shutdown. The old code would > always allow data access for IDs not in the security_state_ map. The > new code adds a pending map so we can deal with UI/IO thread races > during child process removal AND rejects any unknown IDs. > > - Fixed a test that depended on the old behavior where unknown IDs > always allowed access. > > Bug: 898281, 600441 , 915203 > Change-Id: I26ca1e48536672b05d2310d8a17be47d5b6ef5c7 > Reviewed-on: https://chromium-review.googlesource.com/c/1382855 > Commit-Queue: Aaron Colwell <acolwell@chromium.org> > Reviewed-by: Tom Sepez <tsepez@chromium.org> > Reviewed-by: Alex Moshchuk <alexmos@chromium.org> > Cr-Commit-Position: refs/heads/master@{#617937} TBR=acolwell@chromium.org,tsepez@chromium.org,alexmos@chromium.org # Not skipping CQ checks because original CL landed > 1 day ago. Bug: 898281, 600441 , 915203 Change-Id: I3fa9efdede12c4403ace0df0678049358009e4e4 Reviewed-on: https://chromium-review.googlesource.com/c/1389025 Reviewed-by: Nasko Oskov <nasko@chromium.org> Commit-Queue: Nasko Oskov <nasko@chromium.org> Cr-Commit-Position: refs/heads/master@{#618700} [modify] https://crrev.com/b06974abd43560ae11fe5a98fa63245d0fe3d5f8/content/browser/child_process_security_policy_impl.cc [modify] https://crrev.com/b06974abd43560ae11fe5a98fa63245d0fe3d5f8/content/browser/child_process_security_policy_impl.h [modify] https://crrev.com/b06974abd43560ae11fe5a98fa63245d0fe3d5f8/content/browser/child_process_security_policy_unittest.cc [modify] https://crrev.com/b06974abd43560ae11fe5a98fa63245d0fe3d5f8/content/browser/dom_storage/session_storage_context_mojo_unittest.cc [modify] https://crrev.com/b06974abd43560ae11fe5a98fa63245d0fe3d5f8/content/browser/dom_storage/test/mojo_test_with_file_service.h [modify] https://crrev.com/b06974abd43560ae11fe5a98fa63245d0fe3d5f8/content/browser/fileapi/browser_file_system_helper_unittest.cc [modify] https://crrev.com/b06974abd43560ae11fe5a98fa63245d0fe3d5f8/content/browser/network_service_client_unittest.cc
,
Jan 16
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/220d502c977ee3d8617f42eef7aea68e6d8c77ac commit 220d502c977ee3d8617f42eef7aea68e6d8c77ac Author: Aaron Colwell <acolwell@google.com> Date: Wed Jan 16 04:56:55 2019 Replace security state workaround in CanAccessDataForOrigin() - Replace workaround with code that is more strict about enforcing security policy during child process shutdown. The old code would always allow data access for IDs not in the security_state_ map. The new code adds a pending map so we can deal with UI/IO thread races during child process removal AND rejects any unknown IDs. - Fixed a test that depended on the old behavior where unknown IDs always allowed access. Bug: 898281, 600441 , 915203 Change-Id: I4b164eb3ec1cbb110479b633e73bcd883ef9a604 Reviewed-on: https://chromium-review.googlesource.com/c/1409732 Commit-Queue: Aaron Colwell <acolwell@chromium.org> Reviewed-by: Charlie Reis <creis@chromium.org> Reviewed-by: Alex Moshchuk <alexmos@chromium.org> Cr-Commit-Position: refs/heads/master@{#623114} [modify] https://crrev.com/220d502c977ee3d8617f42eef7aea68e6d8c77ac/content/browser/child_process_security_policy_impl.cc [modify] https://crrev.com/220d502c977ee3d8617f42eef7aea68e6d8c77ac/content/browser/child_process_security_policy_impl.h [modify] https://crrev.com/220d502c977ee3d8617f42eef7aea68e6d8c77ac/content/browser/child_process_security_policy_unittest.cc [modify] https://crrev.com/220d502c977ee3d8617f42eef7aea68e6d8c77ac/content/browser/dom_storage/session_storage_context_mojo_unittest.cc [modify] https://crrev.com/220d502c977ee3d8617f42eef7aea68e6d8c77ac/content/browser/dom_storage/test/mojo_test_with_file_service.h [modify] https://crrev.com/220d502c977ee3d8617f42eef7aea68e6d8c77ac/content/browser/fileapi/browser_file_system_helper_unittest.cc [modify] https://crrev.com/220d502c977ee3d8617f42eef7aea68e6d8c77ac/content/browser/network_service_client_unittest.cc |
||||||||||||||||
►
Sign in to add a comment |
||||||||||||||||
Comment 1 by bugdroid1@chromium.org
, Apr 8 2016