Issue metadata
Sign in to add a comment
|
ServiceManager should pass GUID per service instance to ServiceManagerListener for disambiguation |
||||||||||||||||||||||||
Issue description"NetworkServiceRestartBrowserTest.BrowserIOFactoryInfo" is flaky. This issue was created automatically by the chromium-try-flakes app. Please find the right owner to fix the respective test/step and assign this issue to them. If the step/test is infrastructure-related, please add Infra-Troopers label and change issue status to Untriaged. When done, please remove the issue from Sheriff Bug Queue by removing the Sheriff-Chromium label. We have detected 5 recent flakes. List of all flakes can be found at https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyQAsSBUZsYWtlIjVOZXR3b3JrU2VydmljZVJlc3RhcnRCcm93c2VyVGVzdC5Ccm93c2VySU9GYWN0b3J5SW5mbww. Flaky tests should be disabled within 30 minutes unless culprit CL is found and reverted. Please see more details here: https://sites.google.com/a/chromium.org/dev/developers/tree-sheriffs/sheriffing-bug-queues#triaging-auto-filed-flakiness-bugs
,
May 7 2018
Detected 5 new flakes for test/step "NetworkServiceRestartBrowserTest.BrowserIOFactoryInfo". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyQAsSBUZsYWtlIjVOZXR3b3JrU2VydmljZVJlc3RhcnRCcm93c2VyVGVzdC5Ccm93c2VySU9GYWN0b3J5SW5mbww. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
,
May 7 2018
Disabling test on Windows and removing from Sheriff queue. Will let chongz@ decide if/when to re-enable the test.
,
May 7 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9cbbe3833f83d4206f92ae7cca1d2e42acea3dc8 commit 9cbbe3833f83d4206f92ae7cca1d2e42acea3dc8 Author: Guido Urdaneta <guidou@chromium.org> Date: Mon May 07 14:51:06 2018 Disable flaky NetworkServiceRestartBrowserTest.BrowserIOFactoryInfo on Windows TBR=chongz@chromium.org Bug: 840127 Change-Id: Idb63b21d4b1ea0c6b0eca07f6bcec0f7871dffbd Reviewed-on: https://chromium-review.googlesource.com/1046659 Reviewed-by: Guido Urdaneta <guidou@chromium.org> Commit-Queue: Guido Urdaneta <guidou@chromium.org> Cr-Commit-Position: refs/heads/master@{#556436} [modify] https://crrev.com/9cbbe3833f83d4206f92ae7cca1d2e42acea3dc8/content/browser/network_service_restart_browsertest.cc
,
May 7 2018
,
May 7 2018
Lowering priority since it's test only and has been disabled (potential 'process_map.cc' issue is long standing issue 818593 ). +hjd@, primiano@ - Owners of '//src/services/resource_coordinator/memory_instrumentation/process_map.cc': My test intentionally crashes Network Service process and restarts it, and it seems that |ProcessMap| could receive new PID before the old one was removed. Can I have some help on what might be happening? Thanks!
,
May 8 2018
I'm afraid I'm OOO for the next two weeks so I might be slow to respond. This sounds like it is to do with the semantics of the ServiceManager. The ServiceManager (apparently) doesn't guarantee you see OnServiceStopped for the crashing service before you see OnServicePIDReceived for the restarting service and so we end up trying to emplace into instances_ before we've erased the old entry. Given this there's two paths forward: a) Change ServiceManager to have the guarantee. b) Change ProcessMap to cope without the guarantee. Does that seem right? If so: For a): I don't know much about the ServiceManager code so I can't comment on how difficult/easy a) would be. For b): the ProcessMap code is much simpler but it's going to be pretty hard to write something that does the right thing in all cases (in most cases OnServiceStopped should clear the map but when OnServiceStopped comes after the OnServicePIDReceived for the restarted service you have to *not* clear the map and so you have to be able to tell the restarted service identity from the old service identity etc). Sorry to not be much help! People familiar with ServiceManager have better information.
,
May 8 2018
Re #c7: Thanks for the detailed analysis, that sounds reasonable! +rockot@: My feeling is that we should do a), but do you have any other comments before I dig into it? Thanks!
,
May 14 2018
(a) would be the wrong thing to do. It's important to note that a service instance can become unreachable (i.e. new connections won't be routed to it) before it's stopped, so what you're observing is an entirely legitimate order of events. There are two good options I can think of to make ServiceManagerListener more useful in light of this ambiguity: 1. Add an OnServiceUnreachable event which ProcessMap can use instead. This is simple, but has the drawback that it wouldn't accurately track process lifetime. i.e. an unreachable service process may continue running for an arbitrary period of time (or even indefinitely, if the service is behaving badly). 2. Generate a GUID for each created service instance which can be passed to all events on ServiceManagerListener for disambiguation, since Identity alone cannot be sufficient. I think the second option is the right way to go.
,
May 14 2018
Thanks for the detailed explanation! That makes sense. I'll implement the second option probably in a week or two. (Changing OS to ALL since it could potentially affect other OS as well)
,
Jun 19 2018
,
Aug 22
[chongz]: Issue 876179 is possible a duplicate of this. Any progress in the last two months?
,
Aug 22
Apologize for the delay - I was working on other stuff and probably won't have time for this in the near future. Marked as available so others could pick up.
,
Aug 22
Adding the services>network label - I think we want this fixed before we launch the network service on beta or stable, at least.
,
Aug 23
Issue 876179 has been merged into this issue.
,
Aug 24
,
Aug 27
Detected 3 new flakes for test/step "InMemoryApp/NetworkContextConfigurationHttpsStrippingPacBrowserTest.PacHttpsUrlStripping/0". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyZQsSBUZsYWtlIlpJbk1lbW9yeUFwcC9OZXR3b3JrQ29udGV4dENvbmZpZ3VyYXRpb25IdHRwc1N0cmlwcGluZ1BhY0Jyb3dzZXJUZXN0LlBhY0h0dHBzVXJsU3RyaXBwaW5nLzAM. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
,
Aug 28
Issue 878076 has been merged into this issue.
,
Aug 28
Some flaky tests are being root-caused into this issue. - InMemoryApp/NetworkContextConfigurationHttpsStrippingPacBrowserTest.PacHttpsUrlStripping/0 - OnDiskApp/NetworkContextConfigurationHttpsStrippingPacBrowserTest.PacHttpsUrlStripping/0 They don't appear to be disabled yet, but I'm disabling them to meet the "Flaky tests should be disabled within 30 minutes" SLA. assigning an owner based on Issue 818593
,
Aug 28
,
Aug 28
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c4bed4374c9793fadf2b6b9686668cb0fe4d3b28 commit c4bed4374c9793fadf2b6b9686668cb0fe4d3b28 Author: Trent Apted <tapted@chromium.org> Date: Tue Aug 28 04:31:41 2018 Disable flaky */NetworkContextConfigurationHttpsStrippingPacBrowserTest.PacHttpsUrlStripping TBR=hjd@chromium.org Bug: 840127 Change-Id: Iaa027caeac28a2989ea382bde5a6a44756294bec Reviewed-on: https://chromium-review.googlesource.com/1192369 Reviewed-by: Trent Apted <tapted@chromium.org> Commit-Queue: Trent Apted <tapted@chromium.org> Cr-Commit-Position: refs/heads/master@{#586580} [modify] https://crrev.com/c4bed4374c9793fadf2b6b9686668cb0fe4d3b28/chrome/browser/net/network_context_configuration_browsertest.cc
,
Sep 26
Right now Identiy has name, user_id, and instance_id fields. The instance_id may be used by clients to select different instances, e.g. for partitioning service requests into separate processes. So if you want a Blink renderer instance but you want to ensure you aren't reusing the one used for some other origin, you might generate a unique instance_id and use that when connecting. Because all this instance and process management stuff is inherently racy to some extent, that Identity structure alone is not enough for global disambiguation, as discussed earlier in the thread. Rather than modifying ServiceManagerListener to hack around this, my proposal now would be to revamp Identity as such: - name and user_id stay the same - what we call "instance_id" becomes "partition", and remains a way for sufficiently privileged clients to select isolated service instances - a 4th field (repurposing the name "instance_id") is added to Identity. It is an error to set this field in any Identity used to call BindInterface and is effectively a "read-only" field, always generated and provided by the Service Manager. It's basically just a GUID generated when the instance is created. Then ServiceManagerListener has sufficient information to disambiguate one instance from another even over time. There would still be a problem of the reverse mapping, i.e. the OS may re-use a PID and we have no real way to guarantee that notification about a new process using $PID isn't received before notification about the death of an old process that was using $PID. But I don't think this is a concern in practice?
,
Oct 11
,
Oct 11
,
Oct 11
I'm gonna take this.
,
Oct 17
,
Nov 9
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1060490ecae01406391a04a8109e03f5c2ddfb00 commit 1060490ecae01406391a04a8109e03f5c2ddfb00 Author: Ken Rockot <rockot@google.com> Date: Fri Nov 09 20:52:08 2018 [service-manager] Add globally unique ID to Identity Adds a globally_unique_id field to each service's Identity, generated and assigned by the Service Manager when starting any new service instance. This ensures that every running service instance has an Identity that is globally unique across space and time. TBR=jam@chromium.org Bug: 840127 , 895591 Change-Id: If32863b7d8562e3a2cd9625d8eed2db8e83c67cb Reviewed-on: https://chromium-review.googlesource.com/c/1318806 Commit-Queue: Ken Rockot <rockot@google.com> Reviewed-by: John Abd-El-Malek <jam@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org> Reviewed-by: Oksana Zhuravlova <oksamyt@chromium.org> Cr-Commit-Position: refs/heads/master@{#606960} [modify] https://crrev.com/1060490ecae01406391a04a8109e03f5c2ddfb00/content/browser/renderer_host/render_process_host_impl.cc [modify] https://crrev.com/1060490ecae01406391a04a8109e03f5c2ddfb00/content/public/common/services.md [modify] https://crrev.com/1060490ecae01406391a04a8109e03f5c2ddfb00/services/service_manager/README.md [modify] https://crrev.com/1060490ecae01406391a04a8109e03f5c2ddfb00/services/service_manager/public/cpp/identity.cc [modify] https://crrev.com/1060490ecae01406391a04a8109e03f5c2ddfb00/services/service_manager/public/cpp/identity.h [modify] https://crrev.com/1060490ecae01406391a04a8109e03f5c2ddfb00/services/service_manager/public/cpp/identity_mojom_traits.h [modify] https://crrev.com/1060490ecae01406391a04a8109e03f5c2ddfb00/services/service_manager/public/cpp/service_binding.h [modify] https://crrev.com/1060490ecae01406391a04a8109e03f5c2ddfb00/services/service_manager/public/mojom/BUILD.gn [modify] https://crrev.com/1060490ecae01406391a04a8109e03f5c2ddfb00/services/service_manager/public/mojom/connector.mojom [modify] https://crrev.com/1060490ecae01406391a04a8109e03f5c2ddfb00/services/service_manager/service_manager.cc [modify] https://crrev.com/1060490ecae01406391a04a8109e03f5c2ddfb00/services/service_manager/service_manifests.md [modify] https://crrev.com/1060490ecae01406391a04a8109e03f5c2ddfb00/services/service_manager/tests/connect/BUILD.gn [modify] https://crrev.com/1060490ecae01406391a04a8109e03f5c2ddfb00/services/service_manager/tests/connect/connect_unittest.cc [modify] https://crrev.com/1060490ecae01406391a04a8109e03f5c2ddfb00/services/service_manager/tests/service_manager/BUILD.gn [add] https://crrev.com/1060490ecae01406391a04a8109e03f5c2ddfb00/services/service_manager/tests/service_manager/service_manager_listener_unittest.cc
,
Nov 15
,
Jan 17
(5 days ago)
|
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by ksakamoto@chromium.org
, May 7 2018Owner: chongz@chromium.org
Status: Assigned (was: Untriaged)