New issue
Advanced search Search tips

Issue 831541 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 2
Type: Bug



Sign in to add a comment

Garbage collection observable in inspector-protocol/css/css-collect-class-names.js

Project Member Reported by mlippautz@chromium.org, Apr 11 2018

Issue description

Presumably, the write barrier for parent rule/stylesheet is missing. 

Additionally, we need to figure out why the verifier does not catch this problem.

gn args:
is_component_build = true
is_debug = true
use_goma = true
symbol_level = 2
enable_blink_heap_incremental_marking = true

failing test:
out/Debug/content_shell --single-process  --enable-logging=stderr --vmodule=*/heap/*=3   --run-layout-test --js-flags="--trace-gc"  --enable-blink-features=HeapIncrementalMarking inspector-protocol/css/css-collect-class-names.js

The test checks that we can enumerate all the style sheets. With incremental marking kicking in somewhere during the test, iteration does not catch the parent sheets.
 
The test is here
  https://cs.chromium.org/chromium/src/third_party/WebKit/LayoutTests/inspector-protocol/css/css-collect-class-names.js

While we are missing the write barrier this is not the problem of the test.


The problem seems to be some asynchronous processing that is not yet done. I.e., the second loop that collects classnames sometimes completely bails out and is dependent on how much of the CSSStyleSheet has already been processed (?)


Forcing conservative GCs seems fine so maybe some other write barrier or unsupported object is temporarily holding the references.

Presumably something that is fine in the atomic marking case. E.g. some other object owning rules temporarily.
In the good (working) case I see SetParentRule being called in time, in the other one not.
In the good case 
  InspectorNetworkAgent::FetchResourceContent
finds a cached_resource.

In the bad case it doesn't and we completely fall through all branches and CollectFlatRules() doesn't find any rules as there's no resource.

https://cs.chromium.org/chromium/src/third_party/blink/renderer/core/inspector/InspectorNetworkAgent.cpp?l=1746
It also never happens with --single-process. I begin to wonder if that is a race condition with some cached resource where incremental marking is just delaying some thread too long.
We do set ResourceFetcher::cached_resource_map_ for the CSS URL in question.

https://cs.chromium.org/chromium/src/third_party/blink/renderer/platform/loader/fetch/resource_fetcher.cc?l=813

Since that field is weak, the question now is why the incremental GC clears the field.
Keishi mentioned that the Resource is held alive from 
  HTMLLinkElement::link_ through ResourceClient.

However, after finishing loading we explicitly drop that reference.

#1  0x00007f46c26cb0df in blink::ResourceClient::ClearResource() (this=0x2cfce67984a0)
    at ../../third_party/blink/renderer/platform/loader/fetch/resource_client.h:79
#2  0x00007f46c365619b in blink::DocumentLoader::FinishedLoading(base::TimeTicks) (this=0x2cfce67984a0, finish_time=...)
    at ../../third_party/blink/renderer/core/loader/document_loader.cc:484
#3  0x00007f46c3655dac in blink::DocumentLoader::NotifyFinished(blink::Resource*) (this=0x2cfce67984a0, resource=0x2cfce6798f88)
    at ../../third_party/blink/renderer/core/loader/document_loader.cc:398
#4  0x00007f46c09bb43f in blink::Resource::NotifyFinished() (this=0x2cfce6798f88)
    at ../../third_party/blink/renderer/platform/loader/fetch/resource.cc:280
#5  0x00007f46c09b37a5 in blink::RawResource::NotifyFinished() (this=0x2cfce6798f88)
    at ../../third_party/blink/renderer/platform/loader/fetch/raw_resource.cc:332
#6  0x00007f46c09bc39a in blink::Resource::Finish(double, base::SingleThreadTaskRunner*) (this=0x2cfce6798f88, load_finish_time=1449746.416274, task_runner=0x308035599f20) at ../../third_party/blink/renderer/platform/loader/fetch/resource.cc:369
#7  0x00007f46c09d76fd in blink::ResourceFetcher::HandleLoaderFinish(blink::Resource*, double, blink::ResourceFetcher::LoaderFinishType, unsigned int, bool) (this=0x2cfce6798d20, resource=0x2cfce6798f88, finish_time=1449746.416274, type=blink::ResourceFetcher::kDidFinishLoading, inflight_keepalive_bytes=0, blocked_cross_site_document=false)
    at ../../third_party/blink/renderer/platform/loader/fetch/resource_fetcher.cc:1461
#8  0x00007f46c0a03e26 in blink::ResourceLoader::DidFinishLoading(double, long, long, long, bool) (this=0x2cfce6799688, finish_time=1449746.416274, encoded_data_length=0, encoded_body_length=292, decoded_body_length=292, blocked_cross_site_document=false)
    at ../../third_party/blink/renderer/platform/loader/fetch/resource_loader.cc:692
#9  0x00007f46d05fcd83 in content::WebURLLoaderImpl::Context::OnCompletedRequest(network::URLLoaderCompletionStatus const&) (this=0x3080357567a0, status=...) at ../../content/renderer/loader/web_url_loader_impl.cc:1005

Not sure whether the inspector is broken here somewhere (maybe it should load the resource contents from somewhere else after finishing loading?)
The InspectorNetworkAgent cache also depends on whether the Resource is still alive, so I am unsure how that should work in general.

https://cs.chromium.org/chromium/src/third_party/blink/renderer/core/inspector/network_resources_data.cc?l=133
Cc: dgozman@chromium.org
+dgozman for help on how this should work. See #4.
NetworkResourcesData keeps a weekmember to the resource and gets notified when resource is about to die, so that we can copy content out of it. Does that answer the question?
Maybe :)

So I guess the intended behavior is to use the Resource from some other caches if possible and create a fallback cache in the inspector when the other caches are about to be cleared?

If that is the case, then building up the fallback cache seems to be broken with incremental marking and I need to further investigate that.
> So I guess the intended behavior is to use the Resource from some other caches if possible and create a fallback cache in the inspector when the other caches are about to be cleared?

Correct. It's unfortunate that it does not work with incremental marking. I'm happy to answer any questions I can.


Well, it has to work.

Investigating further.
Hmm. Why can't you simply use Member for the fallback cache? You can explicitly remove the Resource from the fallback cache when it finishes loading.

The weak processing looks complex and suspicious. There's a couple of restrictions for what a weak processing can do. e.g., a weak processing is not allowed to mutate any Oilpan's graph. I don't see any bug in the weak processing but it is not encouraged to do complex operations there.




> Hmm. Why can't you simply use Member for the fallback cache? You can explicitly remove the Resource from the fallback cache when it finishes loading.

Regular references would obviously be appreciated. I would guess the inspector was designed to retain as much of the regular lifetimes and behavior as possible. Maybe I am off here.

FYI: By the time we are trying to fetch the resource InspectorNetworkAgent::resources_data_ is empty. I am not sure where that should be populated initially.
Still unsure what is actually going on.

I see 
  DocumentLoader::FinishedLoading
but I don't see 
  InspectorNetworkAgent::DidFinishLoading

Consequently
  InspectorNetworkAgent::resources_data_
is not filled and we bail out in the test without iterating the contents of the CSS file.

At this point I am unsure whether this is some problem in the InspectorNetworkAgent or in the GC.
> Hmm. Why can't you simply use Member for the fallback cache? You can explicitly remove the Resource from the fallback cache when it finishes loading.

IIRC, at the time of MemoryCache it was undesirable to have a strong reference, because then resource won't get evicted from MemoryCache. With weak reference, we ensure that loading works exactly the same with/without DevTools (that's important for developers to trust the tools), and we only retain some data as we see fit (e.g. enforcing our configurable limit on total data size).

> I see 
>   DocumentLoader::FinishedLoading
> but I don't see 
>   InspectorNetworkAgent::DidFinishLoading

Perhaps, we need some help from loading team?

I'm getting confused.

In the first place, why can it happen that a resource that hasn't yet finished loading is garbage-collected? Someone should be holding the resource until the resource finishes loading, right?

Then DevTools can just hold the resources with Members. DevTools can explicitly clear the Members when the resource loading is finished. Then it won't change any lifetime...?




The resource is held alive by HTMLLinkElement until its loading finishes. 

As long as the resource is loading the test works because the resource can be retrieved from the memory cache [1]. The cache holds the resource with a weak ref though, so after loading finished the memory cache will no bit a hit.

DevTools could explicitly clear things but I also understand the design principle of being non-intrusive wrt to lifetimes as much as possible. 

In any case, some handover when loading is finished seems not working and help is appreciated. Can we add somebody knowledgable? 

[1] https://cs.chromium.org/chromium/src/third_party/blink/renderer/core/inspector/InspectorNetworkAgent.cpp?l=1746
> Then DevTools can just hold the resources with Members. DevTools can explicitly clear the Members when the resource loading is finished. Then it won't change any lifetime...?

DevTools retains resource data long after resource loading has finished. On the other hand, we don't want to retain another copy of it nor to affect how long resource lives in MemoryCache. Sorry, but we should probably talk in terms of code if someone has a specific proposal, or we won't get anywhere :)
My specific proposal is to remove the weak processing altogether.

What you need is to save the loaded data to NetworkResourceData, right? Then why does NetworkResourceData need to have a weak reference to the Resource in the first place? You can just ask Resource to copy the contents to NetworkResourceData.

Or are you saying that you want to make the copying lazy? (i.e., copy the contents only when the Resource dies.) If you need that behavior, you can ask the Resource to copy the contents to NetworkResourceData when the Resource is going to die. You won't need to add a weak member from NetworkResourceData to Resource.


> Or are you saying that you want to make the copying lazy? (i.e., copy the contents only when the Resource dies.)

Correct.

> If you need that behavior, you can ask the Resource to copy the contents to NetworkResourceData when the Resource is going to die. You won't need to add a weak member from NetworkResourceData to Resource.

How is this different from weak member? It is just reversing dependency: now Resource is aware of NetworkResourceData, and we need to cleanup Resource->NetworkResourceData pointer when we disable DevTools. I think WeakMember serves our purpose very well - we do not retain the object but get a callback when it dies.

Anyway, no matter how we structure the code, there will still be a call from when Resource dies to NetworkResourcesData. Seems like it won't help here?


I think a better solution would be:

- Create a class that holds the resource contents.
- Make the class RefCounted (or GarbageCollected).
- Make Resource and NetworkResourceData hold a strong reference to the class.

Will it work?

Manual weak processing should be avoided as much as possible since it has a couple of subtle restrictions:

- It cannot mutate an object graph at all. If it mutates, it will cause use-after-free.
- Imagine that X has a WeakMember to Y. The weak callback runs only when X is alive but Y is dead. It's not called when X and Y die at the same GC round.

I don't see any specific issue in the weak callback of NetworkResourceData, but due to the trickiness, it's not a good idea to do something complex there :)

> I think a better solution would be:
> - Create a class that holds the resource contents.
> - Make the class RefCounted (or GarbageCollected).
> - Make Resource and NetworkResourceData hold a strong reference to the class.
> Will it work?

This will work, however it requires large changes in Resource, which I think is not justifiable by inspector needs. If the loader team thinks it's ok, I'll be happy to review any patches!
Any folks to CC on this issue?
Project Member

Comment 27 by bugdroid1@chromium.org, Apr 19 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/31bdd331f36ca1e8dfc9803c1ce87ed055576f48

commit 31bdd331f36ca1e8dfc9803c1ce87ed055576f48
Author: Michael Lippautz <mlippautz@chromium.org>
Date: Thu Apr 19 22:23:24 2018

[oilpan] Temporarily adjust expectations for inspector test

... on incremental marking builds.

Tbr: haraken@chromium.org
No-try: true
Bug: chromium:831541
Change-Id: I3c1e3f4b244eaead7824aee43386807ec5eb8d62
Reviewed-on: https://chromium-review.googlesource.com/1019971
Reviewed-by: Michael Lippautz <mlippautz@chromium.org>
Commit-Queue: Michael Lippautz <mlippautz@chromium.org>
Cr-Commit-Position: refs/heads/master@{#552185}
[modify] https://crrev.com/31bdd331f36ca1e8dfc9803c1ce87ed055576f48/third_party/WebKit/LayoutTests/FlagExpectations/enable-blink-features=HeapIncrementalMarking

Blocking: -757440
Cc: kouhei@chromium.org
Components: Blink>Loader
Owner: ----
Status: Available (was: Assigned)
Summary: Garbage collection observable in inspector-protocol/css/css-collect-class-names.js (was: Incrementally tracing blink:CSSRule is broken)
Seems related to issue 837596 where GC is observable during loading. This is not an incremental marking issue but a more general problem.

Sign in to add a comment