gfx::Image should not internally refcount (should be non-copyable) |
|||
Issue descriptionThis has been on my mind for many years and I think I shall finally do something about it (in light of Issue 596348 and related). gfx::Image is a rather strange class in that it internally refcounts its image storage, so that whenever you copy an Image object, it just creates a new reference to its internal ImageStorage. There are a number of issues with this: 1. Copying an Image behaves differently to copying just about any other object. The user must be aware of the implicit reference semantics. This is essentially Java object semantics; that isn't normal for C++. Reference semantics should be explicit. 2. Image is not thread-safe; normally that means you have to be careful about passing references/pointers across threads. But with Image, you have to understand that *copies* are also not safe to pass across threads ( Issue 600229 ). 3. Lots and lots of client code does not need reference semantics but lazily uses it anyway; for example, Images are passed by value when they could be passed by reference, or passed by value when they could be in a scoped_ptr and passed with std::move. 4. Classes in Chromium should be non-copyable by default. There is no good reason for Image to be copyable. 5. ImageSkia (which is the sole representation for most Images) is already ref-counted internally (and unlike Image, can be made thread-safe). There is usually no need to refcount the Image when you can just cheaply copy the underlying ImageSkia and construct a new Image object around it. Now, sometimes it is helpful to have shared garbage-collected pointers to an Image, but clients should explicitly ask for that (with a scoped_refptr), rather than getting it for free. Here's what I want to refactor it into: class Image { private: // This is basically ImageStorage, but exposing all the user-facing methods of Image. Image::RepresentationMap representations_; DISALLOW_COPY_AND_ASSIGN(Image); }; class RefCountedImage : base::RefCounted<RefCountedImage> { public: // All the methods of Image exposed here, delegated to |image_|. private: Image image_; DISALLOW_COPY_AND_ASSIGN(RefCountedImage); }; typedef scoped_refptr<RefCountedImage> ImageRef; Now clients should prefer to use Image where possible (either as a bare object or in a scoped_ptr). If an Image is passed by value but there is one clear owner, it can be changed so it is passed by reference, or with a scoped_ptr via std::move. If ref-counted semantics are legitimately required, an ImageRef should be used. Alternatively instead of using scoped_ptr, we could make Image itself a move-only type. It's possible that we can remove all usages of ImageRef, by passing around the underlying ImageSkia by value (which has reference semantics). Could supply a helper method CopyImageSkiaToNewImage() that calls ToImageSkia() and constructs a new Image from it. (I am reluctant to actually make Image copyable, because then you may inadvertently copy a large PNG data, for example.) There are hundreds of clients and this will break any client that uses reference semantics. Here is a strawman plan to migrate to the new system: 1. Create a new class called ImageRef that just wraps Image and exposes all of its methods through delegation. Make ImageRef a friend of Image. At this stage, ImageRef and Image will be exactly the same. 2. Make a move constructor for Image (but not ImageRef). 3. Make Image's copy constructor and assignment operator private, and make ImageRef expose them publicly. Now ImageRef is the same as what Image used to be; Image itself is moveable but non-copyable. This will break many many clients of Image. 4. For each broken client, if the copy/assignment can be trivially avoided (e.g., pass by reference or use std::move), do that. Otherwise, change to use ImageRef. (Obviously, land this before #3.) Now all clients are using the appropriate class out of Image and ImageRef. 5. Refactor Image so it doesn't have an internal refcount, and create the RefCountedImage class as shown above. ImageRef is still a separate class; internally it now has a scoped_refptr<RefCountedImage>. This should not affect any clients. 6. Replace the ImageRef class with the typedef shown above. This will break all clients and they need to be updated with an extra level of syntactic indirection (-> instead of .). [Note: I think this is worth doing because it reduces the number of distinct classes and makes the client code explicit about indirection across a shared pointer.] I don't anticipate any performance regressions in any of these steps. Step #5 may give a performance benefit to direct clients of Image (ImageRef clients will have the same performance characteristic). Robert, you are the primary owner of gfx::Image. - What are your thoughts on this? - Do you agree with my assessment about the current state of Image, or do you prefer it stays refcounted? - Do you think this is a worthwhile change in principle? - Do you think it is worthwhile in practice? - And do you think my plan is sound? (I am happy to do all the work, but it will be a background task for me, not a priority.) Thanks!
,
Apr 7 2016
> - What are your thoughts on this? You cover this point as an ancillary nicety, but this is actually one of the core reasons for why gfx::Image exists: > Now, sometimes it is helpful to have shared garbage-collected pointers to an Image, but clients should explicitly ask for that (with a scoped_refptr), rather than getting it for free. Originally there was ScopedImage, which was a single-ownership model for image data, but it turned out to actually be very difficult in several places to manage lifetimes correctly as they were passed through various UI models and controllers (depending on the source of the data, though we also didn't have move semantics at the time). Often the UI models expected value semantics, which without the internal refcouting, would mean needlessly copying around image bytes. > - Do you agree with my assessment about the current state of Image, or do you prefer it stays refcounted? I think the major issue is with thread safety, which I agree is a problem. Originally gfx::Image only dealt with things that lived on the UI thread, but clearly it is now used outside that scope. I'm also not sure how all your points above only apply to gfx::Image and not ImageSkia as well, which is patterned the same way. Not all gfx::Images are backed by ImageSkia, so #5 isn't an accurate. In the case of images that are ultimately drawn by native UI code, there shouldn't be an ImageSkia involved at all. Another core reason for gfx::Image is to be able to create an image in the native type for the platform, pass the it through cross-platform code, and ultimately draw it from the native type, so that the image is not needlessly converted between container types. We used to round-trip everything through SkBitmap, which wasted time by copying image bytes between the containers. > - Do you think this is a worthwhile change in principle? I don't think creating two classes from one (Image and RefCountedImage) is necessarily an API improvement. One thing to also consider is that the types of reps that Image stores have changed over time. Now, all the image representations themselves are actually thread-safe reference counted: kImageRepCoca -> NSImage -> refcounted NSObject kImageRepCocoaTouch -> UIImage -> refcounted NSObject kImageRepPNG -> RefCountedMemory -> RefCountedThreadSafe kImageRepSkia -> ImageSkia -> ImageSkiaStorage -> RefCountedThreadSafe So it seems like gfx::Image could be kept cheaply copyable by changing the storage to rely on the refcounting of the underlying data instead. > - Do you think it is worthwhile in practice? I think this design introduces a large change and I'm not sold on the benefits yet. If the core issue is trying to solve the thread safety problem, is there a more precise and less invasive fix?
,
Apr 8 2016
Hi Robert, sorry for this very long reply. > Originally there was ScopedImage, which was a single-ownership model for image data, but it turned out to actually be very difficult in several places to manage lifetimes correctly as they were passed through various UI models and controllers (depending on the source of the data, though we also didn't have move semantics at the time). Often the UI models expected value semantics, which without the internal refcouting, would mean needlessly copying around image bytes. Interesting, didn't know about that history. So as I said, I've done a fairly big survey of the client code using Image in this past week (just from doing the above steps and seeing what breaks, then trying to fix it). There is a lot of code that just needs value semantics on Image (relies on the copy constructor in a meaningful way); I haven't tried to change those (nor am I proposing to), just replace them with ImageRef (note: I think in hindsight I'd rather call it SharedImage). But there is also a lot of code that doesn't *need* the copy constructor at all. Some of this code doesn't use the copy constructor (and that code doesn't even come up as a compiler error when I make this change). Some of it does, but the copies can be trivially replaced with either moving the Image, or passing by reference/pointer. It is this latter class of code that I believe can have a huge benefit from this refactor: this is all about making it possible to reason about ownership of Images, where possible. I think having move semantics really helps reduce the number of places where ImageRef is needed. I don't think completely getting rid of ImageRef is possible (although I believe a lot of the places where I gave up and used ImageRef could in theory be refactored to not need reference counting, I'm happy to leave them be). My main goal is to make Images behave like normal DISALLOW_COPY_AND_ASSIGN classes by default, so that next time someone needs to make an Image, they get the single-ownership semantics unless they really want the shared semantics. This is aligned with the style guide: https://google.github.io/styleguide/cppguide.html#Ownership_and_Smart_Pointers "Prefer to have single, fixed owners for dynamically allocated objects." ... "Shared ownership can be a tempting alternative to careful ownership design, obfuscating the design of a system." ... "Do not design your code to use shared ownership without a very good reason." The quote about "obfuscating the design of the system" really speaks to me. I stumbled upon this (most recently) due to this bug: https://bugs.chromium.org/p/chromium/issues/detail?id=590882#c11 See the lengths I had to go to to reason about the lifetime and ownership of the ImageStorage object that was being used across two threads! > I think the major issue is with thread safety, which I agree is a problem. Originally gfx::Image only dealt with things that lived on the UI thread, but clearly it is now used outside that scope. > > I'm also not sure how all your points above only apply to gfx::Image and not ImageSkia as well, which is patterned the same way. Mostly because ImageSkia *is* thread-safe, or at least can be made thread-safe once its creator no longer needs to modify it. Once an ImageSkia is read-only-thread-safe, it is essentially an immutable value and so having internal reference counting is indistinguishable from copying. Because Image is not thread-safe, I have to care about ownership, and scoped_refptr is the enemy of reasoning about ownership. I tried to make Image thread-safe 2 years ago ( Issue 468010 ) and we decided it was not worth it from a performance standpoint, and instead we would avoid thread-unsafe uses. > Not all gfx::Images are backed by ImageSkia, so #5 isn't an accurate. Yep that's a good point. Image can't necessarily be cheaply copied, so you often just need to make an ImageRef in those cases. But this does apply if you are looking at code that passes an Image over to another thread then calls ToImageSkia() on it (which is what the code above is doing); then you can just pass the ImageSkia instead or pass a new Image with an ImageSkia inside it. > I don't think creating two classes from one (Image and RefCountedImage) is necessarily an API improvement. One thing to also consider is that the types of reps that Image stores have changed over time. Now, all the image representations themselves are actually thread-safe reference counted: > > kImageRepCoca -> NSImage -> refcounted NSObject > kImageRepCocoaTouch -> UIImage -> refcounted NSObject > kImageRepPNG -> RefCountedMemory -> RefCountedThreadSafe > kImageRepSkia -> ImageSkia -> ImageSkiaStorage -> RefCountedThreadSafe > > So it seems like gfx::Image could be kept cheaply copyable by changing the storage to rely on the refcounting of the underlying data instead. Good suggestion. OK I tried this https://codereview.chromium.org/1873463004 and there are two main problems with this approach (detailed in that CL description): 1. ToImageSkia() now returns a pointer to the ImageSkia that lives inside that Image object. Clients that rely on the ImageSkia* outliving the immediate Image object will now have dangling pointers. I think the only fix for this is to delete ToImageSkia and update all clients to use AsImageSkia (so they get their own shallow copy of the ImageSkia). This in turn requires refactoring a lot of ImageSkia* to ImageSkia. 2. More fundamental problem: it breaks caching of alternative representations (see CL description for details). I think copying Image is not as cheap as you might think. On top of the above #2, it requires copying several layers of maps. I think I'd prefer to leave Images ref-counted over doing a shallow copy. > I think this design introduces a large change and I'm not sold on the benefits yet. If the core issue is trying to solve the thread safety problem, is there a more precise and less invasive fix? So I am independently working on a CL to introduce a thread checker (https://codereview.chromium.org/1771033003/) which, at least at runtime, helps us diagnose thread safety issues. So that's essentially the "fix" for the thread safety issue. This refactor is more about encouraging correct design in line with the style guide, so when I have to write code that passes an Image over a thread boundary, I can make an informed decision like "can I std::move the Image so it is owned by the other thread, or should I pull out its ImageSkia and pass that?". Right now, Images have no clear owner so that question is hard to answer.
,
May 26 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/69dabee977f04d3166b0a42bdc335818567a762e commit 69dabee977f04d3166b0a42bdc335818567a762e Author: Taiju Tsuiki <tzik@google.com> Date: Fri May 26 07:14:02 2017 Avoid touching gfx::Image from multiple thread in devtools page_handler PageHandler::ScreencastFrameCaptured creates a gfx::Image and uses it on another sequence. However, the usage causes a data race as gfx::Image has a non-threadsafe ref count internally, and its copies can't cross sequences. Bug: 600237 , 725706 Change-Id: I645d529db1b8887b3c67f81a7916cdab0940b1d0 Reviewed-on: https://chromium-review.googlesource.com/513665 Reviewed-by: Dmitry Gozman <dgozman@chromium.org> Commit-Queue: Taiju Tsuiki (tzik) <tzik@google.com> Cr-Commit-Position: refs/heads/master@{#474952} [modify] https://crrev.com/69dabee977f04d3166b0a42bdc335818567a762e/content/browser/devtools/protocol/page_handler.cc
,
Jul 27 2017
,
Jul 31 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/3d83f26ad153fd480b6e2330a90c5ba37f4dd623 commit 3d83f26ad153fd480b6e2330a90c5ba37f4dd623 Author: Matt Giuca <mgiuca@chromium.org> Date: Mon Jul 31 03:37:23 2017 gfx::Image, ImageFamily: Add and remove copying/moving methods. Image: - Adds move constructor/assignment. Not technically required as Image is cheaply copyable, but this allows more efficient moving without incrementing/decrementing the refcount. Also safer for multi-threaded usage. - Removes SwapRepresentations. Was unused, and superseded by std::move. ImageFamily: - Removes copy constructor/assignment. ImageFamily can be fairly heavyweight, so it's best to move it rather than copying. Also it is heavily used across threads and accidental ImageFamily copying is responsible for data races ( https://crbug.com/749342 ). - Adds Clone method, an explicit copy operator for use if necessary. - Adds move constructor/assignment. This should be used where possible. BUG= 600237 , 749342 Change-Id: I00482c8440d62edb8d9fb216c3de65bf4102d153 Reviewed-on: https://chromium-review.googlesource.com/588033 Reviewed-by: Robert Sesek <rsesek@chromium.org> Reviewed-by: Ben Wells <benwells@chromium.org> Commit-Queue: Matt Giuca <mgiuca@chromium.org> Cr-Commit-Position: refs/heads/master@{#490704} [modify] https://crrev.com/3d83f26ad153fd480b6e2330a90c5ba37f4dd623/chrome/browser/web_applications/web_app.cc [modify] https://crrev.com/3d83f26ad153fd480b6e2330a90c5ba37f4dd623/extensions/browser/image_loader_unittest.cc [modify] https://crrev.com/3d83f26ad153fd480b6e2330a90c5ba37f4dd623/ui/gfx/image/image.cc [modify] https://crrev.com/3d83f26ad153fd480b6e2330a90c5ba37f4dd623/ui/gfx/image/image.h [modify] https://crrev.com/3d83f26ad153fd480b6e2330a90c5ba37f4dd623/ui/gfx/image/image_family.cc [modify] https://crrev.com/3d83f26ad153fd480b6e2330a90c5ba37f4dd623/ui/gfx/image/image_family.h [modify] https://crrev.com/3d83f26ad153fd480b6e2330a90c5ba37f4dd623/ui/gfx/image/image_family_unittest.cc [modify] https://crrev.com/3d83f26ad153fd480b6e2330a90c5ba37f4dd623/ui/gfx/image/image_unittest.cc
,
Sep 24
|
|||
►
Sign in to add a comment |
|||
Comment 1 by mgiuca@chromium.org
, Apr 7 2016