Review WriteFile() logic (e.g. support for partial-completion) in mojo::edk::ChannelWin |
||||||
Issue descriptionmojo::edk::ChannelWin keeps separate MessagePumpForIO::IOContext structures, each containing an OVERLAPPED structure, for each different kind of I/O operation. For connect & read operations this is fine, since only one operation is in-progress at any given moment (and actually we never have both connect and read operations in-progress at the same time, so they could in principle share a context). However, we call WriteFile() for each Write() operation, but pass the same IOContext/OVERLAPPED structure to each call - although this happens to work for us in practice, it is not correct with respect to the API documentation. Since we already maintain an in-memory queue of outgoing messages, we can easily associate a dedicated IOContext with each MessageView, to use in the WriteFile() call.
,
Mar 14 2018
,
Mar 16 2018
Constructed a trivial loopback experiment which writes a 10MB message to the mojo::edk::Channel, and modified the Channel implementation to: - Log the size of data passed to WriteFile. - Constraint each ReadFile to 4096 bytes. - Log the bytes_read and bytes_written. Observed that (on Windows 10) the WriteFile operation reports completion only when the reader has completely consumed the entire buffer. Modifying the test to stop reading part-way through the message, and close the channel, I did not observe any write-completion notification via OnWriteDone. Based on these observations, removing support from ChannelWin for partial write completions, and maintaining an IOContext/OVERLAPPED structure for each |outgoing_messages_| entry seems the correct resolution.
,
Mar 16 2018
Nice investigation. This seems to also validate the WriteFile MSDN documentation as well.
,
Mar 16 2018
My impression was that the MSDN docs were pretty ambiguous where OVERLAPPED I/O is concerned; there's mention of overlapped operations "blocking" a system thread rather than a userspace thread, but blocking WriteFile() can still partially-complete in some instances, I think, so that doesn't necessarily mean an OVERLAPPED WriteFile() can't partially complete. However, there is an overlapped pipe server example which explicitly checks whether cbWritten passed to the completion is == the amount of bytes we intended to write, which supports our case as well. :)
,
Mar 20 2018
My original assessment of this was wrong, since we have an explicitly check that [appears intended to] stop us from having more than one WriteFile() in-flight at a given time (see https://cs.chromium.org/chromium/src/mojo/edk/system/channel_win.cc?l=104): bool write_now = !delay_writes_ && outgoing_messages_.empty(); outgoing_messages_.emplace_back(std::move(message), 0); if (write_now && !WriteNoLock(outgoing_messages_.front())) reject_writes_ = write_error = true; This suggests that we are not _deliberately_ re-using the same OVERLAPPED for multiple WriteFile() operations. However, given that the current ref-counting works fine, whereas holding a single self-reference during write operations didn't, it suggest that we are somehow ending up with multiple WriteFile()s in-progress in some edge-case. Do we want to: 1. Remove support for partial WriteFile() completion, since that complexity does not seem to be required? 2. Add support for dispatching multiple WriteFile()s in parallel? Depends on #1. May provide some reduction in context switching for chatty IPC exchanges.
,
Mar 20 2018
Re the ref-counting issues: #epicfacepalm If you look at https://chromium-review.googlesource.com/c/chromium/src/+/959762/4/mojo/edk/system/channel_win.cc#237 then you'l notice that the |is_write_pending_| self-reference is being move()d into a reference on the stack, from the IO thread. Since Write() is thread-safe, though, |is_write_pending_| could be _set_ from some arbitrary other thread.
,
Mar 21 2018
Dropping priority since the initial analysis suggesting broken write-completion handling vs multiple writes was incorrect.
,
Mar 21 2018
We should remove partial write handling.
,
Mar 21 2018
,
Mar 21 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b8bc6981c6d135e32fcdcfbb6dee72f41140d792 commit b8bc6981c6d135e32fcdcfbb6dee72f41140d792 Author: Wez <wez@chromium.org> Date: Wed Mar 21 16:08:27 2018 Remove support for partial WriteFile() completion from ChannelWin. Overlapped WriteFile() operations on named pipes do not report partial- completion, so simplify the mojo::edk::ChannelWin implementation by removing the partial-completion logic. Bug: 821903 Change-Id: Iaa7132e86d7e88b097b4243342eccfe3959c7e69 Reviewed-on: https://chromium-review.googlesource.com/972625 Commit-Queue: Wez <wez@chromium.org> Reviewed-by: Ken Rockot <rockot@chromium.org> Cr-Commit-Position: refs/heads/master@{#544721} [modify] https://crrev.com/b8bc6981c6d135e32fcdcfbb6dee72f41140d792/mojo/edk/system/channel_win.cc
,
Apr 3 2018
rockot: We could replace the Boolean |is_<foo>_pending_| flags with scoped_refptr<> instances, but while it makes the ref-counting more intuitive, it makes OnIOCompleted() more complex. This is in part because the current implementation relies on Release() being thread-safe, whereas if we used a scoped_refptr<> for |is_write_pending_|, we would need to take |write_lock_| while clearing the refptr<>, which doesn't seem worth it. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by w...@chromium.org
, Mar 14 2018