New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 821903 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Apr 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 2
Type: Bug



Sign in to add a comment

Review WriteFile() logic (e.g. support for partial-completion) in mojo::edk::ChannelWin

Project Member Reported by w...@chromium.org, Mar 14 2018

Issue description

mojo::edk::ChannelWin keeps separate MessagePumpForIO::IOContext structures, each containing an OVERLAPPED structure, for each different kind of I/O operation.  For connect & read operations this is fine, since only one operation is in-progress at any given moment (and actually we never have both connect and read operations in-progress at the same time, so they could in principle share a context).

However, we call WriteFile() for each Write() operation, but pass the same IOContext/OVERLAPPED structure to each call - although this happens to work for us in practice, it is not correct with respect to the API documentation.

Since we already maintain an in-memory queue of outgoing messages, we can easily associate a dedicated IOContext with each MessageView, to use in the WriteFile() call.
 

Comment 1 by w...@chromium.org, Mar 14 2018

OTOH, the rest of ChannelWin appears to assume that the WriteFile() operation may result in partial-completion, in which case attempting to WriteFile() the remainder of the data subsequently will result in data reordering if any other message had been passed to WriteFile() in the meantime.

We should probably arrange to have only a single WriteFile() in-flight at any time, though it would be good to confirm whether partial completion is really a possibility for pipes.

Comment 2 by w...@chromium.org, Mar 14 2018

Labels: -Pri-2 Pri-1
Summary: Review use of multiple pending WriteFile() calls, using the same OVERLAPPED structure, in mojo::edk::ChannelWin (was: mojo::edk::ChannelWin::Write() passes the same OVERLAPPED structure to multiple ::WriteFile() calls)

Comment 3 by w...@chromium.org, Mar 16 2018

Constructed a trivial loopback experiment which writes a 10MB message to the mojo::edk::Channel, and modified the Channel implementation to:

- Log the size of data passed to WriteFile.
- Constraint each ReadFile to 4096 bytes.
- Log the bytes_read and bytes_written.

Observed that (on Windows 10) the WriteFile operation reports completion only when the reader has completely consumed the entire buffer.

Modifying the test to stop reading part-way through the message, and close the channel, I did not observe any write-completion notification via OnWriteDone.

Based on these observations, removing support from ChannelWin for partial write completions, and maintaining an IOContext/OVERLAPPED structure for each |outgoing_messages_| entry seems the correct resolution.
Nice investigation. This seems to also validate the WriteFile MSDN documentation as well.

Comment 5 by w...@chromium.org, Mar 16 2018

My impression was that the MSDN docs were pretty ambiguous where OVERLAPPED
I/O is concerned; there's mention of overlapped operations "blocking" a
system thread rather than a userspace thread, but blocking WriteFile() can
still partially-complete in some instances, I think, so that doesn't
necessarily mean an OVERLAPPED WriteFile() can't partially complete.

However, there is an overlapped pipe server example which explicitly checks
whether cbWritten passed to the completion is == the amount of bytes we
intended to write, which supports our case as well. :)

Comment 6 by w...@chromium.org, Mar 20 2018

Summary: Review WriteFile() logic (e.g. support for partial-completion) in mojo::edk::ChannelWin (was: Review use of multiple pending WriteFile() calls, using the same OVERLAPPED structure, in mojo::edk::ChannelWin)
My original assessment of this was wrong, since we have an explicitly check that [appears intended to] stop us from having more than one WriteFile() in-flight at a given time (see https://cs.chromium.org/chromium/src/mojo/edk/system/channel_win.cc?l=104):

      bool write_now = !delay_writes_ && outgoing_messages_.empty();
      outgoing_messages_.emplace_back(std::move(message), 0);
      if (write_now && !WriteNoLock(outgoing_messages_.front()))
        reject_writes_ = write_error = true;

This suggests that we are not _deliberately_ re-using the same OVERLAPPED for multiple WriteFile() operations. However, given that the current ref-counting works fine, whereas holding a single self-reference during write operations didn't, it suggest that we are somehow ending up with multiple WriteFile()s in-progress in some edge-case.

Do we want to:
1. Remove support for partial WriteFile() completion, since that complexity does not seem to be required?
2. Add support for dispatching multiple WriteFile()s in parallel?  Depends on #1.  May provide some reduction in context switching for chatty IPC exchanges.

Comment 7 by w...@chromium.org, Mar 20 2018

Re the ref-counting issues: #epicfacepalm

If you look at https://chromium-review.googlesource.com/c/chromium/src/+/959762/4/mojo/edk/system/channel_win.cc#237 then you'l notice that the |is_write_pending_| self-reference is being move()d into a reference on the stack, from the IO thread.  Since Write() is thread-safe, though, |is_write_pending_| could be _set_ from some arbitrary other thread.

Comment 8 by w...@chromium.org, Mar 21 2018

Labels: -Pri-1 Pri-2
Dropping priority since the initial analysis suggesting broken write-completion handling vs multiple writes was incorrect.

Comment 9 by roc...@chromium.org, Mar 21 2018

We should remove partial write handling.

Comment 10 by w...@chromium.org, Mar 21 2018

Status: Started (was: Assigned)
Project Member

Comment 11 by bugdroid1@chromium.org, Mar 21 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/b8bc6981c6d135e32fcdcfbb6dee72f41140d792

commit b8bc6981c6d135e32fcdcfbb6dee72f41140d792
Author: Wez <wez@chromium.org>
Date: Wed Mar 21 16:08:27 2018

Remove support for partial WriteFile() completion from ChannelWin.

Overlapped WriteFile() operations on named pipes do not report partial-
completion, so simplify the mojo::edk::ChannelWin implementation by
removing the partial-completion logic.

Bug:  821903 
Change-Id: Iaa7132e86d7e88b097b4243342eccfe3959c7e69
Reviewed-on: https://chromium-review.googlesource.com/972625
Commit-Queue: Wez <wez@chromium.org>
Reviewed-by: Ken Rockot <rockot@chromium.org>
Cr-Commit-Position: refs/heads/master@{#544721}
[modify] https://crrev.com/b8bc6981c6d135e32fcdcfbb6dee72f41140d792/mojo/edk/system/channel_win.cc

Comment 12 by w...@chromium.org, Apr 3 2018

Status: Fixed (was: Started)
rockot: We could replace the Boolean |is_<foo>_pending_| flags with scoped_refptr<> instances, but while it makes the ref-counting more intuitive, it makes OnIOCompleted() more complex. This is in part because the current implementation relies on Release() being thread-safe, whereas if we used a scoped_refptr<> for |is_write_pending_|, we would need to take |write_lock_| while clearing the refptr<>, which doesn't seem worth it.

Sign in to add a comment