We need to handle failures from the underlying components (Model, Driver, and FileMonitor). If any of these fail we probably want to drop all jobs and notify the Client.
Current behavior:
(1) If any component fails we make the service effectively unavailable and notify the Client.
Future behavior:
(1) If any component fails, we nuke all state and start from scratch.
(2) We notify the Client that we had a bad startup.
It would be worth considering whether or not we still have "unrecoverable" failures that cause the service to effectively turn off though (we could just reject all StartDownload() requests with an UNKNOWN error to the callback).
Comment 1 by dtrainor@chromium.org
, Jul 11 2017Status: Started (was: Available)