autoupdate_Cellular* tests fail to start the update on some devices in pool:cellular but not others |
||||||
Issue descriptionBoth regular update and forced OOBE over cellular are failing: https://stainless.corp.google.com/search?view=list&first_date=2018-10-20&last_date=2018-11-02&suite=%5Ecellular%5C_au%24&model=nautilus&exclude_cts=true&exclude_not_run=false&exclude_non_release=true&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=false autoupdate_CannedOmahaUpdate client test did not pass (reason: Update attempt failed: Command <update_engine_client -update -omaha_url=http://127.0.0.1:39709/update > failed, rc=1, Command returned non-zero exit status). autoupdate_StartOOBEUpdate client test did not pass (reason: Update status was unexpectedly IDLE when we were waiting for the update to complete. Please check the update engine logs.).
,
Nov 6
The only thing I can see is that it can't receive any data over the cellular. Benchan@ would be the expert on this :)
,
Nov 9
I have tested at my desk and I cannot reproduce this autoupdate_Cellular.delta astronaut: 5/5 pass nautilus: 3/3 pass autoupdate_ForcedOOBEUpdate.cellular.delta astronaut: 3/3 pass nautilus: 3/3 pass
,
Nov 9
However it is still happening in the lab. It looks to be flaky as it is only happening sometimes. Latest build the test ran on two different devices: One passed, one failed with update did not start Passing: https://stainless.corp.google.com/browse/chromeos-autotest-results/256146464-chromeos-test/ Failing: https://stainless.corp.google.com/browse/chromeos-autotest-results/256146011-chromeos-test/
,
Nov 9
,
Nov 9
A bit more investigation... I kicked off the test from my desk against some devices in pool:cellular chromeos1-row2-rack7-host1 (daisy): Fail 3/3 with the failure in this bug chromeos1-row2-rack7-host4 (nautilus): Pass 3/3 chromeos1-row1-rack8-host2 (quawks): Pass 2/3 1 failure for INIT_ERROR: org.chromium.flimflam.Error.Failure: Serial command timed out
,
Nov 9
chromeos1-row2-rack7-host3 (nautilus): Fail 2/2 with the failure in this bug
,
Nov 9
Interestingly the two devices that failed for me are also the two that have last failed in the lab. chromeos1-row2-rack7-host1 (daisy) chromeos1-row2-rack7-host3 (nautilus)
,
Nov 13
,
Jan 11
This issue has an owner, a component and a priority, but is still listed as untriaged or unconfirmed. By definition, this bug is triaged. Changing status to "assigned". Please reach out to me if you disagree with how I've done this. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by dhaddock@google.com
, Nov 3Example update_engine.log [1101/001437:INFO:omaha_request_action.cc(847)] Request: <?xml version="1.0" encoding="UTF-8"?> <request protocol="3.0" version="ChromeOSUpdateEngine-0.1.0.0" updaterversion="ChromeOSUpdateEngine-0.1.0.0" installsource="ondemandupdate" ismachine="1"> <os version="Indy" platform="Chrome OS" sp="ForcedUpdate_x86_64"></os> <app appid="{85F8FA82-F276-4EA6-8980-93FE091F6D25}" version="ForcedUpdate" track="stable-channel" lang="en-US" board="nautilus" hardware_class="NAUTILUS F3B-A2D-B5W-394-Q8M" delta_okay="true" fw_version="" ec_version="" > <event eventtype="13" eventresult="1"></event> </app> </request> [1101/001437:INFO:libcurl_http_fetcher.cc(146)] Starting/Resuming transfer [1101/001437:INFO:libcurl_http_fetcher.cc(165)] Using proxy: no [1101/001437:INFO:libcurl_http_fetcher.cc(286)] Not setting http(s) curl options because we are running a dev/test image [1101/001437:INFO:libcurl_http_fetcher.cc(433)] HTTP response code: 200 [1101/001437:INFO:libcurl_http_fetcher.cc(509)] Transfer completed (200), 1365 bytes downloaded [1101/001437:INFO:omaha_request_action.cc(1210)] Omaha request response: <response protocol="3.0"> <daystart elapsed_seconds="44801"/> <app appid="{85F8FA82-F276-4EA6-8980-93FE091F6D25}" status="ok"> <ping status="ok"/> <updatecheck status="ok"> <urls> <url codebase="https://storage.googleapis.com/"/><url codebase="https://storage.googleapis.com/"/> </urls> <manifest version="999999.0.0"> <packages> <package hash_sha256="71c4b1ac12a01d537204f0663618dfdedacaf9c7f400d8beafe298d9b10043e1" name="chromeos-throw-away-bucket/CrOSPayloads/Cellular/chromeos_R72-11197.0.0_nautilus_full_dev.bin" size="1396035064" required="true"/> </packages> <actions> <action event="postinstall" ChromeOSVersion="999999.0.0" sha256="71c4b1ac12a01d537204f0663618dfdedacaf9c7f400d8beafe298d9b10043e1" needsadmin="false" IsDeltaPayload="false" MaxFailureCountPerUrl="1" DisablePayloadBackoff="true" deadline="now" /> </actions> </manifest> </updatecheck> </app> </response> [1101/001437:INFO:omaha_request_action.cc(1899)] No device policy is loaded. Assuming rollback enabled. [1101/001437:INFO:omaha_request_action.cc(1929)] Rollback is enabled. Setting kernel_max_rollforward to 65537 [1101/001437:INFO:metrics_reporter_omaha.cc(609)] Sending 65537 for metric UpdateEngine.KernelKey.MinVersion [1101/001437:INFO:metrics_reporter_omaha.cc(614)] Sending 65537 for metric UpdateEngine.KernelKey.MaxRollforwardVersion [1101/001437:INFO:metrics_reporter_omaha.cc(619)] Sending 1 for metric UpdateEngine.KernelKey.MaxRollforwardSetSuccess (bool) [1101/001437:INFO:action_processor.cc(116)] ActionProcessor: finished OmahaRequestAction with code ErrorCode::kSuccess [1101/001437:INFO:action_processor.cc(143)] ActionProcessor: starting DownloadAction [1101/001437:INFO:install_plan.cc(83)] InstallPlan: resume, version: 999999.0.0, source_slot: B, target_slot: A, url: https://storage.googleapis.com/chromeos-throw-away-bucket/CrOSPayloads/Cellular/chromeos_R72-11197.0.0_nautilus_full_dev.bin, payload: (size: 1396035064, metadata_size: 0, metadata signature: , hash: 71C4B1AC12A01D537204F0663618DFDEDACAF9C7F400D8BEAFE298D9B10043E1, payload type: full), hash_checks_mandatory: false, powerwash_required: false, switch_slot_on_reboot: true, run_post_install: true [1101/001437:INFO:download_action.cc(200)] Marking new slot as unbootable [1101/001437:INFO:boot_control_chromeos.cc(201)] Marking slot A unbootable [1101/001437:INFO:multi_range_http_fetcher.cc(45)] starting first transfer [1101/001437:INFO:multi_range_http_fetcher.cc(74)] starting transfer of range 0+66039 [1101/001437:INFO:libcurl_http_fetcher.cc(146)] Starting/Resuming transfer [1101/001437:INFO:libcurl_http_fetcher.cc(165)] Using proxy: no [1101/001437:INFO:libcurl_http_fetcher.cc(286)] Not setting http(s) curl options because we are running a dev/test image [1101/001437:ERROR:libcurl_http_fetcher.cc(436)] Unable to get http response code. [1101/001437:INFO:libcurl_http_fetcher.cc(468)] Transfer resulted in an error (0), 0 bytes downloaded [1101/001437:INFO:libcurl_http_fetcher.cc(482)] No further proxies, indicating transfer complete [1101/001437:INFO:multi_range_http_fetcher.cc(168)] Received transfer complete. [1101/001437:INFO:multi_range_http_fetcher.cc(125)] TransferEnded w/ code 0 [1101/001437:INFO:multi_range_http_fetcher.cc(140)] Didn't get enough bytes. Ending w/ failure. [1101/001437:INFO:action_processor.cc(116)] ActionProcessor: finished DownloadAction with code ErrorCode::kDownloadTransferError [1101/001437:INFO:action_processor.cc(121)] ActionProcessor: Aborting processing due to failure. [1101/001437:INFO:update_attempter.cc(993)] Processing Done. [1101/001437:INFO:cpu_limiter.cc(71)] Setting cgroup cpu shares to 1024 [1101/001437:INFO:cpu_limiter.cc(79)] CPU shares = 1024 [1101/001437:ERROR:update_attempter.cc(1407)] Update failed.